Cassandra installation and simple trial

Source: Internet
Author: User
Tags cassandra

Official homepage:
Http://cassandra.apache.org/

Introduction:
The Apache Cassandra project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and bigtable's columnfamily-based data model.
Cassandra was open sourced by Facebook in 2008, and is now developed by Apache committers and contributors from your companies.

Apache Cassandra is an open-source distributed nosql database system. It was initially developed by Facebook to store inbox and other simple format data. It integrates the data model of Google bigtable with the fully distributed architecture of Amazon dynamo. Facebook opened Cassandra in 2008. Since then, because of its excellent scalability, Cassandra has been adopted by well-known Web 2.0 websites such as Digg and Twitter, it has become a popular distributed structured data storage solution.
Architecture
Cassandra uses the data model of Google bigtable. Unlike traditional row-Oriented Relational databases, Cassandra is a column-oriented database in which columns are organized into column families ), it is very convenient to add a column to the database. For Search and general structured data storage, this structure is rich and effective.
Cassandra's system architecture is in the same line with dynamo. It is a fully P2P Architecture Based on O (1) DHT (Distributed Hash table). Compared with traditional sharding-based database clusters, cassandra can join or delete nodes almost seamlessly, which is very suitable for application scenarios with fast node scale changes.
Cassandra's data is written into multiple nodes to ensure data reliability. Cassandra is flexible in terms of consistency, availability, and the compromise between the network partition capacity (CAP, when reading a copy, you can specify that all copies must be consistent (high consistency), read a copy (high availability), or confirm that the majority of copies are consistent (compromise) by election ). In this way, Cassandra can be applied to scenarios with nodes, network failures, and multiple data centers.
Features
Compared with other databases, Cassandra has three outstanding features:
Flexible mode: with Cassandra, such as document storage, you do not have to solve the fields in the record in advance. You can add or remove fields at will when the system is running. This is an amazing improvement in efficiency, especially in large-scale deployment.
Real Scalability: Cassandra is purely horizontal scaling. To add more capacity to the cluster, you can point to another computer. You do not have to restart any process, change application queries, or manually migrate any data.
Multi-Data Center Identification: You can adjust the node layout to avoid a data center fire. A backup data center will have at least full replication of each record.
Some other features that make Cassandra more competitive:
Range Query: if you do not like all key-value queries, you can set the range of keys to query.
List Data Structure: in hybrid mode, you can add a super column to a 5-dimension table. This is very convenient for each user's index.
Distributed write operation: You can read or write any data in a centralized manner at any time. And there will be no single point of failure.
Refer:
Http://zh.wikipedia.org/zh/Cassandra

Install On centos 5.5:
Cassandra requires the most stable version of Java 1.6 you can deploy.

Decompress the installation package
Tar zxvf apache-cassandra-0.7.0-bin.tar.gz
Create the default directory required by cassandra
Mkdir-P/data/logs/cassandra
Ln-S/data/logs/Cassandra/var/log/cassandra
Ln-S/data/logs/Cassandra/var/lib/cassandra
Ln-S/root/Apache-Cassandra-0.7.0/usr/local/cassandra
CD/usr/local/cassandra

Start:
Bin/Cassandra-F

Use the command line to connect to the local service:
Bin/Cassandra-cli -- Host localhost
Run the following command in command line mode:
(1) Create a database.
Create keyspace cassandra_test;
Use cassandra_test;
(2) create a table
Create column family users with comparator = utf8type and default_validation_class = utf8type;
(3) Insert data
Set users [author] [nick_name] = 'preftest ';
Set users [author] [age] = long (31 );
(4) query data
Get users [author];
Count users ['autor'];
List users;

 

Use the python Client
Install the required libraries:
Easy_install pycassa
Easy_install thrift05
Note: If you cannot find the python. h file when installing thrift05, you must first install Python-Dev:
Yum install Python-devel

Compile a Python script to connect to Cassandra:
Import pycassa
Pool = pycassa. Connect ('Cassandra _ test', ['localhost: 100'])
Col_fam = pycassa. columnfamily (pool, 'users ')
Col_fam.insert ('tester', {'Nick _ name': 'performance ', 'age': '31 '})
Print col_fam.get ('author ')
Print col_fam.get_count ('tester ')

Refer:
Http://pycassa.github.com/pycassa/tutorial.html

 

For details about cluster creation, refer:
Get started:
Http://wiki.apache.org/cassandra/GettingStarted

# After a new node is added, both consoles can be intelligently sensed
Info 15:52:20, 128 node/10.20.223.111 is now part of the cluster
Info 15:52:21, 137 started hinted handoff for endpoint/10.20.223.111
Info 15:52:21, 138 inetaddress/10.20.223.111 is now up
Info 15:52:21, 139 finished hinted handoff of 0 rows to endpoint/10.20.223.111
# Node failure Induction
Info 15:59:37, 046 Error writing to/10.20.223.111
Info 15:59:42, 051 inetaddress/10.20.223.111 is now dead.
# Node recovery Induction
Info 16:00:19, 430 node/10.20.223.111 has restarted, now up again
Info 16:00:19, 430 started hinted handoff for endpoint/10.20.223.111
Info 16:00:19, 431 node/10.20.223.111 state jump to normal
Info 16:00:19, 431 finished hinted handoff of 0 rows to endpoint/10.20.223.111

 

Refer:
Http://hi.baidu.com/higkoo/blog/item/e5e1cd34d278fba4d1a2d3d7.html

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.