Cassandra Cluster configuration

Source: Internet
Author: User
Tags cassandra log4j
1. Basic Configuration

First, you need to prepare 3 or more computers. The following assumes 3 computers running a Linux operating system with IP addresses of 192.168.0.100, 192.168.0.101, and 192.168.0.102. The system needs to install the Java runtime environment and then download the 0.7 version of the Cassandra binary release package here.

Select one of the machines to start the configuration, first expand the Cassandra Release package: $ TAR-ZXVF apache-cassandra-$VERSION. tar.gz
$ cd apache-cassandra-$VERSION


One of the Conf/cassandra.yaml files for the main configuration file, version 0.7 will no longer use the XML format configuration file, if the YAML format is not familiar with the best to get here first.

Cassandra in the configuration file default set of several directories: data_file_directories:/var/lib/cassandra/data
Commitlog_directory:/var/lib/cassandra/commitlog
Saved_caches_directory:/var/lib/cassandra/saved_caches


Data_file_directories can set up several different directories at once, Cassandra automatically synchronizes data for all directories. Also in the log configuration file log4j-server.properties has a default setting log file directory: Log4j.appender.r.file=/var/log/cassandra/system.log


In general, the default configuration can be, unless you have special storage requirements, so now there are two scenarios: first, the default configuration to create the relevant directory, and the second is to modify the configuration file to use their own specified directory.

Here's the first scenario for simplicity: $ sudo mkdir-p/var/log/cassandra
$ sudo chown-r ' whoami '/var/log/cassandra
$ sudo mkdir-p/var/lib/cassandra
$ sudo chown-r ' whoami '/var/lib/cassandra


The ' WhoAmI ' above is the Linux command used to get the username for the current login, and if you are not ready to run Cassandra with the current logged-on user, you need to replace the ' whoami ' with the specific username.
2, the configuration of the cluster

Because Cassandra uses a central structure, when a machine (node) in the cluster needs a way to notify the current cluster (with new nodes joined), Cassandra's configuration file has a seeds setting, the so-called seeds is to be able to contact all nodes in the cluster of a computer, if all the nodes in the cluster in the same computer room the same subnet, then as long as the random selection of a few more stable computer. In the current example because there are only 3 machines, so I picked the first one as a seed node, configured as follows: Seeds:
-192.168.0.100


Then configure the IP address of the node to communicate before: listen_address:192.168.0.100


It is important to note that you must use a specific IP address, but you cannot use an address such as 0.0.0.0.

Configure IP addresses for Cassandra Thrift client (application) Access: rpc_address:192.168.0.100


This can use 0.0.0.0 to monitor all network interfaces of a machine.

Cassandra's keyspaces and columnfamilies no longer need to be configured, they will be created and maintained at run time.

Copy the configured Cassandra to the 2nd and 3rd machines, create the related directories, and modify the IP address of the listen_address and rpc_address for the actual machine. All configurations are complete.
3, start Cassandra each node and cluster management

The boot sequence is OK, just make sure the seed node starts: $ bin/cassandra-f


The function of the parameter-F is to allow Cassandra to run as a front-end program, which facilitates debugging and observation of log information, which is not required in the actual production environment (that is, Cassandra will run in daemon mode).

After all nodes are started, the cluster can be managed through the Bin/nodetool tool, such as viewing all node operations:

$ bin/nodetool-host 192.168.0.101 Ring


The results of the operation are similar to the following:
Address Status state Load owns Token
159559..
192.168.0.100 up Normal 49.27 KB 39.32% 563215 ...
192.168.0.101 up Normal 54.42 KB 16.81% 849292 ...
192.168.0.102 up Normal 73.14 KB 43.86% 159559 ...

The-host parameter in the command is used to specify which node the Nodetool communicates with, and for the Nodetool Ring command, there is no difference in which node communication, so you can specify one of the nodes at random.

From the results list above you can see whether the running nodes are online, state, data load, and node token (which can be understood as the node name, which is automatically generated the first time the node is started). We can use the Nodetool combination token to manage specific nodes, such as viewing the specified node details: $ bin/nodetool-host 192.168.0.101 Info


The results of the operation are roughly as follows:
84929280487220726989221251643883950871
load:54.42 KB
Generation no:1302057702
Uptime (seconds): 591
Heap Memory (MB): 212.14/1877.63

To view data structure information for a specified node: $ bin/nodetool-host 192.168.0.101 cfstats


Run Result:
Keyspace:keyspace1
Read count:0
Write count:0
Pending tasks:0
Column FAMILY:CF1
Sstable count:1

Use the following command to remove a node that is already offline (for example, the 2nd machine shuts down or is broken) $ bin/nodetool-host 192.168.0.101 Removetoken 84929280487220726989221251643883950871


How does the node of the line get back online? Nothing to do, just start the Cassandra program it will automatically join the cluster.

In practice we may need to back up the data at intervals (creating a snapshot), which is very simple in Cassandra: $ bin/nodetool-host 192.168.0.101 Snapshot

4. Read and write test data

Using the client component Plus unit testing is preferred, and if you only want to know if the cluster is reading or writing data correctly, you can use CASSANDRA-CLI for a simple test: $ bin/cassandra-cli-host 192.168.0.101


Then enter the following statement:
Create Keyspace Keyspace1;
Use Keyspace1;
Create column family Users with Comparator=utf8type and Default_validation_class=utf8type;
Set Users[jsmith][first] = ' John ';
Set Users[jsmith][last] = ' Smith ';
Get Users[jsmith];

The above statement creates a keyspace named "Keyspace1" and also creates a Column Family named "Users", and finally adds an item to the users. Normally you should see a result similar to the following:
=> (Column=first, Value=john, timestamp=1302059332540000)
=> (Column=last, Value=smith, timestamp=1300874233834000)
Returned 2 results.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.