Cassandra Cluster Deployment planning-

Source: Internet
Author: User
Keywords nbsp can data center

Domestic about Cassandra more detailed information or too little, the following is based on some foreign data translation summary of the content, we have the need to refer to the reference! Not finished, I will write side upload!

When planning a Cassandra cluster deployment in a formal production environment, you must first consider the amount of data that you plan to store, as well as the load (read/write) pressure of the main front-end application system and extreme conditions.

Hardware selection:

For any application system, a reasonable choice of hardware resources often means that the 4 basic resources, such as memory, CPU, hard disk and network, can be considered and obtain the optimal configuration balance.

Memory:

The more memory the Cassandra node has, the better the read performance of the cluster. More memory allows for larger cache sizes while reducing disk I/O reading. More memory also allows you to set a larger memory table (Memtables) to store the most recent read data, which effectively reduces the files scanned and read during the read (index, sstable, etc.). In the formal production environment, single node using 8GB~16GB is a common phenomenon, it is recommended that the minimum 4GB memory, there are some production clusters using 32GB or higher memory configuration. The ideal memory configuration often depends on the size of the data stream in frequent reads.

CPU:

In the actual application of Cassandra Cluster system, "write frequent" application is often reached the memory processing limit before its CPU is overwhelmed. The high concurrency characteristics of Cassandra determine that they will benefit directly from the number of processors. Currently, nodes with 8 CPU cores often offer the best price/performance ratio. On the virtual service platform, we can consider using cloud service providers, who tend to have more dense processor units that provide more powerful processing power. (The New York Times, for example, rented Amazon's cloud service platform to handle the analytic storage of scanned pieces of all historical newspapers).

Hard:

Two important bases for selecting a hard disk are capacity (how much data to store) and I/O (read/write throughput). Storage-side optimization will effectively reduce the number of expensive SATA hard drives and enable the expansion of nodes to enhance the storage capacity and I/O performance of the cluster (which often means more memory input, of course).

Solid-State drives (SSDs) are also an effective alternative to Cassandra clusters. Based on the SSDs, and with the Cassandra "sequential flow" of the write mode will be able to greatly reduce the write efficiency of the negative impact.

Cassandra data is persisted to the hard disk through two operations: it first attaches "write data" to the in-memory commit log, and periodically flushes the data in memory to the hard disk and writes to the sstable (the storage file for the clustered data). It is strongly recommended that you use different disk storage for Commitlog and sstables files. Commitlog does not require much storage space, but it can effectively balance your write load pressure. The data directory not only satisfies the requirement of storing persistent data, but also has enough throughput to meet the expected data "read" Pressure, data "write" and compressed I/O requirements.

Cassandra in the background of data compression and data repair operations, hard disk utilization tends to temporarily increase to the actual data directory volume of more than twice times. Therefore, it is recommended that you do not actually use more than 50% of the maximum disk capacity per node.

At the same time consider the impact of disk failure and data redundancy requirements, there are two appropriate matching methods:

storage uses RAID0 to solve the impact of hard disk failure through Cassandra Data backup mechanism. Once the disk of a node fails, we can retrieve the lost data automatically through the Cassandra repair operation. Storage takes RAID10, which resolves a single disk failure by horizontally scaling on pure hardware, which turns off the Cassandra Automatic backup operation.

If sufficient investment is made in disk storage, RAID10 is recommended, which reduces the additional storage operations that are brought by the Cassandra replica backup strategy and reduces network load. Conversely, it is recommended to use RAID0.

Internet:

Cassandra is a distributed data storage platform that processes read/write requests based on the network, while data backups between nodes are performed over the network. The Cassandra selection strategy for replica data is that nodes on the same rack take precedence over nodes on different racks. Nodes in the same data center are superior to nodes in different data centers.

Cassandra using the following ports, if a firewall exists in the actual production environment, you must ensure that nodes in the same cluster can access each other through these ports.

Port number

Description

7000

Cassandra communication ports between internal nodes

9160

Thrift Client Access Port

7199

JMX monitoring Port (earlier version is 8080)

Capacity Planning

The methods described in this section can be used as a reference for estimating data volume in a Cassandra cluster. For better estimation, we first have an objective and comprehensive understanding of the data we operate in Cassandra. We also need to know how we are going to build the storage model for the data in the Cassandra (the partitioning of the columns, the rowset, the number of columns per row, etc. ...). )

Available disk capacity calculations

Calculate how much data your Cassandra cluster nodes can manage, calculate the available disk capacity for each node, and overlay the capacity of all nodes. Keep in mind that in the actual production cluster, the Commit log and the actual data directory (datadirectories) must be stored on separate disks. The following formula can be used to store estimates of the minimum usable capacity.

Let's first calculate the original capacity of our disks:

Original volume = hard disk size * Number of hard drives

Calculate the file system overhead required for hard disk formatting and the level pattern of the disk array we use, for example, assuming we are using RAID10 mode, the volumetric formula is as follows:

(Original volume * 0.9)/2 = disk space available for data

In day-to-day operations, Cassandra data compression and repair operations need to consume a certain amount of disk space, in order to balance performance requirements and cluster stability, it is highly recommended that you give your disk space to leave some room, only use less than 50% of free disk space, always remember this, We can use the following formula to calculate the actual space used:

data available disk space * 0.5 = actual disk space used

User Data Scale calculation

Like all storage systems, once the raw data is stored in the Cassandra, its storage capacity tends to be larger than the original data size due to storage overhead. Generally, the storage will expand to twice times the original size. How much expansion is also dependent on the character set used by the system and the following can be used for estimating disk storage data (the capacity calculation of temporary memory data is not covered in this section).

Column overhead-Each column occupies 15 bytes of overhead in Cassandra. Because each row in the column accessibility can have a large number of different columns and different column names, the metadata for each column needs to be stored. Additional 8-byte definition overhead (a total of 23 bytes of extra overhead) is required for counter columns and columns that are set to lifecycle. So the formula for calculating the cost of a normal column is as follows:

Column total length = column name length + column value length + 15 bytes

Line overhead-Similar to columns, each row needs 23 bytes of overhead.

Primary key indexing overhead-Each column cluster also contains a primary key index for a rowset, which is particularly important when we have a large number of "narrow rows" (no traversal of the entire dataset), and the following formula is used to approximate the space occupied by the primary key index:

PRIMARY KEY index = total number of rows * (32 + mean value of primary key size)

Replication Overhead – The replication number factor in the backup strategy plays an important role in disk capacity usage. If the replication number factor is 1, the system has no additional replication overhead (only one of the data in the cluster), and once the replication factor is greater than 1, the replication cost can be calculated with the following formula:

Replication overhead = Total data Size * (Replication number factor-1)

Reasonable selection of node configuration

One of the main tasks of planning Cassandra cluster configuration is to understand and rationally set the configuration parameters of each node. This section explains what deployment configurations can be used in a cluster for single nodes, multiple nodes, and multiple data centers.

The configuration properties mentioned in this section can be found in the Cassandra.yaml configuration file, and each cluster node must be properly configured before booting.

Storage settings:

By default, the data store path for each node is set to/var/lib/cassandra. In the actual production cluster deployment, we must modify the paths specified by the commitlog_directory and data_file_directories parameters to ensure that the log file directory and the data file directory are in different disk storage.

Gossip set:

The gossip setting specifies the communication mode of the nodes in the cluster and how the nodes are recognized by the cluster.

Property

Description

Cluster_Name

Cluster Name of node

Listen_address

Other nodes in the Cassandra cluster connect to the IP address or host name of this node. The public IP corresponding to this host must be modified from localhost.

Seeds

A comma-separated list of seed node IP. This value must be consistent for each node. In multiple data centers, each datacenter must have at least one node in this list.

Storage_port

The Internal node communication port (default 7000) must be consistent for each node.

Initial_token

This value directly determines the scope of the data for this node (using a consistent hash for the calculation), or can not be set, the system is automatically specified, and can periodically use Cassandra load blance to achieve the average storage purpose (detailed reference ringrange).

Clears the gossip state of a node

Each node can automatically cache gossip state information and automatically load in the next reboot without waiting for a reply from the gossip service. You can add the following definition in the Cassandra-env.sh script file to force clear the gossip state in the cache:

-dcassandra.load_ring_state=false

cassandra-env.sh files are generally located in the/usr/share/cassandra directory or in the $CASSANDRA _home/conf directory.

Partition settings:

In a real-world production environment, we want to make sure that each node in the cluster stores roughly the same amount of data, which is called "Load balancing." This functionality is achieved by configuring the partition (Partitioner) in each node and setting the value of Initial_token correctly and rationally.

It is strongly recommended that the Randompartitioner partition (which is also the default configuration) be used in all clusters, and that, based on this configuration, each node in the cluster is assigned a 0~2**127 hash value.

For Cassandra clusters with all nodes in the same data center, we can get token values in the total number of nodes in the cluster. In a multiple data center deployment, each datacenter must consider load balancing separately. The distribution of different Tokens values for multiple data centers and single data centers can be calculating Tokens for detailed information.

Snitch Settings:

Snitch can help you get your place in the network topology. It directly affects the location of the replica placement and the request routing between replicas. The Endpoint_snitch property configures the Snitch property of the node. The snitch configuration of all nodes must be consistent.

For a single data center cluster, the Simplesnitch policy is used to meet the requirements. But if we were to expand our cluster into multiple rack-and-data centers in the future, it would be easier to start by setting up the racks and data centers for each node.

Configuration Propertyfilesnitch

Propertyfilesnitch requires that we define detailed network information for each node in the Cassandra-topology.properties configuration file.

The following is a configuration instance of this file that represents a total of two data centers, each containing two racks, and a third party logical datacenter for data replication analysis.

# Data Center 1 175.56.12.105=DC1:RAC1175.50.13.200=DC1:RAC1175.54.35.197=DC1:RAC1 120.53.24.101=DC1: RAC2120.55.16.200=DC1:RAC2120.57.102.103=DC1:RAC2 # Data Center 2 110.56.12.120=DC2:RAC1110.50.13.201=DC2: Rac1110.54.35.184=dc2:rac1 50.33.23.120=DC2:RAC250.45.14.220=DC2:RAC250.17.10.203=DC2:RAC2 # Analytics Replication Group 172.106.12.120=DC3:RAC1172.106.12.121=DC3:RAC1172.106.12.122=DC3:RAC1 # Default for unknown NODESDEFAULT=DC3: RAC1

In the above snitch configuration, you can arbitrarily define the name of your data center and rack. However, you must ensure that the name in the Cassandra-topology.properties configuration file is the same as the name you used in the Keyspace strategy_options definition.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.