When starting the Cassandra cluster, you need to choose how the data is divided in the cluster, which is done by Partitioner.
All data managed in cluster is represented by the cyclization (ring). The loop is divided into a range (range) that equals the number of nodes. When each node joins the cluster, a token (token) is issued that determines the location of the node in the loop and the range of the data that is responsible for it.
Column Family (the table that corresponds to the relational database) is based on the partition.
To configure partitioner, you need to specify a partitioning policy:
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/database/extra/
Randompartitioner (Random partitioning): This is the default partitioning policy in the Cassandra Cluster, which uses a consistent hash algorithm. This algorithm explains in detail:
Http://www.cnblogs.com/coser/archive/2011/11/06/2238359.html
The advantage of this strategy is that once you have token assigned, column family data is distributed evenly across multiple cluster nodes and simplifies load balancing. The reading and writing of column family data is also average.
Orderedpartitioner (sorted partition): This Cassandra cluster is not recommended.
This must ensure that all key is sorted.
We looked at our system's partitioning policy choices: