In the previous section we described the problems that the RDBMS encountered, and this section describes whether Cassandra and Cassandra can resolve the issue.
Through this section, we will learn:
What is Cassandra
Hash distribution of Cassandra data
Cassandra trade-offs in caps
Cassandra Replication
Cassandra Adjustable Consistency
Cassandra Multi-Data center
Apache Cassandra is an open-source, distributed, center-free, resilient, highly available, fault-tolerant, consistent, adjustable, column-oriented database that is created on Facebook based on the distributed design of Amazon Dynamo and the Google bigtable database. The summary features are as follows:
Distributed and non-centric
Distributed means that it can run on more than one machine, while presenting to the user is a whole. No center means that there is no single point in the Cassandra, that is, each node is the same, and no node takes on special management tasks. In contrast to the master/slave structure, the Cassandra protocol is peer-to, and uses gossip to maintain a list of surviving or dead nodes.
The PS:GOSSIP algorithm is also called the inverse Entropy (anti-entropy), entropy is a physics concept, represents the chaos, and the inverse entropy is in the chaos seeks the agreement, this fully illustrates the gossip characteristic: in a bounded network, each node randomly communicates with other nodes, After a chaotic communication, the state of all nodes will eventually be agreed upon. Each node may know all other nodes, or only a few neighbor nodes, as long as they can be connected through the network, eventually their state is consistent, of course, this is the characteristics of the spread of the epidemic.
From a general architecture point of view, the high availability of the system is measured by the ability to satisfy the request. But computers can have a variety of failures, from hardware failures to network outages. So for a system that needs to be highly available, it must be made up of multiple networked computers, and the software running on it must be able to operate under cluster conditions, with the device able to identify the node failure and to recover the failed interrupt function on the remaining system. The Cassandra is highly available. The failure node can be replaced without disrupting the system, and data can be distributed across multiple data centers to provide better local access performance and prevent the system from being completely paralyzed in the event of an irresistible disaster such as a fire in a data center. Linear scaling because the Cassandra uses the peer-to protocol, it is easy to scale horizontally, and the performance increases linearly. Acid Support Good Cassandra Consistency adjustable: strict consistency ~ final consistency. Also supports lightweight transactions through CAS (compareandset). Without Spof (single point of failure) easy to manage operations Cassandra it is easy to add, delete, replace nodes, and so on.
The data is partitioned around the ring.
All nodes store data and respond to queries (both readable and writable)
The data is located by the partition key (partition key).
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/82/78/wKioL1dWYyuhwJk8AABtLN_kCmQ548.jpg-wh_500x0-wm_3 -wmp_4-s_3940501810.jpg "title=" 6.jpg "alt=" Wkiol1dwyyuhwjk8aabtln_kcmq548.jpg-wh_50 "/>
650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M01/82/78/wKioL1dWY8nwc1vMAACYLbk-LgQ086.jpg "title=" 7.jpg " alt= "Wkiol1dwy8nwc1vmaacylbk-lgq086.jpg"/>
It is impossible to meet consistency and highly at the same time in satisfying partitioning conditions available
Cross-Datacenter latency also results in inconsistent inconsistencies
Cassandra selected availability and partitioning (Cassandra consistency is adjustable)
Ca:
The primary support for consistency and availability means that you will most likely need to use a two-phase commit distributed transaction. In other words, if the network splits, the system may stop responding.
Ap:
The primary support for availability and partition fault tolerance means that you may have to return less accurate data, but the system will always be available.
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M01/82/78/wKioL1dWan3RVCNyAAFL33zoo_c876.png-wh_500x0-wm_3 -wmp_4-s_2028030975.png "title=" CAP "alt=" Wkiol1dwan3rvcnyaafl33zoo_c876.png-wh_50 "/>
The data is automatically copied, and you only need to select the number of replication servers. Define the number of copies we call "replication factor" or RF.
If a machine is down, the lost data is played back through the "prompt handover" (hinted handoff). (hinted handoff will be in the follow-up course)
650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M00/82/78/wKioL1dWa5KQccgwAAKuWzrKgXk673.png-wh_500x0-wm_3 -wmp_4-s_1823836270.png "title=" 1.png "alt=" Wkiol1dwa5kqccgwaakuwzrkgxk673.png-wh_50 "/>
Cassandra Adjustable Consistency
Each query can specify a consistency level: All,quorum,one. means how many copies of the response.
Cassandra is often referred to as "final consistency", which is actually a bit misleading. Simply put, Cassandra sacrifices a bit of consistency in exchange for full availability. But Cassandra should actually be described as "tunable consistency", which allows you to easily select the exact consistency and eventual coherence needed to find a balance between the two.
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M00/82/7A/wKiom1dWbDuwf98YAAXQB5rflow258.png-wh_500x0-wm_3 -wmp_4-s_4019820473.png "title=" 1.png "alt=" Wkiom1dwbduwf98yaaxqb5rflow258.png-wh_50 "/>
Typical use case: Clients writes to the local DC, asynchronously replicates to other DCs
Each data center has a replication factor for each keyspace, which means that each data center is highly available
The data center can be physical or logical
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M01/82/7A/wKiom1dWbNbARFx7AAJMJUtBw5s564.png-wh_500x0-wm_3 -wmp_4-s_1792251589.png "title=" 1.png "alt=" Wkiom1dwbnbarfx7aajmjutbw5s564.png-wh_50 "/>
This article is from the Java Architect's Road blog, so be sure to keep this source http://eric100.blog.51cto.com/2535573/1786942
Cassandra Basic Introduction (2)-Cassandra Overview