A while ago, Cassandra-0.7.0-beta1 was released, todayCodeAfter a rough look, we found that the main changes were as follows:
1. The keyspace and columnfamily in the data model can be dynamically modified:
In earlier versions, If You Want To modify keyspace and columnfamily in Cassandra, you must stop Cassandra, modify the configuration file, and then restart Cassandra to take effect.
In the current version, we only need to define the new keyspace and columnfamily, and then call the thrift interface to send the new keyspace and columnfamily definitions to Cassandra.
The related struct and interface definitions can be found in the Cassandra. Thrift file:
/* Struct definition. * // * describes a column in a column family. */struct columndef {1: Required binary name, 2: Required string validation_class, 3: Optional indextype index_type, 4: Optional string index_name}/* describes a column family. */struct cfdef {1: Required string keyspace, 2: Required string name, 3: Optional string column_type = "standard", 4: Optional string clock_type = "timestamp", 5: optional string comparator_type = "bytestype", 6: Optional string subcomparator_type = "", 7: Optional string reconciler = "", 8: Optional string comment = "", 9: optional double row_cache_size = 0, 10: Optional bool preload_row_cache = 0, 11: Optional double key_cache_size = 200000, 12: Optional double keys = 1.0 13: optional list <columndef> column_metadata 14: optional i32 gc_grace_seconds}/* describes a keyspace. */struct ksdef {1: Required string name, 2: Required string strategy_class, 3: Optional Map <string, string> strategy_options, 4: Required i32 replication_factor, 5: required list <cfdef> cf_defs,}/* related interface definition. * // ** adds a column family. returns the new schema ID. */string system_add_column_family (1: Required cfdef cf_def) throws (1: invalidrequestexception IRE),/** drops a column family. returns the new schema ID. */string system_drop_column_family (1: Required string column_family) throws (1: invalidrequestexception IRE),/** renames a column family. returns the new schema ID. */string system_rename_column_family (1: Required string old_name, 2: Required string new_name) throws (1: invalidrequestexception IRE ), /** adds a keyspace and any column families that are part of it. returns the new schema ID. */string system_add_keyspace (1: Required ksdef ks_def) throws (1: invalidrequestexception IRE),/** drops a keyspace and any column families that are part of it. returns the new schema ID. */string system_drop_keyspace (1: Required string keyspace) throws (1: invalidrequestexception IRE),/** renames A keyspace. returns the new schema ID. */string system_rename_keyspace (1: Required string old_name, 2: Required string new_name) throws (1: invalidrequestexception IRE ),
2. A secondary index is added to query the column value:
Like almost all K/V systems, Cassandra can only query keys. If we want to query a specific value for a key, you can only extract all the data and traverse it. Alternatively, you can use some other solutions to provide query efficiency and avoid full table scanning. For example, my previousArticleThe reverse Cassandra index is also called lucandra.
If you want to use the secondary index function in the new version, you need to specify the column to which the index is created in columnfamily. The index creation method specified at the same time (currently only indextype. Keys is supported ).
When columnfamily with indexes are created in Cassandra, Cassandra creates an independent indexedcolumnfamily for each column in columnfamily that requires indexes.
When writing data, the data will not only be stored in columnfamily related to the data, but also stored in indexedcolumnfamily.
When querying data by index, Cassandra queries the corresponding data directly from indexedcolumnfamily.
The related struct and interface definitions can be found in the Cassandra. Thrift file:
/* Struct definition. */Enum indextype {keys,}/* describes a column in a column family. */struct columndef {1: Required binary name, 2: Required string validation_class, 3: Optional indextype index_type, 4: Optional string index_name}/* interface definition. * // ** returns the subset of columns specified in response for the rows matching the indexclause */list <keyslice> condition (1: Required columnparent column_parent, 2: Required indexclause index_clause, 3: Required slicepredicate column_predicate, 4: Required consistencylevel consistency_level = one) throws (1: invalidrequestexception ire, 2: unavailableexception ue, 3: timedoutexception Te ),
3. modify the configuration file format
The new version of Cassandra uses the yaml format for configuration, with better readability.
We can compare the option of configuring the cluster name. The differences between the two formats are as follows:
Old Version (storage-conf.xml ):
<! --
~ The name This Cluster. This Is Mainly used to prevent machines In
~ One logical cluster from joining another.
-->
< Clustername > Test Cluster </ Clustername >
New Version (Cassandra. yaml ):
# Name of the Cluster
Cluster_name:'Test Cluster'
In addition. There are also a lot of modifications:
0.7 - Beta1
* Sstable Versioning (cassandra - 389 )
* Switched to slf4j logging (cassandra - 625 )
* Add (optional) expiration time For Column (cassandra - 699 )
* Access levels For Authentication / Authorization (cassandra - 900 )
* Add readrepairchance to CF definition (cassandra - 930 )
* Fix heisenbug In System Tests, especially common on OS X (cassandra - 944 )
* Convert Byte [] Keys internally and all Public APIS (cassandra - 767 )
* Ability to alter schema definitions on a live cluster (cassandra - 44 )
* Renamed configuration file to Cassandra. XML, and log4j. properties
Log4j - Server. properties, which must now be loaded from
The classpath (which Is How our scripts In Bin / Have always done it)
(Cassandra - 971 )
* Change get_count to require a slicepredicate. Create multi_get_count
(Cassandra - 744 )
* Re - Organized endpointsnitch implementations and added simplesnitch
(Cassandra - 994 )
* Added preload_row_cache option (cassandra - 946 )
* Add CRC to commitlog header (cassandra - 999 )
* Removed deprecated batch_insert and get_range_slice methods (cassandra - 1065 )
* Add truncate thrift method (cassandra - 531 )
* HTTP mini - Interface Using Mx4j (cassandra - 1068 )
* Optimize away copy of sliced row on memtable read path (cassandra - 1046 )
* Replace constant - Size 2 GB mmaped segments and special casing For Index
Entries spanning segment boundaries, with segmentedfile that computes
Segments that always contain entire entries / Rows (cassandra - 1117 )
* Avoid reading large rows into memory during compaction (cassandra - 16 )
* Added hadoop outputformat (cassandra - 1101 )
* Efficient streaming (no more anticompaction) (cassandra - 579 )
* Split commitlog header into separate file and add size checksum
Mutations (cassandra - 1179 )
* Avoid allocating New Byte [] For Each mutation on replay (cassandra - 1219 )
* Revise HH schema to be per - Endpoint (cassandra - 1142 )
* Add joining / Leaving status to nodetool ring (cassandra - 1115 )
* Allow multiple repair sessions per node (cassandra - 1190 )
* Optimize away messagingservice For Local range queries (cassandra - 1261 )
* Make framed transport Default So malformed requests can ' T OOM
Server (cassandra - 475 )
* Significantly faster reads from row cache (cassandra - 1267 )
* Take advantage of row Cache during range queries (cassandra - 1302 )
* Make gcgraceseconds A per - Columnfamily value (cassandra - 1276 )
* Keep persistent row size and column count statistics (cassandra - 1155 )
* Add integertype (cassandra - 1282 )
* Page within a single row during hinted handoff (cassandra - 1327 )
* Push datacentershardstrategy configuration into keyspace definition,
Eliminating datacenter. properties. (cassandra - 1066 )
* Optimize forward slices starting '' And single - Index - Block name
Queries by skipping the column index (cassandra - 1338 )
* Streaming refactor (cassandra - 1189 )
* Faster comparison For UUID types (cassandra - 1043 )
* Secondary index support (cassandra - 749 And subtasks)
More about Cassandra: http://www.cnblogs.com/gpcuster/tag/Cassandra/