Cassandra CQL 3.0 How to implement dynamic column;

Source: Internet
Author: User
Tags cassandra

1. A good feature of Cassandra is that columns can be sorted by column key, so that when Rowkey is determined, it is convenient for the range of the same "row" (range query) to be searched; officially, every "line" (Wide row) You can add up to 2 billion columns, although, according to ebay's engineers, there are no more than million in practice; The data value of the same row exists in the same server and will not be separated;

2. And the column mode is not pre-fixed, can be added and deleted at any time, so in fact, not only the column Value,column key We can also use, as a place to store data; For example, say. I collect a server load value every five minutes, so I can design the table as follows;

|                               Hour + minute | |

-------------------------+-----------------------+-----------------------+-----------------------

Device_name + Day | Load Value |

is the server name and day as Rowkey, hours and minutes as column key, and then the server load value as column value;

3. After CQL 3.0, the cut-out slogan is more like the SQL statement of the traditional care database; For example, to create a user table,

CREATE TABLE Users (

user_id int PRIMARY KEY,

Name text,

Company text

);

Here in fact primary key,user_id is Cassandra in the actual storage time Rowkey;

Can access a record,

INSERT into Users (user_id, name, company)

VALUES (1, ' John ', ' Taobao ');

4. From the above view, this seems to have fixed the schema of the table, the original dynamic column how to implement it; The simplest way to modify the schema with ALTER TABLE, and then add, but every time to modify the structure of the table, cumbersome and performance problems;

In fact, CQL 3.0 provides another way to solve the problem:

We look back and think that all of this will have wide row (that is, dynamic column), because you want to organize a range of data in a unified, convenient query (because you do not have to locate multiple Rowkey) is also easy to understand, and when the dynamic column is required. From a business perspective,

In fact, Rowkey + dynamic Columin key is uniquely deterministic, similar to the primary key in the RDBMS, and in CQL 3.0 if you want to build dynamic columin you can use the following build table statement, we use the example in the 2nd paragraph:

CREATE TABLE Device_load (

Device_and_day, text

Hour_and_minute text,

Load_value float,

Primary KEY (Device_and_day, Hour_and_minute)

);

is to combine the data model with Rowkey and dynamic column key as primary key, the first element in primary key is Rowkey, and the element behind it is column key;

This form, in fact, Cassandra the underlying storage method can be described as:

  |                             Row Key | Columns | |                   -------------------+----------------------|----------------------|----------------------||   |   0000: "Load value" |   0005: "Load Value" | 0010: "Load Value" |......|                   device1+20150701 +----------------------|----------------------|----------------------| |          |           1.0 |         2.0 | 5.5 |......|                   -------------------+----------------------|----------------------|----------------------||   |   0000: "Load value" |   0005: "Load Value" | 0010: "Load Value" |......|                   device2+20150701 +----------------------|----------------------|----------------------| |          |           2.0 |         3.0 | 10.0 |......| -------------------+----------------------|----------------------|----------------------| 


In this case, when Rowkey is determined, the column key can still be queried for scope, such as:

Select Load_value where device_and_day = ' device1+20150701 ' and hour_and_minute >= ' 0000 ' and Hour_and_minute <= ' 12 00 ';

5. Cassandra Query method can be used, is to determine the Rowkey, and then in the same row in the scope of the search, Rowkey does not support the direct range lookup, only support = and in, if you want to use the Rowkey range lookup needs the token function;

At the same time, in addition to Rowkey, the field does not support the direct = query operation, need to establish a level two index to support, Cassandra Index is not an index such as Btree, does not support range query, similar to the hash index Cassandra Two-level index guessing is implemented within each sstable, so the global query can not be implemented;

Cassandra CQL 3.0 How to implement dynamic column;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.