SEQUOIADB Series III: SEQUOIADB's advanced features

Last Update:2015-01-18 Source: Internet

Author: User

Tags object object

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The previous article briefly describes the simple crud operation of SEQUOIADB, which will cover the features of a slightly higher point.

The cluster environment deployed on my machine has not been deleted after the creation of CS with the name "Foo", the creation of CL with the name "Bar", and the insertion of some data, so it will continue to be used in this article.

First, let's look at the files in the database directory in SEQUOIADB's installation directory:

~$ ls/opt/sequoiadb/database/data/11850

We will find several documents:FOO.1.IDX,foo.1.data

Exactly the name of the CS we created. Is it a coincidence?

To verify, do the following:

Start two terminals: Terminal 1, Terminal 2;
In Terminal 1, enter the shell execution environment of SEQUOIADB;
Connect to the database.

The above I no longer write operations, the reader can operate on their own, conducive to familiar with common operations.

Delete Collection Space first, Execution:

> db.dropcs ("foo")

Then, under Terminal 2, look at the file:

~$ ls/opt/sequoiadb/database/data/11850

At this time, we found that the foo.1.data and foo.1.idx files are gone.

Go back to Terminal 1 and recreate the Cs:db.createCS ("foo") named "foo".

Switch to Terminal 2 and view the file again:

~$ ls/opt/sequoiadb/database/data/11850

This time, thefoo.1.data and foo.1.idx files are again.

Therefore, we can basically determine that a CS corresponds to a . Data file and an . idx file; CL is a logical concept under the data file, which is equivalent to a table in a relational database. Therefore, in the operation of the file, please be careful, not the last resort, do not move these *.data, *.idx file

Above is a bit of accumulation, ready to enter the topic of this article:

Because the above operation deleted CS, and then re-created the name "foo" CS (although the name is the same, but deleted once, the contents of the content is no longer). So create a cl called "bar" again.

Then switch to Terminal 1 and create CL:

> Db.foo.createCL ("Bar")

First construct some data:

> Docs = [... {"Name": "Milky", "Age":,... {"Name": "Jim", "Age": $, "IP": "192.168.1.131"},... {"Name": "Tyle", "age": +, "phone": "10086"},... {"Name": "Tony", "Age": 33}]

Insert data:

> Db.foo.bar.insert (Docs)

First, create an index

Sometimes there is a lot of data, but you want to find the specific data as soon as possible, this time, you need to use the index.

Before the resume index, check the data, and see whether the data is through a normal scan query, or through the Index Scan query:

> Db.foo.bar.find ({"Age": +}). Explain ()

The result is:

{"Name": "Foo.bar", "ScanType": "Tbscan", "IndexName": "", "Useextsort": false, "NodeName": "milky:11860", "Returnnum": 0 , "ElapsedTime": 0.000003, "Indexread": 0, "Dataread": 0, "usercpu": 0, "SYSCPU": 0}

Then create an index:

> Db.foo.bar.createIndex ("Ageindex", {"Age": 1})

Then we'll do it again:

> Db.foo.bar.find ({"Age": +}). Explain ()

The result is:

{"Name": "Foo.bar", "ScanType": "Ixscan", "IndexName": "Ageindex", "Useextsort": false, "NodeName": "milky:11860", " Returnnum ": 0," elapsedtime ": 0.000003," Indexread ": 0," Dataread ": 0," usercpu ": 0," SYSCPU ": 0}

Because the data volume is too small, the query time-consuming comparisons are less obvious. However, the value of the "scantype" field in the two-time comparison results can be seen in a walking Tbscan, a walking ixscan. If you have time, you can try inserting the data at all levels and try again, time-consuming comparisons will be obvious.

Second, delete the index

This is quite simple. A little demonstration: Knowing the name of the index to be deleted is "Ageindex", calling the interface:

> Db.foo.bar.dropIndex ("Ageindex")

Re-execution:

> Db.foo.bar.find ({"Age": +}). Explain ()

The result is:

{"Name": "Foo.bar", "ScanType": "Tbscan", "IndexName": "", "Useextsort": false, "NodeName": "milky:11860", "Returnnum": 0 , "ElapsedTime": 0.000003, "Indexread": 0, "Dataread": 0, "usercpu": 0, "SYSCPU": 0}

At this point the value of the scantype field becomes "tbscan", indicating that the query did not go index: The index was deleted successfully.

A good index is a great skill for DBAs. about how to create an efficient index that is beyond the scope of this article, do not describe here, please Google learn by yourself.

Third, count the number of record bars

This operation is also a simple and straightforward operation.

On the basis of the above operation, execute:

> Db.foo.bar.count ()

Return:

This time, there are four data in Cl (indeed, 4 data in CL).

Iv. Aggregation

In SQL, aggregation is a simple syntax. And in a NoSQL without SQL statements, it's aggregate to be a great show of divinity.

Since this piece of content involves a large number of matches, I use the example given by the SEQUOIADB website to demonstrate:

Construct the data first:

> Tom = {... "No":,... "Score":,... "Interest": ["basketball", "football"],... "Major": "Computer Science and Technology",... "DEP": "Computer Academy",... "Info": {... "Name": "Tom",... "Age":,... "Gender": "Male" ...} ... } > Sam = {... "No":,... "Score":,... "Interest": ["Music"],... "Major": "Software Engineering",... "DEP": "Computer Academy",... "Info": {... "Name": "Sam",... "Age":,... "Gender": "Male" ...} ... } > Db.foo.bar.insert (Tom) > Db.foo.bar.insert (SAM)

Then execute:

> db.foo.bar.aggregate ({"$match": {"no": +}}, {"$group": {"_id": "$major", "major": {"$first": "$major"}, "Avg_age ": {" $avg ":" $info. Age "}})

Output Result:

{"Major": "Software Engineering", "Avg_age": 22} {"Major": "Computer Science and Technology", "Avg_age": 25}

For details, please refer to SEQUOIADB website Information Center >> Reference manual >>sequoiadb JavaScript method >>sdbcollection >>db.collectionspace.collection.aggregate.

PS: A friend privately asked me, official outlets in, it is difficult to find the corresponding location. Because documents in the SEQUOIADB official website cannot be located in the exact location, they can only be indexed to the Information center location. Many need to look slowly, if familiar with some database commonly used terminology, positioning will be a little faster. You can go to the sequoiadb community to spit out the slots, their community address is: SEQUOIADB community.

V. Segmentation

For each, there is a distinction between thermal data and cold data. For hot data, frequent access is required, and for cold data, the likelihood of access is small. Therefore, the disk performance of the hot data may be a little better. If the cold data and hot data are stored on the performance of the disk, will occupy disk space, due to infrequently accessed, wasted resources. Therefore, thermal data and cold data are usually stored separately.

This is the origin of data segmentation.

SEQUOIADB data segmentation requires two data sets, and the cold data is sliced into another group.

Recalling the first deployment of a clustered environment, a data group has been created. Here, you also need to create another group called "Colddatagroup".

To deploy the steps, refer to one of the SEQUOIADB series: SEQUOIADB installation, deployment.

First, a slice schematic diagram is attached:

Now, to create a new CS named "Total" and create a CL with the name "age" on this CS, the CL partition key type is "age" and the partition type is "range":

> Db.createcs ("Total") localhost:11810.totaltakes 0.193840s.> Db.total.createCL (' age ', {"Shardingkey": {"Age" : 1}, "Shardingtype": "Range"}) localhost:11810.total.hottakes 3.288509s.

Insert a few "age" values into this CL with different data:

> Docs = [... {"Age":,... {"Age": $,... {"Age":,... {"Age":,... {"Age":,... {"Age":,... {"Age":,... {"Age":,... {"Age":,... {"Age":,... {"Age": 90}] [Object Object],[object object],[object object],[object object],[object object],[object Object],[object Object],[ Object Object],[object object],[object object],[object object]takes 0.42086s.

Insert data:

> Db.total.age.insert (Docs) Takes 0.2119s.

　　Then define a rule, after the age of the people, we don't care, when cold data processing. To divide The age value above or equal to zero , slice to the colddatagroup group. Input:

> Db.total.age.split ("Datagroup", "Colddatagroup", {"Age": +/-}, {"Age": 100})

When the amount of data is large, the operation can take a long time and wait patiently.

Wait for the operation to complete, we need to check whether it is really sliced.

Re-establish a connection, this connection is very special, because this connection is directly connected to a data node (remember the previous article mentioned?). This is not recommended in a production environment.

> node = new Sdb ("Milky", 18800)

This node is the node on the colddatagroup data set, and the data node in the data group that holds the data of age greater than or equal to 65 after we slice it.

Check what CL is on this node:

> node.listcollections ()

The results show that Cl, whose name is good, is the name of the collection on the Datagroup data set that is used to slice the source data.

Check it out:

> Node.total.age.find ()

Result output:

{"_id": {"$oid": "54ba9abe74b1303560000044"}, "Age": 68} {"_id": {"$oid": "54ba9abe74b1303560000045"}, "Age": 79} {"_id": {"$oid": "54ba9ca374b1303560000048"}, "Age": 80} {"_id": {"$oid": "54ba9abe74b1303560000046"}, "Age": 85} {"_id": {"$oid": "54ba9abe74b1303560000047"}, "Age": 90}

This data is the value of the age field, which is greater than or 65 of the record.

If you have questions, you can directly connect to the nodes in the Datagroup Data group again, query the records in the source data group, check whether the data in them, the age field value is less than 65.

At this point, this article also to the end of the section, thank you for your patience to read!

Next, will enter the focus of this series, briefly analyze the architecture of SEQUOIADB. Please look forward to!

=====>the end<=====

SEQUOIADB Series III: SEQUOIADB's advanced features

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More