Getting started with MongoDB (4): MongoDB Index

Last Update:2018-06-06 Source: Internet

Author: User

Tags sorted by name

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the previous article, we talked about adding, deleting, querying, and modifying MongoDB basic operations. For queries, We must collect data according to our query requirements and return the searched results, in this process, each document in the entire set is scanned. If our requirements are met, the document is added to the final result set. For small sets, this process is nothing, but when the data in the set is large, it is a terrible thing to scan the table, so I came up with an index, the index is used to accelerate the query, which is equivalent to the book directory. With the directory, You can precisely locate the location of the content to be searched, thus reducing unnecessary searches.

1. Index type

You can create an index on a single field or multiple fields. You can select the index based on your actual situation. The order of the fields when creating the index is also exquisite. The ensureIndex () method is used to create an index. You must pass a document data record to this method. Specify the index fields and sequence. 1 indicates ascending order, and-1 indicates descending order.

1). Default Index

Remember "_ id"? The data in this field cannot be repeated. It is the default index of MongoDB and cannot be deleted.

2). Single Column Index

An index created on a single field is a single index. During the Query Process, You can query the key. However, the query for other keys is not helpful. The order of a single column index does not affect the query of the key immediately. Create a single column index:

> db.people.ensureIndex({"name" : 1})

3). Composite Index

You can also create a composite index on multiple keys. The key position and index order affect the query efficiency. See the following:

> db.people.ensureIndex({"name" : 1, "age" : 1})> db.people.ensureIndex({"age" : 1, "name" : 1})

In the first case, the Organization is sorted by name. When the name is the same, the Organization is sorted by age. Therefore, for {"name": 1} and {"name": 1, "age ": 1} queries are more efficient, while the second case is sorting the age. When the age is the same, the name is sorted. Therefore, for {"age": 1} and {"age ": 1, "name": 1} queries are more efficient. When the composite index contains many fields, it will be helpful for the query of the first few keys.

4). embedded document index

You can also create indexes for embedded documents, which is similar to creating indexes with common keys. You can also create composite indexes for embedded documents:

> db.people.ensureIndex({"friends.name" : 1})> db.people.ensureIndex({"friends.name" : 1, "friends.age" : 1})

Let's take a look at several other forms of indexes:

Unique index> db. people. ensureIndex ({"name": 1 },{ "unique": true})> db. people. ensureIndex ({"name": 1 },{ "unique": true, "dropDups": true}) loose index> db. people. ensureIndex ({"name": 1 },{ "sparse": true}) multi-value index> db. people. find () {"name": ["mary", "rose"]}> db. people. ensureIndex ({"name": 1 })

The unique index unique can ensure that the value corresponding to the key is unique in the set. If duplicate data exists in the field during the creation of the unique index, the creation will fail, the dropDups field can be added to eliminate duplicate data. It retains the first document found, and other documents with duplicate data will be deleted.

Some documents in the collection do not have some fields, or some fields have null values. Therefore, we do not want to include null documents when creating an index on this field, it is defined as a loose index sparse. For example, when creating an index on name, we find that some people only have student IDs and no names in the database, so we do not want to include them, it is defined as a loose index.

The value corresponding to a key is an array. When an index is created on the key, it is a multi-value index that generates an index element for each value in the array, it is equivalent to splitting into several independent index items, but they still correspond to the same document data.

2. Manage Indexes

Indexes are generated for queries and can be created for each key. However, indexes require storage space. Therefore, the more indexes, the better, each insertion, update, and deletion of documents incur additional costs, because not only do these operations in the database, but also tag these operations in the SET index. Therefore, you need to create an index based on the actual situation. If the index is useless, delete it.

The ensureIndex () method is used to create an index. After the index is created, you can use getIndexes () to view the index created in the Set:

> db.people.ensureIndex({"name" : 1, "age" : 1})> db.people.getIndexes()[        {                "v" : 1,                "key" : {                        "_id" : 1                },                "ns" : "test.people",                "name" : "_id_"        },        {                "v" : 1,                "key" : {                        "name" : 1,                        "age" : 1                },                "ns" : "test.people",                "name" : "name_1_age_1"        }]

We can see that two indexes are created in the people set, one is "_ id", which is the default index, and the other is the combined index of name and age. The name is keyname1_dir_keyname2_dir _..., keyname indicates the index key, dir indicates the direction, 1 indicates the ascending order, and-1 indicates the descending order. Of course, you can also customize the index Name:

> db.people.ensureIndex({"name" : 1, "age" : 1}, {"name" : "myIndex"})> db.people.getIndexes()[        {                "v" : 1,                "key" : {                        "_id" : 1                },                "ns" : "test.people",                "name" : "_id_"        },        {                "v" : 1,                "key" : {                        "name" : 1,                        "age" : 1                },                "ns" : "test.people",                "name" : "myIndex"        }]

The index is deleted through dropIndex ():

Method 1:> db. people. dropIndex ({"name": 1, "age": 1}) {"nIndexesWas": 2, "OK": 1} Method 2:> db. runCommand ({"dropIndexes": "people", "index": "myIndex"}) {"nIndexesWas": 2, "OK": 1}

The metadata of an index is stored in the system. indexes set of each database. You cannot insert or delete a document to or from an index, but you can only use ensureIndex and dropIndex.

> db.system.indexes.find(){ "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.people", "name" : "_id_" }{ "v" : 1, "key" : { "name" : 1, "age" : 1 }, "ns" : "test.people", "name" : "myIndex" }

Clearing all documents in the set will not delete the index. The original index still exists. However, if you delete the set directly, the index of the set will also be deleted.

3. Index Efficiency

If we have defined a lot of indexes, MongoDB will sort them again based on our query options and intelligently choose the optimal one to use. For example, we have created {"name": 1, "age": 1} and {"age": 1, "class": 1} are two indexes, but our query item is find ({"age": 10, "name": "mary"}), then MongoDB will automatically reorder to find ({"name": "mary", "age": 10 }), the index {"name": 1, "age": 1} is used for query.

MongoDB provides the explain tool to help us obtain a lot of useful information about the query. You only need to call this method on the cursor to obtain the details of the query. Next, we will add 10 million documents to the math set to see the efficiency comparison before and after using indexes:

> var arr = [];> for(var i = 0; i < 100000; i++){... var doc = {};... var value = Math.floor(Math.random() * 1000);... doc["number"] = value;... arr.push(doc);... }100000> db.math.insert(arr)> db.math.count()100000> db.math.find().limit(10){ "_id" : ObjectId("53a7f7c6e4fd24348ce61fe5"), "number" : 462 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fe6"), "number" : 123 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fe7"), "number" : 90 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fe8"), "number" : 46 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fe9"), "number" : 244 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fea"), "number" : 972 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61feb"), "number" : 925 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fec"), "number" : 110 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fed"), "number" : 739 }{ "_id" : ObjectId("53a7f7c6e4fd24348ce61fee"), "number" : 945 }

Add 10 million pieces of data to the arr array through a for loop, and then insert the data to the math set in batches to view the first 10 pieces of data, because the value is generated immediately, therefore, the value of the number field has a duplicate value. We will query the value of 462:

Before creating an index:> db. math. find ({"number": 462 }). explain () {"cursor": "BasicCursor", "isMultiKey": false, "n": 94, "nscannedObjects": 100000, "nscanned": 100000, "nscannedObjectsAllPlans ": 100000, "nscannedAllPlans": 100000, "scanAndOrder": false, "indexOnly": false, "nYields": 0, "nChunkSkips": 0, "millis": 35, "indexBounds" :{}, "server": "server0.169: 9352"} after creating an index:> db. math. ensureIndex ({"number": 1})> db. math. find ({"number": 462 }). explain () {"cursor": "BtreeCursor number_1", "isMultiKey": false, "n": 94, "nscannedObjects": 94, "nscanned": 94, "nscannedObjectsAllPlans": 94, "nscannedAllPlans": 94, "scanAndOrder": false, "indexOnly": false, "nYields": 0, "nChunkSkips": 0, "millis ": 0, "indexBounds": {"number": [[462,462]}, "server": "server0.169: 9352 "}

Here, let's take a look at the useful information. "cursor" indicates the index used. "nscanned" indicates how many documents are searched. "n" indicates the number of documents returned, "millis" indicates the query time, in milliseconds. It can be seen that no index is used before the index is created. It takes 35 milliseconds to query all the documents. After the index is created, number_1 is used for index query, and the index is stored in B tree structure, it takes almost no time to query 94 documents.

If there are many indexes, MongoDB will automatically select one for query. You can also use hint to force an index. Here {"age": 1, "name ": 1} This index:

> db.people.find({"age" : {"$gt" : 10}, "name" : "mary"}).hint({"age" : 1, "name" : 1})

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More