MongoDB Learning Note four: Index

Last Update:2015-12-30 Source: Internet

Author: User

Tags custom name

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The index is used to speed up the query. Creating a database index is like determining how to organize the index of a book. But your advantage is knowing what queries will be made in the future and what needs to be found quickly. For example: All queries include the "date" key, so it is possible (at least) to create an index about "date". If you are querying a user name, you do not have to index the "User_num" key because it is not queried at all.
Now you want to follow a key to find:
> Db.people.find ({"username": "Mark"})
When only one key is used in a query, the key can be indexed to improve the query speed. The "username" is indexed here. To create an index use the Ensureindex method:
> Db.people.ensureIndex ({"username": 1})
For the same collection, the same index needs to be created only once. It is futile to create again and again.
An index created on a key accelerates the query for that key. However, it may not help for other queries, even if the query contains the keys that are indexed. For example, the following query does not gain any performance gains from the previously established index:
> Db.people.find ({"Date": Date1}). Sort ({"Date": 1, "username": 1})
The server must "find the whole book" to find the desired date. This process is called "Table Scan", which is to find content in a book without an index, starting from the first page, from front to back. In general, try to avoid having the server do a table scan because it is very slow when the collection is large.
★ Be sure to create an index of all the keys used in the query. For example, for the above query, the index of the date and user name should be established:
> Db.ensureindex ({"Date": 1, "username": 1})
The document passed to Ensureindex is in the same form as the document passed to sort: a set of keys with a value of 1 or-1--the direction of the index needs to be considered when there are multiple keys in the index.
The query optimizer of MongoDB will rearrange the order of the query items by using the index: for example, when the query {"x": "foo", "Y": "Bar"}, the index of {"Y": 1, "X": 1} has been indexed, and MongoDB will find it itself and use it.
The disadvantage of creating indexes is that they incur additional overhead each time they are inserted, updated, and deleted. This is because the database does not only need to perform these operations, but also to mark these operations in the index of the collection. Therefore, you should create as few indexes as possible.
The default maximum number of indexes per collection is 64.
"Extended Index"
Suppose we have a collection that holds the user's state information. Now you want to query the user and the date to take out a user's recent state. With what we've learned so far, we'll create an index like this:
> Db.status.ensureIndex ({user:1, Date:-1})
This makes the query for users and dates very fast, but not the best way.
Think again about the index of the book. There is a set of documents sorted by user name (ascending) and then sorted by date (descending), so this is the case:
User 123 on March 13, 2010
User 123 on March 12, 2010
User 123 on March 11, 2010
User 123 on March 5, 2010
User 123 on March 4, 2010
User 124 on March 12, 2010
User 124 on March 11, 2010
...
This data looks OK, but the app will have millions of users, each with dozens of status updates per day. If the index value of each user state occupies disk space similar to one page of paper, the database will be loaded into memory for each current state of the query. If the site is too hot, memory can not put all the index, it will be very very slow.
If you change the order, programming {date:-1, user:1}, the database can save the last days of the index in memory, which can effectively reduce memory exchange, so that querying any user's latest state will be much faster.
Therefore, the following issues should be considered when building an index:
(1) What kind of queries will be made? Where do those keys need to be indexed?
(2) What is the index direction of each key?
(3) How to deal with the expansion? Is there a different key arrangement that can be used to keep more of your data in memory?
Indexing keys in an inline document
There is no difference between indexing the key for an inline document and indexing a normal key. For example, to search for comments on a blog post by date, you can create an index on the "date" key in an array of embedded "comments" documents:
> Db.blog.ensureIndex ({"Comments.date": 1})
There is no difference between the key index of the embedded document and the key index of the normal document, which can also be combined to form a composite index.
To create an index for a sort
As the collection grows, you need to index a large number of sorts in the query. If you call Sort,mongodb on a key that does not have an index, you need to extract all the data into memory to sort. Therefore, there is a limit to what can be done without an index.
Indexed by sort to let MongoDB extract data sequentially, so that large-scale data can be sorted without worrying about running out of memory.
Index name
Each index in the collection has a string type name that uniquely identifies the index, which the server uses to delete or manipulate the index.
By default, the index name is similar to keyname1_dir1_keyname2_dir2_. _keynamen_dirn this form, where Keynamex represents the key of the index, and DirX represents the direction of the index (1 or-1). If the index has a very special number of keys, so the name is slightly stupid, but you can use the ENSUREINDEX option to specify the custom name:
> Db.foo.ensureIndex ({"A": 1, "B": 1, "C": 1, ..., "Z": 1}, {"name": "Alphabet"})
Index names have a limit on the number of characters, so a particularly complex index must use a custom name when it is created. You can use GetLastError to check whether the index was successfully created or why it was not created successfully.
"Unique index"
A unique index ensures that the specified key for each document in the collection has a unique value. For example, if you want to ensure that the "username" key for a document has a different value, create a unique index:
> Db.people.ensureIndex ({"username": 1}, {"Unique": true})
Note: Insert does not check if the document has been inserted. Therefore, in order to avoid inserting a document that contains duplicate values with unique keys, you may want to use safe insertion to meet the requirements. This way, when you insert such a document, you will see a hint that there is a duplicate key error.
The most familiar unique index: "_id"-this index is created together when creating a normal collection, and the index is only a little different from the normal unique index, and cannot be deleted.
Eliminate duplicates
There may be some key duplicates when creating a unique index. The dropdups option preserves the first document found and deletes the next document with duplicate values:
> Db.people.ensureIndex ({"username": 1}, {"Unique": true, "dropdups": true})
Composite Unique Index
When you create a composite unique index, the values of a single key can be the same, as long as the values of all the keys are combined differently.
Example: Gridfs is the standard way to store large files in MongoDB, where a composite unique index is used. The collection that stores the contents of the file has a composite unique index of {filed_id:1, n:1}, which looks like this:
{Files_id:objectid ("4b23c3ca7525f35f94b60a2d"), N:1}
{Files_id:objectid ("4b23c3ca7525f35f94b60a2d"), N:2}
{Files_id:objectid ("4b23c3ca7525f35f94b60a2d"), N:3}
{Files_id:objectid ("4b23c3ca7525f35f94b60a2d"), N:4}
Note that all "files_id" values are the same, but the value of "n" is different. If you try to insert {Files_id:objectid ("4b23c3ca7525f35f94b60a2d") Again, N:1}, the database will prompt for duplicate key errors.
Explain: Using this method with cursors, you can get the details of the query. Explain returns a document, not the cursor itself, which is different from most cursor methods.
> Db.foo.find (). Explain ()
Explain returns the index used by the query (if any), time-consuming, and the statistics of the number of scanned documents.
For example, the index {"username": 1} is very helpful for querying a single key, but most queries are much more complex. For example, to do the following query and sort:
> Db.people.find ({"Age", +}). Sort ({"username": 1})
It's just not clear that the database to section is useless to an already created index, or how efficient it is. Using explain will give you the index used by the current query, how much time is consumed, and how many documents the database needs to scan to get results.
"An example of explain"
For a database with only 64 documents, with no index (except for the "_id" index), make the simplest query ({}), and the output of explain is similar to the following:
> Db.people.find (). Explain ()
{
"Cursor": "Basiccursor",
"Indexbounds": [],
"Nscanned": 64,
"Nscannedobjects": 64,
"N": 64,
"Millis": 0,
"Allplans": [
{
"Cursor": "Basiccursor"
"Indexbounds": []
}
]
}
The main points in the results are as follows:
"Cursor": "Basiccursor"
This indicates that the query did not use the index (because there are no query criteria).
"Nscanned": 64
This number represents how many documents are found in the database. Everyone wants this number to be as close as possible to the number of results returned.
"N": 64
This number represents the number of returned documents. This example is perfect because the number of documents scanned is exactly the same as the number of documents returned. Of course, this is because of the whole set back, otherwise it is difficult to do.
"Millis": 0
This number of milliseconds indicates when the database executed the query.
Suppose you now have an index based on the "age" key, now look for users who are more than 20 years old. For this query, use explain:
> Db.c.find ({age: {$gt:, $lt: +}}). Explain ()
{
"Cursor": "Btreecursor age_1",
"Indexbounds": [
[
{
"Age": 20
},
{
"Age": 30
}
]
],
"nscaned": 14,
"Nscanedobjects": 12,
"N": 12,
"Millis": 1,
"Allplans": [
{
"Cursor": "Btreecursor age_1",
"Indexbounds": [
[
{
"Age": 20
},
{
"Age": 30
}
]
]
}
]
}
Because of the index, different from the above example, the explain output's current key value has changed:
"Cursor": "Btreecursor age_1"
The index is stored in the structure of the B-tree, so when using an index query, a cursor called the Btreecursor type is used.
This value also identifies the index name used by the age_1. By this name, you can query the System.indexes collection to get further information about the index (for example, whether it is a unique index, including those keys):
> Db.system.indexes.find ({"ns": "TEST.c", "name": "Age_1"})
{
"_id": ObjectId ("4c0d211478b4eaaf7fb28565"),
"NS": "TEST.c"
"Key": {
"Age": 1
},
"Name": "Age_1"
}
"Allplans": [...]
This key lists all the query scenarios that MongoDB considers.
An example of a more complex index: Suppose you already have {"username": 1, "Age": 1} and {"Ages": 1, "username": 1} are indexed, now you want to query the user name and the old:
> Db.c.find ({age: {$gt: ten}, Username: "Sally"}). Explain ()
If you find that MongoDB uses an unintended index, you can use hint to force an index. For example, if you want MongoDB to use {"username": 1, "Age": 1} indexes in the previous example, you need:
> Db.c.find ({"Age": +, "username":/.*/}). Hint ({"username": 1, "Age": 1})
"Index management"
The original information for the index is stored in the System.indexes collection of each database. This is a reserved collection that cannot be inserted into or deleted from the document. Operation can only be done through Ensureindex or dropindex.
The System.indexes collection contains detailed information for each index, and the System.namespaces collection also contains the name of the index. If you look at this collection, you will find that each collection has at least two documents corresponding to it, one corresponding to the collection itself, and one containing the index of the collection. For a collection of only standard "_id" indexes, system.namespaces should be exhausted:
{"Name": "Test.foo"}
{"Name": "Test.foo.$_id_"}
If there is a qualifying index for the name and age, System.namespaces adds a document:
{"Name": "Test.foo. $name _1_age_1"}
"Modify Index"
Use Ensureindex to add a new index to an existing geometry at any time:
> Db.people.ensureIndex ({"username": 1}, {"Background": true})
Use {"Background": true} This option allows the entire process of the married index to complete in the background while the request is processed normally.
If this incense is not applied, the database blocks all requests during indexing background.
Index is deleted using Dropindex Plus index name.
Usually, check the System.indexes collection to find the index name, because even the auto-generated name will vary depending on the driver.
> Db.runcommand ({"Dropindex": "foo", "index": "Alphabet"})
To delete all indexes, you can reset the value of index to *:
> Db.runcommand ({"Dropindex": "foo", "index": "*"})
Another way to delete an index is to delete the collection.
"Geospatial Index"
One query becomes more popular (especially with mobile devices): Find the N locations closest to the current location.
MongoDB provides a specialized index-the geospatial index-for coordinate plane queries.
Suppose you want to find the nearest café around a given latitude and longitude coordinate, you need to create a special index to improve the efficiency of this query because it requires two dimensions. Geo-spatial indexes can be created by Ensureindex:
> Db.map.ensureIndex ({"GPs": "2d"})
The parameter here is "2d" instead of 1 or-1.
The value of the "GPs" key must be a pair of values in some form: an array of two elements or an inline document containing two keys. The following are all valid:
{"GPs": [0, 100]}
{"GPs": {"x": -30, "Y": 30}}
{"GPs": {"latitude": -180, "Longitude": 180}}
The key name can be arbitrary, for example {"GPs": {"foo": 0, "bar": 1}} is also possible.
By default, the spatial index assumes that the range of values is -180~180 (convenient for latitude). If you want to use a different value, you can specify the maximum minimum value by using the Ensureindex option:
> Db.star.trek.ensureIndex ({"Lighf-years": "2d"}, {"Min": +, "Max": 1000})
Two ways of geospatial querying: normal queries (with find) or using database commands.
Find Query Example:
> Db.map.find ("GPs": {"$near": [40,-73]})
This will return all documents in the Map collection in the same way as the 40,-73 from near.
When limit is not used, 100 documents are returned by default. Use limit to restrict the number of documents returned:
> Db.map.find ({"GPs": {"$near": [ -73]}}). Limit (10)
Use Geonear to do the same:
> Db.runcommand ({geonear: "Map", near: [ -73]}). Limit (10)
Geonear also returns the distance from each document to the query point.
MongoDB not only finds documents close to one point, but also finds documents within the specified shape. The practice is to replace "$near" with "$within". "$within" gets the increasing number of shapes as parameters that can be used to find all the points within the rectangle and the circle.
For rectangles, use the $box option:
> Db.map.find ({"GPs": {"$within": {"$box": [[10, 20], [15, 30]]}})
The "$box" parameter is an array of two elements, the first element has the coordinates in the lower-left corner, and the second specifies the coordinates of the upper-right corner.
For circles, use the $center option:
> Db.map.find ({"GPs": {"$within": {"$center": [[12, 25], 5]}})
Composite Geospatial Index
Example: To query "location" and "desc", you can create an index like this:
> Db.ensureindex ({"Location": "2d", "desc": 1})
Then you can find the nearest café soon:
> Db.map.find ({"Location": {"$near": [ -70, +]}, "desc": "CoffeeShop"}). Limit (1)

MongoDB Learning Note four: Index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More