MongoDB Database Index

Source: Internet
Author: User
Tags createindex mongodb






Indexes can often greatly improve the efficiency of queries, and if there is no index, MongoDB must scan every file in the collection and select those records that meet the query criteria when reading the data. This scan full set of query efficiency is very low, especially in the processing of large amounts of data, the query can take dozens of seconds or even a few minutes, which is very fatal to the performance of the site. This article describes the MongoDB database index in detail





Introduced





Index can improve query efficiency, how to reflect it? Next, use the performance analysis function explain () to analyze the description



First, insert 100,000 data






Next, do not create an index to look for a document with a time range between 100 and 200






As is known in the figure, the totaldocsexamined value is 100000, which means that 100,000 documents are found, the nreturned value is 101, 101 documents are returned, and the Executiontimemillis value is 39, which means that it takes 39ms



Below, we build the index on the Time field






Again, look for a document with a time range between 100 and 200






The figure shows that both the totaldocsexamined and nreturned values are 101,executiontimemillis values of 0, which is equivalent to 101 documents found in 101 documents, and the speed of the search is approaching 0. This shows that using indexes greatly improves query speed





Overview





Indexes are special data structures that store a small subset of datasets in an easy-to-traverse format. The index stores the values of a specific field or set of fields, sorted by the field values specified in the index



Using an index, you can speed up index-related queries and bring some disadvantages accordingly



1, increase the consumption of disk space. The index file may occupy more space than the data itself if the index is more numerous



2, when writing data or updating data, the maintenance of the index is generally another logic outside the write, to a certain extent, will reduce the write performance



However, for efficient querying, these effects are worthwhile. There are many situations in which the performance of the system is degraded and is related to unreasonable index creation. Therefore, a reasonable index creation can reduce the bad impact of the index





Index settings





"Getindexes ()"



Use the Getindexes () method to query the index


Db.collection_name.getIndexes ()


It is known that there are two indexes of "_id" and "Time"






"CreateIndex ()"


Db. Collection_name.createindex ({key:1})


The key value in the syntax is the index field to be created, 1 is the specified ascending index, and if you want to create an index in descending order, specify 1.






Of course, you can also create multiple indexed fields


Db. Collection_name.createindex ({k1:1,k2:1})


Before the MongoDB3.0 version, the Ensureindex () method was used, and now the Ensureindex () method is still available, just the alias of the CreateIndex () method



If you have more documents, it can take a while to create an index. If the system is heavily loaded and there are many existing documents that cannot be created directly using this command, you need to create the index before using the database. Otherwise, the performance of the database is severely impacted



[note] The index can be created repeatedly, and if it is created again on an existing index, it will be returned directly to the successful






CreateIndex () receives optional parameters, the optional parameter list is as follows:




Parameter Type Description
Background Boolean The indexing process blocks other database operations. background specifies that the index is created in the background. The default value is false.
Unique Boolean The index established is unique. Specify a true index to be created as true. The default value is false
Name string The name of the index. If not specified, MongoDB generates an index name by the field name and sort order of the join index.
dropDups Boolean Whether to delete duplicate records when creating a unique index, specify true to create a unique index. The default value is falsesparse Boolean Does not enable indexing for field data that does not exist in the document; if set to true, documents that do not contain the corresponding field are not queried in the index field. The default value is falsev index version The version number of the index. The default index version depends on the version that mongod runs when the index is created.
Weights document Index weight value, between 1 and 99,999, indicating the score weight of the index relative to other index fields
expireAfterSeconds integer specifies a value in seconds, completes the TTL setting, sets the lifetime of the set
Default_language string For text indexes, this parameter determines the list of rules for stop words and stems and tokens. The default is English
Language_override string For text indexes, this parameter specifies the name of the field contained in the document. The default value is language.



Db.db_coll1.createIndex ({time:1},{background:true})


"Dropindex ()"



Use the Db.collection_name.dropIndex ({key:1}) method to delete the specified index


Db.collection_name.dropIndex ({key:1})





[Note]_id index cannot be deleted






In addition to using key-value pairs to remove an index, you can also use its name value to delete an index



As shown below, the name value of {time:1} is "Time_1" and the index can be deleted using Db.db_coll1.dropIndex ("Time_1")






"Dropindexes ()"



Use the Db.collection_name.dropIndexes () method to delete all indexes


Db.collection_name.dropIndexes ()







Indexed properties





"TTL"



An expired index, also known as a TTL index, is a special type of single-field index that is used to automatically delete a document after a certain period of time is met. This means that the document in the collection has a valid validity period, and the document that expires the validity period is invalidated and removed. That is, the data will expire. Expired data does not need to be retained, which applies to machine-generated event data, logs and session information, and so on



Similarly, an expired index is created using the CreateIndex () method, but it supports the second parameter, expireafterseconds, to specify how many seconds to expire or an array that contains an expiration date value


Db.eventlog.createIndex ({x:1}, {expireafterseconds:3600})


In the following example, after 60s, the time document is deleted






Using outdated indexes, there are a few things to note:



1. The value stored in the Expired index field must be the specified time type. Must be isodate or isodate array, cannot use timestamp, otherwise it cannot be deleted automatically



In the following example, time sets the value of type Isodate, which is automatically deleted after 60s






In the following example, time is set to a timestamp that cannot be deleted after the value is 60s






2, if the isodate time array is specified, then the minimum time to delete



3. Expired index cannot be a composite index



4, the deletion time is inaccurate. The removal process is run by the daemon every 60s, and the deletion takes some time, there are errors. Therefore, if the set expiration time is less than 60s from the current time, the document must be at least 60s to be deleted.



"Uniqueness"



Indexed properties can be unique, that is, a unique index, as long as the unique attribute in the Set Index property is True, the default is False. A unique index is used to ensure that indexed fields do not store duplicate values, that is, to force the uniqueness of an indexed field. By default, the _id field of MongoDB automatically creates a unique index when the collection is created


Db.collection_name.createIndex ({},{unique:true})


As shown, by default, it is not a unique index






Cannot insert duplicate values after setting unique:true






Common errors



As shown, the A field set is a unique index and the B field cannot enter duplicate values. This is because setting the A field to a unique index, inserting the data b:10, the equivalent of A:null, and then inserting b:10, is equivalent to inserting the a:null. While A:null and A:null are duplicates, the A field is a unique index and cannot be duplicated. Therefore, you cannot insert a duplicate b:10






"Sparsity"



Indexed properties can be sparse, that is, sparse indexes, as long as the sparse in the indexed property is set to True, the default is False


Db.collection_name.createIndex ({},{sparse:true})


The difference in sparsity represents two different ways in which MongoDB exists in the process of indexing fields that do not exist in the document.



A sparse index, also known as a Gap index, is the index column that creates an index that does not exist on some documents, resulting in gaps in the index.



Suppose that, in a collection, an index on the X field is created. However, the inserted document does not contain an X field. By default, MongoDB will still create an index for this nonexistent field. If you create this index as a sparse index, this index will not be used



If many documents in the data collection do not have a value on the field where the index is created, using a sparse index can reduce disk consumption and increase insertion speed



$exits



The use of sparse indexes may pose some pitfalls. MongoDB provides a $exits operator that $exits indicates whether a field exists






As shown, a sparse index of {m:1} was created, and the result appears when you use the Find () method to find a document that does not exist in the M field. Because MongoDB does not use a sparse index to query



If you use the hint () method to force a sparse index to look for a field that exists on the index and does not exist in the document, there is no result. Again, the sparse index cannot be used to look for fields that exist on the index but do not exist in the document.








Type of index





MongoDB supports creating indexes based on any column on a collection document. By default, there is an index on all _id columns of the document. Based on business needs, some additional indexes can be created based on some important queries and actions. These indexes can be single-column, but multiple columns (composite index), multi-key index, geo-spatial index, full-text index, etc.



MongoDB supports 6 types of indexes, including



1. _id Index



2. Single-Key index



3. Multi-Key index



4. Composite Index



5. Full-Text Indexing



6. Location Index



"_ID Index"



_id indexes are indexes that are established by default for most collections, and MONGDB automatically generates a unique _id field for each inserted data



As shown, the _id index already exists before any indexes are inserted






"Single Key Index"



A single-key index is the most common index, unlike the _id index, which does not automatically create



For example, a record, in the form {X:1,y:2,z:3}, is indexed on the X field and can then be queried using the X condition






"Multi-Key Index"



In MongoDB, you can create an index based on an array. MongoDB creates an index value for each element of an array. Multi-key indexing supports efficient querying of array fields. Multi-key indexes can be created based on strings, arrays of numbers, and nested documents



The multi-key index is the same as the single-button index creation, which differs from the value of the field. The value of a one-touch index is a single value, such as a string, number, or date. The value of a multi-key index has multiple records, such as an array



If you insert the multi-key data of an array type in MongoDB, the index is automatically created without having to specify it intentionally. However, using the Getindexes () method does not have a multi-key index unless you explicitly create a multi-key index



"Composite Index"



MongoDB supports composite indexes, which combine multiple keys together to create an index. This method is called a composite index, or a combined index, which satisfies the case where the multi-key value matching query uses the index. Next, when using the composite index, the index can also be used by the prefix method.



[note] Any composite index field cannot exceed 31



For example, if you insert a record of {X:1,y:2,z:3}, you need to create a composite index of x and y when you need to query by x and Y values. Next, you can query using X and Y as criteria




db.db_coll1.createIndex({x:1,y:1})
db.db_coll1.createIndex({x:-1,y:1})
db.db_coll1.createIndex({x:-1,y:-1})
db.db_coll1.createIndex({x:1,y:-1})
db.db_coll1.createIndex({y:1,x:1})
db.db_coll1.createIndex({y:-1,x:1})
db.db_coll1.createIndex({y:-1,x:-1})
db.db_coll1.createIndex({y:1,x:-1})




Composite indexes are created by specifying their arrangement in ascending or descending order. The order of a single-key index is not particularly important because MongoDB can traverse the index in either direction. For composite indexes, the sort order determines whether the index can be used in a query



The composite index of x and Y consists of 8 cases, with different order of x and Y, different ascending or descending order, and different indexes will be produced. The query optimizer uses the indexes we build to create the query scheme, ultimately choosing the optimal index to query the data



Index prefix refers to a subset of composite indexes



If the following index exists


{"Item": 1, "Location": 1, "Stock": 1}


The following index prefixes exist


{Item:1} {item:1, Location:1}


In MongoDB, the following query filter condition scenario, the index will be used to


Item field
Item field + location field
Item field + location field + stock field
Item field + stock field (although the index is used, it is not efficient)


The following filter condition queries the situation, the index will not be used to


Location field
Stock field
Location + stock field



Full-Text Indexing





Create



A full-text index is also called a text index, common in the search box. We enter keywords in the search box, such as "HTML", not only the article with "HTML" in the title will be searched, and the article "HTML" article will be searched out



In order to index a key that stores a string or an array of strings, you need to include this key in the Create option and specify"text", as follows:


Db.reviews.createIndex ({comments: "text"})


If you need to create a full-text index on more than one field, you can compound the index




Db.reviews.createIndex ({subject: "Text", Comments: "Text"})




If you need to create a full-text index on all fields, you need to use the $XX identity


Db.collection_name.createIndex ({"$**": "Text"})


[note] A collection can create at most onetextindex



Use



If you are searching using a full-text index, you need to use the following format


Db.collection_name.find ({$text: {$search: ' ... '}})


Suppose you use the following data structure to store a complete article, author store author, title store header, article store article content


{Author: "", Title: "", Article: "}


Now add some data and create full-text indexes on all fields






Below to search for ' Huochai ', can search to 3 records






If you search for ' A2 ', you can only search for the 2nd record






If search ' A1 A2 A3 ', then equivalent or relationship, A1 or A2 or A3, can search to 3 records






If searching ' huochai-css ', it is equivalent to finding records containing ' Huochai ' but not ' CSS ', including 1th and 3rd






If you are searching for a relationship, such as a record that contains both Huochai and CSS, you need to add quotes internally, "\" Huochai\ "\" Css\ "



[note] Only double quotes are supported






"Similarity"



A full-text index has a concept of similarity, which indicates how the search criteria for a full-text index are similar to the contents of a record



In the second argument of the Find () method, score is a number, the larger the number, the higher the similarity


Db.collection_name.find ({$text: {$search: ' ... '}},{score:{$meta: "Textscore"}})


Now, insert a piece of content, the author is ' Huochai '






Then start searching for ' Huochai ', with a similarity of






The following are sorted by similarity, with high similarity in front


Sort ({score:{$meta: "Textscore"}})





Limit



1, each query, can only specify a $text query



2. $text query cannot appear in the $nor query



3, if the query contains $text,hint () will no longer work



4, only for the whole word query, can not be used to intercept part of the word query. Similarly, in Chinese to do full-text query, you can only query a paragraph with spaces in the word or word





Location Index





In general, a geo-index can be implemented such as restaurants sorted by distance, store screening in a region, etc.



You can store the location of some points in MongoDB, and after you create a geo-indexed index, you can find other points by location. There are two types of geolocation indexes: a 2d index for storing and locating points on a plane, and a 2dsphere index for storing and locating points on a sphere



There are generally two ways to find: One is to find a point within a certain range of a point, and the other is to find a point that is contained within an area



"2D Index"



The How to create the index is as follows


Db.<collection>.createindex ({<location field>: "2d", <additional field>: <value>}, {<in Dex-specification options>})


Options include the following parameters


{min: <lower bound>, Max: <upper bound>, bits: <bit precision>}





Use latitude in MongoDB for location, [longitude, Latitude]. Longitude range in [-180,180], latitude range in [ -90,90]



[note] The default boundary allows the insertion of documents with an unreasonable latitude value greater than 90 or less than-90. But for the unreasonable point of the geographical query, the database behavior is unpredictable. Therefore, try to avoid inserting dimension values that are out of range






There are three ways to query the index, including $near, $geoNear, $geoWithin



One is to use the $near query, that is, to query the nearest point from a point, return 100 by default


Db.<collection>.find ({<location field>: {$near: [<x>, <y>]}})





$maxDistance can set the farthest distance from the current point






$minDistance can set the closest distance from the current point






The other is to use the $geonear query, $geoNear use the RunCommand command


Db.runcommand ({geonear:<collection_name>,near:[x,y],mindistance:..,maxdistance:..,num: ...})





The other is to use the $geowithin query, which is to query for points within a shape



In MongoDB, there are three shapes, including rectangles, circles, and polygons, using the following methods


Db.<collection>.find ({<location field>: {$geoWithin: {$box | $polygon | $center: <coordinates>}})


The first is a rectangle that uses $box to represent


{$box: [[X1,y1],[x2,y2]]}





The second is round, using $center to represent


{$center: [[<x1>,<y1>],r]}





The third is polygons, which use $polygon to represent


{$polygon: [[<x1>,<y1>],[<x2>,<y2>],[<x3>,<y3>],...]}





"2dsphere Index"



The 2dsphere index is created in the following way


Db.collection_name.createIndex ({A: "2dsphere"})


Position representation is no longer a simple latitude and longitude, but a geojson representation, used to describe a point, a line, a polygon shape, the format is as follows


{type: "", Coordinates:[<coordinates>]}


MongoDB Database Index


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.