"Four" MongoDB index management

Source: Internet
Author: User
Tags create index createindex



First, the index introduction



In MongoDB, indexes are used to support efficient queries. If there is no index, MONGODB must scan each document in the entire collection to find a matching document. However, if an appropriate index is established, MongoDB can limit the number of documents checked by index.



An index is a special data structure that stores a small subset of data sets in a collection that are easily traversed. The index stores the specified field or field collection, which are sorted according to the field values. Sorted index entries can support efficient equivalence matching and range-based query operations, and MongoDB can also return an ordered set of results by sorting the index.



Basically, the index of MongoDB is similar to the index of other relational database, it is defined at the collection level and supports any field or subdomain, it uses B-TREE data structure.



Second, the index concept



1. Index type



MongoDB offers many different types of indexes. For a document or inline document, you can create an index on any field or inline field. Generally, you should create generic user-oriented indexes. Through these indexes, make sure MONGODB scans the least likely to match the document. In the MongoDB shell, you can create an index by calling the CreateIndex () method.



1) Single field index



For documents in a collection, MongoDB fully supports the creation of indexes on any field. By default, there is an index on the _id field of any collection, and the app and the user can also add additional indexes to support important queries and operations. MongoDB supports both single-field indexes and composite indexes that support multiple fields, so here's an example of a single-segment index:



> db.friends.insert({"name" :"Alice","age":27}) #collection of a document in friends
WriteResult({ "nInserted" : 1 })

> db.friends.createIndex({"name" :1}) #Index on the name field of the document
{
     "createdCollectionAutomatically" : false,
     "numIndexesBefore" : 1,
     "numIndexesAfter" : 2,
     "ok" : 1
}


Db.collection.createIndex (keys,options) Introduction:


Parameter Type Description
keys document

A document that contains T He field and value pairs where the field is the index key and the value describes the type of index for that field. For the ascending index on a field, specify a value of 1; For descending index, specify a value of -1.

MongoDB supports several different index types including  text ,   Geospatial , and  hashed  indexes. see  Index Types  for more information.

options document Optional. A document that contains a set of options that controls the creation of the index. see  Options for details.
    • _id Field Index: When a collection is created, the default is to create an ascending unique index on the _id field, which cannot be deleted. Given that the _id field is a primary key for a collection, there should be a unique _id field for each document in the collection, in which you can store any unique value. The default value of the _id field is Objectid, which is automatically generated when the document is inserted. In a Shard collection environment, if you do not specify the _id field Shard key, your application must ensure that the _id fields are unique, otherwise it will be an error. A common practice is to resolve by automatically generating Objectid standard values.
    • Inline field index: On any field in an inline document, you can also create an index, just as you would create it on a first-level field in a document. However, it is necessary to note that there is a difference between creating an index on an inline field and creating an index on a nested document, which accesses the field names in the inline document by means of a dot. Take a look at the following example:

> db.people.insert(
... {
... name:"John Doe",
... address: {
... street: "Main",
... zipcode:"53511",
... state: "WI"
... }
... }
... )
WriteResult({ "nInserted" : 1 })
 
> db.people.createIndex({"address.zipcode":1}) #Refer to the zipcode field via the address.zipcode method, taking care to add double quotes.

{"createdcollectionautomatically": false, "Numindexesbefore": 1, "Numindexesafter": 2, "OK": 1}
    • Inline Document Index:

> db.factories.insert(
      { metro:
               {   
                    city: "New York",   
                    state:"NY" 
               }, 
          name: "Giant Factory" 
       })
WriteResult({ "nInserted" : 1 })
> db.factories.createIndex({metro:1})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}


The Metro field above is an inline document that contains the inline fields City and state, so the creation method is the same as the one-level field creation method.



The following query can use the index:



Db.factories.find ({metro: {city: ' New York ', State: ' NY '}})


{"_id": ObjectId ("56189565D8624FAFA91CBBC1"), "Metro": {"City": "New York", "state": "NY"}, "name": "Giant Factory " }



When you make an equivalent match query in an inline document, you need to be aware of the order of the fields, for example, the following query will not match any documents:





2) Composite Index



MongoDB supports composite indexes, so-called composite indexes, which are indexes that contain multiple fields, and a combined index can contain up to 31 fields. If the field is a hash index, the combined index cannot include the field.



Example Description:



> db.products.insert(
    {"item": "Banana",
      "category": ["food","produce","grocery"],
      "location": "4th Street Store",
      "stock": 4,
      "type": "cases",
      "arrival": "2015"}
      )
WriteResult({ "nInserted" : 1 })

> db.products.createIndex({"item":1,"stock":1}) #创建复合索引,包含item和stock两个字段
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}


At this point the following two queries are available for this composite index:



> db.products.find({"item":"Banana"})
> db.products.find({"item":"Banana","stock":4})


"Sort order"



The fields in the index can be sorted in ascending order (1) and Descending (-1), and for composite indexes, the sort order of the fields is important, and it directly affects whether the sort operation can be indexed.



The following is an example: The fields in the Events collection Chinese document are username and date


    • Query by username Ascending date in descending order:

Db.events.find (). Sort ({username:1, Date:-1})
    • Query by username descending date Ascending:

Db.events.find (). Sort ({username:-1, date:1})
    • Press username and date Ascending:

Db.events.find (). Sort ({username:1, date:1})


The above query, because of the different sort order, the use of the index is not the same situation, now create the following index:



Db.events.createIndex ({"username": 1, "date":-1})


Only the first 22 cases are able to walk the index, and the third is that the index cannot be exploited.



"Compound index prefix"



The so-called prefix is a subset of the composite index Start field, such as the following combined index:



{"Item": 1, "Location": 1, "Stock": 1}


At this point the prefix can be:



? {item:1}? {item:1, Location:1}


The following query makes good use of the combined index:



?the item field, #only one item condition, starting with a prefix
?the item field and the location field, #two matching conditions, starting with a prefix
?the item field and the location field and the stock field. #All matching conditions, including the prefix


The following query cannot take advantage of the combined index:



?the location field,
?the stock field, or
?the location and stock fields.


There is also a single-field index for a combined index in a collection, such as {a:1,b:1},{a:1}, because the combined index prefix includes the {a:1} index, so the second one is redundant and can be dropped.



"Summary": the combined index in MongoDB is basically the same as the combined index of the other relational database.



3) Multi-key indexing (Multikey index)



For a field that is of type array, MongoDB creates an index for each element in the array, and a multi-key index provides an efficient query for the array field.


    • Create a multi-key index

Db.collections.createIndex ({<field>: < 1 or-1 >})


If the field is of type array, MongoDB automatically creates a multi-key index when the index is created, without requiring us to explicitly specify an index type.


    • Multi-Key index boundary


The boundary of an index scan specifies the range of data that is searched for by index during the query, and when multiple predicates exist on an index, MongoDB attempts to merge the predicates by cross-indexed or combined indexes to produce a smaller range boundary.


    • Limitations of multi-key indexes


For a composite index, a maximum of 1 fields in a multi-key index per document are array types, and conversely, if a multi-key index already exists, you will not be able to insert a document that has two array-type fields. To illustrate:



{ _id: 1, a: [ 1, 2 ], b: [ 1, 2 ], category: "AB - both arrays" }
  #Cannot create index { a: 1, b: 1 } because there are two array type fields in this index


The following scenario is allowed to create index{a:1, b:1}:



{ _id: 1, a: [1, 2], b: 1, category: "A array" }
{ _id: 2, a: 1, b: [1, 2], category: "B array" }
    • Shard Keys


You cannot specify a multi-key index as the Shard key index. A hash index cannot also be a multi-key.


    • An inline document in an array field


You can create a multi-key index that contains inline objects:



{
  _id: 1,
  item: "abc",
  stock: [
    { size: "S", color: "red", quantity: 25 },
    { size: "S", color: "blue", quantity: 10 },
    { size: "M", color: "blue", quantity: 50 }
  ]
}
{
  _id: 2,
  item: "def",
  stock: [
    { size: "S", color: "blue", quantity: 20 },
    { size: "M", color: "blue", quantity: 5 },
    { size: "M", color: "black", quantity: 10 },
    { size: "L", color: "red", quantity: 2 }
  ]
}
{
  _id: 3,
  item: "ijk",
  stock: [
    { size: "M", color: "blue", quantity: 15 },
    { size: "L", color: "blue", quantity: 100 },
    { size: "L", color: "red", quantity: 25 }
  ]
}


Then create a multi-key index:



Db.inventory.createIndex ({"Stock.size": 1, "stock.quantity": 1})


The following query and sort can be used to go through this index:



db.inventory.find( { "stock.size": "M" } )
db.inventory.find( { "stock.size": "S", "stock.quantity": { $gt: 20 } } )
db.inventory.find( ).sort( { "stock.size": 1, "stock.quantity": 1 } )
db.inventory.find( { "stock.size": "M" } ).sort( { "stock.quantity": 1 } )


4) Geo-spatial index (geospatial Indexes)



MongoDB specializes in providing a set of indexing and querying mechanisms to handle geospatial information, which describes the geospatial features in MongoDB.



Before you can store geospatial information data, you need to decide which plane type to use for calculations. The type you choose affects how you store your data, what indexes you build, and the syntax of your queries. MongoDB offers two types of planar faces:



Surface: In order to calculate the spherical geometry body, you need to store your data into the surface type and select the 2dsphere index. Use the data as Geojson objects and store them in the order of the axes.



Plane: In order to calculate Euclidean plane distance, the data is stored as a coordinate pair and is indexed by 2d.


    • 2dsphereIndexes


To create a spatial index based on the Geojson data format, use the Db.collections.createIndex () method for creating a new 2dsphere index with the following syntax:



Db.collection.createIndex ({<location field>: "2dsphere"})


Here's a detailed demonstration:



First, create a location-based collection places, which stores the location data document based on GeoJSON Point, as follows:



db.places.insert(
   {
      loc : { type: "Point", coordinates: [ -73.97, 40.77 ] },
      name: "Central Park",
      category : "Parks"
   }
)

db.places.insert(
   {
      loc : { type: "Point", coordinates: [ -73.88, 40.78 ] },
      name: "La Guardia Airport",
      category : "Airport"
   }


Then, create a new 2dsphere index based on the LOC field:



Db.places.createIndex ({loc: "2dsphere"})


Of course, you can also create a composite index that contains a 2dsphere index:



db.places.createIndex( { loc : "2dsphere" , category : -1, name: 1 } )
db.places.createIndex( { category : 1 , loc : "2dsphere" } ) # Unlike 2d indexes, the 2dsphere type is not required in the first location.


"Caveats": 2dsphere indexed fields must be based on coordinate pairs and GeoJSON data formats.


    • 2dIndexes


This index type is used to store data as a point in a two-dimensional planar scenario, typically used in coordinate-based data formats prior to the v2.2 version, and is not described in detail here.


    • GeohaystackIndexes


ThegeohaystackIndex is a special index that is generally optimized to return a result set of small areas. Usinggeohaystackcan improve query performance when storing data forms with planar geometry. For using surface geometry, the 2dsphere index would be a better choice, allowing the fields to be reordered, whileGeohaystackrequired the first field to be the Location field. Detailed here is not described in detail.


    • 2dIndex Internals: Not commonly used, please check the official documentation


5) Hash Index



The hash index maintains the hash value entry for the indexed field, which collapses the built-in document and calculates the hash of the entire value, which does not support multi-key indexing.



MongoDB's hash index supports equivalent queries and does not support range-based queries. You cannot create a composite index that contains a hash index field, or you cannot specify a unique index on a hash index, but you can create a hash index and a single-field index on the same field.



The following is an example of creating a hash index:



Db.collection.createIndex ({_id: "hashed"})


6) Full-text indexing



MongoDB provides full-text indexing to support query efficiency for text strings, which can be built on any field that is a string type or an element is an array of strings. A collection can have a maximum of one full-text index.


    • To create a full-text index

Db.reviews.createIndex ({comments: "text"})


Of course you can also create a full-text index that includes multiple fields, that is, a composite index can include a full-text index:



db.reviews.createIndex(
   {
     subject: "text",
     comments: "text"
   }
 )


Assign weights: Weights are the ratios between the full-text indexed fields, such as the following content:10,keywords:5, which indicates that content appears 2 times in the query and keywords appears 1 times.



db.reviews.createIndex(
   {
     subject: "text",
     comments: "text"
   }
 )


Wildcard: When you create a full-text index, you can also leverage wildcards:



db.collection.createIndex( { "$**": "text" } )
db.collection.createIndex( { a: 1, "$**": "text" } )


A full-text index that is created in this way, as long as a field with a string type in the collection is all added to the full-text index, which is often used in unstructured data and in the case of indeterminate fields.


    • Limit



2. Indexed Properties



In addition to supporting the above index types, MongoDB also provides some common indexed properties.



1) TTL Indexes



The TTL index is a special type of single-field index that automatically deletes stale data from the collection. The data age is useful for data such as machine code generation, logging, and session.


    • Create TTL index

db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )
# Define TTL index by adding expireAfterSeconds
    • TTL Expiration principle


After the specified number of seconds has elapsed since the Index field value, the TTL index expires the document. The threshold value that expires is equal to the index value plus the specified number of seconds. If the indexed field is an array type, there are multiple time values on the TTL index, and MongoDB uses the earliest time value for the threshold calculation. If the indexed field is not of date type, then the document never expires, and if the document does not contain a TTL index, the document will never expire.



MongoDB will start the background TTL thread every 60 seconds to read the value of the index and delete the expired data, when the status of the TTL thread is active, you can look through the db.currentop () to see the delete operation. The TTL index does not guarantee that the data will be deleted immediately when it expires, and there may be a period of delay.


    • Limit

1. The TTL index is a single-field index and does not support combined indexes.
2, _id field does not support TTL index
3. TTL index is not supported in capped collections.
4. For existing TTL indexes, you cannot modify the value of expireAfterSeconds by the createIndex method, but with the collMod command along with the index collection flag. Otherwise, you can only delete the reconstruction.
5. For the existing non-TTL single-domain index, you can no longer build a TTL index on this field. In order to convert the non-TTL index to a TTL index, you must delete the original index reconstruction.


2) Unique indexes (unique index)



For a field with a unique index, MongoDB rejects all documents that insert duplicate values for that field. When the index is created by default, the unique index parameter is disabled.


    • Create a unique index

Db.members.createIndex ({"user_id": 1}, {unique:true})


For composite indexes, if unique is true, then uniqueness is determined by the combined values of these fields.



If the unique index field is an array or a built-in document type, the unique index does not guarantee that the value inside is unique, as in the following example:



b.collection.createIndex( { "a.b": 1 }, { unique: true } )
Db.collection.insert( { a: [ { b: 5 }, { b: 5 } ] } ) # This is fully insertable


If there is no value for the unique index field, the default is to store the null value. Because of uniqueness constraints, MONGODB only allows inserts that do not include the index field at one time, and if it is greater than 1 times, an error is given. To illustrate:



First create a unique index on the X field:



Db.collection.createIndex ({"X": 1}, {unique:true})


Next, execute the INSERT statement without the X field:



Db.collection.insert( { y: 1 } ) # This can be inserted because the collection does not include a value of x before the insertion.


Then, execute an INSERT statement that does not contain an x:



Db.collection.insert ({z:1})# is now an error because an X with a value of NULL was previously inserted
Writeresult ({
"ninserted": 0,
"Writeerror": {
"Code": 11000,
"ErrMsg": "E11000 duplicate key error collection:test.collection index:x_1 DUP key: {: null}"
}
})


3) Partial Indexes (local index)



A local index is simply an index of some document in the collection that satisfies the specified filter criteria. By indexing parts of a document in a collection, local indexes have lower requirements for storage, index creation, and maintenance performance costs.


    • Create a local index

db.restaurants.createIndex(
   { cuisine: 1, name: 1 },
   { partialFilterExpression: { rating: { $gt: 5 } } }
)


The optional parameter partialfilterexpression applies to all index types.


    • Conditions for using local indexes

1 Query predicate must contain a filter expression
2 The query condition must be the local or subset of the local index result set


For the above index, here are some examples of queries that use the condition to see if the local index can be used:



1, db.restaurants.find( { cuisine: "Italian", rating: { $gte: 8 } } ) #You can take the local index because the result set of the query expression is a subset of the local index result set
2, db.restaurants.find ( { cuisine: "Italian" } ) # can not use the local index, because the condition 1 is not satisfied: there is no filter expression in the query predicate
3, db.restaurants.find ( { cuisine: "Italian", rating: { $lt: 8 } } ) # can not use the local index, because walking the index will lead to incomplete result set
    • Local indexes with uniqueness constraints


For a local index with a uniqueness constraint, this uniqueness constraint is only valid in the scope document that satisfies the local index, and the uniqueness constraint does not work for a local index.



{ "_id" : ObjectId("56424f1efa0358a27fa1f99a"), "username" : "david", "age" : 29 }
{ "_id" : ObjectId("56424f37fa0358a27fa1f99b"), "username" : "amanda", "age" : 35 }
{ "_id" : ObjectId("56424fe2fa0358a27fa1f99c"), "username" : "rajiv", "age" : 57 }

db.users.createIndex(
    { username: 1 },
    { unique: true, partialFilterExpression: { age: { $gte: 21 } } }
)
#The following three unique constraints can have a role
Db.users.insert( { username: "david", age: 27 } )
Db.users.insert( { username: "amanda", age: 25 } )
Db.users.insert( { username: "rajiv", age: 32 } )
#以下唯一性 does not work
Db.users.insert( { username: "david", age: 20 } )
Db.users.insert( { username: "amanda" } )
Db.users.insert( { username: "rajiv", age: null } )


4) Sparse Indexes (sparse index)



A sparse index is a document entry that contains only indexed fields, even if the value of the field is null. Because the index skips documents without indexed fields, it is named "Sparse index." Sparse indexes do not include documents in all collections, whereas non-sparse indexes include all documents in the collection.



In mongodb3.2 and later versions, it is recommended to use partial indexes first.


    • To create a sparse index:

Db.addresses.createIndex ({"xmpp_id": 1}, {sparse:true})


If you use a sparse index to result in an incomplete result set, MongoDB will not have to be indexed unless you explicitly use the hint () function to specify it. The following are examples of use:



{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }

db.scores.createIndex( { score: 1 } , { sparse: true } )
Db.scores.find( { score: { $lt: 90 } } ) # Since userid=newbie has no score field, this does not satisfy the condition of sparse index, so only one document is returned.

{"_id": ObjectId ("523b6e61fb408eea0eec2648"), "userid": "Abby", "Score": 82}


For the above collection, take a look at the sort:



Db.scores.find().sort( { score: -1 } ) #Although the sort is on the score index field, mongodb does not select a sparse index, so that the complete result set can be returned.
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }

#In order to specify the use of sparse indexes, you must explicitly use the hint method
Db.scores.find().sort( { score: -1 } ).hint( { score: 1 } )

{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }


For sparse indexes with unique constraints:



Uniqueness constraints can only work on documents that satisfy a sparse index and do not work on other documents, as follows:



{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }

db.scores.createIndex( { score: 1 } , { sparse: true, unique: true } )
#下面四个可以进行插入
db.scores.insert( { "userid": "AAAAAAA", "score": 43 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 34 } )
db.scores.insert( { "userid": "CCCCCCC" } )
db.scores.insert( { "userid": "DDDDDDD" } )
#下面违反唯一性约束
db.scores.insert( { "userid": "AAAAAAA", "score": 82 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 90 } )


"Four" MongoDB index management


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.