MongoDB Learning--Index types and attributes

Source: Internet
Author: User


Index type


MONGDB indexes are divided into the following types: Single-key index, composite index, geospatial Index, full-text index, and hash index


Single-key index (single Field Indexes)


An index created on a key is a single-key index, which is the most common index, such as the index of _id created by MongoDB by default.



Example:

{
    "_id": ObjectId (...),
    "name": "Alice",
    "score": 27
}
If you want to create a single-key index in the above document, the statement is as follows:

db.users.ensureIndex ({"score": 1})
Its storage structure is as follows:

If you want to create a single-key index on a key of a subdocument, the example is as follows:

{
    "_id": ObjectId (...),
    "name": "John Doe",
    "address": {
        "street": "Main",
        "zipcode": "53511",
        "state": "WI"
    }
}
The structure is as above, and its creation statement is as follows:

db.users.ensureIndex ({"address.zipcode": 1})
If you want to create a single-key index on the entire subdocument, the example is as follows:

{
    _id: ObjectId (...),
    metro: {
        city: "New York",
        state: "NY"
    },
    name: "Giant Factory"
}
The structure is as above, and its creation statement is as follows:

db.factories.ensureIndex ({metro: 1})
The following statement can use its index to find the above data:

db.factories.find ({metro: {city: "New York", state: "NY"}})
However, no data can be found in the following statement, indicating that the search of the subdocument must be an exact match, including the order in the subdocument:

db.factories.find ({metro: {state: "NY", city: "New York"}})
Compound Indexes
Indexes created on multiple keys are compound indexes.

example:

{
    "_id": ObjectId (...),
    "userid": "aa1",
    "category": ["food", "produce", "grocery"],
    "location": "4th Street Store",
    "score": 4
}
If you want to create a compound index in the above document, the statement is as follows:

db.products.ensureIndex ({"userid": -1, "score": 1})
Userid is arranged in positive order, and score is arranged in reverse order. Its storage structure is as follows:

This index can support the following sorting:

db.products.find (). sort ({userid: 1, score: -1});
db.products.find (). sort ({userid: -1, score: 1});
db.products.find (). sort ({userid: 1});
db.products.find (). sort ({userid: -1});
The following sorting cannot be supported:

db.products.find (). sort ({userid: 1, score: 1});
db.products.find (). sort ({userid: -1, score: -1});
db.products.find (). sort ({score: 1});
db.products.find (). sort ({score: -1});
Multikey Index
If you create an index on an array, MongoDB will decide for itself whether to build this index into a multi-key index.

If the data structure is as follows (two kinds):

{a: [1, 2], b: 1}
{a: 1, b: [1, 2]}
You can create {a: 1, b: 1}, which will be a multi-key compound index.

The multi-key index structure is as follows:


 

example:

{
    "_id": ObjectId ("..."),
    "name": "Warm Weather",
    "author": "Steve",
    "tags": ["weather", "hot", "record", "april"]
}
The document structure is as above, if you create an index on tags, it will create a multi-key index

If the document structure is as follows:

{
    "_id": ObjectId (...),
    "title": "Grocery Quality",
    "comments": [{
        author_id: ObjectId (...),
        date: Date (...),
        text: "Please expand the cheddar selection."
    }, {
        author_id: ObjectId (...),
        date: Date (...),
        text: "Please expand the mustard selection."
    }, {
        author_id: ObjectId (...),
        date: Date (...),
        text: "Please expand the olive selection."
    }]
}
Create {"comments.text": 1} index will also be a multi-key index, and is valid in the following search statement:

db.feedback.find ({"comments.text": "Please expand the olive selection."})
Geospatial Indexes and Queries
MongoDB supports several types of geospatial indexes. The most commonly used are the 2dsphere index (for maps of the earth's surface type) and the 2d index (for flat maps and time-continuous data).

1) 2dsphere

2dsphere allows GeoJSON format (http://www.geojson.org) to specify points, lines and polygons.

Points can be represented by an array of two elements of the form [longitude, latitude] ([longitude, latitude]):

{
    "name": "New York City",
    "loc": {
        "type": "Point",
        "coordinates": [50, 2]
    }
}
Lines can be represented by an array of points:

{
    "name": "Hudson River",
    "loc": {
        "type": "LineString",
        "coordinates": [[0, 1], [0, 2], [1, 2]]
    }
}
Polygons are represented by arrays of lines:

{
    "name": "New England",
    "loc": {
        "type": "Polygon",
        "coordinates": [[[0, 1], [0, 2], [1, 2], [0, 1]]]
    }
}
2dsphere supports Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon, Geometry Collection

The name of the loc field can be arbitrary, but the sub-objects are specified by GeoJSON and cannot be changed.

Use the 2dsphere option in ensureIndex to create a geospatial index:

db.world.ensureIndex ({"loc": "2dsphere"})
Many different geospatial queries can be used: intersection, within, and nearness. When querying, you need to specify what you want to find as a GeoJSON object of the form {"$ geometry": geoJsonDesc}.

For intersection, use the $ geoIntersects operator:

var place = {
    "type": "Polygon",
    "coordinates": [[[0, 1], [0, 3], [50, 2], [0, 1]]]
}
db.world.find ({"loc": {"$ geoIntersects": {"$ geometry": place}}})
Will find all documents that intersect with place.

To include (within), use the $ within or $ geoWithin operator:

db.world.find ({"loc": {"$ within": {"$ geometry": place}}})
Nearness, use the $ near or $ geoNear operator:

var place = {
    "type": "Point",
    "coordinates": [0, 3]
}
db.world.find ({"loc": {"$ near": {"$ geometry": place}}})
place must be a point. $ near is the only geospatial operator that will automatically sort the query results. The returned results of $ near are sorted by distance from near to far.

2) 2d

The 2d index is used to flatten the surface, not the surface of the sphere, otherwise a lot of distortion will appear near the pole.

The document uses an array of two elements to represent the 2d index field, not in GeoJSON format.

{
    "name": "Water Temple",
    "tile": [32, 22]
}
2d indexing can only index points. You can save an array of points, but it will only be saved as an array of points, not as a line. Especially for the $ within query, if a certain point in the array is within the query range, the document will be found.

Use the 2d option in ensureIndex to create a geospatial index, and also set the maximum and minimum boundary values and precision. By default, the range of the maximum and minimum values is [-180, 180), and the precision is 26 bits of precision, which is roughly equivalent to 2 feet or 60 centimeters of precision:

db.places.ensureIndex ({"tile": "2d"}, {"min": -90, "max": 90, "bits": 20})
This will create a spatial index of 180 * 180 size.

The query of 2d index is much simpler than 2dsphere, you can directly use $ near and $ within without having a $ geometry child object:

db.places.find ({"tile": {"$ near": [20, 21]}}). limit (10)
If no limit is added, a maximum of 100 items will be returned by default.

$ within can query all the documents in a certain shape, which can be rectangular ($ box), circular ($ center) or polygon ($ polygon).

db.places.find ({"tile": {"$ within": {"$ box": [[0, 0], [30, 30]]}}})
$ box receives two elements, the first element is the coordinates of the lower left corner of the rectangle, and the second element is the coordinates of the upper right corner of the rectangle.

db.places.find ({"tile": {"$ within": {"$ center": [[30, 30], 10]}}}))
$ center also accepts two elements, the first element is the coordinates of the center point of the circle, and the second is the radius of the circle.

db.places.find ({"tile": {"$ within": {"$ polygon": [[0, 0], [30, 30], [0, 25]]}}})
$ polygon receives an array of multiple points to specify the polygon.

no matter how 2dsphere index or 2d index can be combined with other fields to form a composite index:

db.world.ensureIndex ({"name": 1, "loc": "2dsphere"})
Text Indexes
The full-text index is used to search for text in documents. We can also use regular expressions to query strings, but when the text block is relatively large, regular expression search will be very slow, and can not deal with language understanding issues (such as entry and entries should be considered a match). Full-text indexing can be used to perform text searches very quickly, just like the support of the built-in word segmentation mechanism in multiple languages. The cost of creating an index is relatively large, and the cost of full-text indexing is greater. When creating an index, it needs to be created in the background or offline.

{
    "_id": ObjectId ("55a0e30427c9370e525032e9"),
    "content": "This morning I had a cup of coffee.",
    "about": "beverage",
    "keywords": [
        "coffee"
    ]
}
{
    "_id": ObjectId ("55a0e31027c9370e525032ea"),
    "content": "Who does n‘t like cake?",
    "about": "food",
    "keywords": [
        "cake",
        "food",
        "dessert"
    ]
}
The document is shown above, creating a full-text index on content:

db.article.ensureIndex ({"content": "text"})
Use full-text index to query content:

db.article.find ({"$ text": {"$ search": "coffee"}})
If you want to perform a text search on the keys of all strings, use the wildcard character ($ **) to index all the keys that contain strings. Created an index that indexed the strings of all keys of all documents in an article, and named it TextIndex:

db.article.ensureIndex ({"$ **": "text"}, {"name": "TextIndex"})
The default language of the data being indexed determines how to parse roots and ignore stop words. The default language of the indexed data is English. If you want to specify a different language, use the default_language option when creating a full-text index.

The following languages are supported (Chinese is not supported, at least in version 2.6):

da or danish
nl or dutch
en or english
fi or finnish
fr or french
de or german
hu or hungarian
it or italian
nb or norwegian
pt or portuguese
ro or romanian
ru or russian
es or spanish
sv or swedish
tr or turkish
Note: If you specify the language as the value "none", then text search will use a simple tokenizer, with no stop words and no root processing.

db.quotes.ensureIndex ({"content": "text"}, {"default_language": "spanish"})
Hash index
Hash indexes can support equality queries, but hash indexes do not support range queries. You may not be able to create a compound index with a hash index key or impose unique restrictions on the hash index. However, you can create a hash index and an increment / decrement (for example, non-hash) index on the same key at the same time, so MongoDB will automatically use a non-hash index for range queries.

db.active.ensureIndex ({"a": "hashed"})
 The above operation will create a hash index on the active a key.

Index attribute
The index attributes of MongDB are as follows: TTL index, unique index and sparse index.

TTL Indexes
The TTL index is a special index, through which MongoDB will automatically remove the documents in the collection after a period of time. This is an ideal feature for certain types of information, such as machine-generated event data, logs, and session information. These data only need to be kept in the database for a limited time.

The TTL index has the following restrictions:

It does not support compound indexes.

The index key must be date type data.

If the key stores an array and there are multiple date-type data in the index (associated with a document), then the document will expire when the lowest (for example, the earliest) expiration threshold is matched failed.

The TTL index cannot guarantee that expired data will be deleted immediately. There may be a delay between when the document expires and when MongoDB deletes the document from the database. The background task to delete expired data runs every 60 seconds. Therefore, after the document expires and before the background task runs or ends, the document will still exist in the collection. The duration of the delete operation actually depends on the load of your mongod instance. Therefore, between the two background tasks running, the expired data may continue to remain in the database for more than 60 seconds. In other respects, TTL indexes are ordinary indexes, and if possible, MongoDB will use these indexes to match any query.

db.token.ensureIndex ({"lastUpdated": 1}, {"expireAfterSecs": 60 * 60 * 24})
The token will be deleted after more than 24 hours.

Unique Indexes
The unique index can refuse to save documents whose values of the indexed key have been repeated.

db.members.ensureIndex ({"user_id": 1}, {unique: true})
By default, the unique attribute of the MongoDB index is false. If a unique index is imposed on the compound index, then MongoDB will force the uniqueness of the compound value instead of requiring uniqueness for each individual value.

The limitation of uniqueness is for different documents in a collection. That is, the unique index can prevent different documents from storing the same value on the index key, but it does not prevent the same document from storing the elements or embedded documents with the same value in the array stored by the index key. In the case of storing duplicate data in the same document, duplicate values will only be stored in the index once.

For example, a collection has a unique index a.b:

db.collection.ensureIndex ({"a.b": 1}, {unique: true})
If there are no other documents in the collection, the value of the a.b key is 5, then the unique index will allow the following documents to be inserted into the collection:

db.collection.insert ({a: [{b: 5}, {b: 5}]})
If a document does not contain a uniquely indexed key, then the index will store a null value for the document by default. Due to uniqueness restrictions, MongoDB will only allow one article to be included without being indexed. If more than one document does not contain the indexed key or has no value, a duplicate key error will be thrown and the index creation will fail. The uniqueness and sparse index features can be used in combination to filter documents that contain null values to avoid this error.

Sparse Indexes
The sparse index will skip all documents that do not contain the indexed key. This index is called "sparse" because it does not include all documents in the collection. In contrast, a non-sparse index will index each document, and if a document does not contain an index key, store a null value for it.

db.addresses.ensureIndex ({"xmpp_id": 1}, {"sparse": true})
If an index results in an incomplete query or sorted result set, MongoDB will not use this index unless the user uses the hint () method to display the specified index. For example, the query {x: {$ exists: false}} will not use the sparse index on the x key unless the hint is displayed.

2dsphere (version 2), 2d and text indexes are always sparse.

As long as there is at least one index key in a document, a sparse compound index containing only increasing / decreasing index keys will index this document.

As for sparse composite indexes that include geographic index keys (such as 2dsphere, 2d) and increasing / decreasing index keys, only the presence or absence of geographic index keys can determine whether a document is indexed.

As for the sparse composite index that contains the full-text index key and other increasing / decreasing index keys, only the existence of the full-text index key can determine whether to index the document.

A sparse and unique index can prevent the documents in the collection from being duplicated in the index key, and also allows multiple documents not to contain the index key.

 

Reference: MongoDB 2.6 Chinese documentation

MongoDB learning-index types and attributes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.