This article is based on Kyle banker's MongoDB in action book. This section describes some basic knowledge and usage skills related to MongoDB indexes.
Index type
Although MongoDB indexes have the same storage structure, they are still divided into unique indexes (unique), sparse indexes (sparse), and multi-value indexes (multikey) based on different application layer requirements).
Unique Index
The unique index can be created with the unique: true option. The creation command is as follows:
DB. Users. ensureindex ({Username: 1}, {unique: true })
After the preceding unique index is created, if you insert an existing username, the following error is returned:
E11000 duplicate key error index: gardening. Users. $ username_1 DUP key: {: "kbanker "}
If you create a unique index on an existing data collection and the fields corresponding to the unique index already have duplicate data items, the creation will fail, we need to add a dropdups option to force the repeated items to be deleted. The command is as follows:
DB. Users. ensureindex ({Username: 1}, {unique: True, dropdups: true })
Loose Index
If some rows in your data do not contain a field or the field value is null, If you create a common index on this field, the row without this field or null value will also participate in the index structure, occupying the corresponding space. If we do not want empty rows to participate in our index, we can use loose indexes, A loose index will only involve rows with unspecified fields in index creation. To create a loose index, run the following command:
DB. Reviews. ensureindex ({user_id: 1}, {sparse: true })
Multi-value index
MongoDB can create an index for an array type, such as the following structure. MongoDB can create an index on the tags field:
{Name: "wheelbarrow ",
}
When an index is generated, three index elements are generated for the three values in tags. The values of tools, gardening, and soil in the index all point to the same row of data. Split into three independent index items.
Index management
Index creation and Deletion
There are many ways to create and delete indexes. The following two methods are relatively primitive. You can create an index by performing corresponding write operations on the collection of system. indexes:
Spec = {ns: "Green. Users", key: Your 'addresses.zip ': 1}, name: 'zip '}
DB. system. Indexes. insert (SPEC, true)
The preceding command writes a record to system. indexes to create an index. This record contains the name space, index information, and index name of the collection to create an index.
After the index is created, run the following command to find the index:
DB. system. Indexes. Find ()
{"_ Id": objectid ("4d2205c4051f853d46447e95"), "ns": "Green. Users ",
"Key": {"addresses.zip": 1}, "name": "Zip", "V": 0}
To delete a created index, run the following command:
Use green
DB. runcommand ({deleteindexes: "users", index: "Zip "})
Index creation command
In fact, there is a more convenient command to create an index, that is, ensureindex. For example, if we create a joint index for the Open and Close fields, we can use the following command:
DB. Values. ensureindex ({open: 1, close: 1 })
This command will trigger two indexing processes. One is to sort the corresponding fields, because the indexes are organized by the B + tree and the tree needs to be built, sorting data improves the efficiency of inserting B + trees (the efficiency of the second process). In the log, you can see the output similar to the following:
Tue Jan 4 09:58:17 [conn1] building new index on {open: 1.0, close: 1.0} For stocks. Values
1000000/4308303 23%
2000000/4308303 46%
3000000/4308303 69%
4000000/4308303 92%
Tue Jan 4 09:59:13 [conn1] external sort used: 5 files in 55 secs
The second process is to insert sorted data into the index structure to form available indexes:
1200300/4308303 27%
2227900/4308303 51%
2837100/4308303 65%
3278100/4308303 76%
3783300/4308303 87%
4075500/4308303 94%
Tue Jan 4 10:00:16 [conn1] done building bottom layer, going to commit
Tue Jan 4 10:00:16 [conn1] Done For 4308303 records 118.942 secs
Tue Jan 4 10:00:16 [conn1] Insert stocks. system. Indexes 118942 Ms
In addition to log output, you can also run the currentop command on the terminal to obtain information about the current operation thread, as shown in the following example:
> DB. currentop ()
{
"Inprog ":[
{
"Opid": 58,
"Active": True,
"Locktype": "write ",
"Waitingforlock": false,
"Secs_running": 55,
"Op": "insert ",
"Ns": "stocks. system. Indexes ",
"Query ":{
},
"Client": "127.0.0.1: 53421 ",
"DESC": "conn ",
"MSG": "index: (1/3) External sort 3999999/4308303 92%"
}
]
}
The last part is an index building process. Currently, the sorting process is being executed, and the process is up to 92%.
Create an index in the background
Creating an index adds a write lock to the database. If a dataset is large, the online read/write operations on the database are suspended until the index is created. This affects the normal service of the database. we can add the background: true option when creating the index to let the creation work be executed in the background. At this time, the index creation still requires a write lock, however, this write lock is not directly exclusive to index creation, but will be paused to give way to other read/write operations, without causing serious performance impact. Specific Method:
DB. Values. ensureindex ({open: 1, close: 1}, {Background: true })
Create an index offline
In any case, the creation of indexes puts a certain amount of pressure on the database, thus affecting online services. If you want to create an index without affecting online services at all, you can remove the nodes in replica sets from the cluster and add the corresponding indexes to the node, after the index is added, add it to the replica sets. This requires only one condition, that is, the index creation time cannot be longer than the time that oplog can save logs. Otherwise, after the creation, the node will no longer be able to catch up with primary after it is launched, resync will be performed.
Index backup
We know that both mongodump and mongoexport are used to back up data and cannot back up indexes. During restoration, we still need to wait for a long process of index creation. Therefore, if you want to include indexes during backup, you 'd better back up data files.
Index compression
After using indexes for a period of time, operations such as adding, deleting, modifying, and so on will become loose. In this way, we can use the reindex command to re-organize indexes, makes the index space less occupied.