MongoDB indexing skills

Source: Internet
Author: User

This article is based on Kyle banker's MongoDB in action book. This section describes some basic knowledge and usage skills related to MongoDB indexes.

Index type

Although MongoDB indexes have the same storage structure, they are still divided into unique indexes (unique), sparse indexes (sparse), and multi-value indexes (multikey) based on different application layer requirements).

Unique Index

The unique index can be created with the unique: true option. The creation command is as follows:

DB. Users. ensureindex ({Username: 1}, {unique: true })

After the preceding unique index is created, if you insert an existing username, the following error is returned:

E11000 duplicate key error index: gardening. Users. $ username_1 DUP key: {: "kbanker "}

If you create a unique index on an existing data collection and the fields corresponding to the unique index already have duplicate data items, the creation will fail, we need to add a dropdups option to force the repeated items to be deleted. The command is as follows:

DB. Users. ensureindex ({Username: 1}, {unique: True, dropdups: true })

Loose Index

If some rows in your data do not contain a field or the field value is null, If you create a common index on this field, the row without this field or null value will also participate in the index structure, occupying the corresponding space. If we do not want empty rows to participate in our index, we can use loose indexes, A loose index will only involve rows with unspecified fields in index creation. To create a loose index, run the following command:

DB. Reviews. ensureindex ({user_id: 1}, {sparse: true })

Multi-value index

MongoDB can create an index for an array type, such as the following structure. MongoDB can create an index on the tags field:

{Name: "wheelbarrow ",

}

When an index is generated, three index elements are generated for the three values in tags. The values of tools, gardening, and soil in the index all point to the same row of data. Split into three independent index items.

Index management

Index creation and Deletion

There are many ways to create and delete indexes. The following two methods are relatively primitive. You can create an index by performing corresponding write operations on the collection of system. indexes:

Spec = {ns: "Green. Users", key: Your 'addresses.zip ': 1}, name: 'zip '}

DB. system. Indexes. insert (SPEC, true)

The preceding command writes a record to system. indexes to create an index. This record contains the name space, index information, and index name of the collection to create an index.

After the index is created, run the following command to find the index:

DB. system. Indexes. Find ()

{"_ Id": objectid ("4d2205c4051f853d46447e95"), "ns": "Green. Users ",

"Key": {"addresses.zip": 1}, "name": "Zip", "V": 0}

To delete a created index, run the following command:

Use green

DB. runcommand ({deleteindexes: "users", index: "Zip "})

Index creation command

In fact, there is a more convenient command to create an index, that is, ensureindex. For example, if we create a joint index for the Open and Close fields, we can use the following command:

DB. Values. ensureindex ({open: 1, close: 1 })

This command will trigger two indexing processes. One is to sort the corresponding fields, because the indexes are organized by the B + tree and the tree needs to be built, sorting data improves the efficiency of inserting B + trees (the efficiency of the second process). In the log, you can see the output similar to the following:

Tue Jan 4 09:58:17 [conn1] building new index on {open: 1.0, close: 1.0} For stocks. Values

1000000/4308303 23%

2000000/4308303 46%

3000000/4308303 69%

4000000/4308303 92%

Tue Jan 4 09:59:13 [conn1] external sort used: 5 files in 55 secs

The second process is to insert sorted data into the index structure to form available indexes:

1200300/4308303 27%

2227900/4308303 51%

2837100/4308303 65%

3278100/4308303 76%

3783300/4308303 87%

4075500/4308303 94%

Tue Jan 4 10:00:16 [conn1] done building bottom layer, going to commit

Tue Jan 4 10:00:16 [conn1] Done For 4308303 records 118.942 secs

Tue Jan 4 10:00:16 [conn1] Insert stocks. system. Indexes 118942 Ms

In addition to log output, you can also run the currentop command on the terminal to obtain information about the current operation thread, as shown in the following example:

> DB. currentop ()

{

"Inprog ":[

{

"Opid": 58,

"Active": True,

"Locktype": "write ",

"Waitingforlock": false,

"Secs_running": 55,

"Op": "insert ",

"Ns": "stocks. system. Indexes ",

"Query ":{

},

"Client": "127.0.0.1: 53421 ",

"DESC": "conn ",

"MSG": "index: (1/3) External sort 3999999/4308303 92%"

}

]

}

The last part is an index building process. Currently, the sorting process is being executed, and the process is up to 92%.

Create an index in the background

Creating an index adds a write lock to the database. If a dataset is large, the online read/write operations on the database are suspended until the index is created. This affects the normal service of the database. we can add the background: true option when creating the index to let the creation work be executed in the background. At this time, the index creation still requires a write lock, however, this write lock is not directly exclusive to index creation, but will be paused to give way to other read/write operations, without causing serious performance impact. Specific Method:

DB. Values. ensureindex ({open: 1, close: 1}, {Background: true })

Create an index offline

In any case, the creation of indexes puts a certain amount of pressure on the database, thus affecting online services. If you want to create an index without affecting online services at all, you can remove the nodes in replica sets from the cluster and add the corresponding indexes to the node, after the index is added, add it to the replica sets. This requires only one condition, that is, the index creation time cannot be longer than the time that oplog can save logs. Otherwise, after the creation, the node will no longer be able to catch up with primary after it is launched, resync will be performed.

Index backup

We know that both mongodump and mongoexport are used to back up data and cannot back up indexes. During restoration, we still need to wait for a long process of index creation. Therefore, if you want to include indexes during backup, you 'd better back up data files.

Index compression

After using indexes for a period of time, operations such as adding, deleting, modifying, and so on will become loose. In this way, we can use the reindex command to re-organize indexes, makes the index space less occupied.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.