18 principles for MongoDB boost performance (development design phase)

Source: Internet
Author: User
Tags bulk insert memory usage mongodb mongodb collection

MongoDB is a high-performance data, but in the process of use, you will occasionally encounter some performance problems. MongoDB is relatively new compared to other relational databases, such as SQL Server, MySQL, and Oracle, and many people are not familiar with it, so many developers, DBAs tend to focus on the implementation of functionality, and ignore the performance requirements. In fact, MongoDB and SQL Server, MySQL, Oracle, a database object design adjustment, index creation, statement optimization, will have a huge impact on performance.

In order to fully explore MongoDB performance, is now a simple total of the following 18, you are welcome to continue to summarize and improve.

(1) The _ID key in the document recommends using the default value, which prevents the custom values from being saved to _id.

Interpretation: There will be a "_id" key in the MongoDB document, the default is a Objectid object (the identifier contains a timestamp, machine ID, Process ID, and counter). MongoDB has a large difference in speed when specifying _id and does not specify _ID inserts, the specified _id slows the insertion rate.

(2) It is recommended to use short field names.

Interpretation: Unlike relational databases, each document in the MongoDB collection needs to store field names, and long field names require more storage space.

(3) MongoDB Index can improve the document query, update, delete, sort operations, so combined with business needs, appropriate to create an index.

(4) Each index takes up a bit of space and causes the resource consumption of the insert operation, so it is recommended that the number of indexes per collection be kept to within 5.

(5) Creating a composite index with these keys is a good solution for queries that contain multiple keys. The order of key values for a composite index is important to understand the index's leftmost prefix principle.

Interpretation: For example, create a composite index {a:1,b:1,c:1} on the test collection. Execute the following 7 query statements:

    1. Db.test.find ({A: "Hello"})
    2. Db.test.find ({b: "Sogo", A: "Hello"})
    3. Db.test.find ({A: "Hello", B: "Sogo", C: "666"})
    4. Db.test.find ({c: "666", A: "Hello"})
    5. Db.test.find ({b: "Sogo", C: "666"})
    6. Db.test.find ({b: "Sogo"})
    7. Db.test.find ({c: "666"})
    • The above query statements may be indexed by 1, 2, 3, 4
    • The query should contain the leftmost index field, whichever is the index creation order, regardless of the query field order.
    • The minimum index overrides the most queries.

(6) TTL index (time-to-live index, with life-cycle indexes), using the TTL index to age a document with a time-out period, after which a document is deleted after it reaches an aging level.

Interpretation: The index that creates the TTL must be a date type. The TTL index is a single-field index and cannot be a composite index. The TTL removes the document from the background thread every 60s to remove the defunct document. Fixed-length collections are not supported.

(7) A sparse index is recommended when you need to create an index in a field in the collection, but when a large number of documents in the collection do not contain this key value.

Interpretation: The index is intensive by default, which means that there is a correspondence in the index even if the index field of the document is missing. In a sparse index, only documents that contain index key values appear.

(8) When creating a text index, the field specifies text instead of 1 or-1. There is only one text index per collection, but it can be indexed for any number of fields.

Interpretation: Text search speed is much faster, it is recommended to use a text index to replace the multi-field of the collection document inefficient query.

(9) Using FindOne to query a database to match multiple items, it returns the first item in the natural sorted file collection. If you need to return more than one document, use the Find method.

(10) If the query does not need to return the entire document or is simply used to determine whether the key value exists, you can limit the return field by projecting (mapping), reducing network traffic and the client's memory usage.

Interpretation: You can either explicitly specify the returned field by setting {key:1}, or you can set {key:0} to specify the fields that you want to exclude.

(11) In addition to prefix-style queries, regular expression queries cannot use indexes and execute longer than most selectors, and they should be used sparingly.

(12) In an aggregation operation, $match the number of documents to be processed by the $ group operator in front of the $ group, by $match predecessor.

(13) Modifying a document by using an operator typically provides better performance because it does not require a round-trip server to fetch and modify document data, which can take less time to serialize and transfer data.

(14) Bulk INSERT (Batchinsert) can reduce the number of data submissions to the server and improve performance. However, the bulk-submitted Bson size does not exceed 48MB.

(15) It is forbidden to sort out too much data at once, MongoDB currently supports sorting of result sets within 32M. If you want to sort, limit the amount of data in the result set as much as possible.

(16) Some of the $ operators in the query may lead to poor performance, such as $ne, $not, $exists, $nin, $or, as far as possible in the business do not use.

A) $exist: because the loose document structure causes the query to traverse every document;

b) $ne: If the value of the inverse is the majority, the entire index is scanned;

c) $not: It may cause the query optimizer not to know which index to use, so it is often degraded to full-table scanning;

d) $nin: Full table scan;

e) $or: There are multiple conditions to query how many times, and finally merge the result set, you should consider the $in.

(17) A fixed set can be used to log logs, which insert data faster, and can achieve the elimination of the earliest data when inserting data. This feature can be considered in requirements analysis and design, which improves performance and eliminates deletion actions.

Interpretation: Fixed collections need to be created explicitly, specify size, and can specify the number of documents. Collection no matter which limit is reached, the new document inserted will move the oldest document out.

(18) The amount of data set in Chinese documents will affect the query performance, in order to maintain the appropriate amount, the need for regular archiving.

This article copyright belongs to the author, without the consent of the author, thank you!!!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.