MongoDB those pits

Source: Internet
Author: User
Tags mongodb version

MongoDB is the hottest NoSQL document database of all time, offering some great features: automatic failover, automatic sharding, modeless schemaless, and, in most cases, great performance. But mint in the deep use of MongoDB process, encountered a lot of problems, the following summarizes a few of the pits we encountered. Special Disclaimer: We currently use the MongoDB version is 2.4.10, has been upgraded to MongoDB 2.6.0, the problem persists, and back to the 2.4.10 version.

MongoDB Database level Lock

Pit Daddy index: 5 stars (up to 5 stars)

MongoDB lock mechanism and general relational database such as MySQL (InnoDB), Oracle is very different, InnoDB and Oracle can provide row-level granularity lock, and MongoDB can only provide library-level granularity lock, which means when MongoDB a write lock is occupied , other read and write operations have to wait.

At first glance, library-level locks have serious problems in large concurrency, but MongoDB can still maintain large concurrency and high performance, because MongoDB's lock granularity is very extensive, but the lock processing mechanism and relational database lock are very different, mainly manifested in:

    • MongoDB does not have full transaction support, Operation Atomicity only to a single document level, so the operation is usually small granularity;
    • The actual elapsed time of MongoDB lock is memory data calculation and change time, usually quickly;
    • MongoDB Lock has a temporary waiver mechanism, when the need to wait for slow IO read and write data, you can temporarily discard, and so on after the IO is completed and then regain the lock.

Usually no problem does not mean that there are no problems, if the data operation is not appropriate, will still cause a long time to occupy the write lock, such as the following mentioned in the foreground building index operation, when this situation, the entire database is in a completely blocked state, can not do any read and write operations, the situation is very serious.

To solve the problem, try to avoid a long time to occupy the write lock operation, if there are some set operation is difficult to avoid, you can consider to put this set into a separate MongoDB library, because MongoDB different library locks are isolated from each other, the separation of the collection can avoid a set operation caused by a global blocking problem.

Database blocking due to index build

Pit Daddy index: 3 stars

The above mentioned the problem of MongoDB library level lock, the index is an easy to cause long-time write lock problem, MongoDB in the foreground to build an index to occupy a write lock (and do not give up temporarily), if the collection of data is large, the index usually takes a long time, especially easy to cause problems.

The solution is simple, MongoDB provides two types of indexed access, one is the background way, does not require a long time to occupy the write lock, the other is a non-background way, it takes a long time to occupy the lock. Use the background method to solve the problem.
For example, to index a large table posts,
Never use

DB.  Posts.  Ensureindex({user_id1})       

Instead, you should use

DB.  Posts.  Ensureindex({user_id1{background1})        
Unreasonable use of embedded embed document

Pit Daddy index: 5 stars

Embed document is a place where MongoDB differs significantly from relational databases, and can embed other child document (s) in one document, so it is convenient to retrieve and modify the parent document in a single collection.

For example, there is a group document in the Mint application scenario, where the user applies to group modeling as Grouprequest document, and we initially use the Embed method to place the grouprequest in group.
The Ruby code looks like this (using the Mongoid ORM):

Group  Mongoid::.. . . . EndMongoid::.. . . . End               

This mode of use let us fall into the pit, almost can not climb out, it led to nearly two weeks of time system problems, peak hours often a few minutes of the system, the most serious one even caused MongoDB downtime.

After careful analysis, it is found that some active Group group_requests increase (when new applications are available) and changes (when passing or rejecting user requests) are unusually frequent, and these operations often occupy write locks for a long time, causing the entire database to block. The reason is that when there is an increase in the group_request operation, the group pre-allocated space is not enough, need to reallocate space (both memory and hard disk are required), time is long, another group built a lot of indexes, moving group location caused a lot of index update operation is also time-consuming, Combined to cause a long time occupation lock problem.

The solution to the problem, it is simple to say, is to change the embed association to the Common Foreign Key association, is similar to the practice of relational database, so that group_request additions or modifications only occur on the grouprequest, simple and fast, to avoid long-time occupation of the write lock problem. When the data of the associated object is not fixed or changes frequently, you must avoid using Embed association, or you will die badly.

Unreasonable use of the Array field

Pit Daddy index: 4 stars

The MongoDB Array field is a unique feature that stores some simple one-to-many relationships in a single document.

Peppermint has an application scenario that uses a serious performance problem, directly on the code as follows:

User  Mongoid::.. : Follower_user_idsArray. . End         

User in the user through an Array type field follower_user_ids The ID of the person who is concerned about, users concerned about from 10 to 3,000, the change is more frequent, and the above embed caused by similar problems, frequent follower_user_id s increases the modification operation resulting in a large number of long-time database write locks, resulting in a sharp decline in MongoDB database performance.

Solution to the problem: we transferred the follower_user_ids to the in-memory database Redis, avoiding frequent changes to the user in MongoDB to completely solve the problem. If you do not use Redis, you can also create a Userfollower collection that is associated with a foreign key.

First list the above several pits, are hairenbujian traps, use MongoDB process must pay more attention to avoid falling into the pit.

For the use of the problems encountered, talk about my own feelings, talk about a little attention to matters.

1. Be sure to create the index reasonably, there are many people are confused by the propaganda film, that MONGO reading speed itself should be very fast, so from the MySQL after the transfer, even the creation of the index is forgotten, when the table (collection) is large, do not create an index is very impact performance. Creating an index is simple, if you don't want to bother with the shell, just declare it in the Model: index ({xxx:1}, {unique:true, background:true}) and run a rake command: Rake DB: Mongoid:create_indexes is OK, this command will not be created again.

2. Large table query, only return the column you want, the landlord spoke a lot of write performance problems, may be the scene of different reasons, we encountered a large number of query performance issues; This is not enough to say, other relational databases also have this problem. In particular, the Single collection field data volume is relatively large, it is very easy to cause performance problems, in rails is also very simple, query with only. such as User.where (XXX). Only (: F1,:F2).

3. Try to return all required data at once, avoid get_more, avoid cursor operations, and when the user makes a query iteration, MONGO will first return a block for you to iterate over, and when you iterate over the data block, Mongoid initiates Get_more The command moves the cursor to get the next block of data, and the move cursor is very slow, especially if you return more columns. The size of each returned chunk is controlled by batchsize and can be controlled by modifying its default value.

4. Try to avoid using the array type field in the model, the reason landlord has said, but we have encountered the problem of query, because you use the array, query, you will inevitably use # #in # # operation, in operation can not take advantage of the index, This is also exist in the relational database, big table operations must be avoided.

5. Do not use inheritance in the model directly related to the database, what does it mean? is Modelb < model A, and they are all MONGO inside the document, why not this? Because Mongoid's internal implementation will only create a table is DocumentA, and then in DocumentA with a _type field to identify DocumentB, so when you query Modelb, the inside will generate a query to DocumentA statements, That query is used _type in [XXXX] similar to the statement, you see also in operation. If this is the case you found in the late, you really have no surgery, want to die of heart have:).

6. Transactions, or transactions, MongoDB does not support transactions, so you must consider clearly, weigh the pros and cons. We have some features that must use the transaction, no way, I think of a very ugly method, record each created and updated model, its ID and update data, if there is an exception, I will revoke the update and create, really is very troublesome. Consider that in a relational database that supports transactions, these are very simple.

7. Master-slave backup is not very mature, this point, it is my research is not in-depth reasons, I still believe that the master-slave backup is not very mature, and sometimes it is just scary, if the experienced students here, you can discuss more.

MongoDB those pits

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.