MongoDB Paging optimization

Source: Internet
Author: User

MongoDB paging is very simple, this article mainly talk about the problems of paging, as well as the optimization scheme

From the traditional web to the mobile API, we all face the same problem, such as Ajax get size display, etc., will force you to paging

For example, my project uses Ratchet to do H5 framework, its push.js is Ajax get loading other pages, the page is too large to error.

Pagination description

In a typical list API: The drop-down refresh is to get the latest information, then pull up to load the next page

2 interfaces to write for common APIs

    • Get_latest (Model,count)
    • Get_with_page (Number,size)

Get_latest generally take the latest data, such as our common drop-down refresh, which is generally the interface. Because of the 2 drop-down, there may be very long intervals, so the data taken out will flush the data from the current list.

Usual procedure

    • If n (for example, n=30s) has a continuous request within minutes, the hint has recently been updated, there is no need to brush it, or return the current data directly
    • If new data is taken, the data of the current list is flushed out to ensure data consistency

If you judge me to the last page,

The common approach is to take out the totals, divided by pagesize, and then determine whether the current page and total pages-1

n = all_count - 1

Less time, no feeling, if the volume is large, you go to check the count (*) What is the consequences?

So good practice is to follow the ID to check, the front end based on the number of data returned each time, if the number equals pagesize, you can take off a page of data, on the contrary, if the data is less than pagesize, you know that there is not so much data can be taken, that is, to the end. At this point, just disable get the button for the next page.

Using the Skip () and limit () implementations
//Page 1db.users.find().limit (10)//Page 2db.users.find().skip(10).limit(10)//Page 3db.users.find().skip(20).limit(10)........

Abstraction is: Retrieving page n is the code that should look like this

db.users.find().skip(pagesize*(n-1)).limit(pagesize)

Of course, this is assuming that there is no data insertion or deletion between your 2 queries, does your system work?

Of course most OLTP systems cannot be determined not to update, so skip is just a toy, not much use

and skip+limit only suitable for a small amount of data, the data more than a card, even if you add index, optimization, its flaws are so obvious.

If you are dealing with a large number of datasets, you need to consider other scenarios.

Using Find () and limit () implementations

Before using the Skip () method has no way to better handle large-scale data, so we have to find a skip alternative.

To do this we want to balance the query, considering the time stamp or ID in the document

In this case, we'll handle it with ' _id ' (like the timestamp, see if you have a field like created_at in your design).

' _id ' is a mongodb ObjectID type, ObjectID uses 12 bytes of storage space, two hexadecimal digits per byte, is a 24-bit string, including timestamp, machined, ProcessID, counter, etc. Here's a section on how it's made, and why it's unique.

The general idea of using _id for paging is as follows

    1. In the current page to find out the last 1 records of the _id, recorded as last_id
    2. Put down the last_id, as the query conditions, to find more than last_id records as the next page of content

So, isn't it easy?

The code is as follows

//Page 1db.users.find().limit(pageSize);//Find the id of the last document in this pagelast_id = ...//Page 2users = db.users.find({‘_id‘> last_id}). limit(10);//Update the last id with the id of the last document in this pagelast_id = ...

This is just the demo code, let's take a look at how to write in the Robomongo 0.8.4 client

db.usermodels.find({‘_id‘ :{ "$gt" :ObjectId("55940ae59c39572851075bfd")} }).limit(20).sort({_id:-1})

According to the above interface, we still need to implement 2 interfaces

    • Get_latest (Model,count)
    • GET_NEXT_PAGE_WITH_LAST_ID (last_id, size)

In order to get a better understanding of the ' _id ' paging principle, it is necessary to understand the composition of Objectid.

About Objectid composition

Previously said: ' _id ' is the type of MongoDB Objectid, it consists of 12-bit structure, including timestamp, machined, ProcessID, counter and so on.

! [] (Http://images.blogjava.net/blogjava_net/dongbule/46046/o_111.PNG)

TimeStamp

The first 4 bits is a UNIX timestamp, is an int category, we will extract "4DF2DCEC" from the first 4 bits of objectid in the above example, and then install them hex specifically for decimal: "1307761900", this number is a timestamp, To make the effect better, we convert this timestamp into the time format we used to

$ Date-d ' 1970-01-01 UTC 1307761900 sec '-u
Saturday, June 11, 2011 03:11:40 UTC

The first 4 bytes actually hide the time the document was created, and the timestamp is at the top of the character, which means that objectid is roughly sorted by insert, which can be useful in some ways, such as improving search efficiency as an index, and so on. Another benefit of using timestamps is that some client-side drivers can parse out when the record was inserted by Objectid, which also answers the fact that when we create multiple Objectid in a fast succession, we find that the first few numbers rarely find a change in reality, because the current time is used, Many users worry about the time synchronization of the server, in fact, the true value of this timestamp is not important, as long as it always increases the good.

Machine

The next three bytes, that is, the 2cdcd2, the three bytes is the unique identifier of the host, usually the machine hostname hash value, so that the different host generation different machine hash value, to ensure that there is no conflict in the distribution, This is why the strings in the middle of the objectid generated by the same machine are identical.

Pid

The machine above is to ensure that the objectid generated in different machines do not conflict, and the PID is to the same machine different MongoDB process generated Objectid does not conflict, the next 9,362 bits is the process identifier that produces objectid.

Increment

The previous nine bytes are guaranteed to be different machines in a second different process generation Objectid does not conflict, the following three bytes a8b817, is an auto-incremented counter, to ensure that in the same second generated objectid will not find a conflict, Allows 256 of 3 to be equal to the uniqueness of 16,777,216 Records.

Client generation

MongoDB Generation Objectid There is also a greater advantage, is that MongoDB can be generated through its own services Objectid, can also be generated through the client driver, if you look closely at the document you will sigh, MongoDB design everywhere to make

With the idea of space for time, compare Objectid is lightweight, but the service side generates also must spend time, so can transfer from the server to the client driver to complete the transfer as far as possible, the transaction must be thrown to the client to complete, reduce the cost of the service side, Another reason is that scaling the application layer is much more variable than extending the database layer.

Summarize

MongoDB's Obejctid production thought in many aspects is worthy of our reference, especially in large-scale distributed development, how to build a lightweight production, how to transfer the production load, how to exchange space for time to improve the maximum optimization of production and so on.

The purpose of saying so much is to tell you: the _id of MongoDB is the only one, how the single machine, how the unique cluster, understand this can be.

Performance optimizationIndex

Follow your business needs, see official documentation http://docs.mongodb.org/manual/core/indexes/

About explain

The implementation plan in the RDBMS, if you do not understand, then MONGO explain estimate you are not very familiar with, simply say a few words

Explain is a command that MongoDB provides to view the process of a query for performance tuning.

http://docs.mongodb.org/manual/reference/method/cursor.explain/

Db.usermodels.find ({' _id ': {"$gt": ObjectId ("55940AE59C39572851075BFD")}}). Explain ()/* 0 */{"Queryplanner": { "Plannerversion": 1, "namespace": "Xbm-wechat-api.usermodels", "Indexfilterset": false, "PARSEDQ Uery ": {" _id ": {" $gt ": ObjectId (" 55940AE59C39572851075BFD ")}},"                Winningplan ": {" stage ":" FETCH "," Inputstage ": {" stage ":" IXSCAN ", "Keypattern": {"_id": 1}, "IndexName": "_id_", "is                         Multikey ": false," direction ":" Forward "," Indexbounds ": {" _id ": [                 (ObjectId (' 55940AE59C39572851075BFD '), ObjectId (' ffffffffffffffffffffffff ')] " }}}, "Rejectedplans": []}, "Executionstats": {"executionsucces S ": true," NretuRned ": 5," Executiontimemillis ": 0," totalkeysexamined ": 5," totaldocsexamined ": 5," Execu            Tionstages ": {" stage ":" FETCH "," nreturned ": 5," Executiontimemillisestimate ": 0, "Works": 6, "Advanced": 5, "Needtime": 0, "Needfetch": 0, "savest  Ate ": 0," restorestate ": 0," IsEOF ": 1," invalidates ": 0," docsexamined ": 5, "Alreadyhasobj": 0, "Inputstage": {"stage": "IXSCAN", "Nreturn                Ed ": 5," Executiontimemillisestimate ": 0," Works ": 5," Advanced ": 5,  "Needtime": 0, "Needfetch": 0, "saveState": 0, "restorestate":  0, "IsEOF": 1, "invalidates": 0, "Keypattern": {"_id"  : 1},              "IndexName": "_id_", "Ismultikey": false, "direction": "Forward", "Indexbounds": {"_id": ["(ObjectId (' 55940AE59C39572851075BFD '), objec                TId (' ffffffffffffffffffffffff ')] "]}," keysexamined ": 5, "dupstested": 0, "dupsdropped": 0, "seeninvalidated": 0, "matchtested": 0}}, "Allplansexecution": []}, "ServerInfo": {"host": "Iz251uvtr2b", " Port ": 27017," version ":" 3.0.3 "," gitversion ":" B40106b36eecd1b4407eb1ad1af6bc60593c6105 "}}

Field Description:

QueryPlanner.winningPlan.inputStage.stage Column Display query policy

    • Ixscan means using the index query
    • Collscan means using a column query, which is a comparison of the past

The index name in the cursor is moved to the QueryPlanner.winningPlan.inputStage.indexName

3.0 uses executionstats.totaldocsexamined to show the total number of documents that need to be checked to replace the nscanned in 2.6, which is the number of rows scanned for document.

    • nreturned: Number of document rows returned
    • Needtime: Time-consuming (milliseconds)
    • Indexbounds: The index used
Profiling

There's also a profiling feature

db.setProfilingLevel(2, 20)

There are three levels of profile:

    • 0: Do not open
    • 1: Record slow command, default is greater than 100ms
    • 2: Record all commands
    • 3, query profiling record

Default record in System.profile

db[‘system.profile‘].find()
Summarize
    • Explain can do performance analysis during the code-writing phase, and the development phase
    • Profile detection of slow performance statements, easy to locate online product issues

No matter what kind of problem you're targeting, the solution

    • Adapt the schema structure to your business
    • Optimizing indexes

With this knowledge, I believe you can test the performance of the paging statement on your own.

End of full text

Welcome to my public number "node full stack"

MongoDB Paging optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.