The method and performance of MongoDB paging query

Source: Internet
Author: User
Tags tojson mongodb gui

This article focuses on the paging query of MongoDB

Traditional SQL Paging

Traditional SQL paging, all the solutions are almost row_number, for the needs of a variety of sorting, complex query scenarios, Row_number is the killer. In addition, for the current web is very popular Poll/push load paging way, generally use time stamp to achieve paging. These two kinds of paging can be said that the former is universal, even LINQ generated by the paging is row_number, it is conceivable that it is more versatile. The latter is best both in terms of performance and complexity, as long as a simple time stamp is available.

MongoDB Paging

Into the MONGO of ideas, paging is actually not difficult, what is rare? In fact, there is nothing, see understand also that, and SQL paging ideas are consistent.

First of all, the following use case for this article, I imported the following Entity data in the database, where cus_id, amount I was born into an orderly number, pour the number of records is 200w:

public class test{        
<summary>
Primary key ObjectId is the primary key type that the MongoDB comes with
</summary>
Public ObjectId Id {get; set;}
<summary>
Customer number
</summary> [Bsonelement ("cust_id")]
public string CustomerId {get; set;}
<summary>/// total </summary>
[Bsonelement ("Amount")]
public int Amount {get; set;}
<summary>//Status/// </summary>
[Bsonelement ("status")]
public string Status {get; set;}}

The following operations are based on the MongoDB GUI tool see reference 3

First look at the parameters and results of the paging needs, the general paging needs parameters are:

    • PageIndex Current Page
    • PageSize number of records per page
    • Queryparam[] Other query fields

So according to Row_number's paging idea, that is, taking pageindex*pagesize to (Pageindex*pagesize + pageSize), we use LINQ to say:

Query. Where (xxx...xxx). Skip (pageindex*pagesize). Take (PageSize)

Find the data, there is really skip function, but also the limit function See Resources 1, 2, so easy to achieve such a paging query:

Db.test.find ({xxx...xxx}). Sort ({"Amount": 1}). Skip. Limit (10)//Ignore query statements here

Quite efficient, almost a few milliseconds out of the results, and sure enough is the NoSQL efficiency first class. But slowly, I use only 10 of the data, and there is not a lot of data. I added the data to 100000, the efficiency is about 20ms. If it is so simple to study the end of the words, it is really too disappointing the program ape to delve into the spirit. SQL paging scheme, the scheme can have a lot of, efficiency is also different, that MONGO this kind of, the answer is obviously not such. Is there a problem with performance in addition to efficiency? In the Redis chapter, I have eaten such a loss and used the keys in disorder.

After reviewing some of the information, we found that all the information was said:

Do not easily use skip to do queries, otherwise the large amount of data will lead to a sharp decline in performance, this is because skip is a number of one, more natural slow.

So skip is going to be avoided, so how do you avoid it? First of all, this paper reviews the post-timestamp paging scheme of SQL paging, which uses the ordered nature of the fields, and uses the query to fetch the data, which can avoid the large number of numbers directly. In other words, if it comes with such a condition, then the efficiency of the query will improve. Let's check it out:

Here we assume the query 100,001th data, the amount value of this data is: 2399927, we write two statements are as follows:

Db.test.sort ({"Amount": 1}). Skip (100000). Limit ()  //183msdb.test.find ({amount:{$gt: 2399927}}). sort ({" Amount ": 1}". Limit (Ten)  //53ms

The results have been attached to the comments, it is obvious that the latter performance is the former One-third, the gap is very large. The theory of skip efficiency is also proved.

C # implementation

The statement and efficiency of MONGODB paging have been discussed above, so we can implement the C # driver version.

The official Bson driver is used in this article, see Reference 4. The MONGO driver comes with another way, a native query like ADO, and LINQ, where both of us implement

Scenario One: Native query implementation of conditional queries

var query = Query<test>. GT (item = Item. Amount, 2399927);                
var result = collection. Find (query). Setlimit (100). SetSortOrder (sortby.ascending ("Amount")). ToList ();
Console.WriteLine (Result. First (). ToJson ());//bson comes with ToJson

Scenario Two: Skip native query implementation

var result = collection. FindAll (). Setskip (100000). Setlimit (100). SetSortOrder (sortby.ascending ("Amount"));
Console.WriteLine (Result. ToList (). First (). ToJson ());

Scenario Three: Linq conditional query

var result = collection. Asqueryable<test> (). An item by item. Amount). Where (item = Item. Amount > 2399927). Take (100);
Console.WriteLine (Result. First (). ToJson ());

Scenario Four: Linq Skip version

var result = collection. Asqueryable<test> (). An item by item. Amount). Skip (100000). Take (100);
Console.WriteLine (Result. First (). ToJson ());
Performance Comparison Reference

The test code here later I upload, the concrete implementation is to use the old Zhao (my idol ah ~) Codetimer to calculate performance. In addition, I ran the code with the Testdriven plugin to run.

Scenario One: Pagination gt-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"), "cust_id": "A12399997", "Amount": 2399928, "stat US ":" B "}
Time Elapsed: 1,322MSCPU Cycles: 4,442,427,252gen 0: 0Gen 1: 0Gen 2: 0
Scenario Two: Pagination skip-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"), "cust_id": "A12399997", "Amount": 2399928, "St ATUs ":" B "}
Time Elapsed: 95msCPU Cycles: 18,280,728gen 0: 0Gen 1: 0Gen 2: 0
Scenario Three: Paginatilinq on Linq where{"_id": ObjectId ("5472e383fc46de17c45d4682"), "cust_id": "A12399997", "Amount": 2399928, "Status": "B"}

Time Elapsed: 76ms CPU cycles:268,734,988 Gen 0:0 Gen 1:0 Gen 2:0

Scenario Four: Pagination Linq skip{"_id": ObjectId ("5472e383fc46de17c45d4682"), "cust_id": "A12399997", "Amount": 2399928, "stat US ":" B "}
Time Elapsed: 97msCPU Cycles: 30,834,648gen 0: 0Gen 1: 0Gen 2: 0

This is too different from the theory, the first why and behind the gap so big? At first I thought it was C # MONGO drive problem, try to change the driver is similar. These days I was looking at "MongoDB in action" when I found the article mentioned:

MongoDB loads the index and metadata of the document into memory based on the query, and suggests that the size of the document metadata is always kept smaller than the machine memory, or performance degrades.

After noticing the above theory, I replaced my test plan, the first time to perform the exclusion, and then compare, found that the results are normal.

Revised results of programme I:

Pagination gt-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"), "cust_id": "A12399997", "Amount": 2399928, "status": "B"}
Time Elapsed: 18msCPU Cycles: 54,753,796gen 0: 0Gen 1: 0Gen 2: 0

Summarize

This article, based on the Skip page and the Ordered field query paging two different scenarios for comparison. The latter simply uses the query results to improve performance without having to count the numbers in turn. Skip although less efficient but general-purpose, ordered fields of the query, you need to design the page when the field to do some processing, at least the page number can be obtained to this field. Here I attach a way, is the combination of the two, we can take the last page of the data on each display, combined with skip to deal with paging, so it is relatively better. This is not a concrete implementation.

The method and performance of MongoDB paging query

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.