Recently a little busy, there were a lot of things to summarize, the Redis series in fact, there should be four, five or six ... However, "Redis in action" has not finished reading, and so after reading to summarize, otherwise too water, sorry readers.
Since the last Redis, it's a bit of an introduction to NoSQL-type products, which will look at the real NoSQL database--mongodb in a different direction. When it came to MongoDB, it turned out to be a reversal of my data management and procedural outlook. What do you say? If used in OO design program that is really great, like my data-driven, table-driven thinking deeply rooted people, the idea is difficult to keep up with the pace of MongoDB. Of course not call an API, write a few words query those ideas, but the program design ideas, business areas of the design, if OO, how suitable for display, suitable for querying, suitable for aggregation operations and so on. In short, MongoDB important is the design of the program, design, in fact, completely ignore the MONGO storage, because MongoDB is really too convenient.
Not much nonsense to say, about the introduction of information, installation and other please pull to the end of the article, I attached some information, later if necessary to share. This article focuses on MongoDB paged query, why? Paging is a common killer, bad, customer scold, manager scold.
Traditional SQL Paging
Traditional SQL paging, all the solutions are almost row_number, for the needs of a variety of sorting, complex query scenarios, Row_number is the killer. In addition, for the current web is very popular Poll/push load paging way, generally use time stamp to achieve paging. These two kinds of paging can be said that the former is universal, even LINQ generated by the paging is row_number, it is conceivable that it is more versatile. The latter is best both in terms of performance and complexity, as long as a simple time stamp is available.
MongoDB Paging
Into the MONGO of ideas, paging is actually not difficult, what is rare? In fact, there is nothing, see understand also that, and SQL paging ideas are consistent.
First of all, the following use case for this article, I imported the following Entity data in the database, where cus_id, amount I was born into an orderly number, pour the number of records is 200w:
Public classtest{/// <summary> ///PRIMARY Key ObjectId is the primary key type that the MongoDB comes with/// </summary> PublicObjectId Id {Get;Set; } /// <summary> ///Customer number/// </summary>[Bsonelement ("cust_id")] Public stringCustomerId {Get;Set; } /// <summary> ///Total/// </summary>[Bsonelement ("Amount")] Public intAmount {Get;Set; } /// <summary> ///Status/// </summary>[Bsonelement ("Status")] Public stringStatus {Get;Set; }}
The following operations are based on the MongoDB GUI tool see reference 3
First look at the parameters and results of the paging needs, the general paging needs parameters are:
- PageIndex Current Page
- PageSize number of records per page
- Queryparam[] Other query fields
So according to Row_number's paging idea, that is, taking pageindex*pagesize to (Pageindex*pagesize + pageSize), we use LINQ to say:
Query. Where (xxx...xxx). Skip (pageindex*pagesize). Take (PageSize)
Find the data, there is really skip function, but also the limit function See Resources 1, 2, so easy to achieve such a paging query:
Db.test.find ({xxx...xxx}). Sort ({"amount":1}). Skip.Limit ( //Ignore the query statement here
Quite efficient, almost a few milliseconds out of the results, and sure enough is the NoSQL efficiency first class. But slowly, I use only 10 of the data, and there is not a lot of data. I added the data to 100000, the efficiency is about 20ms. If it is so simple to study the end of the words, it is really too disappointing the program ape to delve into the spirit. SQL paging scheme, the scheme can have a lot of, efficiency is also different, that MONGO this kind of, the answer is obviously not such. Is there a problem with performance in addition to efficiency? In the Redis chapter, I have eaten such a loss and used the keys in disorder.
After reviewing some of the information, we found that all the information was said:
Do not easily use skip to do queries, otherwise the large amount of data will lead to a sharp decline in performance, this is because skip is a number of one, more natural slow.
So skip is going to be avoided, so how do you avoid it? First of all, this paper reviews the post-timestamp paging scheme of SQL paging, which uses the ordered nature of the fields, and uses the query to fetch the data, which can avoid the large number of numbers directly. In other words, if it comes with such a condition, then the efficiency of the query will improve. Let's check it out:
Here we assume the query 100,001th data, the amount value of this data is: 2399927, we write two statements are as follows:
Db.test.sort ({"amount":1}). Skip (100000). Limit (ten) //183msdb.test.find ({amount:{$gt:2399927}}). sort ({"amount": 1}). Limit (ten) //53ms
The results have been attached to the comments, it is obvious that the latter performance is the former One-third, the gap is very large. The theory of skip efficiency is also proved.
C # implementation
The statement and efficiency of MONGODB paging have been discussed above, so we can implement the C # driver version.
The official Bson driver is used in this article, see Reference 4. The MONGO driver comes with another way, a native query like ADO, and LINQ, where both of us implement
Scenario One: Native query implementation of conditional queries
var 2399927 ); var result = collection. Find (query). Setlimit (+) . SetSortOrder (sortby.ascending ("amount")). ToList (); Console.WriteLine (Result. First (). ToJson ()); // Bson's own Tojson
Scenario Two: Skip native query implementation
var result = collection. FindAll (). Setskip (100000). Setlimit (+) . SetSortOrder (sortby.ascending ("amount")); Console.WriteLine (Result. ToList (). First (). ToJson ());
Scenario Three: Linq conditional query
var result = collection. Asqueryable<test> (). An item by item. Amount) 2399927). Take (+); Console.WriteLine (Result. First (). ToJson ());
Scenario Four: Linq Skip version
var result = collection. Asqueryable<test> (). An item by item. Amount). Skip (100000). Take (+); Console.WriteLine (Result. First (). ToJson ());
Performance Comparison Reference
The test code here later I upload, the concrete implementation is to use the old Zhao (my idol ah ~) Codetimer to calculate performance. In addition, I ran the code with the Testdriven plugin to run.
Scenario One: pagination GT-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"),"cust_id":"A12399997","Amount":2399928,"Status":"B"}time Elapsed:1, 322msCPU Cycles:4,442,427,252Gen0:0Gen1:0Gen2:0
Scenario Two: pagination Skip-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"),"cust_id":"A12399997","Amount":2399928,"Status":"B"}time elapsed:95mscpu Cycles: -,280,728Gen0:0Gen1:0Gen2:0
Scenario Three: Paginatilinq on Linqwhere{ "_id": ObjectId ("5472e383fc46de17c45d4682"),"cust_id":"A12399997","Amount":2399928,"Status":"B" }
Time Elapsed:76ms
CPU cycles:268,734,988
Gen 0:0
Gen 1:0
Gen 2:0
Scenario Four: pagination Linq skip{"_id": ObjectId ("5472e383fc46de17c45d4682"),"cust_id":"A12399997","Amount":2399928,"Status":"B"}time elapsed:97mscpu Cycles: -,834,648Gen0:0Gen1:0Gen2:0
The result is not a big drop of glasses, which is too different from the theory, the first why and behind the gap so big? At first I thought it was C # MONGO drive problem, try to change the driver is similar. These days I was looking at "MongoDB in action" when I found the article mentioned:
MongoDB loads the index and metadata of the document into memory based on the query, and suggests that the size of the document metadata is always kept smaller than the machine memory, or performance degrades.
After noticing the above theory, I replaced my test plan, the first time to perform the exclusion, and then compare, found that the results are normal.
Revised results of programme I:
Pagination gt-limit{"_id": ObjectId ("5472e383fc46de17c45d4682"),"cust_id":"A12399997","Amount": 2399928,"Status" : "B" }Time elapsed:18mscpu Cycles: Wu,753,796Gen0:0Gen1:0Gen2:0
Summarize
This article, based on the Skip page and the Ordered field query paging two different scenarios for comparison. The latter simply uses the query results to improve performance without having to count the numbers in turn. Skip although less efficient but general-purpose, ordered fields of the query, you need to design the page when the field to do some processing, at least the page number can be obtained to this field. Here I attach a way, is the combination of the two, we can take the last page of the data on each display, combined with skip to deal with paging, so it is relatively better. This is not a concrete implementation. Other ways of performance comparison and implementation, welcomed the Daniel to share, thank you very much. In addition, if there are flaws and deficiencies in this article please leave a message to advise.
Forget to play a small advertisement, our company recruit Oh, details see my blog subtitle!!
Resources
1. MongoDB Skip function: http://docs.mongodb.org/manual/reference/operator/aggregation/skip/
2. MongoDB Limit function: http://docs.mongodb.org/manual/reference/operator/aggregation/limit/
3. Mongovue Windows Client Management tool (with charge version): http://www.mongovue.com/
4. C # official driver: http://docs.mongodb.org/manual/applications/drivers/
The method and performance of MongoDB paging query