MongoDB Natural Sort
noun explanation
Natural order
The order in which the database references documents are stored on disk in this sort. This is the default sort order.
ObjectId
A specific 12-byte Bson type that is used to guarantee uniqueness in the collection. Objectid is generated based on timestamp, machine ID, process ID, and a process-local incremental counter. MongoDB uses the Objectid value as the default value for the _ID key.
_ID and ObjectId
When the collection is created, MongoDB creates a unique index on the _id key. This index prevents the insertion of two documents with the same value on the _ID key. You cannot delete the index on the _id key.
The document in the collection must have a _id key, the value of which can be any type, and the default is the Objectid object. Within a collection, each document has a unique "_id" value to ensure that each document within the collection is uniquely identified. If there are two sets, two collections can have a "_id" key with a value of 123, but only one "_id" in each set is a 123 document.
1. ObjectId
ObjectId is the default type for "_id". It is designed to be light-weight, and different machines can easily generate it with a globally unique homogeneous method. This is the main reason that MongoDB uses objectid instead of other general practices (such as auto-incremented primary keys) because it is laborious and time consuming to automatically increase primary key values on multiple servers. MongoDB has been designed to work as a distributed database from the outset, and processing multiple nodes is a core requirement. You will see later that the Objectid type is much easier to generate in a fragmented environment.
ObjectId uses 12 bytes of storage space, two hexadecimal digits per byte, and is a 24-bit string. Because it looks very long, many people will find it difficult to deal with. But the key is to know that this long objectid is twice times the actual amount of data stored.
If you create multiple objectid quickly and continuously, you will find that only the last few digits change. The other numbers in the middle also change (to
is to pause for a few seconds during the creation process). This is caused by the way Objectid was created. The 12 bytes are generated as follows:
0|1|2|3 | 4|5|6 | 7|8 | 9|10|11
Time Stamp | Machine | PID | Counter
The first 4 bytes are timestamps starting from the standard era, in seconds. This will bring some useful properties.
The timestamp, combined with the subsequent 5 bytes, provides the uniqueness of the second level.
Since the timestamp is in front, this means that the Objectid is roughly sorted in the order in which they were inserted. This is useful for some aspects, such as indexing
High efficiency, but this is not guaranteed, just "approximate". These 4 bytes also imply when the document was created. Most drivers will be exposed
A way to get this information from Objectid.
Because the current time is used, many users worry about the time synchronization of the server, in fact, this is not necessary, because the actual value of the timestamp is not
Important, as long as it always keeps on increasing (once per second).
The next three bytes are the unique identifier of the host on which it resides. This is usually the hash value of the machine host name. This ensures that different hosts can generate different
ObjectId, does not create a conflict.
To ensure that the objectid generated by multiple processes concurrently on the same machine are unique. The last 3 bytes is an automatically incremented counter that
The same process is not the same as the objectid produced in the same second. The same second allows up to 256 (16777216) different objectid per process.
2. Automatically generate _id
As mentioned earlier, if there is no "_id" key when inserting a document, the system will automatically create one for you. This can be done by the MongoDB server, but it is usually done by the driver on the client. The reasons are as follows.
Although Objectid is designed to be lightweight and easy to generate, it generates overhead when it is generated. The client-side generation embodies MongoDB's design philosophy: to transfer from the server to the driver to do things, as far as possible. The reason behind this idea is that even scalable databases like MongoDB extend the application layer much easier than extending the database layer. By handing the transaction over to the client, the burden of the database extension is mitigated.
Drivers can provide richer APIs when the client generates Objectid. For example, a driver can have its own insert method that can return a generated objectid or insert it directly into a document. If the driver allows the server to generate Objectid, a separate query is required to determine the "_id" value in the inserted document.
Natural Sort
1. _id and $natural
Then the _id sort can be interpreted as equivalent to being sorted by insertion time, because it is generated by the client driver and is embedded with the current timestamp. When sorted by $natural, the equivalent is sorted by the order in which data is organized on disk.
2. Index usage
Queries that contain $natural sorting do not use indexes to satisfy the query predicate, with one exception: if the query predicate is an equivalent condition on the _id key, a query using $natural sort can use the _id index.
3. Natural Sort Insider
MongoDB's query ordering is undefined by default, so the order of the documents is returned. If there are no query criteria, then the natural order (natural order) is used. The results are returned in the order in which they were found, possibly in the same order as the insertion order (but not guaranteed) or the index used.
Some situations that affect the storage (natural) Order:
. If the document is updated and cannot be placed in the space they are currently allocating, it will be moved
. New documents may be inserted into available void space resulting from deleting or moving documents
If an index is used, the document is returned in the order in which it was found by the index. If more than one index is used, the order is internally dependent on which index to identify the document first through the data deduplication (de-duplication) process.
If you need a specific order, then you must include a sort in your query.
The exception to be noted is the natural order of the fixed set (Capped collections ' natural order) because the document cannot be moved and stored in the order in which it was inserted. Sorting is the part of the fixed collection feature that ensures that the oldest document is deleted first. In addition, documents cannot be deleted or moved in a fixed set.
Describe situations that affect the natural order:
If the document is moved or deleted, you may get different result sets. If there are no documents inserted, updated, deleted you will get the same results. Adding an index does not affect the location of the document on disk.
Also note that if you use replication, the natural order between members of the replica set may not be the same.
Resources
$natural
https://docs.mongodb.com/manual/reference/operator/meta/natural/
Cursor.sort ()
https://docs.mongodb.com/manual/reference/method/cursor.sort/#return-natural-order
Capped Collections
https://docs.mongodb.com/manual/core/capped-collections/
Default _id Index
https://docs.mongodb.com/manual/indexes/#default-id-index
Research on Data deduplication (de-duplication) technology
http://blog.csdn.net/liuaigui/article/details/5829083
This article is from the SQL Server deep Dive blog, so be sure to keep this source http://ultrasql.blog.51cto.com/9591438/1812708
MongoDB Natural Sort