The basic unit of data in MongoDB is called a document. The document is the core concept of MongoDB, where multiple keys with extremely associated values are placed together in an orderly manner as documents.
Within a particular collection, a unique identity document is required. Therefore, the documents stored in MongoDB are made up of a "_id" key to complete this function. The value of this key can be any type, the default test Objectid object. The idea of generating Objectid object is the subject of this article, and it is also a thought that many distributed system can draw lessons from.
In order to consider the distribution, "_id" requires different machines to be able to easily generate it with the same global and unique method. Therefore, it is not possible to use the self-increment primary key (which requires multiple servers for synchronization, which is time consuming and laborious), so the method of generating the Objectid object is chosen. ( similar to the GUID generation mechanism )
Objectid uses 12 bytes of storage space, which is generated in the following way:
|0|1|2|3|4|5|6 |7|8| 9|10|11|
| time Stamp | machine id| pid| Counters |
The first four bytes timestamp is the timestamp starting from the standard era, in seconds, with the following characteristics:
- The timestamp is 5 bytes behind, guaranteeing the uniqueness of the second level;
- Ensure that the insertion order is roughly sorted by time;
- The creation time of the document is implied;
The actual value of the timestamp is not important, and the time between the servers does not need to be synchronized (because the machine ID and process ID are guaranteed to be unique, and uniqueness is the final claim of Objectid).
The machine ID is the server host identity, which is usually the hash value of the machine hostname.
Multiple Mongod instances can be run on the same machine, so you also need to join the process identifier PID.
The first 9 bytes guarantee the uniqueness of the objectid generated by different processes of different machines in the same second. The last three bytes are an auto-incremented counter (a Mongod process requires a global counter), guaranteeing that the same second objectid is unique. The same second allows each process to have (256^3 = 16777216) a different objectid.
Summary: The time stamp guarantees the second level unique, the machine ID guarantees the design to consider the distribution, avoids the clock synchronization, the PID guarantees the same server to run multiple Mongod instances the uniqueness, the last counter guarantees the same second the uniqueness (chooses several bytes to consider the storage economy, Also consider the upper limit of concurrency performance).
"_id" can be generated either on the server side or on the client, which can reduce the pressure on the server side. If it is running on the server, it is recommended that the server script be generated, reduce the database pressure, if it is the C/s mode, it is generated by the client.
The _id in Mongdb