2015 China Database Conference Mongodb sharing, 2015 mongodb
This article was shared by MongoDB Greater China technical consultant. When the "old man" came up, he asked what Mongo meant. He guessed the prize and claimed that he was the best MongoDB expert (in the kite surfing circle... ). I believe everyone does not know it. It is easy to understand it as mango (mango in English ). In fact, mongo evolved from humongous, meaning "huge.
This article mainly introduces MongoDB version 3.0 and decrypts its performance improvement. First, we introduced the main functions of MongoDB: (1) automatic replication of high-availability MongoDB has three configuration methods: Single-host, master-slave, and replica set (replicat). The replica set can implement data read/write splitting, and maintain automatic replication of database clusters for automatic Disaster Tolerance and high availability. (2) Secondary indexes and dynamic query of MongoDB secondary indexes are implemented using B-tree, which is the default index of most relational databases. Dynamic query: MongoDB supports a variety of Query expressions. Query commands use JSON tags to easily query embedded objects and multi-dimensional arrays in documents.
(3) The aggregation framework and the aggregation framework of MapReduce MongoDB act on a set of Special operators, which can replace MapReduce and be used for general aggregation operations. (4) enterprise-level security (5) Geospatial indexing the coordinates of the longitude and latitude of each location are generally stored for LBS-related projects. If you want to query nearby sites, you need to create an index to improve query efficiency. MongoDB has created a geospatial index for such queries. (6) GridFS: The GridFS function of the MongoDB file storage system. The performance is not as high as that of the traditional file system. We recommend that you use a file system to store files and then use MongoDB to store the metadata of the file. (7) the automatic sharding and horizontal scaling automatic sharding functions support horizontal database clusters and can dynamically add additional machines. (8) document model a model that stores object information with a document. I think that's why MongoDB cannot use join table queries ~
MongoDB 2.8 (version 2.8 jumps directly to version 3.0. Why do we name 2.8 3.0? Tang jianfa said that 2.8 had been greatly enhanced, and 2.8 was a bit grievance. Therefore, the MongoDB Marketing Department named 3.0 .) The version introduces the WiredTiger storage engine MMAP that supports Latch-free and Non-blocking algorithms. Therefore, the wiredtiger parameter must be added before use. MongoDB3.0 improved: (1) Write Performance: 7x-10x, improved write speed by 7 to 10 times (2) Data Compression: 30%-80%, 30%-80% of original data (3) O & M: 95%, the O & M cost is reduced by 95%, mainly in the cluster. It is said that, with the WiredTiger storage engine, MongoDB3.0 can have parallel control at the document level, even if it processes frequently written tasks, the database can still maintain a certain degree of stability and predictability.
In this 3.0 release, the MongoDB official release also released a performance test report, which is definitely the first official test report in the history of MongoDB. Concurrency:
In the YCSB (NoSQL Pressure Measuring Tool) test, MongoDB3.0 has grown by about seven times in multithreading and batch insertion scenarios compared with MongoDB2.6. The second test compared the 95% read and 5% update scenarios on the two systems. We can see that WiredTiger has more than 4 times the throughput. Compared with the pure insert scenario, this performance improvement is not so significant, because write operations only account for 5% of all operations. Finally, for the read/write operation balancing scenario, we can see that MongoDB3.0 has a concurrency rate of six times. This is better than the 4-fold increase of the 95% read, because there are more write operations here.
Response time:
They compare the 95th and 99th percentile latencies of update responses through read-intensive workloads. In MongoDB3.0, the update latency is significantly improved, and the number of 95th and 99th percentile digits is almost reduced by 90%.
MongoDB2.6 and 3.0 adopt their respective document-level concurrency control implementation: version 2.6 uses pessimistic locks, that is, database-level locks (in fact, it is not the Lock we understand. It is too big and too heavy to use locks to describe, the database-level locks of version 2.6 should be understood as latch, that is, a lightweight row lock will be added when data is modified), while version 3.0 uses an optimistic lock (MVCC ): what is a pessimistic lock or an optimistic lock? Pessimistic lock: the pessimistic lock assumes that other users attempt to access or modify the objects you are accessing or modifying have a high probability. Therefore, in the pessimistic lock environment, you lock the object before you begin to change it and release the lock until you submit the changes. So the pessimistic lock takes a long time, which may limit the access of other users for a long time. That is to say, the concurrent access performance of the pessimistic lock is not good. This is why the 2.6 version does not improve performance. In 2.6, database locks are used. Read and Write operations are mutually exclusive operations, which greatly affects the overall performance of the database.
Version 3.0 adopts optimistic locks: Optimistic locks hold that the chances of other users attempting to change the objects you are modifying are very small. So the optimistic lock will not lock the object until you are ready to submit the changes. It can be seen that optimistic locks take less time to lock than pessimistic locks. Optimistic locks can use a larger lock granularity to achieve better concurrent access performance. This is also one of the reasons for the 3.0 read/write Performance Improvement and high concurrency control.
Of course, you also need to pay attention to the following: (1) No 32-bit Support (2) incompatible with 2.6 Data Files (3) Journal will not fl the disk in time by default. If the system goes down, A maximum of 100 MB of Jouranl data will be lost (4) Stall phenomenon will also exist-especially when I/O resources are insufficient (5) Windows performance indicators are not as good as Linux
Finally provide MongoDB community Tutorial: Chinese community: http://www.mongoing.com online tutorial: http://university.mongodb.com