MongoDB 3.0 new features at a glance

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Plug-in storage engine API

MONGODB3.0 introduced the plug-in storage Engine API, which makes it easy for third-party storage engine vendors to join MongoDB, a change that undoubtedly references MySQL's design philosophy. At present, in addition to the early Mmap storage engine, Wiredtiger and ROCKSDB have completed the support for MongoDB, the former is the acquisition of MongoDB company is directly introduced into the MongoDB3.0 version. The introduction of the plug-in storage engine API has made it possible for MongoDB to enrich its arsenal to handle more different types of business, with memory storage engines, transactional storage engines, and even hadoop likely to come in in the future.


Wiredtiger Storage Engine

If the plug-in storage Engine API creates an arsenal for MongoDB3.0, then Wiredtiger is definitely the first and most important bomb in the arsenal. Because of the natural flaws of the mmap storage engine (which consumes disk space and memory space and is difficult to clean up, library-level locks), MongoDB has brought great pain to database operators, and even some have begun to turn to tokumx, even though the latter is currently not very stable. Aware of this problem mongodb, made a rich wayward decision to directly acquire the storage engine vendor Wiredtiger, integrating the Wiredtiger storage engine into version 3.0 (available only in 64-bit editions). So what are the expected features of this storage engine that goes into the spotlight?

1. Document level concurrency control

Wiredtiger implements document-level concurrency control through MVCC, which is a document-level lock. This allows multiple clients to request multiple documents that update a collection memory at the same time, and no longer require write locks at the queue-waiting library level. This can improve the performance of database reading and writing, and greatly improves the concurrency processing ability of the system. The effect of this point from the monitoring tool mongostat can be directly reflected in the old version of the monitoring indicators will have LOCKEDDB this item (the indicator is too high is MONGO the use of a great pain point ah), and the new version of the Mongostat has not been seen.

MongoDB 2.4.12 Version

$/home/mongodb/mongodb-linux-x86_64-2.4.12/bin/mongostat–port55060
Insert Query Update delete getmore command flushes mapped vsize resfaults locked db idxmiss% qr|qw ar|aw netin netout con Ntime
*0 *0 *0 *0 0 1|0 0 18g 18.3g 16.1g 0 ycsb:0.0% 0 0|00|0 62b 2k 1 13:04:01
*0 *0 *0 *0 0 1|0 0 18g 18.3g 16.1g 0 ycsb:0.0% 0 0|00|0 62b 2k 1 13:04:02
*0 *0 *0 *0 0 1|0 0 18g 18.3g 16.1g 0 ycsb:0.0% 0 0|00|0 62b 2k 1 13:04:03

MongoDB 3.0 Rc8 Version

$/home/mongodb/mongodb-linux-x86_64-3.0.0-rc8/bin/mongostat–port55050
Insert Query update delete getmore command% dirty% used flushesvsize res qr|qw ar|aw netin netout conntime
*0 *0 *0 *0 0 1|0 0.0 42.2 0 30.6G 30.4G 0|0 0|0 79b 16k 113:02:38
*0 *0 *0 *0 0 1|0 0.0 42.2 0 30.6G 30.4G 0|0 0|0 79b 16k 113:02:39
*0 *0 *0 *0 0 1|0 0.0 42.2 0 30.6G 30.4G 0|0 0|0 79b 16k 113:02:40

2, disk data compression

Wiredtiger supports block compression and prefix compression for all collections and indexes (if the database is as compressed as the Journal,journal file is enabled), the compression options that are supported include: uncompressed, snappy compression, and zlib compression. This brings another boon to the vast majority of MONGO users, because many MONGO databases are due to the fact that the MMAP storage engine consumes too much disk space and is not expandable. Where snappy compression is the default compression of the database, users can choose the appropriate compression method according to the business requirements. Theoretically, snappy compression is fast, the compression rate is OK, and the zlib compression rate is high, the CPU consumes more and the speed is slightly slower. Of course, as long as you choose to use compression, MONGO will certainly take up more CPU usage, but considering that MONGO itself is not very CPU intensive, enabling compression is entirely worthwhile.


In addition, the Wiredtiger storage method also has a great improvement. The old version MONGO allocates files at the database level, and all the collections and indexes in the database are stored in the database file, so even if a collection or index is deleted, the disk space that is occupied is difficult to automatically reclaim in time. Wiredtiger files are allocated at the collection and index levels, all the collections and indexes in the database are stored in a separate file, and the corresponding storage file is deleted after the collection or index is deleted. Of course, because the storage method is different, the lower version of the database cannot be directly upgraded to the Wiredtiger storage engine, and can only be implemented by exporting the imported data.


3. Configurable memory usage limit

The Wiredtiger supports memory usage capacity configuration, which allows the user to control the maximum memory that MongoDB can use with the Storage.wiredTiger.engineConfig.cacheSizeGB parameter, which defaults to half the size of physical memory. This also for the vast number of MONGO users bring another gospel, mmap storage engine consumes memory is a name, as long as the amount of data is large enough, is simply how much.

MMAPV1 Storage Engine boost

MongoDB3.0 out of the introduction of Wiredtiger, the original storage engine Mmap has also been a certain improvement, the storage engine is still the 3.0 version of the default storage engine. Unfortunately, the improved MMAP storage engine still allocates files at the database level, and all of the collections and indexes in the database are stored in a database file, so the issue of disk space cannot be automatically reclaimed in time.

1. Lock granularity is promoted to collection level lock by library level lock

This also improves the concurrency of the database to a certain extent.

2. Change of document space allocation mode

In the Mmap storage engine, documents are stored in the order in which they are written. If the document is updated after a longer length and there is not enough space behind the original storage location to drop the growth portion of the data, the document is moved to a different location in the file. This move of the document location caused by the update can severely degrade write performance because all indexes in the collection are synchronized to modify the new storage location of the document once the document has moved.

The MMAP storage Engine provides two ways of document space allocation in order to reduce this situation: Adaptive allocation based on paddingfactor (fill factor) and pre-allocation based on Usepowerof2sizes, where the former is the default. The first method updates the average growth length of a document update based on the history of each set, and then fills a portion of the space when a new document is inserted or when the old document is moved. If the value of the current collection paddingfactor is 1.5, a document with a size of 200 bytes is automatically populated with 100 bytes of space after the document is inserted. The second method does not consider the update history, directly assigns the document 2 of the N-square-size storage space, such as a 200-byte size of a document inserted directly when the allocation of 256 bytes of space.

The MongoDB3.0 version of MMAPV1 discards Paddingfactor-based adaptive allocation because it looks smart, but because the size of the documents in a collection is different, the amount of space that is filled is different. If there are many update operations on the collection, the amount of free space that is caused by the record movement is difficult to reuse because of the size. Currently, usepowerof2sizes-based pre-allocation is the default method of document space allocation, which is easier to maintain and utilize because the amount of space allocated and reclaimed is 2 of the n-th square (which becomes a multiple of 2MB when the size exceeds 2MB). If there is only insert or in-placeupdate on a collection, the user can turn off space pre-allocation by setting the NOPADDING flag bit for the collection.

Replication Set Improvements

1. Replication Set Membership Growth

The maximum number of replica set members for MongoDB3.0 is increased from 12 to 50, but the maximum number of members who can vote is still 7, while the corresponding GetLastError W: "Majority" also represents the majority of the voting nodes.

2, primary node Stepdown processing mode change

The Replsetstepdown command in the replication set allows the current primary node to abdicate and re-elect the new Primay node. MONGODB3.0 has made the following modifications to the Stepdown: 1) before primary abdication, some long-time user operations such as index creation, write operations, mapreduce tasks, etc., will be interrupted first; 2) in order to prevent data rollback, The primary node waits for an elected secondary node to be synchronized to the latest data before retiring, while the primary node in the old version will retire as long as the data from the secondary node is synchronized to within 10 seconds ; 3) At the same time the Replsetstepdown command adds a secondarycatchupperiodsecs parameter that allows the user to specify that the primary node waits for the data of the secondary node to be synchronized to the number of seconds specified in the parameter.

Shard Cluster Improvements

1. New tool function Sh.removetagrange ()

The old version only Sh.addtagrange (), if you want to delete Tagrange can only be removed manually to the Config.tags collection.

2. Provide more predictable read preference processing

The MONGOs instance in the new version no longer secures the connection on the replica set member while performing the read operation, and the readpreference is re-evaluated for each read operation. This way, when read preference is modified, its behavior is much easier to predict.

3. Provide Writeconcern settings for chunk migration

The new version provides Writeconcern parameters for the equalizer for both Movechunk and cleanuporphaned, which involve chunk migrations.

4, increase the equalizer status display

The status information of the equalizer can be seen in the new version via Sh.status ().

Other changes

1. Optimize Explan function

The new version of the Explain function can support query plan display for operations such as Count,find,group,aggregate,update,remove, resulting in a more comprehensive and granular result.

2. Rewrite the MongoDB tool

The new version all MongoDB own tools are rewritten using go language, especially in Mongodump and Mongorestore, which can greatly accelerate the export and import of data.

3. Log Output control

The new version divides the logs into different modules, including access, COMMAND, CONTROL, GEO, INDEX, NETWORK, QUERY, REPL, Sharding, STORAGE, Journal, and write. Users can dynamically adjust the log level of each module, which is undoubtedly more conducive to system problem diagnosis.

4. Optimization of index construction

Background indexing process, can not be deleted from the database delete the index operation, and the background index establishment process will not be automatically interrupted. In addition, you can use the createindexes command to create multiple indexes at the same time, and scan the data only once, improving the efficiency of indexing.

Summarize

The above lists only some of the major features and modifications of MongoDB 3.0, if you want to learn more about the release-notes that you can view MongoDB 3.0. Overall, MONGODB3.0 offers more surprising new features and makes people more optimistic about their future development.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.