Previous blog about MongoDB mmapv1 storage engine, this article then introduces mongodb Another storage engine-- wiredtiger wiredtiger mongodb3.0 mongodb3.2 version began to become mongodb default storage engine. Compare mmapv1 wiredtiger
relative to MMAPv1,wiredtiger A series of improvements:
1. Improvement of file space allocation method
The MMAPV1 storage engine allocates files at the database level, mixing all the collections and indexes in each database in a database file, even if a collection or index is deleted, and the disk space it consumes is difficult to automatically reclaim in time. Wiredtiger allocates files at the collection and index levels, stores all the collections and indexes in each database in a separate file, deletes the corresponding files after the collection or index is deleted, and facilitates disk space reclamation.
Some data files for Wiredtiger:
Mongod.lock: Used to prevent multiple processes from connecting to the same wiredtiger Database
. WT files: stores data for individual collections, 100MB per file
WIREDTIGER.WT: Metadata information for storing all collections
Wiredtiger.turtle: metadata information for storing wiredtiger.wt
Journal folder: for storing log files (Write ahead log)
2. document-level concurrency control
The Wiredtiger storage engine uses document-level locks, and multiple write operations at the same time can modify different documents in the same collection, but cannot modify the same document. This makes the Wiredtiger storage Engine more capable of concurrent processing than MMAPv1 .
3. data persistence through checkpoints and pre-written logs
according to the default configurationof MongoDB, the write operation of Wiredtiger is written to the Cachefirst( BTree Knot when Cache size reached 128kb to the pre-write log file (write ahead log). Wiredtiger A checkpoint is made every 60s or log file size reaches 2GB Checkpoint , resulting in a database snapshot at a specified point in time (a consistent view of the in-memory data), persisting all the data in the snapshot to a data file in a consistent manner, ensuring that the data file and memory data are identical. when the wiredtiger Connection is initialized, the data is first restored to the latest snapshot state, and then the data is recovered based on the pre-write log file to ensure storage reliability.
4. maximum memory usage can be configured
Use when wiredtiger the storage engine, theMongoDB data cache is divided into two parts: the internal cache and the file system cache. The internal cache size can be set using the --WIREDTIGERCACHESIZEGB parameter, with the default value:1GB or RAM 60% to 1GB , take the larger of the two value. The file system cache size is not fixed,MongoDB automatically uses the system's free memory, and the data is compressed in the file system cache.
5. Data compression
Use Wiredtiger Storage Engine, the collection of databases and indexes, log files are compressed storage, saving disk space. wiredtiger By default, the collection data uses the block compression algorithm, and the index data uses the prefix compression algorithm. This makes the data consume less disk space, read and write faster, and spend less I/O time.
When the data in the collection is relatively young, the compression and uncompressed write performance is single, but the compression read performance is better than the compression read performance. When the collection data is many, the compression read and write performance is better than the non-compression read and write performance.
an analysis Wiredtiger Data organization structure of the article:http://mini.eastday.com/mobile/160630190233714.html.
MongoDB storage Engine (medium)--wiredtiger