MongoDB Space allocation

Source: Internet
Author: User

MongoDB occupies a much larger disk space than MySQL, can understand the document data such as JSON format, there are many redundant data, but the space consumption is not normal, even the traditional database three or four times times, does not fit the engineering practice, there should be room for improvement. Some data were consulted to specify the space allocation of MongoDB.

  1. MongoDB Each library logically contains many collections (collection), which are physically stored as multiple data files, the allocation of data files is pre-allocated, the pre-allocation method can reduce fragmentation , the program is more efficient when requesting disk space, However, MongoDB pre-allocated policies can lead to wasted space. The default allocation space policy is that MONGODB will continuously allocate more data files as the database data increases. The size of each new data file is twice times the previous allocated file (64M, 128M, 256M, 512M, 1G, 2G, 2G, 2G) until the upper limit of the pre-allocated file size is 2G. Although the 2G threshold can be adjusted, but the general operation and other times often will not adjust, in this case, may lead to space waste. (as can be understood, originally a collection size of 2M, added a 100K of data, now the collection size into 2m*2=4m, this allocation strategy will waste memory, but will avoid fragmentation) for the allocation of disk space efficiency, I quote in a skeptical attitude, if you have an IO bottleneck, pre-allocating a 2G file, will likely lead to serious performance problems in the service. pre-allocating files can reduce fragmentation and increase the efficiency of application space, but it is debatable if it is necessary to initialize a huge file at once. Although the pre-allocation mechanism, documentation can be closed, but the general use of NoSQL products will use the default configuration, it is recommended to use the default configuration, because the default configuration is often a long test, not so many bugs.   

2. MongoDB documents are stored continuously in the data file, which differs from some relational database practices (they split the long record into two parts, the overflow part is stored separately in the other), and if not enough space is reserved, the update may result in the original space not being able to fit the new document . Storage fragmentation can cause unexpected delays when an update forces the engine to move documents in the Bson store. The official explanation for this mongodb is as follows,

"If there is enough room to update the document in MongoDB, the data will be updated in place." If the updated document size is larger than the allocated space, the document is rewritten in a new location. MongoDB will eventually reuse the original space, but this may take time, and the space may be over-allocated.

In MongoDB 2.6, the default spatial allocation policy will be powerof2sizes, which has been provided since MongoDB 2.2. This setting takes the amount of space allocated by MongoDB up to a power of 2 (for example, 2, 4, 6, 8, 16, 32, 64, and so on). This setting reduces the chance that documents need to be moved and allows space to be reused more efficiently, resulting in less space over-allocation and more predictable performance. users can still use an exact matching allocation policy, which is more space-saving if the document size does not increase. "

Obviously, this strategy will also lead to wasted space, especially for importing read-only types of data.

3. MongoDB does not support compression of data files, nor can it reclaim space . The defragmentation strategy that it uses may be rewritten in a new place instead of defragmenting and merging the old fragments.

4. Do not validate the data page. Page validation is important for a database to help identify storage device exceptions. In this respect, the data stored by MongoDB is unsafe and may not be up any day.

MongoDB Space allocation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.