Internal Structure of MongoDB data files

Source: Internet
Author: User

Address: http://blog.nosqlfan.com/html/3515.html

 

Someone asked on Quora what is the internal organizational structure of the MongoDB data file. Then the 10gen engineer Jared Rosoff gave a brief answer.

Each database has its own independent files. If you enable the directoryperdb option, the files in each database will be placed in a separate folder.

Database files are split into a single block internally, and each block only stores data of one namespace. In MongoDB, namespace is used to differentiate different storage classes. For example, each collection has an independent namespace and each index has its own namespace.

Multiple records are saved in one block. Each record is in the BSON format and is connected to the record through a two-way linked list.

Index data also exists in the data file, but the index is organized into a B-Tree structure rather than a two-way linked list.

Each database has a namespace file to save the metadata corresponding to each namespace. We can query the metadata to find the location of the storage block of the corresponding namespace.

If jorunaling logs are enabled, some files will store all your operation records.

The following figure is a hand-drawn data file structure provided by Mathias Stearn, A 10gen engineer, at the sv2011 conference.

1. Each database has corresponding data files and namespace files

2. The data file starts from 16 Mb. The new data file is twice the size of the previous file, and the maximum size is 2 GB.

3. When a file uses MMAP for memory ing, all data files will be mapped to the memory, but only the virtual memory will be exchanged to the physical memory only when the data is accessed.

4. Map MongoDB data files to locations in the memory table

5. If a 32-bit machine is used, the memory address can identify a maximum of 4 GB memory.

6. But on 32-bit machines, 4 GB memory will be 1 GB defeated by the kernel. About GB will be used for the stack space of the mongod process, and only about GB will be available for ing data files.

7. A maximum of TB of space can be displayed on 64-bit machines.

8. Each data file is divided into one data block, and the block is connected by a two-way linked list.

9. in the namespace file, the storage Information metadata of each namespace is saved, including its size, number of blocks, the first position, the last position, the linked list and index information of the deleted blocks.

10. These locations are stored through the DiskLoc data structure, storing the data file number and the location of the block in the file

11. For each block, its header contains metadata of some blocks, such as its position, the position of the previous and next blocks, and the position pointer of the first and last records in the block. The rest is used to store specific data. The specific data is also connected through two-way links.

12. The following describes the storage structure and working principle of B-Tree.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.