Basic Features and internal structure of MongoDB

Source: Internet
Author: User
Tags mongodb server
MongoDB is a product between relational databases and non-relational databases. It has the most abundant functions and features like relational databases. The supported data structure is very loose and is similar to the json bjson format. Therefore, it can store complicated data types. The biggest feature of Mongo is that it supports a very powerful Query Language, and its syntax is a bit similar.

MongoDB is a product between relational databases and non-relational databases. It has the most abundant functions and features like relational databases. The supported data structure is very loose and is similar to the json bjson format. Therefore, it can store complicated data types. The biggest feature of Mongo is that it supports a very powerful Query Language, and its syntax is a bit similar.

MongoDB is a product between relational databases and non-relational databases. It has the most abundant functions and features like relational databases. The supported data structure is very loose and is similar to the json bjson format. Therefore, it can store complicated data types. The biggest feature of Mongo is that it supports a very powerful query language. Its syntax is somewhat similar to an Object-Oriented Query Language. It can almost implement most of the functions similar to single-table queries in relational databases, it also supports data indexing.

For most MongoDB users, MongoDB is like a big black box. However, if you know some internal structures of MongoDB, it will help you better understand and use MongoDB.

BSON

In MongoDB, a document is an abstraction of data. It is used in the interaction between the Client and the Server. All clients (drivers in various languages) use this abstraction, which is represented in BSON (Binary JSON ).

BSON is a lightweight binary data format. MongoDB can use BSON and store BSON as data in disks.

When the Client needs to write and use query operations, the file must be encoded in BSON format and then sent to the Server. Similarly, the return result of the Server is also encoded in BSON format and then returned to the Client.

The BSON format is used for the following purposes:

  1. Efficiency. BSON is designed for efficiency and only requires a small amount of space. Even in the worst case, the BSON format is more efficient than the JSON format in the best case.
  2. Transmission. In some cases, BSON sacrifices extra space to facilitate data transmission. For example, the prefix transmitted by a string identifies the length of the string, rather than marking the end at the end of the string. This transmission mode facilitates MongoDB to modify the transmitted data.
  3. Performance. Finally, BSON encoding and decoding are very fast. It uses a C-style data representation, which can be used efficiently in various languages.
Write Protocol

The Client accesses the Server using a lightweight TCP/IP write protocol. This Protocol is described in detail in MongoDB Wiki. It is actually a simple package on BSON data. For example, a Data Writing command contains a 20-byte message header (consisting of the message length and the write command ID), the Collection name to be written, and the data to be written.

Data Files

In the data folder of MongoDB (the default path is/data/db), all the files that constitute the database. Each database contains a. ns file and some data files. The data files will increase as the data volume increases. So if there is a database named foo, the file that makes up the foo database will be composed of foo. ns, foo.0, foo.1, foo.2, and so on.

Each time a data file is added, it is twice the size of the previous data file, and each data file is up to 2 GB. This design helps prevent databases with a small amount of data from wasting too much space, while ensuring that the databases with a large amount of data have the corresponding space to use.

MongoDB uses the pre-allocation method to ensure stable Write Performance (this method can be disabled using-noprealloc ). Pre-distribution is performed in the background, and each pre-allocated file is filled with 0. This will allow MongoDB to maintain extra space and spare data files, thus avoiding the blocking caused by disk space allocation due to excessive data growth.

Namespace and disk Zone

Each database is composed of multiple namespaces, and each namespace stores the corresponding types of data. Each Collection in the database has its own namespace, and the index file also has a namespace. Metadata of all namespaces is stored in the. ns file.

The data in the namespace is divided into multiple intervals in the disk, which is called the disk area. In, the database foo contains three data files, and the third data file is an empty pre-allocated file. The first two data files are divided into different namespaces for the corresponding disk areas.

Displays the characteristics of the namespace and disk area. Each namespace can contain multiple different disk areas, which are not consecutive. As data files grow, the size of the Disk Area corresponding to each namespace increases with the number of times allocated. This aims to balance the space wasted by the namespace and maintain data continuity in a namespace. There is also a namespace to note: $ freelist, which is used to record the disk areas that are no longer in use (the deleted Collection or index ). Whenever the namespace needs to be allocated to a new disk area, you will first check whether $ freelist has a suitable Disk Area.

Memory ing storage engine

MongoDB currently supports the memory ing engine. When MongoDB is started, all data files are mapped to the memory, and the operating system hosts all disk operations. This storage engine has the following features:

  • The memory management code in MongoDB is very streamlined. After all, the related work has been managed by the operating system.
  • The virtual memory used by the MongoDB server is huge and will exceed the size of the entire data file. Don't worry, the operating system will handle all this. It should be noted that MongoDB does not manage the memory itself and cannot specify the memory size, which is managed by the operating system. Therefore, it is sometimes uncontrollable, memory usage must be monitored at the OS level in the production environment.
  • ? MongoDB cannot control the order in which data is written to the disk. As a result, MongoDB cannot implement the writeahead log feature. Therefore, if MongoDB wants to provide a durability feature, it needs to implement another storage engine.
  • Each Mongod instance on a 32-bit MongoDB server can only use 2 GB of data files. This is because the address pointer can only support 32 bits.
Features

It features high performance, ease of deployment, and ease of use, making it easy to store data. Features:

  • It is designed for centralized storage and is easy to store object-type data.
  • Free mode.
  • Supports dynamic query.
  • Supports full indexing, including internal objects.
  • Query is supported.
  • Supports replication and fault recovery.
  • Use efficient binary data storage, including large objects (such as videos ).
  • Automatic fragment processing to support scalability at the cloud computing level
  • Supports RUBY, PYTHON, JAVA, C ++, PHP, and other languages.
  • The file storage format is BSON (a json extension)
  • Accessible through the network

The so-called "Collenction-Orented" means that data is stored in a data set by groups and is called a collection ). Each set has a unique identification name in the database and can contain an infinite number of documents. The concept of a set is similar to that of a table in a relational database service (RDBMS). The difference is that it does not need to define any schema ).

Schema-free means that we do not need to know any schema definitions for the files stored in the mongodb database. If necessary, you can store files of different structures in the same database.

The document stored in the set is saved as a key-value pair. The key uniquely identifies a document, which is a string type, and the value can be a complex file type. We call this storage form BSON (Binary Serialized dOcument Format ).

Others

There are only so many internal structures in MongoDB described in MongoDB The Definitive Guide. If you really want to make it clear, you may need another book to talk about it. For example, internal JS parsing, query optimization, and index creation. If you are interested, you can directly refer to the source code :)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.