Mongo dB and memory)

Last Update:2018-12-05 Source: Internet

Author: User

Tags mongodb server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Anyone who first came into contact with MongoDB was surprised that it was greedy for memory. For the reason, let me first talk about how Linux manages memory, and then how MongoDB is.

When memory is used, the answer is clear.

It is said that learning with problems is more effective, first look at the top command result of a MongoDB Server:

Shell> top-p $ (pidof release d)

Mem: 32872124 k total, 30065320 K used, 2806804 K free, 245020 K Buffers

Swap: 2097144 k total, 100 K used, 2097044 K free, 26482048 K cached virt res SHR % mem1892g 21G 21G 69.6

Is there any performance problem with this MongoDB server? You can continue reading while thinking.

First, let's talk about how Linux manages memory.

In Linux (similar to other systems), the memory has physical memory and virtual memory. There is no need to explain what physical memory is. The virtual memory is actually the abstraction of physical memory. Most

In this case, for convenience, the program accesses a virtual memory address, and the operating system translates it into a physical memory address through the page table mechanism.

Refer to understanding memory and understanding virtual memory. For how the program uses virtual memory, refer to playing with virtual memory.

There will be no fees.

Many people will confuse virtual memory with swap. In fact, swap is only a technology derived from virtual memory: Once the operating system has insufficient physical memory, in order to free up memory space

When new content is stored, the content in the current physical memory will be put in the SWAp partition and will be retrieved later. Note that the use of swap may cause performance problems,

Occasionally, there is no need to worry about it. The bad thing is that the physical memory and swap partitions frequently exchange data. This is called swap bumps. Once this happens, you must first determine the cause.

If the memory is insufficient, the problem can be solved by adding the memory. However, this problem may occur even if the memory is sufficient, for example, MySQL may.

In this case, an optional solution is to restrict the use of swap:

Shell> sysctl-w vm. swappiness = 0
The free command is most commonly used to view memory conditions:

Shell> free-m total used free shared buffers cachedmem:

32101 29377 2723 0 239 25880

-/+ Buffers/cache: 3258 28842

Swap: 2047 0 2047

When a newbie sees that the value of used is too large, and the value of free column is too small, it is often considered that the memory is used up. This is because every time we operate on a file

In Linux, files are cached to the memory as much as possible, so that the results can be directly obtained from the memory during the next visit. Therefore, the value in the cached column is very large,

Don't worry, this part of memory is recoverable, and the virtual memory manager of the operating system will eliminate cold data according to the LRU algorithm. There is also a buffers that can be recycled, but it is guaranteed

Reserved for Block devices.

After understanding the principle, we can calculate that the available memory of the system is free + buffers + cached:

Shell> echo $(2723 + 239 + 25880) 28842
The actual memory used by the system is used-Buffers-cached:

Shell> echo $ (29377-239-25880) 3258
In addition to the free command, you can also use the sar command:

Shell> Sar-R

Kbmemfree kbmemused % memused kbbuffers kbcached

3224392 29647732 90.19 246116 26070160 shell> Sar-wpswpin/s pswpout/s 0.00 0.00

I hope you are not scared by % memused. If you are unfortunate enough, re-read this article.

Let's talk about how MongoDB uses memory.

Currently, MongoDB uses the memory ing storage engine, which maps data files to the memory. For read operations, the data in the memory is cached. For write operations

Memory can also convert random write operations into sequential write operations, which can greatly improve the performance. MongoDB does not interfere with memory management, but leaves this work to operations

As the system's Virtual Memory Manager for processing, the advantage of doing so is to simplify the work of MongoDB, but the disadvantage is that you have no way to easily control how much memory MongoDB occupies, lucky

The existence of the virtual memory manager makes it unnecessary for us to care about this issue most of the time.

MongoDB's memory usage mechanism makes it more advantageous in cache reconstruction. In short, if the process is restarted, the cache is still valid. if the system is restarted, you can copy

Data files to/dev/null to recreate the cache. For more details, see cache reheating-not to be ignored.

Sometimes, even if MongoDB uses a 64-bit operating system, it may encounter an OOM problem. Most of this situation is caused by limited memory size.

Check the current value:

Shell> ulimit-A | grep memory
By default, most operating systems are set to unlimited. If your operating system is not, you can modify it as follows:

Shell> ulimit-M unlimitedshell> ulimit-V Unlimited
Note: The use of ulimit has a context, and it is best to put it in the MongoDB STARTUP script.

Sometimes, if the number of MongoDB connections is too large, performance will be slowed down. You can query the number of connections through serverstatus:

Mongo> dB. serverstatus (). Connections
Each connection is a thread and requires a stack. The default stack settings in Linux are generally large:

Shell> ulimit-A | grep stackstack size (Kbytes,-S) 10240
The actual size of the stack used by MongoDB can be confirmed using the following command (unit: K ):

Shell> CAT/proc/$ (pidof release d)/limits | grep stack | awk-F 'SIZE' {print int ($ NF)/1024 }'

If the stack size is too large (for example, 10240 K), it makes no sense to simply compare the size and RSS in the command results:

Shell> CAT/proc/$ (pidof release d)/smaps | grep 10240-a 10
The memory consumed by all connections is astonishing. We recommend that you set the stack to a smaller value, for example, 1024:

Shell> ulimit-s 1024
Note: from the beginning, MongoDB will automatically set the stack at startup.

Sometimes, for some reason, you may want to release the memory occupied by MongoDB, but as mentioned above, the memory management work is controlled by the virtual memory manager. Fortunately, it can be used.

MongoDB built-in closealldatabases command for the following purposes:

Mongo> Use adminmongo> dB. runcommand ({closealldatabases: 1 })
In addition, you can release the cache by adjusting the Kernel Parameter drop_caches:

Shell> sysctl-w vm. drop_caches = 1
You can use the Mongo command line to monitor the memory usage of MongoDB as follows:

Mongo> dB. serverstatus (). mem: {"resident": 22346, "virtual": 1938524, "mapped": 962283}

You can also use the mongostat command to monitor MongoDB memory usage, as shown below:

Shell> mongostatmapped

Vsize res faults

940g 1893g 21.9g 0

The memory-related fields have the following meanings:

Mapped: The data size mapped to the memory.
Visze: Virtual Memory Used
Res: physical memory used
Note: If the operation cannot be completed in the memory, the value of the result faults column will not be 0, and performance problems may occur depending on the size.

In the above results, vsize is twice the size of the mapped, and mapped is equal to the size of the data file. Therefore, vsize is twice the size of the data file, mongoDB has enabled journal and needs to map data files once more in the memory. If journal is disabled, the vsize and mapped are roughly the same.

To verify this, you can use the pmap command to observe the file ing after you enable or disable Journal:

Shell> pmap $ (pidof release d)
What size of memory is suitable for MongoDB? Broadly speaking, it is more beneficial. To be specific, it depends on the size of your data and indexes. If the memory can be fully loaded

Partial data indexing is the best case, but in many cases, the data will be larger than the memory, such as the MongoDB instance involved in this article:

Mongo> dB. Stats () {"datasize": 1004862191980, "indexsize": 1335929664}
In this example, the index is more than 1 GB, the memory can be fully loaded, and the data file reaches 1 Tb. It is estimated that it is difficult to find such a large memory. At this time, ensure that the memory can be loaded with hot data, as for hot data

The data size depends on the specific application. As a result, the memory size is clear: Memory> index + hot data, it is better to have a little surplus, after all, the operating system itself needs to run normally also need to consume

Some memory.

For more information about MongoDB and memory, see the official documentation.

Anyone who first came into contact with MongoDB was surprised that it was greedy for memory. For the reason, let me first talk about how Linux manages memory and how MongoDB uses memory, the answer is clear.
It is said that learning with problems is more effective, first look at the top command result of a MongoDB Server:
Shell> top-p $ (pidof release d)
Mem: 32872124 k total, 30065320 K used, 2806804 K free, 245020 K Buffers
Swap: 2097144 k total, 100 K used, 2097044 K free, 26482048 K cached
Virt res SHR % mem
1892 GB 21G 21G 69.6
Is there any performance problem with this MongoDB server? You can continue reading while thinking.
First, let's talk about how Linux manages memory.
In Linux (similar to other systems), the memory has physical memory and virtual memory. There is no need to explain what physical memory is, and the virtual memory is actually an abstraction of physical memory, in most cases, for convenience, the program accesses a virtual memory address, and then the operating system translates it into a physical memory address.
Many people will confuse virtual memory with swap. In fact, swap is just a technology derived from virtual memory: Once the operating system has insufficient physical memory, in order to free up the memory space to store new content, the content in the current physical memory will be put in the SWAp partition, and will be retrieved later. It should be noted that the use of swap may cause performance problems, and occasionally there is no need to worry about it, the bad thing is that the physical memory and swap partitions frequently exchange data, which is called swap bumps. Once this happens, you must first identify the cause, if the memory is insufficient, you can solve the problem by adding the memory. However, sometimes this problem may occur even if the memory is sufficient. For example, MySQL may encounter this problem, the solution is to restrict the use of swap:
Shell> sysctl-w vm. swappiness = 0
The free command is most commonly used to view memory conditions:
Shell> free-m
Total used free shared buffers cached
Mem: 32101 29377 2723 0 239 25880
-/+ Buffers/cache: 3258 28842
Swap: 2047 0 2047
When a newbie sees that the value of used is too large, and the value of free column is too small, it is often considered that the memory is used up. In fact, this is not the case. The reason is that whenever we operate on files, Linux will cache the files to the memory as much as possible, so that the next visit will allow us to retrieve results directly from the memory, therefore, the value of the cached column is very large, but don't worry, this part of memory is recoverable, and the operating system will eliminate cold data according to the LRU algorithm. In addition to cached, there is also a buffers, which is similar to cached and can be recycled, but its focus is to ease the blocking caused by inconsistent operation speeds of different devices, so I will not explain it here.
After understanding the principle, we can calculate that the available memory of the system is free + buffers + cached:
Shell> echo "2723 + 239 + 25880" | BC-l
28842
The actual memory used by the system is used-Buffers-cached:
Shell> echo "29377-239-25880" | BC-l
3258
In addition to the free command, you can also use the sar command:
Shell> Sar-R
Kbmemfree kbmemused % memused kbbuffers kbcached
3224392 29647732 90.19 246116 26070160
3116324 29755800 90.52 245992 26157372
2959520 29912604 91.00 245556 26316396
2792248 30079876 91.51 245680 26485672
2718260 30153864 91.73 245684 26563540
Shell> Sar-W
Pswpin/s pswpout/s
0.00 0.00
0.00 0.00
0.00 0.00
0.00 0.00
0.00 0.00
I hope you are not scared by % memused. If you are unfortunate, please refer to the free command explanation.
Let's talk about how MongoDB uses memory.
Currently, MongoDB uses the memory ing storage engine, which converts disk I/O operations into memory operations. For read operations, the data in the memory serves as a cache. For write operations, memory can also convert random write operations into sequential write operations, which can greatly improve performance. MongoDB does not interfere with memory management. Instead, it leaves the work to the virtual cache manager of the operating system for processing. The advantage is that it simplifies MongoDB's work, but the disadvantage is that you have no way to easily control the memory occupied by MongoDB. In fact, MongoDB will occupy all available memory, so it is best not to put other services together with MongoDB.
Sometimes, even if MongoDB uses a 64-bit operating system, it may encounter the notorious OOM problem. This situation is mostly due to the limitation of the virtual memory size, you can view the current value as follows:
Shell> ulimit-A | grep 'virtual'
By default, most operating systems are set to unlimited. If your operating system is not, you can modify it as follows:
Shell> ulimit-V Unlimited
However, it should be noted that the use of ulimit has a context, and it is best to put it in the MongoDB STARTUP script.
Sometimes, for some reason, you may want to release the memory occupied by MongoDB, but as mentioned above, the memory management work is controlled by the Virtual Memory Manager, therefore, you can only release the memory by restarting the service. Instead, you can use the MongoDB built-in closealldatabases command to release the memory:
Mongo> Use Admin
Mongo> dB. runcommand ({closealldatabases: 1 })
In addition, you can release the cache by adjusting the Kernel Parameter drop_caches:
Shell> sysctl-w vm. drop_caches = 1
You can use the Mongo command line to monitor the memory usage of MongoDB as follows:
Mongo> dB. serverstatus (). mem:
{
"Resident": 22346,
"Virtualization": 1938524,
"Mapped": 962283
}
You can also use the mongostat command to monitor MongoDB memory usage, as shown below:
Shell> restart stat
Mapped vsize res faults
940g 1893g 21.9g 0
940g 1893g 21.9g 0
940g 1893g 21.9g 0
940g 1893g 21.9g 0
940g 1893g 21.9g 0
The memory-related fields have the following meanings:
Mapped: The data size mapped to the memory.
Visze: Virtual Memory Used
Res: memory size actually used
Note: If the operation cannot be completed in the memory, the value of the result faults column will not be 0, and performance problems may occur depending on the size.
In the above results, vsize is twice the size of the mapped, and mapped is equal to the size of the data file. Therefore, vsize is twice the size of the data file, mongoDB has enabled journal and needs to map data files once more in the memory. If journal is disabled, the vsize and mapped are roughly the same.
To verify this, you can use the pmap command to observe the file ing after you enable or disable Journal:
Shell> pmap $ (pidof release d)
What size of memory is suitable for MongoDB? Broadly speaking, it is more beneficial. To be specific, this depends on your data and index size. It is best to add indexes for all data in the memory. However, in many cases, the data is larger than the memory, for example, the MongoDB instance involved in this article:
Mongo> dB. Stats ()
{
"Datasize": 1004862191980,
"Indexsize": 1335929664
}
In this example, the index is more than 1 GB, the memory can be fully loaded, and the data file reaches 1 Tb. It is estimated that it is difficult to find such a large memory. At this time, ensure that the memory can be loaded with hot data, as for the amount of hot data, this is a proportional problem, depending on the specific application. In this way, the memory size is clear: Memory> index + hot data.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More