Can Cache in Linux memory be recycled?

Source: Internet
Author: User

Can Cache in Linux memory be recycled?

In Linux, we often use the free command to view the system memory usage status. On an RHEL6 system, the free command displays the following information:

The default display unit here is kb, and my server is GB memory, so the number is relatively large. This command is required by almost everyone who has used Linux. But the more such a command, the less people actually understand it (I mean, the less the proportion ). Generally, the command output can be understood in the following layers:

  1. I don't know. The first reaction of such a person is: Oh, my God, I used a lot of memory and more than 70G memory. But I almost didn't run any large program? Why? Linux memory usage!

  2. I think I understand it very well. This kind of person may say after his or her self-study evaluation: Well, according to my professional perspective, the memory is about 17 GB, and there is still a lot of available memory. Buffers/cache occupies a large amount of resources, indicating that some processes in the system have read and written files, but it doesn't matter. This part of memory is idle.

  3. Really understand. This kind of reaction makes people feel that they do not understand Linux. Their reaction is: free shows this, okay, I know. Shenma? You asked me if the memory is enough. Of course I don't know! How do I know how your program is written?

According to the current technical documents on the network, I believe most people who know a little about Linux should be at the second level. It is generally believed that the memory space occupied by buffers and cached can be released as idle space when the memory pressure is high. But is that true? Before discussing this question, let's briefly introduce what buffers and cached mean:

What is buffer/cache?

Buffer and cache are two abuse terms in computer technology. They have different meanings in different contexts. In Linux memory management, the buffer here refers to the Buffer cache in Linux memory. The cache here refers to the Page cache in Linux memory. Translation into Chinese can be called buffer cache and page cache. In history, one of them is used as a buffer for writing to io devices, and the other is used as a read cache for io devices, it mainly refers to block device files and common files on the file system. But now they have different meanings. In the current kernel, page cache is the Memory page cache as its name implies. To put it bluntly, if memory is allocated and managed by page, you can use the page cache as its cache for management and use. Of course, not all memory is managed by pages, but also by block. If cache is used, these are all used in buffer cache. (From this perspective, is it better to change the name of buffer cache to block cache ?) However, not all blocks have fixed lengths. The block lengths on the system are mainly determined by the block devices used, the page length is 4 kb for both 32-bit and 64-bit on X86.

By understanding the differences between the two cache systems, you can understand what they can do.

What is page cache?

Page cache is mainly used as the cache of file data on the file system, especially when the process has read/write operations on the file. If you think about it carefully, it can be called as a system that can map files to the memory: Does mmap naturally need to use page cache? In the current system implementation, page cache is also used as a cache device for other file types. In fact, page cache is also responsible for the caching of most block device files.

What is buffer cache?

Buffer cache is mainly designed for the system to cache block data when the system reads and writes data to block devices. This means that some block operations will use buffer cache for caching, for example, when formatting the file system. Generally, the two cache systems work together. For example, when we write a file, the content of the page cache will be changed, the buffer cache can be used to mark the page as a different buffer and record which buffer has been modified. In this way, when the kernel executes the dirty data write-back (writeback), it does not need to write the entire page back, but only needs to write back the modified part.

How to recycle cache?

The Linux kernel will trigger memory recovery when the memory is about to run out, so as to release the memory for processes in urgent need of memory. In general, the major memory release in this operation is from the release of buffer/cache. In particular, more cache space is used. Since it is mainly used for caching and only accelerates the process's reading and writing speed when the memory is sufficient, it is necessary to clear and release the cache when the memory is under heavy pressure, free Space is allocated to related processes. Therefore, we generally think that the buffer/cache space can be released, which is correct.

However, this kind of cache clearance is not costly. After understanding what the cache is, you can understand that the cache must ensure that the data in the cache is consistent with the data in the corresponding file before the cache can be released. Therefore, cache cleanup usually results in high system IO. Because the kernel needs to compare the data in the cache with the data on the corresponding hard disk file, if the data is inconsistent, it needs to be written back before it can be recycled.

In addition to clearing the cache when the memory will be exhausted in the system, we can also use the following file to manually trigger the cache clearing operation:

The method is:

Of course, the values of this file can be set to 1, 2, and 3 respectively. Their meanings are as follows:

Echo 1>/proc/sys/vm/drop_caches: indicates to clear pagecache.

Echo 2>/proc/sys/vm/drop_caches: Indicates clearing the objects in the slab distributor (including directory item cache and inode cache ). The slab splitter is a mechanism for memory management in the kernel. Many of the cache data implementations use pagecache.

Echo 1>/proc/sys/vm/drop_caches: Indicates clearing cache objects in pagecache and slab splitters.

Can cache be recycled?

We analyzed whether the cache can be recycled. Is there any cache that cannot be recycled? Of course. Let's first look at the first situation:

Tmpfs

We all know that Linux provides a "temporary" file system called tmpfs, which can use part of the memory space as a file system, so that the memory space can be used as a directory file. Currently, most Linux systems have a tmpfs directory named/dev/shm. Of course, you can also manually create your own tmpfs by using the following method:

So we created a new tmpfs with 20 GB space. We can create a file within 20 GB in/tmp/tmpfs. If the actual space occupied by the file we created is memory, what part of the data should be occupied by memory space? According to the implementation function of pagecache, since it is a file system, it is natural to use the space of pagecache for management. Let's try this, right?

We created a 13 Gb file in the tmpfs directory, and found that cached increased by 13 Gb through the comparison of free commands before and after, this file is indeed stored in the memory and the kernel uses cache as the storage. Let's look at the metrics we care about:-/+ buffers/cache. We found that in this case, the free command still prompts that we have GB of memory available, but how much? We can manually trigger memory recycle to see how much memory can be recycled:

As we can see, the space occupied by cached is not completely released as we think, and the 13 Gb space is still occupied by files in/tmp/tmpfs. Of course, other non-releasable caches in my system occupy 16 GB of memory space. When will the cache space occupied by tmpfs be released? When a file is deleted, the kernel will not automatically delete the file in tmpfs to release the cache space no matter how much memory is used up.

This is the first case that the cache cannot be recycled. There are other cases, such:

Shared Memory

Shared Memory is a commonly used inter-process communication (IPC) method provided by the system. However, this communication method cannot be applied and used in shell, therefore, we need a simple test program with the following code:

The program function is very simple, that is, to apply for less than 2 GB shared memory, and then open a sub-process to initialize the shared memory, the parent process and other sub-processes output shared memory after initialization, and then exit. However, the shared memory is not deleted before exiting. Let's take a look at the memory usage before and after the program is executed:

The cached space increased from 16 GB to 18 GB. Can this cache be recycled? Continue test:

The result is still unrecoverable. We can observe that this shared memory will be stored in the cache for a long time even if it is not used by anyone until it is deleted. There are two ways to delete a project: shmdt () and ipcrm. Let's try to delete it:

After the shared memory is deleted, the cache is released normally. This behavior is similar to the logic of tmpfs. The underlying kernel uses tmpfs for memory storage of POSIX: xsi ipc Mechanisms such as shared memory (shm), Message Queue (msg), and semaphore array (sem. This is why the operating logic of shared memory is similar to that of tmpfs. Of course, generally, shm occupies more memory, so we will focus on the use of shared memory. Speaking of shared memory, Linux also provides us with another way to share memory:

Mmap

Mmap () is a very important system call, which cannot be seen from the function description of mmap. Literally, mmap maps a file to the virtual memory address of the process. Then, you can operate the file content by operating the memory. But in fact, this call is widely used. When malloc requests memory, the small memory kernel uses sbrk for processing, and the large memory uses mmap. When the system calls the exec family function for execution, because it essentially loads an executable file into the memory for execution, the kernel can naturally use mmap for processing. We only consider one situation here, that is, will cache be used in the same way as shmget () When mmap is used to apply for shared memory?

Similarly, we also need a simple test program:

This time, we don't need to use any parent-child process. We just need a process to apply for a 2 GB mmap shared memory, and wait 100 seconds after initializing the space, so we need to check the memory usage of our system within the 100 seconds of its sleep to see what space it uses? Before that, you must create a 2 GB file./mmapfile. The result is as follows:

Then run the test program:

We can see that during the execution of the program, cached has been 18 GB, up 2 GB from the previous, and this cache still cannot be recycled. Then we wait 100 seconds before the program ends.

After the application exits, the space occupied by cached is released. In this way, we can see that the memory in the MAP_SHARED state is used for mmap application, and the kernel uses cache for storage. Before the process releases the relevant memory, the cache cannot be normally released. In fact, the memory applied by the MAP_SHARED side of mmap is also implemented by tmpfs in the kernel. As a result, we can also speculate that the read-only part of the shared library is managed in the MAP_SHARED mode of mmap in the memory. In fact, they all occupy the cache and cannot be released.

Last

Through three test examples, we found that the cache in Linux memory is not always released as free space. It is also clear that even if the cache can be released, there is no cost for the system. To sum up the key points, we should remember the following points:

  1. When the cache is released as a File cache, IO increases. This is the cost for the cache to speed up file access.

  2. Files stored in tmpfs occupy the cache space. The cache will not be automatically released unless the files are deleted.

  3. Shared Memory applied using shmget occupies cache space. Unless the shared memory is ipcrm or shmdt, the related cache space will not be automatically released.

  4. The memory of the MAP_SHARED flag applied for using the mmap method occupies the cache space. Unless the process munmap the memory, the related cache space will not be automatically released.

  5. In fact, the shared memory of shmget and mmap is implemented through tmpfs at the kernel layer, and the storage implemented by tmpfs is implemented using cache.

When we understand this, we hope that everyone's understanding of the free command can reach the third level. We should understand that memory usage is not a simple concept, and cache is not really usable as free space. If we really want to deeply understand whether the memory usage on your system is reasonable, we need to understand a lot of more detailed knowledge and make a more detailed judgment on the implementation of relevant services. Our current Experiment scenario is CentOS 6. The free reality of Linux may be different in different versions. You can find out the reasons for this.

Of course, this article does not cover all cases where the cache cannot be released. In your application scenarios, what other scenarios can cache be released?

This article permanently updates the link address:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.