Linux Memory architecture for Linux Performance and Tuning Guide (translation)

Source: Internet
Author: User

This document is a translation of section 1.2 of the IBM Redbook Linux Performanceand Tuning guidelines.
Original address: Http://www.redbooks.ibm.com/redpapers/pdfs/redp4285.pdf
Original Eduardo Ciliendo, Takechika Kunimasa, Byron Braswell


For example, the following:

1.2 Linux Memory Architecture

To run a process, the Linux kernel allocates a portion of the memory area to the requested process. The process uses the memory region as its workspace and runs the requested work. It's a desk with your application. Then use your desk to place paper, documents and memos to run your work similar.

The difference is that the kernel must use a more dynamic way to allocate memory space. Sometimes the number of processes running is tens of thousands of, but the amount of memory is limited.

So. The Linux kernel must handle memory efficiently. In this section, we'll talk about Linux's memory structure, address distribution, and how Linux effectively manages memory space.


1.2.1 Physical and virtual memory

Today we have to face the problem of choosing 32-bit and 64-bit systems. One of the most important differences for an enterprise customer is whether the virtual memory address can exceed 4GB.

From a performance perspective, it is important to understand how the Linux kernel in 32-bit and 64-bit systems maps physical memory to virtual cores.


From Figure 1-10, you can see the obvious difference in how the Linux kernel handles 32-bit and 64-bit system memory.

The mapping details of memory to virtual memory are beyond the scope of this article. So this article highlights some of the details of the Linux memory architecture.


On a 32-bit architecture. such as the Ia-32,linux kernel can only directly access the first 1GB of physical memory (when considering a partial reservation is 896MB).

The memory above the so-called zone_normal must be mapped to more than 1GB of memory. The mapping is completely transparent to the application, but applying memory pages in ZONE_HIGHMEM results in a slight decrease in performance.


On the other hand, on a 64-bit architecture, such as x86-64 (also called x64). The ZONE_HIGHMEM can be extended to 64GB, or can extend to 128GB on IA-64 systems. As you can see, through the 64-bit architecture. The mapping overhead of memory pages from Zone_highmem to Zone_normal can be eliminated.


Figure 1-10 Linux kernel memory layout for 32-bit and 64-bit systems


Virtual memory address Layout
Figure 1-11 shows the Linux virtual address layout for 32-bit and 64-bit architectures.


On a 32-bit architecture, the maximum address space a process can access is 4GB. This is a limitation of 32-bit virtual addresses. In the standard implementation. The virtual address space is divided into 3GB of user space and 1GB of kernel space.

This is a bit similar to the variant of the 4G/4G addressing layout implementation.


On the one hand, there is no such limit in 64-bit architectures, such as x86-64 and IA64. Each individual process can benefit from a vast and huge address space.



Figure 1-11 Virtual memory address layout for 32-bit and 64-bit architectures


1.2.2 Virtual Memory Management
The physical memory architecture of the operating system is generally not visible to applications and users. Because the operating system maps whatever physical memory to virtual memory.

Assuming we want to understand the possibility of tuning in the Linux operating system, we have to understand how Linux handles virtual memory.

As described in "Physical memory and virtual memory" in 1.2.1, applications cannot request physical memory, but when requesting a certain size memory mapping to the Linux kernel. A mapping of a virtual memory is obtained. 1-12 of what you see. Virtual memory is not necessarily mapped to physical memory. Suppose your app has requested a lot of memory, and some of that memory may be mapped to a swap file on the disk.


Figure 1-12 shows that the application usually writes directly to the disk instead of directly, but rather to the direct write cache or buffer (buffer).

When the pdflush kernel thread spare or the file size exceeds the cache buffer size, the Pdfflush kernel thread empties the cached/buffered data and writes it to disk.

Read "Emptying dirty buffers".


Figure 1-12 Linux virtual memory management


The Linux kernel handles the write operation of the physical disk in close connection with the Linux management disk cache. Other operating systems allocate only part of the memory as a disk cache, while Linux handles memory resources more efficiently. The default virtual memory management configuration allocates all available spare memory as the disk cache. So in Linux systems with large amounts of memory, it is common to see only 20MB of spare memory.


In the same situation, the Linux management swap space is also very efficient. Swap space is not meant to be a memory bottleneck, it just proves how effective Linux management system resources are. See "Page Frame Recycling" For more details.


Allocation of page Frames
A page is a contiguous set of linear physical memory (page frames) or virtual memory. The Linux kernel manages memory in pages.

The size of a page is usually 4K bytes. When a process requests a certain number of pages. Assume that the available pages are sufficient. The Linux kernel is immediately assigned to the process. Otherwise. The memory page must be fetched from some other process or memory page cache.

Linux memory knows the number and location of available memory pages.


Partner system
The Linux kernel manages spare pages through a mechanism known as a partner system . The partner system manages the spare page and tries to assign pages to the allocation request. It tries its best to keep the memory area contiguous.

It is assumed that scattered small pages are not considered. will cause memory fragmentation. and makes it difficult to apply for a large section of pages in a contiguous area. It will result in inefficient memory usage and performance degradation.


Figure 1-13 illustrates how the partner system allocates pages.


Figure 1-13 Partner System


Page recycling is activated when an attempt to allocate a page fails. Refer to "Page frame recycling".


You can find information about the partner system through/proc/buddyinfo. See "Memory used in a zone".


Page Frame Recycling
When a process requests a certain number of page mappings, assuming that the page is not available, the new Linux kernel request attempts to allocate memory to the process by freeing some pages (pages that were previously used but are no longer used but are still marked as active based on some principles). This process is called polygon frame recycling.

KSWAPD Kernel threads and try_to_free_page () kernel functions are used to take care of page recycling.


KSWAPD threads are typically in an interruptible sleep state, and KSWAPD threads are called by the partner system when free pages in a region are below a threshold. It tries to find the candidate pages from the active page based on least recently used algorithms. The least recently used pages will be released first. Activity lists and inactive lists are used to maintain candidate pages.

KSWAPD scans the list of activities and checks the usage of the page. Put pages that you haven't used recently in the inactive list. You can use the VMSTAT-A command to see which memory is active and which memory is inactive.


KSWAPD also follows other principles. Pages are used primarily for two purposes: the page cache and the process address space. Page caching is a page map to a disk file.

A page that belongs to a process address space (known as anonymous memory, because it is not mapped to any file or name) is used for heaps and stacks. Read 1.1.8. "Process memory segment".

When KSWAPD recycles a page. It will compress the page cache as much as possible instead of paging the page out of the process (or swap out).


page out and swapout: Both "page out" and "swap out" are often confused. "Page out" refers to placing the page (part of the entire address space) in the swap area. "Swap out" means placing the entire address space in the swap area. But they can sometimes be exchanged for use.


Most of the page caches that are recycled and the process address space are recycled depending on their usage scenarios. And will have an impact on performance.

You can control this behavior by using/proc/sys/vm/swappiness.


Swap (Swap area)
As mentioned earlier, when page recycling occurs, the candidate pages that are part of the process's address space in the inactive list will be page out. The exchange itself does not mean what is happening.

Although in other systems, swap is only a guarantee of excessive allocation of major memory, Linux uses swap space more efficiently.

1-12 of what you see. Virtual memory is made up of physical memory and disk or swap partitions. In the implementation of virtual memory management of Linux, it is assumed that a memory page has been allocated, but it has not been used for some time. Linux will move the memory page into the swap space.


You can often see daemons such as Getty, which are usually started when the system is started. But it was almost not used. It seems more efficient to free up the precious main memory of the page and move it to the swap area.

This is how Linux manages swap, so you don't need to panic when you find that the swap area is already using 50%. In fact, the swap space starts to be used and does not imply a memory bottleneck; It proves how effectively Linux manages system resources.


Linux Memory architecture for Linux Performance and Tuning Guide (translation)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.