SUSE Linux overcommit memory and oom-killer

Source: Internet
Author: User

Problem description:

In SUSE Linux (SLES 11), after overcommit memory is activated, the system will enable oom-killer to randomly kill system processes, and there is a very large kcore file in/proc.


Problem Analysis:

Please refer to the document: http://www.novell.com/support/kb/doc.php? Id = 7002775

I will extract several important paragraphs as follows, and add my instructions in blue): SituationOvercommit memory under SLES is a subject which is often misunderstood. the purpose of this TID is to address some common misunderstandings and provide resources for obtaining more information on this subject.

Note-In low memory conditions, overcommiting memory can lead to the oom-killer killing apparently random tasks. Managing the oom-killer will be discussed in detail in an upcoming TID. If the overcommiting memory function is activated, oom-killer may kill processes that require additional virtual memory, such as a Java virtual machine ), the killing task is random.ResolutionThe definitive source of documentation for the behavior of overcommit memory is the Linux kernel source code. in particle,/usr/src/linux/mm/mmap. c (available when the kernel-source package is installed) is a good place to start.

As the source code can be difficult to follow, there is also documentation provided with the kernel-source package that explains overcommit memory in detail. This documentation can be found in the following file:
  • /Usr/src/linux/Documentation/vm/overcommit-account

This file details the following 3 modes available for overcommit memory in the Linux kernel:
  • 0-Heuristic overcommit handling.

  • 1-Always overcommit.

  • 2-Don't overcommit.

Mode 0 is the default mode for SLES servers. this allows for processes to overcommit "reasonable" amounts of memory. if a process attempts to allocate an "unreasonable" amount of memory (as determined by internal heuristics), the memory allocation attempt is denied. in this mode, if your applications perform small overcommit allocations, it is possible for the server to run out of memory. in this situation, the Out of Memory killer (oom-kill) will be used to kill processes until enough memory is available for the server to continue operating.

Mode 1 allows processes to commit as much memory as requested. These allocations will never result in an "out of memory" error. This mode is usually appropriate only in specific scientific applications.

Mode 2 prevents memory overcommit and limits the amount of memory that is available for a process to allocate. this model ensures that processes will not be randomly killed by the oom-killer, and that there will always be enough memory for the kernel to operate properly. the total amount of memory available for use by the system is determined through the following calculation:
  • Total Commit Memory = (swap size + (RAM size * overcommit_ratio ))

By default, overcommit_ratio is set to 50. with this setting, the total commit memory size will be equal to the total amount of swap space in the server, plus 50% of the RAM. in other words, if a server has 1 GB of RAM, and 1 GB of swap space, the system wowould have a total commit limit of 1.5 GB.
  • Note-The RedHat documentation, Understanding Virtual Memory, is a good source of information on overcommit memory. (Other topics in that documentation have evolved since 2004 .) however, there is an error in the "overcommit_ratio" section of this document. in this section, the calculation used to determine the allocatable memory is correct. however, in the text accompanying the calculation, the total amount of allocatable memory is incorrectly calculated as 2.5 GB (on a server with 1 GB of RAM and 1 GB of swap space ). 1.5 GB is the correct value.

To determine or change which overcommit mode a server is operating in, the following proc files are used:
  • /Proc/sys/vm/overcommit_memory

  • /Proc/sys/vm/overcommit_ratio

Echoing the number of the desired mode into overcommit_memory will immediately change the overcommit mode being used. If mode 2 is in use, the ratio is determined using the value in the overcommit_ratio file.

To view the current memory statistics, check the following fields in/proc/meminfo:
  • CommitLimit-Overcommit limit

  • Committed_AS-Current memory amount committed

This is about several types of overcommiting memory, which can be activated or disabled. The principle of overcommiting memory is to allow the system to use memory that exceeds its actual memory capacity to allow more programs to run, not all programs consume memory at the same time, which is similar to Thin Provision. However, if there is too much memory, oom-killer will be activated.



The following is a description of overcommit memory: http://www.redhat.com/magazine/001nov04/features/vm/

Overcommit_memoryIs a value which sets the general kernel policy toward granting memory allocations. if the value is 0, then the kernel checks to determine if there is enough memory free to grant a memory request to a malloc call from an application. if there is enough memory, then the request is granted. otherwise, it is denied and an error code is returned to the application. if the value is set to 1, then the kernel grants allocations above the amount of physical RAM and swap in the system as defined byovercommit_ratioValue. enabling this feature can be somewhat helpful in environments which allocate large amounts of memory expecting worst case scenarios but do not use it all. if the setting in this file is 2, the kernel allows all memory allocations, regardless of the current memory allocation state.



Solution:

Using ps to view the memory of each process occupies about 4 GB, and most of the memory is occupied by Page Cache. The Linux kernel policy is to use the data in the memory cache file system to maximize IO speed. Although the Page Cache is automatically released when a process requires a larger memory, however, it is not ruled out that the released memory is not timely or that the memory fragments do not meet the memory requirements of the process.

Therefore, we need a method to limit the upper limit of PageCache.

Linux provides the min_free_kbytes parameter to determine the threshold value for the system to recycle memory and control the system's idle memory. The higher the value, the earlier the kernel starts to recycle memory, and the higher the idle memory.

?
[root@zyite-app01 root]# cat /proc/sys/vm/min_free_kbytes163840echo 963840 > /proc/sys/vm/min_free_kbytes

Other optional temporary solutions:

Disable oom-killer

Cat/proc/sys/vm/oom_kill_allocating_task

Echo "0">/proc/sys/vm/oom-kill_allocating_task

Vi/etc/sysctl. conf

Virtual Machine. oom-kill_allocating_task = 0

2. Clear cache (optional)
Echo 1>/proc/sys/vm/drop_caches


This article is from the "Garden" blog, please be sure to keep this source http://ku881.blog.51cto.com/1840178/1299993

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.