Linux memory subsystem and common tuning parameters

Last Update:2017-06-06 Source: Internet

Author: User

Tags message queue cpu usage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Memory subsystem and common tuning parameters

Memory Subsystem Components

Slab Allocator

Buddy system

Kswapd

Pdflush

Mmu

Virtualized environments:

PA (Process Address)-->ha (virtual machine address)-->ma (physical machine address)

Virtual Machine conversion: Pa-->ha

Guestos virtual machine kernel, OS physical machine kernel

Shadow PT

Memory:

TLB: Improving performance

Hugepages Memory Large Page

[Email protected] domain1]# cat/proc/meminfo |grep-i Huge
anonhugepages:0 KB
hugepages_total:0 not enabled
hugepages_free:0
hugepages_rsvd:0
hugepages_surp:0
hugepagesize:2048 KB

Enable Hugepages Enabled

Method 1: Permanent entry

[Email protected] ~]# vim/etc/sysctl.conf
Vm.nr_hugepages = 10

Method 2: Temporary effect

[Email protected] domain1]# sysctl-w vm.nr_hugepages=10
Vm.nr_hugepages = 10

Hang on as a file system with

[Email protected] domain1]# mkdir/hugepages
[Email protected] domain1]# mount-t HUGETLBFS none/hugepages

Test

[[email protected] ~]# dd If=/dev/zero of=/hugepages/a.test bs=100m count=1000000
[Email protected] ~]# ll-h/hugepages/
Total 0
-rw-r--r--. 1 root root 0 June 5 12:18 a.test
The size is 0 because it is memory, and it is not allowed to be used directly by the user (cannot copy and create new files), only process uses

Strace Tracking Command

1.STRACE-P PID Tracking How the initiated process invokes system resources

-o Specifies the trace to the future output path, saved to the file.

-p Specifies the PID of the process

-C Track Overall results

How does 2.strace command track commands?

[[email protected] ~]# Strace cat/etc/fstab track The running path of a command

Reduce the overhead of micro-memory objects
Slab
Reduced service time for slow subsystems
Cache file elements using buffer cache
Cache disk IO with page cache
Using SHM to complete interprocess communication
Boost Network IO performance with buffer cache, ARP cache, and connetion tracking

Excessive use:

Excessive CPU usage, more than the total number of virtual machine CPUs over the physical machine

Excessive use of memory, beyond the physical memory portion, with swap as a precondition

Using swap:

# cat/proc/sys/vm/overcommit_memory
0 heuristic overdose, the system itself decides how to overuse

1 always overuse, do not use swap on the database server as much as possible

2 All physical memory plus a portion of swap

# Cat/proc/sys/vm/overcommit_ratio
50 means that the percentage of physical memory can be exceeded, here is 50%, generally as little as 50% (to ensure that 50% does not exceed the swap space)

When memory overflows, OOM kills the process.

[[email protected] ~]# LS/PROC/1

Oom_score records the OOM score for each process, and the high-score process is killed as a malicious process.

Oom_adj Adjust the score, you can prioritize the monitoring of a process, or as far as possible to save the last kill.

Slabtop all slab states of the random monitoring system

[[email protected] ~]# Cat/proc/slabinfo View the size of the swap
Tw_sock_tcpv6 0 0 1:tunables 8:slabdata 0 0 0
# # #limit = 54 represents the maximum number of objects that can be cached per CPU and can be adjusted

# # #batchcount =27 indicates how many objects can be cached by the CPU at a time when the CPU cache is empty, and can be adjusted

# # #shared =8 indicates how many slab caches (shared cache) can be shared between CPUs and can be adjusted

To adjust the amount of memory for the cache:

# echo ' Tw_sock_tcpv6 108 8 ' >/proc/slabinfo

How to adjust the ARP cache for network IO:

Soft limit can exceed 50% default is 512

Hard limits must never exceed the default of 1024

GC: Garbage collector, which does not automatically clean up when the default cache entry is less than 128.

[Email protected] ~]# Cat/proc/net/arp
IP address HW type Flags HW address Mask Device
192.168.0.149 0x1 0x2 28:d2:44:8e:5c:16 * eth0
192.168.0.163 0x1 0x2 08:ed:b9:12:c1:6d * eth0

[[Email protected] ~]# IP Neighbor list Display cache entry
192.168.0.163 Dev eth0 lladdr 08:ed:b9:12:c1:6d REACHABLE
192.168.0.146 Dev eth0 lladdr b4:b5:2f:dc:aa:72 STALE

[[Email protected] ~]# IP neighbor flush dev eth0 empty all cache entries on eth0

[Email protected] ~]# ls-l/proc/sys/net/ipv4/neigh/default
Total 0
-rw-r--r--1 root root 0 June 5 16:46 gc_interval
-rw-r--r--1 root root 0 June 5 16:46 gc_thresh1
-rw-r--r--1 root root 0 June 5 16:46 gc_thresh2
-rw-r--r--1 root root 0 June 5 16:46 GC_THRESH3

GC_THRESH1 cleanup Pre-values, default 128, more than 128 expired entries, allow 5 minutes, after automatic cleanup by GC

GC_THRESH2 soft Limit, default 512, the portion beyond the soft limit is only allowed to exist for 5 seconds, the number can not reach the hard limit

GC_THRESH3 hard Limit, default of 1024

Gc_interval defines which expiration is checked every few seconds.

Paged cache: page caches reduce disk IO, read files and put them in memory

Lowmen_reserve_ratio How much space is reserved when memory is low, 64-bit operating system does not need to adjust

Vfs_cache_pressure control the kernel to reclaim memory trends (recycle inode,directory)

[Email protected] ~]# Cat/proc/sys/vm/lowmem_reserve_ratio
256 256 32

How much space is reserved when memory is low, 64-bit operating system does not need to adjust

[Email protected] ~]# cat/proc/sys/vm/vfs_cache_pressure

100

Reduce this value as little as possible to reclaim memory, to achieve the optimization effect, 0 is not recycled, it is possible to cause memory overflow, between 0-100 tends to not be recycled. More than 100 are more likely to be recycled.

[Email protected] ~]# Cat/proc/sys/vm/page-cluster
3

Page-cluster =1 controls the amount of time it takes to get data from memory to the swap partition. Default is 3

=1 represents a swap out 2 of the 1-time party

=2 represents a swap out 2 of the 2-time party

。。。

[Email protected] ~]# Cat/proc/sys/vm/zone_reclaim_mode
0
Zone_reclaim_mode the memory area is more likely to be recycled.

1 indicates that the memory area reclamation function is turned on

2 represents the page generated by the recycle write operation

4 represents a page that reclaims swap

Anonymous Pages Anonymous page

Store the program, the data generated by the process itself

IPC, communication between processes is also done by anonymous pages

Anonymous pages = rss-shared

[[email protected] ~]# grep anon/proc/meminfo View anonymous page size
anonpages:16104 KB
anonhugepages:0 KB

[[email protected] ~]# CAT/PROC/PID/STATM viewing anonymous pages of a process

Inter-process Communication management commands

[[email protected] ~]# ipcs-l view current memory settings

------Shared Memory Limits--------
Max number of segments = 4096 maximum shared memory size
Max seg Size (Kbytes) = 67108864 Segment size max number of bytes
Max Total Shared memory (Kbytes) = 17179869184 allows for the use of large amount of RAM in the global scope
Min seg Size (bytes) = 1 The minimum segment is a byte

------Messages:limits--------
Max queues System wide = 1954 the maximum number of queues in the global range
Max size of message (bytes) = 65536 The maximum amount of each information
Default max size of queue (bytes) = 65536 The maximum volume of messages that can be accepted by default for each queue

IPCRM Removing a message queue

Parameters about shared Memory:

[[email protected] ~]# Cat/proc/sys/kernel/shmmni system level, maximum allowed shared memory segment
[[email protected] ~]# Cat/proc/sys/kernel/shmall system level, the maximum number of pages that can be used for shared memory allocation
[[email protected] ~]# Cat/proc/sys/kernel/shmmax The maximum size limit for a single shared memory

About the message parameters: Ipcs-p View queue

[[email protected] ~]# CAT/PROC/SYS/KERNEL/MSGMNB the upper limit of a single message queue.
65536
[[email protected] ~]# Cat/proc/sys/kernel/msgmni system level, maximum number of message queues
1954
[[email protected] ~]# Cat/proc/sys/kernel/msgmax maximum single message queue size, per byte
65536

Pdflush: Adjust the memory usage space, clear the memory of the chapter page to disk up

[[email protected] ~]# Cat/proc/sys/vm/nr_pdflush_threads shows how many Pdflush are currently started by default
0
[[email protected] ~]# Cat/proc/sys/vm/dirty_background_ratio equivalent to the total memory, chapter page accounted for a large proportion of the beginning of the write-up
[[email protected] ~]# Cat/proc/sys/vm/dirty_ratio How much of a single process's chapter page reaches the entire memory scale start to clear
[[email protected] ~]# cat/proc/sys/vm/dirty_expire_centisecs Pdflush periodic start time interval, 0 is forbidden, Unit is 1% seconds
[[email protected] ~]# cat/proc/sys/vm/dirty_writeback_centisecs How long a dirty page is stored in memory and becomes out of date, and needs to start the write thread immediately

Manually write dirty caches and caches: Sync and release first

Sync command

echo S >/proc/sysrq-trigger

Echo 3>/proc/sys/vm/drop_caches

1 means release Pagecache

2 means release dentries and inodes

3 means release Pagecache and dentries and Inodes

29:00

This article from "Operation and maintenance Growth Road" blog, declined reprint!

Linux memory subsystem and common tuning parameters

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More