Http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
NUMA architecture CPU, a CPU has multiple cores, then each CPU becomes a node
When this feature is turned off
A node uses its own local memory and tries not to access other node's memory unless the local memory is full
How Linux handles NUMA architectures
1 The processor is divided into nodes, modern processors are typically one processor per node, multiple cores on a processor
2 local memory modules for each processor connected to each node
3 Calculating the communication cost between nodes (node distance)
The Numactl--hardware command lets you see how Linux recognizes the NUMA architecture layer
Available:4 nodes (0-3)
Node 0 cpus:0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
Node 0 size:65512 MB
Node 0 free:2146 MB
Node 1 Cpus:8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
Node 1 size:65536 MB
Node 1 free:96 MB
Node 2 cpus:16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
Node 2 size:65536 MB
Node 2 free:32362 MB
Node 3 cpus:24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
Node 3 size:65536 MB
Node 3 free:21805 MB
Node Distances:
Node 0 1 2 3
0:10 11 11 11
1:11 10 11 11
2:11 11 10 11
3:11 11 11 10
Memory is distributed evenly to each node.
# How Linux handles resource allocation
Each process and thread inherits the father's NUMA policy. This policy can be modified on a per-thread basis.
A policy defines the node or even the core that a process is allowed to dispatch.
Each thread is initially assigned to the most "fit" node to run. Threads can also run elsewhere, but the scheduler tries to ensure that the field is running at the optimal node
By default, the allocation of memory is assigned to a specific node, the node that the thread is currently running.
In the UMA/SMP architecture, memory is treated equally, while under NUMA, allocating memory for other nodes means cache latency and performance degradation
Once the memory is allocated to one node, it is not moved to another node, regardless of the system's requirements, it will stay on that node forever.
NUMA policies for any process can be modified, numactl as packages of programs, or you can use Libnuma to write code to manage NUMA policies.
For example, use NUMACTL as the package for the program:
1 allocating memory using the specified policy:
Use the current node,--the LocalAlloc parameter specified, which is the default mode
One node is preferred, but other nodes can also be used,--Preferred=node parameters
Always use a node or a set of nodes,--membind=nodes
Cross, poll all nodes--interleaved=all or--interleaved=nodes
2 selection of program running nodes
Specify a node (--cpunodebind=nodes) or a core or set of cores (--physcpubind=cpus)
The meaning of NUMA for MySQL and InnoDB
For InnoDB and most database servers, such as Oracle, the way they work on Linux is a huge single process with multiple threads.
In a NUMA architecture, memory is dispatched to a different node, and when you allocate more than 50% of the system's memory to a process, it is not something that a node can accomplish.
When different queries run at the same time, each processor is unable to prioritize access to a specific query that requires memory
It turns out that this is a very important question. By/proc/pid/numa_maps you can see all the memory allocated by MYSQLD, you will find an interesting phenomenon
If you look for the value of Anon=size,
This is the value of one of the rows
7ecf14000000 default anon=3584 dirty=3584 active=1024 n1=3584
7ecf14000000 Virtual memory address
Default NUMA Policy
Anon=number Number of anonymous pages
Dirty dirty pages, modified pages
The pages that are usually assigned to the process are always used, and therefore are dirty, but because of the fork, the process will have many copy-on-write page mappings, they are not dirty
Swapcache=number
The active column appears, indicating how many pages appear in the list of activities, while also suggesting that there are some inactive pages that will be swapper to be paged out
N0 and N1 pages per node
With a script, you can count all the memory cases
perl/data0/script/numa-maps-summary.pl </proc/4417/numa_maps
n0:392133 (1.50 GB)
n1:792466 (3.02 GB)
n2:531028 (2.03 GB)
n3:3743392 (14.28 GB)
active:4314131 (16.46 GB)
anon:5457149 (20.82 GB)
dirty:5456665 (20.82 GB)
mapmax:268 (0.00 GB)
mapped:1930 (0.01 GB)
swapcache:484 (0.00 GB)
Reading/proc/pid/numa_maps information will block the process
http://blog.wl0.org/2012/09/checking-procnuma_maps-can-be-dangerous-for-mysql-client-connections/
Not just MySQL, but Mangodb.
According to the official documentation, Linux, NUMA, MongoDB are not very harmonious, if the current hardware is NUMA, you can turn it off:
# numactl--interleave=all sudo-u mongodb mongod--port xxx--logappend--logpath yyy--dbpath zzz
(Vm.overcommit_ratio = +, vm.overcommit_memory = 2)
The Vm.zone_reclaim_mode is set to 0.
The system allocates memory to NUMA node, and if Numa node is full, the system will reclaim memory for the local NUMA node instead of the extra memory to the remote NUMA node, so the overall performance will be better, but in some cases, the remote N UMA node allocates memory better than reclaiming local NUMA node, so it needs to shut down the Zone_reclaim_mode.
Numactl--interleave All
The performance impact of Numa on MySQL InnoDB