This article is mainly on the swappiness of a source of analysis (based on the kernel version v4.14-13151-g5a787756b809), only for personal opinion, there is a lack of welcome to communicate with each other.
About Swap and swappiness
Swap (swap partition) is a mitigation of the operating system's low memory. When memory is tight, it is appropriate to make a judgement based on some configuration values and current statistics, swapping some anon memory (allocated memory) into the swap partition.
Swappiness is a parameter to the system that adjusts the use priority of the swap. The Linux documentation is described below:
Swappiness
This control was used to define how aggressive the kernel would swap
Memory pages. Higher values would increase aggressiveness, lower values
Decrease the amount of swap. A value of 0 instructs the kernel not to
Initiate swap until the amount of free and file-backed pages are less
than the high water mark in a zone.
The default value is 60.
It's just a translation.
This parameter is the xxx sex (aggressive) that defines the kernel Swap memory page. A larger value will increase the xxx sex, and a lower value will reduce the number of swaps. A value of 0 will command the kernel not to use swap, only if free and the file uses less than a zone high water level.
The default value is 60.
About the aggressive here, look at the foggy. Only know the value of the approximate meaning. In some environments, users have been complaining about why swap usage is so much, and there are quite a lot of available memory.
Linux Memory Request
Linux memory applications will generally have some flag flags that will have some impact on the application process, not in detail here. This is mostly about memory requests that are normally (user-state applications and most of the kernel-state communities are available to wait for memory to be released).
__alloc_pages generally first traverse each memory area (zone) to find the first available enough memory block. If one area is full, the next area is searched. Singular if cpusets is set, he will trigger a memory reclaim recycle.
Here swappiness is mainly in memory reclaim time to take effect.
The way of Reclaim
Basically, the Reclaim method is to recycle the file-related memory, and one is to swap the anon portion of memory (that is, the allocated memory) to the swap partition.
One of the purposes of Linux memory usage is to use memory as much as possible. When the file is read and written, the file's cache will remain in the system memory, until the memory is not enough time, there is no active release of this part of the memory logic. This allows you to read the cached file directly from memory the next time you read it, without having to do IO from the disk, so that the file reads faster.
The result is that in fact available memory is still a lot of cases, there will still be insufficient memory, triggering reclaim logic, a portion of memory swap to swap partition.
Swappiness effective Way
Swappiness is used in the Get_scan_count function.
The following code shows: This parameter has no effect when swap is full.
2195/ If We have no swap space, does not bother scanning anon pages. /
2196 if (!sc->may_swap | | mem_cgroup_get_nr_swap_pages (MEMCG) <= 0) {
2197 scan_balance = scan_file;
2198 Goto out;
2199}
The mem in Cgroup has not reached limit, and swappiness is 0, and only scans the file cache section. That does not consider swapping out.
2201/
2202 Global reclaim would swap to prevent OOM even with no
2203 swappiness, but memcg users want
2204 Disable swapping for individual groups completely when
2205 using the memory controller ' s swap limit feature would be
2206 too expensive.
2207 */
2208 if (!global_reclaim (SC) &&!swappiness) {
2209 scan_balance = scan_file;
2210 Goto out;
2211}
When the system approaches Oom, and swapiness is not 0, the memory of anon and file is scanned equally.
2213/
2214 do not apply any pressure balancing cleverness when the
2215 system was close to OOM, s Can both anon and file equally
2216 (unless the swappiness setting disagrees with swapping).
2217 */
2218 if (!sc->priority && swappiness) {
2219 scan_balance = scan_equal;
2220 Goto out;
2221}
When the memory reaches limit, only the requested memory is freed. Here, in conjunction with the previously mentioned branch, it is known that when the swappiness is 0, it does not reach limit only releasing the file cache, and when the limit is reached, it is only considered to switch the memory into swap.
/* * Prevent the reclaimer from falling to the cache Trap:as * Cache pages start out inactive, every cache Fault would tip * The scan balance towards the file LRU. And as the file LRU * shrinks, so does the window for rotation from references. * This means we had a runaway feedback loop where a tiny * thrashing file LRU becomes infinitely more attractive than * Anon pages. Try to detect the based on file LRU size. */if (Global_reclaim (SC)) {unsigned long pgdatfile; unsigned long pgdatfree; int z; unsigned long total_high_wmark = 0; Pgdatfree = Sum_zone_node_page_state (pgdat->node_id, nr_free_pages); Pgdatfile = Node_page_state (Pgdat, Nr_active_file) + node_page_state (Pgdat, nr_inactive_file); for (z = 0; z < max_nr_zones; z++) {struct Zone *zone = &pgdat->node_zones[z]; if (!managed_zone (Zone)) continue; Total_high_wmark + = High_wmark_pages (zone); } if (Unlikely (Pgdatfile + pgdatfree <= total_high_wmark)) {/* * for Ce Scan_anon If there is enough inactive * Anonymous pages on the LRU in eligible zones. * Otherwise, the small LRU gets thrashed. */if (!inactive_list_is_low (Lruvec, False, MEMCG, SC, false) && Lruvec_ Lru_size (Lruvec, Lru_inactive_anon, Sc->reclaim_idx) >> sc->priority) { Scan_balance = Scan_anon; Goto out; } } }
When the cache page for inactive is sufficient, only the file cache is released.
/* * If there is enough inactive page cache, i.e. if the size of the * inactive list is greater than that of the active list *and* the * inactive list actually has some pages to scan on this priority, we * do not reclaim anything from the anonymous working set right now. * Without the second condition we could end up never scanning an * lruvec even if it has plenty of old anonymous pages unless the * system is under heavy pressure. */ if (!inactive_list_is_low(lruvec, true, memcg, sc, false) && lruvec_lru_size(lruvec, LRU_INACTIVE_FILE, sc->reclaim_idx) >> sc->priority) { scan_balance = SCAN_FILE; goto out; }
It is emphasized here that the general effect of swappiness begins here. is to set the Anon_prio to the corresponding Swappiness,file_prio set into 200-anon_prio.
scan_balance = SCAN_FRACT; /* * With swappiness at 100, anonymous and file have the same priority. * This scanning priority is essentially the inverse of IO cost. */ anon_prio = swappiness; file_prio = 200 - anon_prio;
Further use of Anon_prio and File_prio to obtain APS and FP
/* * The amount of pressure on anon vs file pages is inversely * proportional to the fraction of recently scanned pages on * each list that were recently referenced and in active use. */ ap = anon_prio * (reclaim_stat->recent_scanned[0] + 1); ap /= reclaim_stat->recent_rotated[0] + 1; fp = file_prio * (reclaim_stat->recent_scanned[1] + 1); fp /= reclaim_stat->recent_rotated[1] + 1;
Specific other details or subsequent algorithms are left for subsequent analysis.
Summarize
The Swappiness control mode is mainly triggered when memory is tight (this means that free memory is low). Specific as follows:
- When Swappiness is 0, then in available memory adequacy, only the file cache is released, and when the available is out of memory, some memory is swapped to swap space.
- Swappiness is not 0, then the size of his value is mainly to control each time the memory tense, switch to swap and the file cache release ratio.
Note: Most people mistakenly think that it is wrong to switch memory to swap when controlling the remaining ratio of memory to swappiness value.
Open source technology sharing: Linux kernel parameter swappiness fine solution