Brief introduction of hot and cold page mechanism in Linux

Brief introduction of hot and cold page mechanism in Linux _linux

Last Update:2017-01-18 Source: Internet

Author: User

Tags goto prev

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

what is a hot and cold page?

The concept of hot and cold pages is introduced in the buddy system of the Linux kernel physical memory management. A cold page indicates that the free page is no longer in the cache (generally refers to L2 cache), and the hot page indicates that the free page is still in the cache. Hot and cold pages are for each CPU, each zone, will be the needle for all the CPU initialization of a hot and cold page of the Per-cpu-pageset.

Why do you have a hot and cold page?

function has 3 points:

Buddy allocator If a hot page is allocated when allocating a free page with order 0, the page already exists in the L2 cache. CPU Write access, do not need to first read the contents of memory in the cache, and then write. If you assign a cold page, the page is not in the L2 cache. In general, using hot pages as much as possible is easy to understand. When should I use a cold page? While allocating a physical page frame, there are a bit specifying whether we would like a hot or a cold page (which is, p The age likely to is in the CPU cache, or a page not likely to is there). If the page is used by the CPU, a hot page would be faster. If the page is used for device DMA the CPU cache would is invalidated anyway, and a cold page does not waste precious Cache contents.

A simple translation: when the kernel allocates a physical page frame, there are some specifications to constrain whether we are allocating hot or cold pages. When the page box is CPU-used, the hot page is allocated. When the page box is used by the DMA device, the cold page is allocated. Because the DMA device does not use the CPU cache, it is not necessary to use hot pages.
When Buddy system assigns an idle page in a zone to a process, it first needs to lock the zone with a spin lock, and then allocate the page. This way, if a process on multiple CPUs is assigned to the page at the same time, it will compete. When the Per-cpu-set is introduced, the competition does not occur and the efficiency is increased when the processes on multiple CPUs allocate pages at the same time. In addition, when a single page is released, the free page is first put back into the per-cpu-pageset to reduce the use of the spin lock in the zone. When the number of pages in the page cache exceeds the threshold, then the page is placed back into the partner system.

One advantage of using a hot and cold page per CPU is that it ensures that a page is stuck on 1 CPUs, which helps to increase the cache's hit rate.

Data structure of hot and cold pages

 struct Per_cpu_pages {
  int count;    Number of pages in the list
  int high;    High watermark, emptying needed
  int batch;    Chunk size for Buddy Add/remove
   /Lists of pages, one per migrate type stored
   on the pcp-lists each CPU on each zone Migrate_pcptypes a hot and Cold page list (according to the type of migration)
   struct list_head lists[migrate_pcptypes];

In Linux, for Uma architecture, hot and cold pages are managed on a linked list. Hot page in front, cold page in the rear. When the CPU releases an order 0 page, if the number of pages in the Per-cpu-pageset is less than its specified threshold, the freed page is inserted at the beginning of the hot and Cold page list. In this way, the hot page that you inserted before will move backwards with the constant insertion of the hot page, and the chances of the page getting colder from hot are greatly increased.

How to assign a hot and cold page

When allocating order 0 pages (hot and cold page mechanism only to deal with a single page distribution), first find the appropriate zone, and then according to the required type of migratetype to locate the hot and cold page (each zone, for each CPU, there are 3 hot and cold page linked list, corresponding to: Migrate_ Unmovable, Migrate_reclaimable, migrate_movable). If a hot page is required, remove a page from the header (this page is the most "hot"), and if a cold page is required, remove the page from the end of the list (this page is "cold").

Assign function (key part added note):

 * * Really, prep_compound_page () should is called from __rmqueue_bulk (). But * We cheat by calling it from here, in the order > 0 path.
 Saves a branch * or two. */static inline struct page *buffered_rmqueue (struct zone *preferred_zone, struct zone *zone, int order, gfp_t Gfp_fla
 GS, int migratetype) {unsigned long flags;
 struct page *page; Allocation flag is __gfp_cold to allocate cold page int leng =!!
(Gfp_flags & __gfp_cold);
  Again:if (likely (order = 0)) {struct per_cpu_pages *pcp;
  struct List_head *list;
  Local_irq_save (flags);
  PCP = &this_cpu_ptr (zone->pageset)->pcp;
  List = &pcp->lists[migratetype];
   if (List_empty (list)) {//If the page is missing, it is allocated from buddy system.
   Pcp->count + = Rmqueue_bulk (zone, 0, Pcp->batch, list, migratetype, cold);
  if (Unlikely (List_empty (list)) Goto failed;
  } if (cold)///allocation of cool pages, from the tail of the chain distribution, list as a chain header, List->prev to indicate the chain footer page = list_entry (list->prev, struct page, LRU); else//When the hot page is allocated, assign page = List_entry from the list header(List->next, struct page, LRU);
  After assigning a page box, delete the page List_del (&AMP;PAGE-&GT;LRU) from the Hot and cold page list;
 pcp->count--;  else {//If Order!=0 (page box number >1), do not assign if (Unlikely (Gfp_flags & __gfp_nofail)) {/* __gfp_nofail) from the Hot and cold page list not
    To is used in new code.
    * All __gfp_nofail callers should is fixed so, they * properly detect and handle allocation. * We most definitely don ' t want callers attempting to * allocate-greater than order-1 page units with * __GF
    P_nofail.
  * * Warn_on_once (Order > 1);
  } spin_lock_irqsave (&zone->lock, flags);
  page = __rmqueue (zone, Order, migratetype);
  Spin_unlock (&zone->lock);
  if (!page) goto failed;
 __mod_zone_page_state (Zone, Nr_free_pages,-(1 << order));
 } __count_zone_vm_events (Pgalloc, zone, 1 << order);
 Zone_statistics (Preferred_zone, Zone, gfp_flags);
 Local_irq_restore (flags);
 vm_bug_on (Bad_range (Zone, page));
  if (Prep_new_page (page, order, gfp_flags))Goto again;
return page;
 Failed:local_irq_restore (flags);
return NULL; }

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More