Per-cpu assigning and releasing single-page boxes

Source: Internet
Author: User

The kernel often requests and frees a single page box. To improve system performance, each memory management area defines a per-CPU page box cache. All "per CPU" caches contain pre-allocated page boxes that are used to satisfy a single memory request made by the local CPU.

Provides two caches per memory area and per CPU: a hot cache, which holds content contained in a page box is most likely in the CPU hardware cache, and a cold cache.


In the memory management area, the allocation of a single page using the PER-CPU mechanism, the allocation of multi-page using the partner algorithm, the structure diagram is as follows:

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M02/75/FF/wKioL1ZHNqigAz2BAADVBbb8mDo407.jpg "title=" Partition page box allocator example. jpg "alt=" wkiol1zhnqigaz2baadvbbb8mdo407.jpg "/>


1. Related structural bodies

The Pageset member in the zone structure points to the memory domain PER-CPU management structure, Nr_cpus defines the number of system CPUs

struct Zone {... #ifdef Config_numa//If the Config_numa macro is defined, Pageset is a level two pointer, otherwise the array struct per_cpu_pageset*page    Set[nr_cpus];    #else struct Per_cpu_pagesetpageset[nr_cpus]; #endif ...}

The PER_CPU_PAGESET structure is defined as follows:

struct Per_cpu_pageset {struct per_cpu_pages PCP; The per_cpu_pages structure defines the number of page frames, linked lists and other information #ifdef CONFIG_NUMAS8 expire; #endif #ifdef CONFIG_SMPS8 stat_threshold;s8 vm_stat_ Diff[nr_vm_zone_stat_items]; #endif} ____CACHELINE_ALIGNED_IN_SMP;

This struct will be used only in a struct zone and accessed through the ZONE_PCP macro:

#ifdef config_numa#define ZONE_PCP (__z, __cpu) ((__z)->pageset[(__CPU)]) #else # define ZONE_PCP (__z, __cpu) (& (_ _z)->pageset[(__CPU)]) #endif


Total number of struct Per_cpu_pages {int count;/* memory area per-cpu */int high;/* definition per-cpu maximum */int batch;/* block size added or removed from partner system *//* per-cpu Array Linked list, each migration type is a linked list, can prevent memory fragmentation */struct list_head lists[migrate_pcptypes];};

2. PER_CPU Cache Initialization


static __meminit void zone_pcp_init (struct zone  *zone) {int cpu;unsigned long batch = zone_batchsize (zone);    // Calculates the batch size based on the number of page frames in the memory domain for  (cpu = 0; cpu < nr_cpus; cpu++)  {     //Traverse all CPUs for initialization #ifdef config_numa/* early boot. slab allocator &NBSP;NOT&NBSP;FUNCTIONAL&NBSP;YET&NBSP;*/ZONE_PCP (ZONE,&NBSP;CPU)  = &boot_pageset[cpu];setup_ Pageset (&boot_pageset[cpu],0), #elsesetup_pageset (ZONE_PCP (ZONE,CPU),  batch),   //initialization per_cpu_ Pages Structure #endif}if  (zone->present_pages) printk (kern_debug  "  %s zone: %lu  pages, lifo batch:%lu\n ", Zone->name, zone->present_pages, batch);} 
static int zone_batchsize (struct zone *zone) {#ifdef  CONFIG_MMUint batch;batch = zone->present_pages / 1024;if  (batch *  page_size > 512 * 1024) batch =  (512 * 1024)  / PAGE_ size;batch /= 4;/* we effectively *= 4 below */if  (batch  < 1) Batch = 1;batch = rounddown_pow_of_two (BATCH&NBSP;+&NBSP;BATCH/2)  -  1;return batch; #elsereturn  0; #endif} 
static void Setup_pageset (struct per_cpu_pageset *p, unsigned long batch) {struct per_cpu_pages *pcp;int migratetype;    memset (p, 0, sizeof (*p)); Initialize the memory field that P points to, full fill 0PCP = &p->pcp;pcp->count = 0;pcp->high = 6 * Batch;pcp->batch = max (1UL, 1 * batch); for ( Migratetype = 0; Migratetype < Migrate_pcptypes; migratetype++) Init_list_head (&pcp->lists[migratetype]);}


The above is the initialization process for each CPU. The next step is to see how the system allocates and releases a single-page box:

The Buffered_rmqueue function is used for allocation, and the Free_hot_cold_page function is used to release

3. Assign a single page box


Buffer_rmqueue () {    ......    int cold = !! (gfp_flags & __gfp_cold);  /* Assign parameter specifies __gfp_cold flag, set COLD flag */         if  (Likely (order == 0))  {           struct per_cpu_pages *pcp;           struct list_head *list;          pcp &NBSP;=&NBSP;&AMP;ZONE_PCP (ZONE,&NBSP;CPU)->pcp;      //page cache object for current CPU            list = &pcp->lists[migratetype];   /* is assigned from the migration cache list specified by PER-CPU, based on the specified migration type */          local_irq_ Save (flags);     //off interrupt           if  (List_empty (list))  { &nbsP;               pcp->count  += rmqueue_bulk (Zone, 0, pcp->batch, list,  migratetype, cold);                   if   (Unlikely (List_empty (list))                          goto failed;           }          if   (Cold)                 /* A cold flag is a hot page, then the last node of the list is taken back */                page = list_entry (LIST-&GT;PREV,&NBSP;STRUCT&NBSP;PAGE,&NBSP;LRU);           else                /* Take out the first page of the list, this page is recently released to the CPU cache, the cache heat is highest */                page = list_entry (LIST-&GT;NEXT,&NBSP;STRUCT&NBSP;PAGE,&NBSP;LRU);           list_del (&AMP;PAGE-&GT;LRU);     /* Remove the page from the per-CPU cache list and reduce the per-CPU cache count by one */          pcp->count--; &NBSP;&NBSP;&NBSP;&NBSP, ...}


When the page box is empty in the PER-CPU cache, the Rmqueue_bulk function is called to bulk remove the batch page box from the partner system into the PER-CPU linked list, as follows:

static int rmqueue_bulk (struct zone *zone,  Unsigned int order, unsigned long count, struct list_head *list,int  migratetype, int cold) {int i;spin_lock (&zone->lock);for  (i = 0;  i < count; ++i)  {struct page *page = __rmqueue (zone,  Order, migratetype);   //the page from the partner system, the specific function implementation is referenced in the previous blog if  (Unlikely (page == null))      break;if  (Likely (cold == 0))     //no cold flag for hot pages, added to the list of linked headers _add (&page->lru, list); Elselist_add_tail (&page->lru, list);    // Cold page added to the end of the chain Set_page_private (page, migratetype); list = &page->lru;} __mod_zone_page_state (zone, nr_free_pages, -(I << order));  //Number of idle page counts Spin_ Unlock (&zone->lock); return i;} 


4. Release the page box

Release and allocation similar, for single-page boxes released into per-CPU cache, multiple pages released into the partner system

void __free_pages (struct page *page, unsigned int order) {if (Put_page_testzero (page)) {//Judgment page is idle to release trace_mm_page_free_ Direct (page, order), if (order = = 0)//Release a single page box, use the PER-CPU mechanism free_hot_page (page), else//multiple page boxes, use the buddy mechanism __FREE_PAGES_OK ( page, order); }}

The Free_hot_page function is the encapsulation of the Free_hot_cold_page function.

The approximate flow of the release is as follows:

650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M01/76/01/wKiom1ZHN3jjJ3xxAABnk0UqAMM905.png "title=" Release the single-page box. png "alt=" Wkiom1zhn3jjj3xxaabnk0uqamm905.png "/>

Static void free_hot_cold_page (struct page *page, int cold) {Struct zone  *zone = page_zone (page);   /* gets the descriptor address from the Page->flags field that contains the page memory area */struct per_cpu_ pages *pcp;unsigned long flags;int migratetype;int wasmlocked = __ testclearpagemlocked (page); Kmemcheck_free_shadow (page, 0);if  (Pageanon (page))/* Monitoring page is anonymous page */page- >mapping = NULL;if  (Free_pages_check (page))   /* detects if the page has errors */return;if  (! Pagehighmem (page))  {debug_check_no_locks_freed (page_address (page),  page_size);d ebug_check_no_obj_ Freed (page_address (page),  page_size);} Arch_free_page (page, 0);   /*  If a Have_arch_free_page macro is defined and has a release function of its own particular architecture, this function is not empty (x86 is empty) * /kernel_map_pages (page, 1, 0); &NBSP;PCP&NBSP;=&NBSP;&AMP;ZONE_PCP (Zone, get_cpu ())->pcp;    /* get the PCP address of the CPU corresponding zone */migratetype = get_pageblock_migratetype (page); Set_page_priVate (Page, migratetype); Local_irq_save (Flags),/* Disables current CPU interrupts and saves interrupt identification */if  (unlikely (wasmlocked)) free_ Page_mlock (page); __count_vm_event (Pgfree);/*         *  Only non-removable pages, recyclable pages, and removable pages can be placed in the per-CPU page box cache          *  If the migration type does not belong to these three types, To release the page back to the partner system          */if  (migratetype >=  Migrate_pcptypes)  {if  (Unlikely (migratetype == migrate_isolate))  {free_one_page (zone,  page, 0, migratetype); goto out;} migratetype = migrate_movable;} if  (Cold)     //Cold page added to the end of the chain, Hot page added to the list header List_add_tail (&page->lru, &pcp-> Lists[migratetype]); Elselist_add (&page->lru, &pcp->lists[migratetype]);p cp->count++;if   (Pcp->count >= pcp->high)  {free_pcppages_bulk (ZONE,&NBSP;PCP-&GT;BATCH,&NBSP;PCP );p Cp->count -= pcp->batch;} OuT:local_irq_restore (Flags);p ut_cpu ();} 

When the number of page frames per CPU is higher than the maximum, call the Free_pcppages_bulk function to release the batch page box to the partner system, as follows:

/* * frees a number of pages from the pcp lists *  Assumes all pages on list are in same zone, and of same  order. * count is the number of pages to free. *  * if the zone was previously in an  "all pages pinned"   state then look to * see if this freeing clears that  State. * * and clear the zone ' s pages_scanned counter, to  hold off the  "all pages are * pinned"  detection logic. */ Static void free_pcppages_bulk (struct zone *zone, int count,struct per_cpu_ PAGES&NBSP;*PCP) {Int migratetype = 0;int batch_free = 0;spin_lock (&zone-> Lock); zone_clear_fLag (zone, zone_all_unreclaimable); Zone->pages_scanned = 0;__mod_zone_page_state (zone,  Nr_free_pages, count);while  (count)  {struct page *page;struct list_head * list;/* * remove pages from lists in a round-robin fashion.  a * batch_free count is maintained that is incremented when  An * empty list is encountered.  this is so more pages  are freed * off fuller lists instead of spinning  excessively around empty * lists */do {  /* from the PCP three-class list (MIGRATE_UNMOVABLE, Migrate_reclaimable, migrate_movable) to find the non-empty release */batch_free++;if  (++migratetype == migrate_ pcptypes) Migratetype = 0;list = &pcp->lists[migratetype];}  while  (List_empty (list));d o {page = list_entry (LIST-&GT;PREV,&NBSP;STRUCT&NBSP;PAGE,&NBSP;LRU);/* must delete as __free _one_page list manipulates */list_del (&AMP;PAGE-&GT;LRU); __free_one_page (page, zone,  0, migratetype); Trace_mm_page_pcpu_drain (Page, 0, migratetype);}  while  (--count && --batch_free && !list_empty (list));} Spin_unlock (&zone->lock);}


The above is the approximate process of releasing a single page per CPU allocation.

The above code is based on the kernel 2.6.32

This version does not have a specific cold page cache, for hot and cold pages are only in the list header (hot) or the end of the chain (cold) difference, do not know this understanding right? If wrong, also hope pointing.

This article is from the "Dance Fish" blog, please be sure to keep this source http://ty1992.blog.51cto.com/7098269/1712801

Per-cpu assigning and releasing single-page boxes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.