Linux memory Management--allocating memory pages (FAST) Get_page_from_freelist__linux

Source: Internet
Author: User


First of all, the quick allocation of memory page parameters:

Gfp_mask into the rapid distribution, coupled with __gfp_hardwall this means that redistribution should be increased distribution;

Nodemask represents the mask of a node, that is, whether memory can be allocated on that node, which is a bit array;

Order is the rank of the assignment;

Zonelist is when there is no suitable page on the Perferred_zone can be allocated, it is necessary to zonelist in order to scan the zonelist in the alternate zone list, one of the trial;

High_zoneidx is the highest zone that can be allocated, generally from high-"normal"-The DMA memory is more and more expensive, so it is usually distributed from the higher to the DMA distribution;

Alloc_flags is the identity of allocating memory;

Preferred_zone that the appropriate zone to be found from the HIGH_ZONEIDX will normally be allocated from the zone, and if the allocation fails, a zonelist will be found in preferred_zone = appropriate zone;

Migratetype is the migration type, which is used in zone->free_area.free_list[xxx] as an allocation subscript, which is used to deserialize, modifies the previous Free_area structure, and adds an array to the structure. The array is subscript with the migration type, and each array element hangs a page list of the corresponding migration type;

    page = Get_page_from_freelist (Gfp_mask|__gfp_hardwall, Nodemask, order, zonelist, High_zoneidx, Alloc_flag


S, Preferred_zone, Migratetype); static struct page * Get_page_from_freelist (gfp_t gfp_mask, nodemask_t *nodemask, unsigned int order, struct Zonel ist *zonelist, int high_zoneidx, int alloc_flags, struct zone *preferred_zone, int migratetype) {struct Zoner
    EF *z;
    struct page *page = NULL;
    int classzone_idx;
    struct zone *zone;     nodemask_t *allowednodes = null;/* zonelist_cache approximation */int zlc_active = 0;      /* Set if using Zonelist_cache */int did_zlc_setup = 0; /* Just call Zlc_setup () one time/Classzone_idx = Zone_idx (preferred_zone);//zone ID Zonelist_scan: * *
     Scan Zonelist, looking for a zone with enough free.
     * also cpuset_zone_allowed () comment in kernel/cpuset.c. * * For_each_zone_zonelist_nodemask (zone, Z, Zonelist, High_zoneidx, nodemask) {//The macro is the appropriate zone from the Zonelist->_zonerefs array, which is explained later if (is_enabled (Config_numa) && Zlc_active &
                amp;&!zlc_zone_worth_trying (Zonelist, Z, allowednodes))//z->zone node is not allowed to allocate or the zone is full, skipping the Z
        Continue if ((Alloc_flags & Alloc_cpuset) &&!cpuset_zone_allowed_softwall (Zone, Gfp_mask))//Open check whether the memory node is in the specified
        CPU collection, and the zone is not allowed to allocate memory on the CPU, skipping; continue; * * When allocating a page cache page for writing, we * want to get it from a zone this is within its D Irty * limit, such that no a zone holds more than its * proportional share of globally allowed
         Y pages. * The dirty limits take into account the zone's * lowmem reserves and high watermark so that KSWAPD * sh
         Ould is able to balance it without has to * write pages from its LRU list.
 * * This could look like it could increase pressure on        * Lower zones by failing allocations into higher zones * before they full. But the pages that did spill * over are limited as the lower zones the are protected * by this very same  Anism.
         It should not become * a practical burden to them.
         * Xxx:for now, allow allocations to potentially * exceed the per-zone dirty limit in the Slowpath
         * (Alloc_wmark_low unset) before going into reclaim, * which are important when on a NUMA setup the allowed  * Zones are together not the "big enough" to reach the * global limit. The proper fix for these situations * 'll require awareness of zones in the * dirty-throttling and the
         Flusher threads. */if ((Alloc_flags & Alloc_wmark_low) && (Gfp_mask & __gfp_write) &&!zone_di RTY_OK (Zone))//Determine if the dirty page on the zone exceeds the limit goto this_zone_full;//dirty page exceeds the limit, and jumps to the last setting the zone is full so that you can balance dirty pages toA zone on build_bug_on (Alloc_no_watermarks < Nr_wmark); if (!) (
            Alloc_flags & Alloc_no_watermarks)) {unsigned long mark;

            int ret;
            Mark = zone->watermark[alloc_flags & Alloc_wmark_mask];
                if (ZONE_WATERMARK_OK (zone, order, Mark, Classzone_idx, alloc_flags))//Check that the zone has enough pages to allocate, detailed analysis see the following

            Goto Try_this_zone;
                if (is_enabled (Config_numa) &&!did_zlc_setup && nr_online_nodes > 1) {  /* We do zlc_setup if there are multiple nodes * and before considering the
                 Zone allowed * by the cpuset.
                * * Allowednodes = Zlc_setup (zonelist, alloc_flags);
                zlc_active = 1;
            Did_zlc_setup = 1;
   The above has been tested with ZONE_WATERMARK_OK () to determine whether the zone can allocate pages, if not allocated, and can not be recycled, or the zone is not in the recovery zone range, so have to set full, to prevent the next scan of the zone         if (zone_reclaim_mode = 0 | |

            !zone_allows_reclaim (Preferred_zone, Zone)) goto This_zone_full; * * As we may have just activated ZLC, check if the the I * Eligible zone has failed ZONE_RECLA
             Im recently.
                *///because the above has been set zlc_active variables, so to scan again, do not skip the zone if (is_enabled (Config_numa) && zlc_active &&

            !zlc_zone_worth_trying (Zonelist, Z, allowednodes)) continue; ret = Zone_reclaim (zone, Gfp_mask, order);//run here to show that the zone can reclaim the page switch (ret) {case Zone_reclaim_nos
            CAN:/* did not scan * * continue;
            Case ZONE_RECLAIM_FULL:/* Scanned But unreclaimable * * continue;
                default://The above two cases are not recycled pages, to here that has been recycled part of the page/* did we reclaim enough *///because the above recycled part of the page, so to see if the zone can be allocated page if (ZONE_WATERMARK_OK (zone, OrdEr, mark, classzone_idx, alloc_flags) goto Try_this_zone;
                 * * Failed to reclaim enough to meet watermark.
                 * Only mark the zone full if checking the min * watermark or if we failed to reclaim just
                 * 1<<order pages or else the page allocator * FastPath'll prematurely Mark zones full
                 * When the watermark are between the low and * min watermarks.
                    *///can't do it. if ((Alloc_flags & alloc_wmark_mask) = = Alloc_wmark_min) | |

                ret = = zone_reclaim_some) goto this_zone_full;
            Continue
                        } try_this_zone://Ideally, start allocating memory page = Buffered_rmqueue (Preferred_zone, zone, order,
        Gfp_mask, Migratetype);
if (page) break;
    this_zone_full://The zone is full, set the zone    if (is_enabled (Config_numa)) zlc_mark_zone_full (zonelist, z);
        }//here it's over. Iterate through the Zone allocation function if (unlikely (is_enabled (config_numa) && page = = NULL && zlc_active)) {
        /* Disable ZLC cache for second zonelist scan/zlc_active = 0; Goto zonelist_scan;//Once again} if (page)/* Page->pfmemalloc is set when Alloc_no_watermarks was * necessary to allocate the page. The expectation is * This caller is taking steps that would free more * memory. The caller should avoid the page being used * for!
         Pfmemalloc purposes. *///If there is no watermark allocation of the page, indicating that the zone has not much memory page can be used to allocate, so to set the Pfmemalloc, let the system recycle point of memory; Page->pfmemalloc =!!

    (Alloc_flags & Alloc_no_watermarks);
return page;
 }



This is the macro that gets the appropriate zone from the Zonelist->_zonerefs array

For_each_zone_zonelist_nodemask (zone, Z, zonelist,
                        high_zoneidx, Nodemask) {

Zone are available in the loop body, traversing zonelist->_zonerefs array elements,

#define FOR_EACH_ZONE_ZONELIST_NODEMASK (zone, Z, Zlist, highidx, nodemask) \ for
    (z = first_zones_zonelist (zlist, Highidx, Nodemask, &zone); \
        Zone;                           \
        z = next_zones_zonelist (++z, Highidx, Nodemask, &zone)) \


Static inline struct zoneref *first_zones_zonelist (struct zonelist *zonelist,
                    enum Zone_type highest_zoneidx,
                    nodemask_t *nodes,
                    struct zone **zone)
{return
    next_zones_zonelist (zonelist->_zonerefs, Highest_ Zoneidx, nodes,
                                zone);

struct Zoneref {
    struct zone *zone;  /* Pointer to actual zone *
    /int zone_idx;       /* ZONE_IDX (zoneref->zone) */
};


Highest_zoneid is the largest zone that can be accepted, and if it is larger than that, then it is not so z++ until a return is found that is less than (or is the most appropriate)

/* Returns the next zone at or below Highest_zoneidx in a zonelist
/struct zoneref *next_zones_zonelist (struct Zoner EF *z, 
                    enum Zone_type highest_zoneidx,
                    nodemask_t *nodes,
                    struct zone **zone)
{  
     * * Find the Next suitable zone to the allocation.
     * Only filter based on Nodemask if it ' s set
    /if (likely (nodes = = NULL) while
        (Zonelist_zone_idx (z) > Hi GHEST_ZONEIDX)
            z++;    
    else while
        (Zonelist_zone_idx (z) > Highest_zoneidx | |                (Z->zone &&!zref_in_nodemask (z, nodes)
            ) z++;

    *zone = Zonelist_zone (z);
    return z;
}



To understand the zlc_zone_worth_trying () function, you need to look at a few structural bodies first

struct Zonelist {
    struct zonelist_cache *zlcache_ptr;          NULL or &zlcache
    struct zoneref _zonerefs[max_zones_per_zonelist + 1];
#ifdef config_numa
    struct Zonelist_cache zlcache;
Optional ... #endif
};


struct Zonelist_cache {
    unsigned short z_to_n[max_zones_per_zonelist];      /* Zone->nid * *
    declare_bitmap (fullzones, max_zones_per_zonelist);  /* Zone full? * *
    unsigned long last_full_zap;        * When last Zap ' d (jiffies) * * *
;


static int zlc_zone_worth_trying (struct zonelist *zonelist, struct zoneref *z,
                        nodemask_t *allowednodes)
{
    struct Zonelist_cache *ZLC/* Cached zonelist speedup info/
    int i;              /* Index of *Z in zonelist zones *
    /int n;              /* node that zone *z

    are on/ZLC = zonelist->zlcache_ptr;//The structure above is very clear zlcache_ptr is Zlcache's address
    if (!ZLC)/Z Lcache does not exist, memory is Uma mode return
        1;

    i = z-zonelist->_zonerefs;//this is z in _zonerfs is the first few elements
    n = zlc->z_to_n[i];//According to I can get the Zone node number nid/

    * this Zone is worth trying if it's allowed but not full
    /return Node_isset (n, *allowednodes) &&!test_bit (i, ZLC ->fullzones)//To determine whether the NID is legal, whether the zone has reached full;
}


* * return TRUE if free pages are above ' mark '.
 This is the order * of the allocation takes. * * static bool __zone_watermark_ok (struct zone *z, int order, unsigned long mark, int classzone_idx, int all
    Oc_flags, long free_pages) {/* free_pages I go negative-that ' s OK/long min = Mark;
    Long Lowmem_reserve = number of pages int o that can be allocated in z->lowmem_reserve[classzone_idx];//emergency situations;

    Long FREE_CMA = 0;
    Free_pages-= (1 << order)-1;//minus the number of pages to be allocated if (Alloc_flags & alloc_high) min-= MIN/2;
if (Alloc_flags & alloc_harder) min-= MIN/4; #ifdef CONFIG_CMA/* If allocation can ' t use CMA areas don ' t with free CMA pages */if (!) (
Alloc_flags & ALLOC_CMA)) FREE_CMA = Zone_page_state (z, nr_free_cma_pages);
        #endif if (free_pages-free_cma <= min + lowmem_reserve)//min + Lowmem_reserve is a boundary, less than can not be allocated on the zone
    return false; for (o = 0; o < order; o++) {//loop minus FR on orders smaller than required assignmentEE page//At the next order, this is the pages become unavailable/free_pages-= Z->free_area[o].nr

        _free << o;
            /* Require fewer higher order pages to is free/min >>= 1;//higher-level pages can be less if (free_pages <= min)
    return false;
return true; }




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.