Python source analysis----memory allocation (1)

Source: Internet
Author: User

Here is a rough introduction to the Python object structure, this article to analyze a very important part of the memory allocation ...

As if you look at the source code, as long as it is related to the C language, all in memory processing has done quite a lot of work .... For example Nginx, it also has to realize its own pool,python of course is no exception ....


Python is divided into 4 levels of memory allocation ...

_____   ______   ______       ________
[INT] [Dict] [List] ...       [String] Python Core |
+3 | <-----object-specific Memory-----> | <--non-object Memory--|
_______________________________       | |
[Python ' s object Allocator] | |
+2 | ####### Object Memory ####### | <------Internal buffers------> |
______________________________________________________________    |
[Python ' s raw memory allocator (pymem_ API)] |
+1 |   <-----Python Memory (under PYMEM Manager ' s control)------> | |
__________________________________________________________________
[Underlying general-purpose allocator (EX:C library malloc)]
0 | <------Virtual Memory allocated for the Python process-------> |


The above is a direct copy from the source code in the comments ....

(0) This is the bottom of the C-level memory operations, that is, malloc and free.

(1) This is a simple encapsulation of Python's operations on the C level. For example Pymem_malloc,pymem_free, they do a very simple packing on MALLOC and free:

Here is a simple macro package for malloc,realloc and free, for MALLOC, if 0, then allocate 1#define pymem_malloc (n) ((size_t) (n) > (size_t) py_ssize_t_ MAX? Null:malloc ((n)? (n): 1) #define PYMEM_REALLOC (P, N) ((size_t) (n) > (size_t) Py_ssize_t_max? Null:realloc (P), (n)? (n): 1)) #define pymem_freefree#endif/* pymalloc_debug *//* * type-oriented Memory interface * ==========================  = = = * * Allocate memory for n objects of the given type.  Returns a new pointer * or NULL if the request was too large or memory allocation failed. Use * These macros rather than doing the multiplication yourself so that proper * overflow checking are always done. *///by perceiving the size of the type to divide the memory # # pymem_new (type, N) (((size_t) (n) > py_ssize_t_max/sizeof (type))? NULL:( (Type *) Pymem_malloc ((N) * sizeof (type)))) #define PYMEM_NEW (type, N) (((size_t) (n) > py_ssize_t_max/sizeof (type))? NULL:( (Type *) Pymem_malloc ((N) * sizeof (type))))

It also has new, that is, to expand the perceptual part of the type size ....

(3) This layer is the most important part, Pyobject_malloc,pyobject_free belongs to this layer .... This is where python implements the memory pool for efficient memory allocation .... (source code in OBMALLOC.C)


Some common sense in Python memory allocation:

(1) Python's memory allocation is broadly divided into two parts, the first is the small memory allocation, which is mainly 512 bytes, and more than 512 bytes of allocation two classes

(2) In terms of memory allocation in accordance with 8-byte alignment, for example, to allocate 12 bytes of memory, in fact, will eventually occupy 16 bytes of size

* Request in bytes size of allocated block size class IDX
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
*        ...                   ...                     ...
* 497-504 504 62
* 505-512 512 63

The above is also directly from the source code copy out of the comments, very clearly show the Python memory allocation aspect of the alignment strategy ...


Let's look at some very important macro definitions:

#define ALIGNMENT               8               */Must be 2^n */#define ALIGNMENT_SHIFT         3#define alignment_mask          (ALIGNMENT-1)/* Return the number of bytes in size class I, as a uint. */#define INDEX2SIZE (i) (((UINT) (i) + 1) << alignment_shift) #define Small_request_threshold #define Nb_small_ Size_classes   (small_request_threshold/alignment)   //==64//Page size 4kb#define system_page_size        (4 * 1024) # Define System_page_size_mask   (system_page_size-1) #define Arena_size              (<<)/     * 256KB */#ifdef With_memory_limits#define Max_arenas              (small_memory_limit/arena_size) #endif//The size of the POOL defined here is 4k#define Pool_ SIZE               system_page_size        /* Must be 2^n   4kb*/#define          pool_size_mask system_page_size_mask  // 4*1024-1

The above macro is not detailed in detail here,,, it is mainly determined by the following information

(1) A arena size of 256KB (it is used to manage pool)

(2) The size of a pool is 4KB


OK, next look at the more important pool header definition:

Memory pool head//through the SZIDX number you can know the current pool is used to allocate a large amount of memory poolstruct Pool_header {    Union {block *_padding;            UINT Count; } ref;          /* Number of allocated blocks    *  //////////current pool of blocks allocated above    block *freeblock;                   /* pool ' s free list head         *///  point to the next available block, which forms a linked list, which is a discrete list, very interesting    struct pool_header *nextpool;       /* Previous pool       ""        */    uint Arenaindex;                    /* index into Arenas of base ADR *////  indexed uint inside Arena    Szidx;                         /* Block Size class Index        *   //Allocated memory category, 8 bytes, 16 or ...    UINT Nextoffset;                    /* bytes to Virgin block         *////The memory offset of the   next available block    uint Maxnextoffset;                 /* Largest valid Nextoffset      *///  last block distance from start position};

The comments above explain in great detail the usefulness of each field ... In addition, here you can see that there is a SZIDX field, which corresponds to the Memory alignment table above memory allocation when the SZIDX, in fact, each pool is used to allocate a fixed size of memory, such as Szidx 0, then this pool is used to allocate 8 bytes, A szidx of 1 is used to allocate 16 bytes of .... Look at the code later to understand ...

This way, each pool is allocated only one size of memory block is more convenient. In particular, the calculation of memory offsets is quite convenient

When allocating small memory, Python is allocated according to block units, such as the pool with a szidx of 0, and one of its block sizes is 8 bytes. By freeblock the pointer to form a block of discrete single-linked list (well, this implementation is also very trick, see for a long time to see) ...

(Well, in fact, the memory pool this part of the implementation of a lot of trick, ... See the realization of these trick really consume brain cells ah ... Wipe.. Can only blame oneself this aspect indeed Caishuxueqian,, must look so long to understand ... )


This can be understood to be used to manage poolstruct arena_object {/* The address of the arena, as returned by malloc. Note that 0 * would never be returned by a successful malloc, and was used * here to mark an arena_object that doesn     ' t correspond to an * allocated arena.     */UPTR address; Point to the first address of the assigned 256KB, which indicates that there is currently no assignment/* pool-aligned Pointer to the next Pool to be carved off.    */block* pool_address;     /* The number of available pools in the Arena:free pools + never-* Allocated pools.            */UINT Nfreepools; Available Pool/* The total number of pools in the arena, whether or not available.       */UINT Ntotalpools; The total number of pool in the current arena/* singly-linked list of available pools.        */struct pool_header* freepools;  Head of the Pool list/* Whenever this arena_object are not associated with a allocated * arena, the Nextarena member is used     To link all unassociated * arena_objects in the singly-linked ' unused_arena_objects ' list. * The Prevarena Member is unused in this case. * * When the This arena_object are associated with a allocated arena * with at least one available pool, both members was used in the * doubly-linked ' Usable_arenas ' list, which was maintained in * increasing order of ' nfreepools ' VA     Lues.  * * Else This arena_object was associated with a allocated arena * All of whose pools was in use.     ' Nextarena ' and ' Prevarena ' * is both meaningless in the this case.    */struct arena_object* Nextarena; struct arena_object* Prevarena;};

This is another very important structure, which can be understood as it is used to manage the pool, its address pointer will point to an allocated 256kb of memory, the pool will be generated on this ....


Now let's take a look at the creation of the arena structure:

 Assign a arena_object, in fact this is also done a cache of static struct Arena_object*new_arena (void) {struct arena_object* arenaobj;        Here first create a arena pointer to the UINT excess;               /* Number of bytes above pool alignment */void *address; If newly created, this is used to point to the requested 256KB memory int err; #ifdef pymalloc_debug if (py_getenv ("Pythonmallocstats")) _pyobjec        T_debugmallocstats (); #endif if (unused_arena_objects = = NULL) {//There are currently no arena available, then you need to create a uint I;        UINT Numarenas;        size_t nbytes;         /* Double the number of arena objects on each allocation.         * Note that it's possible for ' numarenas ' to overflow. *///Maxarenas to 0, that is, the first time you create a arena structure, you will create 16 at a time, and then directly double Numarenas = Maxarenas?  Maxarenas << 1:initial_arena_objects;                initial_arena_objects=16 if (Numarenas <= Maxarenas)//This situation, can only say that, this can be plastic overflow ah return NULL; /* Overflow */#if sizeof_size_t <= sizeof_int if (Numarenas > Py_size_maX/sizeof (*arenas)) return NULL;    /* Overflow */#endif nbytes = Numarenas * sizeof (*arenas);   Next, allocate the memory Arenaobj = (struct Arena_object *) realloc (arenas, nbytes) required by the arenas struct;        Allocate memory address if (arenaobj = = null) return null;        Arenas = Arenaobj;  /* We might need to fix pointers that were copied.  However, * New_arena only gets called if all the pages in the * previous arenas is full. Thus, there is *no* pointers * into the old array.  Thus, we don ' t has to worry about * invalid pointers.        Just to be sure, some asserts: */assert (Usable_arenas = = NULL);        ASSERT (Unused_arena_objects = = NULL); /* Put The new arenas on the Unused_arena_objects list.       *///Here is equivalent to initializing the arena structure for (i = Maxarenas; i < Numarenas; ++i) {arenas[i].address = 0; By assigning this address to 0, the current arena does not allocate available memory Arenas[i].nextarena = i < Numarenas-1?        &AMP;ARENAS[I+1]: NULL;  } unused_arena_objects = &arenas[maxarenas];    Here, Unused_arena_objects points to the first Maxarenas that is currently available = Numarenas; }/* Take the next available Arena object off the head of the list.    */assert (unused_arena_objects! = NULL);    Arenaobj = unused_arena_objects;  Unused_arena_objects = arenaobj->nextarena;    Point the Unused_arena_objects pointer to the next arena structure assert (arenaobj->address = = 0); Next allocate data memory #ifdef ARENAS_USE_MMAP address = MMAP (NULL, Arena_size, prot_read| Prot_write, map_private|    Map_anonymous,-1, 0);   Err = (Address = = map_failed); #else address = malloc (arena_size);  This allocates 256 bytes of memory Err = (address = = 0); #endif if (err) {/* The allocation Failed:return NULL after putting         The * arenaobj back.        */Arenaobj->nextarena = unused_arena_objects;        Unused_arena_objects = Arenaobj;    return NULL; }//To refer to the address of the arena structureaddressed to the assigned address arenaobj->address = (uptr);    Update counter ++narenas_currently_allocated; #ifdef pymalloc_debug ++ntimes_arena_allocated; if (narenas_currently_allocated > Narenas_highwater) narenas_highwater = narenas_currently_allocated; #endif a   Renaobj->freepools = NULL; Here the pool head pointer is set to NULL/* Pool_address <-First pool-aligned address in the arena nfreepools <-number of Whol    E pools that fit after alignment */arenaobj->pool_address = (block*) arenaobj->address;    Arenaobj->nfreepools = arena_size/pool_size;    In fact, here are 64 available POOL, exactly the type of object 64 in the small memory allocation assert (pool_size * arenaobj->nfreepools = = arena_size);    Here is a memory alignment, the final guarantee pool_address address is an integer multiple of 4KB, this is mainly convenient after the memory calculation excess = (UINT) (Arenaobj->address & Pool_size_mask);   if (Excess! = 0) {//This means that the current memory address is not an integer multiple of 4kb, then the alignment of the pool address needs to be--arenaobj->nfreepools;    Available numbers-1 arenaobj->pool_address + = pool_size-excess; } Arenaobj->ntotalpools = Arenaobj->nfreepools; The total number of pool available is return arenaobj;}

Well, the comments above the code should be very clear .... Create a struct object, then allocate memory, and use the adress pointer to point to:

Initialize the number of pool that can be allocated, as well as the starting address ...


All right... Let's write about it today ... It's so late ... Feel at work is really a little tired ah ..... It's easy at school ...

Tomorrow to analyze the most important pyobject_malloc and pyobject_free two functions ...

Python source analysis----memory allocation (1)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.