Python source code analysis-memory allocation (1)

Source: Internet
Author: User

Python source code analysis-memory allocation (1)

The previous article roughly introduced the python object structure. This article analyzes a very important part, memory allocation...

As if I had read the source code, as long as it was related to the C language, I had done a lot of work in memory processing .... For example, nginx has its own pool, and python is no exception ....

 

Python divides memory allocation into four layers...

_________________________
[Int] [dict] [list]... [string] Python core |
+ 3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ |
[Python's object allocator] |
+ 2 | ######## Object memory #######| <------ Internal buffers ------> |
______________________________________________________________________ |
[Python's raw memory allocator (PyMem _ API)] |
+ 1 | <----- Python memory (under PyMem manager's control) ------> |
__________________________________________________________________
[Underlying general-purpose allocator (ex: C library malloc)]
0 | <------ Virtual memory allocated for the python process -------> |

 

The above is the comment copied directly from the source code ....

(0) This is the underlying C-level memory operation, that is, malloc and free ..

(1) python operations on the C layer are encapsulated in a simple way .. For example, PyMem_MALLOC and PyMem_FREE are simple packages on malloc and free:

// Here, a simple macro encapsulation is made for malloc, realloc, and free. For malloc, if it is 0, then 1 # define PyMem_MALLOC (n) (size_t) is allocated) (n)> (size_t) PY_SSIZE_T_MAX? NULL \: malloc (n )? (N): 1) # define PyMem_REALLOC (p, n) (size_t) (n)> (size_t) PY_SSIZE_T_MAX? NULL \: realloc (p), (n )? (N): 1 )) # define PyMem_FREEfree # endif/* PYMALLOC_DEBUG * // ** Type-oriented memory interface * ============== ===========** Allocate memory for n objects of the given type. returns a new pointer * or NULL if the request was too large or memory allocation failed. use * these macros rather than doing the multiplication yourself so that proper * overflow checking is always done. * /// perceive the size of the type Memory Allocation # define PyMem_New (type, n) \ (size_t) (n)> PY_SSIZE_T_MAX/sizeof (type ))? NULL: \ (type *) PyMem_Malloc (n) * sizeof (type) # define PyMem_NEW (type, n) \ (size_t) (n)> PY_SSIZE_T_MAX/sizeof (type ))? NULL: \ (type *) PyMem_MALLOC (n) * sizeof (type ))))

 

What's New above, that is, the sensing part that extends the type size ....

(3) This layer is the most important part. PyObject_Malloc and PyObject_Free belong to this layer .... Python implements a memory pool at this layer for efficient memory allocation .... (The source code is concentrated in obmalloc. c)

 

Knowledge about python memory allocation:

(1) python memory allocation is generally divided into two parts: Small memory allocation, which is mainly within 512 bytes and allocation over 512 bytes.

(2) 8-byte alignment in memory allocation. For example, to allocate 12-byte memory, it will eventually occupy 16 bytes.

* Request in bytes Size of allocated block Size class idx
*----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
*.........
* 497-504 504 62
* 505-512 512 63

The above is also a comment directly copied from the source code, which clearly shows the alignment policy for python memory allocation...

 

Next, let's look at several very important macro definitions:

 

# Define ALIGNMENT 8/* must be 2 ^ N */# define ALIGNMENT_SHIFT 3 # define ALIGNMENT_MASK (ALIGNMENT-1)/* Return the number of bytes in size class I, as a uint. */# define INDEX2SIZE (I) (uint) (I) + 1) <ALIGNMENT_SHIFT) # define SMALL_REQUEST_THRESHOLD 512 # define Merge (SMALL_REQUEST_THRESHOLD/ALIGNMENT) // = 64 // page size 4kb # define SYSTEM_PAGE_SIZE (4*1024) # define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE-1) # define ARENA_SIZE (256 <10) /* 256KB */# ifdef WITH_MEMORY_LIMITS # define MAX_ARENAS (SMALL_MEMORY_LIMIT/ARENA_SIZE) # endif // The pool size defined here is 4 k # define POOL_SIZE SYSTEM_PAGE_SIZE/* must be 2 ^ N 4kb */# define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK // 4*1024-1

The above macro is not described in detail here. It mainly determines the following information.

 

(1) The size of a Arena is KB (it is used to manage the pool)

(2) The size of a pool is 4 kb.

 

Now let's take a look at the definition of the more important pool header:

// Memory pool header // The number of the szidx indicates the size of the poolstruct pool_header {union {block * _ padding; uint count;} ref; /* number of allocated blocks * // number of blocks allocated on the current pool * freeblock;/* pool's free list head * // points to the next available block, A linked list is formed. It is a discrete linked list. It is very interesting to see struct pool_header * nextpool; /* next pool of this size class * // use these two pointers to form the pool's double-stranded table struct pool_header * prevpool;/* previous pool "" */uint are Naindex;/* index into arenas of base adr * // index uint szidx in arena;/* block size class index * // class of memory allocated, 8 bytes, 16 or... Uint nextoffset;/* bytes to virgin block * // memory offset of the next available block uint maxnextoffset; /* largest valid nextoffset * // The distance from the start position of the last block };

 

The preceding annotations describe the usefulness of each field in detail... In addition, we can see that there is a szidx field, which corresponds to the szidx field in the memory alignment table during memory allocation. In fact, each pool is used to allocate a fixed size of memory, for example, if szidx is 0, this pool is used to allocate 8 bytes. If szidx is 1, it is used to allocate 16 bytes .... You can understand the code later...

In this way, it is much easier to allocate only one memory block for each pool .. Especially for memory offset calculation

Python allocates small memory according to the block unit. For example, if szidx is 0, the size of a block is 8 bytes .. The freeblock pointer is used to form a discrete single-chain table of a block (well, this implementation is also very trick. It took a long time to understand it )...

(Well, there are still many trick and Nima implementations in the memory pool... Looking at these trick implementations, they really consume brain cells... .. It takes so long for me to understand it ...)

 

 

// This can be understood as being used to manage poolstruct arena_object {/* The address of the arena, as returned by malloc. note that 0 * will never be returned by a successful malloc, and is used * here to mark an arena_object that doesn't correspond to an * allocated arena. */uptr address; // point to the first address of the allocated kb. Here, 0 indicates that no allocation is performed currently. * Pool-aligned pointer to the next pool to be carved off. */block * pool_address;/* The number of available pools in the arena: free pools + never-* allocated pools. */uint nfreepools; // available pool/* The total number of pools in the arena, whether or not available. */uint ntotalpools; // The total number of pools in the current arena/* Singly-linked list of available pools. */struct pool_header * freepools; // The Head of the pool linked list/* Whenever this arena_object is not associated with an allocated * arena, the nextarena member is used to link all unassociated * arena_objects in the singly-linked 'unused _ arena_objects' list. * The prevarena member is unused in this case. ** When this arena_object is associated with an allocated arena * with at least one available pool, both members are used in the * doubly-linked 'usable _ Arenas' list, which is maintained in * increasing order of 'nfreepools 'values. ** Else this arena_object is associated with an allocated arena * all of whose pools are in use. 'nextarena 'and 'prevarena' * are both meaningless in this case. */struct arena_object * nextarena; struct arena_object * prevarena ;};

The above is another very important structure. It can be understood that it is used to manage the pool, and its address pointer will point to an allocated kb memory, the pool will be generated on this .....

 

 

Next, let's take a look at the creation process of the Arena structure:

// Allocate an arena_object. In fact, this is also the static struct arena_object * new_arena (void) {struct arena_object * arenaobj; // create a arena pointer uint excess first; /* number of bytes abve pool alignment */void * address; // if it is newly created, this is used to point to the applied kb memory int err; # ifdef PYMALLOC_DEBUG if (Py_GETENV ("PYTHONMALLOCSTATS") _ PyObject_DebugMallocStats (); # endif if (unused_arena_objects = NULL) {// no available arena currently, create uint I; uint numar Enas; size_t nbytes;/* Double the number of arena objects on each allocation. * Note that it's possible for 'numarenas' to overflow. * /// when maxarenas is set to 0, that is, when the Arena structure is created for the first time, 16 will be created at a time, and numarenas = maxarenas will be doubled in the future? Maxarenas <1: INITIAL_ARENA_OBJECTS; // INITIAL_ARENA_OBJECTS = 16 if (numarenas <= maxarenas) // in this case, it can only be called Nima, which can cause integer overflow. return NULL; /* overflow */# if SIZEOF_SIZE_T <= SIZEOF_INT if (numarenas> PY_SIZE_MAX/sizeof (* arenas) return NULL; /* overflow */# endif nbytes = numarenas * sizeof (* arenas); // allocate the memory arenaobj = (struct arena_object *) realloc (arenas, nbytes); // allocate the memory address if (arenaob J = NULL) return NULL; arenas = arenaobj;/* We might need to fix pointers that were copied. however, * new_arena only gets called when all the pages in the * previous arenas are full. thus, there are * no * pointers * into the old array. thus, we don't have to worry about * invalid pointers. just to be sure, some asserts: */assert (usable_arenas = NULL); assert (unused_arena_objects = NULL);/* P Ut the new arenas on the unused_arena_objects list. * // This is equivalent to initializing the newly created arena struct for (I = maxarenas; I <numarenas; ++ I) {arenas [I]. address = 0; // This address is assigned 0, indicating that no available memory arenas [I] is allocated to the current arena. nextarena = I <numarenas-1? & Arenas [I + 1]: NULL;} unused_arena_objects = & arenas [maxarenas]; // point unused_arena_objects to the first available maxarenas = numarenas ;} /* Take the next available arena object off the head of the list. */assert (unused_arena_objects! = NULL); arenaobj = unused_arena_objects; unused_arena_objects = arenaobj-> nextarena; // point the unused_arena_objects pointer to the next arena structure assert (arenaobj-> address = 0 ); // allocate data memory # ifdef ARENAS_USE_MMAP address = mmap (NULL, ARENA_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0); err = (address = MAP_FAILED ); # else address = malloc (ARENA_SIZE); // The allocated 256-byte memory err = (address = 0); # endif if (err ){/ * The allocation failed: return NULL after putting the * arenaobj back. */arenaobj-> nextarena = unused_arena_objects; unused_arena_objects = arenaobj; return NULL;} // point the address of the Arena structure to the allocated address arenaobj-> address = (uptr) address; // update the counter ++ narenas_currently_allocated; # ifdef PYMALLOC_DEBUG ++ complete; if (narenas_currently_allocated> narenas_highwater) narenas_highwater = narenas_curre Ntly_allocated; # endif arenaobj-> freepools = NULL; // here, the pool header pointer is set to null/* pool_address <-first pool-aligned address in the arena nfreepools <-number of whole pools that fit after alignment */arenaobj-> pool_address = (block *) arenaobj-> address; arenaobj-> nfreepools = ARENA_SIZE/POOL_SIZE; // actually, there are 64 available pools, assert (POOL_SIZE * arenaobj-> nfreepools = ARENA_SIZE) for small memory allocation in object 64; // The following is a memory alignment to ensure the final p Ool_address is an integer multiple of 4 kb. This is mainly used to facilitate memory computing in the future. excess = (uint) (arenaobj-> address & POOL_SIZE_MASK); if (excess! = 0) {// This indicates that the current memory address is not an integer multiple of 4 kb, so you need to align the pool address once-arenaobj-> nfreepools; // number of available pools-1 arenaobj-> pool_address + = POOL_SIZE-excess;} arenaobj-> ntotalpools = arenaobj-> nfreepools; // return arenaobj for the total number of available pools ;}

 

Well, the comments above the code should be very clear .... Create a struct object, allocate memory, and point to it using the adress pointer ..

Initialize the number of pools that can be allocated and the starting address...

 

Okay... Let's write it here today... It's too late... I feel a little tired at work ..... It's still easy at school...

Let's analyze the most important PyObject_Malloc and PyObject_Free functions tomorrow...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.