From Python's source code to dissect Python's memory management _python

Source: Internet
Author: User
Tags garbage collection in python

Python's memory Management Architecture (OBJECTS/OBMALLOC.C):

Copy Code code as follows:

_____   ______   ______       ________
[INT] [Dict] [List] ...       [String] Python Core |
+3 | <-----object-specific Memory-----> | <--Non-object Memory--> |
_______________________________       | |
[Python ' s object Allocator] | |
+2 | ####### Object Memory ####### | <------Internal buffers------> |
______________________________________________________________    |
[Python ' s raw memory allocator (pymem_ API)] |
+1 |   <-----Python Memory (under PYMEM Manager ' s control)------> | |
__________________________________________________________________
[Underlying general-purpose allocator (EX:C library malloc)]
0 | <------Virtual Memory allocated for the Python process-------> |

0. C Language Library function provides the interface

1. The Pymem_* family is a simple encapsulation of malloc, realloc and free in C, providing the underlying control interface.

2. Pyobject_* family, advanced memory control interface.
3. Object type-related management interface

Pymem_*

PYMEM_ family: Low-level memory allocation interface (low-level memory allocation interfaces)

Python provides a simple encapsulation of malloc, realloc, and free in C:

Why do so many times:

    • Different C implementations for MALLOC (0) result in different clubs, and Pymem_malloc (0) will be converted into MALLOC (1).
    • The need for C-implemented malloc and free blending can have potential problems. Python provides encapsulation to avoid this problem.
    • Python provides macros and functions, but macros cannot avoid this problem, so writing extensions should avoid using macros

Source:

  Include/pymem.h

#define PYMEM_MALLOC (n) (size_t) (n) > (size_t) Py_ssize_t_max? NULL \
             : malloc (n)? (n): 1)
#define PYMEM_REALLOC (P, N) (size_t) (n) > (size_t) Py_ssize_t_max? NULL \
              : ReAlloc (P), (n)? (n): 1)
#define Pymem_free free

  objects/object.c

/* Python ' s malloc wrappers (* pymem.h)/

void *
   pymem_malloc (size_t nbytes)
{return
  pymem_malloc (nbytes);
}
...


In addition to the simple encapsulation of C, Python also offers 4 macros

Pymem_new and Pymem_new

Pymem_resize and Pymem_resize

They can perceive the size of the type

#define PYMEM_NEW (type, n) \
 (((size_t) (n) > py_ssize_t_max/sizeof (type))? NULL:   \
    (type *) Pymem_malloc ((n) * sizeof (type)))

#define PYMEM_RESIZE (p, type, n) \
 ((p) = (size _t) (n) > py_ssize_t_max/sizeof (type))? NULL:    \
    (type *) Pymem_realloc ((p), (n) * sizeof (type))
#define Pymem_del        pymem_free
#define Pymem_del        Pymem_free


Some of the functions described below still exist with functions and macros, which are all uppercase characters after the underscore is a macro that is not specifically explained later.
Pyobject_*

The Pyobject_* family is an advanced memory control interface (high-level object memory interfaces).

Attention

    • Don't mix with the Pymem_* family!!
    • You should insist on using pyobject_* unless you have special internal management requirements.

Source

  Include/objimpl.h

#define PYOBJECT_NEW (type, typeobj) \
        ((type *) _pyobject_new (typeobj))
#define Pyobject_newvar (type, typeobj, n) \
        ((type *) _pyobject_newvar ((typeobj), (n))

  objects/object.c

Pyobject *
_pyobject_new (pytypeobject *tp)
{
  pyobject *op;
  OP = (Pyobject *) Pyobject_malloc (_pyobject_size (TP));
  if (op = NULL) return
    pyerr_nomemory ();
  Return Pyobject_init (OP, TP);
}

Pyvarobject *
_pyobject_newvar (pytypeobject *tp, py_ssize_t nitems)
{pyvarobject *op
  ;
  Const size_t size = _pyobject_var_size (TP, nitems);
  OP = (Pyvarobject *) pyobject_malloc (size);
  if (op = NULL) return
    (Pyvarobject *) pyerr_nomemory ();
  Return Pyobject_init_var (OP, TP, Nitems);
}

They do two things:

    1. Allocating Memory: Pyobject_malloc
    2. Partially initialized objects: Pyobject_init and Pyobject_init_var

Initialization is nothing to look at, but this malloc is a bit more complicated ...
Pyobject_{malloc, free}

This and the pymem_* 3 are very different, complex and powerful!

void * Pyobject_malloc (size_t nbytes)
void * Pyobject_realloc (void *p, size_t nbytes)
void Pyobject_free (void *p )

Python programs frequently need to create and destroy small objects when they are running, and Python uses the technology of memory pools to avoid large amounts of malloc and free operations.

    • A series of arena (each management 256KB) forms a list of memory regions
    • Each arena has a number of pool (each 4KB) composition
    • Each memory request release will be performed in a pool

Single Request memory block

When the request size is in memory between 1~256 bytes, the memory pool is used (when requesting 0 or 257 bytes, it will retire using the Pymem_malloc we mentioned earlier).

Each time the application is applied, the actual allocated space is aligned to a number of bytes, with 8 bytes in the following table (for example, Pyobject_malloc (20) bytes will be allocated 24 bytes).

Copy Code code as follows:

Request in bytes the size of allocated block size class IDX
----------------------------------------------------------------
1-8 8 0
9-16 16 1
17-24 24 2
25-32 32 3
33-40 40 4
...                   ...                     ...
241-248 248 30
249-256 256 31

0, 257 and up:routed to the underlying allocator.

These parameters are controlled by some macros:

#define ALIGNMENT        8/        * must be 2^n/
* Return the number of bytes in size class I, as a uint. * *
#define Index2size (i) ((UINT) (i) + 1) << alignment_shift)
#define SMALL_REQUEST_THRESHOLD 256

Pool

Each requested block of memory needs to be allocated in the pool and the size of a pool is 4k. controlled by the following macros:

#define SYSTEM_PAGE_SIZE (4 * 1024)
#define POOL_SIZE system_page_size/* Must be 2^n * *

The head of each pool is defined as follows:

struct Pool_header {
  union {block *_padding;
      UINT Count; } ref;     /* Number of allocated blocks  * * block
  *freeblock;          /* pool ' s free list head     *
  /struct pool_header *nextpool;    /* Next pool of this size class *
  /struct pool_header *prevpool;    /* Previous pool    ""    * *
  uint Arenaindex;          /* index into Arenas of base ADR *
  /UINT Szidx;             /* Block Size class Index    *
  /UINT Nextoffset;          /* bytes to Virgin block     *
  /UINT Maxnextoffset;         /* Largest valid nextoffset   * *
;

Note that there is a member Szidx that corresponds to the Size class idx of the last column in the previous list. This also explains the problem: each pool can only allocate a fixed size block of memory (for example, only 16-byte blocks are allocated, or only 24-byte blocks are allocated ...).

To be able to allocate memory blocks of various sizes in the previous list, you must have more than one pool. The pool allocation for the same size is complete and a new pool is required. Multiple pool in turn constitute a linked list
Arena

Multiple pool objects are managed using something called arena.

struct Arena_object {
  uptr address;
  block* pool_address;
  UINT Nfreepools;
  UINT Ntotalpools;
  struct pool_header* freepools;
  struct arena_object* Nextarena;
  struct arena_object* Prevarena;


The size of the memory controlled by Arean is controlled by the following macros:

#define Arena_size       (256 <<)   * 256KB * *

A series of arena form a linked list.
reference count and garbage collection

The lifecycle of most objects in Python is controlled by reference counting, which enables dynamic memory management.

But the reference count has a fatal problem: Circular Reference!

To break the circular reference, Python introduced garbage collection techniques.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.