Recently want to learn the source of Python, hope to write a series of blog, record while urging themselves to learn.
Python Source Directory
Download the source code tarball from python.org and unzip, I download is Python2.7.12, after decompression:
For the main folder to make an introduction:
include: Contains all the header files provided by Python, if you need to use C or C + + to write custom module extension python, you need to use the header file here;
Lib: Contains all the standard libraries that Python comes with, all written in Python language;
Modules: Contains all modules written in C language;
Parser: Scanner and Parser in the Python interpreter (lexical parsing and parsing of Python code), which also includes tools to automatically generate Python lexical and grammatical functions based on the syntax of the Python language;
Objects: All Python built-in objects;
python: The compiler and execution engine parts of the Python interpreter are at the heart of Python's run!!!
Objects in Python
Object can be said to be the most central concept of Python, in the Python world, everything is an object. We know that Python is written by C, C is not an object-oriented language, and Python, written by C, is actually object-oriented, so how does its object mechanism be implemented?
For people's thinking, objects can be depicted, but for computers, objects are an abstract concept, and everything the computer knows is bytes. With regard to objects, it is common to say that objects are collections of data and operations based on them, and that in a computer, an object is actually a allocated memory space, and there is a higher level within the slice as a whole, and this whole is an object.
In Python, an object is a piece of memory that the struct in C requests on the heap.
The cornerstone of the object mechanism--pyobject
In Python, everything is an object, and all objects have some of the same content, and Python's content is defined in the Pyobject in Object.h.
typedef struct _OBJECT { pyobject_head} pyobject;
Fixed-length objects and variable-length objects
In addition to the Pyobject object, Python has a structure that represents such objects Pyvarobject,pyvarobject is actually an extension to pyobject.
Then stand in the source of the analysis, the variable length of the object is added in the pyvarobject variable lengths of data objects, that is, ob_size, defines the number of elements accommodated. The difference between fixed-length objects and variable-length objects is that different objects of fixed-length objects occupy the same amount of memory, and the memory used by different objects of a variable-length object may not be the same. For example, the integer object ' 1 ' and ' 100 ' memory size are sizeof (pyintobject), and the string object "Me" and "you" occupy a different memory size.
The polymorphism of Python objects
One of the important features in object-oriented is polymorphism, so how does python implement polymorphism?
When Python creates an object, it allocates memory, initializes it, and Python internally uses a pyobject* variable to hold and maintain the object, as is true for all objects in Python. For example, to create a Pyintobject object (an Integer object), not to save and maintain the object through the pyintobject * variable, but through the pyobject *, just because all objects are so, So python inside each function is a type of pointer (pyobject*), and this pointer refers to what kind of object, we do not know, only from the pointer of the object's Ob_type domain dynamic judgment, and it is this domain, Python implemented polymorphic.
Reference count
Unlike C or C + +, Python chooses to use the language itself to manage and maintain memory, the garbage collection mechanism, instead of the programmer's heavy memory management work, and the reference count is just part of the Python garbage collection mechanism.
Python manages and maintains the presence or absence of objects in memory by reference counting of an object. Everything in Python is an object, and in all objects there is a ob_refcent variable that maintains the reference count of the object, which also determines the object's creation and extinction.
In Python, using Py_incref (OP) and Py_decref (OP) two macros to increase and decrease the reference count of an object, Python provides a py_newreference (OP) when each object is created Macro initializes the reference count of the object to 1.
When the reference count of an object is 0 o'clock, the destructor corresponding to the object is called, but the destructor is not necessarily called free to release memory space, in order to avoid frequent requests, free memory space, Python is used in the Memory object pool, maintain a certain size of memory object pool, call the destructor , the space occupied by the object is returned to the memory pool.
An integer object in Python
In all Python objects, integer objects are the simplest and most frequently used, so we'll start by learning integer objects. About the source code of an integer object in Objects.intobjects.c, the integer object is done by the Pyintobject object, and after a Pyintobject object is created, the value of the object cannot be changed. Defined as:
typedef struct { probject_head; Long Ob_ival;} Pyintobject;
As you can see, the integer object in Python is actually a simple encapsulation of a long in C, that is, the length of the data maintained by an integer object is determined when the object is defined, that is, the length of long in C.
in Python, the use of integers is very extensive, corresponding to the creation and release of it will be very frequent, so how to design an efficient mechanism, so that the use of integer objects will not become a python bottleneck? In Python, this problem is resolved by using the buffer pooling mechanism of an integer object. Using the buffer pooling mechanism, it means that the runtime's integer object is not independent of each other, but rather relates to a large integer object system.
Small Integer Object
in the actual programming, the numerical comparison small integer, such as 1, 2, and so on, these are frequently used in the program, but in Python, all objects live on the system heap, that is, if there is no special mechanism for small integer objects, Python will request space on the heap again and again, and then free, this will greatly reduce the operational efficiency.
so how to solve it? In Python, object pooling techniques are used for small integer objects.
so there's another question, how do large objects and small objects in Python differentiate? Well, there really is a way in Python that users can adjust the cutoff point of large integers and small integers to dynamically determine how many small integer objects should be in the object pool of small integers, but the only way to adjust them is to modify the source code themselves and recompile.
Large Integer Object
for small integers, the Pyintobject object is fully cached in the small integer object pool, and for other objects, Python will provide a piece of memory space that will be used by these large integers in turn, that is, who will use them when needed.
For example, there is a pyintblock structure in Python that maintains a piece of memory, which preserves some pyintobject objects and maintains the number of objects that can be dynamically adjusted. At some point in the Python run, some memory is already in use, while others are idle, and the free memory must be organized so that when Python needs new memory, it can quickly get the required memory and use a one-way list in Python (free_ List) to manage all the free memory.
#define BLOCK_SIZE / * 1K less typical malloc overhead */#define BHEAD_SIZE 8/ * enough for a 64-bit p Ointer */#define N_INTOBJECTS ((block_size-bhead_size)/sizeof (pyintobject)) struct _intblock { struct _ Intblock *next; Pyintobject objects[n_intobjects];}; typedef struct _INTBLOCK pyintblock;static pyintblock *block_list = null;static pyintobject *free_list = NULL;
Create
now, we have a general knowledge of the Python in the integer object system in memory is a kind of structure, the following will explain how a pyintobject is produced from scratch. It is divided into two main steps:
If the small integer object pool mechanism is activated, a small integer object pool is attempted, and a common integer object pool is used if the small integer object pool cannot be used.
Description by Pyint_fromlong:
Pyobject *pyint_fromlong (Long ival) { register pyintobject *v; #if nsmallnegints + nsmallposints > 0 if (- Nsmallnegints <= ival && ival < nsmallposints) { v = small_ints[ival + nsmallnegints]; Py_incref (v); #ifdef Count_allocs if (ival >= 0) quick_int_allocs++; else quick_neg_int_allocs++; #endif return (Pyobject *) v; } #endif if (free_list = = null) { if ((Free_list = Fill_free_list ()) = = null) return null; } /* Inline pyobject_new */ v = free_list; Free_list = (Pyintobject *) Py_type (v); Pyobject_init (V, &pyint_type); V->ob_ival = ival; Return (Pyobject *) v;}
Today's study is here, tomorrow continues to study the analysis ~
Python source Analysis (i)