How Python programs are executed
1. Process OverviewPython first puts the code (
.py
file) is compiled into bytecode, given to the bytecode virtual machine, and then a virtual machine executes a byte code instruction to complete the execution of the program.
2. Byte code
The byte code corresponds to the object in the Python virtual machine program PyCodeObject
.
.pyc
A file is a representation of the bytecode on disk.
3.pyc File
PyCodeObject
The time when the object is created is when the module is loaded, that is, import.
Python test.py
is test.py
compiled into bytecode and interpreted for execution, but is not generated test.pyc
.
If test.py
other modules are loaded, such as import Util,python will util.py
compile into bytecode, generate util.pyc
, and then interpret the bytecode for execution.
If you want to build test.pyc
, we can use the Python built-in module py_compile
to compile.
When the module is loaded, if both are present .py
and .pyc
Python tries to use .pyc
, if .pyc
the compilation time is earlier than .py
the modification time, it is recompiled .py
and updated .pyc
.
4.PyCodeObject
The result of compiling the Python code is the PyCodeObject
object.
typedef struct {PyObject_HEADint co_argcount; /* 位置参数个数 */int co_nlocals; /* 局部变量个数 */int co_stacksize; /* 栈大小 */int co_flags;PyObject *co_code; /* 字节码指令序列 */PyObject *co_consts; /* 所有常量集合 */PyObject *co_names; /* 所有符号名称集合 */PyObject *co_varnames; /* 局部变量名称集合 */PyObject *co_freevars; /* 闭包用的的变量名集合 */PyObject *co_cellvars; /* 内部嵌套函数引用的变量名集合 *//* The rest doesn’t count for hash/cmp */PyObject *co_filename; /* 代码所在文件名 */PyObject *co_name; /* 模块名|函数名|类名 */int co_firstlineno; /* 代码块在文件中的起始行号 */PyObject *co_lnotab; /* 字节码指令和行号的对应关系 */void *co_zombieframe; /* for optimization only (see frameobject.c) */} PyCodeObject;
5.PYC file Format
When the module is loaded, the corresponding object of the module PyCodeObject
is written .pyc
to the file in the following format:
6. Parsing byte code
6.1 Parsing Pycodeobject
Python provides built-in functions compile can compile Python code and view Pycodeobject objects, as follows:
Python code [test.py]
s = ”hello”def func():print sfunc()
Compile the code in the Python interactive shell to get the PyCodeObject
object:
Dir (CO) already lists the various domains of the CO and wants to see a domain directly in the terminal output:
test.py
The Pycodeobject
co.co_argcount 0co.co_nlocals 0co.co_names (‘s’, ’func’)co.co_varnames (‘s’, ’func’)co.co_consts (‘hello’, <code object func at 0x2aaeeec57110, file ”test.py”, line 3>, None)co.co_code ’d\x00\x00Z\x00\x00d\x01\x00\x84\x00\x00Z\x01\x00e\x01\x00\x83\x00\x00\x01d\x02\x00S’
The Python interpreter will also generate a byte-code object for the function PyCodeObject
, see co_consts1 above
Pycodeobject of the Func
func.co_argcount 0func.co_nlocals 0func.co_names (‘s’,)func.co_varnames ()func.co_consts (None,)func.co_code ‘t\x00\x00GHd\x00\x00S’
Co_code is a sequence of instructions, a string of binary streams, and its format and parsing methods are shown in 6.2.
6.2 Parsing Instruction sequence
Format of the instruction sequence Co_code
Python's built-in dis module can parse Co_code, such as:
test.py
Sequence of instructions
Sequence of instructions for the Func function
The first column represents the line number of the following instructions in the Py file;
The second column is the offset of the instruction in the instruction sequence Co_code;
The third column is the name of the instruction opcode, which is divided into two types with operands and no operands, and opcode is a byte integer in the instruction sequence;
The fourth column is the operand oparg, which occupies two bytes in the instruction sequence, and is basically co_consts
or subscript co_names
;
The fifth column with parentheses is the operand description.
7. Execute byte code
The principle of a Python virtual machine is to simulate an executable program and then X86 the machine, X86 the runtime stack frames such as:
If test.py
implemented in C, it would look like this:
const char *s = “hello”;void func() { printf(“%s\n”, s);}int main() { func();return 0;}
The principle of Python virtual machines is to simulate these behaviors. When a function call occurs, a new stack frame is created, and the implementation of the python corresponds to the PyFrameObject
object.
7.1 Pyframeobject
typedef struct _frame {PyObject_VAR_HEADstruct _frame *f_back; /* 调用者的帧 */PyCodeObject *f_code; /* 帧对应的字节码对象 */PyObject *f_builtins; /* 内置名字空间 */PyObject *f_globals; /* 全局名字空间 */PyObject *f_locals; /* 本地名字空间 */PyObject **f_valuestack; /* 运行时栈底 */PyObject **f_stacktop; /* 运行时栈顶 */…….}
So the runtime stack that corresponds to Python is like this:
7.2 Execution Instructions
test.py
the execution of the bytecode, will create a stack frame, the following with F for the current stack frame, the execution process notes are as follows:
test.py
Collection of symbol names and constants
co.co_names (‘s’, ’func’)co.co_consts (‘hello’, <code object func at 0x2aaeeec57110, file ”test.py”, line 3>, None)
test.py
Sequence of instructions
When the above CALL_FUNCTION
instruction executes, a new stack frame is created and the bytecode func
instruction is executed, following the f
bytecode execution process that represents the current stack frame, func
as follows:
A set of symbolic names and constants for the Func function
func.co_names (‘s’,)func.co_consts (None,)
Sequence of instructions for the Func function
7.3 Viewing stack Frames
If you want to view the current stack frame, Python provides a sys._getframe()
way to get the current stack frame, and you just need to include the code in the code as follows:
def func():
import sys
frame = sys._getframe()
print frame.f_locals
print frame.f_globals
print frame.f_back.f_locals
#你可以打印frame的各个域
print s
How Python programs are executed