How to decompile Python into bytecode using the dis module in Python

Source: Internet
Author: User
Generally, Python's performance or code quality evaluation can be carried out by obtaining the corresponding bytecode instructions through the dis module, so here we will take a look at the usage details of using the dis module to decompile Python into bytecode: dis-Disassembler for Python bytecode, that is, decompiling Python code into bytecode instructions.
Easy to use:

python -m dis xxx.py

Python code is first compiled into bytecode before being executed by the Python virtual machine. Python bytecode is an intermediate language similar to assembly instructions, A Python statement corresponds to several bytecode commands. the virtual machine executes the bytecode command one by one to complete program execution.
The Python dis module supports disassembly of Python code to generate bytecode commands.
I am confused when I see that while 1 is faster than while True on the internet. why is this difference?
So we used dis for in-depth analysis.
Suppose the code of est_while.py is as follows.

#coding=utf-8 while 1:  pass  while True:  pass 

The following section uses dis for analysis.

E:\>python -m dis test_while.py  2   0 SETUP_LOOP    3 (to 6)   3  >> 3 JUMP_ABSOLUTE   3   5  >> 6 SETUP_LOOP    10 (to 19)   >> 9 LOAD_NAME    0 (True)     12 POP_JUMP_IF_FALSE  18 

As you can see, in while 1 (line 1), the JUMP_ABSOLUTE command is directly used;
While True here (5th rows), composed of LOAD_NAME and POP_JUMP_IF_FALSE commands.
Originally, True is not a keyword in python2, but a built-in variable, bool type. The value is 1, that is, True + True outputs 2.
It can also be assigned a value. for example, you can assign a value of True = 2 or even True = False.
Therefore, when the value is while True, the value of True must be checked for each loop, corresponding to the command LOAD_NAME.
This is why while True is slower than while 1.
However, in python3, True becomes the keyword. the while 1 and while True commands are the same, so there is no performance difference.

Let's take a look at a small example. let's take a short piece of code:

In[6]: def test(): ...   x = 1 ...   if x < 3: ...    return "yes" ...   else: ...    return "no" 

After the code is executed, the following output is displayed:

In[7]: import dis In[8]: dis.dis(test)  2   0 LOAD_CONST    1 (1)     3 STORE_FAST    0 (x)   3   6 LOAD_FAST    0 (x)     9 LOAD_CONST    2 (3)     12 COMPARE_OP    0 (<)     15 POP_JUMP_IF_FALSE  22   4   18 LOAD_CONST    3 ('yes')     21 RETURN_VALUE     6  >> 22 LOAD_CONST    4 ('no')     25 RETURN_VALUE       26 LOAD_CONST    0 (None)     29 RETURN_VALUE   
Take the first command as an example. the number (2) in the first column indicates the number of lines corresponding to the source code. The number in the second column is the bytecode index, and the instruction LOAD_CONST is at 0. The third column is the human readable name corresponding to the instruction itself. The fourth column indicates the command parameters. The 5th column is the calculated actual parameter. ">" Indicates the target to jump to, and "22" in column 4th indicates the command to jump to index 22. The Python code will generate CodeObject during compilation. CodeObject is an abstract representation in the virtual machine and is represented as PyCodeObject in the Python C source code. the pyc file represents the bytecode in the disk.
In Python code, test. _ code _. co_code indicates the bytecode instruction sequence of the test function.
Print the sequence,

code = [ord(i) for i in list(test.__code__.co_code)] print code 

Output:

The code is as follows:

[100, 1, 0,125, 0, 0,124, 0, 0,100, 2, 0,107, 0, 0,114, 22, 0,100, 3, 0, 83,100, 4, 0, 83,100, 0, 0, 83]

Take the sequence [, 1, 0] as an example to compare the bytecode commands output by dis. 100 indicates the index in the Python bytecode definition. in the python code,
You can view it through dis. opname [100], that is, LOAD_CONST. The second two bytes represent the command parameters. In the bytecode command output by dis,
The bytecode index in the second column refers to the location of the current instruction in the co_code sequence.
In the bytecode command output by dis, some commands do not have parameters. in co_code, we can also see that 83 (RETURN_VALUE) directly connects to the next command 100 (LOAD_CONST ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.