This book address: http://www.jianshu.com/p/14586ec50ab6
Python often use the module, such as import xxx,from xxx import yyy
this, the inside of the mechanism is also need to explore, the main from the black box angle to detect the module mechanism, source analysis donuts, detailed source analysis see Chen Ru the Great god of the "Python Source Code Anatomy" 14th chapter.
1 How to import modules
First, consider an example of an import module. To create a folder Demo5, there are several files in the folder.
[email protected] ~/demo5 $ ls__init__.py math.py sys.py test.py
According to Python rules, because there is an init. py file under the folder, Demo5 is a package. The contents of each file are as follows:
#__init__.pyimport sysimport math#math.pyprint‘my math‘#sys.pyprint‘my sys‘#test.pyimport sysimport math
OK, the question is, what will I print when I run python test.py
the Demo5 directory? Does the Sys module and the Math module call the module under the Demo5 directory or the system itself? The result is only printed, that is my math
, the SYS module is not the SYS module under the imported Demo5 directory. But what if we don't run test.py directly, but instead import the entire package? The result is very different, when we are in the Demo5 upper directory execution import demo5
, we can find print out my sys
and my math
, in other words, the import is the Demo5 directory under the two modules. These two different results are caused by the Python module and the package import mechanism. The following is an analysis of the Python module and the package import mechanism.
2 Python module and package import principle
The
Python module and package Import function call path is builtin___import__->import_module_level->load_next->import_submodule-> Find_module->load_module
, this article does not intend to analyze all functions, only to pick out a few key code analysis.
builtin___import__
function resolves import parameters, such as import xxx
and from yyy import xxx
The parameters obtained after parsing are not the same. The import_module_level
function is then parsed into the tree structure of the module and package, and Load_next
is called to import the module. Load_next
calls Import_submodule
to find and import the module. Notice that if you import the module from inside the package, Load_next
will first call Import_submodule
with the full module name containing the package name to find and import the module. If it is not found, only the module name is used to find and import the module. Import_submodule
will first determine if the system module is based on the module full name FullName . That is said before the Sys.modules whether there is the module, such as Sys,os modules, if the system module, then directly return the corresponding module. Otherwise, call the find_module
search module from the module path and call the load_module
function to import the module. Note that if the module is not imported from the package, the Find_module will determine whether the module is a built-in module or an extension module (note that the built-in modules and extensions here refer to infrequently used system modules, such as IMP and math modules, etc.). If yes, initialize the built-in module directly and add it to the previous Backup module collection extensions. Otherwise you need to search the module package path and the system default path whether there is the module, if you do not find the module, then error. When the module is found, the module is initialized and the module reference is added to the sys.modules
.
Load_module
This function requires additional instructions, and the function uses different loading methods depending on the type of module, the base type is Py_source, Py_compiled,c_builtin, c_extension , Pkg_directory and so on. Py_source refers to the normal py file, and py_compiled refers to the compiled PYC file, if the py file and PYc files are present, then here is the type of Py_source, you may be a little puzzled, so it will not affect the efficiency of it? In fact, this is to ensure that the import of the latest module code, because in the load_source_module will determine whether the PYc file is outdated, if not outdated, or will be imported here PYc file, so performance will not have too much impact. C_builtin refers to the system built-in modules, such as the Imp module, c_extension refers to the expansion module, usually in the form of a dynamic link library, such as math.so module. Pkg_directory refers to importing packages, such as importing DEMO5 packages, importing the package Demo5 itself and then importing the init . PY module.
/*load_next函数部分代码*/*load_next() { ....... =//p是模块名,buf是包含包名的完整模块名 if==&&!= mod) { = import_submodule(altmod, p, p); } .......}
/*import_submodule Part code * /Static Pyobject*Import_submodule (Pyobject*MoD, Char*SubName, Char*FullName) {Pyobject*Modules=Pyimport_getmoduledict (); Pyobject*M= NULL;if((M=Pydict_getitemstring (modules, fullname))!= NULL) {py_incref (M); }Else{...... if(MoD==Py_none) path= NULL;Else{Path=Pyobject_getattrstring (mod,"__path__");......}.......Fdp=Find_module (fullname, SubName, Path, buf, Maxpathlen+1,&Fp&loader);.......M=Load_module (FullName, FP, BUF, FDP -type, loader);....... if(!Add_submodule (mod, M, FullName, SubName, modules)) {py_xdecref (M); M= NULL; } }returnm;}
Next, we need to explain the problem raised in the first section, the first direct python test.py
time, then successively imported sys模块和math模块
, because it is directly imported module, then the full name is SYS, while importing the SYS module, although the current directory has the SYS module, but the SYS module is the system module, Therefore, the SYS module of the system is returned directly in the Import_submodule. The math module is not a pre-loaded module, so it will be found and loaded in the current directory.
And if we use the package mechanism, we will load the import demo5
DEMO5 package itself, then load the __init__.py
module, andinit. PY will load the SYS and math modules, because it is loaded by the package, So FullName will become Demo5.sys and Demo5.math. Obviously at the time of judgment, Demo5.sys is not in the system preloaded module Sys.modules, so it will eventually load the SYS module under the current directory. Math is similar to the previous situation.
3 Modules and namespaces
When the module is imported, the corresponding name is introduced into the namespace. Note that importing the module and setting the name of the namespace is not the same, and you need to be careful to differentiate it. Here is a chestnut, here is a bag foobar, inside there a.py, b.py,__init__.py
.
In [1]: Import Sysin [2]: sys.modules[' foobar ']---------------------------------------------------------------------------Keyerror Traceback (most recentPager Last) <ipython-input-2-9001Cd5d540a>inch<Module> ()---->1sys.modules[' Foobar ']keyerror:' Foobar 'inch[3]: Import Foobarimport Package Foobarinch[4]: sys.modules[' Foobar ']out[4]: <Module ' Foobar ' from ' Foobar/__init__.pyc '>inch[5]: Import Foobar.aimportModuleAinch[6]: sys.modules[' Foobar.a ']out[6]: <Module ' Foobar.a ' from ' Foobar/a.pyc '>inch[7]: Locals () [' Foobar ']out[7]: <Module ' Foobar ' from ' Foobar/__init__.pyc '>inch[8]: Locals () [' Foobar.a ']---------------------------------------------------------------------------Keyerror Trac Eback (most recentPager Last) <ipython-input-8-059690e6961A>inch<Module> ()---->1Locals () [' Foobar.a ']keyerror:' Foobar.a 'inch[9]: fromFoobar Import BimportModuleBinch[Ten]: Locals () [' B ']out[Ten]: <Module ' Foobar.b ' from ' Foobar/b.pyc '>inch[ One]: sys.modules[' Foobar.b ']out[ One]: <Module ' Foobar.b ' from ' Foobar/b.pyc '>inch[ A]: sys.modules[' B ']---------------------------------------------------------------------------Keyerror Trac Eback (most recentPager Last) <ipython-input- --1Df8d2911c99>inch<Module> ()---->1sys.modules[' B ']keyerror:' B '
We know that the imported modules will be added to the sys.modules
dictionary. When we import modules, can be easily divided into the following cases, the specific principle can be found in the source code:
-Import Foobar.a
This is a direct import of module A, then there are foobar and foobar.a in the sys.modules, but there is only foobar in the local namespace and there is no foobar.a. This is determined by the import mechanism, and in the code that imports the module you can see only foobar for FOOBAR.A final storage to the namespace.
-From Foobar import b
This situation is stored in sys.modules and only Foobar (the previous import is not re-imported) and foobar.b. The local namespace has only B, no foobar, and no foobar.b.
-Import FOOBAR.A as a
In this case sys.modules is still foobar and FOOBAR.A, while the local namespace is only a, no foobar, no foobar.a.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Python Source code profiling note 5-module mechanism