Background knowledge:
In Python, a function is required to run, which requires three things in a Python vm.
- Pycodeobject, the code that saved the function.
- Pyfunctionobject, this represents a function object in a virtual machine.
- Pyframeobject, which represents the call chain and the stack when the function is run
Python is the only thing that simulates 0x86 's function call through these three things.
In Python, Coroutine is called the generator, and these two things are actually the same thing in Python, so it's called because it has the function of an iterator, but it can consume very little memory. Do not eat can save, and produce data, known as generator or very well in line with the situation.
The generotor in Python is a wrapper for Pyfunctioncode and Pyframeobject, which has its own independent value stack. Plus it can return halfway through function code, and save the state of Pyframeobject. So there is a major role for similar threads: to be able to be dispatched.
For the operating system, it can dispatch only threads, and this scheduling occurs in the kernel state, scheduling timing for programmers is not known. In general, wait for something (lock, network data, disk data), time slice run out, this time if the non-blocking return, but the current task because of the lack of data and can not continue to execute, as the CPU to be drained of the programmer can not waste allocated to the time slice, so should switch tasks. If a thread represents a task, then one more thread object is in the kernel. Increase the burden of memory and scheduler, if the user can have a programmer to control the scheduling tasks, it is not necessary to increase the kernel state thread objects, task scheduling by the programmer. The thing that can be dispatched in user state is coroutine. Because it can be switched, within a thread, it should have its own stack, its own register (state)-------if implemented in a C + + language, if implemented in a VM, it will be switched, as long as it represents the current task (in fact, the function) The state of the Pyframeobject state is available.
CPython Generator The data structures and objects involved
1.pygen_type
pytypeobject pygen_type = {pyvarobject_head_init (&PyType_Type, 0) "Gener Ator ",/* tp_name */sizeof (PYGENOBJECT),/* tp_basicsize */. ........ Omit Pyobject_genericgetattr,/* Tp_getattro */.... Omit (Traverseproc) gen_traverse,/* Tp_traverse */0,/* tp_c Lear */0,/* tp_richcompare */offsetof (Pygenobject, Gi_weakreflist), /* Tp_weaklistoffset */pyobject_selfiter,/* tp_iter */(Iternextfunc) Gen_iternext, /* Tp_iternext */gen_methods,/* tp_methods */gen_memberlist, /* Tp_members */gen_getsetlist,/* tp_getset */.... Omit Gen_del,/* Tp_del */};
From the Pygen_type object to the Tp_iter,tp_iternext setting, the generator is implemented iterator protocol, which can be iterated in the for statement.
2.PyCodeObject, Pyframeobject,pyfunctionobject
3.PyGenObject
typedef struct {pyobject_head/* The gi_ prefix is intended to remind of Generator-iterator. *//* Note:gi_frame can NUL L If the generator is "finished" *///pyframeobjectstruct _frame *gi_frame;/* True If generator is being executed. *///status int gi_running;/* The code object backing the generator *///pycodeobjectpyobject *gi_code;/* List of weak reference. */pyobject *gi_weakreflist;} Pygenobject;
gi_running in pygenobject indicates state 0: Not running, 1: running, using Frame.f_lasti==-1 to indicate that it has not been started, because no bytecode has been run, so the last instuction offset of the frame will be -1,gi_ Code corresponding to the generator method, Gi_frame is Pyframeobject, to save the current generator bytecode execution status, you can know that generator can only correspond to a frame, it would not have nested frame, That is, it is not possible to return to the Send/next point in a function called by generator, which is a limitation to its application, and if the complexity of the business results in generator code being bloated.
Implementation analysis of generator in CPython:
Take this Python code as the analysis object
Def gen (): X=yield 1print Xx=yield 2g=gen () g.next () Print g.send ("sender")
The corresponding Python bytecode is
Source line number |
Python code |
Byte code offset |
BYTE code |
BYTE code parameters |
Comments |
1 |
Def gen (): |
0 |
Load_const |
0 (<code Object gen ) |
This defines a pyfunctionobject, The corresponding Pycodeobject There is a flag (co_generator) Mark is a generator |
|
|
3 |
Make_function |
0 |
|
|
|
6 |
Store_name |
0 (Gen) |
Gen=pyfunctionobject |
|
|
|
|
|
|
7 |
G=gen () |
9 |
Load_name |
0 (Gen) |
|
|
|
12 |
Call_function |
|
In Pyeval_evalcodeex, because Gen saves the Pyfunctionobject, The corresponding pycodeobject.co_flags There are co_generator marks, It returns directly back to a pygenobject |
|
|
15 |
Store_name |
1 (g) |
|
|
|
|
|
|
|
9 |
G.next () |
18 |
Load_name |
1 (g) |
|
|
|
21st |
Load_attr |
2 (Next) |
Pyobject_getattr (g, ' next ') Pygen_type.tp_getattro () At this time tp_getattro=pyobject_genericgetattr Get Wrappertype This wrapper contains the generator, |
|
|
24 |
Call_function |
0 |
When calling, call Generator.next instead. Is Gen_iternext, then go to GEN_SEND_EX here, |
|
|
27 |
Pop_top |
|
|
|
|
|
|
|
|
10 |
|
28 |
Load_name |
1 (g) |
|
|
|
31 |
Load_attr |
3 (send) |
|
|
|
34 |
Load_const |
1 (' sender ') |
|
|
|
37 |
Call_function |
1 |
Here go to Gen_send (Pygenobject *gen, Pyobject *arg) |
|
|
40 |
Print_item |
|
|
|
|
41 |
Print_newline |
|
|
|
|
42 |
Load_const |
2 (None) |
|
|
|
45 |
Return_value |
|
|
|
|
|
|
|
|
In the analysis of CPython source code will encounter a lot of Pymethoddescrobject, Pymemberdescrobject, Pygetsetdescrobject, Pywrapperdescrobject, Because Python language design is more flexible, different methods, properties, there are different methods of acquisition, and different methods have different parameters, so the way to call is not the same ah, so the corresponding C code should have a different strategy, need to wrap this strategy role. These descr are some of the outer wrapper objects, just for the convenience of management. When class object is initialized, it is saved to the corresponding type.tp_dict.
Application of Coroutine:
Coroutine because there is no active call to the operating system, to have a programmer to control the timing of the scheduling, in the user-state scheduling is not suitable for simulating real-time, but it is very suitable for the status of unrelated time changes, we take the e-commerce Express merchandise process for example, a product in the seller to arrive at the buyer will experience the following several : For sale, sold, commodity in the beginning city, commodity in the middle city, the goods arrive destination city, start to deliver, arrive the buyer hand.
Express Product Status Change chart
E-commerce commodity status Switch pseudo-code:
From collections Import namedtuplestate=namedtuple (' state ', ' statename action ') def Commodity (ID): #待售状态action =yield State (' Forsale ', ' online ') #已售状体if action== ' sellout ': Action =yield State (' Sellout ', ' postman1 ') elif action== ' offline ' : return# in Departure City Express point status if action== ' Store1 ': Action=yield state (' Store1 ', ' store in Garage ') Else:return # Intermediate path status has been generated Middlecities=generateroute (ID) If action== ' route ': Action=yield state (' store1_routed ', ' caculate Route ') Else:returnl=len (middlecities) for city in Middlecities:if action== ' next ': if city==middlecities[l-1]:# has reached the destination city status Action =yield state (' Destination ', cities) Else: #中间城市流转状态action =yield (' middle_city ') Start delivery status in destination city if ' deliver ': Action=yield state (' Delivering ', ' postman is delivering ') else:return# be accepted by buyer status if action== ' Accept ': Yield state (' accepted ', ' finish ')
Generator (coroutine) in Python and its application