Python yield and implementation method code analysis, pythonyield

Source: Internet
Author: User
Tags builtin

Python yield and implementation method code analysis, pythonyield

Yield functions similar to return, but the difference is that it returns a generator.

Generator

A generator is a function composed of one or more yield expressions. Each generator is an iterator (but not necessarily a generator ).

If a function contains the yield keyword, this function will become a generator.

The generator does not return all results at a time. Instead, the generator returns the corresponding results after encountering the yield keyword each time, and retains the current running status of the function, waiting for the next call.

Since the generator is also an iterator, it should support the next method to obtain the next value.

Basic operations

# Create the generator def func (): for I in xrange (10); yield I through 'yield'
# Create the generator [I for I in xrange (10)] through the list # create the generator def func (): for I in xrange (10) through 'yield ); yield I # create a generator through the list [I for I in xrange (10)] Python # Call The following >>> f = func () >>> f # The generator has not run <generator object func at 0x7fe01a853820 >>> f. next () # When I = 0, the yield keyword is returned directly> f. next () # continue the last execution and enter the next loop... >>> f. next () >>> f. next () # When the last loop is executed, end the yield statement and generate the StopIteration exception Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration >># call the following >>>> f = func () >>> f # The generator has not run <generator object func at 0x7fe01a853820 >>> f. next () # When I = 0, the yield keyword is returned directly> f. next () # continue the last execution and enter the next loop... >>> f. next () >>> f. next () # When the last loop is executed, end the yield statement and generate the StopIteration exception Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration>

In addition to the next function, the generator also supports the send function. This function can pass parameters to the generator.

>>> Def func ():... n = 0... while 1 :... n = yield n # assign a value to n using the send function... >>> f = func () >>> f. next () # by default, n is 0 >>> f. send (1) # n value assignment 1 >>> f. send (2) >>>>>> def func ():... n = 0... while 1 :... n = yield n # assign a value to n using the send function... >>> f = func () >>> f. next () # by default, n is 0 >>> f. send (1) # n value assignment 1 >>> f. send (2) >>>

Application

The most classic example generates an infinite sequence.

The general solution is to generate a large list that meets the requirements. This list needs to be saved in the memory, which is obviously limited by the memory.

def get_primes(start): for element in magical_infinite_range(start):  if is_prime(element):   return elementdef get_primes(start): for element in magical_infinite_range(start):  if is_prime(element):   return element

If you use the generator, you do not need to return the entire list. Each time, only one data is returned, which avoids memory restrictions.

def get_primes(number): while True:  if is_prime(number):   yield number  number += 1def get_primes(number): while True:  if is_prime(number):   yield number  number += 1

Generator source code analysis

The source code of the generator is in Objects/genobject. c.

Call Stack

Before interpreting the generator, you need to explain the calling principle of the Python virtual machine.

The Python virtual machine has a stack frame call stack. The stack frame is PyFrameObject, which is located in Include/frameobject. h.

typedef struct _frame { PyObject_VAR_HEAD struct _frame *f_back; /* previous frame, or NULL */ PyCodeObject *f_code; /* code segment */ PyObject *f_builtins; /* builtin symbol table (PyDictObject) */ PyObject *f_globals; /* global symbol table (PyDictObject) */ PyObject *f_locals;  /* local symbol table (any mapping) */ PyObject **f_valuestack; /* points after the last local */ /* Next free slot in f_valuestack. Frame creation sets to f_valuestack.  Frame evaluation usually NULLs it, but a frame that yields sets it  to the current stack top. */ PyObject **f_stacktop; PyObject *f_trace;  /* Trace function */ /* If an exception is raised in this frame, the next three are used to  * record the exception info (if any) originally in the thread state. See  * comments before set_exc_info() -- it's not obvious.  * Invariant: if _type is NULL, then so are _value and _traceback.  * Desired invariant: all three are NULL, or all three are non-NULL. That  * one isn't currently true, but "should be".  */ PyObject *f_exc_type, *f_exc_value, *f_exc_traceback; PyThreadState *f_tstate; int f_lasti;  /* Last instruction if called */ /* Call PyFrame_GetLineNumber() instead of reading this field  directly. As of 2.3 f_lineno is only valid when tracing is  active (i.e. when f_trace is set). At other times we use  PyCode_Addr2Line to calculate the line from the current  bytecode index. */ int f_lineno;  /* Current line number */ int f_iblock;  /* index in f_blockstack */ PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */ PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */} PyFrameObject;typedef struct _frame { PyObject_VAR_HEAD struct _frame *f_back; /* previous frame, or NULL */ PyCodeObject *f_code; /* code segment */ PyObject *f_builtins; /* builtin symbol table (PyDictObject) */ PyObject *f_globals; /* global symbol table (PyDictObject) */ PyObject *f_locals;  /* local symbol table (any mapping) */ PyObject **f_valuestack; /* points after the last local */ /* Next free slot in f_valuestack. Frame creation sets to f_valuestack.  Frame evaluation usually NULLs it, but a frame that yields sets it  to the current stack top. */ PyObject **f_stacktop; PyObject *f_trace;  /* Trace function */ /* If an exception is raised in this frame, the next three are used to  * record the exception info (if any) originally in the thread state. See  * comments before set_exc_info() -- it's not obvious.  * Invariant: if _type is NULL, then so are _value and _traceback.  * Desired invariant: all three are NULL, or all three are non-NULL. That  * one isn't currently true, but "should be".  */ PyObject *f_exc_type, *f_exc_value, *f_exc_traceback;  PyThreadState *f_tstate; int f_lasti;  /* Last instruction if called */ /* Call PyFrame_GetLineNumber() instead of reading this field  directly. As of 2.3 f_lineno is only valid when tracing is  active (i.e. when f_trace is set). At other times we use  PyCode_Addr2Line to calculate the line from the current  bytecode index. */ int f_lineno;  /* Current line number */ int f_iblock;  /* index in f_blockstack */ PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */ PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */} PyFrameObject;

Stack frames store the information and context of the Code, including the last executed command, global and local namespace, and exception status. F_valueblock stores data, and B _blockstack stores exception and cyclic control methods.

For example,

def foo(): x = 1 def bar(y):  z = y + 2 # def foo(): x = 1 def bar(y):  z = y + 2 # 

The corresponding call stack is as follows: a py file, a class, and a function are all code blocks. The corresponding Frame stores the context and bytecode instructions.

c ---------------------------a | bar Frame     | -> block stack: []l |  (newest)    | -> data stack: [1, 2]l --------------------------- | foo Frame     | -> block stack: []s |       | -> data stack: [.bar at 0x10d389680>, 1]t ---------------------------a | main (module) Frame  | -> block stack: []c |  (oldest)   | -> data stack: []k ---------------------------c ---------------------------a | bar Frame     | -> block stack: []l |  (newest)    | -> data stack: [1, 2]l --------------------------- | foo Frame     | -> block stack: []s |       | -> data stack: [.bar at 0x10d389680>, 1]t ---------------------------a | main (module) Frame  | -> block stack: []c |  (oldest)   | -> data stack: []k ---------------------------

Each stack frame has its own data stack and block stack. The independent data stack and block stack enable the interpreter to interrupt and restore stack frames (this is officially used by the generator ).

Python code is first compiled into bytecode and then executed by the Python virtual machine. Generally, a Python statement corresponds to multiple bytecode (because each bytecode corresponds to a C statement instead of a machine command, therefore, the code performance cannot be determined based on the number of bytecode ).

The dis module can be called to analyze bytecode,

From dis import disdis (foo) 0 LOAD_CONST 1 (1) # Load constant 1 3 STORE_FAST 0 (x) # x value 1 6 LOAD_CONST 2 (<code>) # Load constant 2 9 MAKE_FUNCTION 0 # create function 12 STORE_FAST 1 (bar) 15 LOAD_FAST 1 (bar) 18 LOAD_FAST 0 (x) 21 CALL_FUNCTION 1 # Call function 24 RETURN_VALUE </code> from dis import dis (foo) 0 LOAD_CONST 1 (1) # Load constant 1 3 STORE_FAST 0 (x) # assign the value of x to 1 6 LOAD_CONST 2 (<code>) # Load constant 2 9 MAKE_FUNCTION 0 # create function 12 STORE_FAST 1 (bar) 15 LOAD_FAST 1 (bar) 18 LOAD_FAST 0 (x) 21 CALL_FUNCTION 1 # Call function 24 RETURN_VALUE </code>

Where,

Line number of the first behavior code;
Second behavior offset address;
Third behavior bytecode command;
The fourth behavior command parameter;
Description of the fifth behavior parameter.

Line number of the first behavior code;
Second behavior offset address;
Third behavior bytecode command;
The fourth behavior command parameter;
Description of the fifth behavior parameter.

Generator source code analysis

With the above understanding of the Call Stack, you can easily understand the specific implementation of the generator.

The source code of the generator is located in object/genobject. c.

Generator Creation

PyObject * PyGen_New (PyFrameObject * f) {PyGenObject * gen = PyObject_GC_New (PyGenObject, & PyGen_Type); # create a generator object if (gen = NULL) {Py_DECREF (f ); return NULL;} gen-> gi_frame = f; # assign the code block Py_INCREF (f-> f_code); # reference count + 1 gen-> gi_code = (PyObject *) (f-> f_code); gen-> gi_running = 0; #0 indicates execution, that is, the initial state of the generator gen-> gi_weakreflist = NULL; _ PyObject_GC_TRACK (gen ); # GC trace return (PyObject *) gen;} PyObject * PyGen_New (PyFrameObject * f) {PyGenObject * gen = PyObject_GC_New (PyGenObject, & PyGen_Type ); # create a generator object if (gen = NULL) {Py_DECREF (f); return NULL;} gen-> gi_frame = f; # assign the code block Py_INCREF (f-> f_code ); # reference count + 1 gen-> gi_code = (PyObject *) (f-> f_code); gen-> gi_running = 0; #0 indicates execution, that is, the initial state of the generator gen-> gi_weakreflist = NULL; _ PyObject_GC_TRACK (gen); # GC trace return (PyObject *) gen ;}

Send and next

The next and send functions are as follows:

static PyObject *gen_iternext(PyGenObject *gen){ return gen_send_ex(gen, NULL, 0);}static PyObject *gen_send(PyGenObject *gen, PyObject *arg){ return gen_send_ex(gen, arg, 0);}static PyObject *gen_iternext(PyGenObject *gen){ return gen_send_ex(gen, NULL, 0);}static PyObject *gen_send(PyGenObject *gen, PyObject *arg){ return gen_send_ex(gen, arg, 0);}

From the code above, we can see that both send and next call the same function gen_send_ex. The difference is whether there is a parameter.

Static PyObject * gen_send_ex (PyGenObject * gen, PyObject * arg, int exc) {PyThreadState * tstate = PyThreadState_GET (); PyFrameObject * f = gen-> gi_frame; PyObject * result; if (gen-> gi_running) {# determine whether the generator has run PyErr_SetString (PyExc_ValueError, "generator already executing"); return NULL ;} if (f = NULL | f-> f_stacktop = NULL) {# if the code block is empty or the call stack is empty, throw a StopIteration exception/* Only set exception if called from se Nd () */if (arg &&! Exc) PyErr_SetNone (PyExc_StopIteration); return NULL;} if (f-> f_lasti =-1) {# f_lasti = 1 indicates the first execution of if (arg & arg! = Py_None) {# The parameter PyErr_SetString (PyExc_TypeError, "can't send non-None value to a" "just-started generator"); return NULL is not allowed during the first execution ;}} else {/* Push arg onto the frame's value stack */result = arg? Arg: Py_None; Py_INCREF (result); # reference count + 1 * (f-> f_stacktop ++) = result; # parameter pressure stack}/* Generators always return to their most recent caller, not * necessarily their creator. */f-> f_tstate = tstate; Py_XINCREF (tstate-> frame); assert (f-> f_back = NULL); f-> f_back = tstate-> frame; gen-> gi_running = 1; # modify the generator execution status result = PyEval_EvalFrameEx (f, exc); # Run the bytecode gen-> gi_running = 0; # restore to unexecuted status/* Don't k Eep the reference to f_back any longer than necessary. it * may keep a chain of frames alive or it cocould create a reference * cycle. */assert (f-> f_back = tstate-> frame); Py_CLEAR (f-> f_back ); /* Clear the borrowed reference to the thread state */f-> f_tstate = NULL;/* If the generator just returned (as opposed to yielding ), signal * that the generator is exhausted. */if (result = Py_None & f-> F_stacktop = NULL) {Py_DECREF (result); result = NULL;/* Set exception if not called by gen_iternext () */if (arg) PyErr_SetNone (PyExc_StopIteration );} if (! Result | f-> f_stacktop = NULL) {/* generator can't be rerun, so release the frame */Py_DECREF (f); gen-> gi_frame = NULL ;} return result;} static PyObject * gen_send_ex (PyGenObject * gen, PyObject * arg, int exc) {PyThreadState * tstate = PyThreadState_GET (); PyFrameObject * f = gen-> gi_frame; pyObject * result; if (gen-> gi_running) {# determine whether the generator has run PyErr_SetString (PyExc_ValueError, "generator already e Xecuting "); return NULL;} if (f = NULL | f-> f_stacktop = NULL) {# if the code block is empty or the call stack is empty, throw a StopIteration exception/* Only set exception if called from send () */if (arg &&! Exc) PyErr_SetNone (PyExc_StopIteration); return NULL;} if (f-> f_lasti =-1) {# f_lasti = 1 indicates the first execution of if (arg & arg! = Py_None) {# The parameter PyErr_SetString (PyExc_TypeError, "can't send non-None value to a" "just-started generator"); return NULL is not allowed during the first execution ;}} else {/* Push arg onto the frame's value stack */result = arg? Arg: Py_None; Py_INCREF (result); # reference count + 1 * (f-> f_stacktop ++) = result; # parameter pressure stack}/* Generators always return to their most recent caller, not * necessarily their creator. */f-> f_tstate = tstate; Py_XINCREF (tstate-> frame); assert (f-> f_back = NULL); f-> f_back = tstate-> frame; gen-> gi_running = 1; # modify the generator execution status result = PyEval_EvalFrameEx (f, exc); # Run the bytecode gen-> gi_running = 0; # restore to unexecuted status/* Don't k Eep the reference to f_back any longer than necessary. it * may keep a chain of frames alive or it cocould create a reference * cycle. */assert (f-> f_back = tstate-> frame); Py_CLEAR (f-> f_back ); /* Clear the borrowed reference to the thread state */f-> f_tstate = NULL;/* If the generator just returned (as opposed to yielding ), signal * that the generator is exhausted. */if (result = Py_None & f-> F_stacktop = NULL) {Py_DECREF (result); result = NULL;/* Set exception if not called by gen_iternext () */if (arg) PyErr_SetNone (PyExc_StopIteration );} if (! Result | f-> f_stacktop = NULL) {/* generator can't be rerun, so release the frame */Py_DECREF (f); gen-> gi_frame = NULL ;} return result ;}

Execution of bytecode

The function of PyEval_EvalFrameEx is to execute bytecode and return results.

# The main process is as follows: for (;) {switch (opcode) {# opcode is the operation code, corresponding to various operations case NOP: goto fast_next_opcode ;...... case YIELD_VALUE: # If the operation code is yield retval = POP (); f-> f_stacktop = stack_pointer; why = WHY_YIELD; goto fast_yield; # Use goto jump out of the loop} fast_yield :... return vetval; # return result # The main process is as follows: for (;) {switch (opcode) {# opcode is the operation code, corresponding to various operations case NOP: goto fast_next_opcode ;...... case YIELD_VALUE: # If the operation code is yield retval = POP (); f-> f_stacktop = stack_pointer; why = WHY_YIELD; goto fast_yield; # Use goto jump out of the loop} fast_yield :... return vetval; # return results

For example, the offset of the last executed command of Frame f_lasti on f_back,

import sysfrom dis import disdef func(): f = sys._getframe(0) print f.f_lasti print f.f_back yield 1 print f.f_lasti print f.f_back yield 2a = func()dis(func)a.next()a.next()import sysfrom dis import disdef func(): f = sys._getframe(0) print f.f_lasti print f.f_back yield 1 print f.f_lasti print f.f_back yield 2a = func()dis(func)a.next()a.next()

The result is as follows. The third line corresponds to the opcode above, and each switch is selected between different Opcodes.

Python 0 LOAD_GLOBAL 0 (sys) 3 LOAD_ATTR 1 (_ getframe) 6 LOAD_CONST 1 (0) 9 CALL_FUNCTION 1 12 STORE_FAST 0 (f) 15 LOAD_FAST 0 (f) 18 LOAD_ATTR 2 (f_lasti) 21 PRINT_ITEM 22 PRINT_NEWLINE 23 LOAD_FAST 0 (f) 26 LOAD_ATTR 3 (f_back) 29 PRINT_ITEM 30 PRINT_NEWLINE 31 LOAD_CONST 2 (1) 34 YIELD_VALUE # The operation code is complete, jump directly to the preceding goto statement. f_lasti is the current command, and f_back is the current frame 35 POP_TOP 36 LOAD_FAST 0 (f) 39 LOAD_ATTR. 2 (f_lasti) 42 PRINT_ITEM 43 PRINT_NEWLINE 44 LOAD_FAST 0 (f) 47 LOAD_ATTR 3 (f_back) 50 PRINT_ITEM 51 PRINT_NEWLINE 52 LOAD_CONST 3 (2) 55 YIELD_VALUE 56 POP_TOP 57 LOAD_CONST 0 (None) 60 RETURN_VALUE <frame object at 0x7fa75fcebc20> # It is the same as the following frame and belongs to the same frame. That is to say, within the same function (namespace), the frame is the same. <Frame object at 0x7fa75fcebc20> 0 LOAD_GLOBAL 0 (sys) 3 LOAD_ATTR 1 (_ getframe) 6 LOAD_CONST 1 (0) 9 CALL_FUNCTION 1 12 STORE_FAST 0 (f) 15 LOAD_FAST 0 (f) 18 LOAD_ATTR 2 (f_lasti) 21 PRINT_ITEM 22 PRINT_NEWLINE 23 LOAD_FAST 0 (f) 26 LOAD_ATTR 3 (f_back) 29 PRINT_ITEM 30 PRINT_NEWLINE 31 LOAD_CONST 2 (1) 34 YIELD_VALUE # The operation code is complete, jump directly to the preceding goto statement. f_lasti is the current command, and f_back is the current frame 35 POP_TOP 36 LO. AD_FAST 0 (f) 39 LOAD_ATTR 2 (f_lasti) 42 PRINT_ITEM 43 PRINT_NEWLINE 44 LOAD_FAST 0 (f) 47 LOAD_ATTR 3 (f_back) 50 PRINT_ITEM 51 PRINT_NEWLINE 52 LOAD_CONST 3 (2) 55 YIELD_VALUE 56 POP_TOP 57 LOAD_CONST 0 (None) 60 RETURN_VALUE <frame object at 0x7fa75fcebc20> # Same as the following frame, which belongs to the same frame, that is, in the same function (namespace) and frame is the same. <Frame object at 0x7fa75fcebc20>

Summary

The above is a small series of Python yield and implementation method code analysis, I hope to help you, if you have any questions, please leave a message, the small series will reply to you in a timely manner. Thank you very much for your support for the help House website!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.