Python Descriptor learning tutorial, pythondescriptor
What is Descriptor? In short, Descriptor is a protocol used to customize the consumer class or instance members. .. Well, I cannot say a word clearly. The following describes the definition and use of member variables in Python.
We know that defining class members in Python is very different from the results obtained by C/C ++. See the following definition:
class Cclass { int I; void func(); }; Cclass c;
In the preceding definition, C ++ defines a type. All objects of this type contain a member integer I and function func; python creates an object named Pclass and the type (_ class _) is type (for details, see MetaClass. in Python, everything is an object, and the type is no exception, create an object named p and Pclass. As follows:
In [71]: type(pclass) Out[71]: <type 'type'> In [72]: p = pclass() In [73]: type(p) Out[73]: <class '__main__.pclass'>
P and Pclass each contain the following members:
1 p. _ class _ p. _ init _ p. _ sizeof __
2 p. _ delattr _ p. _ module _ p. _ str __
3 p. _ dict _ p. _ new _ p. _ subclasshook __
4 p. _ doc _ p. _ reduce _ p. _ weakref __
5 p. _ format _ p. _ performance_ex _ p. f
6 p. _ getattribute _ p. _ repr _ p. I
7 p. _ hash _ p. _ setattr __
The members with double underscores (_ slots _) are special members, or they can be called fixed members. The values of these member variables can be changed, but cannot be deleted (del ). The variable __class _ is the type of the object, and __doc _ is the document string of the object. There is a special member worth noting: __dict __, which stores the custom variables of the object. I believe you are new to Python and are surprised that the object can add or delete any member variables, in fact, the mystery of this function lies in the _ dict _ member (note that the _ dict _ of type is of the dictproxy type ):
In [10]: p.x = 2 In [11]: p.__dict__ Out[11]: {'x': 2}
Through the above demonstration, we can clearly see that Python saves the custom members of an object as key-value pairs to the _ dict _ dictionary, the type definition mentioned above is just the syntactic sugar in this case, that is, the above type definition is equivalent to the following form definition:
Class Pclass(object): pass Pclass.i = 1 Pclass.f = lambda x: x
When accessing a member variable, Python also extracts the value corresponding to the variable name from the _ dict _ dictionary. The two access forms in the following form are equivalent -- before the Descriptor is introduced:
p.i p.__dict__['i']
The introduction of Descriptor is about to change the above rules, and it will be broken down later.
Definition: Descriptor Protocol
How does Descriptor change the access rules of object members? According to the famous saying "most software problems can be solved by adding an intermediate layer" in computer theory, we need to provide an intermediate layer for Object Access instead of directly accessing the desired objects. To implement this intermediate layer, define the Descriptor protocol. The definition of Descriptor is very simple. If a class contains one of the following three methods, it can be called a Descriptor:
1. object. _ get _ (self, instance, owner)
Called when a Member is accessed. The instance is the object to which the member belongs, and the owner is the type of the instance.
2. object. _ set _ (self, instance, value)
Called when a member is assigned a value
3.0object. _ delete _ (self, instance)
Called when a Member is deleted
If we need to change the access rules of an object in other objects, we need to define it as a Descriptor. Then, when accessing this member, the corresponding function of the Descriptor will be called. The following is an example of using Descriptor to change access rules:
class MyDescriptor(object): def __init__(self, x): self.x = x def __get__(self, instance, owner): print 'get from descriptor' return self.x def __set__(self, instance, value): print 'set from descriptor' self.x = value def __delete__(self, instance) print 'del from descriptor, the val is', self.x class C(object): d = MyDescriptor('hello') >> C.d get from descriptor >> c = C() >> c.d get from descriptor >> c.d = 1 set from descriptor >> del c.d del from descriptor, the val is 1
We can see from the example that when we perform Reference, Assign, and Dereference operations on Object members, if the object member is a Descriptor, these operations will execute the corresponding member functions of the Descriptor object. The above agreement is the Descriptor protocol.
Magic behind obj. name
After Descriptor is introduced, what are the rules of Python for accessing object members? Before answering this question, you need to make a simple division of the Descriptor:
Overriding or Data: the object provides both the _ get _ and _ set _ methods.
Nonoverriding or Non-Data: the object only provides the _ get _ method.
(The _ del _ method indicates that you are ignored. It is very sad ~)
The following is a rule for accessing a member (such as C. name) from a class object:
If "name" can be found in C. _ dict _, C. name will access C. _ dict _ ['name'], which is assumed to be v. If v is a Descriptor, type (v). _ get _ (v, None, C) is returned. Otherwise, v is directly returned;
If "name" is not in C. _ dict _, search for the parent class of C and repeat the first step of C's parent class based on MRO (Method Resolution Order;
"Name" still not found, throwing AttributeError exception.
Accessing a Member from a class instance object (for example, x. name, type (x) is C) is slightly more complex:
If "name" can be found in C (or C's parent class) and Its Value v is an Overriding Descriptor, type (v) is returned ). _ get _ (v, x, C) value;
Otherwise, if "name" can be found in x. _ dict _, the value of x. _ dict _ ['name'] is returned;
If "name" is not found, the search rule for Class Object members is executed;
If C defines the _ getattr _ function, this function is called; otherwise, an AttributeError error is thrown.
The search rule for assigning values to members is similar to the access rule, but there is a difference: when assigning values to class members, C. the value in _ dict _ does not call the _ set _ function of Descriptor.
The preceding code is used as an example. c. d is found in _ dict _ and d is a Descriptor. Therefore, d is called. _ get _ (None, C); when accessing c. d, Python first looks for C, and finds the definition of d in it, and d is an Overriding Descriptor, So execute d. _ get _ (c, C ).
I have introduced some details about Descriptor. What is the role of Descriptor? In Python, Descriptor is mainly used to implement some Python functions, such as class method calling, staticmethod and Property. The following describes how to use Descriptor to call class methods.
Bound & Unbound Method
In python, a function is the first-level object, that is, it is essentially the same as other objects. The difference is that the function object is a callable object. For function object f, the syntax f () can be used () to call a function. The access rules for object members mentioned above are identical for functions. Python performs the following two steps when calling the member function obj. f:
Obtains function objects based on access rules of object members;
Execute function calls using function objects;
To verify the above process, we can execute the following code:
Class C(object): def f(self): pass >> fun = C.f Unbound Method >> fun() >> c = C() >> fun = c.f Bound Method >> fun()
We can see that C. f and c. f return the instancemethod type object. These two objects are also callable, but they are not the func object we originally thought. So what is the association between the instancemethod object and the func object?
Func type: the func type is the original function object type in Python, that is, def f (): pass defines a func type object f;
Instancemethod: a wrapper of func. If the class Method is not bound to an object, the instancemethod is an Unbound Method. Calling the Unbound Method will cause a TypeError. If the class Method is bound to an object, the instancemethod is a Bound Method. You cannot specify the value of the self parameter when calling the Bound Method.
If you view the members of The Unbound Method object and the Bound Method object, we can find that they all contain the following three members: im_func, im_self, and im_class. Im_func is the encapsulated func object, im_self is the value of the bound object, and im_class is the class object that defines the function. From this we can know that Python will return different wrapper of the function according to different situations. when accessing the function through a class object, the returned Wrapper named Unbound Method is returned, when you access a function through a class instance, the returned result is the Wrapper Bound to the Bound Method object of the instance.
It's time for Descriptor to show its skills.
In Python, func is defined as an Overriding Descriptor. In its _ get _ method, an instancemethod object is constructed, and im_func, im_self, and im_class members are set according to the accessed function. When an instancemethod instance is called, the real function call is completed based on im_func and im_self. The code to demonstrate this process is as follows:
Class instancemethod(object): def __call__(self, *args): if self.im_self == None: raise 'unbound error' return self.im_func(self.im_self, *args) def __init__(self, im_self, im_func, im_class): self.im_self = im_self self.im_func = im_func self.im_class = im_class class func(object): ... def __get__(self, instance, owner): return instancemethod(instance, self, owner) def __set__(self, instance, value): pass ...
Solving a small problem
I have a python code similar to this:
# coding: utf-8class A(object): @property def _value(self):# raise AttributeError("test") return {"v": "This is a test."} def __getattr__(self, key): print "__getattr__:", key return self._value[key]if __name__ == '__main__': a = A() print a.v
You can get the correct result after running
__getattr__: vThis is a test.
However, note that
# raise AttributeError("test")
If the comment in this line is removed, the AttributeError exception is thrown in the _ value method, which makes things somewhat strange. When the program runs, it does not throw an exception, but enters an infinite recursion:
File "attr_test.py", line 12, in __getattr__ return self._value[key] File "attr_test.py", line 12, in __getattr__ return self._value[key]RuntimeError: maximum recursion depth exceeded while calling a Python object
After searching through multiple parties, the problem is found to be a property modifier. property is actually a descriptor. This text can be found in python doc:
Object. _ get _ (self, instance, owner)
Called to get the attribute of the owner class (class attribute access) or of an instance of that class (instance attribute access ). owner is always the owner class, while instance is the instance that the attribute was accessed through, or None when the attribute is accessed through the owner. this method shocould return the (computed) attribute value or raise an AttributeError exception.
In this way, when the user accesses. _ value, AttributeError is thrown, and the _ getattr _ method is called to try to obtain it. In this way, the program becomes infinite recursion.
This problem does not seem complicated, but when your _ value Method throws AttributeError in a relatively obscure manner, debugging will be more difficult.
Summary
Descriptor is an intermediate layer for accessing object members. It provides a custom object member access method. Through the exploration of Descriptor, the original seemingly mysterious concepts were suddenly quite open-minded:
Class method call: the compiler does not provide special syntax rules for it, but uses Descriptor to return instancemethod to encapsulate func, so as to implement a call method similar to obj. func;
Staticmethod: decorator creates a StaticMethod and saves the func object in it. StaticMethod is a Descriptor, and its _ get _ function returns the saved func object;
Property: Creates a Property object, in its _ get _, _ set _, and _ delete _ methods, the constructed objects are passed in fget, fset, and fdel functions. Do you know why Property only provides these three functions as parameters ..
The last question is, will the performance after Python introduces Descriptor be affected? The performance impact is required: The search rule is used to access members each time, and then the Descriptor's _ get _ function is called. If it is a method call, the real function call is executed. Each time you access an object member, you must go through the above process, which has a great impact on Python performance. However, in the world of Python, it seems that Pythonic is the focus of attention ..