The beauty of Python [from cainiao to masters]-full interpretation of shortest copy and deep copy (copy source code analysis)

Source: Internet
Author: User

Sadly, I always thought that the copy module was written in C. Sometimes I need to have a deep understanding of deepcopy. It is too simple to describe the document or refer to cloud.

For example, I have some questions about AttribDict's _ deepcopy _ in sqlmap Source Code recently,

    def __deepcopy__(self, memo):        retVal = self.__class__()        memo[id(self)] = retVal        for attr in dir(self):            if not attr.startswith('_'):                value = getattr(self, attr)                if not isinstance(value, (types.BuiltinFunctionType, types.BuiltinFunctionType, types.FunctionType, types.MethodType)):                    setattr(retVal, attr, copy.deepcopy(value, memo))        for key, value in self.items():            retVal.__setitem__(key, copy.deepcopy(value, memo))        return retVal
What is memo? Why do we need id (self)? If it is uniform, we need to directly upload self. It is not easier to perform id operations in copy.

The second problem can be solved by thinking about it. id must be used for unique identification. It is dangerous to pass self over. Because you have no idea what internal operations will be performed on self, maybe you will wonder if it is harmful to self itself? This also complies with the minimum principle of passing parameters.

The doc starting with copy. py is very good:

Main interfaces:

Import copyx = copy. copy (y) # shallow copy of y x = copy. deepcopy (y) # Deep copy of y
If the copy module encounters an Error, a copy. Error exception is thrown.

The difference between shallow replication and deep replication is that for composite objects, the so-called composite objects include objects of other objects, such as lists and class instances.

-- The shortest copy function constructs a new composite object and inserts the contained object into the new composite object.

-- The deep copy statement constructs a new composite object, Recursively copies the contained objects, and inserts them into the new composite object.


There are two problems with the deep copy:

A) recursive copying of objects (combination objects may directly or indirectly contain their own references) may lead to cyclic copying.

B) Because deep copy will copy all data, too many copies may be made. For example, some data structures may be shared.

Python solves the above problems through the following methods:

A) save an object table that has been copied

B) allow users to customize the copy operation in the class

The copy module does not copy the following types: module, class, function, method, stack trace, stack frame, file, socket, window, array, or similar.


We can see how deepcopy can avoid endless loops,

Def deepcopy (x, memo = None, _ nil = []): "" Deep copy operation on arbitrary Python objects. see the module's _ doc _ string for more info. "if memo is None: memo = {} d = id (x) y = memo. get (d, _ nil) # Check whether the object has been copied to avoid copying an endless loop if y is not _ nil: return y

As we can see, meno is indeed based on the Object id as the key, so it is not surprising to start with the article.

Let's take a look at the operation of the shallow copy:

Def _ copy_immutable (x): # return the self-return xfor t in (type (None), int, long, float, bool, str, tuple, frozenset, type, xrange, types. classType, types. builtinFunctionType, type (Ellipsis), types. functionType, weakref. ref): d [t] = _ copy_immutabledef _ copy_with_constructor (x): # copy a variable object. Create a new object return type (x) # type (1) is actually intfor t in (list, dict, set): d [t] = _ copy_with_constructor
The shallow copy operation is relatively simple. You can call the corresponding method based on the type of the required copy. The only thing to note is the usage of type.


Let's take a look at the deep copy operation. We need to focus on tuple:

Def _ deepcopy_atomic (x, memo): # whether deep copy distinguishes can be divided, that is, whether it is a composite object, return x # For a non-composite object (atomic object), return d [type (None)] = _ deepcopy_atomicd [type (Ellipsis)]. = _ deepcopy_atomicd [int] = _ deepcopy_atomicd [long] = _ deepcopy_atomicd [float] = _ partition [bool] = _ deepcopy_atomicdef _ deepcopy_list (x, memo ): "because the list is a mutable object, the deep copy of the list is to make a deep copy of all the contained elements and compare them with the following tuples, better understanding of mutable and immutable "" y = [] memo [id (x)] = y for a in x: y. append (Deepcopy (a, memo) return yd [list] = _ deepcopy_listdef _ deepcopy_tuple (x, memo): "Because tuples are immutable objects, so to handle some special issues, Let's first look at the deepcopy1. object of the tuples In [48]: t1 = (1, (2, 3), 's') In [49]: t2 = copy. deepcopy (t1) In [50]: id (t1), id (t2) # t1 is t2Out [50]: (43829008,438 29008) 2. contains a variable object In [51]: t1 = (1, [2, 3], 's') In [52]: t2 = copy. deepcopy (t1) In [53]: id (t1), id (t2) # t1 and t2 are different Out [53]: (43828528,438 28968) "y = [] for a in x :#? Why is this operation not performed in location 1. append (deepcopy (a, memo) d = id (x) try: return memo [d] random t KeyError: pass # position 1 for I in range (len (x )): if x [I] is not y [I]: # not equal to that of variable object y = tuple (y) # Use the tuples constructor to regenerate the break else: # if there is no break, it indicates that the tuples are all immutable objects and do not need to regenerate y = x memo [d] = y return yd [tuple] = _ deepcopy_tupledef _ deepcopy_dict (x, memo): y = {} memo [id (x)] = y for key, value in x. iteritems (): # dictionary key and value must be deeply copied to y [deepcopy (key, memo)] = deepcopy (value, memo) return yd [dict] = _ deepcopy_dict


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.