To get a deeper understanding of Python's data structure, you need to understand some of the data storage in Python to understand the principles of the deep copy in Python, and then we'll go further.
One, Python data storage
In high-level languages (c, C + +, Java, Python), variables are abstractions of memory and their addresses. In Python, all variables are objects , and the storage of variables is referenced in a way that stores only the memory address where the value of the variable resides, not the variable itself. That is, the variable holds the address of the corresponding data, which we call a reference to the object . In this way, the variable holds only a reference, so the variable requires a consistent amount of storage space (similar to the C language pointer type).
Since the variables in Python are all cited, the data structure can contain the underlying data type, causing the address of the variable to be stored in each variable in Python, rather than the value itself; For complex data structures, the store is just the address of each element. The following gives the storage changes for the underlying type and the data structure type variable re-assignment. Let's take a look at an easy-to-understand diagram, understanding the semantics of Python and the storage of C-language values in memory, and about two graphs representing the difference between variable storage and C in Python, respectively (see: http://www.cnblogs.com/Eva-J/ p/5534037.html):
1. The effect of data type reinitialization on Python semantic references
Each initialization of a variable opens up a new space to assign the address of the new content to the variable. For the words, we repeat the assignment to the STR1, in fact the memory changes as shown in the right:
As we can see, str1 in the repetitive initialization process because the address of the element stored in STR1 is changed from the address of ' Hello World ' to ' new Hello World '.
2. Impact of changes in data structure internal elements on Python semantic references
For complex data types, the effect of changing its internal values on variables is:
When you make some additions and deletions to the elements in the list, it does not affect the Lst1 list itself for the entire list address, only changes the address references of its inner elements. But when we re-initialize (assign) a list, we give lst1 this variable an address, overwriting the address of the original list, and this time the memory ID of the Lst1 list has changed. The above principle is the same for all complex data types.
Second, the depth of the copy
1. Assigning values to variables
Figuring out the above, and then discussing the assignment of variables, becomes very simple.
Assignment of 2.str
We have just learned that str1 initialization (Assignment) will result in a change in memory address, from which we can see that the assigned str2 from the memory address to the value are not affected after the STR1 has been modified. Looking at the changes in memory, the starting assignment allows both the STR1 and STR2 variables to store the ' hello ' World ' address, re-initializes the str1, so that the address stored in STR1 has changed, pointing to the new value, the memory address of the STR2 variable storage is not changed, so it is not affected.
3
. Assigning values in complex data structures
We've just looked at the assignment of simple data types and now look at the impact of complex data structure changes on memory.
The added modification to the list does not change the memory address of the list, Lst1 and lst2 have changed. Comparing memory graphs It is not difficult to see that when a new value is added to the list, the address of a new element is stored in the list, and the address of the list itself does not change, so the IDs of both Lst1 and lst2 are unchanged and a new element is added. Simple analogy, we go out to eat, lst1 and lst2 like a table to eat two people, two people common a desk, as long as the table does not change, the dishes on the table changed two people are common feelings.
4. Initial knowledge Copy
We have learned more about the process of assigning variables. For complex data structures, the assignment is equal to the full sharing of the resource, and a change in value is completely shared by another value. Sometimes, however, we need to keep a copy of the original content of a piece of data, and then process the data, and it's not wise to use assignments at this time. Python provides a copy module for this requirement. Two main copy methods are available, one for normal copy and the other for deepcopy. We call the former a shallow copy, the latter a deep copy.
Deep copy has always been an important point of knowledge for all programming languages, so let's look at the differences from the memory point of view.
5. Shallow copy
First, let's take a look at the shallow copy. Shallow copy: No matter how complex the data structure, shallow copy will only copy one layer. Let's look at a picture below to see the concept of shallow copy.
Looking at the two graphs above, we added that the left image represents a list of sourcelist,sourcelist = [' str1 ', ' str2 ', ' str3 ', ' str4 ', ' str5 ', [' str1 ', ' str2 ', ' str3 ', ' STR4 ', ' s TR5 '];
The image on the right is based on the original of a shallow copy of the copylist,copylist = [' str1 ', ' str2 ', ' str3 ', ' str4 ', ' str5 ', [' str1 ', ' str2 ', ' str3 ', ' str4 ', ' STR5 '];
SourceList and Copylist look exactly the same on the surface, but in fact a new list has been generated in memory, copy the Sourcelst, get a new list, store 5 strings and a list of memory addresses.
Let's look at the following actions for two lists, the red box is the variable initialization, initialize the above two lists, we can work on the two lists separately, such as inserting a value, what will we find? As shown below:
From the above code we can see that for the Sourcelst and Copylst list to add an element, the two lists seem to be independent of the same has changed, but when I modify the LST, the two lists have changed, this is why? Let's look at an in-memory change graph:
We can know that the sourcelst and Copylst lists all store a lump of address, when we modify the Source_lst1 element, equivalent to the ' sourcechange ' address replaced the original ' str1 ' address, So the first element of Sourcelst has changed. The Copylst still stores the str1 address, so copylst will not change.
When the Sourcelst list changes, the LST memory address stored in COPYLST does not change, so when LST changes, the Sourcelst and copylst two lists change.
This happens in dictionaries, list sets, dictionary sets, list sets, and nested lists of complex data structures, so when our data types are complex, it's very careful to use copy to make shallow copies ...
5. Deep copy
Deep copy-Another deepcopy method provided by the Python copy module. A deep copy completely duplicates all the data associated with the original variable, generating exactly the same set of content in memory, and any modification to one of the two variables in this process will not affect the other variables. Let's try it out here.
Looking at the results of the above implementation, this time we will not be affected by the list of copies, regardless of whether we operate directly on the list or on other data structures nested inside the list. Let's look at the state of these variables in memory:
Looking at the above, we know the principle of deep copy. In fact, deep copy is to re-open a piece of space in memory, no matter how complex the data structure, as long as it encounters a possible change in the type of the information, it re-open a memory space to copy the content, until the last layer, no longer have complex data types, keep its original reference. This way, no matter how complex the data structures are, the changes between them do not affect each other. This is the deep copy.
This article refers to the blog: https://www.cnblogs.com/Liubit/p/7668476.html, thank you for the "Liubit" Share!
----data storage and dark copy of Python data structures