In the previous introduction to Python objects (the basic concept of object-oriented, the further development of object-oriented), I mentioned Python's philosophy of "all objects", in Python, whether it is a variable or a function, is an object. When Python runs, the object is stored in memory and waits for the system to be called at any time. However, the data in memory will shut down and disappear with the computer, how to save the object to a file and store it on the hard disk?
The computer's memory is stored in a binary sequence (of course, in the Linux eye, it is a text stream). We can directly fetch the data from the location of an object, convert it to a text stream (the process is called serialize), and then deposit the text stream into a file. Since Python is referring to the class definition of an object when creating an object, when we read the object from the text, we must have the class definition of the object handy to know how to reconstruct the object. When reading from a file, for Python's built-in (built-in) objects (such as integers, dictionaries, tables, and so on), it is not necessary for us to define classes in the program because their class definitions are already loaded into memory. But for a user-defined object, the class must be defined before the object can be loaded from the file (such as an object in the basic object-oriented concept summer).
Pickle Bag
For the above procedure, the most common tool is the pickle package in Python.
1) Convert an in-memory object into a text stream:
[Python]View PlainCopyprint?
- IMPORT PICKLE  
-
- # DEFINE CLASS  
- class bird (object):
- have_feather = TRUE  
- way_of_reproduction = ' egg '
-   
- Summer = bird () # construct an object
- picklestring = pickle.dumps (Summer) # serialize object
You can use the Pickle.dumps () method to convert an object summer to a string picklestring (that is, a text stream). We can then store the string in the file (the input and output of the text file) using the normal text storage method.
Of course, we can also use the Pickle.dump () method to combine the above two parts:
[Python]View PlainCopyprint?
- <span style="Font-family:microsoft yahei;font-size:18px;" >import pickle</span>
- # define Class
- Class Bird (object):
- Have_feather = True
- way_of_reproduction = ' egg '
- Summer = Bird () # construct an object
- fn = ' a.pkl '
- With open (FN, ' W ') as F: # Open file with Write-mode
- picklestring = Pickle.dump (summer, F) # Serialize and save Object
Object summer stored in file a.pkl
2) Rebuilding objects
First, we want to read the text from the text and store it in a string (the input and output of the text file). The string is then converted to an object using the Pickle.loads (str) method. Remember, at this point in our program we must already have the class definition for that object.
In addition, we can use the Pickle.load () method to merge the above steps:
[Python]View PlainCopy print?
- IMPORT&NBSP;PICKLE&NBSP;&NBSP;
-
- # define the class before unpickle
- class bird (object):
- have_feather = TRUE&NBSP;&NBSP;
- way_of_reproduction =
- &NBSP;&NBSP;
- fn = ' a.pkl '
- with open (Fn,
- summer = pickle.load (f) # read file and build object
Cpickle Bag
The functionality and usage of the Cpickle package is almost identical to the pickle package (where the difference is actually rarely used), except that the Cpickle is written in C and is 1000 times times faster than the pickle package. For the above example, if you want to use the Cpickle package, we can change the import statement to:
Import Cpickle as Pickle
There is no need to make any more changes.
Summarize
Text-to-file with objects
Pickle.dump (), Pickle.load (), Cpickle
The concept of Python serialization is simple. There is a data structure in memory that you want to save, reuse, or send to others. What would you do? It depends on how you want to save it, how to reuse it, and who to send it to. Many games allow you to save your progress when you exit, and then go back to where you left off when you started again. (In fact, many non-gaming programs do the same) in this case, a data structure that captures the current progress needs to be saved to the hard disk when you exit, and then loaded from the hard disk when you reboot.
The Python standard library provides pickle and cpickle modules. Cpickle is encoded in C and is higher in efficiency than pickle, but the type defined in the Cpickle module cannot be inherited (most of the time, we do not need to inherit from these types, we recommend using Cpickle). The serialization/deserialization rules for cpickle and Pickle are the same, using pickle to serialize an object that can be deserialized using Cpickle. At the same time, these two modules become more "smart" when dealing with self-referencing types, and it does not have unrestricted recursive serialization of self-referencing objects, which are serialized only once for multiple references to the same object.
The two main functions in the Pickle module are the dump()
and load()
. The dump () function takes a data object and a file handle as a parameter, saving the data object in a specific format to a given file. When we use the load () function to remove a saved object from a file, pickle knows how to restore those objects to their original format.
dumps()
The function performs dump()
the same serialization as the function. Instead of accepting the stream object and saving the serialized data to a disk file, this function simply returns the serialized data.
loads()
function execution load()
is deserialized like a function. Instead of accepting a stream object and going to the file to read the serialized data, it takes the object directly back from the Str object that contains the serialized data.
Pickle.dump (obj, file[, Protocol])
这是将对象持久化的方法,参数的含义分别为:obj: 要持久化保存的对象;file: 一个拥有 write() 方法的对象,并且这个 write() 方法能接收一个字符串作为参数。这个对象可以是一个以写模式打开的文件对象或者一个 StringIO 对象,或者其他自定义的满足条件的对象。protocol: 这是一个可选的参数,默认为 0 ,如果设置为 1 或 True,则以高压缩的二进制格式保存持久化后的对象,否则以ASCII格式保存。
How do I restore an object after it is persisted? The pickle
module also provides the appropriate method, as follows:
Pickle.load (file)
file ,对应于上面 dump 方法中的 file 参数。这个 file 必须是一个拥有一个能接收一个整数为参数的 read() 方法以及一个不接收任何参数的 readline() 方法,并且这两个方法的返回值都应该是字符串。这可以是一个打开为读的文件对象、StringIO 对象或其他任何满足条件的对象。
The following is a basic use case:
#-*-coding:utf-8-*-Import Pickle # can also do this: # import cpickle as pickle obj = {1, " B ": 2, "C": 3} # save obj persisted to file Tmp.txt pickle.< Span class= "Hljs-keyword" >dump (obj, open ( "Tmp.txt", "W")) # do something else ... # reads and restores obj objects from tmp.txt obj2 = pickle.load (open (" Tmp.txt ", " R ") print obj2
However, in practical applications, we may also have some improvements, such as cPickle
replacing pickle
, the former is a C language implementation version of the latter, with faster speed, in addition, sometimes the dump
third parameter is set to improve the True
compression ratio. Take another look at the following example:
#-*-Coding:utf-8-*-import cpickle as pickleimport randomimport OS ImportTime LENGTH =1024 *10240 def main (): D = {} A = []For I in Range (LENGTH): A.append (Random.randint (0,255)) d["a"] = aPrint"Dumping ..." T1 =Time.Time () pickle.Dump (d,Open"Tmp1.dat","WB"), True)Print"DUMP1:%.3fs"% (Time.Time ()-T1) T1 =Time.Time () pickle.Dump (d,Open"Tmp2.dat","W"))Print"DUMP2:%.3fs"% (Time.Time ()-T1) S1 = os.Stat"Tmp1.dat"). st_size s2 = os.Stat"Tmp2.dat"). St_sizePrint"%d,%d,%.2f%%"% (S1, S2,100.0 * s1/s2)Print "Loading ..." T1 = time. Time () Obj1 = Pickle.load (open ( "Tmp1.dat", " RB ")) print " Load1:%.3fs "% (time. time ()-T1) T1 = time. Time () Obj2 = Pickle.load (open ( "Tmp2.dat", print "load2:%.3fs"% (time. time ()-T1) if __name__ = = "__main__": Main ()
As you can see, when you specify protocol as True for dump, the size of the compressed file is only 30% of the original file, and it takes less time both at dump and load. Therefore, it is generally advisable to set this value to True.
In addition, the Pickle module provides the dumps and loads two methods, similar to the dump and load methods above, except that you do not need to enter the file parameter, the input and output are string objects, and some scenarios may be more convenient to use these two methods.
Python standard library (pickle package, Cpickle package)