Example of using the Data Object Persistence storage module pickle in Python, pythonpickle
In Python, you can use the pickle module to convert an object into a file and save it on a disk. You can then read and restore the object as needed. The usage is as follows:
Pickle is a common serialization tool in the Python library. It can export memory objects as strings in text or binary format, or write them into documents. In the future, you can restore it to a memory object from a character or document. The new version of Python uses c to implement it again, called cPickle, with higher performance. The following code demonstrates the common interface usage of the pickle library, which is very simple:
Import cPickle as pickle # dumps and loads # dump the memory object as a string or load the string as a memory object def test_dumps_and_loads (): t = {'name': ['v1 ', 'v2']} print t o = pickle. dumps (t) print o print 'len o: ', len (o) p = pickle. loads (o) print p # About HIGHEST_PROTOCOL parameters, pickle supports three protocol, 0, 1, 2: # http://stackoverflow.com/questions/23582489/python-pickle-protocol-choice# 0: ASCII protocol, compatible with the old version of Python #1: binary format, compatible with earlier versions of Python #2: binary format, Python2.3, better support for new-sytle classdef test_dumps_and_loads_HIGHEST_PROTOCOL (): print 'highest _ PROTOCOL: ', pickle. HIGHEST_PROTOCOL t = {'name': ['v1 ', 'v2']} print t o = pickle. dumps (t, pickle. HIGHEST_PROTOCOL) print 'len o: ', len (o) p = pickle. loads (o) print p # new-style classdef test_new_sytle_class (): class TT (object): def _ init _ (self, arg, ** kwargs ): super (TT, self ). _ init _ () self. arg = arg self. kwargs = kwargs def test (self): print self. arg print self. kwargs # ASCII protocol t = TT ('test', a = 1, B = 2) o1 = pickle. dumps (t) print o1 print 'o1 len: ', len (o1) p = pickle. loads (o1) p. test () # HIGHEST_PROTOCOL has better support for new-style class and higher performance. o2 = pickle. dumps (t, pickle. HIGHEST_PROTOCOL) print 'o2 len: ', len (o2) p = pickle. loads (o2) p. test () # dump and load # directly dump memory objects to files or objects supporting file interfaces after serialization # For dump, you must support the write interface and accept a string as the input parameter, for example: StringIO # For load, you need to support the read interface, accept int input parameters, and support the readline interface, without input parameters, such as StringIO # using files, ASCII encoding def test_dump_and_load_with_file (): t = {'name': ['v1 ', 'v2']} # ASCII format with open('test.txt', 'w') as fp: pickle. dump (t, fp) with open('test.txt ', 'R') as fp: p = pickle. load (fp) print p # Use File, binary encoding def test_dump_and_load_with_file_HIGHEST_PROTOCOL (): t = {'name': ['v1 ', 'v2']} with open ('test. bin', 'wb ') as fp: pickle. dump (t, fp, pickle. HIGHEST_PROTOCOL) with open ('test. bin', 'rb') as fp: p = pickle. load (fp) print p # Use StringIO, binary encoding def test_dump_and_load_with_StringIO (): import StringIO t = {'name': ['v1 ', 'v2']} fp = StringIO. stringIO () pickle. dump (t, fp, pickle. HIGHEST_PROTOCOL) fp. seek (0) p = pickle. load (fp) print p fp. close () # use a custom class # demonstrate the user-defined class here. As long as the write, read, and readline interfaces are implemented, # can be used as the file parameter def test_dump_and_load_with_user_def_class () of dump and load (): import StringIO class FF (object): def _ init _ (self): self. buf = StringIO. stringIO () def write (self, s): self. buf. write (s) print 'len: ', len (s) def read (self, n): return self. buf. read (n) def readline (self): return self. buf. readline () def seek (self, pos, mod = 0): return self. buf. seek (pos, mod) def close (self): self. buf. close () fp = FF () t = {'name': ['v1 ', 'v2']} pickle. dump (t, fp, pickle. HIGHEST_PROTOCOL) fp. seek (0) p = pickle. load (fp) print p fp. close () # Pickler/Unpickler # Pickler (file, protocol ). dump (obj) is equivalent to pickle. dump (obj, file [, protocol]) # Unpickler (file ). load () is equivalent to pickle. load (file) # Pickler/Unpickler has better encapsulation and can easily replace filedef test_pickler_unpickler (): t = {'name': ['v1 ', 'v2']} f = file ('test. bin', 'wb ') pick = pickle. pickler (f, pickle. HIGHEST_PROTOCOL) pick. dump (t) f. close () f = file ('test. bin', 'rb') unpick = pickle. unpickler (f) p = unpick. load () print p f. close ()
Pickle. dump (obj, file [, protocol])
This is a method for Object Persistence. The parameter meanings are as follows:
- Obj: the object to be persistently saved;
- File: an object with the write () method, and the write () method can receive a string as a parameter. This object can be a file object opened in write mode, a StringIO object, or other custom objects that meet the conditions.
- Protocol: This is an optional parameter. The default value is 0. If it is set to 1 or True, the persistent object is saved in a highly compressed binary format. Otherwise, the object is saved in ASCII format.
How can I restore an object after it is persisted? The pickle module also provides the following methods:
Pickle. load (file)
There is only one parameter file, which corresponds to the file parameter in the above dump method. This file must be a read () method that can receive an integer as a parameter and a readline () method that does not receive any parameters, and the return values of both methods should be strings. This can be a file object opened as a read, StringIO object, or any other object that meets the conditions.
The following is a basic example:
#-*-Coding: UTF-8-*-import pickle # You can also: # import cPickle as pickleobj = {"a": 1, "B": 2, "c ": 3} # Save obj persistently to the file tmp.txt pickle. dump (obj, open ("tmp.txt", "w") # do something else... # Read and restore the obj object obj2 = pickle from tmp.txt. load (open ("tmp.txt", "r") print obj2 #-*-coding: UTF-8-*-import pickle # can also be like this: # import cPickle as pickle obj = {"a": 1, "B": 2, "c": 3} # Save obj persistently in the tmp.txt file pickle. dump (obj, open ("tmp.txt", "w") # do something else... # Read and restore the obj object obj2 = pickle from tmp.txt. load (open ("tmp.txt", "r") print obj2
However, in practical applications, we may also make some improvements, such as replacing pickle with cPickle. The former is a C-language implementation version of the latter, with a faster speed. In addition, sometimes the third parameter is set to True During dump to increase the compression ratio. Let's take a look at the following example:
# -*- coding: utf-8 -*-import cPickle as pickleimport randomimport osimport timeLENGTH = 1024 * 10240def main(): d = {} a = [] for i in range(LENGTH): a.append(random.randint(0, 255)) d["a"] = a print "dumping..." t1 = time.time() pickle.dump(d, open("tmp1.dat", "wb"), True) print "dump1: %.3fs" % (time.time() - t1) t1 = time.time() pickle.dump(d, open("tmp2.dat", "w")) print "dump2: %.3fs" % (time.time() - t1) s1 = os.stat("tmp1.dat").st_size s2 = os.stat("tmp2.dat").st_size print "%d, %d, %.2f%%" % (s1, s2, 100.0 * s1 / s2) print "loading..." t1 = time.time() obj1 = pickle.load(open("tmp1.dat", "rb")) print "load1: %.3fs" % (time.time() - t1) t1 = time.time() obj2 = pickle.load(open("tmp2.dat", "r")) print "load2: %.3fs" % (time.time() - t1)if __name__ == "__main__": main()# -*- coding: utf-8 -*- import cPickle as pickleimport randomimport os import time LENGTH = 1024 * 10240 def main(): d = {} a = [] for i in range(LENGTH): a.append(random.randint(0, 255)) d["a"] = a print "dumping..." t1 = time.time() pickle.dump(d, open("tmp1.dat", "wb"), True) print "dump1: %.3fs" % (time.time() - t1) t1 = time.time() pickle.dump(d, open("tmp2.dat", "w")) print "dump2: %.3fs" % (time.time() - t1) s1 = os.stat("tmp1.dat").st_size s2 = os.stat("tmp2.dat").st_size print "%d, %d, %.2f%%" % (s1, s2, 100.0 * s1 / s2) print "loading..." t1 = time.time() obj1 = pickle.load(open("tmp1.dat", "rb")) print "load1: %.3fs" % (time.time() - t1) t1 = time.time() obj2 = pickle.load(open("tmp2.dat", "r")) print "load2: %.3fs" % (time.time() - t1) if __name__ == "__main__": main()
The execution result on my computer is:
dumping…dump1: 1.297sdump2: 4.750s20992503, 68894198, 30.47%loading…load1: 2.797sload2: 10.125s
We can see that if protocol is set to True During dump, the size of the compressed file is only 30% of the size of the original file, at the same time, it takes less time for both dump and load. Therefore, we recommend that you set this value to True.
In addition, the pickle module provides dumps and loads methods. The usage is similar to the preceding dump and load methods, but the input and output parameters do not need to be input file, these two methods may be more convenient in some scenarios.
Articles you may be interested in:
- Anydbm template and shelve template User Guide in Python
- Python uses the shelve module to implement simple data storage
- Python implements modifying object instances through shelve
- Usage of python pickle and shelve modules
- Detailed introduction to the python persistent management pickle Module
- Introduction to Python pickle class library (Object serialization and deserialization)
- Examples of cPickle usage in python
- An example of how to use the data storage module shelve in Python