A detailed explanation of the use of serialization and deserialization in Python

Source: Internet
Author: User
Learned the Marshal module is used for serialization and deserialization, but the functionality of Marshal is weak, only supports serialization/deserialization of some built-in data types, is powerless for user-defined types, and marshal the serialization of objects that do not support self-referencing (recursive referencing). So it may not be convenient to use Marshal to serialize/deserialize directly. Fortunately, the Python standard library provides a more powerful and secure pickle and Cpickle module.

The Cpickle module is implemented in C language, so it is higher than pickle in operational efficiency. However, the types defined in the Cpickle module cannot be inherited (in fact, most of the time, we do not need to inherit from these types.) )。 The serialization/deserialization rules for cpickle and Pickle are the same, and we can use pickle to serialize an object and then deserialize it using Cpickle. At the same time, these two modules become more "smart" when dealing with self-referencing types, and it does not have unrestricted recursive serialization of self-referencing objects, which are serialized only once for multiple references to the same object. For example:

Import marshal, pickle list = [1]list.append (list) byt1 = Marshal.dumps (list) #出错, unrestricted recursive serialization byt2 = Pickle.dumps (list) #No p Roblem
serialization rules for pickle

The Python Specification (python-specific) provides pickle serialization rules. This eliminates the need to worry about serialization compatibility issues between different versions of Python. By default, the serialization of Pickle is text-based, and we can view the serialized text directly with a text editor. We can also serialize data into binary format, which results in smaller volumes. For more detailed information, refer to the Python manual pickle module.

Let's start using pickle.
Pickle.dump (obj, file[, Protocol])

Serializes the object and writes the resulting data stream to the file object. The parameter protocol is a serialization mode with a default value of 0, which indicates serialization as text. The value of the protocol can also be 1 or 2, which is serialized as a binary representation.
Pickle.load (file)

Deserializes the object. Resolves the data in a file to a Python object. Here's a simple example to illustrate the use of the above two methods:

#coding =GBK Import Pickle, Stringio class Person (object): "Custom type. "Def __init__ (self, Name, address):  self.name = name  self.address = Address    def display (self):  print ' Name: ', Self.name, ' Address: ', self.address  JJ = person ("Jgood", "Hangzhou, China") jj.display () file = Stringio.stringio () Pickle.dump (JJ, file, 0) #序列化 #print file.getvalue () #打印序列化后的结果  #del person #反序列的时候, the definition of the corresponding class must be found. Otherwise, the deserialization operation fails. File.seek (0) jj1 = pickle.load (file) #反序列化jj1. Display () File.close ()

Note: When deserializing, you must be able to find the definition of the corresponding class, otherwise the deserialization will fail. In the above example, if you cancel the #del person comment, the runtime throws an Attributeerror exception, suggesting that the current module cannot find the definition of person.
Pickle.dumps (obj[, Protocol])
Pickle.loads (String)

We can also directly get the serialized data stream, or deserialize it directly from the data stream. Method dumps and loads complete such a function. Dumps returns the serialized data stream, loads returns the serialized generated object.

The Python module also defines two classes, which are used to serialize and deserialize objects, respectively.
Class Pickle. Pickler (file[, protocal]):

This class is used to serialize objects. The parameter file is a class file object (File-like object) that holds the serialized result. An optional parameter represents the serialization mode. It defines two methods:
Dump (obj):

Serializes the object and saves it in the class file object. The parameter obj is the object to serialize.
Clear_memo ()

Empty the Pickler "Memo". When serializing an object using an Pickler instance, it "remembers" the object reference that has already been serialized, so the dump (obj) is called multiple times for the same object, and Pickler is not "silly" to serialize multiple times. The following is a simple example:

#coding =gbkimport Pickle, Stringio class Person (object): ' Custom type. "Def __init__ (self, Name, address):  self.name = name  self.address = Address   def display (self):  print ' Name: ', Self.name, ' Address: ', self.address    fle = Stringio.stringio () pick = pickle. Pickler (fle) person = person (' Jgood ', ' Hangzhou China ')  pick.dump (person) val1 = Fle.getvalue () print Len (val1) Pick.clear_memo () #注释此句, and then look at the result of the run Pick.dump (person) #对同一引用对象再次进行序列化val2 = Fle.getvalue () print Len (val2) #----Results----#148 #296 # #将这行代码注释掉: Pick.clear_memo () #结果为: #148 #152class Pickle. Unpickler (file):

This class is used to deserialize an object. The parameter file is a class file (File-like object) that Unpickler gets the data from that parameter to deserialize.
Load ():

Deserializes the object. The method automatically chooses the appropriate deserialization mode based on the data stream already serialized.

#.... The code in the last example Fle.seek (0) Unpick = pickle. Unpickler (fle) print unpick.load ()

The basic use of the Pickle module is described above, but as with marshal, not all types can be serialized by Pickle. For example, for a nested type, the use of pickle serialization fails. For example:

Class A (object): Class B (object):  def __init__ (self, name):   self.name = name   def __init__ (self):  print ' init A ' B = a.b ("My Name") Print BC = pickle.dumps (b, 0) #失败哦print pickle.loads (c)

For pickle supported serialization types, refer to the Python manual.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.