Details about serialization and deserialization in Python

Source: Internet
Author: User

Details about serialization and deserialization in Python

This article mainly introduces the usage of serialization and deserialization in Python, and explores pickle and cPickle objects. For more information, see

I have learned that the marshal module is used for serialization and deserialization, but the function of marshal is weak. It only supports the serialization and deserialization of some built-in data types, and there is nothing to do with custom data types, at the same time, marshal does not support the serialization of Self-referenced (recursive reference) objects. Therefore, it is not convenient to directly use marshal for serialization/deserialization. Fortunately, the python Standard Library provides more powerful and secure pickle and cPickle modules.

The cPickle module is implemented in C language, so the running efficiency is higher than that of pickle. However, the types defined in the cPickle module cannot be inherited (in most cases, we do not need to inherit from these types .). The serialization/deserialization rules of cPickle and pickle are the same. We can use pickle to serialize an object and then use cPickle for deserialization. At the same time, these two modules will become more "smart" when processing the self-reference type, and it will not recursively serialize the self-reference object without restrictions, for multiple references of the same object, it will be serialized only once. For example:

?

1

2

3

4

5

6

7

8

Import marshal, pickle

 

List = [1]

List. append (list)

Byt1 = marshal. dumps (list)

# Error, unrestricted recursive serialization

Byt2 = pickle. dumps (list)

# No problem

Pickle serialization rules

The Python specification (Python-specific) provides pickle serialization rules. This eliminates the need to worry about serialization compatibility between different versions of Python. By default, the serialization of pickle is text-based. We can directly view the serialized text in a text editor. We can also sequence the data into binary format, and the result size will be smaller. For more details, refer to the pickle module in the Python manual.

Let's start using pickle ~

Pickle. dump (obj, file [, protocol])

Serialize the object and write the result data stream to the file object. The protocol parameter is a serialization mode. The default value is 0, indicating serialization in text format. The value of protocol can also be 1 or 2, indicating that the protocol is serialized in binary format.

Pickle. load (file)

Deserialization object. Parses the data in the file into a Python object. The following is a simple example to demonstrate the use of the above two methods:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

# Coding = gbk

 

Import pickle, StringIO

 

Class Person (object ):

 

'''Custom type.

 

'''

Def _ init _ (self, name, address ):

Self. name = name

Self. address = address

 

Def display (self ):

Print 'name: ', self. name, 'address:', self. address

 

Jj = Person ("JGood", "Hangzhou, China ")

Jj. display ()

File = StringIO. StringIO ()

 

Pickle. dump (jj, file, 0)

# Serialization

# Print file. getvalue () # print the serialized result

 

# Del Person # When deserializing, you must be able to find the definition of the corresponding class. Otherwise, the deserialization operation fails.

File. seek (0)

Jj1 = pickle. load (file)

# Deserialization

Jj1.display ()

File. close ()

Note: During deserialization, the corresponding class definition must be found; otherwise, deserialization will fail. In the preceding example, If you cancel the # del Person annotation, an AttributeError error will be thrown during runtime, prompting that the current module cannot find the Person definition.

Pickle. dumps (obj [, protocol])

Pickle. loads (string)

We can also directly obtain the serialized data stream or deserialize it directly from the data stream. Methods dumps and loads complete this function. Dumps returns the serialized data stream and the serialized object returned by loads.

The python module also defines two classes for serialization and deserialization of objects.

Class pickle. Pickler (file [, protocal]):

This class is used to serialize objects. The parameter file is a file-like object that stores serialized results. Optional parameter indicates the serialization mode. It defines two methods:

Dump (obj ):

Serialize the object and save it to the class file object. The obj parameter is the object to be serialized.

Clear_memo ()

Empty pickler's "Memo ". When using the Pickler instance to serialize an object, it will "remember" The object that has been serialized is referenced, so the same object is called multiple times dump (obj ), pickler won't perform multiple serialization as silly. The following is a simple example:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

# Coding = gbk

Import pickle, StringIO

 

Class Person (object ):

 

'''Custom type.

 

'''

Def _ init _ (self, name, address ):

Self. name = name

Self. address = address

 

Def display (self ):

Print 'name: ', self. name, 'address:', self. address

 

Fle = StringIO. StringIO ()

Pick = pickle. Pickler (fle)

Person = Person ("JGood", "Hangzhou China ")

 

Pick. dump (person)

Val1 = fle. getvalue ()

Print len (val1)

 

Pick. clear_memo ()

# Comment out this sentence and check the running result.

 

Pick. dump (person)

# Serialize the same referenced object again

Val2 = fle. getvalue ()

Print len (val2)

 

# ---- Result ----

#148

#296

#

# Comment out this line of code: pick. clear_memo ()

# Result:

#148

#152

Class pickle. Unpickler (file ):

This class is used to deserialize objects. The parameter file is a file-like object. Unpickler obtains data from this parameter for deserialization.

Load ():

Deserialization object. This method automatically selects an appropriate deserialization mode based on the serialized data streams.

?

1

2

3

4

5

#... Connect to the Code in the example

 

Fle. seek (0)

Unpick = pickle. Unpickler (fle)

Print unpick. load ()

The basic usage of the pickle module is introduced above, but like marshal, not all types can be serialized by pickle. For example, for a nested type, pickle serialization fails. For example:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

Class A (object ):

Class B (object ):

Def _ init _ (self, name ):

Self. name = name

 

Def _ init _ (self ):

Print 'init'

 

B = A. B ("my name ")

Print B

C = pickle. dumps (B, 0)

# Failure

Print pickle. loads (c)

For details about the serialization types supported by pickle, refer to the Python manual.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.