Python Serialization-Review

Source: Internet
Author: User
Tags serialization

Serialization of

From https://www.liaoxuefeng.com/

In the process of running the program, all the variables are in memory, for example, to define a dict:
d = dict(name=‘Bob‘, age=20, score=88)

Variables can be modified at any time, such as name change ‘Bill‘ , but once the program is finished, the memory used by the variables is fully recycled by the operating system. If you do not save the modified ‘Bill‘ storage to disk, the next time you rerun the program, the variable is initialized ‘Bob‘ .

The process of changing a variable from memory to a storage or transfer is called serialization, and in Python it's called pickling, which is also called serialization,marshalling,flattening in other languages, and so on.

After serialization, the serialized content can be written to disk or transferred over the network to another machine.

In turn, re-reading the variable contents from the serialized object into memory is called deserialization, i.e. unpickling.

Python provides two modules for serialization: cPickle and pickle . The two modules function is the same, the difference cPickle is written in C language, fast, pickle is pure python writing, slow, cStringIO and StringIO a reason. When using, try the import first cPickle , if it fails, then import pickle :

try:    import cPickle as pickleexcept ImportError:    import pickle

First, we try to serialize and write an object to the file:

>>> d = dict(name=‘Bob‘, age=20, score=88)>>> pickle.dumps(d)"(dp0\nS‘age‘\np1\nI20\nsS‘score‘\np2\nI88\nsS‘name‘\np3\nS‘Bob‘\np4\ns."

pickle.dumps()Method serializes an arbitrary object into a str and then writes the STR to the file. Or use another method pickle.dump() to serialize the object directly after it is written to a File-like object:

>>> f = open(‘dump.txt‘, ‘wb‘)>>> pickle.dump(d, f)>>> f.close()

Look at the dump.txt files that are written, a bunch of messy stuff, all of the information inside the object that Python holds.

When we want to read the object from disk to memory, we can first read the content to one str , and then deserialize the object with a pickle.loads() method, or directly pickle.load() from a method to deserialize the file-like Object object directly. We open another Python command line to deserialize the object we just saved:

>>> f = open(‘dump.txt‘, ‘rb‘)>>> d = pickle.load(f)>>> f.close()>>> d{‘age‘: 20, ‘score‘: 88, ‘name‘: ‘Bob‘}

The contents of the variable are back!

Of course, this variable and the original variable are completely irrelevant objects, they are just the same content.

The problem with Pickle is the same as for all other programming language-specific serialization problems, that is, it can only be used in Python, and may be incompatible with each other in Python, so it's okay to save only those unimportant data with pickle and not successfully deserialize it.

Json

If we are going to pass objects between different programming languages, we have to serialize the object into a standard format, such as XML, but the better way is to serialize it to JSON, because JSON represents a string that can be read by all languages, easily stored to disk, or transmitted over a network. JSON is not only a standard format, but also faster than XML, and can be read directly in the Web page, very convenient.

JSON represents objects that are standard JavaScript language objects, and JSON and Python have built-in data types that correspond to the following:

JSON type Python type
{} Dict
[] List
"String" ' Str ' or U ' Unicode '
1234.56 int or float
True/false True/false
Null None

Python's built-in json modules provide a very sophisticated translation of Python objects into JSON format. Let's look at how to turn the Python object into a JSON:

>>> import json>>> d = dict(name=‘Bob‘, age=20, score=88)>>> json.dumps(d)‘{"age": 20, "score": 88, "name": "Bob"}‘

dumps()method returns one str , the content is the standard JSON. Similarly, the dump() method can write JSON directly to one file-like Object .

To deserialize JSON into a Python object, loads() or a corresponding load() method, the former JSON string is deserialized, the latter reads the string from file-like Object and deserializes:

>>> json_str = ‘{"age": 20, "score": 88, "name": "Bob"}‘>>> json.loads(json_str){u‘age‘: 20, u‘score‘: 88, u‘name‘: u‘Bob‘}

It is important to note that all string objects that are deserialized are, by default, unicode not str . Because the JSON standard specifies that JSON encoding is UTF-8, we are always able to correctly str unicode convert between Python or JSON strings.

JSON advanced

Python dict objects can be serialized directly into JSON {} , but, many times, we prefer to class represent objects, such as defining Student classes, and then serializing:

import jsonclass Student(object): def __init__(self, name, age, score): self.name = name self.age = age self.score = scores = Student(‘Bob‘, 20, 88)print(json.dumps(s))

Run the code and relentlessly get one TypeError :

call last):  ...TypeError: <__main__.Student object at 0x10aabef50> is not JSON serializable

The reason for the error is that the Student object is not an object that can be serialized as JSON.

If even class instance objects cannot be serialized as JSON, this is certainly unreasonable!

Don't worry, let's take a closer look at dumps() the parameter list of the method, you can see that the obj dumps() method provides a whole bunch of optional parameters in addition to the first required parameter:

Https://docs.python.org/2/library/json.html#json.dumps

These optional parameters are for us to customize JSON serialization. The previous code was unable to Student serialize the class instance to JSON because by default the dumps() method does not know how to Student change the instance to a JSON {} object.

The optional parameter default is to turn any object into an object that can be serialized as JSON, we just need to Student write a conversion function, and then pass in the function:

def student2dict(std):    return { ‘name‘: std.name, ‘age‘: std.age, ‘score‘: std.score }print(json.dumps(s, default=student2dict))

In this way, the Student instance is first student2dict() converted into a function and dict then serialized into JSON.

However, the next time you encounter an Teacher instance of a class, you cannot serialize to JSON. We can steal a lazy, turn any class instance into dict :

print(json.dumps(s, default=lambda obj: obj.__dict__))

Because class the usual instance has a __dict__ property, it is the one dict that stores the instance variable. There are a few exceptions, such as __slots__ the defined class.

Similarly, if we are going to deserialize JSON into an Student object instance, the loads() method first converts an dict object, and then our incoming object_hook function is responsible for dict converting to an Student instance:

def dict2student(d):    return Student(d[‘name‘], d[‘age‘], d[‘score‘])json_str = ‘{"age": 20, "score": 88, "name": "Bob"}‘print(json.loads(json_str, object_hook=dict2student))

The results of the operation are as follows:

<__main__.Student object at 0x10cd3c190>

Prints out an instance object that is deserialized Student .

Summary

Python language-specific serialization modules are pickle , but you can use modules if you want to make serialization more generic and more web-compliant json .

jsonModules dumps() and loads() functions are examples of very well-defined interfaces. When we use it, we only need to pass in a required parameter. However, when the default serialization or deserialization mechanism does not meet our requirements, we can also pass in more parameters to customize the serialization or deserialization rules, not only the interface is simple to use, but also to achieve full scalability and flexibility.

Python Serialization-Review

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.