Serialization and deserialization of data in Python (JSON processing)

Source: Internet
Author: User

Concept:

JSON (JavaScript Object Notation): is a lightweight data interchange format. Easy for people to read and write. It is also easy for machine parsing and generation. It is based on JavaScript programming Language, Standard ECMA-262 a subset of 3rd Edition-december 1999. JSON takes a completely language-independent text format, but also uses a similar idiom to the C language family (c, C + +, C #, Java, JavaScript, Perl, Python, etc.). These features make JSON an ideal data exchange language.

JSON is constructed in two structures:

    • A collection of name/value pairs (A collection of name/value pairs). In different languages, it is understood as objects (object), record (record), structure (struct), Dictionary (dictionary), hash table (hash table), keyed list (keyed list), or associative array (associative array).
    • The sequence of values (an ordered list of values). In most languages, it is understood as an array (array).

These are common data structures. In fact, most of the modern computer languages support them in some form. This makes it possible for a data format to be exchanged between programming languages that are also based on these constructs.

serialization (serialization): Converts the state information of an object into a process that can be stored or transmitted over a network, in the form of JSON, XML, and so on. Deserialization is the state of the deserialized object that is read from the storage area (Json,xml) and re-created.

Python2.6 began to join the JSON module, no additional download, Python JSON module serialization and deserialization process is   encoding and  decoding

encoding: Converts a Python object encoding into a JSON string
decoding: Converting JSON format string decoding to Python object
For simple data types (string, Unicode, int, float, list, tuple, dict), they can be processed directly.

JSON detailed explanation see: http://json.org/json-zh.html

Standard API Library Reference for Python operation JSON: https://docs.python.org/3/library/json.html

1. Encoding and decoding for simple data types:

The simple type of data follows encode with the original repr () output results; There are differences in some data types, such as tuples converted to lists in the previous example, and integers with key 123 in the dictionary converted to string types. During the encoding process of JSON, there is a conversion process from the original Python type to the JSON type, with the specific conversions as follows:

The Json.dumps () method returns a Str object Encodedjson,json.loads () function that can decode the Encodedjson to get the original data:

Note that the key for the decodedjson[4][' 123 '] is still a string and is not converted to the original data type (int), it is recommended to use the basic string type in the dictionary key.

The type conversions from JSON to Python are compared as follows:

Json.dumps method provides a lot of useful parameters to choose from, more commonly used have Sort_keys (Dict objects to sort, we know the default dict is unordered), separators,indent and other parameters.

The sorting feature makes the stored data more useful for observation and also makes comparisons of JSON output objects, such as:

In the above example, the data1 and data2 data should be the same, but because of the unordered nature of dict storage, they cannot be compared. Therefore, the two can be sorted by the results of storage to avoid the inconsistent data, but after sorting and then storage, the system must do something more, it will certainly result in a certain amount of performance consumption, so proper sequencing is very important.

The indent parameter is the indentation meaning, which can make the data storage format more elegant.

>>> data_1 = {' B ': 789, ' C ': 456, ' a ':123}>>> D1 = json.dumps (data_1, Sort_keys=true, indent=4) >>& Gt Print (D1) {    "a": 123,    "B": 789,    "C": 456}>>>

After the output is formatted, the data is more readable, but it needs to be padded with some extra whitespace. JSON as a form of data communication, and network communication is very concerned about the size of the information, useless space will occupy more communication bandwidth, so it is appropriate to compress the data. The separator parameter can play the role of passing a tuple that contains a string that splits the object.

>>> data = {' B ': 789, ' C ': 456, ' A ':123}>>> print (' Data: ', repr (data)) data: {' C ': 456, ' a ': 123, ' B ': 789}& gt;>> print (' repr (data)             : ', Len (repr (data)) repr (data)             : 30>>> print (' Dumps (data)            : ', Len (json.dumps (data))) dumps (data)            : 30>>> print (' Dumps (data, indent=4)  : ', Len (json.dumps (data, Indent = 4)) dumps (data, indent=4)            

Another useful dumps parameter is Skipkeys, which defaults to false. When the dumps method stores the Dict object, the key must be of type STR, and if there are other types, a TypeError exception will be generated, and if this argument is set to true, it will be more gracefully over.

>>> data = {' B ': 789, ' C ': 456, (UP): 123, 456:678}>>> print (json.dumps (data)) Traceback (most recent Call last):  file ' <pyshell#12> ', line 1, in <module>    print (json.dumps (data))  file "D:\Python\ lib\json\__init__.py ", line +, in dumps    return _default_encoder.encode (obj)  File" D:\Python\lib\json\ encoder.py ", line 192, in encode    chunks = Self.iterencode (o, _one_shot=true)  File" D:\Python\lib\json\ encoder.py ", line +, in Iterencode    return _iterencode (o, 0)TypeError: Keys must be a string>>> pri NT (json.dumps (data, skipkeys=true)) {"456": 678, "B": 789, "C": 456}>>>

2. Handle the custom data type:

The JSON module not only handles common python built-in types, but also handles our custom data types, which are often used to handle custom objects.

First, we define a class person in the man module.

#!/usr/local/bin/python3#-*-coding:utf-8-*-# autor:antcloniesimport jsonclass Person (object):    def __init__ (  Self, name, age):        self.name = name        Self.age = Age     def __repr__ (self):        return ' person ' Object name:%s, age :%d '% (self.name, self.age) if __name__ = = ' __main__ ':    p = person (' Peter ', ')     print (p)    dmp = Json.dumps (P) C11/>print (DMP)

Run the above modules:

[Email protected] tmp]#/py_2.py person Object name:peter, Age:22traceback (most recent call last): ...  File "/usr/local/lib/python3.5/json/encoder.py", line 179, in default    raise TypeError (Repr (o) + "isn't JSON Serializ Able ") Typeerror:person Object Name:peter, age:22 is not JSON serializable

If the instance of person is processed directly through the Json.dumps method, an error is given, because JSON does not know how to convert the instance p of this person type. By using the JSON and Python type conversion tables mentioned above, we can see that the object type is associated with dict, so we need to convert our custom type to dict before processing. Here, there are two ways to use it.

Method One: Write your own conversion function

#!/usr/local/bin/python3#-*-coding:utf-8-*-#import jsonimport personp = Person.person (' Peter ', ') def obj2dict (obj): ' Convert object to dict ' d = {}d[' __class__ '] = obj.__class__.__name__d[' __module__ '] = obj.__module__d.update (obj._ _dict__) return ddef dict2obj (DMP): "Convert Dict to Object" If ' __class__ ' in d:class_name = D.pop (' __class__ ') module_ Name = D.pop (' __module__ ') module = __import__ (module_name) Class_ = getattr (module, class_name) args = Dict ((key, value) fo R key, value in D.items ()) # Get argsinst = Class_ (**args) # Create New Instanceelse:inst = Dreturn instd = obj2dict (P) Print (d) ' ' {' __module__ ': ' Person ', ' __class__ ': ' Person ', ' name ': ' Peter ', ' Age ': ' ' ' o = dict2obj (d) ' Print (Type (o), O ) ' <class ' Person.person ' > Person Object name:peter, age:22 ' dump = Json.dumps (p, default=obj2dict) print (dump) ' {' __module__ ': ' Person ', ' __class__ ': ' Person ', ' name ': ' Peter ', ' age ': ' $ ' ' Load = json.loads (dump, object_hook= dict2obj) print (load) ' {' name ': ' Peter ', ' Age ': 22} ' 

The above code has been written very clearly, the essence is the custom object type and Dict type to convert. The Object2dict function stores the object module name, class name, and __dict__ in the Dict object and returns. The Dict2object function is to reverse the module name, the class name, the parameters, create a new object, and return. Add the default parameter to the Json.dumps method, which means that the specified function is called during the conversion process, and the Json.loads method adds object_hook to the decode process, specifying the conversion function.

Method Two: Inherit Jsonencoder and Jsondecoder classes, overwrite related methods

The Jsonencoder class is responsible for encoding, mainly through its default function to convert, we can override the method. Likewise for Jsondecoder.

#!/usr/local/bin/python3#-*-coding:utf-8-*-#import jsonimport personp = Person.person (' Peter ', ') class Myencoder ( Json. Jsonencoder):d EF default (self, obj): "Convert object to Dict" D = {}d[' __class__ '] = obj.__class__.__name__d[' __ module__ '] = obj.__module__d.update (obj.__dict__) return Dclass Mydecoder (JSON. Jsondecoder):d EF __init__ (self): JSON. Jsondecoder.__init__ (self, object_hook=self.dict2obj) def dict2obj (self, D): "Convert Dict to Object" if ' __class__ ' In d:class_name = D.pop (' __class__ ') module_name = D.pop (' __module__ ') module = __import__ (module_name) Class_ = GetAttr ( module, class_name) args = Dict (key, value) for key, value in D.items ()) # Get argsinst = Class_ (**args) # Create new I Nstanceelse:inst = Dreturn instd = Myencoder (). Encode (p) o = Mydecoder (). Decode (d) print (d) print (Type (o), O) "{" Name ":"  Peter "," Age ": $," __module__ ":" Person "," __class__ ":" Person "}<class ' Person.person ' > Person Object name:peter, Age:22 ""

Referenced by:

http://www.cnblogs.com/coser/archive/2011/12/14/2287739.html
https://docs.python.org/3/library/json.html
http://www.liaoxuefeng.com/wiki/0014316089557264a6b348958f449949df42a6d3a2e542c000/00143192607210600a668b5112e4a979dd20e4661cc9c97000

Serialization and deserialization of data in Python (JSON processing)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.