Python-related operations on JSON and a brief analysis of JSON modules

Source: Internet
Author: User

JSON (JavaScript Object Notation) is a lightweight data interchange format. Easy for people to read and write. It is also easy for machine parsing and generation.

JSON has two types of structure:

The first is the collection of name/value pairs. In Python, which is equivalent to a dictionary type, in other languages it is understood as objects (object), record (record), struct (struct), Dictionary (dictionary), hash table (hash table), keyed list (keyed list), or an associative array (associative array).

The second type is the ordered list of values. In most languages, it is understood as an array (array).

Both of these are common data structures. Most languages support them in some way. This makes it possible for a data format to be exchanged between programming languages that are also based on these constructs.


JSON official description See: http://json.org/
Standard API Library Reference for Python operation JSON: http://docs.python.org/library/json.html


Encoding and decoding for simple data types:

Use a simple Json.dumps method to encode a simple data type, for example:

Import JSON obj = [[1,2,3],123,123.123, ' abc ', {' key1 ':(), ' key2 ':(4,5,6)}]encodedjson = json.dumps (obj) print repr ( obj) Print Encodedjson

Output: [[1, 2, 3], 123, 123.123, ' abc ', {' Key2 ': (4, 5, 6), ' Key1 ': (1, 2, 3)}][[1, 2, 3], 123, 123.123, "abc", {"Key2": [4, 5 , 6], "Key1": [1, 2, 3]}

The result of the output shows that the simple type is very similar to its original repr () output after encode, but some data types have changed, for example, the tuple in the previous example is converted to a list. During the encoding process of JSON, there is a conversion process from the original Python type to the JSON type, with the specific conversions as follows:

The Json.dumps () method returns a Str object Encodedjson, and we then decode the Encodedjson to get the raw data and the json.loads () function that we need to use:


Decodejson = Json.loads (encodedjson) print type (decodejson) print decodejson[4][' Key1 ']print Decodejson

Output: <type ' list ' >[1, 2, 3][[1, 2, 3], 123, 123.123, u ' abc ', {u ' key2 ': [4, 5, 6], U ' key1 ': [1, 2, 3]}

The loads method returns the original object, but some data type conversions are still occurring. For example, in the example above, ' ABC ' was converted to a Unicode type. The type conversions from JSON to Python are compared as follows:

Json.dumps method provides a lot of useful parameters to choose from, more commonly used have Sort_keys (Dict objects to sort, we know the default dict is unordered), separators,indent and other parameters.

The sorting feature makes the stored data more useful for observation and also makes comparisons of JSON output objects, such as:

Data1 = {' B ': 789, ' C ': 456, ' a ': 123}data2 = {' A ': 123, ' B ': 789, ' C ': 456}d1 = Json.dumps (data1,sort_keys=true) d2 = Json.dumps (data2) d3 = Json.dumps (data2,sort_keys=true) print d1print d2print d3print d1==d2print d1==d3

Output: {"A": 123, "B": 789, "C": 456}{"a": 123, "C": 456, "B": 789}{"a": 123, "B": 789, "C": 456}falsetrue

In the above example, the data1 and data2 data should be the same, but because of the unordered nature of dict storage, they cannot be compared. Therefore, the two can be sorted by the results of storage to avoid the inconsistent data, but after sorting and then storage, the system must do something more, it will certainly result in a certain amount of performance consumption, so proper sequencing is very important.

The indent parameter is the indentation meaning, which can make the data storage format more elegant.


Data1 = {' B ': 789, ' C ': 456, ' a ': 123}d1 = Json.dumps (data1,sort_keys=true,indent=4) Print D1

Output: {    "a": 123,    "B": 789,    "C": 456}

After the output is formatted, it becomes more readable, but it is populated by adding some redundant blanks. JSON is mainly as a data communication format exists, and network communication is very concerned about the size of the information, useless space will occupy a lot of communication bandwidth, so the appropriate time also to compress the data. The separator parameter can play the role of a tuple that contains a string that splits the object.

print ' data: ', repr (data) print ' repr (data)             : ', Len (repr (data)) print ' Dumps (data)            : ', Len (json.dumps (data)) print ' Dumps (data, indent=2)  : ', Len (json.dumps (data, indent=4)) print ' Dumps (data, separators): ', Len (Json.dumps ( Data, separators= (', ', ': ')))

Output: DATA: {' A ': 123, ' C ': 456, ' B ': 789}repr (data):             30dumps (data)            : 30dumps (data, indent=2)  : 46dumps (data, S Eparators): 25

The purpose of compressing the data is achieved by removing the extra whitespace characters, and the effect is quite obvious.

Another useful dumps parameter is Skipkeys, which defaults to false. When the dumps method stores the Dict object, the key must be of type STR, and if there are other types, a TypeError exception will be generated, and if this argument is set to true, it will be more gracefully over.

data = {' B ': 789, ' C ': 456, (): 123}print json.dumps (data,skipkeys=true)

Output: {"C": 456, "B": 789}

Working on your own data types

The JSON module not only handles common python built-in types, but also handles our custom data types, which are often used to handle custom objects.

First, we define a class person.

class Person (object):    def __init__ (self,name,age):        self.name = name        Self.age = Age    def __repr__ (self) :        return ' person Object name:%s, age:%d '% (self.name,self.age) if __name__  = = ' __main__ ':    p = person (' Pet Er ', ')    print P

If the instance of person is processed directly through the Json.dumps method, an error is given, because JSON cannot support such an automatic conversion. By using the JSON and Python type conversion tables mentioned above, we can see that the object type is associated with dict, so we need to convert our custom type to dict before processing. Here, there are two ways to use it.

Method One: Write your own conversion function

"' Created on 2011-12-14@author:peter ' ' import personimport json p = Person.person (' Peter ', ') def object2dict (obj): # Convert object to a dict d = {} d[' __class__ '] = obj.__class__.__name__ d[' __module__ '] = obj.__module__ d.upd Ate (obj.__dict__) return D def dict2object (d): #convert dict to Object if ' __class__ ' in d:class_name = d. Pop (' __class__ ') module_name = D.pop (' __module__ ') module = __import__ (module_name) Class_ = GetAttr ( Module,class_name) args = Dict ((Key.encode (' ASCII '), value) for key, value in D.items ()) #get args Inst = cl Ass_ (**args) #create new instance else:inst = d Return Inst d = object2dict (p) Print d#{' age ': ' __module_ _ ': ' Person ', ' __class__ ': ' Person ', ' name ': ' Peter '} o = Dict2object (d) print type (o), O#<class ' Person.person ' > Per Son Object name:peter, age:22 dump = json.dumps (p,default=object2dict) print dump#{"age": All, "__module__": "Person", "__class__": "Person", "NAme ":" Peter "} load = json.loads (Dump,object_hook = dict2object) Print Load#person object Name:peter, age:22 

The above code has been written very clearly, the essence is the custom object type and Dict type to convert. The Object2dict function stores the object module name, class name, and __dict__ in the Dict object and returns. The Dict2object function is to reverse the module name, the class name, the parameters, create a new object, and return. Add the default parameter to the Json.dumps method, which means that the specified function is called during the conversion process, and the Json.loads method adds object_hook to the decode process, specifying the conversion function.

Method Two: Inherit Jsonencoder and Jsondecoder classes, overwrite related methods

The Jsonencoder class is responsible for encoding, mainly through its default function to convert, we can override the method. Likewise for Jsondecoder.

' Created on 2011-12-14@author:peter ' ' import personimport json p = Person.person (' Peter ', ') class Myencoder (JSON. Jsonencoder): def default (self,obj): #convert object to a dict d = {} d[' __class__ '] = Obj.__class __.__name__ d[' __module__ '] = obj.__module__ d.update (obj.__dict__) Return D class Mydecoder (JSON. Jsondecoder): def __init__ (self): JSON.        Jsondecoder.__init__ (Self,object_hook=self.dict2object) def dict2object (self,d): #convert dict to Object If ' __class__ ' in d:class_name = D.pop (' __class__ ') module_name = D.pop (' __module__ ') mod Ule = __import__ (module_name) Class_ = GetAttr (module,class_name) args = Dict (("Key.encode (' ASCII '),            Value) for key, value in D.items ()) #get args inst = Class_ (**args) #create new instance else: Inst = d Return Inst d = Myencoder (). Encode (p) o = Mydecoder (). Decode (d) Print dprint type (o),O 

from the sugar mixed salted fish Big blog .....


Python-related operations on JSON and a brief analysis of JSON modules

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.