Python processing JSON

Source: Internet
Author: User
Tags python list

Python processing JSON


(If you read poorly, you can poke here)

Concept

serialization (serialization): Converts the state information of an object into a process that can be stored or transmitted over a network, in the form of JSON, XML, and so on. Deserialization is the state of the deserialized object that is read from the storage area (Json,xml) and re-created.

JSON (JavaScript Object Notation): A lightweight data interchange format that is easier to read and write than XML, is easy to parse and generate, and JSON is a subset of JavaScript.

Python2.6 started adding JSON modules without additional download, and the Python JSON module serialization and deserialization process is encoding and decoding , respectively.

encoding: Converts a Python object encoding into a JSON string
decoding: Converting JSON format string decoding to Python object
For simple data types (string, Unicode, int, float, list, tuple, dict), they can be processed directly.

The Json.dumps method is encoding for simple data types:
import jsondata = [{‘a‘:"A",‘b‘:(2,4),‘c‘:3.0}]  #list对象print "DATA:",repr(data)data_string = json.dumps(data)print "JSON:",data_string

Output:

DATA: [{‘a‘:‘A‘,‘c‘:3.0,‘b‘:(2,4)}] #python的dict类型的数据是没有顺序存储的JSON: [{"a":"A","c":3.0,"b":[2,4]}]  

The output of JSON is similar to data, except for some subtle changes such as Python's tuple type becoming an array of JSON, the code conversion rules for Python to JSON are:

The Json.loads method handles decoding (decoding) conversions of simple data types
import jsondata = [{‘a‘:"A",‘b‘:(2,4),‘c‘:3.0}]  #list对象data_string = json.dumps(data)print "ENCODED:",data_stringdecoded = json.loads(data_string)print "DECODED:",decodedprint "ORIGINAL:",type(data[0][‘b‘])print "DECODED:",type(decoded[0][‘b‘])

Output:

ENCODED: [{"a": "A", "c": 3.0, "b": [2, 4]}]DECODED: [{u‘a‘: u‘A‘, u‘c‘: 3.0, u‘b‘: [2, 4]}]ORIGINAL: <type ‘tuple‘>DECODED: <type ‘list‘>

During decoding, the JSON array is eventually converted to the Python list instead of the original tuple type, and the JSON-to-Python decoding rules are:

The humanistic care of JSON

Encoded JSON-formatted strings are compact output, and there is no order, so the dumps method provides some optional parameters to make the output format more readable, such as sort_keys telling the encoder to sort by dictionary (A to Z) output.

import jsondata = [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0 } ]print ‘DATA:‘, repr(data)unsorted = json.dumps(data)print ‘JSON:‘, json.dumps(data)print ‘SORT:‘, json.dumps(data, sort_keys=True)

Output:

DATA: [{‘a‘: ‘A‘, ‘c‘: 3.0, ‘b‘: (2, 4)}]JSON: [{"a": "A", "c": 3.0, "b": [2, 4]}]SORT: [{"a": "A", "b": [2, 4], "c": 3.0}

indentParameters are indented according to the data format and are clearer to read:

import jsondata = [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0 } ]print ‘DATA:‘, repr(data)print ‘NORMAL:‘, json.dumps(data, sort_keys=True)print ‘INDENT:‘, json.dumps(data, sort_keys=True, indent=2)

Output:

DATA: [{‘a‘: ‘A‘, ‘c‘: 3.0, ‘b‘: (2, 4)}]NORMAL: [{"a": "A", "b": [2, 4], "c": 3.0}]INDENT: [  {    "a": "A",    "b": [      2,      4    ],    "c": 3.0  }]

separatorsThe function of the parameter is to remove , , : The following space, from the above output can be seen ",:" There is a space behind, which is to beautify the effect of the output, but in the process of transmitting data, the more streamlined the better, redundant things all removed, Therefore, the separators parameter can be added:

import jsondata = [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0 } ]print ‘DATA:‘, repr(data)print ‘repr(data)             :‘, len(repr(data))print ‘dumps(data)            :‘, len(json.dumps(data))print ‘dumps(data, indent=2)  :‘, len(json.dumps(data, indent=2))print ‘dumps(data, separators):‘, len(json.dumps(data, separators=(‘,‘,‘:‘)))

Output:

DATA: [{‘a‘: ‘A‘, ‘c‘: 3.0, ‘b‘: (2, 4)}]repr(data)             : 35dumps(data)            : 35dumps(data, indent=2)  : 76dumps(data, separators): 29

skipkeysparameter, in the encoding process, the Dict object's key can only be a string object, and if it is another type, the exception that is thrown during the encoding process ValueError . skipkeysYou can skip the processing of those non-string objects as keys.

import jsondata= [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0, (‘d‘,):‘D tuple‘ } ]try:    print json.dumps(data)except (TypeError, ValueError) as err:    print ‘ERROR:‘, errprint print json.dumps(data, skipkeys=True)

Output:

ERROR: keys must be a string[{"a": "A", "c": 3.0, "b": [2, 4]}]
Make JSON support custom data types

The above examples are based on Python's built-in type, and for custom types of data structures, the JSON module is not handled by default and throws an exception: TypeError xx is not JSON serializable at this point you need to customize a conversion function:

import json  class MyObj(object):    def __init__(self, s):        self.s = s    def __repr__(self):        return ‘<MyObj(%s)>‘ % self.sobj = .MyObj(‘helloworld‘)try:    print json.dumps(obj)except TypeError, err:    print ‘ERROR:‘, err#转换函数def convert_to_builtin_type(obj):    print ‘default(‘, repr(obj), ‘)‘    # 把MyObj对象转换成dict类型的对象    d = { ‘__class__‘:obj.__class__.__name__,           ‘__module__‘:obj.__module__,        }    d.update(obj.__dict__)    return dprint json.dumps(obj, default=convert_to_builtin_type)

Output:

ERROR: <MyObj(helloworld)> is not JSON serializabledefault( <MyObj(helloworld)> ){"s": "hellworld", "__module__": "MyObj", "__class__": "__main__"} #注意:这里的class和module根据你代码的所在文件位置不同而不同

Conversely, if you want to decode JSON into a Python object, you also need to customize the conversion function to pass the arguments to the Json.loads method object_hook :

#jsontest.pyimport jsonclass MyObj(object):    def __init__(self,s):        self.s = s    def __repr__(self):        return "<MyObj(%s)>" % self.sdef dict_to_object(d):    if ‘__class__‘ in d:        class_name = d.pop(‘__class__‘)        module_name = d.pop(‘__module__‘)        module = __import__(module_name)        print "MODULE:",module        class_ = getattr(module,class_name)        print "CLASS",class_        args = dict((key.encode(‘ascii‘),value) for key,value in d.items())        print ‘INSTANCE ARGS:‘,args        inst = class_(**args)    else:        inst = d    return instencoded_object = ‘[{"s":"helloworld","__module__":"jsontest","__class__":"MyObj"}]‘myobj_instance = json.loads(encoded_object,object_hook=dict_to_object)print myobj_instance

Output:

MODULE: <module ‘jsontest‘ from ‘E:\Users\liuzhijun\workspace\python\jsontest.py‘>CLASS <class ‘jsontest.MyObj‘>INSTANCE ARGS: {‘s‘: u‘helloworld‘}[<MyObj(helloworld)>]MODULE: <module ‘jsontest‘ from ‘E:\Users\liuzhijun\workspace\python\jsontest.py‘>CLASS <class ‘jsontest.MyObj‘>INSTANCE ARGS: {‘s‘: u‘helloworld‘}[<MyObj(helloworld)>]
Using encoder with the decoder class to implement JSON-encoded conversions

Jsonencoder has an iterative interface iterencode(data) that returns a series of encoded data, and the advantage is that it is easy to write data to a file or network stream, without having to read the data into memory at once.

import jsonencoder = json.JSONEncoder()data = [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0 } ]for part in encoder.iterencode(data):    print ‘PART:‘, part

Output:

PART: [PART: {PART: "a"PART: :PART: "A"PART: ,PART: "c"PART: :PART: 3.0PART: ,PART: "b"PART: :PART: [2PART: , 4PART: ]PART: }PART: ]

encodeMethod is equivalent to ‘‘.join(encoder.iterencode() , and will do some error checking beforehand (such as non-string as Dict key), for the custom object, we only need some Jsonencoder default() method, its implementation is similar to the function mentioned above convet_to_builtin_type() .

import jsonimport json_myobjclass MyObj(object):    def __init__(self,s):        self.s = s    def __repr__(self):        return "<MyObj(%s)>" % self.sclass MyEncoder(json.JSONEncoder):    def default(self, obj):        print ‘default(‘, repr(obj), ‘)‘        # Convert objects to a dictionary of their representation        d = { ‘__class__‘:obj.__class__.__name__,               ‘__module__‘:obj.__module__,              }        d.update(obj.__dict__)        return dobj = json_myobj.MyObj(‘helloworld‘)print objprint MyEncoder().encode(obj)

Output:

<MyObj(internal data)>default( <MyObj(internal data)> ){"s": "helloworld", "__module__": "Myobj", "__class__": "MyObj"}

To convert a Python object from JSON:

class MyDecoder(json.JSONDecoder):    def __init__(self):        json.JSONDecoder.__init__(self, object_hook=self.dict_to_object)    def dict_to_object(self, d):        if ‘__class__‘ in d:            class_name = d.pop(‘__class__‘)            module_name = d.pop(‘__module__‘)            module = __import__(module_name)            print ‘MODULE:‘, module            class_ = getattr(module, class_name)            print ‘CLASS:‘, class_            args = dict( (key.encode(‘ascii‘), value) for key, value in d.items())            print ‘INSTANCE ARGS:‘, args            inst = class_(**args)        else:            inst = d        return instencoded_object = ‘[{"s": "helloworld", "__module__": "jsontest", "__class__": "MyObj"}]‘myobj_instance = MyDecoder().decode(encoded_object)print myobj_instance

Output:

MODULE: <module ‘jsontest‘ from ‘E:\Users\liuzhijun\workspace\python\jsontest.py‘>CLASS: <class ‘jsontest.MyObj‘>INSTANCE ARGS: {‘s‘: u‘helloworld‘}[<MyObj(helloworld)>]
JSON format strings are written to the file stream

The above example is in memory operation, if the big data, encode him into a class file (File-like) more appropriate, load() and dump() the method can implement such a function.

import jsonimport tempfiledata = [ { ‘a‘:‘A‘, ‘b‘:(2, 4), ‘c‘:3.0 } ]f = tempfile.NamedTemporaryFile(mode=‘w+‘)json.dump(data, f)f.flush()print open(f.name, ‘r‘).read()

Output:

[{"a": "A", "c": 3.0, "b": [2, 4]}]

Similar to:

import jsonimport tempfilef = tempfile.NamedTemporaryFile(mode=‘w+‘)f.write(‘[{"a": "A", "c": 3.0, "b": [2, 4]}]‘)f.flush()f.seek(0)print json.load(f)

Output:

[{u‘a‘: u‘A‘, u‘c‘: 3.0, u‘b‘: [2, 4]}]

Reference:
Http://docs.python.org/2/library/json.html
Http://www.cnblogs.com/coser/archive/2011/12/14/2287739.html
http://pymotw.com/2/json/

Python processing JSON

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.