Source file storage Format {"coi_id": "Dhfjhjhfjdhfjdhjs", "City_name": "Shenzhen", "Coi_name": "Coastal City", "img": "Http://......jpg", "list": [{ ...}], "Hot": 3.5}, the source file is stored in JSON as rows.
Target: Delete the "list" property and related values, the shopping center of the same city name together, and finally save as {"City_name": "Shenzhen", "list": [{"coi_is": "DFGFFGF", "Coi_name": "Coastal City", "IMG": "Http://...jpg", "Hot": 3.5}, {...}, ...] } form, and the heat of the shopping malls under each city is sorted by value size.
Realize:
1 ImportJSON2 ImportCodecs3 4Fileread=open ("Test.data")5Filenew=open ("Aois.txt",'w+')6 7city_aois={}8city_put={}9 Ten forLineinchFileread: One #Print Line Ajson_obj=json.loads (line) - deljson_obj["List"] - #Print Json_obj the ifjson_obj['City_name'] not inchCity_aois: -city_aois[json_obj['City_name']]=[] -city_aois[json_obj['City_name']].append (json_obj) - Else: +city_aois[json_obj['City_name']].append (json_obj) - + A forPoiinchCity_aois: atCity_aois[poi].sort (key=Lambdax:x[" Hot"],reverse=True) - #item=json.dumps (City_aois[poi],ensure_ascii=false) - #STRs = ' {' city_name ': ' +poi.encode ("UTF-8") + ' "," List ": ' +item.encode (" UTF-8 ") + '} ' - #Print STRs -tmp={} -tmp['City_name']=POI intmp['List']=City_aois[poi] -Strs=json.dumps (Tmp,ensure_ascii=false). Encode ("UTF-8") toFilenew.write (strs+'\ r \ n') + - fileread.close () theFilenew.close ()
The pits that have been trampled:
1. Not familiar with Python, in fact, "JSON" is not used in Java is an object, is just a JSON-formatted string, through Json.loads () is to read the file string into a "dictionary" form, type "dict", The dictionary can then be manipulated, for example, to obtain the form of value in terms of key.
2. When Python converts a JSON-formatted string into a dictionary, it is a Unicode encoding, and after a series of operations, it is necessary to convert the dictionary to a string by Json.dumps (), which can cause various coding problems. For example, after simple dumps, the Chinese display is in Unicode form, if there is Chinese in the dictionary, you need to add
Json.dumps (Tmp,ensure_ascii=false)
When printing the display, you also need to call encode to convert the encoding to UTF-8 form.
3. Error "Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xe6 in position 1:ordinal not in range (128)"
is because "key and value cannot exist in the form of a mixture of ordinary strings and Unicode strings."
In addition
Item=json.dumps (city_aois[poi],ensure_ascii=False) tmp['list']=item.encode ( " UTF-8 " ) STRs=json.dumps (tmp,ensure_ascii=false). Encode ("UTF-8")
Print STRs
As the above approach is to convert dict to a string copy to the TMP "list", and then dumps will not be in the dictionary of the Rules of the list of contents, are treated as a string, it will appear:
So before dumps, the contents of the list are also in the form of dictionaries. Finally unified dumps.
Note that encoding consistency is used in the process.
Refer to the following blog content:
Json.dumps used by the pit and character encoding:
Http://www.cnblogs.com/stubborn412/p/3818423.html
View Chinese characters for different encoded content online:
Http://tool.oschina.net/encode?type=3
Python dictionary and file read and write:
http://blog.csdn.net/frankchen0130/article/details/53136681
Json.dumps encoded UTF8 and Unicode:
Https://stackoverflow.com/questions/18337407/saving-utf-8-texts-in-json-dumps-as-utf8-not-as-u-escape-sequence
Processing of data in JSON format