First, JSON data preparation
First, prepare a JSON data, a total of 3,560 items, each with the following structure:
This example analyzes the distribution of time zones in this data using the values of the TZ (TimeZone time zone) field.
Ii. convert JSON data to Python dictionary
The code is as follows:
statistical TZ value Distribution, generate statistical results in the form of "time zone: Total"
To achieve this, it is necessary to convert records into Dataframe,dataframe, the most important data structure in Pandas, which can represent data in tabular form, and then summarize with the Value_counts () method:
Iv. generate bar graphs based on statistical results
Before generating a bar chart, for the completeness of the data, you can add a value to the missing time zone in the result (shown here in missing), and the missing value in each time zone content also needs to add an unknown value (represented here in unknown):
You can then use the plot () method to generate both bar graphs:
Here is a complete example of processing JSON data to generate statistical results and bar graphs, but the results can be further processed to obtain more detailed results.
Each data also has an agent value, that is, the browser's user_agent information, through this information to know the operating system used,so the statistical results generated in the previous step can also be differentiated by operating system differences.
Agent value:
v. To distinguish a bar chart from an operating system (windows/non-Windows)
Not all data have a This field, first filter out the data without the agent value;then group the data according to the time zone and operating system list, and then to count the grouped results:
Finally, select the data for the 10 time zones that occur most timesTo generate a bar chart:
This results in bar chart statistics that are differentiated by different operating systems:
The next essay is: Data analysis using Python (iii) using IPython to improve development efficiency, interested friends welcome to pay attention to this blog, but also welcome you to add comments to discuss.
Data analysis using Python (ii) Try to process a copy of the JSON data and generate a bar chart