Continue with the previous section, in this section you will download the JSON-formatted population data and use the JSON module to process them. Pygal provides a map creation tool for beginners, which you will use to visualize demographic data to explore the distribution of the global population.
A production map of the world population
1 Download the World population data and extract related data
can go to (http://data.okfn.org/) download Population_data.json, to study the Population_data.json, see how to proceed to deal with the data in this file:
[{"Country Name": "Arab World", "Country Code": "ARB", "Year": "1960", "Value": "96388069"}, {"Count Ry Name ":" Arab World "," Country Code ":" ARB "," Year ":" 1961 "," Value ":" 98882541.4 "}, omitted .... }
This file is actually a very long Python list, where each element is a dictionary of four keys: Country name, Country code, year, and value representing the number of people. We only care about the number of people in each country in 2010, so we first write a program that prints this information:
Import json# load data into a list filename= ' Population_data.json ' with open (filename) as f: pop_data = Json.load (f) for Pop_ DiC in Pop_data: if pop_dic["year"] = = ': country_name= pop_dic[' country name '] population = pop_dic[' Value '] print (country_name + ":" + population)
The results are as follows:
2 Converting a string to a numeric value
Each key and value in the Population_data.json is a string. To deal with these demographic data, we need to convert the string representing the population number to a numeric value, so we use the function int ():
Population = Int (pop_dic[' Value ') print (country_name + ":" + str (population))
However, for some values, this conversion causes an error, as follows:
===== restart:d:/study/python/code/world_population/world_population.py =====arab World:357868000Caribbean Small States:6880000east Asia & Pacific (all income levels): 2201536674East Asia & Pacific (developing only): 1961558757E Uro Area:331766000europe & Central Asia (all income levels): 890424544Europe & Central Asia (developing only): 40520 4000European union:502125000heavily Indebted Poor Countries (HIPC): 635663000Traceback (most recent call last): File "d:/s tudy/python/code/world_population/world_population.py ", line A, in <module> population = int (pop_dic[' Value ']) Valueerror:invalid literal for int. () with base: ' 1127437398.85751 ' >>>
The reason for the above error is that Python cannot directly convert the string containing the decimal point ' 1127437398.85751 ' to an integer (this small value may be worthwhile by interpolating when the population data is missing). To eliminate this error, we first convert the string to a floating-point number and then convert the floating-point number to an integer:
Population = Int (float (pop_dict[' Value '))
The function float () converts the string to a decimal, and the function int () discards the fractional part, returning an integer. Each string is successfully converted to a floating-point number and converted to an integer. After storing the population values in numeric format, you can use them to create a world population map.
Three get two-letter country codes
Before you make a map, you need to address the last problem with the data. The Map authoring tool in Pygal requires that the data be in a specific format: country codes for countries, and numbers for population numbers. When dealing with geopolitical data, it is often necessary to use several standardized country code sets. The Population_data.json contains three-letter country codes, but Pygal uses two-letter country codes. We need to find a way to get two-letter country codes by country name. The country code used by Pygal is stored in module i18n (abbreviated internationalization). The dictionary countries contains keys and values of two-letter country codes and country names. To view these country codes, you can import the dictionary from module i18n and print its keys and values:
From pygal.i18n import countries? For Country_code in Sorted (Countries.keys ()): print (Country_code, Countries[country_code])
Error:
======== restart:d:/study/python/code/world_population/countries.py ========traceback (most recent): File "d:/study/python/code/world_population/countries.py", line 1, <module>
Causes and solutions:
The i18n module is removed in pygal-2.0.0, however, it can is now being found in the Pygal_maps_world plugin. You can install this with pip install Pygal_maps_world. Then you can access countries as Pygal.maps.world.COUNTRIES:from pygal.maps.world import countries
Press the action above:
From Pygal.maps.world import countriesfor country_code in sorted (Countries.keys ()): print (country_code, countries [Country_code])
The results are as follows:
To get the country code, we will write a function that finds and returns the country code in countries. We put this function in a module named Country_codes so that it can be imported in a visualizer:
From Pygal.maps.world import countriesdef get_country_code (country_name): #根据指定的国家, return Pygal using the two-letter country code for Code,name in Countries.items (): if name = = country_name: Return code # returns NONE if the specified country is not found Noneprint (Get_country_code (' Andorra ')) print (Get_country_code (' Arab Emirates ')) print (Get_country_code (' Afghanistan '))
The results are as follows:
Next, import the Get_country_code in world_population.py:
Import jsonfrom country_codes import get_country_code# load data into a list filename= ' Population_data.json ' with open (filename) As f: pop_data = Json.load (f) # Print the population of each country 2010 for pop_dic in Pop_data: if pop_dic["year"] = = ' All ': count Ry_name= pop_dic[' country name '] population = int (float (pop_dic[' Value ')) code =get_country_code (country_ Name) if code: Print (code + ":" + str (population)) else: print ("ERROR-" + country_name)
The results are as follows:
There are two causes for error messages to be displayed. First, not all populations correspond to countries, and some of the population correspond to regions (the Arab World) and economic groups (all income levels). Second, some statistics use different full country names (such as Yemen, REP, and not Yemen). At this moment, we will ignore the data that leads to the error and see what the map is based on the data that was successfully recovered.
3 Drawing the world map
With the country code, making the world map is a breeze. Pygal provides the chart type WorldMap, which helps you create a world map that renders data for each country. To demonstrate how to use WorldMap, let's create a simple map that highlights north, Central and South America:
Import pygal# The code in this book is obsolete, please use the latest. WM = Pygal.maps.world.World () Wm.title = ' North, Central, and South America ' Wm.add (' North America ', [' Ca ', ' mx ', ' Us ') WM.A DD (' Central America ', [' BZ ', ' cr ', ' GT ', ' hn ', ' ni ', ' pa ', ' SV ']) wm.add (' South America ', [' ar ', ' bo ', ' BR ', ' cl ', ' Co ', ' E C ', ' GF ', ' Gy ', ' pe ', ' py ', ' sr ', ' uy ', ' ve ']) wm.render_to_file (' Americas.svg ')
(1) We created a WorldMap instance and set the title property of the map
(2) The method Add (), which accepts a label and a list, which contains the country codes of the countries we want to highlight. Each call to add () selects a new color for the specified country and displays the color and the specified label on the left side of the chart. We want to display the entire North American region in the same color, so the first call to add () includes ' Ca ', ' MX ', and ' us ' in the list passed to it, highlighting Canada, Mexico, and the United States. Next, do the same for the countries of Central and South America.
(3) method Render_to_file () creates an. svg file that contains the chart, which you can open in a browser. The output is a map that highlights north, Central and South America in different colors, such as:
4 drawing a complete map of the world population
To render the number of people in other countries, the previously processed data needs to be converted to the Pygal required dictionary format: The key is a two-letter country code with a value of population. To do this, add the following code to the world_population.py:
Import jsonfrom country_codes import get_country_codeimport pygal# load data into a list filename= ' Population_data.json ' with Open (filename) as f: pop_data = Json.load (f) cc_populations ={}for pop_dict in Pop_data: if pop_dict[' year ' ] = = ': country_name= pop_dict[' country name '] population = int (float (pop_dict[' Value ')) code =get_ Country_code (country_name) if code: Cc_populations[code] = POPULATIONWM = Pygal.maps.world.World () wm.title= "World Population in 2010,by Country" wm.add (' All ', cc_populations) wm.render_to_file ("World_population.svg")
Such as:
5 Grouping of countries according to the number of population
India and China have a much larger population than other countries, but in the current map, their colors are less different than those in other countries. China and India have a population of more than 1 billion, and the next most populous country is the United States, but only about 300 million. Instead of grouping all countries as a group, they are divided into three groups-less than 10 million, between 10 million and 1 billion, and more than 1 billion-in terms of population.
Import jsonfrom country_codes import get_country_codeimport pygal# load data into a list filename= ' Population_data.json ' with Open (filename) as F:pop_data = Json.load (f) cc_populations ={}for pop_dict in Pop_data:if pop_dict[' year '] = = ' $ ': country_name= pop_dict[' country name '] population = int (float (pop_dict[' Value ')) code =get _country_code (country_name) if code:cc_populations[code] = population# divides all countries into three groups based on the number of population cc_pops_1, cc_p Ops_2, Cc_pops_3 = {}, {}, {}for cc, pop in Cc_populations.items (): If pop < 10000000:cc_pops_1[cc] = Pop Elif Pop < 1000000000:CC_POPS_2[CC] = Pop ELSE:CC_POPS_3[CC] = pop# See how many countries each group contains? print (Len (cc_pops _1), Len (cc_pops_2), Len (Cc_pops_3)) WM = Pygal.maps.world.World () wm.title= "World Population in 2010,by Country" #wm. Add (' Cc_populations ') wm.add (' 0-10m ', cc_pops_1) wm.add (' 10m-1bn ', cc_pops_2) wm.add (' >1bn ', Cc_pops_3) Wm.render_to_file ("World_population.svg")
The results are as follows:
We now use three different colors, so that we can see the difference in population numbers. In each group, each country has a light-to-dark color according to its population from less to more.
6 use Pygal to set the style of the world map
In this map, the country groups according to the population although very effective, but the default color settings are difficult to see. Here, for example, Pygal selected a vibrant pink and green base color. The following uses the Pygal style settings directive to adjust the color. We also let Pygal use a base color, but it will specify the base and make the colors of the three groupings much more different:
Import jsonfrom country_codes import get_country_codeimport pygalfrom pygal.style import rotatestyle# Load data into a list filename= ' Population_data.json ' with open (filename) as F:pop_data = Json.load (f) cc_populations ={}for Pop_dict in Pop_data:if pop_dict[' year '] = = ': Country_name= pop_dict[' country name '] Population = Int (float (pop_dict[' Value '))) code =get_country_code (country_name) if code:cc_populations[cod E] = population# divides all countries into three groups according to population number Cc_pops_1, cc_pops_2, Cc_pops_3 = {}, {}, {}for cc, pop in Cc_populations.items (): If Pop < 10000000:CC_POPS_1[CC] = Pop elif Pop < 1000000000:cc_pops_2[cc] = Pop else:cc_p OPS_3[CC] = pop# See how many countries each group contains? print (Len (cc_pops_1), Len (cc_pops_2), Len (Cc_pops_3)) Wm_style=rotatestyle (' #336699 ') WM = Pygal.maps.world.World (Style=wm_style) wm.title= "World Population in 2010,by Country" #wm. Add (' All ', cc_ populations) Wm.add (' 0-10m ', cc_pops_1) wm.add (' 10m-1bn ', cc_pops_2) WM.add (' >1bn ', Cc_pops_3) wm.render_to_file ("World_population.svg")
The Pygal style is stored in the module style, and we import the style Rotatestyle from this module. When you create an instance of this class, you need to provide an argument-the RGB color of 16-pygal, which will be selected for each group based on the specified color. The RGB color in hexadecimal format is a string preceded by a pound sign (#) followed by 6 characters, with the first two characters representing the red component, the next two representing the green component, and the last two representing the blue component. The value range for each component is 00 (no corresponding color) ~ff (contains the most appropriate color). If you search online for hex color chooser (hex colour picker), you can find tools that let you try to choose a different color and display its RGB values. The color values (#336699) used here blend a small amount of red (33), a little more green (66), and a few more blue (99), which provides a light blue base color for Rotatestyle.
Effects such as:
7 Highlight Color Theme
Pygal usually use a darker color theme by default. For the convenience of printing, I used Lightcolorizedstyle to highlight the color of the map. This class modifies the theme of the entire chart, including background colors, labels, and the colors of each country. To use this style, first import it:
From Pygal.style import Lightcolorizedstyle, Rotatestyle
Then use Rotatestyle to create a style and pass in another argument Base_style:
Wm_style = Rotatestyle (' #336699 ', Base_style=lightcolorizedstyle)
The final code is as follows:
Import jsonfrom country_codes import get_country_codeimport pygalfrom pygal.style import Rotatestyle, lightcolorizedstyle# loading data into a list filename= ' Population_data.json ' with open (filename) as F:pop_data = Json.load (f) cc _populations ={}for pop_dict in Pop_data:if pop_dict[' year '] = = ': Country_name= pop_dict[' country name ' ] population = Int (float (pop_dict[' Value '))) code =get_country_code (country_name) if code: Cc_populations[code] = population# divides all countries into three groups according to population number Cc_pops_1, cc_pops_2, Cc_pops_3 = {}, {}, {}for cc, pop in Cc_populat Ions.items (): If pop < 10000000:cc_pops_1[cc] = Pop elif Pop < 1000000000:cc_pops_2[cc] = Pop ELSE:CC_POPS_3[CC] = pop# See how many countries each group contains? print (Len (cc_pops_1), Len (cc_pops_2), Len (Cc_pops_3)) wm_style=rotatesty Le (' #336699 ', base_style=lightcolorizedstyle) wm = Pygal.maps.world.World (Style=wm_style) wm.title= "World Population In 2010,by Country "#wm. Add (' $ ', cc_populations) WM.ADD (' 0-10m ', cc_pops_1) wm.add (' 10m-1bn ', cc_pops_2) wm.add (' >1bn ', Cc_pops_3) wm.render_to_file ("World_ Population.svg ")
Not to be continued, today is the second day of New Year festival, refueling!
Python Project Practice II (Download data) fourth article