Modified by SunJackson

Creating a Choropleth Map of the World in Python using GeoPandas • Jupyter Notebook

Creating a Choropleth Map of the World in Python using GeoPandas

There are different ways of creating choropleth maps in Python. In a previous notebook, I showed how you can use the Basemap library to accomplish this. More than 2 years have passed since publication and the available tools have evolved a lot. In this notebook I use the GeoPandas library to create a choropleth map. As you'll see the code is more concise and easier to follow along.

To allow for better comparison of the 2 approaches I'll again use a World Bank dataset, but to make it a bit more interesting the map will show the number of Individuals using the Internet (% of population) based on the International Telecommunication Union, World Telecommunication/ICT Development Report and database.

Setup

Load the necessary modules and specify the files for input and output, set the number of colors to use, the size of the figure in inches (width, height) and meta information about what is displayed.

%load_ext signature
%matplotlib inline
import os

import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd

from helpers import slug

datafile = os.path.expanduser('~/data/worldbank/API_IT.NET.USER.ZS_DS2_en_csv_v2.csv')
shapefile = os.path.expanduser('~/data/geo/naturalearthdata.com/ne_10m_admin_0_countries_lakes/ne_10m_admin_0_countries_lakes.shp')

colors = 9
cmap = 'Blues'
figsize = (16, 10)
year = '2016'
cols = ['Country Name', 'Country Code', year]
title = 'Individuals using the Internet (% of population) in {}'.format(year)
imgfile = 'img/{}.png'.format(slug(title))

description = '''
Individuals who have used the Internet from any location in the last 3 months via any device based on the International Telecommunication Union,
World Telecommunication/ICT Development Report and database. Data: World Bank - worldbank.org • Author: Ramiro Gómez - ramiro.org'''.strip()

Create a GeoDataFrame from the Admin 0 - Countries shapefile available from Natural Earth Data and show a sample of 5 records. We only read the ADM0_A3 and geometry columns, which contain the 3-letter country codes defined in ISO 3166-1 alpha-3 and the country shapes as polygons respectively.

gdf = gpd.read_file(shapefile)[['ADM0_A3', 'geometry']].to_crs('+proj=robin')
gdf.sample(5)
ADM0_A3 geometry
52 CPV (POLYGON ((-2309170.347710123 1587430.58249633...
100 HTI (POLYGON ((-6860253.524149654 1936562.75358100...
253 ZMB POLYGON ((2934183.839812623 -918610.8758938238...
241 VAT POLYGON ((1073492.780621549 4478198.522616305,...
219 SYR POLYGON ((3738892.582840674 3968612.091458879,...

Next read the datafile downloaded from the World Bank Open Data site and create a pandas DataFrame that contains values for Country Code, Country Name and the percentages of Internet users in the year 2016.

df = pd.read_csv(datafile, skiprows=4, usecols=cols)
df.sample(5)
Country Name Country Code 2016
144 Macao SAR, China MAC 81.642985
157 Malta MLT 77.289395
204 Sudan SDN 28.000000
165 Mauritius MUS 53.226178
11 Australia AUS 88.238658

Next we merge the data frames on the columns containing the 3-letter country codes and show summary statistics as returned from the describe method.

merged = gdf.merge(df, left_on='ADM0_A3', right_on='Country Code')
merged.describe()
2016
count 201.000000
mean 51.417892
std 28.569949
min 1.177119
25% 25.366301
50% 54.000000
75% 76.409085
max 98.240016

Mapping the data

The merge operation above returned a GeoDataFrame. From this data structure it is very easy to create a choropleth map by invoking the plot method. We need to specify the column to plot and since we don't want a continuous color scale we set scheme to equal_interval and the number of classes k to 9. We also set the size of the figure and show a legend in the plot.

ax = merged.dropna().plot(column=year, cmap=cmap, figsize=figsize, scheme='equal_interval', k=colors, legend=True)

This is pretty nice already, but before publishing this map, there remains some work to be done. As is often the case, some data is missing. You may or may not have noticed it, but the corresponding countries are not shown at all, look for North Korea. The call to dropna() right before the plot() call removed these records from the plotted GeoDataFrame.

We could just leave it like that, because we simply don't know the values, but I'm sure that would put off some people. So let's draw these countries and fill them with a light gray and a striped pattern as in this D3.js based map.

Moreover, the image taken by itself provides no clue about what is shown, so we'll add a title and an annotation. Also we to turn off the axes, cut off some space in the far west and east, and move the legend to the lower left of the figure, because there is more empty space.

merged[merged.isna().any(axis=1)].plot(ax=ax, color='#fafafa', hatch='///')

ax.set_title(title, fontdict={'fontsize': 20}, loc='left')
ax.annotate(description, xy=(0.1, 0.1), size=12, xycoords='figure fraction')

ax.set_axis_off()
ax.set_xlim([-1.5e7, 1.7e7])
ax.get_legend().set_bbox_to_anchor((.12, .4))
ax.get_figure()

I think this map is fine for publication and the code is pretty easy to follow, but there is some room for improvement as far as I'm concerned. For example reducing the left and right margins, turning off axis display and specifying a color and/or pattern for missing data via parameters to the plot method would be nice to have. In any case, I think the GeoPandas project is headed in a good direction and hope it will continue to evolve as a library for analyzing and mapping geographic data in Python.

%signature

Author: Ramiro Gómez • Last edited: March 23, 2018Linux 4.13.0-37-generic - CPython 3.6.4 - IPython 6.2.1 - matplotlib 2.2.2 - numpy 1.14.2 - pandas 0.22.0

Shirts for Python Programmers

Published:March 20, 2018 by Ramiro Gómez. If you want to be notified about new content, click here to subscribe to the newsletter or RSS feed.

Disclosure: External links on this page may contain affiliate IDs, which means that I earn a commission if you make a purchase via such a link. This allows me to operate this site and offer hopefully valuable content that is freely accessible. More information about affiliate programs.

© Ramiro Gómez. Berlin, Germany.

Be informed about new content

Share this page

Hit ESC or click X on top right to close this dialog.

    标签:
微信扫一扫订阅