Python and Folium

Creating Interactive Maps with Python


Introduction

I recently took a course based around GIS programming and Python.  Like most of the courses in my program there is a heavy emphasis on utilizing the ESRI platform and so this course was mostly about ArcPy with a short Python introduction, with it we created a process to determine the optimal pathing for a pipeline.  That project can be viewed here.  

In day to day work life, while I leverage a lot of GIS, we do not have ArcGIS Pro. I do much of my GIS type work in Global Mapper and Spotfire.  This has lead me to want to learn more about creating maps with Python and the various packages that are available.

The following map was created using Python, Folium and Pandas within the Google Colab environment, saved out as an HTML and hosted on GitHub.  It demonstrates a simple use case of displaying a basic interactive choropleth map using these simple and open source tools.

Calgary Population

Percentage of Population Contains within a Ward (2019)

 Data: City of Calgary.  Keith Johnson, January 13, 2023

 Lets Build!

I am going to assume you have at least a little knowledge of notebooks or similar for running Python.  I used Google Colab.  In order to have this work we are going to need a few things.
  • Data to create the population with
  • Data to define the ward boundaries
  • The packages loaded into Google Colab

The Data

Like many cities, Calgary has a wealth of data available online through Open Calgary.  Both of the required data sets were downloaded from this website:
The first thing we need to do is import the packages we will be using

  
	import folium
	from folium import plugins
	import pandas
  
 

If the Google Colab notebook fails to load folium you can install it for the notebook by executing


  
	!pip install folium
  
 

Setting up the Population Data

The next thing we need to do is set up our population data.  We can do this by simply reading the .csv file we downloaded into a Pandas data frame.  I then sliced off the two columns I wanted, Ward and population for ease of use.

  
  	df=pd.read_csv("Census_by_Ward_2019.csv")
	popdf=df[["WARD_NUM","RES_CNT"]]
  
 

The next thing I did was to create a percentage column so I could display each ward as a percentage of the total

  
	popdf["pctpop"]=popdf["RES_CNT"]/popdf["RES_CNT"].sum()*100
  
 

Prepping the Population Data for association with the geoJSON

It is also important to note that we will need to be able to relate a column from our data with a column from the GeoJSON file.  If we look at our current dataframe we will see that our ward number is currently stored as an integer

  
	popdf.info()
  
 
<class 'pandas.core.frame.DataFrame'> RangeIndex: 14 entries, 0 to 13 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 WARD_NUM 14 non-null int64 1 RES_CNT 14 non-null int64 2 pctpop 14 non-null float64 dtypes: float64(1), int64(2) memory usage: 464.0 bytes

If we look at the GeoJSON file for the ward number we will see it is a string.  We can do this with GeoPandas

  
	import geopandas as gpd
	gpd.read_file("Ward Boundaries.geojson").info()
  
 
<class 'geopandas.geodataframe.GeoDataFrame'> RangeIndex: 14 entries, 0 to 13 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 councillor 14 non-null object 1 ward_num 14 non-null object 2 label 14 non-null object 3 geometry 14 non-null geometry dtypes: geometry(1), object(3) memory usage: 576.0+ bytes

Because of this our two columns are not going to match, we will need to change the data type of our dataframe to match the geoJSON file so the choropleth can be drawn properly.  Our full set of code for reading in the population data, slicing off the two columns of interest, changing the ward number data type and calculating the percentage is these 4 lines of code.

  
		df=pd.read_csv("Census_by_Ward_2019.csv",)
		popdf=df[["WARD_NUM","RES_CNT"]]
		popdf["ward_num"]=popdf["WARD_NUM"].astype(str)
		popdf["pctpop"]=popdf["RES_CNT"]/popdf["RES_CNT"].sum()*100
  
 

Creating the Map with Folium

I am not going to pretend I am an expert in any of this, but some quick reading tells us that folium is a wrapper that will allow us to create an interactive map for use on the web.  It is a wrapper for the Leaflet.js libraries, but how that all works is beyond this post and, frankly, beyond my knowledge.

To create a choropleth we will need to define a set of bins that control how the data is grouped.  There are many methods that can be used to create these, but they also lie beyond the scope of this post, for this example I am using a simple defined interval of 1% ranging from 5 to 9

  
		bins=[5,6,7,8,9]
  
 

The first line of code is going to create the folium map object called "m" and center it over Calgary at an appropriate zoom level.

  
		m=folium.Map(location=[51.0447,-114.0710],zoom_start=10)
  
 

Our next line of code is a long one and is used to create the choropleth map

  
		folium.Choropleth(geo_data="Ward Boundaries.geojson",
				key_on="feature.properties.ward_num",
				data=popdf,
				columns=["ward_num","pctpop"],
				threshold_scale=bins,
				fill_color='YlOrRd',
				fill_opacity=0.7,
				line_opacity=0.5).add_to(m)
  
 

Lets break this down.
  • geo_data: The path to the geoJSON that will draw our boundaries
  • key_on: The column in the geoJSON to associate with the population dataframe.  Opening the file in a text editor will reveal the name to use, but as far as I can tell it will generally be in the form of "feature.properties.xxxxxxx"
  • data: the dataframe that holds the data to control the shading.
  • columns: columns from the dataframe, the first is matched to the geoJSON and the second holds the data to plot the choropleth colors.
  • threshold_scale: this is a list of bins to use to plot the choropleth.
  • fill_color: The colors to use to fill, you can find these with a quick Google search.
  • fill_opacity: opacity control for the fill.
  • line_opacity: opacity control for the lines.
  • .add_to(m): DO NOT FORGET THIS FINAL PIECE.  It tells folium to add it to our map object, "m"
I added a few other things to this map to allow for some basic pop-ups

  
		folium.GeoJson("Ward Boundaries.geojson",
				popup=folium.GeoJsonPopup(fields=['councillor',
					'label'])).add_to(m)
  
 

As well as a mini map
  
		minimap=plugins.MiniMap()
		m.add_child(minimap)
  
 

Finally I saved it to an HTML file and displayed it within the notebook.

  
		m.save("calgary.html")
		m
  
 

Conclusion

And that is all there is to creating a simple interactive choropleth map.  The two largest issues I had in getting this to work was the type mismatch between the dataframe and geoJSON file and forgetting the ".add_to(m)" on my choropleth line.  Once these two errors were identified and fixed it was smooth sailing.  

There are a multitude of things that could be done to improve this, but it shows how simply and easily, and with a very small amount of code a useful map can be created.

References

Folium and Choropleth Map: From Zero to Pro: 
https://towardsdatascience.com/folium-and-choropleth-map-from-zero-to-pro-6127f9e68564

Create and Visualize Choropleth map with Folium:
https://medium.com/analytics-vidhya/create-and-visualize-choropleth-map-with-folium-269d3fd12fa0

Lab: GeoJSON and choropleth maps:
http://comet.lehman.cuny.edu/owen/teaching/datasci/choroplethLab.html

Comments