Graphical Mapping with R

Mapping with R is a real joy. To see some examples, we’ll rely on the maps package.

Mapping the World

Getting the longitude and latitude

First of all, we need to get the long and lat data in order to plot with ggplot2 which provides graphs with more aesthetic. (for an introduction to the ggplot2 package, just click here). To do that, we’ll use the map_data() function which belongs to ggplot2 and put as a parameter the world map provided by the maps package.

library(maps)
library(tidyverse)
world <- map_data("world")

head(world)
##        long      lat group order region subregion
## 1 -69.89912 12.45200     1     1  Aruba      <NA>
## 2 -69.89571 12.42300     1     2  Aruba      <NA>
## 3 -69.94219 12.43853     1     3  Aruba      <NA>
## 4 -70.00415 12.50049     1     4  Aruba      <NA>
## 5 -70.06612 12.54697     1     5  Aruba      <NA>
## 6 -70.05088 12.59707     1     6  Aruba      <NA>

To render our world geographical plot, we set long to the x-axis and lat to the y-axis. Finally, we set group = group (group is a variable that groups each observation according to the corresponding polygon).

ggplot(world, aes(x=long, y=lat, group=group)) +
  geom_polygon(fill="lightgreen", colour="black")

Mapping Regions

Now we’ll use the region parameter which extracts regional data from the world map. Let’us have a look to the region where I belong to, North Africa:

north_africa <- map_data("world", region = c(
  "Algeria",
  "Morocco",
  "Tunisia",
  "Egypt",
  "Western Sahara",
  "Libya"))

head(north_africa)
##       long      lat group order  region subregion
## 1 8.576563 36.93721     1     1 Algeria      <NA>
## 2 8.597656 36.88388     1     2 Algeria      <NA>
## 3 8.601269 36.83393     1     3 Algeria      <NA>
## 4 8.506739 36.78750     1     4 Algeria      <NA>
## 5 8.444238 36.76074     1     5 Algeria      <NA>
## 6 8.369629 36.63252     1     6 Algeria      <NA>

In order to differentiate among countries using colors, we set the parameter fill = region.

ggplot(north_africa, aes(x = long, y = lat, group = group, fill = region)) + 
  geom_polygon(colour = "black") + # colour = "black" is needed to emphasize borders
  theme_minimal()

Geographical Filling

Let us go deeper in our geographical plotting. Suppose we want to plot the French median revenue according to each borough (in french : “arrondissement”). The french median revenue data for 2015 is freely available at the INSEE website.

library(readxl)

fr_median <- read_excel("base-cc-filosofi-2015.xls", skip = 5) # keep only the ARR tab in the Excel File. 

Furthermore, the SHP file for France is available in the European Environment Agency Data Base. In order to read our SHP file, we’ll rely on the sf package.

library(sf)

fr_map <- sf::st_read("C:/Users/Administrateur/Desktop/SHP_France/FRA_adm3.shp", stringsAsFactors = FALSE)
## Reading layer `FRA_adm3' from data source `C:\Users\Administrateur\Desktop\SHP_France\FRA_adm3.shp' using driver `ESRI Shapefile'
## Simple feature collection with 350 features and 13 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -5.143751 ymin: 41.33375 xmax: 9.560416 ymax: 51.0894
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
head(fr_map, n = 2)
## Simple feature collection with 2 features and 13 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 7.065385 ymin: 48.33202 xmax: 8.107299 ymax: 49.04169
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##   ID_0 ISO NAME_0 ID_1 NAME_1 ID_2   NAME_2 ID_3   NAME_3         TYPE_3
## 1   79 FRA France    1 Alsace    1 Bas-Rhin    1 Haguenau Arrondissement
## 2   79 FRA France    1 Alsace    1 Bas-Rhin    2 Molsheim Arrondissement
##   ENGTYPE_3 NL_NAME_3 VARNAME_3                       geometry
## 1 Districts      <NA>      <NA> MULTIPOLYGON (((7.70342 48....
## 2 Districts      <NA>      <NA> MULTIPOLYGON (((7.317291 48...

Now we’ll combine the fr_map and the fr_median databases. To do so, we need to provide a common key.The column NAME_3 in fr_map and the column LIBGEO in fr_median list our French boroughs. As mentioned previously, we must provide a unique key, therefore we’ll rename the NAME_3 as LIBGEO:

fr_map <- fr_map %>% rename(LIBGEO = NAME_3)

Now we’re ready to combine our databases and drop all values that do not correspond using the inner_join function provided by the tidyverse package.

final <- inner_join(fr_map, fr_median, by = "LIBGEO")

It’s possible to use the ggplot2 package however I’d rather suggest the tmap package:

library(tmap)

tm_shape(final)+
  tm_polygons("MED15", id  = "LIBGEO")

Avatar
Mohamed El Fodil Ihaddaden
Ph.D candidate in Economics.

My research interests include Performance Management, Data Envelopment Analysis and Artificial Intelligence applied to Economics.