Working with the French Open Data Portal using BARIS

The French official open data portal offers a huge quantity of information. They also provide a well structured API. The BARIS package allows you to exploit this API in order to get the required data from the portal.

Within the portal there is the concept of a dataset which contains one or several dataframes or resources. So, if I use the resource term, you need to apprehend it as the dataframe inside a dataset.

You can install the package from CRAN:

install.packages("BARIS")

Too much talking, let’s dive into a reproducible example.

BARIS_explain()

The BARIS_explain() function provides a description of a dataset. The function takes one argument which is the ID of the dataset:

BARIS_explain(datasetId = "5cebfa8306e3e77ffdb31ef5")
## [1] "Monuments historiques situés sur le territoire de Marseille, avec adresse, numéro de base Mérimée (base de données du Ministère de la Culture recensant les monuments historiques de toute la France) et points de géolocalisation"

Don’t panic if you’re not a french speaker. You can always use the great googleLanguageR.

Now, it’s time to list the resources contained within this dataset !!!

BARIS_resources()

The BARIS_resources function displays the available resources or dataframes within a dataset. The function takes as argument the ID of the dataset:

BARIS_resources(datasetId = "5cebfa8306e3e77ffdb31ef5")
## Warning: `...` is not empty.
## 
## We detected these problematic arguments:
## * `needs_dots`
## 
## These dots only exist to allow future extensions and should be empty.
## Did you misspecify an argument?
## # A tibble: 2 x 6
##   id         title       format published   url             description         
##   <chr>      <chr>       <chr>  <chr>       <chr>           <chr>               
## 1 59ea7bba-~ MARSEILLE_~ csv    2019-05-27~ https://trouve~ Monuments historiqu~
## 2 6328f8b3-~ Plan des M~ pdf    2019-05-27~ https://trouve~ Edition Janvier 2013

You can see from above that the dataset has two resources, a csv and a pdf. Now, we’ve reached the interesting part: extracting the dataframe that you’ll work on !

BARIS_extract()

Using BARIS_extract() you can extract directly into your R session the needed dataset. Currently, “only” theses formats are supported: json, csv, xls, xlsx, xml, geojson and shp, nevertheless you can always rely on the url of the resource to download it manually.

In order to use the function you’ll have to specify two arguments: The ID of the resource and its format.

You can visually catch the structure difference between the ID of a dataset and the ID of a resource.

data <- BARIS_extract(resourceId = "59ea7bba-f38a-4d75-b85f-2d1955050e53", format = "csv")

head(data)
## Warning: `...` is not empty.
## 
## We detected these problematic arguments:
## * `needs_dots`
## 
## These dots only exist to allow future extensions and should be empty.
## Did you misspecify an argument?
## # A tibble: 6 x 10
##   n_base_merimee date_de_protect~ denomination adresse code_postal
##   <chr>          <chr>            <chr>        <chr>         <int>
## 1 PA00081336     Classement : li~ Ancienne ég~ "/"           13002
## 2 PA00081340     Classement: 13/~ Eglise Sain~ "Espla~       13002
## 3 PA00081331     Classement: 29/~ Chapelle et~ "2, Ru~       13002
## 4 PA00081344     Classement: 16/~ Fort Saint-~ ""            13002
## 5 PA00081325     Inscription : 2~ Les deux bâ~ "Quai ~       13002
## 6 PA00081334     Inscription : 0~ Clocher des~ "Monté~       13002
## # ... with 5 more variables: proprietaire_du_monument <chr>,
## #   epoque_de_construction <chr>, date_de_construction <chr>, longitude <dbl>,
## #   latitude <dbl>

Avatar
Mohamed El Fodil Ihaddaden
Ph.D candidate in Economics.

My research interests include Performance Management, Efficiency Analysis and Experimental Economics.