FAOSTAT-package 3.0

An overhaul of the API wrapper of the FAOSTAT API

Sebastian Campbell

March 18, 2024

The core mission

To update the FAOSTAT package to allow R users to consume FAO’s API data easily

To update the FAOSTAT package to allow R users to consume FAO’s API data easily

To update the FAOSTAT package to allow R users to consume FAO’s API data easily

To update the FAOSTAT package to allow R users to consume FAO’s API data easily

What is FAO?

Food and Agriculture Organization

FAO

Goal is to:

…achieve food security for all and make sure that people have regular access to enough high-quality food to lead active, healthy lives.

ESS in FAO

Produce up-to-date statistics

Develop and promote international food and agricultural statistical standards, method and tools

Work directly with countries to develop national statistical capacity

What is FAOSTAT?

  • FAOSTAT is the Food and Agriculture Organization’s (FAO) tool for disseminating statistical data they produce

  • Not all data

    • Fisheries and Forestry are separate

How do we get data from FAOSTAT?

  • Web interface
    • Data explorer
    • CSV exporter
    • Bulk download
    • Web interface API
  • FAOSTAT3 API

How do we get data from FAOSTAT?

  • Web interface
    • Data explorer
    • CSV exporter
    • Bulk download
    • Web interface API
  • FAOSTAT3 API

The core problem

What is an API?

  • Application Programming Interface
  • Allows different bit of software to communicate with each other
  • Similar to how a steering wheel lets you drive anything

REST APIs

  • HTTP-based
  • Use mainly GET and POST

REST example

GET - Request data from the server

  • Request: GET https://example.com/johnsmith/info
  • Payload: None
  • Response: {username: johnsmith, full_name = "John Smith"}

What is the FAOSTAT package?

  • API wrapper that allows R users to use FAO functions
  • Allows users to pull in FAOSTAT data

Why do R users need a package?

  • Easily accessible documentation
  • No need to convert json to R objects (tables)

Quick demo of pre-existing functionality

landuse <- get_faostat_bulk("RL")
head(landuse)
  area_code area_code__m49_        area item_code         item element_code
1         2            '004 Afghanistan      6600 Country area         5110
2         2            '004 Afghanistan      6600 Country area         5110
3         2            '004 Afghanistan      6600 Country area         5110
4         2            '004 Afghanistan      6600 Country area         5110
5         2            '004 Afghanistan      6600 Country area         5110
6         2            '004 Afghanistan      6600 Country area         5110
  element year_code year    unit value flag note
1    area      1961 1961 1000 ha 65286    A     
2    area      1962 1962 1000 ha 65286    A     
3    area      1963 1963 1000 ha 65286    A     
4    area      1964 1964 1000 ha 65286    A     
5    area      1965 1965 1000 ha 65286    A     
6    area      1966 1966 1000 ha 65286    A     

Custodianship

  • Developed by FAO employees
  • Currently maintained by Paul Rougieux at the European Commission

Existing documentation

  • A single json file
  • A word document
    • Only covers a subset of functionality

Challenges

  • Old API doesn’t work
  • New API is largely undocumented
  • Outdated functions

We need to get from here

To here

FAOSTAT 2.3.0

  • Completed in March 2023 with 3 goals:
    • Fix core functions
    • Triage existing functions
    • Describe all API endpoints

Triage existing functions

Describe all API endpoints

  • Used json documentation
  • Manually tested all the endpoints
  • Wrote everything up on an issue page

It was a lot of work

API structure

flowchart TD
  Group --> Domain
  Dimension --> Codes
  Dimension --> Subdimensions
  Subdimensions --> Codes
  Domain --> Dimension
  Domain --> Data
  Domain --> BD[Bulk downloads]
  Domain --> Metadata
  Metadata --> Document

FAOSTAT 3.0.0

Implementing all of the changes we identified in 2.3.0

  • Deprecating old functions
  • Creating new structures

Now it works

read_fao(domain = "RL", 
         area_codes = "8", 
         element_codes = "5110", 
         item_codes = "6620",
         year_codes = 2000:2020)
   Domain.Code   Domain Area.Code..M49.                Area Element.Code
1           RL Land Use              28 Antigua and Barbuda         5110
2           RL Land Use              28 Antigua and Barbuda         5110
3           RL Land Use              28 Antigua and Barbuda         5110
4           RL Land Use              28 Antigua and Barbuda         5110
5           RL Land Use              28 Antigua and Barbuda         5110
6           RL Land Use              28 Antigua and Barbuda         5110
7           RL Land Use              28 Antigua and Barbuda         5110
8           RL Land Use              28 Antigua and Barbuda         5110
9           RL Land Use              28 Antigua and Barbuda         5110
10          RL Land Use              28 Antigua and Barbuda         5110
11          RL Land Use              28 Antigua and Barbuda         5110
12          RL Land Use              28 Antigua and Barbuda         5110
13          RL Land Use              28 Antigua and Barbuda         5110
14          RL Land Use              28 Antigua and Barbuda         5110
15          RL Land Use              28 Antigua and Barbuda         5110
16          RL Land Use              28 Antigua and Barbuda         5110
17          RL Land Use              28 Antigua and Barbuda         5110
18          RL Land Use              28 Antigua and Barbuda         5110
19          RL Land Use              28 Antigua and Barbuda         5110
20          RL Land Use              28 Antigua and Barbuda         5110
21          RL Land Use              28 Antigua and Barbuda         5110
   Element Item.Code..CPC.     Item Year.Code Year    Unit Value Flag
1     Area           F6620 Cropland      2000 2000 1000 ha     5    I
2     Area           F6620 Cropland      2001 2001 1000 ha     5    I
3     Area           F6620 Cropland      2002 2002 1000 ha     5    I
4     Area           F6620 Cropland      2003 2003 1000 ha     5    I
5     Area           F6620 Cropland      2004 2004 1000 ha     5    I
6     Area           F6620 Cropland      2005 2005 1000 ha     5    I
7     Area           F6620 Cropland      2006 2006 1000 ha     5    I
8     Area           F6620 Cropland      2007 2007 1000 ha     5    I
9     Area           F6620 Cropland      2008 2008 1000 ha     5    I
10    Area           F6620 Cropland      2009 2009 1000 ha     5    I
11    Area           F6620 Cropland      2010 2010 1000 ha     5    I
12    Area           F6620 Cropland      2011 2011 1000 ha     5    I
13    Area           F6620 Cropland      2012 2012 1000 ha     5    I
14    Area           F6620 Cropland      2013 2013 1000 ha     5    I
15    Area           F6620 Cropland      2014 2014 1000 ha     5    I
16    Area           F6620 Cropland      2015 2015 1000 ha     5    I
17    Area           F6620 Cropland      2016 2016 1000 ha     5    I
18    Area           F6620 Cropland      2017 2017 1000 ha     5    I
19    Area           F6620 Cropland      2018 2018 1000 ha     5    I
20    Area           F6620 Cropland      2019 2019 1000 ha     5    I
21    Area           F6620 Cropland      2020 2020 1000 ha     5    I
   Flag.Description Note
1     Imputed value   NA
2     Imputed value   NA
3     Imputed value   NA
4     Imputed value   NA
5     Imputed value   NA
6     Imputed value   NA
7     Imputed value   NA
8     Imputed value   NA
9     Imputed value   NA
10    Imputed value   NA
11    Imputed value   NA
12    Imputed value   NA
13    Imputed value   NA
14    Imputed value   NA
15    Imputed value   NA
16    Imputed value   NA
17    Imputed value   NA
18    Imputed value   NA
19    Imputed value   NA
20    Imputed value   NA
21    Imputed value   NA

Defunct functions

getWDI()
Error: 'getWDI' is defunct.
See help("Defunct")

Future work

  • Does it make sense to make a package for every API?
    • Web-connectors are another approach
  • Collecting AQUASTAT, Fisheries and other FAO data into one place would allow a single package to work for more data

Further reading