⇦ Back

“JSON” stands for JavaScript Object Notation. It’s a type of plaintext file that contains data, like a CSV except more complex:

More importantly, a JSON file stores information in a format that can be easily read by both computer programmes and humans. If a programme starts at the beginning of a file and moves through it character by character it will know exactly where within the overall data structure it is at all times, similar to how you are able to move from cell to cell in a spreadsheet and always know where you are from the row and column numbers.

Here’s an example of a JSON file, taken from the Wikipedia page on JSON files:

{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    }
  ],
  "children": [],
  "spouse": null
}

As you can see, we have information about a fictional person. As promised, the data is human-readable and the syntax has a number of standard elements:

Importing and Exporting JSON

The rjson library is a useful one when working with JSON files:

library(rjson)

It contains a function fromJSON() that allows JSON files to be imported as data frames:

  • Import a single JSON file:
file_location <- "json/json object.json"
json_data <- rjson::fromJSON(file = file_location)
  • Import a single JSON file that has been saved as a txt:
file_location <- "txt/json object.txt"
json_data <- rjson::fromJSON(paste(readLines(file_location), collapse=""))
  • Import all JSON files in a folder:
dir <- "json_files"
# List all JSON files in a folder and its sub-folders
files <- list.files(dir, pattern = "*.json", recursive = TRUE)
# Construct the relative paths to the JSON files
filepaths <- file.path(dir, files)
# Initialise list of JSON data
all_json_data <- vector(mode = "list")
# Import all the files
for (filepath in filepaths) {
    # Import files
    json_data <- rjson::fromJSON(file = filepath)
    # Append to list
    all_json_data[[length(all_json_data) + 1]] <- json_data
}
  • Import JSON data from a web address:
json_file <- "http://api.worldbank.org/country?per_page=10&region=OED&lendingtype=LNX&format=json"
json_data <- rjson::fromJSON(file = json_file)

Similarly, the rjson library contains a toJSON() function that allows data frames to be exported as JSON files:

  • Export a data frame to a JSON file:
json <- rjson::toJSON(df)
file <- file("json/json object.json")
writeLines(json, file)
close(file)
  • Export a data frame to a text file:
json <- rjson::toJSON(df)
file <- file("txt/json object.txt")
writeLines(json, file)
close(file)

Converting JSON Files to CSV

Generally, in R, you want to be working with traditional row-and-column data frames, the type that can be exported to CSV. So it’s often useful to be able to import a JSON, convert it to a single row of a data frame and then export it as a spreadsheet. Even more so, it’s useful to be able to do this on a whole folder of JSON files at a time, importing each one and adding it as a single row to a data frame which can then be exported as a CSV:

library(rjson)

# Walk through a directory to find JSON files
dir <- "json_files"
files <- list.files(dir, pattern = "*.json", recursive = TRUE)
filepaths <- file.path(dir, files)

# Initialise output data frame
df <- data.frame()
# Iterate over all JSON files
for (filepath in filepaths) {
    # Import files
    json_data <- rjson::fromJSON(file = filepath)
    # Construct new row for output data frame
    new_row <- data.frame(
        firstName = json_data[["firstName"]],
        lastName = json_data[["lastName"]],
        isAlive = json_data[["isAlive"]],
        age = json_data[["age"]],
        streetAddress = json_data[["address"]][["streetAddress"]],
        city = json_data[["address"]][["city"]],
        state = json_data[["address"]][["state"]],
        postalCode = json_data[["address"]][["postalCode"]]
    )
    # Add new row to output data frame
    df <- rbind(df, new_row)
}

# Export
utils::write.csv(df, "json_files/json_data.csv", row.names = FALSE)

Here’s what the data frame that is created looks like:

print(df)
##   firstName lastName isAlive age streetAddress     city     state postalCode
## 1      John    Smith    TRUE  27 21 2nd Street New York        NY 10021-3100
## 2      Jane      Doe    TRUE  28 21 3rd Street     York Yorkshire    YO1 8QG

⇦ Back