“JSON” stands for JavaScript Object Notation. It’s a type of plaintext file that contains data, like a CSV except more complex:
More importantly, a JSON file stores information in a format that can be easily read by both computer programmes and humans. If a programme starts at the beginning of a file and moves through it character by character it will know exactly where within the overall data structure it is at all times, similar to how you are able to move from cell to cell in a spreadsheet and always know where you are from the row and column numbers.
Here’s an example of a JSON file, taken from the Wikipedia page on JSON files:
{
"firstName": "John",
"lastName": "Smith",
"isAlive": true,
"age": 27,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "office",
"number": "646 555-4567"
}
],
"children": [],
"spouse": null
}
As you can see, we have information about a fictional person. As promised, the data is human-readable and the syntax has a number of standard elements:
"firstName": "John"
. In this case the attribute is "firstName"
and the value is "John"
{ }
[ ]
"address"
attribute in the example) or dictionaries within lists within dictionaries (eg the "phoneNumbers"
attribute in the example) and so on.The rjson
library is a useful one when working with JSON files:
library(rjson)
It contains a function fromJSON()
that allows JSON files to be imported as data frames:
file_location <- "json/json object.json"
json_data <- rjson::fromJSON(file = file_location)
file_location <- "txt/json object.txt"
json_data <- rjson::fromJSON(paste(readLines(file_location), collapse=""))
dir <- "json_files"
# List all JSON files in a folder and its sub-folders
files <- list.files(dir, pattern = "*.json", recursive = TRUE)
# Construct the relative paths to the JSON files
filepaths <- file.path(dir, files)
# Initialise list of JSON data
all_json_data <- vector(mode = "list")
# Import all the files
for (filepath in filepaths) {
# Import files
json_data <- rjson::fromJSON(file = filepath)
# Append to list
all_json_data[[length(all_json_data) + 1]] <- json_data
}
json_file <- "http://api.worldbank.org/country?per_page=10®ion=OED&lendingtype=LNX&format=json"
json_data <- rjson::fromJSON(file = json_file)
Similarly, the rjson
library contains a toJSON()
function that allows data frames to be exported as JSON files:
json <- rjson::toJSON(df)
file <- file("json/json object.json")
writeLines(json, file)
close(file)
json <- rjson::toJSON(df)
file <- file("txt/json object.txt")
writeLines(json, file)
close(file)
Generally, in R, you want to be working with traditional row-and-column data frames, the type that can be exported to CSV. So it’s often useful to be able to import a JSON, convert it to a single row of a data frame and then export it as a spreadsheet. Even more so, it’s useful to be able to do this on a whole folder of JSON files at a time, importing each one and adding it as a single row to a data frame which can then be exported as a CSV:
library(rjson)
# Walk through a directory to find JSON files
dir <- "json_files"
files <- list.files(dir, pattern = "*.json", recursive = TRUE)
filepaths <- file.path(dir, files)
# Initialise output data frame
df <- data.frame()
# Iterate over all JSON files
for (filepath in filepaths) {
# Import files
json_data <- rjson::fromJSON(file = filepath)
# Construct new row for output data frame
new_row <- data.frame(
firstName = json_data[["firstName"]],
lastName = json_data[["lastName"]],
isAlive = json_data[["isAlive"]],
age = json_data[["age"]],
streetAddress = json_data[["address"]][["streetAddress"]],
city = json_data[["address"]][["city"]],
state = json_data[["address"]][["state"]],
postalCode = json_data[["address"]][["postalCode"]]
)
# Add new row to output data frame
df <- rbind(df, new_row)
}
# Export
utils::write.csv(df, "json_files/json_data.csv", row.names = FALSE)
Here’s what the data frame that is created looks like:
print(df)
## firstName lastName isAlive age streetAddress city state postalCode
## 1 John Smith TRUE 27 21 2nd Street New York NY 10021-3100
## 2 Jane Doe TRUE 28 21 3rd Street York Yorkshire YO1 8QG