1 The Problem

Here we have a data set from a fictitious research study that is incomplete - it only specifies when the participants visited the study centre for the first time, and lists these in the ‘Visit Number’ column as ‘Visit 1’:

df <- data.frame(
    participant_id = c(102, 105, 105, 111, 111, 111),
    visit_date = c(
        "2018-02-20", "2018-04-24", "2019-02-15",
        "2018-01-27", "2013-11-19", "2018-11-28"
    ),
    visit_no = c("Visit 1", "Visit 1", NA, "Visit 1", NA, NA)
)

print(df)

##   participant_id visit_date visit_no
## 1            102 2018-02-20  Visit 1
## 2            105 2018-04-24  Visit 1
## 3            105 2019-02-15     <NA>
## 4            111 2018-01-27  Visit 1
## 5            111 2013-11-19     <NA>
## 6            111 2018-11-28     <NA>

What we want to do is fill in the gaps: calculate from the dates which visits were ’Visit 0’s and ’Visit 2’s. More specifically, we can see that:

Participant 105 had their ‘Visit 1’ in April 2018, so their visit in Feb 2019 was their ‘Visit 2’
Participant 111 had their ‘Visit 1’ in Jan 2018, so their visit in Nov 2013 was their ‘Visit 0’ and their visit in Nov 2018 was their ‘Visit 2’

Let’s see how to calculate that automatically:

Data Handling in R:
Data Classification Using Date or Time Information

1 The Problem

2 The Solution

Data Handling in R:Data Classification Using Date or Time Information

1 The Problem

2 The Solution

Data Handling in R:
Data Classification Using Date or Time Information