Re-shape status columns based on value in another column

Question

I have a table that looks like this:

ID    Start_year    Status_2005    Status_2006    Status_2007
 1          2005            GBR            GBR            FRA
 2          2006             NA            FRA            FRA
 3          2007             NA             NA            GBR
 4          2006             NA            UKR            RUS

I would like to re-shape the data so that it gives the status in the years subsequent to the start year. So the above, would look like this:

ID     Year_0    Year_1    Year_2
 1        GBR       GBR       GBR
 2        FRA       FRA        NA
 3        GBR        NA        NA
 4        UKR       RUS        NA

I have been trying to use tidyverse in R, using gather in combination with "starts_with", and mutate to make the new columns. However, I keep ending up with a single column of "years_since_start_year", and can't work out how to spread this column to make my final table.

Any help much appreciated

talat talat · Accepted Answer · 2018-12-11T12:06:57

Here's a data.table approach:

library(data.table)
setDT(df)
df <- melt(df, id.vars = c("ID", "Start_year"))   # melt to long format
df <- df[!is.na(value)]                           # remove NA entries
df[, year := seq_len(.N) -1L, by = ID]            # add year-number
dcast(df, ID ~ year, value.var = "value")         # reshape to wide format
#      ID      0      1      2
#1:     1    GBR    GBR    FRA
#2:     2    FRA    FRA   <NA>
#3:     3    GBR   <NA>   <NA>
#4:     4    UKR    RUS   <NA>

Re-shape status columns based on value in another column

3 Answers