1
votes

I would like to expand my dataset so that there is one row per ID and each value in the "step" column has its own row.

I've tried group_by and spread in dplyr but am not achieving the result I'm looking for.

My dataset looks like this

df = data.frame(id = c(1,1,2,2,3,3,3,4,4,4,4,4), 
       points = c(0,1,0,1,0,1,2,0,1,2,3,4), 
       step = c(0, 0, 0, 0, 0.0000, -1.9701, -1.6758, 0.0000, -2.5414,-2.5397,1.1516,  3.9296))

   id points    step
1   1      0  0.0000
2   1      1  0.0000
3   2      0  0.0000
4   2      1  0.0000
5   3      0  0.0000
6   3      1 -1.9701
7   3      2 -1.6758
8   4      0  0.0000
9   4      1 -2.5414
10  4      2 -2.5397
11  4      3  1.1516
12  4      4  3.9296

I would like the end result to look like this where the "points" column from the original dataset indicates the column names in the final dataset:

 id step0   step1   step2  step3  step4
1  1     0  0.0000      NA     NA     NA
2  2     0  0.0000      NA     NA     NA
3  3     0 -1.9701 -1.6758     NA     NA
4  4     0 -2.5414 -2.5397 1.1516 3.9296
1

1 Answers

0
votes

We can use spread

library(tidyverse)
df %>% 
    mutate(points = str_c("step", points)) %>%
    spread(points, step)
#  id step0   step1   step2  step3  step4
#1  1     0  0.0000      NA     NA     NA
#2  2     0  0.0000      NA     NA     NA
#3  3     0 -1.9701 -1.6758     NA     NA
#4  4     0 -2.5414 -2.5397 1.1516 3.9296