1
votes

I working with the mlogit package. The package has some unforgiving data requirements. For each key in a data set, there must be an identical number of rows.

Here is a reprex with an example:

library(reprex)
#> Warning: package 'reprex' was built under R version 3.5.3
## Have This
df <- tibble( key = c(1,1,1,1,1,2,2,2,2,3,3,3),y=c(2,2,2,2,2,2,2,2,2,2,2,2), z=c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE))
#> Error in tibble(key = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3), y = c(2, : could not find function "tibble"
df
#> function (x, df1, df2, ncp, log = FALSE) 
#> {
#>     if (missing(ncp)) 
#>         .Call(C_df, x, df1, df2, log)
#>     else .Call(C_dnf, x, df1, df2, ncp, log)
#> }
#> <bytecode: 0x0000000013f046d0>
#> <environment: namespace:stats>

#Want this via tidyverse 
df2 <- tibble( key = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3),y=c(2,2,2,2,2,2,2,2,2,0,2,2,2,0,0), z=c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE))
#> Error in tibble(key = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3), : could not find function "tibble"
df2
#> Error in eval(expr, envir, enclos): object 'df2' not found

Created on 2020-05-02 by the reprex package (v0.3.0)

df has three keys 1, 2 and 3. Key 1 has five rows of observation, Key 2 has 4 rows of observation and Key 3 has three rows. I need each key to have 5 rows of observation and would like to achieve this with the tidyverse. I thought add_row() might be my solution, but I couldn't get it to work. Is this possible.

In my example, I have df as the before and df2 as the desired after.

Created on 2020-05-02 by the reprex package (v0.3.0)

1

1 Answers

1
votes

We could expand the dataset based on the count of 'key' column

library(dplyr)
library(tidyr)
library(data.table)
df %>%
     mutate(ind = rowid(key)) %>%
     complete(key, ind) %>%
     select(-ind) %>%
     fill(z) %>%
     mutate(y = replace_na(y, 0))
# A tibble: 15 x 3
#     key     y z    
#   <dbl> <dbl> <lgl>
# 1     1     2 TRUE 
# 2     1     2 FALSE
# 3     1     2 FALSE
# 4     1     2 FALSE
# 5     1     2 FALSE
# 6     2     2 TRUE 
# 7     2     2 FALSE
# 8     2     2 FALSE
# 9     2     2 FALSE
#10     2     0 FALSE
#11     3     2 TRUE 
#12     3     2 FALSE
#13     3     2 FALSE
#14     3     0 FALSE
#15     3     0 FALSE