0
votes

I'm trying to reshape some data from a long format to a single line wide format, below is how my data currently looks;

id var1 var2 var3
1     a    b    c
2     d    e    f
3     g    h    i

Below is how I'd like my data to look at the end;

id.1 var1.1 var2.1 var3.1 id.2 var1.2 var2.2 var3.2 id.3 var1.3 var2.3 var3.3
1         a      b      c    2      d      e      f    3      g      h      i

I've tried using the tidyr package and reshape but neither seem to be capable of what I want to do. Any help would be greatly appreciated.

2
I think just unlist(df) does what you want (assuming you have no factors there and you don't care about the order). Or c(t(df)) will give the values in desired order but without the names.David Arenburg
Or, I suppose do.call(cbind, split(mydf, 1:nrow(mydf))), but why?A5C1D2H2I1M1N2O1R2T1

2 Answers

1
votes

If maintaining column types is of interest, you can try do.call(cbind, split(mydf, 1:nrow(mydf))).

Example:

mydf <- structure(list(id = 1:3, var1 = structure(1:3, .Label = c("a",      
         "d", "g"), class = "factor"), var2 = c("b", "e", "h"), var3 = c("c",    
         "f", "i")), .Names = c("id", "var1", "var2", "var3"), row.names = c(NA, 
         3L), class = "data.frame")  

^^ This is like your data, but the "var1" column is a factor.

do.call(cbind, split(mydf, 1:nrow(mydf)))
#   1.id 1.var1 1.var2 1.var3 2.id 2.var1 2.var2 2.var3 3.id 3.var1 3.var2 3.var3
# 1    1      a      b      c    2      d      e      f    3      g      h      i

str(.Last.value)
# 'data.frame': 1 obs. of  12 variables:
#  $ 1.id  : int 1
#  $ 1.var1: Factor w/ 3 levels "a","d","g": 1
#  $ 1.var2: chr "b"
#  $ 1.var3: chr "c"
#  $ 2.id  : int 2
#  $ 2.var1: Factor w/ 3 levels "a","d","g": 2
#  $ 2.var2: chr "e"
#  $ 2.var3: chr "f"
#  $ 3.id  : int 3
#  $ 3.var1: Factor w/ 3 levels "a","d","g": 3
#  $ 3.var2: chr "h"
#  $ 3.var3: chr "i"

^^ Note that column classes are retained.

A little gsub can get the column names to be what you were expecting.


Or, you can add two additional columns, one filled with the value "1", the other with the sequence from 1 to the number of rows in your dataset, and use that as your "id" (LHS) and "time" (RHS) variables with the reshape function, or something like the data.table version of dcast that accepts multiple variables for value.var.