1
votes

a common task in the data I work with is reshaping client data from long to wide. I have a process to do this with Reshape outlined below that basically creates new (but unmodified) columns with a numeric index appended. In my case I do not want to perform any modifications on the data. My question, because I often use reshape2 for other operations, is how this can be accomplished with dcast? It does not seem that the example data need to be melted by id, for example, but I'm not sure how I would go about making it wide. Would anyone be able to provide code in reshape2 to produce a frame comparable to "wide" in the example below?

Thanks.

Example

date_up   <- as.numeric(as.Date("1990/01/01"))
date_down <- as.numeric(as.Date("1960/01/01"))
ids <- data.frame(id=rep(1:1000, 3),site=rep(c("NMA", "NMB","NMC"), 1000))
ids <- ids[order(ids$id), ]
dates <-  data.frame(datelast=runif(3000, date_down, date_up),
          datestart=runif(3000, date_down, date_up),
          dateend=runif(3000, date_down, date_up),
          datemiddle=runif(3000, date_down, date_up))
dates[] <- lapply(dates[ , c("datestart", "dateend", "datemiddle")], 
             as.Date.numeric, origin = "1970-01-01")
df <- cbind(ids, dates)

# Make a within group index and reshape df
df$gid <- with(df, ave(rep(1, nrow(df)), df[,"id"], FUN = seq_along))
wide <- reshape(df, idvar = "id", timevar = "gid", direction = "wide")
1
At the moment one needs to run this twice (with an initial error that most R-newbs would find puzzling having to do with a "closure" because the object df is the F-density function in R. The second time around, there is a df-data-object and so no error occurs. (I only made a 30 row matrix to work with.)IRTFM
You are correct, thanks for pointing that out. I updated the code to fix the error.Derek Darves

1 Answers

2
votes

We can use dcast from data.table, which can take multiple value.var columns. Convert the 'data.frame' to 'data.table' (setDT(df)), use the dcast with formula and value.var specified.

library(data.table)
dcast(setDT(df), id~gid, value.var=names(df)[2:6])

NOTE: The data.table method would be faster compared to the reshape2