1
votes

I would like to tabulate a long data.frame. The following example code creates a long data.frame, with columns "ID", "unit" and "mat". Using dcast I would like to create a wide format data.frame with new columns for each "mat", and the values should be either 1 (this combination exists) or 0 (nope, not there...).

df.long <- data.frame(ID = c(3, 4, 8, 9, 3, 4, 10, 3, 3, 4),
    unit = c("cm", "cm", "m", "K", "cm", "cm", "m", "cm", "m", "cm"),
    mat = c(1,1,1,1,2,2,2,3,3,3))
dcast(df.long, ID + unit ~ mat)

gives me a data frame looking the following way:

ID unit  1  2  3
1  3   cm  1  2  3
2  3    m NA NA  3
3  4   cm  1  2  3
4  8    m  1 NA NA
5  9    K  1 NA NA
6 10    m NA  2 NA

To get the table filles with 1 and 0 (as planned), the only way I find right now is adding an additional column to the long data.frame and use "fun.aggregate = tabulate" as additional dcast parameter, i.e.:

df.long <- data.frame(ID = c(3, 4, 8, 9, 3, 4, 10, 3, 3, 4),
    unit = c("cm", "cm", "m", "K", "cm", "cm", "m", "cm", "m", "cm"),
    mat = c(1,1,1,1,2,2,2,3,3,3),
    value = c(1,1,1,1,1,1,1,1,1,1))
dcast(df.long, ID + unit ~ mat, fun.aggregate = tabulate)

ID unit 1 2 3
1  3   cm 1 1 1
2  3    m 0 0 1
3  4   cm 1 1 1
4  8    m 1 0 0
5  9    K 1 0 0
6 10    m 0 1 0

Despite it now works as intended, I wonder if anyone has a better approach using only dcast functionality, without altering the initial data.frame.

2

2 Answers

3
votes

Roland's approach definitely works, but it might make more sense to simply use as.logical for the aggregation function:

dcast(df.long, ID + unit ~ mat, 
      fill = 0, 
      fun.aggregate = as.logical, 
      value.var =  "mat")
#   ID unit 1 2 3
# 1  3   cm 1 1 1
# 2  3    m 0 0 1
# 3  4   cm 1 1 1
# 4  8    m 1 0 0
# 5  9    K 1 0 0
# 6 10    m 0 1 0

Try as.logical(-1:3) to get a sense of what gets converted to zero and what gets converted to one.

2
votes

I think this does the job:

dcast(df.long, ID + unit ~ mat, 
      fill = 0, 
      fun.aggregate = function(x) 1)
#  ID unit 1 2 3
#1  3   cm 1 1 1
#2  3    m 0 0 1
#3  4   cm 1 1 1
#4  8    m 1 0 0
#5  9    K 1 0 0
#6 10    m 0 1 0