5
votes

Why can't one have multiple variables passed to value.var in dcast? From ?dcast:

value.var name of column which stores values, see guess_value for default strategies to figure this out.

It doesn't explicitly indicate that only one single variable can be passed on as value. If however I try that, then I get an error:

> library("reshape2")
> library("MASS")
> 
> dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
  the condition has length > 1 and only the first element will be used

So is there a good reason for imposing this limitation? And is it possible to work around this (perhaps using reshape, etc.)?

3
Why don't you just melt the data further and then you can use dcast?A5C1D2H2I1M1N2O1R2T1
@AnandaMahto, what if the measure.vars are of different types? Or just that the data is too big..?Arun
@Arun, Would it be highly likely that you are trying to use the same function on different data types (other than, maybe, length)?A5C1D2H2I1M1N2O1R2T1
@Arun, if the data is too big, we know the answer :-)A5C1D2H2I1M1N2O1R2T1
@AnandaMahto, on the first point, the data I'm thinking of needs no aggregation function, just plain long-to-wide reshape, with columns of different types. On the 2nd point, :) memory is still an issue.. as melting would take space as well...Arun

3 Answers

7
votes

This question is very much related to your other question from earlier today.

@beginneR wrote in the comments that "As long as the existing data is already in long-format, I don't see any general need to melt it before casting." In my answer posted at your other question, I gave an example of when melt would be required, or rather, how to decide whether your data are long enough.

This question here is another example of when further melting would be required since point 3 in my answer is not satisfied.

To get the behavior you want, try the following:

C93L <- melt(Cars93, measure.vars = c("Price", "Weight"))
dcast(C93L, AirBags ~ DriveTrain + variable, mean, value.var = "value")
#              AirBags 4WD_Price 4WD_Weight Front_Price Front_Weight
# 1 Driver & Passenger       NaN        NaN    26.17273     3393.636
# 2        Driver only     21.38       3623    18.69286     2996.250
# 3               None     13.88       2987    12.98571     2703.036
#   Rear_Price Rear_Weight
# 1      33.20      3515.0
# 2      28.23      3463.5
# 3      14.90      3610.0

An alternative is to use aggregate to calculate the means, and then use reshape or dcast to go from "long" to "wide". Both are required since reshape does not perform any aggregation:

temp <- aggregate(cbind(Price, Weight) ~ AirBags + DriveTrain, 
                  Cars93, mean)
#              AirBags DriveTrain    Price   Weight
# 1        Driver only        4WD 21.38000 3623.000
# 2               None        4WD 13.88000 2987.000
# 3 Driver & Passenger      Front 26.17273 3393.636
# 4        Driver only      Front 18.69286 2996.250
# 5               None      Front 12.98571 2703.036
# 6 Driver & Passenger       Rear 33.20000 3515.000
# 7        Driver only       Rear 28.23000 3463.500
# 8               None       Rear 14.90000 3610.000

reshape(temp, direction = "wide", 
        idvar = "AirBags", timevar = "DriveTrain")
#              AirBags Price.4WD Weight.4WD Price.Front Weight.Front
# 1        Driver only     21.38       3623    18.69286     2996.250
# 2               None     13.88       2987    12.98571     2703.036
# 3 Driver & Passenger        NA         NA    26.17273     3393.636
#   Price.Rear Weight.Rear
# 1      28.23      3463.5
# 2      14.90      3610.0
# 3      33.20      3515.0
7
votes

I had the same issue and I found this answer: Error using dcast with multiple value.var that suggests to "force" data.table dcast function as follows:

# multiple value.var
data.table::dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))

I was able to cast multiple variables without error.

5
votes

Try coercing your data.frame into a data.table using setDT or as.data.table.

dcast(setDT(C93L), AirBags ~ DriveTrain , mean, value.var=c("Price","Weight"))