1
votes

Python guy new to R, so forgive the naive question.

I have an R dataframe named metrics with four columns:

I want to pass the level of aggregation (day or week) as a variable to dcast for aggregation.

agg_level <- c("week")

If I hard-code week in the in the function, it aggregates data for each week correctly:

  • met <- dcast(metrics, week ~ city, value.var = count, fun.aggregate = sum)
  • Output:

week NYC CHI SF

2015-10-18 1 2 3

2015-10-25 4 5 6

If I replace week with the variable, it fails. (It aggregates data for all weeks.)

  • met <- dcast(metrics, agg_level ~ city, value.var = count, fun.aggregate = sum)

  • Output:

agg_level NYC CHI SF

week 5 7 9

Based on this, metrics[[agg_level]] extracts a column from variable, but this fails:

  • met <- dcast(m, [[agg_level]] ~ city, value.var = metric, fun.aggregate = sum)

  • Error in (function ... unexpected '[['

What is the correct way to do this?

1
Can you show a reproducible example - akrun
Added example correct output, and the output I get when passing variable. - lmart999

1 Answers

3
votes

The formula argument of dcast expects that the words passed to it are column/variable names inside of the data.frame x. It does not recognize or resolve the fact that "agg_level" is a variable. As such, you have two options:

# Option 1
# Do some text operations to make the formula based on variables.
if(this==that) {agg_level <- 'week'} else {agg_level <- 'day'}
myFormula <- sprintf("%s ~ city", agg_level)
met <- dcast(metrics, as.formula(myFormula), sum, value.var = metric)

# Option 2 - Untested
# Take advantage of dcast's alternative to the formula notation and pass a list instead.
# No idea if this will work.
met <- dcast(metrics, list(.(agg_level),.(city)), sum, value.var=metric)