Using tapply and sapply, i am trying sum the number of counts based on multiple (two) indices i give to tapply using sapply. The problem is the returned matrix loses the column name I give to tapply. I end up turning the matrix into a data.frame using melt() for input into ggplot and would have to add the variable names in a more manual fashion but i want them to just be retained through the two apply() functions. The metric/variable names are retained when i only use on index in tapply() so i am hung up on why they are lost with two indices.
Fc_desc. <- rep(c(rep("Local",10),rep("Collector",10),rep("Arterial",10)),2)
Year. <- c(rep(seq(2000,2008,2),12))
df.. <- data.frame(Fc_desc = Fc_desc., Year = Year., Tot_ped_fatal_cnt = sample(length(Year.)),Tot_ped_inj_lvl_a_cnt = sample(length(Year.)))
#Define metrics(columns) of interest
Metrics. <- c("Tot_ped_fatal_cnt", "Tot_ped_inj_lvl_a_cnt")
#Summarize into long data frame
Ped_FcSv.. <- melt(sapply(Metrics., function(x){tapply(df..[,x],list(df..$Year, df..$Fc_desc), sum,na.rm=T)}),varnames = c("Fc_desc","Year","Injury_Severity"), value.name = "Count")
tapply
orsapply
, try this:aggregate(.~Fc_desc + Year, data = df.., FUN = sum)
– bouncyballdf_long = reshape2::melt(df.., measure.vars = Metrics.)
. Then you can aggregate over just the singe variable you care about over the three grouping variablesaggregate(value ~ Fc_desc + Year + variable, data = df_long, FUN = sum)
. – aosmithmelt()
/-aggregate()
approach doesn't work I would probably switch to your favorite add-on package for this sort of data manipulation problem (I usually use dplyr) – aosmith