0
votes

The post calculation of anomalies on time-series was very helpful but I have grouped data in my situation. I have a data frame with year, group, and value and columns. Each group has a values for each year. What I want to calculate is the yearly anomaly with in each group. i.e. this year's value minus the mean value over all years for that group. It would be nice to append this anomaly value as a column in the data frame too. Thanks! Here is sample data

year <- c(2000, 2000, 2000, 2000, 2000,2001, 2001, 2001, 2001, 2001,2002, 2002, 2002, 2002, 2002,2003, 2003, 2003, 2003, 2003)
group <- c("A", "B", "C", "D", "A", "B", "C", "D","A", "B", "C", "D","A", "B", "C", "D","A", "B", "C", "D")
value <- runif(20, 0, 1)
df <- as.data.frame(year)
df$group <- group
df$value <- value
1
please show your problem with an example dataset along with the desired resultmtoto

1 Answers

2
votes

Another instance where the ave function is useful (and as such the FUN argument is actually not needed but important to remember that it is after the ellipsis in the argument list and therefore needs to be a named argument if used):

df$grp.means <-with( df, ave(value,group, FUN=mean )
df$yr.anomaly <- df$value-df$grp.means
df
 year group      value grp.means   yr.anomaly
 2000     A 0.40778676 0.4135109 -0.005724164
 2000     B 0.02709893 0.2660400 -0.238941031
 2000     C 0.30375035 0.6461923 -0.342441950
 2000     D 0.46330590 0.4901705 -0.026864586
 2000     A 0.98482498 0.4135109  0.571314056
 2001     B 0.02279144 0.2660400 -0.243248519
 2001     C 0.64370031 0.6461923 -0.002491994
 2001     D 0.28803650 0.4901705 -0.202133986
 2001     A 0.40769648 0.4135109 -0.005814443
 2001     B 0.21896143 0.2660400 -0.047078526
 2002     C 0.83771796 0.6461923  0.191525655
 2002     D 0.61869987 0.4901705  0.128529384
 2002     A 0.06946549 0.4135109 -0.344045431
 2002     B 0.14443442 0.2660400 -0.121605537
 2002     C 0.95324165 0.6461923  0.307049349
 2003     D 0.60165466 0.4901705  0.111484174
 2003     A 0.19778091 0.4135109 -0.215730018
 2003     B 0.91691357 0.2660400  0.650873612
 2003     C 0.49255124 0.6461923 -0.153641061
 2003     D 0.47915550 0.4901705 -0.011014985

Also possible to do it in one step:

df$yr.anomaly <- with( df, ave(value, group, FUN=function(x) x- mean(x)))