I have a dataframe containing different groups, years and their values, for example:
data <- data.frame(
group = c(rep('A', 120), rep('B', 120)),
year = rep(c(rep('2013-2014', 40), rep('2014-2015', 40), rep('2015-2016', 40)), 2),
value = rnorm(240)
)
For each year within each group I want to run a t-test to see whether the values are significantly different to the previous years (I have been using the function t.test(x, y, var.equal = TRUE) to do this on a one-off)
I would like to return the a dataframe along with the p-values, or preferably significant stars generated using gtools::stars.pval(). So to return something like the following
group year significance
A 2013-2014 NA
A 2014-2015 **
A 2015-2016 ***
B 2013-2014 NA
B 2014-2015
B 2015-2016
Where in the above the p value for difference between 2014-2015 and 2013-2014 for 'A' is between 0.001 and 0.01, and the p-value for the difference between 2015-2015 and 2014-2015 for A is <0.001. There is no evidence of any significant difference in any years for B.
There is no guarantee that each of the groups have the same number of years.
What is the best and quickest way of doing this? I was hoping that I could do it using dplyr and group_by by group and year?