How to find the average, max, and min based on multiple sets of criteria

Question

I have found a few options from other posts but I am having trouble reproducing the code for my specific needs.

I have climate data that I would like to find the average based on season from 1910-2015 for specific locations.

Here is an example of what I need: the average PPT for CENTROID_ID c1763_1 in 1911 for Winter (Month 12 of the preceding year [1910], month 1 and 2 of year in question [1911]), Spring (months 3,4,5 in 1911), Summer (months 6,7,8 in 1911) and Fall (months 9,10,11 in 1911). This would then need to be done for all individual centroid IDs for every year. I have over 400 unique CENTROID_IDs covering years 1910-2015.

I envision the new dataframe to have the columns CENTROID_ID, YEAR, WINT_PPT, SPR_PPT, SUM_PPT, FALL_PPT.

 CENTROID_ID    YEAR MONTH  PPT
1       c1763_1 1910     1  52.639
2       c1763_1 1910     2  20.870
3       c1763_1 1910     3  21.706
4       c1763_1 1910     4   9.347
5       c1763_1 1910     5   1.201
6       c1763_1 1910     6  11.267
7       c1763_1 1910     7  41.870
8       c1763_1 1910     8  61.260
9       c1763_1 1910     9  27.815
10      c1763_1 1910    10  67.377
11      c1763_1 1910    11  24.719
12      c1763_1 1910    12  30.212
13      c1763_1 1911     1  88.728
14      c1763_1 1911     2  50.035
15      c1763_1 1911     3  37.720
16      c1763_1 1911     4  12.831
17      c1763_1 1911     5   0.739
18      c1763_1 1911     6  18.198
19      c1763_1 1911     7  74.731
20      c1763_1 1911     8  40.873
21      c1763_1 1911     9  86.340
22      c1763_1 1911    10  36.423
23      c1763_1 1911    11  12.491
24      c1763_1 1911    12  19.428
25      c1763_1 1912     1  11.010
26      c1763_1 1912     2  16.339
27      c1763_1 1912     3  72.017
28      c1763_1 1912     4  25.887
29      c1763_1 1912     5   5.314
30      c1763_1 1912     6   8.595
31      c1763_1 1912     7  47.781
32      c1763_1 1912     8  51.188
33      c1763_1 1912     9  10.931
34      c1763_1 1912    10 119.725
35      c1763_1 1912    11  10.420
36      c1763_1 1912    12   8.777
37      c1763_1 1913     1  27.771
38      c1763_1 1913     2  62.622
39      c1763_1 1913     3  17.533
40      c1763_1 1913     4   8.008
41      c1763_1 1913     5   1.423
42      c1763_1 1913     6   3.773
43      c1763_1 1913     7  42.982
44      c1763_1 1913     8  40.541
45      c1763_1 1913     9  58.495
46      c1763_1 1913    10  22.729
47      c1763_1 1913    11  48.130
48      c1763_1 1913    12  32.049
49      c1763_1 1914     1 104.197
50      c1763_1 1914     2  31.707

The title of your post asks about MAX and MIN, but your problem statement doesn't mention them. — David T
Sorry, I forgot to edit. I realized that maybe that was too much for one question and to start with means first then try to figure out a max and min on my own if I could with the help of this question. — TaraBBB

GKi GKi · Accepted Answer · 2020-06-03T08:26:19

You can create an auxiliary column of x$YEAR + x$MONTH/12 which helps bringing December for last year to current year. And then simply use aggregate of PPT over CENTROID_ID, floor(aux) and floor(aux %% 1 * 4). Then reshape to bring the seasons in the same row.

x$aux <- x$YEAR + x$MONTH/12
y <- aggregate(PPT ~ CENTROID_ID + cbind(YEAR=floor(aux)) + cbind(SEASON=c("WINT",
 "SPR", "SUM", "FALL")[1+floor(aux %% 1 * 4)]), x, mean)
reshape(do.call(data.frame, y), v.names = "PPT", timevar = "SEASON", idvar = 
 c("CENTROID_ID", "YEAR"), direction = "wide")
#   CENTROID_ID YEAR PPT.FALL  PPT.SPR  PPT.SUM PPT.WINT
#1      c1763_1 1910 39.97033 10.75133 38.13233 36.75450
#2      c1763_1 1911 45.08467 17.09667 44.60067 56.32500
#3      c1763_1 1912 47.02533 34.40600 35.85467 15.59233
#4      c1763_1 1913 43.11800  8.98800 29.09867 33.05667
#17     c1763_1 1914       NA       NA       NA 55.98433

How to find the average, max, and min based on multiple sets of criteria

3 Answers