0
votes

I'm new to R so this is maybe simple, but I haven't find how to do it yet. I'm trying to aggregate my temperature data by day so I have a mean temperature for every day of the year.

Here's an example of my data and the code I made :

         Date    Qobs   Ptot  Fsol Temp    PE X

1     1956-11-01 0.001  14.0  -99  12.0   1.4 NA
2     1956-11-02 0.001   0.0  -99  13.5   1.5 NA
3     1956-11-03 0.001   0.0  -99  13.5   1.5 NA
4     1956-11-04 0.001   0.0  -99  13.0   1.4 NA
5     1956-11-05 0.001   0.0  -99  11.5   1.3 NA
6     1956-11-06 0.001   0.0  -99  11.0   1.2 NA
7     1956-11-07 0.001   2.0  -99  12.5   1.3 NA
8     1956-11-08 0.000   0.0  -99   5.0   0.7 NA
9     1956-11-09 0.000   0.5  -99   0.0   0.4 NA
10    1956-11-10 0.000   0.0  -99  -2.5   0.2 NA
11    1956-11-11 0.000   2.5  -99   5.5   0.8 NA
12    1956-11-12 0.000   0.0  -99   7.5   0.9 NA

reg_T=aggregate(x=tmp_data$Temp, by=list(j=format(tmp_data$Date, "%j")), mean)

But as you can see my data doesn't start the 1st Januray, so the 1st day of my data is the 01/11 which makes it complicated for later when it's aggregated. How can I aggregate and define the start at the 01/01 and make it forget the beginning and end of my data because they are not complete years?

Thanks!

dput() of the data:

df <- structure(list(Date = structure(c(-4809, -4808, -4807, -4806, -4805, -4804,
                                        -4803, -4802, -4801, -4800, -4799, -4798, -4797,
                                        -4796, -4795, -4794, -4793, -4792, -4791, -4790,
                                        -4789, -4788, -4787, -4786, -4785, -4784, -4783,
                                        -4782, -4781, -4780), class = "Date"),
                     Temp = c(12, 13.5, 13.5, 13, 11.5, 11, 12.5, 5, 0, -2.5, 5.5, 7.5,
                              1.5, 6, 14, 6, 0.5, 0.5, 4, 2, 9, -4.5, -11.5, -10, -4.5,
                              -2.5, -3.5, -1, -1.5, -7.5)),
                .Names = c("Date", "Temp"), row.names = c(NA, 30L), class = "data.frame")
1
In other words all that you're interested is the month? And what should be the expected output? ā€“ DJV
Iā€™m interested by the mean temperature every day of the year and I need an output like that: ` j x` 1 001 -1.015094340 2 002 -1.700000000 3 003 -0.883018868 4 004 -1.445283019 5 005 -2.356603774 6 006 -1.360377358 7 007 -1.941509434 8 008 -3.473584906 9 009 -3.394339623 10 010 -3.224528302 11 011 -5.158490566 12 012 -4.088679245` But here the 1st day corresponds to the 01/11 (1st line of my data) and not the 1/01. ā€“ Jude
Why not to group_by() date and summarise()? instead of using aggregate. ā€“ DJV

1 Answers

0
votes

What about something like this:

require(tidyverse)

df %>% 
  mutate(MonthDay = str_sub(as.character(Date), 6)) %>% 
  group_by(MonthDay) %>% 
  summarise(MeanDay = mean(Temp, na.rm = TRUE))

    # A tibble: 30 x 2
   MonthDay MeanDay
   <chr>      <dbl>
 1 11-01      12.0 
 2 11-02      13.5 
 3 11-03      13.5 
 4 11-04      13.0 
 5 11-05      11.5 
 6 11-06      11.0 
 7 11-07      12.5 
 8 11-08       5.00
 9 11-09       0.  
10 11-10      -2.50
# ... with 20 more rows