0
votes

I have data from several years and each record has a date value (YYYY-MM-DD). I want to label each record with the season that it fell into. For example, I want to take all the records from December 15 to March 15, across all years, and put "Winter" in a season column. Is there a way in R to specify a sequence of dates using just the month and date, regardless of year?

Lubridate quarter command doesn't work because I have custom dates to define the seasons and the seasons are not all of equal length, and I can't just do month(datevalue) %in% c(12,1,2,3) because I need to split the months in half (i.e. March 15 is winter and March 16 is spring).

I could manually enter in the date range for each year in my dataset (e.g. Dec 15 2015 to March 15 2015 or Dec 15 2016 to Mar 15 2016, etc...), but is there a better way?

2

2 Answers

1
votes

You can extract the month and date out of the date column and use case_when to assign Season based on those two dates.

library(dplyr)
library(lubridate)

df %>%
  mutate(day = day(Date), 
         month = month(Date), 
         Season  = case_when(#15 December to 15 March as Winter
                             month == 12 & day >= 15 | 
                             month %in% 1:2 | month == 3 & day <= 15 ~ "Winter", 
                             #Add conditions for other season
                             )
         )
0
votes

We assume that when the question says that winter is "Dec 15 2015 to March 15 201 or Dec 15 2016 to Mar 15 2016" what is really meant is that winter is Dec 16, 2015 to Mar 15, 2016 or Dec 16, 2016 to Mar 15, 2017.

Also it is not clear what the precise output is supposed to be but in each case below we provide a second argument which takes a vector giving the season names or numbers. The default is that winter is reported as 1, spring is 2, summer is 3 and fall is 4 but you could pass a second argument of c("Winter", "Spring", "Summer", "Fall") instead or use other names if you wish.

1) yearmon/yearqtr Convert to Date class and subtract 15. Then convert that to yearmon class which represents dates internally as year + fraction where fraction = 0 for January, 1/12 for February, ..., 11/12 for December. Add 1/12 to get to the next month. Convert that to yearqtr class which represents dates as year + fraction where fraction is 0, 1/4, 2/4 or 3/4 for the 4 quarters and take cycle of that which gives the quarter number (1, 2, 3 or 4). If we knew that the input x was a Date vector as opposed to a character vector then we could simplify this by replacing as.Date(x) in season.

library(zoo)

season <- function(x, s = 1:4) 
  s[cycle(as.yearqtr(as.yearmon(as.Date(x) - 15) + 1/12))]

# test
d <- c(as.Date("2020-12-15") + 0:1, as.Date("2021-03-15") + 0:1)
season(d)
## [1] 4 1 1 2

season(d, c("Winter", "Spring", "Summer", "Fall"))
## [1] "Fall"   "Winter" "Winter" "Spring"

2) base The above could be translated to base R using POSIXlt. Subtract 15 as before and then add 1 to the month to get to the next month. Finally extract the month and ensure that is is less than or equal to the third month.

season.lt <- function(x, s = 1:4) {
  lt <- as.POSIXlt(as.Date(d) - 15)
  lt$mon <- lt$mon + 1
  s[as.POSIXlt(format(lt))$mon %/% 3 + 1]
}

# test - d defined in (1)
is.season.lt(d)
## [1] 4 1 1 2

3) lubridate We can follow the same logic in lubridate like this:

season.lub <- function(x, s = 1:4) 
  s[(month((as.Date(x) - 15) %m+% months(1)) - 1) %/% 3 + 1]

# test - d defined in (1)
season.lub(d)
## [1] 4 1 1 2