2
votes

I have a data frame with one column for year and another column in Julian day (1-366, 1-365 depending on the year). I wanted to know how can I effectively set the DOY_start column as.Date based on the year (to account for leap years).

I tried to use as.Date(), as.POSIXct(), lubridate::as_date()

But I have failed in all my trials. Below is an example of code with generated data that is really similar to my original one.

Thank you so much for any advice.

library(tibble)

Year <- 1980:2020
DOY_start <- as.integer(rnorm(length(Year), mean=91.1, sd=9.65))
var <- cbind(Year, DOY_start)

var <- as_tibble(var) 
head(var)
#> # A tibble: 6 x 2
#>    Year DOY_start
#>   <int>     <int>
#> 1  1980        98
#> 2  1981        89
#> 3  1982        79
#> 4  1983        97
#> 5  1984        81
#> 6  1985        80

var$DOY_start_date <- as.POSIXct(strptime(var$DOY_start, "%j"))

head(var)
#> # A tibble: 6 x 3
#>    Year DOY_start DOY_start_date     
#>   <int>     <int> <dttm>             
#> 1  1980        98 2020-04-07 00:00:00
#> 2  1981        89 2020-03-29 00:00:00
#> 3  1982        79 2020-03-19 00:00:00
#> 4  1983        97 2020-04-06 00:00:00
#> 5  1984        81 2020-03-21 00:00:00
#> 6  1985        80 2020-03-20 00:00:00

Created on 2020-09-18 by the reprex package (v0.3.0)

1
That is not a Julian day, that is the 'day-of-year-number'.Dirk Eddelbuettel

1 Answers

3
votes

That is interesting puzzle. We know that as.POSIXlt contains the day of the year number and that some date libraries convert to it, but I could not immediately find a parser that dealt with it.

Then again, date arithmentic is all we need. We always get the date of January 1. And the desired date is then simply the Jan 1 plus the 'day-of-year' number minus 1.

yearyearday <- function(yr, yd) {
    base <- as.Date(paste0(yr, "-01-01")) # take Jan 1 of year
    day <- base + yd - 1
}

set.seed(42)  # make it reproducible
sample <- data.frame(year=1980:2020, doy=as.integer(rnorm(41,mean=91.1,sd=9.65)))

sample$date <- yearyearday(sample$year, sample$doy)

head(sample)
R> yearyearday <- function(yr, yd) {
+     base <- as.Date(paste0(yr, "-01-01")) # take Jan 1 of year
+     day <- base + yd - 1
+ }
R> 
R> set.seed(42)  # make it reproducible
R> sample <- data.frame(year=1980:2020, 
+                       doy=as.integer(rnorm(41, mean=91.1, sd=9.65)))
R> 
R> sample$date <- yearyearday(sample$year, sample$doy)
R> 
R> head(sample)
  year doy       date
1 1980 104 1980-04-13
2 1981  85 1981-03-26
3 1982  94 1982-04-04
4 1983  97 1983-04-07
5 1984  95 1984-04-04
6 1985  90 1985-03-31
R> 

As so often with date calculation, nothings besides base R is needed.