0
votes

I need to parse dates and have a cases like "31/02/2018":

library(lubridate)
> dmy("31/02/2018", quiet = T)
[1] NA

This makes sense as the 31st of Feb does not exist. Is there a way to parse the string "31/02/2018" to e.g. 2018-02-28 ? So not to get an NA, but an actual date?

Thanks.

1
No, there isn't (because it is not an actual date). You can subset and replace it.Roland
Is this a XY problem?Rui Barradas
@RuiBarradas So what is the X in your opinion?c0bra
To have those values recorded or somwhow show up .Rui Barradas
@RuiBarradas blame the one, who filled the survey... or the one who did not set up a format filter. your choice. ;)c0bra

1 Answers

1
votes

We can write a function assuming you would only have dates which could be higher than the actual date and would have the same format always.

library(lubridate)

get_correct_date <- function(example_date) {
  #Split vector on "/" and get 3 components (date, month, year)
  vecs <- as.numeric(strsplit(example_date, "\\/")[[1]])

  #Check number of days in that month
  last_day_of_month <-  days_in_month(vecs[2])

  #If the input date is higher than actual number of days in that month
  #replace it with last day of that month
  if (vecs[1] > last_day_of_month)
    vecs[1] <- last_day_of_month

  #Paste the date components together to get new modified date
  dmy(paste0(vecs, collapse = "/"))
}


get_correct_date("31/02/2018")
#[1] "2018-02-28"

get_correct_date("31/04/2018")
#[1] "2018-04-30"

get_correct_date("31/05/2018")
#[1] "2018-05-31"

With small modification you can adjust the dates if they have different format or even if some dates are smaller than the first date.