0
votes

I have a dataset of 21840 obs for 6 variables and one of the latter is a simple "date" column with a bunch of missing values. For the purpose of my project I would need to impute NAs based on position.

For example I might have:

  • 25/01/1990
  • NA
  • 27/01/1990

Given that dates are ordered, the NA is either: 25/01/1990 or 26/01/1990 or 27/01/1990 (I can have multiple obs a day, no prob!). Is there a way to replicate this reasoning in an easy and automatic way?

I tried with "mice" treating dates as a factor but it won't work!

Thanks!

Code attached:

library(mice)
init = mice(dat, maxit=0) 
meth = init$method
predM = init$predictorMatrix


meth[c("date")]="polr"

set.seed(103)
imputed = mice(dat, method=meth, predictorMatrix=predM, m=5)
1
What code have you tried so far? - Zach
If the dates are ordered, how would you get 27/01/1990 between 25/01/1990 and 26/01/1990? - Matt Hogan-Jones
Corrected, sorry for the typo - The_Car_a_Carn

1 Answers

0
votes

Try na.approx:

library(zoo)

x <- as.Date(c("25/01/1990", NA, "27/01/1990"), format = "%d/%m/%Y")
as.Date(na.approx(x))

giving:

[1] "1990-01-25" "1990-01-26" "1990-01-27"