1
votes

I have a data frame with a date column titled 'End_Date'. I want to add a new column 'Start_Date' that calculates the day after the previous 'End_Date'. For example:

End_Date <-as.Date(c("2010-11-01", "2010-11-18", "2010-11-26"))
dates <-data.frame(End_Date)

In this example, the new column 'Start Date' to look like this: dates$Start_Date <-as.Date(c(NA, "2010-11-02", "2010-11-19"))

I have tried using sapply, but get an error stating that the new column has one too few rows:

dates$Start_Date <-sapply(2:nrow(dates), 
     function(i) (dates$End_Date[i]-dates$End_Date[i-1]))

Here, I created a data frame with only 3 rows just as an example, but I need to solution that I can apply to data frames with large numbers of rows.

1
as.Date(c(NA, (end+1)[-length(end)]), origin = "1970-01-01")Rich Scriven
I created a data frame with only 3 rows for this example. But I will need to do this for a data frame with ~700 rows, so I am looking for some sort of loop or function for thisuser3791234
Why would you choose to loop it when you don't have to? It doesn't matter how many rows there are. This is a vectorized operationRich Scriven
Thanks for your speedy replies. Your first comment works for the example data frame, but when applied to a data frame with more rows as-is, I get an error like "replacement has 3 rows, data has 600". Perhaps it should be obvious how your commented code should be edited to apply to a larger data frame, but I can't figure it out. Also, if you have an answer, please provide it as an answer rather than a comment.user3791234

1 Answers

0
votes

Try:

dates$Start_Date<-rep(as.Date(NA,origin = "1970-01-01"),nrow(dates))
for(i in 2:nrow(dates)) {dates$Start_Date[i]<- as.Date(dates$End_Date[i-1]+1)}

Or as Richard suggested in the comments, a much better method is:

dates$Start_Date<-as.Date(c(NA, (End_Date+1)[-length(End_Date)]), origin = "1970-01-01")