0
votes

To generate the output below, I am using the following code:

safe.ifelse <- function(cond, yes, no) structure(ifelse(cond, yes, no), class = class(yes))

library(lubridate)

df <- data.frame(i_date=mdy("9/1/2011") + months(seq(0,31)), t_date=mdy("2/1/2012")) r <- seq(1:nrow(df))

r <- (r - which(df$i_date == df$t_date)) %/% 12

df$r_date <- as.Date(safe.ifelse(r<0, df$i_date, df$t_date + years(r)), origin = "1970-01-01")

For good reason, I get an error if I set the t_date to be beyond the biggest i_date. Does anyone know a way to avoid this error? So instead of finding where the i_date and t_date match, replicating the t_date 12 times and adding a year, replicating again 12 times etc, I would just cascade the i_date the entire way to the end of the r_date where all three columns of the data frame have the same length. So in the case I am referring to, the i_date would match the t_date if the t_date is > max(i_date) otherwise we would do what we see below. Thanks!

i_date       t_date      r_date
9/1/2011    2/1/2012    9/1/2011
10/1/2011   2/1/2012    10/1/2011
11/1/2011   2/1/2012    11/1/2011
12/1/2011   2/1/2012    12/1/2011
1/1/2012    2/1/2012    1/1/2012
2/1/2012    2/1/2012    2/1/2012
3/1/2012    2/1/2012    2/1/2012
4/1/2012    2/1/2012    2/1/2012
5/1/2012    2/1/2012    2/1/2012
6/1/2012    2/1/2012    2/1/2012
7/1/2012    2/1/2012    2/1/2012
8/1/2012    2/1/2012    2/1/2012
9/1/2012    2/1/2012    2/1/2012
10/1/2012   2/1/2012    2/1/2012
11/1/2012   2/1/2012    2/1/2012
12/1/2012   2/1/2012    2/1/2012
1/1/2013    2/1/2012    2/1/2012
2/1/2013    2/1/2012    2/1/2013
3/1/2013    2/1/2012    2/1/2013
4/1/2013    2/1/2012    2/1/2013
5/1/2013    2/1/2012    2/1/2013
6/1/2013    2/1/2012    2/1/2013
7/1/2013    2/1/2012    2/1/2013
8/1/2013    2/1/2012    2/1/2013
9/1/2013    2/1/2012    2/1/2013
10/1/2013   2/1/2012    2/1/2013
11/1/2013   2/1/2012    2/1/2013
12/1/2013   2/1/2012    2/1/2013
1/1/2014    2/1/2012    2/1/2013
2/1/2014    2/1/2012    2/1/2014
3/1/2014    2/1/2012    2/1/2014
4/1/2014    2/1/2012    2/1/2014
2
It's nice to have the desired output, but what was the input?MrFlick
The input is the first two columns.Matthew Lundberg
Ideally I would use the first 2 columns in my data frame to generate the 3rd column.user3743201

2 Answers

0
votes

This should work for ordered data where the i_date increments in steps of one month as in data above. I'll use lubridate package to make it easier to manipulate the dates.

I replicate your data above as a dataframe df.

library(lubridate)
td <- mdy("2/1/2012")
df <- data.frame(i_date=mdy("9/1/2011") + months(seq(0,31)),
                 t_date=td)

I create temporary index in variable r to indicate how many years (in essence 12 months) to add. Then simply add r number of years to t_date from the point where r is not negative (i.e. i_date is no longer smaller than t_date). Set r to negative if t_date is not in the range of i_date.

if (td %in% df$i_date) {
  r <- (seq(1:nrow(df)) - which(df$i_date == df$t_date)) %/% 12
} else { r <- rep(-1, nrow(df)) }
df$r_date <- as.POSIXct(ifelse(r<0, df$i_date,
                                    df$t_date + years(r)), origin = "1970-01-01")

My result below.

       i_date     t_date              r_date
1  2011-09-01 2012-02-01 2011-09-01 08:00:00
2  2011-10-01 2012-02-01 2011-10-01 08:00:00
3  2011-11-01 2012-02-01 2011-11-01 08:00:00
4  2011-12-01 2012-02-01 2011-12-01 08:00:00
5  2012-01-01 2012-02-01 2012-01-01 08:00:00
6  2012-02-01 2012-02-01 2012-02-01 08:00:00
7  2012-03-01 2012-02-01 2012-02-01 08:00:00
8  2012-04-01 2012-02-01 2012-02-01 08:00:00
9  2012-05-01 2012-02-01 2012-02-01 08:00:00
10 2012-06-01 2012-02-01 2012-02-01 08:00:00
11 2012-07-01 2012-02-01 2012-02-01 08:00:00
12 2012-08-01 2012-02-01 2012-02-01 08:00:00
13 2012-09-01 2012-02-01 2012-02-01 08:00:00
14 2012-10-01 2012-02-01 2012-02-01 08:00:00
15 2012-11-01 2012-02-01 2012-02-01 08:00:00
16 2012-12-01 2012-02-01 2012-02-01 08:00:00
17 2013-01-01 2012-02-01 2012-02-01 08:00:00
18 2013-02-01 2012-02-01 2013-02-01 08:00:00
19 2013-03-01 2012-02-01 2013-02-01 08:00:00
20 2013-04-01 2012-02-01 2013-02-01 08:00:00
21 2013-05-01 2012-02-01 2013-02-01 08:00:00
22 2013-06-01 2012-02-01 2013-02-01 08:00:00
23 2013-07-01 2012-02-01 2013-02-01 08:00:00
24 2013-08-01 2012-02-01 2013-02-01 08:00:00
25 2013-09-01 2012-02-01 2013-02-01 08:00:00
26 2013-10-01 2012-02-01 2013-02-01 08:00:00
27 2013-11-01 2012-02-01 2013-02-01 08:00:00
28 2013-12-01 2012-02-01 2013-02-01 08:00:00
29 2014-01-01 2012-02-01 2013-02-01 08:00:00
30 2014-02-01 2012-02-01 2014-02-01 08:00:00
31 2014-03-01 2012-02-01 2014-02-01 08:00:00
32 2014-04-01 2012-02-01 2014-02-01 08:00:00
0
votes

I don't think using rep with ifelse makes a lot of sense because ifelse operates on each row. I assume ones you start replacing, you will continue to do so for the rest of the data.frame. Assuming your data.frame above is called x and that the first two columns are proper date classes, then i might do

ww <- seq_along(x$i_date)-which(x$i_date == x$t_date)

to identify each row by it's offset from where the values are equal. Then we can add years to the pivot date to calculate the values for the rest of the rows

pvdate <- as.Date(
    paste(as.numeric(strftime(x$t_date[ww==0], "%Y"))+0:max(floor(ww/12)), 
    strftime(x$t_date[ww==0], "%m-%d"), sep="-")
)

That's a bit of messy date arithmetic, but it gets the job done. Now i just combine the unreplaced rows with the replaced ones

x$r_date<-c(x$i_date[ww<=0], rep(pvdate, table(floor(ww[ww>0]/12))))

It's not exactly elegant, but perhaps someone will have a better solution.