0
votes

this is my first post so I do apologize if I am not specific enough.

I have a sequence of months and a data frame with approximately 100 rows, each with a unique identifier. Each identifier is associated with a start up date. I am trying to calculate the number of months since start up for each of these unique identifiers at each month in the sequence. I have tried unsuccessfully to write a for loop to accomplish this.

Example Below:

# Build Example Data Frame #
x_example <- c("A","B","C","D","E")
y_example <- c("2013-10","2013-10","2014-04","2015-06","2014-01")
x_name <- "ID"
y_name <- "StartUp"
df_example <- data.frame(x_example,y_example)
names(df_example) <- c(x_name,y_name)

# Create Sequence of Months, Format to match Data Frame, Reverse for the For Loop #
base.date <- as.Date(c("2015-11-1"))
Months <- seq.Date(from = base.date , to = Sys.Date(), by = "month")
Months.1 <- format(Months, "%Y-%m")
Months.2 <- rev(Months.1)

# Create For Loop #
require(zoo)
for(i in seq_along(Months.2))
{
  for(j in 1:length(summary(as.factor(df_example$ID), maxsum = 100000)))
  {
   Active.Months <- 12 * as.numeric((as.yearmon(Months.2 - i) - as.yearmon(df_example$StartUp)))
  }
}

The idea behind the for loop was that for every record in the Months.2 sequence, there would be a calculation of the number of months to that record (month date) from the Start Up month for each of the unique identifiers. However, this has been kicking back the error:

Error in Months.2 - i : non-numeric argument to binary operator

I am not sure what the solution is, or if I am using the for loop properly for this.

Thanks in advance for any help with solving this problem!

Edit: This is what I am hoping my expected outcome would be (this is just a sample as there are more months in the sequence):

 ID Start Up Month 2015-11 2015-12 2015-12 2016-02 2016-03
1  A        2013-10      25      26      27      28      29
2  B        2013-10      25      26      27      28      29
3  C        2014-04      19      20      21      22      23
4  D        2015-06       5       6       7       8       9
5  E        2014-01      22      23      24      25      26
2
Can you post an example of what output you are expecting so that it is easier for people to answer your question correctly?Adam Warner
I added the expected output, sorry about that!DW1

2 Answers

1
votes

One way to do it is to first use as.yearmon from zoo package to convert the dates. Then simply we iterate over months and subtract from the ones in the df_example,

library(zoo)

df_example$StartUp <- as.Date(as.yearmon(df_example$StartUp))
Months.2 <- as.Date(as.yearmon(Months.2))

df <- as.data.frame(sapply(Months.2, function(i) 
                    round(abs(difftime(df_example$StartUp, i, units = 'days')/30))))
names(df) <- Months.2
cbind(df_example, df)

#  ID StartUp 2016-07 2016-06 2016-05 2016-04 2016-03 2016-02 2016-01 2015-12 2015-11
#1  A 2013-10      33      32      31      30      29      28      27      26      25
#2  B 2013-10      33      32      31      30      29      28      27      26      25
#3  C 2014-04      27      26      25      24      23      22      21      20      19
#4  D 2015-06      13      12      11      10       9       8       7       6       5
#5  E 2014-01      30      29      28      27      26      25      24      23      22
0
votes
x_example <- c("A","B","C","D","E")
y_example <- c("2013-10","2013-10","2014-04","2015-06","2014-01")
y_example <- paste(y_example,"-01",sep = "")

# past on the "-01" because I want the later function to work. 

x_name <- "ID"
y_name <- "StartUp"
df_example <- data.frame(x_example,y_example)
names(df_example) <- c(x_name,y_name)


base.date <- as.Date(c("2015-11-01"))
Months <- seq.Date(from = base.date , to = Sys.Date(), by = "month")
Months.1 <- format(Months, "%Y-%m-%d")
Months.2 <- rev(Months.1)

monnb <- function(d) { lt <- as.POSIXlt(as.Date(d, origin="1900-01-01")); lt$year*12 + lt$mon } 
mondf <- function(d1, d2) {monnb(d2) - monnb(d1)}

NumofMonths <- abs(mondf(df_example[,2],Sys.Date()))

n = max(NumofMonths)

# sequence along the number of months and get the month count. 

monthcount <- (t(sapply(NumofMonths, function(x) pmax(seq((x-n+1),x, +1), 0) )))
monthcount <- data.frame(monthcount[,-(1:24)])
names(monthcount) <- Months.1

finalDataFrame <- cbind.data.frame(df_example,monthcount)

Here is your final data frame which is the desired output you indicated:

  ID    StartUp 2015-11-01 2015-12-01 2016-01-01 2016-02-01 2016-03-01 2016-04-01 2016-05-01 2016-06-01 2016-07-01
1  A 2013-10-01         25         26         27         28         29         30         31         32         33
2  B 2013-10-01         25         26         27         28         29         30         31         32         33
3  C 2014-04-01         19         20         21         22         23         24         25         26         27
4  D 2015-06-01          5          6          7          8          9         10         11         12         13
5  E 2014-01-01         22         23         24         25         26         27         28         29         30

The overall idea is that we calculate the number of months and use the sequence function to create a counter of the number of months until we get the current month.