I really need your help with this. I have a panel dataframe which looks something like this
Name A B
1 Marco 01/09/2014 NA
2 Marco NA 01/01/2015
3 Marco 02/01/2015 NA
4 Luca 01/01/2015 NA
5 Luca NA 31/01/2015
6 Silvia NA 15/01/2015
and I want to create a dummy variable taking value 1 if (condition 1), in column A, observations do not show a 2014-date OR (condition 2) if, in column B, observations show a 2015-date AND, at the same time, there is at least another observation for that individual but none of them being associated with a 2014-date in column A. In other words, I do not know how to impose a condition for the dummy which checks all the other observations related to the same individual (identified in the column "Name"). The result I want is something like this
Name A B dummy
1 Marco 01/09/2014 NA 0
2 Marco NA 01/01/2015 0
3 Marco 02/01/2015 NA 1
4 Luca 01/01/2015 NA 1
5 Luca NA 31/01/2015 1
6 Silvia NA 15/01/2015 0
In the example above, the value of the dummy at the first observation is 0 because of the 2014-date in column A (condition 1 not verified). At the second observation, the dummy takes value 0 because, despite the fact of the 2015-date in column B, the same individual (Marco) presents a 2014-date in Column A in at least one of the other observations related to him (observation 1 in this case). Observation 4 instead shows the dummy equal to 1 since the date in column A is 2015. Observation 5 shows the dummy equal to 1 since, despite the 2015-date in column B, the same individual (Luca) does not have other observations with a 2014-date in column A (it has a 2015-date in observation 4). Finally, the dummy associated with Silvia must be 0 since, despite the 2015-date in column B, there is no other Silvia's observation in the dataframe.
I hope it is not too twisted and that I expressed my idea. Let me know if this is not clear. Besides the conditions themselves, if you help me just with the way to impose conditions accross different observations related to the same individual it would already help a lot.
Thank you all! Marco
structure(list(Name = c("Marco", "Marco", "Marco", "Luca",
"Luca",
"Silvia"), A = structure(c(1409529600, NA, 1420156800,
1420070400,
NA, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"), B =
structure(c(NA,
1420070400, NA, NA, 1422662400, 1421280000), class =
c("POSIXct",
"POSIXt"), tzone = "UTC")), row.names = c(NA, -6L), class =
c("tbl_df",
"tbl", "data.frame"))
dput()
your sample data.frame so that we can start with the same dat types you have. – vaettchen