I want to code a new variable in a dataframe based on a set of rules. I have a dataframe df1 with a subject variable, a time variable, and variables A, B and C, like this:
subject <- c(1,1,1,1,1,1,2,2,2,2,2,2)
time <- c(1,2,3,4,5,6,1,2,3,4,5,6)
A <- c(1,7,7,6,6,5,1,2,3,NA,NA,NA)
B <- c(2,1,1,1,1,1,6,5,4,NA,NA,NA)
C <-c(7,1,6,1,6,1,6,2,4,NA,NA,NA)
df1 <- data.frame(subject,time,A,B,C)
Values in A, B, and C range from 1 (lowest) to 7 (highest), there are also some NA. Now I want to code a new dichotomous variable, newvar. The first row for every subject should always be coded 0. 1 should be coded whenever the variable/s with the highest score (A,B or C) within one row change/s to one or more different variable/s in the next row. It doesn't matter if the value changes from one row to the next within one variable, only if there is a change in which of the three variables has the highest score within one row compared to the previous row.
The examples from df1 should make this clearer:
Row 1 is coded 0 because it is the first row for subject 1. C has the highest score among the three variables A, B, and C.
In row 2, A has the highest score. Therefore, newvar = 1.
In row 3, A still has the highest score, therefore, newvar = 0.
In row 4, A still has the highest score --> newvar = 0.
In row 5, now A and C both have the highest score, therefore, newvar = 1.
In row 6, only A has the highest score again, therefore, newvar = 1.
Row 7 is the first row for subject 2, therefore newvar is coded 0.
In row 8, newvar should be coded 1, because in the previous row, B and C equally had the highest score, now it is only B.
In row 9, newvar should again be coded 1, because now B and C have the highest score again within the row.
Rows 10 to 12 should be coded NA.
This is what it should look like:
newvar <-c(0,1,0,0,1,1,0,1,1,NA,NA,NA)
df2 <- data.frame(subject,time,A,B,C,newvar)
I would greatly appreciate any input in how to go about this!