My df contains series of columns with similar names that are grouped every three columns, similar to this:
>df<-data.frame(c(0,1,4,5),c(0,1,3,3),c(0,1,1,1),c(0,1,1,1),c(0,1,1,1),c(0,1,1,1),c(0,8,1,9),c(6,1,1,1),c(5,1,3,4))
>names(df)<-c("AA1","AA2","AA3","BB1","BB2","BB3","CC1","CC2","CC3")
> df
AA1 AA2 AA3 BB1 BB2 BB3 CC1 CC2 CC3
1 0 0 0 0 0 0 0 3 3
2 1 1 1 1 1 1 8 1 1
3 4 6 1 1 1 1 1 1 3
4 5 5 1 1 1 1 9 1 4
This essentially shows 3 different measurements (1,2,3) per examination type(AA,BB,CC) for 4 patients. In reality I have a huge dataset with 3 measurements for 10 different examinations on 2,000 patients. I would like to add a new column with classification of disease as follows: If the score for at least one measurement per examination (XX1,XX2,XX2 where XX=AA or BB or CC) is >4 then the patient has the disease. So the new dataset would look like that:
>
AA1 AA2 AA3 BB1 BB2 BB3 CC1 CC2 CC3 DISEASE
1 0 0 0 0 0 0 0 3 3 0
2 1 1 1 1 1 1 8 1 1 1
3 4 6 1 1 1 1 1 1 3 1
4 5 5 1 1 1 1 9 1 4 1