I have a dataset of like the following one. With subjects that join groups over time
df <- data.frame("group" = c("A", "A", "A+1","A+1", "A+1", "B","B+1","B+1"),
"id" = c("id1", "id2", "id1", "id2", "id3", "id5","id5","id1"),
"time" = c(1,1,3,3,3,2,5,5),
"Val" = c(10,10,10,10,10,12,12,12),
"groupid" = c("A", "A", "A","A", "A", "B","B","B"))
My final aim is to add all of the values grouping by id.
My problem is that as a new subject join groups, for instance, group "A" becomes Group "A+1" when subject "id3" joins at time 3, all of my observations are duplicated ("A+1" include "id1" and "id2". Hence I cannot do my sum.
In sum, I would like to remove duplicated "ids" based on "Group id" identifiers.
the final df should look like this
final <- data.frame("group" = c("A", "A", "A+1", "B","B+1"),
"id" = c("id1", "id2", "id3", "id5","id1"),
"time" = c(1,1,3,2,5),
"Val" = c(10,10,10,12,12),
"groupid" = c("A", "A", "A", "B","B"))
I thank you a lot in advance for your help
Follow up question
I also have the opposite case whereby some subject leave the group over time. The dataset looks something like this:
df2 <- data.frame("group" = c("A", "A", "A","A_1", "A_1", "B","B","B_1"),
"id" = c("id1", "id2", "id3", "id2", "id3", "id5","id1","id1"),
"time" = c(1,1,1,3,3,2,2,5),
"Val" = c(10,10,10,10,10,12,12,12),
"groupid" = c("A", "A", "A","A", "A", "B","B","B"))
In this case, one subject "id1" leaves the "group A" that becomes group "A_1", where only "id2" and "id3" are members. Similarly "id5" leaves group B that becomes "B_1" with only id1 as a member.
What I would like to have in the final dataset is an opposite type of groups identification that should look something like this
final2 <- data.frame("group" = c("A", "A", "A","A_1", "B","B","B_1"),
"id" = c("id1", "id2", "id3", "id1", "id5","id1","id5"),
"time" = c(1,1,1,3,2,2,5),
"Val" = c(10,10,10,10,12,12,12),
"groupid" = c("A", "A", "A","A", "B","B","B"))
Whereby "A_1" and "B_1" only indicate the subjects, "id1" and "id5" respectively, that has left the group, rather than the remaining ones.
Any suggestions on this approach?
I thank you in advance for your help