I want to take a sample per group, allthewhile avoiding that any participant appears twice across the samples (I need this for a between-subjects ANOVA). I have a dataframe in which some participants (not all) appear twice, each time in a different group, i.e. Peter can appear in group v1=A and v2=1 but theoretically also in group v1=B and v2=3. A group is defined by the two variables v1 and v2, so according to the below code, there are 8 groups.
Now, I want to avoid the double appearance of any participant in the data by taking samples per group and randomly eliminating one observation from any participant, allthewhile maintaining similarly sized samples. I constructed the following ugly code to showcase my problem.
How do I get the last step done, so that no participant appears twice across the samples and I only have unique cases across all samples?
df1 < - data.frame(ID=c("peter","peter","chris","john","george","george","norman","josef","jan","jan","richard","richard","paul","christian","felix","felix","nick","julius","julius","moritz"),
v1=rep(c("A","B"),10),
v2=rep(c(1:4),5))
library(dplyr)
df2 <- df1 %>% group_by(v1,v2) %>% sample_n(2)
df1[sample(1:nrow(df1)), ] %>% filter(!duplicated(ID)) %>% group...- gfgm