data1=data.frame("group1"=c(1,1,1,1,2,2,2,2,3,3,3,3,1,1,1,1,2,2,2,2,3,3,3,3),
"group2"=c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2),
"var1"=c(1,0,0,1,0,0,0,1,1,1,1,1,0,0,1,0,0,1,0,1,1,0,0,0),
"var2"=c(1,0,1,1,0,0,1,0,0,0,0,0,0,0,1,1,0,0,1,0,1,0,0,1),
"var3"=c(1,1,4,3,3,1,1,2,4,1,4,4,4,2,1,2,1,2,2,2,3,1,2,4))
data2=data.frame("group1"=rep(c(rep(1:3,2)),2),
"group2"=rep(c(rep(1:2,3))),
"var1"=sort(rep(0:1,6)),
"svar1" = c(2,2,0,3,3,3,1,2,4,1,1,1),
"var2"=sort(rep(0:1,6)),
"svar2" = c(rep(NA,12)))
I have 'data1' and hope to make 'data2'. What it does is it collapses the actual counts of 'var1' and 'var2' to create 'svar1' and 'svar2' in 'data2'.
to create 'svar1' we sift through all combinations of 'group1' and 'group2' in 'data1' and then just store the sum of all occurances of '0' and '1' which are the response options for 'var1'. I wish to also do this for 'var2' to generate 'svar2'
I also hope for a data.table solution given the big data!! For now we can ignore 'var3'!