I have a dataset which looks somewhat like this-
Col1 Col2 Col3 Col4 Col5
400 322 345 1 1
131 345 809 1 1
565 676 311 2 1
121 645 777 2 1
322 534 263 3 1
545 222 111 3 1
I want to perform a group-wise calculation where for each unique value in Col5, I calculate a statistic for Col1:Col3 grouping by Col4-
(X(i,j)-X'(i,j))/S(i)
where X(i,j) represents the mean of the variable for group i,j (Col5,Col4) ,X' represents the mean of the other groups j for the same variable, and S is the standard deviation over the entire group i. For example, in the above case, the statistic for Col1 based on group 1 in Col4 will be-
(mean(400,131)-mean(565,121,322,545))/stddev(Col1)
(265.5-388.25)/193.85 = -0.633
I want to use the summarise function with ddply to calculate this for each of the variables and for each of the groups in Col4 and Col5.
PS- I hope I've been able to explain the problem clearly.
Thanks!
(mean(c(400, 131))-mean(c(565,121,322,545)))/sd(df1$Col1) #[1] -0.6332145
Thesd
forCol1
issd(df1$Col1) #[1] 193.8522
– akrun