I'm struggling with chisq.test in Tableau via R. I have a model in Excel which I have to replicate, but my results differ. I think the problem is in correct massaging of the data in R code...
These are the Excel values:
p-value formula in Excel is: CHISQ.TEST(F4:G4,H4:I4).
This is how Tableau looks:
p-value formula in Tableau is:
SCRIPT_REAL("
obs_yes = .arg1;
obs_no = .arg2;
exp_yes = .arg3;
exp_no = .arg4;
df1 <- data.frame(obs_yes,obs_no);
df2 <- data.frame(exp_yes,exp_no);
print('df1 is ');
print(df1);
print('df2 is ');
print(df2);
if ( !is.na(obs_yes) && !is.na(obs_no) && !is.na(exp_yes) && !is.na(exp_no)) {
#ChiSq = chisq.test(c(df1[8,1],df1[8,2]), p=c(df2[8,1],df2[8,2]), rescale.p=TRUE);
ChiSq = chisq.test(df1, p=df2, rescale.p=TRUE);
ChiSq$p.value;
} else {
{NaN};
}
"
,[obs_yes],[obs_no],[exp_yes],[exp_no])
What I see in R debug for df1 and df2 looks OK to me:
but the p-value is not correct - looks like it is calculated over all values?
If I pass in R script only one row, then the p-value is correct:
ChiSq = chisq.test(c(df1[1,1],df1[1,2]), p=c(df2[2,1],df2[2,2]), rescale.p=TRUE);
but this doesn't help, as it is a static calculation.
Any ideas?
Here is a link to the .twbx file.
Thank you,
Adrian
P.S. A working code found on a forum (for a different setup) is below, but I don't know how to adapt it for my situation.
SCRIPT_REAL(
'mm <- data.frame(commodity = .arg1);
d <- split(mm,rep(1:.arg3[1],each=.arg2[1]/.arg3[1]));
zt <-do.call?do.c(cbind, d);chisq.test(zt)$p.value'
,SUM([Number of Records]), SIZE(),[Priorities])