0
votes

I'm trying to run diagnostics for normality in a 2^4 factorial problem with two replicates. Here is my code:

n = 2

A <- factor(c(rep("-", 1*n), rep("+", 1*n)))
B <- factor(c(rep("-", 2*n), rep("+", 2*n)))
C <- factor(c(rep("-", 4*n), rep("+", 4*n)))
D <- factor(c(rep("-", 8*n), rep("+", 8*n)))

obs <- c(90, 93,
         74, 78,
         81, 85,
         83, 80,
         77, 78,
         81, 80,
         88, 82,
         73, 70,
         98, 95,
         72, 76,
         87, 83,
         85, 86,
         99, 90,
         79, 75,
         87, 84,
         80, 80)

df <- data.frame(A, B, C, D, obs)

model <- aov(obs ~ A*B*C*D, data = df)

summary(model)


par(mfrow=c(1,2))

qqnorm(resid(model), ylab = "Residuals", xlab = "Quantiles", pch = 16)
qqline(resid(model))

plot(resid(model) ~ fitted(model), ylab = "Residual", xlab = "Predicted",            pch = 16)
abline(0,0)

The ANOVA table is giving me the correct values, but when I analyze the normality conditions using a Normal Q-Q plot, it incorrectly gives me symmetric residuals. I have noticed that I only run into this issue when I am analyzing four or more interactions. All the residual plots for three interaction or less has the correct expected output with the same code.

Any help would be greatly appreciated

1
Why do you think the residuals should not be symmetric?IRTFM

1 Answers

0
votes

I'm unclear why you think the residuals in the saturated model should not be "symmetric" you can look at them directly with:

> print( sort(model$residuals),  digits=4)
        26         14          5         19         22         28 
-4.500e+00 -3.000e+00 -2.000e+00 -2.000e+00 -2.000e+00 -2.000e+00 
         3          1         16         18         30          8 
-2.000e+00 -1.500e+00 -1.500e+00 -1.500e+00 -1.500e+00 -1.500e+00 
        23         12          9         31         32         10 
-5.000e-01 -5.000e-01 -5.000e-01  6.947e-17  6.947e-17  5.000e-01 
        11         24          7         29         15         17 
 5.000e-01  5.000e-01  1.500e+00  1.500e+00  1.500e+00  1.500e+00 
         2         20         21         27          4          6 
 1.500e+00  2.000e+00  2.000e+00  2.000e+00  2.000e+00  2.000e+00 
        13         25 
 3.000e+00  4.500e+00 

They look pretty symmetric in the sense of having paired values on either side of the median. Another way to display numeric values might be:

table( abs(sort(model$residuals)) )

6.9468775251052e-17   0.499999999999995   0.499999999999998 
                  2                   2                   1 
  0.499999999999999                 0.5   0.500000000000002 
                  1                   1                   1 
   1.49999999999999                 1.5    1.50000000000006 
                  2                   6                   2 
                  2                   3                 4.5 
                 10                   2                   2