I'm looking to create a user-defined contrast on my data. In brief, the data is organized in a dataframe, with each row having 1 of 4 possible conditions, a proportion of correct answers on a test, and 2 variables called "Schedule" and "Cluster." The head of my data looks like this:
Subjects Condition PC Schedule Cluster
1 1 1 0.5555556 Interleaved Similar
2 2 1 0.3425926 Interleaved Similar
3 3 1 0.7129630 Interleaved Similar
4 4 1 0.5000000 Interleaved Similar
5 5 1 0.6296296 Interleaved Similar
6 6 1 0.6851852 Interleaved Similar
There are two main contrasts I want to run. The first compares condition 1 to the mean of conditions 2, 3, and 4. The second compares condition 4 to the mean of conditions 2 and 3. I coded my two contrtasts like this:
contrast1 = c(1, -1/3, -1/3, -1/3)
contrast2 = c(0, -1/2, -1/2, 1)
I then put them into a matrix:
cond.contrasts = matrix(c(contrast1, contrast2), ncol = 2)
Per advice I saw elsewhere, I got the general inverse of this matrix with a function from the MASS
package, ginv()
:
cond.contrasts = t(ginv(cond.contrasts))
show(cond.contrasts)
[,1] [,2]
[1,] 0.75 0.0000000
[2,] -0.25 -0.3333333
[3,] -0.25 -0.3333333
[4,] -0.25 0.6666667
Note there are only two contrasts here. However, my output looks like this:
lm.experiment = lm(PC ~ Condition, PC)
summary(lm.experiment)
Call:
lm(formula = PC ~ Condition, data = PC)
Residuals:
Min 1Q Median 3Q Max
-0.22099 -0.12069 -0.00926 0.11443 0.35117
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5438470 0.0136786 39.759 <2e-16 ***
Condition1 0.0263110 0.0312175 0.843 0.401
Condition2 0.0279084 0.0335882 0.831 0.408
Condition3 -0.0007032 0.0276090 -0.025 0.980
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1472 on 112 degrees of freedom
Multiple R-squared: 0.01234, Adjusted R-squared: -0.01412
F-statistic: 0.4663 on 3 and 112 DF, p-value: 0.7064
If I'm understanding this right, my contrasts should be represented by the "Condition1" and "Condition2" coefficients. However, I have no idea what "Condition3" refers to. If I ask R to show me the contrasts directly, it gives me this:
> show(contrasts(PC$Condition))
[,1] [,2] [,3]
1 0.75 0.0000000 8.326673e-17
2 -0.25 -0.3333333 -7.071068e-01
3 -0.25 -0.3333333 7.071068e-01
4 -0.25 0.6666667 -2.498002e-16
Where does the third column come from? Have I done something wrong?
Condition
? – Sven Hohensteincontrasts(PC$Condition) = cond.contrasts
– Brian