I used the MatchIt function to derive a 1:4 ratio treated:untreated dataset, attempting to achieve similar average age and gender frequency.
I have a small treated group (n = 44) and a much larger control group (n= 980). To reduce the number of the control group and exclude age and gender as confounders, I attempted to use the MatchIt function to create a control group of 176 with an average age and gender balance similar to the treated group.
m.out <- matchit(Treated ~ AGE + SEX, data = d,
method = "optimal",
ratio = 4)
The summary of the output is:
Summary of balance for matched data:
Means Treated Means Control SD Control Mean Diff eQQ Med
distance 0.0602 0.0603 0.0250 -0.0001 0
AGE 57.5227 58.4034 7.9385 -0.8807 1
SEXF 0.4318 0.1477 0.3558 0.2841 0
SEXM 0.5682 0.8523 0.3558 -0.2841 0
The Age variable worked great - it is not significantly different but the gender seemed off (85% male in control vs 57% in treated) so I performed a chi-square test on the treated ~ gender data. It showed a highly significant difference in gender:
chisq <- with(m.data, chisq.test(SEX, Treated))
data: SEX and Treated
X-squared = 15.758, df = 1, p-value = 7.199e-05
How do I account for the difference here? Is my problem with the MatchIT function (incorrect method?) or it has worked but I've applied the chi-square to the incorrect problem?