2
votes

I am trying to match two samples on several covariates using MatchIt, but I am having difficulty creating samples that are similar enough. Both my samples are plenty large (~1000 in the control group, ~5000 in the comparison group).

I want to get a matched sample with participants as closely matched as possible and I am alright with losing sample size in the control group. Right now, MatchIt only returns two groups of 1000, whereas I want two groups that are very closely matched and would be fine with smaller groups (e.g., 500 instead of 1000).

Is there a way to do this through either MatchIt or another package? I would rather avoid using random sampling and then match if possible because I want as close a match between groups as possible.

Apologies for not having a reproducible example, I am still pretty new to using R and couldn't figure out how to make a sample of this issue...

Below is the code I have for matching the two groups.

data<- na.omit(data)
data$Group<- as.numeric(data$Group)
data$Group<- recode(data$Group, '1 = 1; 2 = 0')

m.out <- matchit(Group ~ Age + YearsEdu + Income + Gender, data = data, ratio = 1)
s.out <- summary(m.out, standardize = TRUE)
plot(s.out)

matched.data <- match.data(m.out)
2
Can you add the code of what you have tried? It will help others looking at your question.Juan Carlos Ramirez
I added my code, sorry it is a little messyZach

2 Answers

0
votes

MatchIt, like other similar packages, offers several matching routines that enable you to play around with the settings. Check out the argument method, which is set to method = 'nearest' by default. This means that unless you specify, it will look for the best match for each of the treatment observations. In your case, you will always have 1000 paired matches with this setting.

You can choose to set it to method = 'exact', which is much more restrictive. In the documentation you will find:

This technique matches each treated unit to all possible control units with exactly the same values on all the covariates, forming subclasses such that within each subclass all units (treatment and control) have the same covariate values.

On the lalonde dataset, you can run:

m.out <- matchit(treat ~ educ + black + hispan, data = lalonde, method = 'exact')
summary(m.out)

As a consequence, it discards some of the treatment observation that could not get matched. Have a look at the other possibilities for method, maybe you will find something you will like better.

That being said, be mindful not to discard too many treatment observations. If you do, you will make the treatment group look like the control group (instead of the opposite), which might lead to unwanted results.

0
votes

You should look into the package designmatch, which implements a form of matching called cardinality matching that does what you want (i.e., find the largest matched set that yields desired balance). Unlike MatchIt, designmatch doesn't use a distance variable; instead, it uses optimization to solve the matching problem. You select exactly how balanced you want each covariate to be, and it will do its best to solve the problem while retaining as many matches as possible. The methodology is described in Zubizarreta, Paredes, & Rosenbaum (2014).