0
votes

I am using the MatchIt package to implement nearest neighbor matching with the Mahalonobis distance. After the matching stage, how do I get it to report which control observation was matched with each treatment observation?

The following code does not work and throws the warning "No subclassification with pure Mahalanobis distance."

library("MatchIt")

data("lalonde")

lalonde_matchit_nn <-
  matchit(
    treat ~ age + educ + black + hispan + nodegree + married + re74 + re75,
    baseline.group = 1,
    data = lalonde,
    method = "nearest",
    distance = "mahalanobis",
    subclass = T
  )

Again, what I look for is for the output to have an ID for each pair of treatment and control, just like the subclass reported with other matching methods (e.g., "exact" or "cem").

1
The warning is because you have method = "mahalanobis" and subclass = T, but you are not allowed to have those together. You can choose either subclassification (on the propensity score) or Mahalanobis distance matching.Noah
@Noah Thanks, yes, I realize that. It seems like a strange design choice not to keep track of matched groups (which is what subclass is doing, among other things) for nearest neighbor matching. I realize that unlike other matching methods, it's a 1:1 match so there's no need for, e.g., weighting within groups. But having a column with the subclass ID it's still useful for other purposes, and is necessary in my application.dzeltzer
Did you see my answer here which addresses that very issue?Noah

1 Answers

2
votes

You are looking for the attributes of the output in this case: output is lalonde_matchit_nn and attributes are nn and match.matrix

smry<-lalonde_matchit_nn$nn #A basic summary table of matched data (e.g., the number of matched units)

#represent the names of the treatment units, which
#come from the data frame specified in data. Each column stores the name(s)
#of the control unit(s) matched to the treatment unit of that row. F
matchedPool<-lalonde_matchit_nn$match.matrix

Now if you look at smry and matched pool from above code:

smry
          Control Treated
All           429     185
Matched       185     185
Unmatched     244       0
Discarded       0       0

head(matchedPool)

     1        
NSW1 "PSID375"
NSW2 "PSID341"
NSW3 "PSID361"
NSW4 "PSID345"
NSW5 "PSID172"
NSW6 "PSID237"

The smry tells the population of each type and matched pool gives you the ID which has matched as per your optimal criteria, in this case, Mahanlobis distance, However the warning message Warning message: No subclassification with pure Mahalanobis distance is telling you that for this method other optimal parameters can be a better choice.

For more details, it's always good practice to refer the package document, https://cran.r-project.org/web/packages/MatchIt/MatchIt.pdf