I figured out the answer to this... refer to the answer I posted below.
A) My knowledge of what a caliper does (please correct if wrong):
Adding a caliper (e.g caliper = 0.2) when matching using R's matchit package, means only matching a control group point and a treatment group point if they are within 0.2 standard deviations (of propensity score in this case) away from each other. Therefore treatment group points without control group points within 0.2 s.d will not be matched, and hence discarded. This is supposed to improve balance (reduce bias), as it produces matched control and treatment only if they are "similar" enough with each other.
B) My main Question:
So, with resampling (replace = TRUE), and using nearest neighbour method, doesn't this mean including a caliper can only change the matching by reducing the number of treatment groups matched, and potentially the number of control group points used for matching?
I.e Treatment group points WITH a control group point within 0.2 sd will be matched the same way as without a caliper (assuming seed is constant), by being matched with it's nearest neighbour. Treatment group points those WITHOUT a control group point within 0.2 sd will be discarded.
In the following example, this is not what happened, so I'm very confused. Any explanations / correction of my understanding of calipers would be greatly appreciated!
C) Example to my question above:
In the following example (Code taken from https://sejdemyr.github.io/r-tutorials/statistics/tutorial8.html), I ran PSM with a caliper and without a caliper, both times all my treatment group points were matched (1352). So I'd expect it means the caliper had no effect (as it didn't prevent any treatment group points from getting matched), and therefore the control group points matched should be the same.
BUT that was not the case. Without a caliper, number of control group points matched = 1164, with a caliper, it increased to 1185, as a result changing my estimate of the effect of treatment. It also seems to have made balance worse (as seen from the images, link attached). Could someone please explain to me how this could have happened?
Without a caliper, I got the following results (Matched Control = 1164, Matched Treated 1352):
Call:
matchit(formula = catholic ~ race_white + w3income + p5hmage +
p5numpla + w3momed_hsb, data = ecls_nomiss, method = "nearest",
distance = "logit", replace = TRUE)
Sample sizes:
Control Treated
All 7915 1352
Matched 1164 1352
Unmatched 6751 0
Discarded 0 0
Effect of treatment (being a catholic), using linear regression = -0.176:
Call:
lm(formula = c5r2mtsc_std ~ catholic, data = dta_m)
Residuals:
Min 1Q Median 3Q Max
-3.4783 -0.5803 0.0647 0.5997 3.0473
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.38637 0.02677 14.432 < 2e-16 ***
catholic -0.17670 0.03652 -4.838 1.39e-06 ***
Balance (by comparing each covariance of control and treatment across propensity score), seems to be very well matched: [1] https://i.stack.imgur.com/pyU6s.png
With a caliper, I got the following results (Matched Control = 1185, Matched Treated 1352):
Call:
matchit(formula = catholic ~ race_white + w3income + p5hmage +
p5numpla + w3momed_hsb, data = ecls_nomiss, method = "nearest",
distance = "logit", replace = TRUE, caliper = 0.2)
Sample sizes:
Control Treated
All 7915 1352
Matched 1185 1352
Unmatched 6730 0
Discarded 0 0
Estimate of effect of treatment = -0.1151, i.e it reduced:
Call:
lm(formula = c5r2mtsc_std ~ catholic, data = dta_m)
Residuals:
Min 1Q Median 3Q Max
-3.4167 -0.5649 0.0608 0.5947 3.1089
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.32477 0.02635 12.326 < 2e-16 ***
catholic -0.11510 0.03609 -3.189 0.00144 **
Balance (by comparing each covariance of control and treatment across propensity score), seems to have gotten worse, and matching between control and treatment also got worse: [2]: https://i.stack.imgur.com/Z9uLK.png