Matching is a statistical technique that evaluates the effect of a treatment by comparing the treated and the non-treated units in an
observational study or
quasi-experiment (i.e. when the treatment is not randomly assigned). The goal of matching is to reduce bias for the estimated treatment effect in an observational-data study, by finding, for every treated unit, one (or more) non-treated unit(s) with similar observable characteristics against which the covariates are balanced out. By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment reducing bias due to
confounding.[1][2][3]Propensity score matching, an early matching technique, was developed as part of the
Rubin causal model,[4] but has been shown to increase model dependence, bias, inefficiency, and power and is no longer recommended compared to other matching methods.[5] A simple, easy-to-understand, and statistically powerful method of matching known as Coarsened Exact Matching or CEM.[6]
Matching has been promoted by
Donald Rubin.[4] It was prominently criticized in
economics by LaLonde (1986),[7] who compared estimates of treatment effects from an
experiment to comparable estimates produced with matching methods and showed that matching methods are
biased. Dehejia and Wahba (1999) reevaluated LaLonde's critique and showed that matching is a good solution.[8] Similar critiques have been raised in
political science[9] and
sociology[10] journals.
Analysis
When the outcome of interest is binary, the most general tool for the analysis of matched data is
conditional logistic regression as it handles strata of arbitrary size and continuous or binary treatments (predictors) and can control for covariates. In particular cases, simpler tests like
paired difference test,
McNemar test and
Cochran–Mantel–Haenszel test are available.
When the outcome of interest is continuous, estimation of the
average treatment effect is performed.
Matching can also be used to "pre-process" a sample before analysis via another technique, such as
regression analysis.[11]
Overmatching
Overmatching, or post-treatment bias, is matching for an apparent mediator that actually is a result of the exposure.[12] If the mediator itself is stratified, an obscured relation of the exposure to the disease would highly be likely to be induced.[13] Overmatching thus causes
statistical bias.[13]
For example, matching the control group by gestation length and/or the number of
multiple births when estimating
perinatal mortality and birthweight after
in vitro fertilization (IVF) is overmatching, since IVF itself increases the risk of premature birth and multiple birth.[14]
It may be regarded as a
sampling bias in decreasing the
external validity of a study, because the controls become more similar to the cases in regard to exposure than the general population.
^Rubin, Donald B. (1973). "Matching to Remove Bias in Observational Studies". Biometrics. 29 (1): 159–183.
doi:
10.2307/2529684.
JSTOR2529684.
^Anderson, Dallas W.; Kish, Leslie; Cornell, Richard G. (1980). "On Stratification, Grouping and Matching". Scandinavian Journal of Statistics. 7 (2): 61–66.
JSTOR4615774.
^Kupper, Lawrence L.; Karon, John M.; Kleinbaum, David G.; Morgenstern, Hal; Lewis, Donald K. (1981). "Matching in Epidemiologic Studies: Validity and Efficiency Considerations". Biometrics. 37 (2): 271–291.
CiteSeerX10.1.1.154.1197.
doi:
10.2307/2530417.
JSTOR2530417.
PMID7272415.
^
LaLonde, Robert J. (1986). "Evaluating the Econometric Evaluations of Training Programs with Experimental Data". American Economic Review. 76 (4): 604–620.
JSTOR1806062.
^Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2006). "Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization". Political Analysis. 14 (1): 37–62.
doi:
10.1093/pan/mpj001.
^Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2010). "A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark". Sociological Methods & Research. 39 (2): 256–282.
doi:
10.1177/0049124110378098.
S2CID37012563.
^Gissler, M.; Hemminki, E. (1996). "The danger of overmatching in studies of the perinatal mortality and birthweight of infants born after assisted conception". Eur J Obstet Gynecol Reprod Biol. 69 (2): 73–75.
doi:
10.1016/0301-2115(95)02517-0.
PMID8902436.
Further reading
Angrist, Joshua D.; Pischke, Jörn-Steffen (2009). "Regression Meets Matching". Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. pp. 69–80.
ISBN978-0-691-12034-8.