Types of Matching in Stata
Matching methods are used to reduce selection bias in observational studies by pairing treated and control units based on their propensity scores. The most common matching techniques in Stata include:
- Nearest Neighbor (NN) Matching
- Caliper Matching
- Kernel Matching
- Radius Matching
- Stratification/Interval Matching
Each method has its own advantages and trade-offs.
1. Nearest Neighbor (NN) Matching
Concept
- Each treated unit is matched with the control unit that has the closest propensity score.
- Can be done with or without replacement.
- Can specify 1-to-1 or 1-to-many matching.
mplementation in Stata (psmatch2
)
psmatch2 treatment covariates, outcome(outcome_var) neighbor(1) neighbor(1)
: Matches each treated unit to the single closest control unit.
neighbor(3)
: Matches each treated unit to the three closest control units (1-to-3 matching).
2. Caliper Matching
Concept
- Similar to NN matching, but imposes a maximum allowed difference in propensity scores (the "caliper").
- Helps avoid poor matches by ensuring that matched units are sufficiently similar.
psmatch2 treatment covariates, outcome(outcome_var) neighbor(1) caliper(0.05)
caliper(0.05)
: Ensures that control units are within 0.05 propensity score of the treated unit.3. Kernel Matching
Concept
- Instead of picking one nearest neighbor, Kernel Matching uses multiple control units and assigns them weights based on their closeness.
- Treated unit outcome is compared to a weighted average of the control group.
psmatch2 treatment covariates, outcome(outcome_var) kernel
4. Radius MatchingConcept :
- Each treated unit is matched with all control units within a certain distance (radius) in propensity score space.
psmatch2 treatment covariates, outcome(outcome_var) radius caliper(0.05)
caliper(0.05)
: Includes all control units within 0.05 propensity score.
5. Stratification (Interval) Matching
Concept
- The propensity score is divided into intervals (strata), and treated/control units are compared within each stratum.
- Works similarly to coarsened exact matching (CEM).
psmatch2 treatment covariates, outcome(outcome_var) strata(5)
strata(5)
: Divides the propensity score into 5 groups.Comparison Table of Matching Methods
Method Matching Type Strengths Limitations Nearest Neighbor (NN) 1-to-1 or 1-to-many Easy to interpret, real units used Bad matches possible, may drop many controls Caliper Matching NN with a threshold Prevents poor matches May drop treated units Kernel Matching Weighted average of controls Uses all data, reduces variance Computationally intensive Radius Matching Multiple matches within a range More control units per treated Sample size varies Stratification Matching Groups by propensity score strata Retains most data, simple to apply Assumes similarity within strata
Which Matching Method Should You Use?
- If you want simple matching → Nearest Neighbor Matching
- If you want to avoid poor matches → Caliper Matching
- If you have a large control group → Kernel Matching
- If you want a balance between NN and Kernel → Radius Matching
- If you prefer a stratified approach → Stratification Matching
Summary & Conclusion
Propensity Score Matching (PSM) methods in Stata help reduce selection bias in observational studies by balancing treated and control groups based on their propensity scores. Nearest Neighbor Matching (NN) is the simplest method, pairing each treated unit with the closest control, but may lead to poor matches. Caliper Matching improves upon NN by restricting matches within a specified range, preventing extreme differences. Kernel Matching and Radius Matching use multiple control units per treated unit, reducing variance but requiring careful selection of bandwidth or caliper. Stratification Matching divides the sample into propensity score bins, ensuring comparability within each group. Choosing the right method depends on the dataset and research goals—NN is intuitive but risky, Caliper reduces bias at the cost of sample size, Kernel and Radius improve precision but are computationally complex, and Stratification offers a structured approach. Regardless of the method, researchers should check balance and common support to validate results. 🚀