MMM 603: Identification and Assumptions for Synthetic Control

Once we know how to construct a synthetic control (MMM 602), the natural next question is: when does it work? This post formalises the identification logic, articulating exactly which assumptions are needed, what they imply, and what happens when they fail.

1. The Factor Model Framework

Identification in Synthetic Control rests on a latent factor model for potential outcomes. In the absence of treatment, each unit’s outcome evolves as:

$$Y_{it}(0) = \alpha_i + \lambda_t + f_t' \lambda_i + \varepsilon_{it}$$

where:

$\alpha_i$ is a unit fixed effect (time-invariant level),
$\lambda_t$ is a time fixed effect (common time trend),
$f_t \in \mathbb{R}^R$ is a vector of latent common factors,
$\lambda_i \in \mathbb{R}^R$ is the unit-specific factor loading vector,
$\varepsilon_{it}$ is idiosyncratic noise with $\mathbb{E}[\varepsilon_{it}] = 0$.

This is a strict generalisation of the parallel trends assumption used in Difference-in-Differences. DiD adjusts for $\alpha_i$ and $\lambda_t$ but ignores the interactive $f_t' \lambda_i$ term—implicitly assuming $\lambda_i$ is the same for all units. Synthetic Control relaxes this by allowing each unit to have its own factor loadings, and by seeking weights that match those loadings for the treated unit.

2. Core Identification Assumptions

Four key assumptions underpin the identification result:

Assumption	Content
No interference	Potential outcomes for unit $i$ do not depend on treatment assigned to other units (SUTVA).
No anticipation	The treated unit’s pre-treatment outcomes are unaffected by knowledge of future treatment.
Factor model	Equation above holds for all untreated units in the pre-treatment period.
Convex hull condition	The treated unit’s factor loadings $\lambda_1$ lie within the convex hull of the donor pool’s loadings: $\lambda_1 = \sum_{j \in J} w_j^* \lambda_j$ for some $w_j^* \geq 0$ summing to one.

The convex hull condition is the binding constraint. If the treated unit is a structural outlier—the largest market, a unique geography—no convex combination of smaller donors can replicate its factor loadings, and the estimator is biased regardless of how well it fits pre-treatment data.

3. The Main Identification Theorem

Theorem 6.1 (Identification of Treatment Effect). Under the four assumptions above, suppose there exist weights $w^* = (w_2^*, \ldots, w_N^*)'$ satisfying:

Convexity: $w_j^* \geq 0$ for all $j \in J$ and $\sum_{j \in J} w_j^* = 1$
Factor loading match: $\sum_{j \in J} w_j^* \lambda_j = \lambda_1$
Unit fixed effect match: $\sum_{j \in J} w_j^* \alpha_j = \alpha_1$

Then the synthetic control estimator $\hat{\tau}_{1t} = Y_{1t} - \sum_{j \in J} w_j^* Y_{jt}$ identifies the treatment effect in expectation:

$$\mathbb{E}[\hat{\tau}_{1t}] = \tau_{1t} \quad \text{for all } t > T_0$$

The proof is direct: plug in the factor model, expand the counterfactual error, and observe that all three systematic terms cancel—the time fixed effect by the adding-up constraint, the factor-loading term by condition (2), and the unit fixed effect by condition (3). What remains is $\varepsilon_{1t} - \sum_j w_j^* \varepsilon_{jt}$, which has expectation zero.

4. Consistency as Pre-Treatment Grows

The theorem above assumes $w^*$ is known. In practice, weights are estimated from pre-treatment data. A consistency result shows that as $T_0$ (the number of pre-treatment periods) grows, estimated weights $\hat{w}$ converge to $w^*$, and the SC estimator is consistent for $\tau_{1t}$ at each fixed post-treatment period.

The intuition: more pre-treatment periods more precisely identify the factor structure, allowing better recovery of the latent loadings $\lambda_i$ and thus better weight estimation. More donors in the pool also help by spanning a larger region of loading space.

5. Bias When Identification Fails

When the convex hull condition fails—or when weights are only approximately matched—a bias decomposition characterises the damage. Let $\Delta\lambda = \lambda_1 - \sum_j w_j^* \lambda_j$ be the factor loading mismatch. Then:

$$\text{Bias}(\hat{\tau}_{1t}) = f_t' \Delta\lambda$$

If $\|\Delta\lambda\| \leq \delta$ and $\|f_t\| \leq L$ for all post-treatment periods:

$$|\text{Bias}(\hat{\tau}_{1t})| \leq L \cdot \delta$$

This has two practical implications:

Pre-treatment fit (e.g., RMSPE) proxies for $\delta$: poor fit signals factor loading mismatch.
Factor volatility $L$ governs how much that mismatch matters post-treatment. Large macro shocks amplify any pre-existing mismatch into large bias.

Good pre-treatment fit is a necessary diagnostic—but not sufficient, since $L$ is not controlled by fit statistics.

6. Synthetic Parallel Trends and Partial Identification

The Synthetic Parallel Trends framework [Liu, 2025] provides a unifying perspective. When multiple weighting schemes achieve similar pre-treatment fit, point identification is fragile: different weights selected by different estimators (DiD, SC, SDID) may all be “admissible” yet yield different treatment effect estimates.

Define the admissible weight set for tolerance $\varepsilon \geq 0$:

$$\mathcal{W} = \left\{ w \in \mathbb{R}^{N-1} : w_j \geq 0,\ \sum_{j \in J} w_j = 1,\ \|Y_1^{\text{pre}} - Y_J^{\text{pre}} w\| \leq \varepsilon \right\}$$

Theorem 6.3 (Bounds on Treatment Effect). If $\mathcal{W} \neq \emptyset$, the identified set $\mathcal{I}_t$ is the closed interval:

$$\underline{\tau}_t = Y_{1t} - \max_{w \in \mathcal{W}} Y_{J,t}' w, \qquad \bar{\tau}_t = Y_{1t} - \min_{w \in \mathcal{W}} Y_{J,t}' w$$

These bounds are linear programmes over the convex set $\mathcal{W}$. When $\varepsilon = 0$ and the admissible weight vector is unique, the interval collapses to a point and we recover point identification under factor model assumptions.

Practical takeaway: When DiD and SC give different estimates, they are selecting different weight vectors within $\mathcal{W}$. Reporting bounds from both estimators—rather than insisting on one definitive number—honestly reflects the partial identification inherent in observational panel data.

7. Implications for Practice

The theory delivers three actionable principles:

Convex hull check: Before trusting any SC estimate, assess whether the treated unit is structurally different from all donors on key observable dimensions. If it is an extreme outlier, bias is likely regardless of fit.
Factor volatility awareness: In periods of macro shocks (recessions, pandemics), pre-treatment fit statistics systematically understate post-treatment bias. Apply wider uncertainty bands.
Report bounds, not just points: When pre-treatment RMSPE tolerances allow multiple weighting schemes, compute and report the partial identification bounds alongside the point estimate.

These diagnostics are formalised in Sections 6.5 and 6.6, covered in subsequent posts.

Summary

Synthetic Control identification is grounded in a factor model where matching pre-treatment outcomes acts as a proxy for matching latent factor loadings. The key condition is that the treated unit lies within the convex hull of the donor pool in factor loading space. When this holds and weights are well-estimated, the SC estimator is unbiased; when it fails, bias equals the factor loading mismatch scaled by factor volatility. The Synthetic Parallel Trends framework reframes near-identical fit from multiple weighting schemes as partial identification—bounding rather than pinpointing the treatment effect.