MMM 302-2: Staggered Adoption Designs

What is staggered adoption?

Staggered adoption means units adopt treatment at different times. Let $G_i$ be the first treated period (or $\infty$ if never treated), and assume absorbing treatment:

$$ D_{it}=\mathbf{1}\{t\ge G_i\}. $$

This is common in phased rollouts, platform entry sequences, and campaign launches. It also creates an immediate warning: if treatments can switch off, the design deviates from this assumption and requires extensions.

The core estimand: cohort–time effects

The natural target is the cohort–time effect:

$$ \tau(g,t)=\mathbb{E}[Y_{it}(g)-Y_{it}(\infty)\mid G_i=g],\quad t\ge g. $$

This is the average effect for the cohort that adopts at time $g$, evaluated in calendar time $t$. Aggregating $\tau(g,t)$ gives overall ATT or event-time effects $\theta_k$.

Why this matters: different estimators apply different weights to $\tau(g,t)$, so understanding the estimand is essential for interpreting results.

Event-time effects and dynamics

Event time is $k=t-G_i$. Event-time effects average $\tau(g, g+k)$ across cohorts with data at that $k$:

$$ \theta_k=\mathbb{E}[Y_{i,G_i+k}(G_i)-Y_{i,G_i+k}(\infty)\mid G_i<\infty]. $$

These profiles answer whether effects grow, fade, or reverse after adoption. Pre-treatment event times ($k<0$) are also the best diagnostic for parallel trends.

Identification: parallel trends across cohorts

Staggered adoption relies on parallel trends across adoption cohorts. In words: absent treatment, early- and late-adopting units would have evolved similarly.

Because this is untestable, we rely on diagnostics:

Cohort-specific event-study plots.
Placebo tests in pre-treatment windows.
Sensitivity analysis when pre-trends diverge.

If parallel trends is implausible, alternatives include factor models or synthetic control.

Why TWFE can fail

Traditional TWFE mixes comparisons across cohorts and times, often using already-treated units as controls. With heterogeneous effects, this can produce negative weights and misleading estimates.

Modern estimators (e.g., Callaway–Sant’Anna, Sun–Abraham) construct cleaner comparisons between treated units and not-yet-treated or never-treated controls.

Takeaway

Staggered adoption is rich but delicate. The design creates natural variation for identification, but only if parallel trends is plausible and estimators respect heterogeneity. Always identify the cohort–time estimand first, then choose an estimator that targets it transparently.

References

Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 3.2.2.
Callaway, B., and Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics.
Sun, L., and Abraham, S. (2021). Estimating dynamic treatment effects in event studies. Journal of Econometrics.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics.