Why a notation guide
Chapter 4 introduces difference-in-differences (DiD) for staggered adoption and dynamic effects. The notation below keeps the objects straight: who is treated when, what effect is being estimated, and how effects are aggregated.
Core adoption and treatment notation
- Adoption time: $G_i \in \{1, 2, \ldots, T, \infty\}$ where $\infty$ means never treated.
- Treatment indicator: $D_{it} = 1\{t \ge G_i\}$.
- Event-time indicator: $D_{it}^k = 1\{t - G_i = k\}$.
Cohort-time effects
- Cohort-time effect: $\tau(g,t) = E[Y_{it}(g) - Y_{it}(\infty) \mid G_i = g]$ for $t \ge g$.
This is the basic building block in staggered adoption settings: the effect for cohort $g$ in period $t$ relative to the never-treated path.
Event-time effects
- Event-time effect: $\theta_k = E[Y_{i,G_i+k}(G_i) - Y_{i,G_i+k}(\infty) \mid G_i < \infty]$.
Event time is relative to adoption. Leads have $k < 0$, lags have $k \ge 0$.
Aggregating cohort-time effects
Event-time effects are weighted averages of cohort-time effects:
$$ \theta_k = \sum_{g: g+k \le T} w_{gk} \tau(g, g+k). $$Common choices for weights $w_{gk}$:
- Cohort-size weights: $w_{gk} \propto n_g$.
- Equal weights across observed cohorts at event time $k$.
Overall ATT aggregation
The aggregate ATT over treated units and post-treatment periods can be written as:
$$ ATT_{agg} = \sum_{g<\infty} \sum_{t \ge g} w_{gt} \tau(g,t). $$Different weighting schemes answer different questions (cohort-size, calendar time, or equal weights).
Long-run multiplier
For dynamic profiles, a common summary is the long-run multiplier:
$$ LRM = \frac{\sum_{k=0}^K \theta_k}{\theta_0}. $$Control groups and diagnostics
- Never-treated controls: units with $G_i = \infty$.
- Not-yet-treated controls: units with $G_i > t$.
Pre-trend diagnostics use leads ($k < 0$). A common normalization is $\theta_{-1}=0$, and systematic deviations for $k < -1$ signal potential violations of parallel trends or anticipation.
Takeaway
DiD for staggered adoption is built from cohort-time effects and transparent aggregation. Keep track of $G_i$, $\tau(g,t)$, and $\theta_k$ and the rest of the machinery follows.
References
- Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Chapter 4 notation guide.
- Callaway, B., and Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods.
- Sun, L., and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous effects.