Why a notation guide

Chapter 4 introduces difference-in-differences (DiD) for staggered adoption and dynamic effects. The notation below keeps the objects straight: who is treated when, what effect is being estimated, and how effects are aggregated.

Core adoption and treatment notation

  • Adoption time: $G_i \in \{1, 2, \ldots, T, \infty\}$ where $\infty$ means never treated.
  • Treatment indicator: $D_{it} = 1\{t \ge G_i\}$.
  • Event-time indicator: $D_{it}^k = 1\{t - G_i = k\}$.

Cohort-time effects

  • Cohort-time effect: $\tau(g,t) = E[Y_{it}(g) - Y_{it}(\infty) \mid G_i = g]$ for $t \ge g$.

This is the basic building block in staggered adoption settings: the effect for cohort $g$ in period $t$ relative to the never-treated path.

Event-time effects

  • Event-time effect: $\theta_k = E[Y_{i,G_i+k}(G_i) - Y_{i,G_i+k}(\infty) \mid G_i < \infty]$.

Event time is relative to adoption. Leads have $k < 0$, lags have $k \ge 0$.

Aggregating cohort-time effects

Event-time effects are weighted averages of cohort-time effects:

$$ \theta_k = \sum_{g: g+k \le T} w_{gk} \tau(g, g+k). $$

Common choices for weights $w_{gk}$:

  • Cohort-size weights: $w_{gk} \propto n_g$.
  • Equal weights across observed cohorts at event time $k$.

Overall ATT aggregation

The aggregate ATT over treated units and post-treatment periods can be written as:

$$ ATT_{agg} = \sum_{g<\infty} \sum_{t \ge g} w_{gt} \tau(g,t). $$

Different weighting schemes answer different questions (cohort-size, calendar time, or equal weights).

Long-run multiplier

For dynamic profiles, a common summary is the long-run multiplier:

$$ LRM = \frac{\sum_{k=0}^K \theta_k}{\theta_0}. $$

Control groups and diagnostics

  • Never-treated controls: units with $G_i = \infty$.
  • Not-yet-treated controls: units with $G_i > t$.

Pre-trend diagnostics use leads ($k < 0$). A common normalization is $\theta_{-1}=0$, and systematic deviations for $k < -1$ signal potential violations of parallel trends or anticipation.

Takeaway

DiD for staggered adoption is built from cohort-time effects and transparent aggregation. Keep track of $G_i$, $\tau(g,t)$, and $\theta_k$ and the rest of the machinery follows.

References

  • Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Chapter 4 notation guide.
  • Callaway, B., and Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods.
  • Sun, L., and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous effects.