Why estimands matter in staggered adoption
Section 4.2 moves beyond the canonical 2x2 and defines the estimands that are well-posed when units adopt treatment at different times and effects are heterogeneous. The key idea: define cohort-time effects first, then aggregate them transparently to answer specific business questions.
Cohort-time effects $\tau(g,t)$
The building block is the cohort-time effect:
$$ \tau(g,t) = E[Y_{it}(g) - Y_{it}(\infty) \mid G_i = g], \quad t \ge g. $$This allows effects to vary across cohorts and calendar time. Every aggregation is just a weighted average of these $\tau(g,t)$ values.
Table 4.1: Hypothetical Cohort-Time Treatment Effects
| Cohort $g$ | Stores | $k=0$ | $k=1$ | $k=2$ | $k=3$ |
|---|---|---|---|---|---|
| $g=2$ | 100 | 8 | 10 | 12 | 14 |
| $g=4$ | 200 | 6 | 9 | 11 | – |
| $g=6$ | 150 | 5 | – | – | – |
Event-time effects $\theta_k$
Event-time effects pool cohort-time effects by time since adoption:
$$ \theta_k = \sum_{g:g+k \le T} w_{gk}^{\text{event}} \tau(g, g+k). $$They answer questions about dynamics: how fast effects build, decay, or persist.
Composition bias in event-time profiles
Different event times include different cohorts. If early cohorts have larger effects, later $\theta_k$ may look larger simply because the composition shifts toward high-effect cohorts. Diagnose this by plotting cohort-specific event-time profiles, not just the aggregate.
Balanced vs unbalanced event-time windows
Some estimators restrict to a balanced window so the same cohorts contribute to each $k$. This removes composition shifts but changes the estimand by excluding late cohorts or long horizons. If you use unbalanced windows, document which cohorts contribute to each $k$.
Calendar-time and cohort-specific aggregations
Calendar-time effects average across cohorts within each period:
$$ \tau_t = \sum_{g \le t} w_{g\mid t}^{\text{cal}} \tau(g,t). $$Cohort-specific effects average across time within each cohort:
$$ \tau_g = \sum_{t \ge g} w_{t\mid g}^{\text{cohort}} \tau(g,t). $$These are useful for period-specific impact and targeting decisions.
Long-run effects and the multiplier
The cumulative effect over $K$ periods is:
$$ \sum_{k=0}^K \theta_k. $$The long-run multiplier is:
$$ LRM = \frac{\sum_{k=0}^K \theta_k}{\theta_0}. $$Use $LRM$ only when $\theta_0$ is meaningfully different from zero and the outcome is a flow (e.g., sales). For stock outcomes, report the final level $\theta_K$ instead of a sum.
Map business questions to estimands
Section 4.2 emphasizes choosing estimands that match the decision:
- Overall ROI: $ATT_{agg}$.
- Dynamics and payback: event-time $\theta_k$.
- Targeting: cohort-specific $\tau_g$.
- Period impact: calendar-time $\tau_t$.
- Carryover: cumulative effect or $LRM$.
Linking estimands to KPIs keeps the analysis decision-ready. Overall $ATT_{agg}$ maps to incremental revenue or profit, the numerator in ROI. Event-time $\theta_k$ informs payback and CLV by providing the lift path used for discounting and break-even analysis. Cohort-specific $\tau_g$ guides targeting and rollout prioritization. The long-run multiplier $LRM$ quantifies carryover, helping allocate budget toward channels with persistent effects.
Example: if a programme generates an average lift of $50 per store across 500 stores for 4 quarters, incremental revenue is $50 \times 500 \times 4 = $100,000. If the programme costs $100,000, ROI breaks even; larger lifts or longer horizons push ROI positive. Event-time effects like $\theta_0=20$, $\theta_1=35$, $\theta_2=45$ determine when cumulative lift surpasses per-store cost, which is the payback period. The same $\theta_k$ path feeds CLV by discounting future lift and combining it with margins and retention assumptions. For carryover, $LRM>1$ indicates persistence (cumulative lift exceeds the immediate effect), while $LRM<1$ signals decay.
Table 4.2: Estimand Selection Guide
| Business Question | Estimand | Why This Estimand? |
|---|---|---|
| What is the overall programme effect? | $ATT_{agg}$ | Single summary statistic for ROI calculation |
| How does the effect evolve over time? | $\theta_k$ | Traces dynamics, habit formation, carryover |
| Do early adopters benefit more than late adopters? | $\tau_g$ | Reveals heterogeneity for targeting and prioritisation |
| What was the impact in a specific period? | $\tau_t$ | Measures contemporaneous effect for that period |
| What is the cumulative vs immediate effect? | $LRM$ | Quantifies carryover for budget allocation |
Takeaway
Staggered adoption is not a single estimand problem. Define $\tau(g,t)$ first, then aggregate with weights that align with the business question. Report the weights and the estimand, not just the estimator.
References
- Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 4.2.
- Callaway, B., and Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods.
- Sun, L., and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous effects.