MMM 402: Estimands for Staggered Adoption

Why estimands matter in staggered adoption

Section 4.2 moves beyond the canonical 2x2 and defines the estimands that are well-posed when units adopt treatment at different times and effects are heterogeneous. The key idea: define cohort-time effects first, then aggregate them transparently to answer specific business questions.

Cohort-time effects $\tau(g,t)$

The building block is the cohort-time effect:

$$ \tau(g,t) = E[Y_{it}(g) - Y_{it}(\infty) \mid G_i = g], \quad t \ge g. $$

This allows effects to vary across cohorts and calendar time. Every aggregation is just a weighted average of these $\tau(g,t)$ values.

Table 4.1: Hypothetical Cohort-Time Treatment Effects

Cohort $g$	Stores	$k=0$	$k=1$	$k=2$	$k=3$
$g=2$	100	8	10	12	14
$g=4$	200	6	9	11	–
$g=6$	150	5	–	–	–

Event-time effects $\theta_k$

Event-time effects pool cohort-time effects by time since adoption:

$$ \theta_k = \sum_{g:g+k \le T} w_{gk}^{\text{event}} \tau(g, g+k). $$

They answer questions about dynamics: how fast effects build, decay, or persist.

Composition bias in event-time profiles

Different event times include different cohorts. If early cohorts have larger effects, later $\theta_k$ may look larger simply because the composition shifts toward high-effect cohorts. Diagnose this by plotting cohort-specific event-time profiles, not just the aggregate.

Balanced vs unbalanced event-time windows

Some estimators restrict to a balanced window so the same cohorts contribute to each $k$. This removes composition shifts but changes the estimand by excluding late cohorts or long horizons. If you use unbalanced windows, document which cohorts contribute to each $k$.

Calendar-time and cohort-specific aggregations

Calendar-time effects average across cohorts within each period:

$$ \tau_t = \sum_{g \le t} w_{g\mid t}^{\text{cal}} \tau(g,t). $$

Cohort-specific effects average across time within each cohort:

$$ \tau_g = \sum_{t \ge g} w_{t\mid g}^{\text{cohort}} \tau(g,t). $$

These are useful for period-specific impact and targeting decisions.

Long-run effects and the multiplier

The cumulative effect over $K$ periods is:

$$ \sum_{k=0}^K \theta_k. $$

The long-run multiplier is:

$$ LRM = \frac{\sum_{k=0}^K \theta_k}{\theta_0}. $$

Use $LRM$ only when $\theta_0$ is meaningfully different from zero and the outcome is a flow (e.g., sales). For stock outcomes, report the final level $\theta_K$ instead of a sum.

Map business questions to estimands

Section 4.2 emphasizes choosing estimands that match the decision:

Overall ROI: $ATT_{agg}$.
Dynamics and payback: event-time $\theta_k$.
Targeting: cohort-specific $\tau_g$.
Period impact: calendar-time $\tau_t$.
Carryover: cumulative effect or $LRM$.

Linking estimands to KPIs keeps the analysis decision-ready. Overall $ATT_{agg}$ maps to incremental revenue or profit, the numerator in ROI. Event-time $\theta_k$ informs payback and CLV by providing the lift path used for discounting and break-even analysis. Cohort-specific $\tau_g$ guides targeting and rollout prioritization. The long-run multiplier $LRM$ quantifies carryover, helping allocate budget toward channels with persistent effects.

Example: if a programme generates an average lift of $50 per store across 500 stores for 4 quarters, incremental revenue is $50 \times 500 \times 4 = $100,000. If the programme costs $100,000, ROI breaks even; larger lifts or longer horizons push ROI positive. Event-time effects like $\theta_0=20$, $\theta_1=35$, $\theta_2=45$ determine when cumulative lift surpasses per-store cost, which is the payback period. The same $\theta_k$ path feeds CLV by discounting future lift and combining it with margins and retention assumptions. For carryover, $LRM>1$ indicates persistence (cumulative lift exceeds the immediate effect), while $LRM<1$ signals decay.

Table 4.2: Estimand Selection Guide

Business Question	Estimand	Why This Estimand?
What is the overall programme effect?	$ATT_{agg}$	Single summary statistic for ROI calculation
How does the effect evolve over time?	$\theta_k$	Traces dynamics, habit formation, carryover
Do early adopters benefit more than late adopters?	$\tau_g$	Reveals heterogeneity for targeting and prioritisation
What was the impact in a specific period?	$\tau_t$	Measures contemporaneous effect for that period
What is the cumulative vs immediate effect?	$LRM$	Quantifies carryover for budget allocation

Takeaway

Staggered adoption is not a single estimand problem. Define $\tau(g,t)$ first, then aggregate with weights that align with the business question. Report the weights and the estimand, not just the estimator.

References

Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 4.2.
Callaway, B., and Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods.
Sun, L., and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous effects.