Why this reference exists
Section 2.8 collects the notation and assumptions that recur across the rest of the book. In practice, it is a quick lookup: when a method or diagnostic is introduced later, you can map it back to a small set of symbols and assumptions.
Core notation, in plain language
- Units are indexed by $i=1,\ldots,N$ and time by $t=1,\ldots,T$.
- Observed outcomes are $Y_{it}$.
- Potential outcomes can be contemporaneous $Y_{it}(d)$ or path dependent $Y_{it}(d_{ti})$.
- Treatment is $D_{it}$, usually binary but sometimes continuous.
- Under staggered adoption, $G_i$ is the first treated period (or $\infty$ for never treated), and event time is $k=t-G_i$.
Key estimands:
$$ \mathrm{ATE}=\mathbb{E}_{i,t}[Y_{it}(1)-Y_{it}(0)],\quad \mathrm{ATT}=\mathbb{E}_{i,t}[Y_{it}(1)-Y_{it}(0)\mid D_{it}=1]. $$Staggered adoption adds:
$$ \tau(g,t)=\mathbb{E}[Y_{it}(g)-Y_{it}(\infty)\mid G_i=g],\quad t\ge g, $$$$ \theta_k=\mathbb{E}[Y_{i,G_i+k}(G_i)-Y_{i,G_i+k}(\infty)\mid G_i<\infty]. $$Core assumptions (short form)
These are the building blocks that drive which estimators are valid.
- No anticipation (Assumption 1): outcomes at $t$ do not depend on treatments assigned in future periods.
- SUTVA (Assumption 3): no interference and a single version of treatment.
- No dynamic effects (Assumption 4): outcomes depend only on current treatment, not past treatments.
- Parallel trends (Assumption 6): untreated outcome changes are the same across cohorts before treatment.
- Unconfoundedness (Assumption 7): after conditioning on covariates and fixed effects, treatment is independent of potential outcomes.
- Homogeneous effects (Assumption 8): treatment effects are constant across units and time.
Factor structure: a key relaxation
When parallel trends is too strong, a low-rank factor structure provides an alternative:
$$ Y_{it}(\infty)=\alpha_i+\lambda_t+\sum_{r=1}^R \lambda_{ir} f_{tr}+\varepsilon_{it}, $$with $R \ll \min(N,T)$. Identification then depends on residual treatment variation being as-good-as-random after conditioning on the latent factors and observed covariates.
Why the assumptions matter
Different methods rely on different subsets:
- DiD and event studies hinge on parallel trends and no anticipation.
- Synthetic control leans on pre-treatment fit under a factor structure.
- DML and high-dimensional controls rely on unconfoundedness.
- TWFE additionally needs homogeneous effects to yield a single interpretable parameter.
- Spillover models explicitly relax SUTVA.
Takeaway
This reference section is the dictionary for the rest of MMM: estimands define the question, assumptions define credibility, and notation keeps the mapping precise. When you are unsure whether a method applies, check which assumptions it needs and whether your data can defend them.
References
- Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 2.8.