Why this reference exists

Section 2.8 collects the notation and assumptions that recur across the rest of the book. In practice, it is a quick lookup: when a method or diagnostic is introduced later, you can map it back to a small set of symbols and assumptions.

Core notation, in plain language

  • Units are indexed by $i=1,\ldots,N$ and time by $t=1,\ldots,T$.
  • Observed outcomes are $Y_{it}$.
  • Potential outcomes can be contemporaneous $Y_{it}(d)$ or path dependent $Y_{it}(d_{ti})$.
  • Treatment is $D_{it}$, usually binary but sometimes continuous.
  • Under staggered adoption, $G_i$ is the first treated period (or $\infty$ for never treated), and event time is $k=t-G_i$.

Key estimands:

$$ \mathrm{ATE}=\mathbb{E}_{i,t}[Y_{it}(1)-Y_{it}(0)],\quad \mathrm{ATT}=\mathbb{E}_{i,t}[Y_{it}(1)-Y_{it}(0)\mid D_{it}=1]. $$

Staggered adoption adds:

$$ \tau(g,t)=\mathbb{E}[Y_{it}(g)-Y_{it}(\infty)\mid G_i=g],\quad t\ge g, $$$$ \theta_k=\mathbb{E}[Y_{i,G_i+k}(G_i)-Y_{i,G_i+k}(\infty)\mid G_i<\infty]. $$

Core assumptions (short form)

These are the building blocks that drive which estimators are valid.

  • No anticipation (Assumption 1): outcomes at $t$ do not depend on treatments assigned in future periods.
  • SUTVA (Assumption 3): no interference and a single version of treatment.
  • No dynamic effects (Assumption 4): outcomes depend only on current treatment, not past treatments.
  • Parallel trends (Assumption 6): untreated outcome changes are the same across cohorts before treatment.
  • Unconfoundedness (Assumption 7): after conditioning on covariates and fixed effects, treatment is independent of potential outcomes.
  • Homogeneous effects (Assumption 8): treatment effects are constant across units and time.

Factor structure: a key relaxation

When parallel trends is too strong, a low-rank factor structure provides an alternative:

$$ Y_{it}(\infty)=\alpha_i+\lambda_t+\sum_{r=1}^R \lambda_{ir} f_{tr}+\varepsilon_{it}, $$

with $R \ll \min(N,T)$. Identification then depends on residual treatment variation being as-good-as-random after conditioning on the latent factors and observed covariates.

Why the assumptions matter

Different methods rely on different subsets:

  • DiD and event studies hinge on parallel trends and no anticipation.
  • Synthetic control leans on pre-treatment fit under a factor structure.
  • DML and high-dimensional controls rely on unconfoundedness.
  • TWFE additionally needs homogeneous effects to yield a single interpretable parameter.
  • Spillover models explicitly relax SUTVA.

Takeaway

This reference section is the dictionary for the rest of MMM: estimands define the question, assumptions define credibility, and notation keeps the mapping precise. When you are unsure whether a method applies, check which assumptions it needs and whether your data can defend them.

References

  • Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 2.8.