Why assignment mechanisms come first
Before choosing an estimator, ask a simpler question: How did units get treated? The answer governs which identification assumptions are plausible and, therefore, which methods are credible. Section 2.4 classifies common assignment mechanisms in marketing panels and ties each to the assumptions that make causal interpretation possible.
Randomized assignment
When assignment is randomized, identification is the cleanest. Randomization ensures treatment is independent of potential outcomes (under SUTVA), so post-treatment differences can be read causally without adding parallel trends or factor-structure assumptions.
Assumption 5 (Randomization-based unconfoundedness). Under SUTVA for panels, treatment assignment is independent of potential outcomes:
$$ \Pr\bigl(D \mid \{Y_{it}(d_{ti}) : d_{ti} \in D_t\}_{i,t}, X\bigr) = \Pr(D). $$In stratified designs, independence holds conditional on strata covariates. In practice, even randomized designs require diagnostics for non-compliance, attrition, and protocol deviations.
Staggered adoption and parallel trends
In most marketing panels, treatment timing is not randomized. Adoption is staggered across units. Identification then relies on parallel trends across cohorts before treatment begins.
Assumption 6 (Parallel trends). For cohorts $g$ and $g'$, and periods $t < \min(g, g')$:
$$ \mathbb{E}[Y_{it}(\infty) - Y_{i,t-1}(\infty) \mid G_i = g] = \mathbb{E}[Y_{it}(\infty) - Y_{i,t-1}(\infty) \mid G_i = g']. $$Key implications:
- Cohorts may differ in baseline levels; the restriction is on changes before treatment.
- Post-treatment divergence is allowed because treatment can affect outcomes.
- Event-study pre-trends ($k<0$) provide indirect evidence on plausibility.
When timing is driven by operational constraints rather than expected gains, this assumption is more defensible.
Single treated unit and synthetic control
When only one unit is treated, timing variation disappears. The synthetic control approach constructs a weighted average of controls that matches pre-treatment outcomes, then uses it as the counterfactual.
The identification logic is:
- If pre-treatment fit is tight, post-treatment divergence can be attributed to treatment.
- This rests on a factor-structure view of untreated outcomes, where the treated unit can be approximated by a combination of control units’ factor loadings.
Common shocks with differential exposure
Sometimes all units experience a common event, but with different intensities (e.g., national ad campaigns with varying GRPs). Identification relies on either:
- Parallel trends across exposure levels, or
- A factor structure that separates common shocks from unit-specific exposure.
This setting motivates interactive fixed effects models that allow heterogeneous responses to common time-varying shocks.
Continuous treatment intensity
Treatments are often continuous: GRPs, discount depth, or loyalty reward generosity. The potential outcomes framework extends naturally to $Y_{it}(d)$.
Assumption 7 (Unconfoundedness for continuous treatment). Conditional on covariates and fixed effects, treatment intensity is independent of potential outcomes:
$$ D_{it} \perp Y_{it}(d) \mid X_{it}, \alpha_i, \lambda_t, \quad \forall d \in D. $$This is strong in marketing contexts because spend often reacts to recent outcomes. It becomes more plausible with:
- Rich covariates,
- Unit and time fixed effects,
- High-dimensional controls or double machine learning,
- Instruments or calibration when feedback from outcomes to spend is strong.
Factor structure as an alternative to parallel trends
When strict parallel trends is implausible, factor models allow heterogeneous responses to common shocks:
$$ Y_{it}(0) = \alpha_i + \lambda_t + \sum_{r=1}^R \lambda_{ir} f_{tr} + \varepsilon_{it}. $$This lets units load differently on shared shocks (seasonality, macro changes, national campaigns). Identification requires that, after conditioning on observed covariates and latent factors, treatment variation is as-good-as-random with respect to $Y_{it}(0)$.
Interference-aware designs
SUTVA can fail in marketing settings when spillovers matter (e.g., geographic or network effects). Identification then requires:
- Cluster randomization, or
- Explicit spillover models that allow outcomes to depend on neighbors’ treatments.
The estimand shifts from $Y_{it}(d)$ to $Y_{it}(d, h_i(D_{-i,t}))$, where $h_i(\cdot)$ summarizes exposure to others’ treatment.
Choosing a method: a compact map
| Assignment setting | Core assumption | Typical method |
|---|---|---|
| Randomized assignment | Randomization-based unconfoundedness | Difference in means, regression adjustment |
| Staggered adoption | Parallel trends | Modern DiD, event studies |
| Single treated unit | Factor-structure fit | Synthetic control |
| Common shock, varying exposure | Parallel trends or factor structure | Exposure-response DiD, interactive FE |
| Continuous treatment | Conditional unconfoundedness | Dose-response, DML, IV |
| Spillovers | Interference-aware identification | Cluster designs, spillover models |
Takeaway
Assignment mechanisms are the backbone of identification. The estimator is secondary: it should follow, not lead. In MMM and panel settings, the most credible analyses are those that make the assignment mechanism explicit, articulate the assumptions it implies, and use diagnostics that probe those assumptions.
References
- Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 2.4.
- Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics.
- Rambachan, A., and Roth, J. (2023). A sensitivity analysis for parallel trends.