MMM 712: Workflow Checklist

This section distils the chapter into a compact protocol for using hybrid methods in marketing panels. The workflow is deliberately high level. It tells you what to decide and in what order, and points back to earlier sections where you can find detail on design, tuning, diagnostics, inference and reporting. The aim is not to turn hybrid methods into a mechanical checklist, but to provide a disciplined way to organise an analysis.

Step 1: Define the Estimand and Cohorts Start by fixing the substantive question and the estimand. For a single treated unit, this is usually an average treatment effect on that unit over a chosen post-treatment window. With several treated units that adopt at the same time, you must decide whether you care about unit-specific effects, an overall ATT, or both. With staggered adoption, define cohorts by adoption time and decide whether you will focus on cohort-specific effects, event-time profiles or calendar-time effects. In staggered designs, the building blocks are cohort-time effects $ au$ (g, t), which can be aggregated into event-time effects θk as described in Section 7.7. Make this explicit before you look at estimates. The aggregation schemes in Section 7.7 show how unitlevel or cohort–time effects map into different summaries. Pre-specifying the target helps prevent you from drifting toward whichever summary looks most flattering ex post.

Step 2: Curate the Donor Pool Next, assemble a donor pool that can plausibly stand in for the treated units in the absence of treatment. Exclude units that receive treatment during the estimation window, units that are clearly incomparable on business grounds, and units that are likely to be heavily contaminated by spillovers, drawing on the guidance in Section 7.8 and Chapter 11. Document these choices. Summarise how donors differ from treated units on key characteristics such as region, size, format or demographics. If donors are systematically smaller, poorer or otherwise different, that will influence your choice of method: designs such as ASCM that can correct covariate imbalances become more attractive. In staggered designs, remember that the donor pool for a given cohort and period consists of not-yettreated and never-treated units. Track how this pool shrinks as later cohorts adopt. Late adopters may have far fewer valid donors than early ones.

7.12 Workflow Checklist

Step 3: Choose Candidate Hybrid Methods Choose one or more hybrid estimators that match the structure of your problem. Section 7.3 lays out the main trade-offs. ASCM is a natural candidate when treated units differ systematically from donors on observables that predict outcomes and when pure SC cannot achieve good pre-treatment fit. Ridge SC is helpful when the donor pool is large relative to the pre-period and you worry about unstable weights. SDID fits staggered adoption and settings where treated and control units have different pre-trends but can be aligned after reweighting. More complex factor-based hybrids such as TROP make sense only when interactive fixed effects are plausible, the panel is reasonably large and you have the technical capacity to implement and scrutinise them. If more than one method seems appropriate — which is common — plan from the outset to estimate several and compare their diagnostics and estimates.

Step 4: Select Predictors and Tuning Rules Specify the predictors you will use for weighting and, for ASCM, for augmentation, drawing on Section 7.8. Pre-treatment outcomes over a reasonably long window are usually central; a small number of well-chosen covariates that capture scale, demographics and competition often add value. Avoid building predictor sets by trial and error on the post-treatment data. Decide how you will tune regularisation parameters using only pre-treatment information. For ridge-type hybrids, this typically means splitting the pre-period into a training block and a validation block and searching over a broad grid of penalties. For SDID, most implementations solve the underlying optimisation problem once given the design; tuning is implicit rather than explicit. For TROP, staged cross-validation is one option, but given its research status it should be treated as experimental rather than routine. Write down these rules before inspecting post-treatment outcomes to reduce the temptation to chase specifications that produce the most appealing effects.

Step 5: Assess Pre-Treatment Fit and Balance Estimate your chosen hybrids and compute pre-treatment RMSPE for each, as defined in Section 7.9. Express RMSPE relative to the standard deviation of pre-period outcomes and compare across methods. Designs that cannot track the treated unit’s pre-treatment path at all are not credible; designs that match the path much better than simpler alternatives, without implausible weights, are promising, conditional on donor validity (no spillovers) and stable pre/post relationships. At the same time, examine covariate balance for variables you believe matter for both treatment and outcomes. Standardised mean differences provide a convenient summary; values close to zero indicate good balance, and large absolute values flag imbalances that may threaten identification. You do not need to enforce a rigid cutoff, but imbalances on central covariates should be rare in a design you trust. Finally, look at weight dispersion. Effective numbers of donors or periods that are either extremely low (all mass on one donor or one period) or extremely high (near-uniform weights) deserve scrutiny. Visualising weight distributions over donors and periods alongside pre-treatment fit helps you see whether a design is relying too heavily on a few units or times. If pre-treatment fit is poor, key covariates are badly imbalanced, or weights look extreme, regard this as feedback, not as failure. Revisit the donor pool, predictors or tuning and iterate until you either achieve acceptable diagnostics or conclude that the design cannot support a reliable hybrid.

Step 6: Run Placebos and Sensitivity Checks Once pre-period diagnostics look reasonable, stress-test the design. Apply in-space placebos by treating donors as if they were exposed and comparing their placebo gaps to the treated unit’s gap. Apply in-time placebos by shifting the intervention date into the pre-period and checking whether the method spuriously produces “effects” where none should exist. Conduct leave-one-out analyses by re-estimating after removing influential donors or pre-periods. Large swings in estimates when a single donor or period is omitted indicate fragility and call for explanation. Residual plots and weight-distribution plots help you see whether such donors and periods are truly special or artefacts of the optimisation. The goal of these checks is not to pass a checklist, but to understand how your design behaves under small perturbations.

Step 7: Compute and Interpret Inference Choose inference tools that respect your sample size and design, drawing on Chapter 16 and Section 7.10. In small marketing panels, unit-level or cohort-level block bootstraps, wild bootstraps and placebo-based distributions are often more reliable than asymptotic formulas. When there is only one treated unit, placebobased and time-permutation tools often carry more interpretive weight than a naive unit bootstrap. Be explicit about what each inferential device assumes. Rank-based placebo statistics rely on symmetry or exchangeability arguments; bootstrap procedures treat your sample as a stand-in for a wider population; conformal-style prediction intervals lean on residual exchangeability. In observational work, randomisationstyle p-values are best viewed as sensitivity tools, not as definitive causal tests. Report interval estimates and uncertainty summaries alongside point estimates, and interpret them in the context of effect sizes that matter for decisions.

7.12 Workflow Checklist

Step 8: Aggregate and Explore Heterogeneity If your design involves multiple treated units or cohorts, aggregate unit-level or group–time estimates into the estimand you specified in Step 1. Use weights that reflect the policy question — equal weights for “average treated unit” questions, population or revenue weights for questions about aggregate impact — and report these weights so readers can see which units drive the results. For staggered adoption, present event-time profiles with confidence bands, mark the omitted event time (the reference period that defines the baseline for θk ) and indicate which cohorts contribute at each horizon. For common-adoption designs, show unit-specific estimates alongside the aggregate to reveal heterogeneity. In both cases, sensitivity analyses that vary aggregation weights or restrict attention to better-matched units help assess how conclusions depend on these choices.

Step 9: Document Assumptions and Make the Analysis Auditable Finally, write down the identification assumptions you are relying on: no anticipation, no interference (or an explicit exposure mapping if spillovers are modelled, using hi (D−i,t ) as in Chapter 11), stability of the predictor–outcome relationship across pre- and post-periods, and sufficient overlap between treated units and donors. Connect these to the economic context of your application. Explain why you believe, for example, that donors were not affected by the campaign or that there were no structural breaks at the time of treatment. Document any deviations from your pre-specified plan and justify them. Where possible, provide replication materials — cleaned data or suitably anonymised proxies, code, and details of software versions — so that others can reproduce your results or apply alternative specifications. Transparency about design choices, diagnostics and inference strengthens the credibility of your conclusions more than any single “significant” estimate. Used together with the workflow above, the diagnostics and examples in this chapter provide a practical blueprint for deploying hybrid methods in marketing panels in a way that is both rigorous and transparent.

Table 7.1 Hybrid Methods at a Glance (see [Abadie et al., 2010, Ben-Michael et al., 2021, Arkhangelsky et al., 2021, Athey et al., 2025b] for methodological details) Method

Key Assumptions

Tuning Focus

Typical Use-Cases

Standard SC

Convex-hull approximation; no anticipation; stability of the imputation mapping learned in the pre-period

Predictor set; choice of preperiod

Single treated unit; long pre-period; treated unit well represented in donor pool

ASCM

Same as SC plus outcomemodel stability; bias is small when weighting imbalance and augmentation error are both small under the maintained model

Choice of covariates; regularisation strength for augmentation and, where used, weights

Treated units near but not clearly inside donor hull; important covariate imbalances; short pre-periods

Ridge SC

Approximate convex-hull match with explicit shrinkage towards diffuse weights

Ridge penalty on weights; pre-period split for validation

Large donor pools; concern about unstable or overly concentrated weights

SDID

Weighted parallel trends after reweighting units and periods; overlap between treated and not-yet-treated units

Implementation details for weight estimation; choice of pre-period window

Staggered adoption; differential pre-trends across units; rich donor pool of not-yet-treated markets

TROP

Low-rank factor structure for untreated outcomes; shared factor space between treated and donor units

Penalties on unit weights, time weights and factor rank (typically tuned by cross-validation)

Panels with strong interactive fixed effects; moderate sample sizes; teams comfortable with factor models

Part IV

Factor Models and Matrix Methods

Chapter 8

Interactive Fixed Effects and Matrix Completion

This chapter develops factor models and matrix completion methods for counterfactual imputation and treatment-effect estimation in marketing panels. We treat factor models as low-rank statistical approximations to panel outcomes, capturing shared shocks and heterogeneous exposure across units, and we show how this structure motivates interactive fixed effects (IFE) estimators. We formalise IFE models, including identification up to rotation (that is, identification of the factor space and fitted values, not the factors themselves) and selection of the number of factors. We then show how to apply matrix completion with nuclear-norm regularisation to impute counterfactual outcomes for treated or otherwise missing cells. We develop diagnostics, tuning rules, and inference procedures that work in marketing panels with serial dependence and a limited number of clusters. By the end of the chapter, you should be able to (i) specify an IFE model for a marketing panel, (ii) select the factor rank, (iii) implement matrix-completion estimators, and (iv) diagnose when a low-rank approximation is not credible. Factor-based designs sit alongside parallel-trends and synthetic control approaches introduced earlier in the book. As throughout the book, credible use requires a clear estimand, identification assumptions about the assignment mechanism and stability of $Y_{it}(0)$, an estimator for imputing counterfactual outcomes, and diagnostics and inference that assess sensitivity to these choices. We map connections to synthetic control and SDID (Chapters 6 and 7), clarifying when factor designs are preferable to parallel-trends designs and when you should instead rely on difference-in-differences (Chapter 4). Later sections refer back to event-study diagnostics and inference procedures developed in Chapters 5, 17, and 16.

References

Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 7.12.