Why inference is hard in geo-experiments

Geo-experiments often have only a modest number of clusters (DMAs, regions, custom markets) and outcomes measured over many periods within each cluster. This creates two problems:

  • Few clusters: cluster-robust SEs rely on large-cluster asymptotics and can be unreliable with small $G$.
  • Serial correlation: outcomes within a cluster are correlated across time, violating independence.

The effective sample size is the number of clusters, not the number of unit-period observations.

Randomization inference

Randomization inference (RI) gives exact finite-sample p-values under the sharp null that treatment has no effect on any unit in any period. The logic:

  1. Treat the observed assignment as one draw from the randomization protocol.
  2. Recompute the test statistic under all (or many) admissible randomizations.
  3. Compare the observed statistic to this null distribution.

RI automatically respects clustering and serial correlation because it conditions on the assignment mechanism.

Wild cluster bootstrap

The wild cluster bootstrap is a practical alternative when $G$ is small. It resamples clusters with random sign flips on cluster-level residuals, preserving within-cluster correlation. It performs well when outcomes are heteroskedastic or serially correlated.

Practical guidance

  • Use RI when the assignment protocol is well-defined and the sharp null is a meaningful benchmark.
  • Use wild cluster bootstrap when you need inference for average effects and the number of clusters is small.
  • Treat conventional cluster-robust SEs as a baseline only when $G$ is reasonably large.

Takeaway

In geo-experiments, credible inference depends on respecting the small number of clusters and serial correlation. Randomization inference and the wild cluster bootstrap are often the safest defaults.

References

  • Shaw, C. (2025). Causal Inference in Marketing: Panel Data and Machine Learning Methods (Community Review Edition), Section 3.3.4.
  • Rubin, D. B. (1980). Randomization analysis of experimental data.
  • Cameron, A. C., Gelbach, J. B., and Miller, D. L. (2008). Bootstrap-based improvements for inference with clustered errors.
  • MacKinnon, J. G., and Webb, M. D. (2017). Wild bootstrap inference for clustered errors.