Practical Steps for Reliable Carryover Experiments
Section 5.7 distills the main practical lessons for designing and analyzing carryover (switchback) experiments:
Choose the Right Outcome Metric:
- Select an outcome that truly reflects the effect of the intervention, using both domain knowledge and an understanding of the likely impact.
Set Experimental Granularity Carefully:
- Make each experimental period shorter than the expected carryover effect. Finer granularity (e.g., minutes instead of hours) increases data usefulness and precision.
Estimate the Lag Order ($m$) Thoughtfully:
- Use prior knowledge or stepwise testing (see Section 4.4) to select $m$. If $m$ is large relative to the experiment length $T$, statistical power drops and inference may be inconclusive.
- Each hypothesis test to identify $m$ requires substantial data ($T/m > 100$ recommended).
Optimize Experiment Duration:
- If possible, set the experiment’s total duration $n = T/m$ based on power analysis and expected signal-to-noise ratio.
- Use the rejection rate curve (see Section 5.3) to guide duration and power.
Leverage Multiple Units:
- If you can run experiments on multiple units, do so and combine results to boost precision and power.
Limitations and Open Questions
- High Lag Order: When $m$ is large relative to $T$, variance increases and inference becomes unreliable unless you have strong domain knowledge to inform the model.
- No Adaptive Randomization: The method assumes fixed randomization; adaptive designs require further modeling assumptions.
- Estimand Scope: The results focus on a specific estimand (permanent adoption effect). Other causal questions may require new methods.
Summary
Careful metric selection, granularity, lag estimation, and experiment duration are essential for valid carryover experiments. Be aware of limitations—especially with high lag order or limited data—and use domain knowledge to guide design choices.