STB 202: Expected Rejection Time Optimality

In the previous post, we saw that SPRUCE achieves the optimal growth rate for wealth under the alternative. But practitioners often care about a different quantity: how long until the test rejects? This post covers Section 4 of the paper, which derives matching lower and upper bounds on the expected time to rejection.

The stopping time

For a given e-process $(W_n)_{n \in \mathbb{N}}$ and type-I error level $\alpha \in (0, 1)$, define the stopping time:

$$\tau_\alpha := \inf\{n \in \mathbb{N} : W_n \geq 1/\alpha\}.$$

This is the first sample size at which the wealth crosses the rejection threshold. We want $E_Q[\tau_\alpha]$ to be as small as possible under the alternative $Q$.

The lower bound

Proposition 4.1 establishes a fundamental limit. For any process $\widetilde{W}$ of the form (4) with predictable portfolios and arm selections, if the log-increment $\log(\lambda_Q(a)^\top E_1(a))$ is Q-almost surely bounded for each arm:

$$\frac{E_Q[\tilde{\tau}_\alpha]}{\log(1/\alpha)} \geq \left(\max_{(a,\lambda) \in \mathcal{A} \times \Delta^d} E_Q\!\left[\log\big(\lambda^\top E_1(a)\big)\right]\right)^{-1}.$$

In words: no test can reject faster than the inverse of the optimal expected log-increment, scaled by $\log(1/\alpha)$. This lower bound holds even for a process that always pulls the same arm $a$ — it’s a per-arm limit.

The proof uses Wald’s identity and compares $\widetilde{W}$ to a process with oracle access to the optimal portfolios. The key insight is that the expected log-increment acts as a “speed limit” for evidence accumulation.

The upper bound: SPRUCE matches the lower bound

Theorem 4.2 shows that SPRUCE achieves this lower bound in the high-confidence regime ($\alpha \to 0^+$):

$$\lim_{\alpha \to 0^+} \frac{E_Q[\tau^{\text{SPRUCE}}_\alpha]}{\log(1/\alpha)} = \lim_{\alpha \to 0^+} \frac{E_Q[\tau^\star_{\alpha,Q}]}{\log(1/\alpha)} = \left(\max_{(a,\lambda) \in \mathcal{A} \times \Delta^d} E_Q\!\left[\log\big(\lambda^\top E_1(a)\big)\right]\right)^{-1}.$$

Here $\tau^\star_{\alpha,Q}$ is the stopping time for the oracle process $W^\star_{n,Q} = \prod_{i=1}^n \lambda_Q(a_Q)^\top E_i(a_Q)$ that knows both the optimal arm and portfolio from the start.

This is a strong result: SPRUCE’s expected rejection time matches the oracle’s, with exact constants, as $\alpha \to 0^+$. The same quantity — the optimal expected log-increment — governs both growth rate and rejection time.

Empirical evidence

Figure 2 in the paper shows the distribution of stopping times at $\alpha = 0.001$ under “easy” and “hard” data generating processes. SPRUCE’s distribution closely tracks the oracle arm’s, while Round Robin and Random Selection are noticeably slower with heavier tails. The vertical line marks the theoretical lower bound $\log(1/\alpha) / E_Q[\ell_{1,Q}(a_Q)]$.

Why the same quantity appears twice

It’s not a coincidence that the optimal growth rate and the optimal rejection time depend on the same quantity $\max_{(a,\lambda)} E_Q[\log(\lambda^\top E_1(a))]$. This is the Kelly-optimal expected log-wealth — the fundamental rate at which evidence accumulates when you’re betting optimally.

Growth rate: how fast your wealth grows per sample
Rejection time: how many samples you need to reach a target wealth level

These are two sides of the same coin. If wealth grows at rate $r$ per sample, you need approximately $\log(1/\alpha) / r$ samples to reach wealth $1/\alpha$.

Key takeaways

Lower bound: no test can have expected rejection time smaller than $\log(1/\alpha)$ divided by the optimal expected log-increment.
SPRUCE matches it: in the high-confidence regime, SPRUCE’s expected rejection time equals the oracle’s, with exact constants.
Same quantity, two roles: the Kelly-optimal expected log-increment governs both asymptotic growth rate and expected rejection time.
The boundedness assumption: the proof requires $\log(\lambda_Q(a)^\top E_1(a))$ to be Q-almost surely bounded, which holds in all the paper’s examples (bounded mean testing, treatment effect testing).
Nonasymptotic bounds exist: Theorem 4.2 follows from nonasymptotic upper bounds that hold for any $\alpha \in (0, 1)$, not just in the limit.

References

Sandoval, Waudby-Smith, and Jordan (2026). Multi-Armed Sequential Hypothesis Testing by Betting.
Wald (1945, 1947). Sequential analysis and sequential probability ratio tests.
Agrawal and Ramdas (2025). Tight lower and upper bounds on stopping times of power-one sequential tests.
Kelly (1956). A new interpretation of information rate.