Tune Mode¶

Tune mode (--stochastic-tune) empirically profiles your stochastic tests to discover distributional parameters (primarily variance), then persists them to .stochastic.toml. On subsequent runs, the framework uses these discovered parameters to select tighter bounds and require fewer samples.

Quick Start¶

# Run tuning
pytest --stochastic-tune

# Normal test run now uses discovered parameters
pytest

How It Works¶

Profiling: Each @stochastic_test function is called 50,000 times (configurable) to collect samples
Variance UCB: An upper confidence bound on the true variance is computed — distribution-free when the test declares bounds
Persistence: Results are written to .stochastic.toml in the pytest root directory
Automatic loading: On subsequent runs, the decorator loads tuned parameters and adds variance_tuned to the declared properties, enabling the bernstein_tuned bound

The Variance UCB¶

The upper confidence bound ensures the discovered variance is conservative (larger than the true variance with high probability).

When the test declares bounds=(a, b) — which bernstein_tuned requires anyway — the UCB is distribution-free, via the Maurer-Pontil (2009) self-bounding variance concentration: with probability at least \(1 - \alpha\),

\[\sqrt{\text{Var}} \leq \sqrt{s^2} + (b-a)\sqrt{\frac{2\ln(1/\alpha)}{n-1}}\]

so the persisted value is the square of the right-hand side (method = "maurer_pontil").

When no bounds are declared, tune mode falls back to the chi-squared interval

\[\hat{\sigma}^2_{\text{upper}} = \frac{(n-1) \cdot s^2}{\chi^2_{\alpha}(n-1)}\]

which is exact only for Gaussian data — it is recorded with method = "chi2_gaussian_approx" to make the approximation explicit.

The default confidence level is \(\alpha = 10^{-8}\). Because a tuned test fails spuriously either when its concentration bound fails (probability \(\leq\) failure_prob) or when the UCB missed the true variance (probability \(\leq \alpha\)), the total flakiness of a tuned test is at most failure_prob \(+\ \alpha\).

CLI Options¶

`--stochastic-tune`¶

Enable tune mode. Stochastic tests are profiled instead of being run normally.

pytest --stochastic-tune

`--stochastic-tune-samples`¶

Number of samples per test during tuning. Default: 50,000.

pytest --stochastic-tune --stochastic-tune-samples 100000

More samples produce a tighter variance UCB but take longer.

The .stochastic.toml File¶

After tuning, the file looks like:

# Auto-generated by pytest-stochastic --stochastic-tune

[tests."tests.test_example.test_uniform_mean"]
confidence = 1e-08
method = "maurer_pontil"
n_tune_samples = 50000
observed_range = [0.000012, 0.999987]
tuned_at = "2026-02-22T15:30:00+00:00"
variance = 0.09973

Each entry records:

Field	Description
`variance`	Upper confidence bound on the true variance
`observed_range`	`[min, max]` of observed samples
`n_tune_samples`	Number of samples used for tuning
`tuned_at`	ISO 8601 timestamp of when tuning was run
`confidence`	Failure budget of the variance UCB itself
`method`	`"maurer_pontil"` (distribution-free, bounds declared) or `"chi2_gaussian_approx"`

Merging Behavior¶

Re-running --stochastic-tune merges results with existing data. Tests not included in the current run keep their previous tuned parameters.

How Tuned Parameters Improve Tests¶

Consider a uniform-on-\([0,1]\) test with bounds=(0, 1) and no declared variance:

Without tuning: Hoeffding's bound is used. For atol=0.05 and failure_prob=1e-8, this requires \(n = 3{,}823\).
With tuning: The true variance is \(1/12 \approx 0.083\); with 50,000 tune samples the distribution-free UCB comes out \(\approx 0.0997\), and the tuned Bernstein bound requires only \(n = 1{,}781\) — a 53% reduction.

The key: tuning adds variance_tuned to the property set, enabling the bernstein_tuned bound which combines the declared bounds with the machine-discovered variance.

Workflow¶

Initial Setup¶

# 1. Write tests with bounds
# 2. Run tuning to discover variance
pytest --stochastic-tune

# 3. Commit .stochastic.toml
git add .stochastic.toml
git commit -m "Add tuned stochastic test parameters"

Periodic Re-tuning¶

Re-tune when:

You change the test function's implementation
You update the distribution being tested
You want tighter bounds after a code change

pytest --stochastic-tune
git diff .stochastic.toml  # Review changes
git add .stochastic.toml && git commit -m "Re-tune stochastic tests"

CI Integration¶

Run tuning locally or in a dedicated CI job, then commit the results. Normal CI runs use the committed .stochastic.toml automatically.

Test Key Matching¶

Tuned parameters are matched to tests by key. Tune mode stores each test under the fully qualified name of its function — {module}.{qualname}, e.g. tests.test_example.test_uniform_mean — and the decorator looks up exactly that key at import time. Keys written by older versions (derived from pytest node IDs, containing a .py segment) are still recognized. As a last resort, a bare function-name match is accepted only when it is unambiguous across all stored keys.