Performance
bext ships with a built-in performance measurement and regression prevention system. Every PR is benchmarked, every metric has a budget, and regressions are caught before they reach production.
Budget Framework
Sites declare [perf.budget] in bext.config.toml to set per-metric
thresholds. When a workload run exceeds a budget the regression gate fails
the build.
[perf.budget]
p50_ms = 8
p99_ms = 25
rss_mb = 128
[perf.budget.features]
waf = { p50_add_ms = 1.5 }
compression = { p50_add_ms = 0.5 }
Budgets are enforced per-workload. If a workload does not declare a budget for a metric, the global budget applies.
History Store
bext bench history <workload> shows the timeline of a metric across
commits. The history store persists results to .bext/bench-history/ by
default, with optional S3 upload for team-wide visibility.
$ bext bench history blog_read_heavy --metric p99_ms --last 20
commit date p99_ms
a1b2c3d 2026-04-01 12.3
d4e5f6g 2026-04-02 12.1
h7i8j9k 2026-04-03 14.8 << regression
...
Use bext bench blame <workload> --metric p99_ms to identify the commit
that introduced a regression using binary-search over the history.
Regression Gate
CI runs perf-gate on every PR. The gate compares the PR's workload results
against the baseline from the main branch and fails if any metric regresses
beyond the MAD (median absolute deviation) band.
The gate uses a statistical model rather than a fixed threshold so that
natural variance does not cause false positives. A regression must exceed
2x the MAD of the last 10 baseline runs to trigger a failure
(MAD_MULTIPLIER = 2.0 in bext-bench-history/src/band.rs).
$ perf-gate --baseline main --compare HEAD
PASS blog_read_heavy p50_ms 7.8 vs 8.1 (+3.8%, within 2x MAD)
FAIL ssr_dynamic p99_ms 22.1 vs 18.4 (+20.1%, exceeds 2x MAD of 1.2)
Per-Feature Cost
bext bench cost shows the cost table: how much each compiled feature adds
to key metrics. This is measured by building with and without each feature
and differencing the workload results.
$ bext bench cost
feature p50_add_ms rss_add_mb binary_add_kb
waf +1.2 +3.4 +820
tls +0.1 +1.8 +1200
react-compiler +0.0 +0.5 +340
v8 +0.3 +12.1 +4800
Interpreting Failures
When the gate fails, the error shows:
1. Which workload regressed (e.g. ssr_dynamic).
2. Which metric exceeded the band (e.g. p99_ms).
3. The delta between the PR and the baseline, expressed as both an
absolute value and a percentage.
4. The MAD band so you can see how far outside normal variance the
result falls.
Common causes of regressions:
- Adding a synchronous operation to the request hot path.
- A dependency update that increases binary size or startup time.
- A cache configuration change that lowers hit rates.
- An accidental clone() or allocation in a per-request path.
Run bext bench compare <workload> --a <base> --b <head> locally to
reproduce the regression and iterate on fixes before pushing again.
The gate uses a statistical MAD band, not a fixed threshold, so minor noise does not trigger failures. A result must exceed 2× the MAD of the last 10 baseline runs — single-run outliers are ignored.
The bext bench cost feature table shows per-feature overhead measured by building with and without each feature. If you want to understand why enabling v8 adds RSS, or whether the WAF is the bottleneck, start there before tuning budgets.
Related
- Monitoring — Prometheus metrics for live P50/P95/P99 latency
- Troubleshooting — high CPU, transform pipeline, and image cache issues
- Architecture: V8 Render Pool — render worker pool topology that drives SSR latency
- Guides: Caching — TTL and stale-while-revalidate mechanics that affect cache hit rates
- Configuration: Build Flags — compile-time feature flags reflected in the cost table