Phase 9: Advanced Caching

Goal

Bring bext's caching to parity with nginx's proxy_cache — per-status-code TTLs, Vary-aware cache keys, negative caching, cache purge API, and background revalidation. Build on top of the existing ISR cache and stampede guard.

Current State

  • ISR cache: in-memory LRU, max_entries 10K, default TTL 60s, SWR 60m
  • Stampede guard: coalescing lock prevents thundering herd
  • Redis L2: optional distributed cache with Pub/Sub invalidation
  • Tag-based invalidation: bulk purge by tag
  • Fragment cache: per-island HTML caching
  • Tenant cache: per-tenant scoped keys
  • No per-status-code TTL
  • No Vary header awareness in cache key
  • No negative caching (404s, 500s not cached)
  • No background revalidation pool
  • No cache purge HTTP API (only CLI bext cache purge)
  • No conditional request support (If-Modified-Since, If-None-Match for upstream)

Design

1. Per-Status-Code TTL

Different TTLs for different response statuses:

[cache]
default_ttl_ms = 60000           # 200 responses

[cache.ttl_by_status]
301 = 86400000                   # Permanent redirects: 24h
302 = 60000                      # Temporary redirects: 1m
404 = 30000                      # Not found: 30s (negative cache)
500 = 0                          # Never cache server errors (default)

For proxy responses, cache based on upstream status code:

fn cache_ttl(status: StatusCode, config: &CacheConfig) -> Option {
    // Check specific status override
    if let Some(ttl) = config.ttl_by_status.get(&status.as_u16()) {
        if *ttl == 0 { return None; }  // Explicit no-cache
        return Some(Duration::from_millis(*ttl));
    }

    // Default: only cache 2xx and 3xx
    if status.is_success() || status.is_redirection() {
        Some(Duration::from_millis(config.default_ttl_ms))
    } else {
        None
    }
}

2. Vary-Aware Cache Keys

Respect the Vary header to cache different variants of the same URL:

GET /products HTTP/2
Accept-Encoding: gzip
Accept-Language: fr

→ Cache key: GET:/products:accept-encoding=gzip:accept-language=fr
fn build_cache_key(req: &BextRequest, vary_headers: &[String]) -> String {
    let mut key = format!("{}:{}", req.method, req.uri.path());

    // Include query string
    if let Some(query) = req.uri.query() {
        key.push('?');
        key.push_str(query);
    }

    // Include Vary headers (sorted for consistency)
    let mut vary_parts: Vec = vary_headers.iter()
        .filter_map(|h| req.headers.get(h).map(|v| format!("{}={}", h, v.to_str().unwrap_or(""))))
        .collect();
    vary_parts.sort();

    for part in vary_parts {
        key.push(':');
        key.push_str(&part);
    }

    key
}

The Vary header is stored alongside the cached response. On cache lookup, if the stored Vary headers don't match the current request, it's a cache miss.

3. Negative Caching

Cache error responses to prevent thundering herd on broken pages:

[cache.negative]
enabled = true
404_ttl_ms = 30000               # Cache 404s for 30 seconds
429_ttl_ms = 5000                # Cache rate-limited for 5 seconds
502_ttl_ms = 10000               # Cache bad gateway for 10 seconds
503_ttl_ms = 5000                # Cache unavailable for 5 seconds

Negative cache entries are stored separately and have:

  • Short TTLs (to quickly recover when the issue is fixed)
  • No SWR (stale error responses are never served)
  • Tag __negative for bulk purging

4. Background Revalidation

Decouple stale-while-revalidate from the request path:

Request → Cache HIT (stale) → serve stale immediately
                             → queue background revalidation
                             → revalidation pool fetches fresh
                             → update cache asynchronously
[cache.revalidation]
pool_size = 4                    # Background revalidation workers
max_queue_size = 1000            # Pending revalidation queue
dedup = true                     # Don't revalidate same URL twice concurrently
struct RevalidationPool {
    queue: mpsc::Sender,
    in_flight: DashSet,  // Dedup by cache key
}

impl RevalidationPool {
    async fn enqueue(&self, key: String, url: String) {
        if self.in_flight.contains(&key) {
            return;  // Already being revalidated
        }
        self.in_flight.insert(key.clone());
        self.queue.send(RevalidationRequest { key, url }).await;
    }
}

5. Cache Purge API

HTTP endpoints for cache management (authenticated):

POST /__bext/cache/purge
  { "pattern": "/products/*" }              # Glob pattern

POST /__bext/cache/purge
  { "tags": ["product", "category:shoes"] } # Tag-based

POST /__bext/cache/purge
  { "url": "/products/123" }                # Exact URL

POST /__bext/cache/purge-all                # Nuclear option

GET  /__bext/cache/stats                    # Hit rate, entry count, memory usage
GET  /__bext/cache/inspect?url=/products/123  # Show cache entry details

All purge endpoints require authentication (Phase 1 of SC-3: bearer token) or are restricted to localhost.

6. Conditional Requests to Upstream

When revalidating cached proxy responses, use conditional requests:

If cache entry has ETag "abc123":
  GET /api/products HTTP/1.1
  If-None-Match: "abc123"

  → 304 Not Modified (upstream says it hasn't changed)
  → Refresh cache TTL without downloading body

If cache entry has Last-Modified:
  GET /api/products HTTP/1.1
  If-Modified-Since: Wed, 29 Mar 2026 10:00:00 GMT

  → 304 Not Modified

This reduces bandwidth on revalidation by 90%+ for responses that haven't changed.

7. Cache Key Customization

Per-route cache key configuration:

[[route_rules]]
pattern = "/api/products/**"
cache = true

[route_rules.cache_key]
include_query = true              # Include query string (default: true)
include_headers = ["Accept-Language"]  # Additional headers in key
ignore_query_params = ["utm_source", "utm_medium", "fbclid"]  # Strip tracking params
include_cookies = ["locale"]      # Include specific cookies in key

8. Tiered Caching

When Redis L2 is enabled, implement a proper tiered cache:

L1 (in-memory) → L2 (Redis) → Origin (render/proxy)

Read:
  Check L1 → HIT? return
  Check L2 → HIT? promote to L1, return
  Render/proxy → store in L1 + L2

Write-through:
  On L1 store → async write to L2
  On L2 invalidation → broadcast to all instances → evict L1

Config Reference

[cache]
enabled = true
default_ttl_ms = 60000
max_entries = 10000
swr_ms = 3600000                  # Stale-while-revalidate window

[cache.ttl_by_status]
301 = 86400000
404 = 30000
502 = 10000

[cache.negative]
enabled = true

[cache.revalidation]
pool_size = 4
max_queue_size = 1000
dedup = true

[cache.redis]
enabled = true
url = "redis://localhost:6379"
key_prefix = "bext:"
write_through = true

# Per-route cache config (in route_rules)
[[route_rules]]
pattern = "/api/**"
cache = true
cache_ttl_ms = 30000

[route_rules.cache_key]
include_query = true
ignore_query_params = ["utm_source", "fbclid"]
include_headers = ["Accept-Language"]

Testing Plan

Test Type What it validates
Per-status TTL Unit 200, 301, 404 cached with different TTLs
No cache for 500 Unit 500 not cached (default)
Vary-aware keys Unit Different Accept-Encoding → different cache entries
Vary header storage Unit Stored Vary checked on lookup
Negative cache 404 Unit 404 cached, expires after short TTL
Background revalidation Integration Stale response served, fresh fetched async
Revalidation dedup Unit Same URL not revalidated twice
Conditional 304 Integration ETag/Last-Modified → 304, cache refreshed
Purge by pattern Unit Glob pattern removes matching entries
Purge by tag Unit Tag purge removes tagged entries
Purge by URL Unit Exact URL removed
Cache key customization Unit Query params, headers, cookies in key
Ignore tracking params Unit utm_source stripped from cache key
Tiered L1→L2 promotion Integration L2 hit promoted to L1
Write-through to L2 Integration L1 store triggers async L2 write
Cache stats endpoint Unit Hit/miss/stale counts returned
Cache inspect endpoint Unit Entry details (TTL, tags, headers) returned
Purge API auth Unit Unauthenticated purge rejected

Done Criteria

  • Per-status-code TTL configuration
  • Vary-aware cache keys (store and check Vary header)
  • Negative caching (404, 502, 503 with short TTLs)
  • Background revalidation pool with dedup
  • Conditional requests (If-None-Match, If-Modified-Since) to upstream
  • Cache purge HTTP API (by pattern, tag, URL, all)
  • Cache inspect endpoint (show entry details)
  • Cache key customization per route (query params, headers, cookies)
  • Tracking param stripping (utm_*, fbclid)
  • Tiered L1→L2 with write-through and promotion
  • All tests passing