Runtime tuning
- Reuse
tts_confighashes:VoiceManagercaches rendered audio per text + config hash, so reusing configs dramatically reduces provider round-trips. - Disable audio when needed: Set
audio=falsein theconfigmessage to use LiveKit data relaying without spinning up STT/TTS providers. - Balance CPU vs. quality:
noise-filterimproves transcripts but costs CPU; toggle the feature flag when ultra-low latency is more important than denoising. - Persist caches: Mount
CACHE_PATHto a durable volume if you want previously synthesized clips and turn-detect assets to survive restarts.
Observability & testing
- Use the integration tests in
tests/(e.g.,tests/ws_tests.rs) as references when extending message formats or LiveKit behavior. - Emit application metrics around provider errors and queue depth to quickly detect throttling.
- Enable structured logging inside WebSocket handlers so you can correlate STT/TTS lifecycle events with LiveKit callbacks.
Release hygiene
- Regenerate the OpenAPI file with
cargo run --features openapi -- openapi -o docs/openapi.yamlwhenever you tweak request or response schemas. - Keep the documentation and server version in lockstep; expose the git SHA or semver string via
/so probes can verify deployments. - Rotate provider credentials and authentication keys as part of your deployment pipelines; the Authentication guide documents the required env vars.
Operational playbook
Many outages boil down to LiveKit connectivity or provider rate limits. Monitor those dependencies explicitly and surface their status in your dashboards.
- Circuit breakers: Treat provider 5xx responses as signals to backoff and fall back to cached content when possible.
- Traffic shaping: Consider per-tenant rate limits enforced inside your JWT auth service so Sayna sees predictable load.
- WebSocket health: Alert on connection churn, repeated
errorframes, or clients that never sendconfig. - LiveKit drift: Keep Sayna and LiveKit clocks in sync; recording requests rely on accurate timestamps.