Sayna exposes a compact set of HTTP endpoints that reuse the same providers and caches that power the streaming stack. Use the interactive playground inside the API reference for request-by-request schemas and try-it-out support.
Authentication is optional. Enable it through the Authentication guide if you need shared secrets or delegated JWT validation.
Want idiomatic clients? Install one of the Sayna SDKs to reuse the snippets shown below.
SDK examples assume you have already instantiated a client (see the SDK guides for setup) and are showing only the method call relevant to each endpoint.
GET / – health check
- Purpose: returns
{ "status": "ok" } when the server and dependencies are alive.
- Status codes: always
200 OK when the Axum router is reachable.
- Usage: liveness/readiness probes and smoke tests after deployments.
curl https://api.sayna.ai/
# => {"status":"ok"}
const health = await client.health();
console.log(health.status);
from sayna_client import SaynaClient, STTConfig, TTSConfig
client = SaynaClient(
url="https://api.sayna.ai",
stt_config=STTConfig(provider="deepgram", model="nova-2"),
tts_config=TTSConfig(provider="elevenlabs", voice_id="21m00Tcm4TlvDq8ikWAM"),
)
health = await client.health()
print(health.status)
GET /voices – provider catalog
| Detail | Description |
|---|
| Request body | none |
| Authentication | Required when AUTH_REQUIRED=true |
| Success | 200 OK with provider metadata and voice options |
| Failure | 401 Unauthorized when auth fails, 500 for upstream provider errors |
Response schema mirrors each provider’s capabilities, including languages, sample rates, and optional tags. Use the payload to drive voice pickers inside your product.
curl https://api.sayna.ai/voices | jq '.voices[0]'
const voices = await client.getVoices();
console.log(voices.deepgram?.[0]);
voices = await client.get_voices()
print(next(iter(voices.values()))[0])
POST /speak – one-shot synthesis
| Field | Type | Required | Description |
|---|
text | string | Yes | Text to convert into audio |
voice_id | string | Conditional | Voice from /voices; provider default when omitted |
provider | string | Conditional | Provider slug (e.g., deepgram, elevenlabs) |
voice_settings | object | No | Provider-specific overrides |
- Request: JSON body plus optional auth header.
- Response:
200 OK with an object that contains metadata and the rendered audio bytes (base64) plus x-voice-id, x-sample-rate, and caching headers.
- Errors:
400 when text is empty, 500 for provider failures or missing credentials.
curl -X POST https://api.sayna.ai/speak \
-H "Content-Type: application/json" \
-d '{"text": "Welcome to Sayna", "provider": "deepgram"}' \
| jq '.metadata'
const audioBuffer = await client.speakRest("Welcome to Sayna", {
provider: "deepgram",
model: "aura-asteria-en",
voice_id: "aura-asteria-en",
});
console.log(`Received ${audioBuffer.byteLength} bytes`);
audio_bytes, headers = await client.speak_rest(
"Welcome to Sayna",
TTSConfig(provider="deepgram", voice_id="aura-asteria-en"),
)
print(len(audio_bytes), "bytes of", headers.get("Content-Type"))
POST /livekit/token – participant tokens
| Field | Type | Description |
|---|
room_name | string | Room to join or create |
participant_name | string | Display name shown inside LiveKit |
participant_identity | string | Stable identity string for permissions |
Returns:
| Field | Type | Description |
|---|
token | string | Signed LiveKit JWT for the participant |
room_name | string | Echoes the requested room |
participant_identity | string | Echo of the requested identity |
livekit_url | string | Client-facing LiveKit URL from config |
- Errors:
400 when any field is empty, 500 when LiveKit credentials are misconfigured.
- Typical flow: once a WebSocket session advertises LiveKit settings in the
config message, call this REST endpoint from your control plane to mint attendee tokens.
curl -X POST https://api.sayna.ai/livekit/token \
-H "Content-Type: application/json" \
-d '{
"room_name": "support-42",
"participant_name": "alex",
"participant_identity": "alex-support"
}'
const tokenInfo = await client.getLiveKitToken(
"support-42",
"alex",
"alex-support"
);
console.log(tokenInfo.token, tokenInfo.livekit_url);
token = await client.get_livekit_token("support-42", "alex", "alex-support")
print(token.token, token.livekit_url)
GET /recording/{stream_id} – download session audio
| Detail | Description |
|---|
| Path param | stream_id from the WebSocket ready message (or the value you provided). Empty values, .., or / are rejected. |
| Authentication | Required when AUTH_REQUIRED=true |
| Success | 200 OK with audio/ogg body and Content-Disposition: attachment; filename="{stream_id}.ogg" |
| Failure | 400 invalid stream_id, 404 recording missing, 503 storage not configured or unavailable |
Recording files live at {recording_s3_prefix}/{stream_id}/audio.ogg when livekit.enable_recording=true and storage credentials are set.
# Download and save as <stream_id>.ogg
curl -OJ https://api.sayna.ai/recording/support-call-789
import { promises as fs } from "fs";
const streamId = client.streamId; // from ready()
const audio = await client.getRecording(streamId!);
await fs.writeFile(`${streamId}.ogg`, Buffer.from(audio));
stream_id = client.received_stream_id
audio_bytes, headers = await client.get_recording(stream_id)
with open(f"{stream_id}.ogg", "wb") as f:
f.write(audio_bytes)
Cache-aware behavior
/speak reuses cached synthesis when both the text and tts_config hash match a previous request and caching is enabled.
- Cache assets live under
CACHE_PATH; mount a persistent volume in production if you want to avoid cold starts.
- Clear server caches when rotating provider credentials or when you change voice defaults that affect TTS hashes.