Skip to main content
Sayna exposes a compact set of HTTP endpoints that reuse the same providers and caches that power the streaming stack. Use the interactive playground inside the API reference for request-by-request schemas and try-it-out support. If you need tenant-scoped room management (listing rooms, inspecting participants, or moderating LiveKit sessions), see the Livekit Room Management guide.
Authentication is optional. Enable it through the Authentication guide if you need shared secrets or delegated JWT validation.
Want idiomatic clients? Install one of the Sayna SDKs to reuse the snippets shown below.
SDK examples assume you have already instantiated a client (see the SDK guides for setup) and are showing only the method call relevant to each endpoint.

GET / – health check

  • Purpose: returns { "status": "ok" } when the server and dependencies are alive.
  • Status codes: always 200 OK when the Axum router is reachable.
  • Usage: liveness/readiness probes and smoke tests after deployments.
curl https://api.sayna.ai/
# => {"status":"ok"}

GET /voices – provider catalog

DetailDescription
Request bodynone
AuthenticationRequired when AUTH_REQUIRED=true
Success200 OK with provider metadata and voice options
Failure401 Unauthorized when auth fails, 500 for upstream provider errors
Response schema mirrors each provider’s capabilities, including languages, sample rates, and optional tags. Use the payload to drive voice pickers inside your product.
curl https://api.sayna.ai/voices | jq '.voices[0]'

POST /speak – one-shot synthesis

FieldTypeRequiredDescription
textstringYesText to convert into audio
voice_idstringConditionalVoice from /voices; provider default when omitted
providerstringConditionalProvider slug (e.g., deepgram, elevenlabs)
voice_settingsobjectNoProvider-specific overrides
  • Request: JSON body plus optional auth header.
  • Response: 200 OK with an object that contains metadata and the rendered audio bytes (base64) plus x-voice-id, x-sample-rate, and caching headers.
  • Errors: 400 when text is empty, 500 for provider failures or missing credentials.
curl -X POST https://api.sayna.ai/speak \
  -H "Content-Type: application/json" \
  -d '{"text": "Welcome to Sayna", "provider": "deepgram"}' \
  | jq '.metadata'

POST /livekit/token – participant tokens

FieldTypeDescription
room_namestringRoom to join or create
participant_namestringDisplay name shown inside LiveKit
participant_identitystringStable identity string for permissions
Returns:
FieldTypeDescription
tokenstringSigned LiveKit JWT for the participant
room_namestringEchoes the requested room
participant_identitystringEcho of the requested identity
livekit_urlstringClient-facing LiveKit URL from config
When authentication is enabled, this endpoint creates the room if it doesn’t exist and sets metadata.auth_id for tenant isolation.
  • Errors: 400 when any field is empty, 403 when the room exists with a different tenant’s auth_id, 500 when LiveKit credentials are misconfigured.
  • Typical flow: once a WebSocket session advertises LiveKit settings in the config message, call this REST endpoint from your control plane to mint attendee tokens.
curl -X POST https://api.sayna.ai/livekit/token \
  -H "Content-Type: application/json" \
  -d '{
        "room_name": "support-42",
        "participant_name": "alex",
        "participant_identity": "alex-support"
      }'

GET /recording/{stream_id} – download session audio

DetailDescription
Path paramstream_id from the WebSocket ready message (or the value you provided). Empty values, .., or / are rejected.
AuthenticationRequired when AUTH_REQUIRED=true
Success200 OK with audio/ogg body and Content-Disposition: attachment; filename="{stream_id}.ogg"
Failure400 invalid stream_id, 404 recording missing, 503 storage not configured or unavailable
Recording files live at {recording_s3_prefix}/{stream_id}/audio.ogg when livekit.enable_recording=true and storage credentials are set.
# Download and save as <stream_id>.ogg
curl -OJ https://api.sayna.ai/recording/support-call-789

POST /sip/call – initiate SIP call

FieldTypeRequiredDescription
room_namestringYesThe LiveKit room to connect the call to
participant_namestringYesDisplay name for the SIP participant in the room
participant_identitystringYesUnique identity for the SIP participant
from_phone_numberstringYesCaller ID phone number (must be configured in your SIP provider)
to_phone_numberstringYesDestination phone number to dial
sipobjectNoPer-request SIP configuration overrides (see below)
The optional sip object allows overriding global SIP configuration on a per-request basis. This is useful when you need to use different SIP providers or credentials for specific calls.
FieldTypeDescription
sip.outbound_addressstring|nullSIP server address override. Format: hostname or hostname:port
sip.auth_usernamestring|nullSIP authentication username override
sip.auth_passwordstring|nullSIP authentication password override
  • Request: JSON body plus optional auth header.
  • Response: 200 OK with call status, room name, participant identity, participant ID, and SIP call ID.
  • Errors: 400 when phone number is invalid or required fields are empty, 404 when room exists with a different tenant’s auth_id, 500 when LiveKit is not configured, outbound address is missing, or call fails.
Phone numbers support international format (+1234567890), national format (07123456789), or internal extensions (1234). See the API reference for detailed schema information.
# Basic call (uses global SIP configuration)
curl -X POST https://api.sayna.ai/sip/call \
  -H "Content-Type: application/json" \
  -d '{
        "room_name": "call-room-123",
        "participant_name": "John Doe",
        "participant_identity": "caller-456",
        "from_phone_number": "+15105550123",
        "to_phone_number": "+15551234567"
      }'
# With per-request SIP configuration overrides
curl -X POST https://api.sayna.ai/sip/call \
  -H "Content-Type: application/json" \
  -d '{
        "room_name": "call-room-123",
        "participant_name": "John Doe",
        "participant_identity": "caller-456",
        "from_phone_number": "+15105550123",
        "to_phone_number": "+15551234567",
        "sip": {
          "outbound_address": "sip.provider.com:5060",
          "auth_username": "user123",
          "auth_password": "secret456"
        }
      }'

Cache-aware behavior

  • /speak reuses cached synthesis when both the text and tts_config hash match a previous request and caching is enabled.
  • Cache assets live under CACHE_PATH; mount a persistent volume in production if you want to avoid cold starts.
  • Clear server caches when rotating provider credentials or when you change voice defaults that affect TTS hashes.