Sayna exposes a compact set of HTTP endpoints that reuse the same providers and caches that power the streaming stack. Use the interactive playground inside the API reference for request-by-request schemas and try-it-out support.
If you need tenant-scoped room management (listing rooms, inspecting participants, or moderating LiveKit sessions), see the Livekit Room Management guide.
Authentication is optional. Enable it through the Authentication guide if you need shared secrets or delegated JWT validation.
Want idiomatic clients? Install one of the Sayna SDKs to reuse the snippets shown below.
SDK examples assume you have already instantiated a client (see the SDK guides for setup) and are showing only the method call relevant to each endpoint.
GET / – health check
- Purpose: returns
{ "status": "ok" } when the server and dependencies are alive.
- Status codes: always
200 OK when the Axum router is reachable.
- Usage: liveness/readiness probes and smoke tests after deployments.
curl https://api.sayna.ai/
# => {"status":"ok"}
const health = await client.health();
console.log(health.status);
from sayna_client import SaynaClient, STTConfig, TTSConfig
client = SaynaClient(
url="https://api.sayna.ai",
stt_config=STTConfig(provider="deepgram", model="nova-2"),
tts_config=TTSConfig(provider="elevenlabs", voice_id="21m00Tcm4TlvDq8ikWAM"),
)
health = await client.health()
print(health.status)
GET /voices – provider catalog
| Detail | Description |
|---|
| Request body | none |
| Authentication | Required when AUTH_REQUIRED=true |
| Success | 200 OK with provider metadata and voice options |
| Failure | 401 Unauthorized when auth fails, 500 for upstream provider errors |
Response schema mirrors each provider’s capabilities, including languages, sample rates, and optional tags. Use the payload to drive voice pickers inside your product.
curl https://api.sayna.ai/voices | jq '.voices[0]'
const voices = await client.getVoices();
console.log(voices.deepgram?.[0]);
voices = await client.get_voices()
print(next(iter(voices.values()))[0])
POST /speak – one-shot synthesis
| Field | Type | Required | Description |
|---|
text | string | Yes | Text to convert into audio |
voice_id | string | Conditional | Voice from /voices; provider default when omitted |
provider | string | Conditional | Provider slug (e.g., deepgram, elevenlabs) |
voice_settings | object | No | Provider-specific overrides |
- Request: JSON body plus optional auth header.
- Response:
200 OK with an object that contains metadata and the rendered audio bytes (base64) plus x-voice-id, x-sample-rate, and caching headers.
- Errors:
400 when text is empty, 500 for provider failures or missing credentials.
curl -X POST https://api.sayna.ai/speak \
-H "Content-Type: application/json" \
-d '{"text": "Welcome to Sayna", "provider": "deepgram"}' \
| jq '.metadata'
const audioBuffer = await client.speakRest("Welcome to Sayna", {
provider: "deepgram",
model: "aura-asteria-en",
voice_id: "aura-asteria-en",
});
console.log(`Received ${audioBuffer.byteLength} bytes`);
audio_bytes, headers = await client.speak_rest(
"Welcome to Sayna",
TTSConfig(provider="deepgram", voice_id="aura-asteria-en"),
)
print(len(audio_bytes), "bytes of", headers.get("Content-Type"))
POST /livekit/token – participant tokens
| Field | Type | Description |
|---|
room_name | string | Room to join or create |
participant_name | string | Display name shown inside LiveKit |
participant_identity | string | Stable identity string for permissions |
Returns:
| Field | Type | Description |
|---|
token | string | Signed LiveKit JWT for the participant |
room_name | string | Echoes the requested room |
participant_identity | string | Echo of the requested identity |
livekit_url | string | Client-facing LiveKit URL from config |
When authentication is enabled, this endpoint creates the room if it doesn’t exist and sets metadata.auth_id for tenant isolation.
- Errors:
400 when any field is empty, 403 when the room exists with a different tenant’s auth_id, 500 when LiveKit credentials are misconfigured.
- Typical flow: once a WebSocket session advertises LiveKit settings in the
config message, call this REST endpoint from your control plane to mint attendee tokens.
curl -X POST https://api.sayna.ai/livekit/token \
-H "Content-Type: application/json" \
-d '{
"room_name": "support-42",
"participant_name": "alex",
"participant_identity": "alex-support"
}'
const tokenInfo = await client.getLiveKitToken(
"support-42",
"alex",
"alex-support"
);
console.log(tokenInfo.token, tokenInfo.livekit_url);
token = await client.get_livekit_token("support-42", "alex", "alex-support")
print(token.token, token.livekit_url)
GET /recording/{stream_id} – download session audio
| Detail | Description |
|---|
| Path param | stream_id from the WebSocket ready message (or the value you provided). Empty values, .., or / are rejected. |
| Authentication | Required when AUTH_REQUIRED=true |
| Success | 200 OK with audio/ogg body and Content-Disposition: attachment; filename="{stream_id}.ogg" |
| Failure | 400 invalid stream_id, 404 recording missing, 503 storage not configured or unavailable |
Recording files live at {recording_s3_prefix}/{stream_id}/audio.ogg when livekit.enable_recording=true and storage credentials are set.
# Download and save as <stream_id>.ogg
curl -OJ https://api.sayna.ai/recording/support-call-789
import { promises as fs } from "fs";
const streamId = client.streamId; // from ready()
const audio = await client.getRecording(streamId!);
await fs.writeFile(`${streamId}.ogg`, Buffer.from(audio));
stream_id = client.received_stream_id
audio_bytes, headers = await client.get_recording(stream_id)
with open(f"{stream_id}.ogg", "wb") as f:
f.write(audio_bytes)
POST /sip/call – initiate SIP call
| Field | Type | Required | Description |
|---|
room_name | string | Yes | The LiveKit room to connect the call to |
participant_name | string | Yes | Display name for the SIP participant in the room |
participant_identity | string | Yes | Unique identity for the SIP participant |
from_phone_number | string | Yes | Caller ID phone number (must be configured in your SIP provider) |
to_phone_number | string | Yes | Destination phone number to dial |
sip | object | No | Per-request SIP configuration overrides (see below) |
The optional sip object allows overriding global SIP configuration on a per-request basis. This is useful when you need to use different SIP providers or credentials for specific calls.
| Field | Type | Description |
|---|
sip.outbound_address | string|null | SIP server address override. Format: hostname or hostname:port |
sip.auth_username | string|null | SIP authentication username override |
sip.auth_password | string|null | SIP authentication password override |
- Request: JSON body plus optional auth header.
- Response:
200 OK with call status, room name, participant identity, participant ID, and SIP call ID.
- Errors:
400 when phone number is invalid or required fields are empty, 404 when room exists with a different tenant’s auth_id, 500 when LiveKit is not configured, outbound address is missing, or call fails.
Phone numbers support international format (+1234567890), national format (07123456789), or internal extensions (1234).
See the API reference for detailed schema information.
# Basic call (uses global SIP configuration)
curl -X POST https://api.sayna.ai/sip/call \
-H "Content-Type: application/json" \
-d '{
"room_name": "call-room-123",
"participant_name": "John Doe",
"participant_identity": "caller-456",
"from_phone_number": "+15105550123",
"to_phone_number": "+15551234567"
}'
# With per-request SIP configuration overrides
curl -X POST https://api.sayna.ai/sip/call \
-H "Content-Type: application/json" \
-d '{
"room_name": "call-room-123",
"participant_name": "John Doe",
"participant_identity": "caller-456",
"from_phone_number": "+15105550123",
"to_phone_number": "+15551234567",
"sip": {
"outbound_address": "sip.provider.com:5060",
"auth_username": "user123",
"auth_password": "secret456"
}
}'
The sipCall method is available in SDK version 0.3.0 and later.
const result = await client.sipCall({
roomName: "call-room-123",
participantName: "John Doe",
participantIdentity: "caller-456",
fromPhoneNumber: "+15105550123",
toPhoneNumber: "+15551234567",
});
console.log(result.sipCallId, result.participantId);
// With per-request SIP configuration overrides
const result = await client.sipCall({
roomName: "call-room-123",
participantName: "John Doe",
participantIdentity: "caller-456",
fromPhoneNumber: "+15105550123",
toPhoneNumber: "+15551234567",
sip: {
outboundAddress: "sip.provider.com:5060",
authUsername: "user123",
authPassword: "secret456",
},
});
The sip_call method is available in SDK version 0.3.0 and later.
result = await client.sip_call(
room_name="call-room-123",
participant_name="John Doe",
participant_identity="caller-456",
from_phone_number="+15105550123",
to_phone_number="+15551234567",
)
print(result.sip_call_id, result.participant_id)
# With per-request SIP configuration overrides
result = await client.sip_call(
room_name="call-room-123",
participant_name="John Doe",
participant_identity="caller-456",
from_phone_number="+15105550123",
to_phone_number="+15551234567",
sip={
"outbound_address": "sip.provider.com:5060",
"auth_username": "user123",
"auth_password": "secret456",
},
)
Cache-aware behavior
/speak reuses cached synthesis when both the text and tts_config hash match a previous request and caching is enabled.
- Cache assets live under
CACHE_PATH; mount a persistent volume in production if you want to avoid cold starts.
- Clear server caches when rotating provider credentials or when you change voice defaults that affect TTS hashes.