Python SDK

The sayna-client package on PyPI is the official, fully typed async SDK. It streams audio over WebSocket, emits transcripts and TTS audio, calls REST endpoints (health, voices, /speak, LiveKit tokens, SIP hooks), and verifies signed SIP webhooks with HMAC.

Install

pip install sayna-client

Quick start (realtime)

import asyncio
from sayna_client import SaynaClient, STTConfig, TTSConfig

async def main():
    client = SaynaClient(
        url="https://api.sayna.ai",
        stt_config=STTConfig(provider="deepgram", model="nova-2"),
        tts_config=TTSConfig(provider="cartesia", voice_id="example-voice"),
        api_key="your-api-key",
    )

    client.register_on_stt_result(lambda r: print("Transcription:", r.transcript))
    client.register_on_tts_audio(lambda audio: print("Received", len(audio), "bytes"))

    await client.connect()
    await client.speak("Hello, world!")
    await client.disconnect()

asyncio.run(main())

Pass an optional livekit_config to join a room server-side, and set without_audio=True if you only need data-channel control.

REST API (no WebSocket needed)

await client.health() → HealthResponse with status.
await client.get_voices() → dict[str, list[VoiceDescriptor]] grouped by provider.
await client.speak_rest(text, tts_config) → (bytes, headers) audio synthesized over HTTP; headers include x-audio-format, x-sample-rate, etc.
await client.get_livekit_token(room_name, participant_name, participant_identity) → LiveKitTokenResponse with token, LiveKit URL, room name, and participant identity.
await client.get_recording(stream_id) → (bytes, headers) binary OGG for a completed session (use client.received_stream_id after ready).
await client.get_sip_hooks() → SipHooksResponse from the runtime cache.
await client.set_sip_hooks(hooks) → SipHooksResponse after replacing hooks with matching hosts (case-insensitive) and adding new ones.

from sayna_client import TTSConfig, SipHook

audio, headers = await client.speak_rest(
    "Hello, world!",
    TTSConfig(
        provider="elevenlabs",
        voice_id="21m00Tcm4TlvDq8ikWAM",
        model="eleven_turbo_v2",
        speaking_rate=1.0,
        audio_format="mp3",
        sample_rate=24000,
        connection_timeout=30,
        request_timeout=60,
        pronunciations=[],
    ),
)
print("Audio bytes:", len(audio), "Format:", headers.get("x-audio-format"))

updated = await client.set_sip_hooks([
    SipHook(host="example.com", url="https://webhook.example.com/events"),
    SipHook(host="another.com", url="https://webhook.another.com/events"),
])
print("Total hooks configured:", len(updated.hooks))

WebSocket client

Constructing the client

SaynaClient(
    url: str,
    stt_config: STTConfig,
    tts_config: TTSConfig,
    livekit_config: LiveKitConfig | None = None,
    without_audio: bool = False,
    api_key: str | None = None,
    stream_id: str | None = None,
)

url accepts http(s) or ws(s); the client will open a WebSocket to Sayna and send the initial config when you call connect().

Lifecycle and state

await client.connect() establishes the socket and resolves after the ready message.
await client.disconnect() closes the socket and cleans up.
Connection state is exposed via client.ready and client.connected.
LiveKit metadata after ready (when enabled): client.livekit_room_name, client.livekit_url, client.sayna_participant_identity, client.sayna_participant_name.
Session identifier after ready: client.received_stream_id (server-generated UUID unless you provided stream_id).

Sending data

await client.speak(text, flush=True, allow_interruption=True) to enqueue speech (defaults clear-queue and interruptible).
await client.on_audio_input(audio_data: bytes) to stream raw audio for STT.
await client.send_message(message, role, topic="messages", debug=None) to send data-channel messages.
await client.clear() to clear the TTS queue.
await client.tts_flush(allow_interruption=True) to flush the queue with an empty speak.

Receiving events

register_on_stt_result(cb) for transcription results.
register_on_tts_audio(cb) for TTS audio buffers.
register_on_error(cb) for error messages.
register_on_message(cb) for participant messages.
register_on_participant_disconnected(cb) when a participant leaves.
register_on_tts_playback_complete(cb) when queued speech finishes.

SIP webhook receiver

WebhookReceiver verifies SIP webhooks with HMAC-SHA256, constant-time comparison, a 5-minute replay window, and strict validation.

from sayna_client import WebhookReceiver, SaynaValidationError

receiver = WebhookReceiver("your-secret-key-min-16-chars")  # or rely on SAYNA_WEBHOOK_SECRET

body = request.get_data(as_text=True)  # Flask example
try:
    webhook = receiver.receive(request.headers, body)
    print("From:", webhook.from_phone_number)
    print("To:", webhook.to_phone_number)
    print("Room:", webhook.room.name)
    print("Participant:", webhook.participant.identity)
except SaynaValidationError as err:
    return {"error": str(err)}, 401

Constructor: WebhookReceiver(secret: str | None = None) uses the provided secret or SAYNA_WEBHOOK_SECRET.
receive(headers, body) returns a typed WebhookSIPOutput (participant identity/sid/name, room name/sid, from/to phone numbers, room prefix, sip_host) or raises SaynaValidationError on failed verification.

Overview

Build with Sayna

Client libraries

Telephony & SIP

Operate

Install

Quick start (realtime)

REST API (no WebSocket needed)

WebSocket client

Constructing the client

Lifecycle and state

Sending data

Receiving events

SIP webhook receiver

Overview

Build with Sayna

Client libraries

Telephony & SIP

Operate

​Install

​Quick start (realtime)

​REST API (no WebSocket needed)

​WebSocket client

​Constructing the client

​Lifecycle and state

​Sending data

​Receiving events

​SIP webhook receiver

Install

Quick start (realtime)

REST API (no WebSocket needed)

WebSocket client

Constructing the client

Lifecycle and state

Sending data

Receiving events

SIP webhook receiver