Skip to main content
The sayna-client package on PyPI is the official, fully typed async SDK. It streams audio over WebSocket, emits transcripts and TTS audio, calls REST endpoints (health, voices, /speak, LiveKit tokens, SIP hooks), and verifies signed SIP webhooks with HMAC.

Install

pip install sayna-client

Quick start (realtime)

import asyncio
from sayna_client import SaynaClient, STTConfig, TTSConfig

async def main():
    client = SaynaClient(
        url="https://api.sayna.ai",
        stt_config=STTConfig(provider="deepgram", model="nova-2"),
        tts_config=TTSConfig(provider="cartesia", voice_id="example-voice"),
        api_key="your-api-key",
    )

    client.register_on_stt_result(lambda r: print("Transcription:", r.transcript))
    client.register_on_tts_audio(lambda audio: print("Received", len(audio), "bytes"))

    await client.connect()
    await client.speak("Hello, world!")
    await client.disconnect()

asyncio.run(main())
Pass an optional livekit_config to join a room server-side, and set without_audio=True if you only need data-channel control.

REST API (no WebSocket needed)

  • await client.health()HealthResponse with status.
  • await client.get_voices()dict[str, list[VoiceDescriptor]] grouped by provider.
  • await client.speak_rest(text, tts_config)(bytes, headers) audio synthesized over HTTP; headers include x-audio-format, x-sample-rate, etc.
  • await client.get_livekit_token(room_name, participant_name, participant_identity)LiveKitTokenResponse with token, LiveKit URL, room name, and participant identity.
  • await client.get_recording(stream_id)(bytes, headers) binary OGG for a completed session (use client.received_stream_id after ready).
  • await client.get_sip_hooks()SipHooksResponse from the runtime cache.
  • await client.set_sip_hooks(hooks)SipHooksResponse after replacing hooks with matching hosts (case-insensitive) and adding new ones.
from sayna_client import TTSConfig, SipHook

audio, headers = await client.speak_rest(
    "Hello, world!",
    TTSConfig(
        provider="elevenlabs",
        voice_id="21m00Tcm4TlvDq8ikWAM",
        model="eleven_turbo_v2",
        speaking_rate=1.0,
        audio_format="mp3",
        sample_rate=24000,
        connection_timeout=30,
        request_timeout=60,
        pronunciations=[],
    ),
)
print("Audio bytes:", len(audio), "Format:", headers.get("x-audio-format"))

updated = await client.set_sip_hooks([
    SipHook(host="example.com", url="https://webhook.example.com/events"),
    SipHook(host="another.com", url="https://webhook.another.com/events"),
])
print("Total hooks configured:", len(updated.hooks))

WebSocket client

Constructing the client

SaynaClient(
    url: str,
    stt_config: STTConfig,
    tts_config: TTSConfig,
    livekit_config: LiveKitConfig | None = None,
    without_audio: bool = False,
    api_key: str | None = None,
    stream_id: str | None = None,
)
url accepts http(s) or ws(s); the client will open a WebSocket to Sayna and send the initial config when you call connect().

Lifecycle and state

  • await client.connect() establishes the socket and resolves after the ready message.
  • await client.disconnect() closes the socket and cleans up.
  • Connection state is exposed via client.ready and client.connected.
  • LiveKit metadata after ready (when enabled): client.livekit_room_name, client.livekit_url, client.sayna_participant_identity, client.sayna_participant_name.
  • Session identifier after ready: client.received_stream_id (server-generated UUID unless you provided stream_id).

Sending data

  • await client.speak(text, flush=True, allow_interruption=True) to enqueue speech (defaults clear-queue and interruptible).
  • await client.on_audio_input(audio_data: bytes) to stream raw audio for STT.
  • await client.send_message(message, role, topic="messages", debug=None) to send data-channel messages.
  • await client.clear() to clear the TTS queue.
  • await client.tts_flush(allow_interruption=True) to flush the queue with an empty speak.

Receiving events

Register callbacks to wire in your agent logic:
  • register_on_stt_result(cb) for transcription results.
  • register_on_tts_audio(cb) for TTS audio buffers.
  • register_on_error(cb) for error messages.
  • register_on_message(cb) for participant messages.
  • register_on_participant_disconnected(cb) when a participant leaves.
  • register_on_tts_playback_complete(cb) when queued speech finishes.

SIP webhook receiver

WebhookReceiver verifies SIP webhooks with HMAC-SHA256, constant-time comparison, a 5-minute replay window, and strict validation.
from sayna_client import WebhookReceiver, SaynaValidationError

receiver = WebhookReceiver("your-secret-key-min-16-chars")  # or rely on SAYNA_WEBHOOK_SECRET

body = request.get_data(as_text=True)  # Flask example
try:
    webhook = receiver.receive(request.headers, body)
    print("From:", webhook.from_phone_number)
    print("To:", webhook.to_phone_number)
    print("Room:", webhook.room.name)
    print("Participant:", webhook.participant.identity)
except SaynaValidationError as err:
    return {"error": str(err)}, 401
  • Constructor: WebhookReceiver(secret: str | None = None) uses the provided secret or SAYNA_WEBHOOK_SECRET.
  • receive(headers, body) returns a typed WebhookSIPOutput (participant identity/sid/name, room name/sid, from/to phone numbers, room prefix, sip_host) or raises SaynaValidationError on failed verification.