sayna-client package on PyPI is the official, fully typed async SDK. It streams audio over WebSocket, emits transcripts and TTS audio, calls REST endpoints (health, voices, /speak, LiveKit tokens, SIP hooks), and verifies signed SIP webhooks with HMAC.
Install
Quick start (realtime)
livekit_config to join a room server-side, and set without_audio=True if you only need data-channel control.
REST API (no WebSocket needed)
await client.health()→HealthResponsewithstatus.await client.get_voices()→dict[str, list[VoiceDescriptor]]grouped by provider.await client.speak_rest(text, tts_config)→(bytes, headers)audio synthesized over HTTP; headers includex-audio-format,x-sample-rate, etc.await client.get_livekit_token(room_name, participant_name, participant_identity)→LiveKitTokenResponsewith token, LiveKit URL, room name, and participant identity.await client.get_recording(stream_id)→(bytes, headers)binary OGG for a completed session (useclient.received_stream_idafter ready).await client.get_sip_hooks()→SipHooksResponsefrom the runtime cache.await client.set_sip_hooks(hooks)→SipHooksResponseafter replacing hooks with matching hosts (case-insensitive) and adding new ones.
WebSocket client
Constructing the client
loading_audio argument supplies a loading-indicator clip (base64 WAV or raw 16-bit PCM) inside the initial config frame. It is only effective when without_audio=False and a livekit_config is provided. See Loading indicator below.
url accepts http(s) or ws(s); the client will open a WebSocket to Sayna and send the initial config when you call connect().
Both STTConfig and TTSConfig accept an optional auth field for per-session provider credential overrides. See the WebSocket guide for supported shapes per provider.
Lifecycle and state
await client.connect()establishes the socket and resolves after the ready message.await client.disconnect()closes the socket and cleans up.- Connection state is exposed via
client.readyandclient.connected. - LiveKit metadata after ready (when enabled):
client.livekit_room_name,client.livekit_url,client.sayna_participant_identity,client.sayna_participant_name. - Session identifier after ready:
client.received_stream_id(server-generated UUID unless you providedstream_id).
Sending data
await client.speak(text, flush=True, allow_interruption=True)to enqueue speech (defaults clear-queue and interruptible).await client.on_audio_input(audio_data: bytes)to stream raw audio for STT.await client.send_message(message, role, topic="messages", debug=None)to send data-channel messages.await client.clear()to clear the TTS queue.await client.tts_flush(allow_interruption=True)to flush the queue with an empty speak.await client.loading_start()to begin the loading-indicator loop on the dedicatedloading-audioLiveKit track. Fire-and-forget and idempotent.await client.loading_stop()to stop the loop with a short fade-out. Fire-and-forget, idempotent, and always silent server-side.
Receiving events
Register callbacks to wire in your agent logic:register_on_stt_result(cb)for transcription results.register_on_tts_audio(cb)for TTS audio buffers.register_on_error(cb)for error messages.register_on_message(cb)for participant messages.register_on_participant_disconnected(cb)when a participant leaves.register_on_tts_playback_complete(cb)when queued speech finishes.
Loading indicator
Construct aLoadingAudioConfig with a base64-encoded WAV (or raw 16-bit PCM) clip and pass it as the loading_audio keyword argument. The clip plays on its own dedicated loading-audio LiveKit track, independent of TTS. See the WebSocket guide for the full protocol contract and audio-content rules.
loading_audio is a LoadingAudioConfig instance and that data is non-empty. All audio-content rules (duration, sample rate, byte cap) are enforced by the server; failures arrive asynchronously via the register_on_error(cb) callback and the session stays alive.
SIP webhook receiver
WebhookReceiver verifies SIP webhooks with HMAC-SHA256, constant-time comparison, a 5-minute replay window, and strict validation.
- Constructor:
WebhookReceiver(secret: str | None = None)uses the provided secret orSAYNA_WEBHOOK_SECRET. receive(headers, body)returns a typedWebhookSIPOutput(participant identity/sid/name, room name/sid, from/to phone numbers, room prefix, sip_host) or raisesSaynaValidationErroron failed verification.