Skip to main content
@sayna-ai/node-sdk is the server-side toolkit for Sayna. It streams audio over WebSocket, sends and receives transcripts and TTS audio, calls REST endpoints, issues LiveKit tokens, manages SIP hooks, and verifies signed SIP webhooks. Everything is fully typed for TypeScript.

Install

npm install @sayna-ai/node-sdk

Quick start (realtime)

import { SaynaClient } from "@sayna-ai/node-sdk";

const client = new SaynaClient(
  "https://api.sayna.ai",
  { provider: "deepgram", model: "nova-2" },
  { provider: "cartesia", voice_id: "example-voice" }
);

client.registerOnSttResult(({ transcript }) => {
  console.log("Transcription:", transcript);
});

client.registerOnTtsAudio((audio) => {
  // audio is an ArrayBuffer stream from TTS
});

await client.connect();
await client.speak("Hello, world!");
Pass an optional livekitConfig as the fourth constructor argument to have the server join a LiveKit room, and set the final boolean to true when you want data-channel control without audio streaming.

REST API (no WebSocket required)

  • await client.health(){ status: string } health check.
  • await client.getVoices()Record<string, Voice[]> voice catalog grouped by provider.
  • await client.speakRest(text, ttsConfig)ArrayBuffer audio synthesized over HTTP. Useful when you do not need an open socket.
  • await client.getLiveKitToken(roomName, participantName, participantIdentity){ token, livekit_url, room_name, participant_identity } for joining LiveKit from the browser or another client.
  • await client.getRecording(streamId)ArrayBuffer OGG recording for a completed session (use client.streamId after ready).
  • await client.getSipHooks() → current SIP webhook hooks from the runtime cache.
  • await client.setSipHooks(hooks) → upserts SIP webhook hooks (matching hosts are replaced, new hosts are added).
const audioBuffer = await client.speakRest("Hello, world!", {
  provider: "elevenlabs",
  voice_id: "21m00Tcm4TlvDq8ikWAM",
  model: "eleven_turbo_v2",
  speaking_rate: 1.0,
  audio_format: "mp3",
  sample_rate: 24000,
});
// With per-request provider credentials
const audioBuffer = await client.speakRest("Hello, world!", {
  provider: "elevenlabs",
  voice_id: "21m00Tcm4TlvDq8ikWAM",
  model: "eleven_turbo_v2",
  auth: { api_key: "your-elevenlabs-key" },
});

SIP hooks (REST)

Manage SIP webhook forwarding via REST:
  • getSipHooks() returns { hooks: SipHook[] }.
  • setSipHooks(hooks: SipHook[]) replaces hooks with matching host (case-insensitive) and adds new hosts, then returns the merged list.
SipHook shape:
fieldtypedescription
hoststringSIP domain pattern to match.
urlstringHTTPS destination for webhook events.
const current = await client.getSipHooks();
console.log(current.hooks);

const updated = await client.setSipHooks([
  { host: "example.com", url: "https://webhook.example.com/events" },
  { host: "another.com", url: "https://webhook.another.com/events" },
]);

console.log("Total hooks configured:", updated.hooks.length);

WebSocket client

Constructing the client

new SaynaClient(url, sttConfig?, ttsConfig?, livekitConfig?, withoutAudio?, apiKey?, streamId?, loadingAudio?)
argumentpurpose
urlSayna server URL (http(s) or ws(s)).
sttConfig (STTConfig)Default speech-to-text provider config sent on connect.
ttsConfig (TTSConfig)Default text-to-speech provider config.
livekitConfig (LiveKitConfig?)Optional LiveKit room configuration.
withoutAudio (boolean)Disable audio streaming when you only need data-channel control (default false).
apiKey (string?)Optional API key for HTTP + WebSocket auth (falls back to SAYNA_API_KEY).
streamId (string?)Optional session identifier for recording paths; server generates one when omitted.
loadingAudio (LoadingAudioConfig?)Optional loading-indicator clip (base64 WAV or raw PCM) sent inside the initial config frame. Only effective when withoutAudio=false and a LiveKit room is configured. See Loading indicator.
Both STTConfig and TTSConfig accept an optional auth field for per-session provider credential overrides. See the WebSocket guide for supported shapes per provider.

Lifecycle and state

  • await client.connect() to establish the socket and send the initial config. Resolves after the server sends a ready message.
  • client.ready and client.connected expose readiness and connection state.
  • await client.disconnect() to close the socket and clean up.
  • LiveKit metadata becomes available when enabled: client.livekitRoomName, client.livekitUrl, client.saynaParticipantIdentity, client.saynaParticipantName.
  • Session identifier: client.streamId mirrors the stream_id returned in the ready message (or the value you provided).

Sending data

  • await client.speak(text, flush?, allowInterruption?) to enqueue speech (defaults flush and interruption to true).
  • await client.onAudioInput(audioData) to stream raw audio for STT.
  • await client.sendMessage(message, role, topic?, debug?) to send data-channel messages.
  • await client.clear() to clear the TTS queue.
  • await client.ttsFlush(allowInterruption?) to flush the queue with an empty speak.
  • client.loadingStart() to begin the loading-indicator audio loop on the dedicated loading-audio LiveKit track. Fire-and-forget and idempotent.
  • client.loadingStop() to stop the loop with a short fade-out. Fire-and-forget, idempotent, and always silent server-side.

Receiving events

Register callbacks to integrate with your agent logic:
  • registerOnSttResult(cb) for transcription results.
  • registerOnTtsAudio(cb) for streaming TTS audio buffers.
  • registerOnError(cb) for error messages.
  • registerOnMessage(cb) for participant messages.
  • registerOnParticipantDisconnected(cb) when a participant leaves.
  • registerOnTtsPlaybackComplete(cb) when queued speech finishes.

Loading indicator

Supply a base64-encoded WAV (or raw 16-bit PCM) clip at construction time, then start and stop the loop around your application’s background work. The clip plays on its own dedicated loading-audio LiveKit track, independent of TTS. See the WebSocket guide for the full protocol contract and audio-content rules.
import { readFile } from "node:fs/promises";
import { SaynaClient } from "@sayna-ai/node-sdk";

const data = (await readFile("./loading.wav")).toString("base64");

const client = new SaynaClient(
  "https://api.sayna.ai",
  { provider: "deepgram", model: "nova-2" },
  { provider: "cartesia", voice_id: "example-voice" },
  { room_name: "my-room" },
  false,        // withoutAudio
  undefined,    // apiKey (falls back to SAYNA_API_KEY)
  undefined,    // streamId
  { data, format: "wav", volume: 0.3 }
);

await client.connect();

// User finishes speaking; start the loop while the app thinks.
client.loadingStart();
// …call your LLM / tools…
client.loadingStop();      // IMPORTANT: stop the loop before speak()
await client.speak("Here is what I found.");
The SDK validates only that loadingAudio.data is a non-empty base64 string and that format (if supplied) is "wav" or "pcm". All audio-content rules (duration, sample rate, byte cap) are enforced by the server; failures arrive asynchronously via the existing registerOnError(cb) callback and the session stays alive.

SIP webhook receiver

Use WebhookReceiver to validate SIP webhooks cryptographically (HMAC-SHA256, constant-time comparison, 5-minute replay window, and strict payload validation).
import { WebhookReceiver } from "@sayna-ai/node-sdk";

const receiver = new WebhookReceiver("your-secret-key-min-16-chars"); // or rely on SAYNA_WEBHOOK_SECRET

app.post(
  "/webhook",
  express.json({
    verify: (req, _res, buf) => ((req as any).rawBody = buf.toString("utf8")),
  }),
  (req, res) => {
    try {
      const webhook = receiver.receive(req.headers, (req as any).rawBody);
      res.status(200).json({ received: true });
    } catch (err: any) {
      res.status(401).json({ error: err.message });
    }
  }
);
  • Constructor: new WebhookReceiver(secret?) uses the provided secret or SAYNA_WEBHOOK_SECRET.
  • receiver.receive(headers, body) verifies the signature and returns a typed WebhookSIPOutput containing participant, room, phone numbers, room prefix, and SIP host. It throws SaynaValidationError if validation fails.