Skip to main content
@sayna-ai/node-sdk is the server-side toolkit for Sayna. It streams audio over WebSocket, sends and receives transcripts and TTS audio, calls REST endpoints, issues LiveKit tokens, manages SIP hooks, and verifies signed SIP webhooks. Everything is fully typed for TypeScript.

Install

npm install @sayna-ai/node-sdk

Quick start (realtime)

import { SaynaClient } from "@sayna-ai/node-sdk";

const client = new SaynaClient(
  "https://api.sayna.ai",
  { provider: "deepgram", model: "nova-2" },
  { provider: "cartesia", voice_id: "example-voice" }
);

client.registerOnSttResult(({ transcript }) => {
  console.log("Transcription:", transcript);
});

client.registerOnTtsAudio((audio) => {
  // audio is an ArrayBuffer stream from TTS
});

await client.connect();
await client.speak("Hello, world!");
Pass an optional livekitConfig as the fourth constructor argument to have the server join a LiveKit room, and set the final boolean to true when you want data-channel control without audio streaming.

REST API (no WebSocket required)

  • await client.health(){ status: string } health check.
  • await client.getVoices()Record<string, Voice[]> voice catalog grouped by provider.
  • await client.speakRest(text, ttsConfig)ArrayBuffer audio synthesized over HTTP. Useful when you do not need an open socket.
  • await client.getLiveKitToken(roomName, participantName, participantIdentity){ token, livekit_url, room_name, participant_identity } for joining LiveKit from the browser or another client.
  • await client.getRecording(streamId)ArrayBuffer OGG recording for a completed session (use client.streamId after ready).
  • await client.getSipHooks() → current SIP webhook hooks from the runtime cache.
  • await client.setSipHooks(hooks) → upserts SIP webhook hooks (matching hosts are replaced, new hosts are added).
const audioBuffer = await client.speakRest("Hello, world!", {
  provider: "elevenlabs",
  voice_id: "21m00Tcm4TlvDq8ikWAM",
  model: "eleven_turbo_v2",
  speaking_rate: 1.0,
  audio_format: "mp3",
  sample_rate: 24000,
});

SIP hooks (REST)

Manage SIP webhook forwarding via REST:
  • getSipHooks() returns { hooks: SipHook[] }.
  • setSipHooks(hooks: SipHook[]) replaces hooks with matching host (case-insensitive) and adds new hosts, then returns the merged list.
SipHook shape:
fieldtypedescription
hoststringSIP domain pattern to match.
urlstringHTTPS destination for webhook events.
const current = await client.getSipHooks();
console.log(current.hooks);

const updated = await client.setSipHooks([
  { host: "example.com", url: "https://webhook.example.com/events" },
  { host: "another.com", url: "https://webhook.another.com/events" },
]);

console.log("Total hooks configured:", updated.hooks.length);

WebSocket client

Constructing the client

new SaynaClient(url, sttConfig?, ttsConfig?, livekitConfig?, withoutAudio?, apiKey?, streamId?)
argumentpurpose
urlSayna server URL (http(s) or ws(s)).
sttConfig (STTConfig)Default speech-to-text provider config sent on connect.
ttsConfig (TTSConfig)Default text-to-speech provider config.
livekitConfig (LiveKitConfig?)Optional LiveKit room configuration.
withoutAudio (boolean)Disable audio streaming when you only need data-channel control (default false).
apiKey (string?)Optional API key for HTTP + WebSocket auth (falls back to SAYNA_API_KEY).
streamId (string?)Optional session identifier for recording paths; server generates one when omitted.

Lifecycle and state

  • await client.connect() to establish the socket and send the initial config. Resolves after the server sends a ready message.
  • client.ready and client.connected expose readiness and connection state.
  • await client.disconnect() to close the socket and clean up.
  • LiveKit metadata becomes available when enabled: client.livekitRoomName, client.livekitUrl, client.saynaParticipantIdentity, client.saynaParticipantName.
  • Session identifier: client.streamId mirrors the stream_id returned in the ready message (or the value you provided).

Sending data

  • await client.speak(text, flush?, allowInterruption?) to enqueue speech (defaults flush and interruption to true).
  • await client.onAudioInput(audioData) to stream raw audio for STT.
  • await client.sendMessage(message, role, topic?, debug?) to send data-channel messages.
  • await client.clear() to clear the TTS queue.
  • await client.ttsFlush(allowInterruption?) to flush the queue with an empty speak.

Receiving events

Register callbacks to integrate with your agent logic:
  • registerOnSttResult(cb) for transcription results.
  • registerOnTtsAudio(cb) for streaming TTS audio buffers.
  • registerOnError(cb) for error messages.
  • registerOnMessage(cb) for participant messages.
  • registerOnParticipantDisconnected(cb) when a participant leaves.
  • registerOnTtsPlaybackComplete(cb) when queued speech finishes.

SIP webhook receiver

Use WebhookReceiver to validate SIP webhooks cryptographically (HMAC-SHA256, constant-time comparison, 5-minute replay window, and strict payload validation).
import { WebhookReceiver } from "@sayna-ai/node-sdk";

const receiver = new WebhookReceiver("your-secret-key-min-16-chars"); // or rely on SAYNA_WEBHOOK_SECRET

app.post(
  "/webhook",
  express.json({
    verify: (req, _res, buf) => ((req as any).rawBody = buf.toString("utf8")),
  }),
  (req, res) => {
    try {
      const webhook = receiver.receive(req.headers, (req as any).rawBody);
      res.status(200).json({ received: true });
    } catch (err: any) {
      res.status(401).json({ error: err.message });
    }
  }
);
  • Constructor: new WebhookReceiver(secret?) uses the provided secret or SAYNA_WEBHOOK_SECRET.
  • receiver.receive(headers, body) verifies the signature and returns a typed WebhookSIPOutput containing participant, room, phone numbers, room prefix, and SIP host. It throws SaynaValidationError if validation fails.