REST endpoints
Synthesize speech
POST
Handler for the /speak endpoint
Kick off one-shot synthesis jobs for short responses. This endpoint reuses the same provider layer and cache that powers the streaming WebSocket experience.
Reuse identical
tts_config inputs to hit the cache and avoid extra provider round-trips.Authorizations
Authentication token for protected endpoints. Can be provided as Authorization: Bearer <token> or ?api_key=<token>. Required when AUTH_REQUIRED is enabled.
Body
application/json
Request body for the speak endpoint
The text to synthesize
Example:
"Hello, world!"
TTS configuration, including an optional provider auth override.
Response
Audio generated successfully
Previous
LiveKit tokenGenerates a LiveKit JWT token for a participant to join a specific room.
When authentication is enabled (`auth.id` is present), this handler:
1. Creates the room if it doesn't exist
2. Sets `room.metadata.auth_id` to the authenticated tenant's ID
3. Issues the token only after metadata is verified/set
# Arguments
* `state` - Shared application state containing LiveKit configuration
* `request` - Token request with room name and participant details
# Returns
* `Response` - JSON response with token or error status
# Errors
* 400 Bad Request - Invalid request data (empty fields)
* 403 Forbidden - Room exists with a different tenant's `auth_id`
* 500 Internal Server Error - LiveKit service not configured, room creation failed, or token generation failed
Next
Handler for the /speak endpoint