Client and server SDKs for real-time voice streaming in AI agents
This repository contains the official Sayna SDKs for adding voice capabilities to existing AI agents. Sayna provides a unified voice streaming layer that handles speech-to-text (STT) and text-to-speech (TTS) interactions, abstracting the complexity of managing real-time voice pipelines.
Sayna enables AI agents to support natural voice conversations by providing:
- Real-time audio streaming over WebSockets
- Multi-provider support for STT (Deepgram, Google) and TTS (ElevenLabs, Google, Deepgram)
- Optional LiveKit integration for multi-participant voice rooms
- Low-latency voice pipeline management
- Type-safe client libraries
This monorepo contains three SDKs:
- JavaScript SDK - Browser-based client for connecting to Sayna voice rooms
- Node.js SDK - Server-side SDK for Node.js applications
- Python SDK - Async Python SDK for server-side voice streaming
Each SDK has its own README with detailed installation instructions and API documentation.
saysdk/
├── js-sdk/ # Browser JavaScript SDK
├── node-sdk/ # Node.js server SDK
├── python-sdk/ # Python server SDK
├── api-reference.md # Complete API documentation
└── README.md # This file
- API Reference - Complete REST and WebSocket API documentation
- JavaScript SDK - Browser client documentation
- Node.js SDK - Node.js SDK documentation
- Python SDK - Python SDK documentation
- Voice-enabled chatbots and conversational AI
- AI-powered phone systems and telephony applications
- Multi-participant voice rooms with AI agents
- Real-time transcription services
- Voice synthesis and speech generation
Apache License 2.0 - see LICENSE for details