Skip to main content

Welcome to AIVOCO’s Speech-to-Speech API

Meet ROSE, AIVOCO’s real-time speech-to-speech model that brings voice AI agents to life.
Whether you’re building telecalling agents, AI receptionists, or human-like sales agents, ROSE lets your applications listen, understand, and talk back, all through a natural, continuous speech pipeline.
Our APIs and WebSocket infrastructure make it simple to:
  • Create and manage voice agents
  • Connect them to telephony systems or web apps
  • Enable real-time bi-directional audio streaming
  • Transcribe and analyze your calls, instantly.

Quick Start

Get started in just a few minutes with our three-step process.

Start here

Follow our Quickstart Guide to create your first Speech-to-Speech agent.

What You Can Build

With AIVOCO’s Speech-to-Speech platform, you can bring real-time conversational AI to your business or app.

Voice AI Sales Agents

Build voice-based agents that can make outbound calls, qualify leads, and close deals autonomously.

Telecalling Agents

Automate customer support or follow-up calls with natural-sounding speech-to-speech interactions.

In-App Voice Assistants

Add human-like conversational capabilities inside your product or website with our WebSocket API.

Real-Time Transcription

Get instant transcripts and conversation analytics from live or recorded calls.

How It Works

The AIVOCO Speech-to-Speech flow is designed to be modular and real-time:
  1. Create an Agent — Define your agent’s name, voice, and system behavior via the /agents API.
  2. Connect via WebSocket — Use our secure endpoint to establish a live speech stream between your app or telephony system and ROSE.
  3. Integrate Telephony — Connect with Twilio, Exotel, or any SIP trunking provider using our wss://call.aivoco.on.cloud.vispark.in/ws/{api_key}/{agent_id} WebSocket.
  4. Transcribe Conversations — Retrieve full transcripts and insights from the Transcription endpoint using your call ID.
These steps are covered in detail in the Quickstart Guide →

Authentication

Every API request requires authentication via an API key. To generate your key:
  • Log in to your AIVOCO Playground
  • Go to your dashboard → API Keys
  • Copy your key and include it in the X-API-Key header or WebSocket path
Example:
X-API-Key: YOUR_API_KEY

Supported Integrations

AIVOCO supports multiple channels for connecting your agents:

Telephony (Twilio, Exotel, Telnyx)

Connect your ROSE agent to real phone calls using standard telephony providers.

Web Applications

Stream voice interactions directly in your browser or app using WebSocket APIs.

Custom Integrations

Extend your agents with function calling and real-world data (e.g., weather, CRM updates).

Analytics and Transcription

Convert any voice interaction into structured data using our transcription service.

Need Help?

If your telephony provider or integration type isn’t available, we’re happy to help. 📧 Contact us: vansh@aivoco.com
© 2025 AIVOCO | ROSE – Real-Time Speech-to-Speech Intelligence