Outbound Telephony Integration (Twilio & Custom Providers)
Our speech-to-speech model can connect directly to telephony systems to make outbound calls. Using Twilio or your own telephony infrastructure, you can stream audio bi-directionally between a phone call and our ROSE WebSocket, enabling real-time, human-like voice AI interactions.How It Works

Prerequisites
- A telephony provider account (Telnyx, Exotel or others)
- Your ROSE API Key and Agent ID from the AIVOCO Playground
Example: Outbound Call with Twilio
Using Other Telephony Providers
our ROSE WebSocket API works with any telephony provider that supports live media streaming via Websockets. To connect your telephony platform:- Generate your API Key and Agent ID from the AIVOCO Playground.
-
Connect your telephony system to the WebSocket endpoint below:
- Send and receive live audio streams to and from ROSE using your provider’s streaming API.
- Once connected, the ROSE model will handle real-time speech understanding and generation.
Key Parameters
| Parameter | Description |
|---|---|
api_key | Your ROSE API key for authentication |
agent_id | The agent you want to connect to |
websocket_url | The live bidirectional audio stream endpoint |
Helpful Telephony Resources
- Twilio Media Streams Overview
- Twilio Outbound Calls API
- Exotel API Reference
- Plivo Voice API
- Telnyx Call Control API
Troubleshooting & Debugging
1. Invalid credentials: Double-check your Twilio/telephony credentials and ROSE API key. 2. Call connects but no audio: Ensure your WebSocket endpoint is reachable and supports bi-directional audio. 3. Call not initiating: Your telephony number must be verified (for trial or restricted accounts). 4. Test locally: Use ngrok to tunnel your WebSocket or HTTP endpoints for local testing.Next Steps
- Replace placeholders (
api_key,agent_id, and telephony credentials`) in your code. - Run the script to initiate a live outbound call.
- Experience a real-time voice conversation with your ROSE Voice Agent.
© 2025 AIVOCO | ROSE — Real-Time Speech-to-Speech Intelligence