Goal: Create a voice AI agent, connect it to live audio (web or telephony), and capture transcriptions.

Step 1 — Get your API key
- Sign in to the AIVOCO Playground:
https://playground.aivoco.com - Open Dashboard → API Keys and generate (or copy) your API key.
- Keep the key secure — do not publish it.
You will use this API key to authenticate both HTTP requests and WebSocket connections.
Step 2 — Create your Agent
Create a new Speech-to-Speech agent using the Create Agent docs.- Follow: Create Agent
- Provide a
name, asystem_messagethat defines the agent persona, and avoice(boyorgirl). - Optionally add
functionsto enable function-calling behavior from the agent.
Step 3 — Connect via WebSocket (Web or Telephony)
Use our WebSocket endpoint to stream audio bi-directionally with ROSE. This is the same mechanism whether you connect from a browser app or bridge an external telephony provider. WebSocket URL (example):{API_KEY} and {AGENT_ID} with your values from Step 1 and Step 2.
Once connected you can:
- Send audio frames (client → ROSE)
- Receive synthesized audio responses (ROSE → client)
- Maintain session context for multi-turn conversations
Transcription (After the Call)
After a call ends you can obtain a transcription using the Transcription endpoint (see the Transcription docs). The typical flow is:- Make your call (web or telephony) to the agent connected to ROSE.
- Note the
call_idreturned or available in your call logs. - Use the Transcription API to fetch the text transcript for that
call_id.
Tips & Notes
- Use the Playground to test agents interactively before wiring production telephony.
- If your telephony provider isn’t supported out of the box, contact us at [email protected].
- Protect your API key — treat it like a password.
© 2025 AIVOCO | ROSE – Real-Time Speech-to-Speech Intelligence