Transcription API Documentation
Endpoint: /transcribe
Method: POSTAuthentication: API Key (required)
Features
- Transcribe audio files using Transcription API endpoint.
- Supports multiple input methods (base64, URL, or call log ID)
- Automatic speaker identification (AI vs User)
- Handles credit and unit-based billing
- Admin mode for enterprise use
Request Format
Base Request
Note: Exactly one ofaudio_data,audio_url, orcall_idmust be provided.
Input Methods
Method 1: Base64 Audio Data
Upload audio directly as base64 encoded data.audio/wavaudio/mp3audio/mp4(m4a)audio/ogg
Method 2: Audio URL
Provide a direct URL to the audio file.Method 3: Log ID
Use an existing call log ID to transcribe a previous call recording.- The log must belong to the authenticated user
- The log must have a valid recording URL
Response Format
Success Response (200 OK)
| Field | Description |
|---|---|
transcription | Full conversation text with speaker tags |
user_credits_deducted | Credits used for this transcription |
timestamp | Time of transcription completion (ISO format) |
Error Responses
400 Bad Request
401 Unauthorized
402 Payment Required
404 Not Found
429 Too Many Requests
500 Internal Server Error
Constraints
| Constraint | Limit |
|---|---|
| Minimum duration | 1 second |
| Maximum duration | 30 minutes |
| Rate limit | 60 requests/min per IP |
| Authentication | Required via API key |
Example Usage
cURL Example
Python Example
JavaScript Example
Transcription Format
Example output with speaker labeling:- Clear
AI:andUser:labels - Chronological conversation flow
- Marks unclear sections with
[unclear]
Notes
- Audio files are processed in-memory and not stored
- Duration estimation accurate for WAV; approximate for others
- Uses Vispark Vision (small) model for transcription
- Supports multilingual input
© 2025 AIvoco | Transcription | Speech-to-Text Intelligence