webhookElevenLabs TTS

ElevenLabs Text to Speech (TTS)

The ElevenLabs TTS API converts text into natural-sounding speech using ElevenLabs' advanced text-to-speech models. This endpoint provides high-quality voice synthesis with customizable voice selection, speech speed, and output formats.

Base URL: https://api.openmind.org

Authentication: OpenMind API key is required. Include the key in the x-api-key or Authorization header.

Endpoints Overview

Method
Endpoint
Description

POST

/elevenlabs/tts

Generate speech from text using ElevenLabs TTS

POST

/elevenlabs/tts/audio/speech

Stream speech from text using ElevenLabs TTS

Generate Speech

Convert text to speech using the ElevenLabs TTS engine with customizable voice and output options.

Endpoint: POST /elevenlabs/tts

Request

curl -X POST https://api.openmind.org/elevenlabs/tts \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "input": "Hello, this is a test of the ElevenLabs text to speech API."
  }'

Request Body

Field
Type
Required
Default
Description

input

string

Yes

-

The text to convert to speech

voice

string or object

No

JBFqnCBsd6RMkjVDRZzb

ElevenLabs voice ID (string) or voice object

model

string

No

eleven_flash_v2_5

ElevenLabs model ID to use for synthesis

response_format

string

No

mp3_44100_128

Audio output format specification

speed

float

No

1.0

Speech speed multiplier (0.5 - 2.0)

elevenlabs_api_key

string

No

-

Optional ElevenLabs API key override

Response

Success (200 OK):

Response Fields

Field
Type
Description

text

string

The original input text

response

string

Base64-encoded audio data ready for decoding and playback

format

string

Audio format of the returned data (e.g., "mp3_44100_128")

Error Responses:

Note: The returned audio is base64-encoded. You must decode it before playback or saving to a file.

Stream Speech

Convert text to speech and stream the audio directly. This endpoint is ideal for real-time applications where low latency is critical.

Endpoint: POST /elevenlabs/tts/audio/speech

Request

The request body parameters are identical to the /elevenlabs/tts endpoint.

Response

Success (200 OK):

The response is a binary stream of the audio file.

Headers:

  • Content-Type: audio/mpeg (depending on requested format)

Error Responses:

See Error Responses for /elevenlabs/tts.

Usage Examples

Basic Text-to-Speech

Convert simple text to speech using default settings:

Custom Voice and Speed

Use a specific voice with faster speech rate:

Full Configuration

Customize all available parameters:

Save Audio to File

Generate speech and save directly to an MP3 file:

With Environment Variables

Store your configuration in environment variables for easier management:

Stream to File

Stream the audio directly to a file using the streaming endpoint:

Voice Configuration

Default Voice

The default voice ID is JBFqnCBsd6RMkjVDRZzb. This voice provides clear, natural-sounding English speech suitable for most applications.

Custom Voices

You can use any ElevenLabs voice ID by specifying it in the voice parameter. Visit the ElevenLabs Voice Libraryarrow-up-right to explore available voices.

Speed Control

The speed parameter accepts values between 0.5 (half speed) and 2.0 (double speed):

  • 0.5 - 50% slower (more deliberate)

  • 1.0 - Normal speed (default)

  • 1.5 - 50% faster

  • 2.0 - Double speed (maximum)

Output Formats

The default output format is mp3_44100_128. The response_format parameter allows you to specify other formats if needed.

Error Handling

All endpoints follow consistent error response patterns:

HTTP Status Codes

Code
Description

200

Success - Audio generated successfully

400

Bad Request - Missing required fields, invalid JSON, or unsupported format

503

Service Unavailable - ElevenLabs API unavailable or not configured

500

Internal Server Error - Server-side processing error

Error Response Format

Common Error Scenarios

Missing Input Field:

API Key Not Configured: If the server-side ElevenLabs API key is not configured and you don't provide one in the request, you'll receive:

Connection Issues: If the service cannot reach the ElevenLabs API:

Best Practices

Audio Decoding

The API returns base64-encoded audio data. Always decode it before use:

Note: Note the following best practices when using the ElevenLabs TTS API:

  • Audio responses are base64-encoded and must be decoded before playback

  • The ElevenLabs API key can be configured server-side or provided per-request

  • Default voice and model settings are optimized for English speech

  • Large text inputs may take longer to process

Last updated

Was this helpful?