A production text-to-speech REST API that converts any text into natural-sounding audio in 30+ languages. Describe the voice you want in plain English — no voice library to browse, no reference file needed — or upload a short audio clip to clone any speaker. $0.05 per request, half the price of ElevenLabs. 500 free credits, no credit card.
Sign up, copy your key from the dashboard, and POST your text. Include a voice_description to design the voice on the fly, or attach a voice_ref audio file to clone an existing speaker. Poll GET /v1/tts/status/{id} until status=completed, then download from output_url.
# Voice Design — describe the voice you want
curl -X POST https://api.pixelapi.dev/v1/tts/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "text=Hello, welcome to our platform." \
-F "language=en" \
-F "voice_description=young woman, warm and friendly"
# Response: {"id": "uuid", "status": "queued", "credits_used": 50}
# Poll until completed
curl https://api.pixelapi.dev/v1/tts/status/UUID \
-H "Authorization: Bearer YOUR_API_KEY"
# Response: {"status": "completed", "output_url": "https://..."}
# Voice Cloning — upload a reference audio file
curl -X POST https://api.pixelapi.dev/v1/tts/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "text=Hello, welcome to our platform." \
-F "language=en" \
-F "[email protected]"
pip install pixelapi
---
from pixelapi import PixelAPI
client = PixelAPI(api_key="YOUR_API_KEY")
# Voice Design — describe the voice
result = client.text_to_speech(
text="Hello, welcome to our platform.",
language="en",
voice_description="young woman, warm and friendly",
)
result.save("output.mp3")
# Voice Cloning — upload a reference audio clip
result = client.text_to_speech(
text="Hello, welcome to our platform.",
language="en",
voice_ref="reference.wav", # WAV or MP3, min 5s, max 10MB
)
result.save("cloned_output.mp3")
npm install pixelapi
---
import { PixelAPI } from "pixelapi";
const client = new PixelAPI({ apiKey: process.env.PIXELAPI_KEY });
// Voice Design
const result = await client.textToSpeech({
text: "Hello, welcome to our platform.",
language: "en",
voiceDescription: "young woman, warm and friendly",
});
await result.save("output.mp3");
// Voice Cloning
const cloned = await client.textToSpeech({
text: "Hello, welcome to our platform.",
language: "en",
voiceRef: "./reference.wav",
});
await cloned.save("cloned_output.mp3");
composer require pixelapi/pixelapi
---
<?php
use PixelAPI\Client;
$client = new Client(getenv("PIXELAPI_KEY"));
// Voice Design
$result = $client->textToSpeech([
"text" => "Hello, welcome to our platform.",
"language" => "en",
"voice_description"=> "young woman, warm and friendly",
]);
file_put_contents("output.mp3", $result->getBody());
// Voice Cloning
$cloned = $client->textToSpeech([
"text" => "Hello, welcome to our platform.",
"language" => "en",
"voice_ref" => new \CURLFile("reference.wav"),
]);
file_put_contents("cloned_output.mp3", $cloned->getBody());
gem install pixelapi
---
require "pixelapi"
client = PixelAPI::Client.new(api_key: ENV["PIXELAPI_KEY"])
# Voice Design
result = client.text_to_speech(
text: "Hello, welcome to our platform.",
language: "en",
voice_description: "young woman, warm and friendly"
)
File.binwrite("output.mp3", result.body)
# Voice Cloning
cloned = client.text_to_speech(
text: "Hello, welcome to our platform.",
language: "en",
voice_ref: File.open("reference.wav")
)
File.binwrite("cloned_output.mp3", cloned.body)
go get github.com/pixelapi/pixelapi-go
---
import "github.com/pixelapi/pixelapi-go"
client := pixelapi.New("YOUR_API_KEY")
// Voice Design
result, err := client.TextToSpeech(pixelapi.TTSParams{
Text: "Hello, welcome to our platform.",
Language: "en",
VoiceDescription: "young woman, warm and friendly",
})
if err != nil { panic(err) }
result.Save("output.mp3")
// Voice Cloning
cloned, err := client.TextToSpeechClone(pixelapi.TTSCloneParams{
Text: "Hello, welcome to our platform.",
Language: "en",
VoiceRef: "reference.wav",
})
if err != nil { panic(err) }
cloned.Save("cloned_output.mp3")
| Provider | Free tier | Voice design | Voice cloning | Languages | No subscription |
|---|---|---|---|---|---|
| PixelAPI | 500 credits, no card | $0.05 / request | $0.10 / request | 30+ | ✓ |
| ElevenLabs | 10k chars/month | $0.10 / 500 chars (Starter at $6/mo) | $0.10 / 500 chars (Starter) | 30+ | — (subscription required) |
| OpenAI TTS-1 | Limited (API free tier) | $0.0075 / 500 chars ($15/1M chars) | — (preset voices only) | 57 | ✓ |
| OpenAI TTS-1-HD | Limited | $0.015 / 500 chars ($30/1M chars) | — (preset voices only) | 57 | ✓ |
| Play.ht | Limited trial | See play.ht/pricing (subscription required; contact for API) | 900+ | — | |
| Murf.ai | 10 min/month | See murf.ai/pricing (subscription-based) | 20+ | — | |
Pricing verified from each rival's public pricing page May 2026. ElevenLabs Starter: $6/month for 30,000 characters ($0.20/1k chars, $0.10 per 500 chars). OpenAI TTS-1: $15/1M chars ($0.0075 per 500 chars) — preset voices only, no voice design or cloning. PixelAPI Voice Design price is set at exactly half ElevenLabs' Starter rate per our pricing principle.
Describe the speaker in natural language — age, gender, tone, accent, pace. No voice library needed. Pass voice_description="calm narrator, British accent" and get a unique voice generated on the fly.
Upload a WAV or MP3 reference clip (5 seconds minimum, 10 MB max, 16 kHz or higher). The API replicates the speaker's timbre, cadence, and accent in the generated audio. No separate training step required.
English, Chinese, Hindi, Spanish, French, German, Japanese, Korean, Arabic, Portuguese, Russian, and 20+ more. Set language=auto to detect the language automatically from the input text.
The completed job returns an output_url to download the generated audio. The file is stored for 24 hours. Fetch it with any HTTP client or the SDK's .save() helper.
PixelAPI TTS is production-ready across these industries. Each link leads to a complete setup guide:
Generate chapter narration in consistent voices. Clone author voice or design a dedicated narrator persona.
Ad voiceovers, video narration, and social content at scale — without a studio booking.
Product description audio for accessibility and voice-commerce channels.
Dynamic hold messages, menu prompts, and notifications read in a consistent brand voice.
Native-sounding pronunciation samples in 30+ languages, generated on demand from lesson text.
Branded voice assistants and product spotlight narration.
Multi-language device prompts and on-product assistants.
Tutorial voiceovers, ingredient explanations, and brand stories.
In-car voice alerts, manuals, and dealership IVR prompts.
No-code TTS trigger — convert any new document, email, or CMS entry to audio automatically.
Drag-and-drop voice generation in Make scenarios.
Server-side audio generation for React apps. Stream or pre-generate on deploy.
Auto-generate product audio descriptions on publish for accessibility compliance.
CMS-triggered narration for blog posts and landing pages.
Bulk audio generation for product catalog accessibility.
Same voice design and cloning features. No monthly subscription. $0.05 vs $0.10 per 500-char request at Starter.
Fully self-serve pay-as-you-go vs Play.ht's subscription model and manual API approval.
REST API with voice cloning vs Murf's subscription studio. Instant signup, no sales call.
OpenAI TTS is cheaper per character but offers preset voices only. PixelAPI adds voice design and voice cloning.
No GCP billing account needed. Voice cloning unavailable on Google Cloud TTS.
Simpler auth (one API key vs Azure AD). Voice design without the Custom Neural Voice training cost.
Free tier: 60 requests/minute. Paid tiers: 600 requests/minute. Need higher throughput for batch audiobook generation or IVR? Email [email protected] with expected volume.
Exceeding the rate limit returns HTTP 429 with a Retry-After header specifying how long to wait. Use exponential backoff — start at 2 seconds, double on each retry, cap at 30 seconds. The Python and Node SDKs apply this automatically.
# Python SDK — auto-retries on 429 with exponential backoff
from pixelapi import PixelAPI
client = PixelAPI(api_key="...", max_retries=4)
result = client.text_to_speech(
text="Chapter one of the audiobook.",
language="en",
voice_description="calm, measured narrator, British male",
)
result.save("chapter_01.mp3") # SDK handles 429 retries automatically
If a TTS job fails during processing, status becomes failed and the error field explains why. Credits are automatically refunded for failed jobs — you never pay for a broken result.
Common error codes:
400 — text is empty, or reference audio is too large (>10 MB) or too short (<5 s)402 — insufficient credits; top up at /pricing429 — rate limit exceeded; respect the Retry-After header503 — queue full; credits refunded, retry after a few secondsPixelAPI is one of the most capable free text-to-speech APIs available: 500 credits on signup with no credit card, $0.05 per request after that, voice design in plain English, voice cloning from a short audio clip, and 30+ languages — all pay-as-you-go with no subscription lock-in.
POST your text to https://api.pixelapi.dev/v1/tts/generate with your API key, the text you want spoken, and an optional voice_description. You get a job id back; poll GET /v1/tts/status/{id} until status=completed, then download the audio from output_url. The Quick Start section above has copy-paste code in cURL, Python, Node, PHP, Ruby, and Go.
Yes. Upload a WAV or MP3 reference clip via the voice_ref field — minimum 5 seconds, maximum 10 MB, 16 kHz or higher sample rate. The API replicates the speaker's timbre, cadence, and accent. Voice cloning costs $0.10 per request (100 credits), compared to ElevenLabs' ~$0.10 per 500 chars on their Starter subscription.
30+ languages: English, Chinese (Mandarin), Hindi, Spanish, French, German, Japanese, Korean, Russian, Arabic, Portuguese, Italian, Dutch, Polish, Turkish, Vietnamese, Thai, Indonesian, Malay, Bengali, Tamil, Telugu, Marathi, Ukrainian, Swedish, Norwegian, Danish, Finnish, Greek, Hebrew, Swahili. Set language=auto to detect automatically.
Voice design lets you describe the voice you want in plain natural language — for example "middle-aged man, calm and authoritative" or "young woman, warm and gentle, slight Southern US accent". Pass this string as the voice_description field. No voice library to browse, no reference file needed. This is particularly useful for maintaining a consistent branded voice across all your content.
Voice design (no reference file): $0.05 per request for up to 500 characters. Voice cloning (with reference audio): $0.10 per request. New accounts receive 500 free credits — 10 voice design requests — with no credit card. There is no monthly minimum or subscription. Top up credits via the dashboard at any time.
The API returns a signed URL to the generated audio file. See the full API docs at pixelapi.dev/docs for format options, file size expectations, and the URL expiry window.
Yes — pip install pixelapi. The SDK handles authentication, job polling, retry-on-429 with exponential backoff, and file download automatically. Official SDKs also available for Node.js (npm install pixelapi), PHP (Composer), Ruby (Gem), and Go (go get github.com/pixelapi/pixelapi-go).
Default 60 requests/minute on the free tier, 600 requests/minute on paid tiers. Exceeding the limit returns HTTP 429 with a Retry-After header. Need higher throughput — for example, batch IVR generation or large audiobook pipelines — email [email protected] with your expected volume.
Failed jobs set status=failed and expose an error field with a description. Credits are automatically refunded for every failed job — you never pay for a result that didn't complete. HTTP 503 (queue full) also triggers an automatic refund; just retry after a few seconds.