REST API · Live

Best TTS API for Free — Voice Design & Voice Cloning

A production text-to-speech REST API that converts any text into natural-sounding audio in 30+ languages. Describe the voice you want in plain English — no voice library to browse, no reference file needed — or upload a short audio clip to clone any speaker. $0.05 per request, half the price of ElevenLabs. 500 free credits, no credit card.

$0.05 / request 30+ languages Voice design Voice cloning 500 free credits No subscription
Get an API key (free) Quick start See pricing API docs

Quick start — convert text to speech in one API call

Sign up, copy your key from the dashboard, and POST your text. Include a voice_description to design the voice on the fly, or attach a voice_ref audio file to clone an existing speaker. Poll GET /v1/tts/status/{id} until status=completed, then download from output_url.

# Voice Design — describe the voice you want
curl -X POST https://api.pixelapi.dev/v1/tts/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello, welcome to our platform." \
  -F "language=en" \
  -F "voice_description=young woman, warm and friendly"
# Response: {"id": "uuid", "status": "queued", "credits_used": 50}

# Poll until completed
curl https://api.pixelapi.dev/v1/tts/status/UUID \
  -H "Authorization: Bearer YOUR_API_KEY"
# Response: {"status": "completed", "output_url": "https://..."}

# Voice Cloning — upload a reference audio file
curl -X POST https://api.pixelapi.dev/v1/tts/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "text=Hello, welcome to our platform." \
  -F "language=en" \
  -F "[email protected]"
pip install pixelapi
---
from pixelapi import PixelAPI

client = PixelAPI(api_key="YOUR_API_KEY")

# Voice Design — describe the voice
result = client.text_to_speech(
    text="Hello, welcome to our platform.",
    language="en",
    voice_description="young woman, warm and friendly",
)
result.save("output.mp3")

# Voice Cloning — upload a reference audio clip
result = client.text_to_speech(
    text="Hello, welcome to our platform.",
    language="en",
    voice_ref="reference.wav",  # WAV or MP3, min 5s, max 10MB
)
result.save("cloned_output.mp3")
npm install pixelapi
---
import { PixelAPI } from "pixelapi";

const client = new PixelAPI({ apiKey: process.env.PIXELAPI_KEY });

// Voice Design
const result = await client.textToSpeech({
  text: "Hello, welcome to our platform.",
  language: "en",
  voiceDescription: "young woman, warm and friendly",
});
await result.save("output.mp3");

// Voice Cloning
const cloned = await client.textToSpeech({
  text: "Hello, welcome to our platform.",
  language: "en",
  voiceRef: "./reference.wav",
});
await cloned.save("cloned_output.mp3");
composer require pixelapi/pixelapi
---
<?php
use PixelAPI\Client;

$client = new Client(getenv("PIXELAPI_KEY"));

// Voice Design
$result = $client->textToSpeech([
    "text"             => "Hello, welcome to our platform.",
    "language"         => "en",
    "voice_description"=> "young woman, warm and friendly",
]);
file_put_contents("output.mp3", $result->getBody());

// Voice Cloning
$cloned = $client->textToSpeech([
    "text"      => "Hello, welcome to our platform.",
    "language"  => "en",
    "voice_ref" => new \CURLFile("reference.wav"),
]);
file_put_contents("cloned_output.mp3", $cloned->getBody());
gem install pixelapi
---
require "pixelapi"

client = PixelAPI::Client.new(api_key: ENV["PIXELAPI_KEY"])

# Voice Design
result = client.text_to_speech(
  text: "Hello, welcome to our platform.",
  language: "en",
  voice_description: "young woman, warm and friendly"
)
File.binwrite("output.mp3", result.body)

# Voice Cloning
cloned = client.text_to_speech(
  text: "Hello, welcome to our platform.",
  language: "en",
  voice_ref: File.open("reference.wav")
)
File.binwrite("cloned_output.mp3", cloned.body)
go get github.com/pixelapi/pixelapi-go
---
import "github.com/pixelapi/pixelapi-go"

client := pixelapi.New("YOUR_API_KEY")

// Voice Design
result, err := client.TextToSpeech(pixelapi.TTSParams{
    Text:             "Hello, welcome to our platform.",
    Language:         "en",
    VoiceDescription: "young woman, warm and friendly",
})
if err != nil { panic(err) }
result.Save("output.mp3")

// Voice Cloning
cloned, err := client.TextToSpeechClone(pixelapi.TTSCloneParams{
    Text:     "Hello, welcome to our platform.",
    Language: "en",
    VoiceRef: "reference.wav",
})
if err != nil { panic(err) }
cloned.Save("cloned_output.mp3")

Pricing — 2× cheaper than ElevenLabs for voice design & cloning

ProviderFree tierVoice designVoice cloningLanguagesNo subscription
PixelAPI 500 credits, no card $0.05 / request $0.10 / request 30+
ElevenLabs 10k chars/month $0.10 / 500 chars (Starter at $6/mo) $0.10 / 500 chars (Starter) 30+ — (subscription required)
OpenAI TTS-1 Limited (API free tier) $0.0075 / 500 chars ($15/1M chars) — (preset voices only) 57
OpenAI TTS-1-HD Limited $0.015 / 500 chars ($30/1M chars) — (preset voices only) 57
Play.ht Limited trial See play.ht/pricing (subscription required; contact for API) 900+
Murf.ai 10 min/month See murf.ai/pricing (subscription-based) 20+

Pricing verified from each rival's public pricing page May 2026. ElevenLabs Starter: $6/month for 30,000 characters ($0.20/1k chars, $0.10 per 500 chars). OpenAI TTS-1: $15/1M chars ($0.0075 per 500 chars) — preset voices only, no voice design or cloning. PixelAPI Voice Design price is set at exactly half ElevenLabs' Starter rate per our pricing principle.

What you get back

Voice Design output

Describe the speaker in natural language — age, gender, tone, accent, pace. No voice library needed. Pass voice_description="calm narrator, British accent" and get a unique voice generated on the fly.

Voice Cloning output

Upload a WAV or MP3 reference clip (5 seconds minimum, 10 MB max, 16 kHz or higher). The API replicates the speaker's timbre, cadence, and accent in the generated audio. No separate training step required.

30+ languages

English, Chinese, Hindi, Spanish, French, German, Japanese, Korean, Arabic, Portuguese, Russian, and 20+ more. Set language=auto to detect the language automatically from the input text.

Audio file download

The completed job returns an output_url to download the generated audio. The file is stored for 24 hours. Fetch it with any HTTP client or the SDK's .save() helper.

Best TTS API for these common workflows

PixelAPI TTS is production-ready across these industries. Each link leads to a complete setup guide:

Audiobook creators

Generate chapter narration in consistent voices. Clone author voice or design a dedicated narrator persona.

Marketing agencies

Ad voiceovers, video narration, and social content at scale — without a studio booking.

E-commerce

Product description audio for accessibility and voice-commerce channels.

IVR systems

Dynamic hold messages, menu prompts, and notifications read in a consistent brand voice.

Language learning

Native-sounding pronunciation samples in 30+ languages, generated on demand from lesson text.

Fashion & retail

Branded voice assistants and product spotlight narration.

Electronics

Multi-language device prompts and on-product assistants.

Cosmetics & beauty

Tutorial voiceovers, ingredient explanations, and brand stories.

Automotive

In-car voice alerts, manuals, and dealership IVR prompts.

Integrations & SDKs

Zapier

No-code TTS trigger — convert any new document, email, or CMS entry to audio automatically.

Make.com

Drag-and-drop voice generation in Make scenarios.

Next.js

Server-side audio generation for React apps. Stream or pre-generate on deploy.

Shopify

Auto-generate product audio descriptions on publish for accessibility compliance.

Webflow

CMS-triggered narration for blog posts and landing pages.

BigCommerce

Bulk audio generation for product catalog accessibility.

Comparison vs alternatives

vs ElevenLabs

Same voice design and cloning features. No monthly subscription. $0.05 vs $0.10 per 500-char request at Starter.

vs Play.ht

Fully self-serve pay-as-you-go vs Play.ht's subscription model and manual API approval.

vs Murf.ai

REST API with voice cloning vs Murf's subscription studio. Instant signup, no sales call.

vs OpenAI TTS

OpenAI TTS is cheaper per character but offers preset voices only. PixelAPI adds voice design and voice cloning.

vs Google TTS

No GCP billing account needed. Voice cloning unavailable on Google Cloud TTS.

vs Azure TTS

Simpler auth (one API key vs Azure AD). Voice design without the Custom Neural Voice training cost.

Rate limits & error handling

Free tier: 60 requests/minute. Paid tiers: 600 requests/minute. Need higher throughput for batch audiobook generation or IVR? Email [email protected] with expected volume.

Exceeding the rate limit returns HTTP 429 with a Retry-After header specifying how long to wait. Use exponential backoff — start at 2 seconds, double on each retry, cap at 30 seconds. The Python and Node SDKs apply this automatically.

# Python SDK — auto-retries on 429 with exponential backoff
from pixelapi import PixelAPI
client = PixelAPI(api_key="...", max_retries=4)
result = client.text_to_speech(
    text="Chapter one of the audiobook.",
    language="en",
    voice_description="calm, measured narrator, British male",
)
result.save("chapter_01.mp3")  # SDK handles 429 retries automatically

If a TTS job fails during processing, status becomes failed and the error field explains why. Credits are automatically refunded for failed jobs — you never pay for a broken result.

Common error codes:

Frequently asked questions

What is the best TTS API for free?

PixelAPI is one of the most capable free text-to-speech APIs available: 500 credits on signup with no credit card, $0.05 per request after that, voice design in plain English, voice cloning from a short audio clip, and 30+ languages — all pay-as-you-go with no subscription lock-in.

How do I convert text to speech via API?

POST your text to https://api.pixelapi.dev/v1/tts/generate with your API key, the text you want spoken, and an optional voice_description. You get a job id back; poll GET /v1/tts/status/{id} until status=completed, then download the audio from output_url. The Quick Start section above has copy-paste code in cURL, Python, Node, PHP, Ruby, and Go.

Does PixelAPI support voice cloning?

Yes. Upload a WAV or MP3 reference clip via the voice_ref field — minimum 5 seconds, maximum 10 MB, 16 kHz or higher sample rate. The API replicates the speaker's timbre, cadence, and accent. Voice cloning costs $0.10 per request (100 credits), compared to ElevenLabs' ~$0.10 per 500 chars on their Starter subscription.

What languages does the TTS API support?

30+ languages: English, Chinese (Mandarin), Hindi, Spanish, French, German, Japanese, Korean, Russian, Arabic, Portuguese, Italian, Dutch, Polish, Turkish, Vietnamese, Thai, Indonesian, Malay, Bengali, Tamil, Telugu, Marathi, Ukrainian, Swedish, Norwegian, Danish, Finnish, Greek, Hebrew, Swahili. Set language=auto to detect automatically.

What is voice design and how does it work?

Voice design lets you describe the voice you want in plain natural language — for example "middle-aged man, calm and authoritative" or "young woman, warm and gentle, slight Southern US accent". Pass this string as the voice_description field. No voice library to browse, no reference file needed. This is particularly useful for maintaining a consistent branded voice across all your content.

How much does the TTS API cost?

Voice design (no reference file): $0.05 per request for up to 500 characters. Voice cloning (with reference audio): $0.10 per request. New accounts receive 500 free credits — 10 voice design requests — with no credit card. There is no monthly minimum or subscription. Top up credits via the dashboard at any time.

What audio format does the TTS API return?

The API returns a signed URL to the generated audio file. See the full API docs at pixelapi.dev/docs for format options, file size expectations, and the URL expiry window.

Is there a Python SDK?

Yes — pip install pixelapi. The SDK handles authentication, job polling, retry-on-429 with exponential backoff, and file download automatically. Official SDKs also available for Node.js (npm install pixelapi), PHP (Composer), Ruby (Gem), and Go (go get github.com/pixelapi/pixelapi-go).

What are the rate limits?

Default 60 requests/minute on the free tier, 600 requests/minute on paid tiers. Exceeding the limit returns HTTP 429 with a Retry-After header. Need higher throughput — for example, batch IVR generation or large audiobook pipelines — email [email protected] with your expected volume.

What happens if a TTS generation fails?

Failed jobs set status=failed and expose an error field with a description. Credits are automatically refunded for every failed job — you never pay for a result that didn't complete. HTTP 503 (queue full) also triggers an automatic refund; just retry after a few seconds.

Start free — 500 credits, no card Read full API docs Compare all plans