TextToSpeechSkills

Expressive text-to-speech

Great text-to-speech for LLM apps, videos, games, and products

Create polished voice output in minutes. Write normal scripts, add natural-language delivery cues in brackets, reuse the same voice across every prompt, and let your LLM app generate audio through MCP.

Natural expressionSave voicesConnect LLM apps

USER

Write a warm dashboard update.

ASSISTANT

[warm] Welcome back. [excited] Report ready.

Audio previewaudio ready
Voice: Product Guide v2
Expression: [calm and bright], [urgent but friendly]
Use for: dashboard updates
Output: audio ready
For agent platformsGame creatorsYouTube narrationSupport productsEducation apps

What is TextToSpeechSkills?

TextToSpeechSkills is a paid text-to-speech platform for people who want good voice output without setting up a complicated audio stack. You can write a normal script, add full natural-language expression directions like [trying not to wake someone] or [confident but playful], save a voice template, and generate audio from the studio, API, MCP server, or installed skills. The product is built for LLM workflows, so a non-technical user can ask an agent to prepare narration while the account owner keeps billing, keys, templates, and permissions controlled. Developers get async speech jobs, webhooks, scoped API keys, workspace billing, and predictable credits. One credit is one full minute of audio, which makes testing and production usage easier to explain before a team connects speech to apps, videos, games, courses, support, onboarding, or internal tools without rebuilding the content model later.

Easy workflow

From text to polished audio without fiddly voice prompts

The workflow is simple enough for a creator and structured enough for a product team. Start in the studio, then reuse the same natural expression markup and saved voices from your app or LLM workflow.

Write or paste text

Use a script, app message, or LLM draft.

Direct the performance

Use plain-language cues like [quiet], [trying not to wake someone], or [loud and angry].

Pick a saved voice

Templates keep every prompt consistent.

Generate audio

Preview in the UI or hand it to your app.

Why teams choose it

Easy to start, good enough to keep using

Make great speech quickly, keep voices consistent, and let humans or LLM apps use the same simple setup.

Sounds intentional

Natural-language expression cues make emotion, pacing, and emphasis clear, so speech feels directed instead of guessed.

Stays consistent

Voice templates keep your narrator, character, support voice, or course instructor recognizable across many prompts.

Easy for LLM users

Connect once, choose the voices your app may use, and ask your LLM workflow to create audio from approved text.

Ready for teams

Workspaces, scoped keys, and clear credit plans help you move from one test to shared production use.

Popular workflows

Voice workflows for games, videos, agents, and learning

Start in the UI, connect an LLM app through MCP, or move the same workflow into the API when it becomes part of your product.

LLM setup

Follow a short setup guide, choose a saved voice, and ask your LLM app to create audio.

Read setup guide

Easy LLM setup

Give your LLM app a voice tool without building an integration first

TextToSpeechSkills is made for the moment when you want an LLM to help with speech, not just write scripts. Install the MCP tool, choose which voices are allowed, and let the same workflow create narration, character lines, support replies, or product audio.

Read the guide
01

MCP install

Paste one command into your LLM app settings.

02

Saved voices

Pick the narrator, character, or support voice it can use.

03

Natural expression

Ask for natural directions like [quiet], [excited but restrained], or [loud and angry].

04

Audio ready

Review the result in your dashboard or send it back to your app.

Credit pricing

Start with a small paid test, then upgrade when usage is real

Every plan includes the UI, API, MCP setup, natural expression markup, and saved voice templates. One credit is one full minute of audio.

Test

$2.99 / month

Yearly: $29.99 / year

30 credits = 30 full minutes included

  • Try UI, API, and MCP
  • Create saved voice templates
  • Upgrade when you need more credits

Starter

$12 / month

Yearly: $120 / year

300 credits = 300 full minutes / month

  • 300 full minutes of audio
  • API, UI, and MCP included
  • Good for small projects

Scale

$99 / month

Yearly: $999 / year

3,200 credits = 3,200 full minutes / month

  • 3,200 full minutes of audio
  • Workspace add-on available
  • Built for production usage
Workspaces are available on Pro and higher for $2 per user per month with central billing.

Expression markup

Natural-language voice direction that stays readable

Write expressive delivery notes directly in brackets. Examples like [quiet] are starters; you can use full phrases such as [trying not to wake someone] or [excited but professional].

StarterCategoryExampleHow to extend it
quietvolume[quiet] hello there.Speak softly or at a low volume. You can combine it with natural language when the scene needs more detail.
whispervolume[whisper] I have a secret.Use a very soft, intimate delivery. You can combine it with natural language when the scene needs more detail.
loudvolume[loud] Listen up!Increase emphasis and volume. You can combine it with natural language when the scene needs more detail.
excitedemotion[excited] that's amazing!Speak with high energy and enthusiasm. You can combine it with natural language when the scene needs more detail.
angryemotion[angry] how could you do that?Speak with anger or frustration. You can combine it with natural language when the scene needs more detail.
warmtone[warm] welcome back.Use a friendly and pleasant tone. You can combine it with natural language when the scene needs more detail.
serioustone[serious] this is important.Use a focused, earnest tone. You can combine it with natural language when the scene needs more detail.
fastpace[fast] let's go!Increase speaking speed. You can combine it with natural language when the scene needs more detail.
slowpace[slow] take your time.Decrease speaking speed. You can combine it with natural language when the scene needs more detail.
pausetiming[pause] wait here.Insert a short pause. You can combine it with natural language when the scene needs more detail.
laughinterjection[laugh] that's funny.Add a brief natural laugh. You can combine it with natural language when the scene needs more detail.

Reusable voices

Templates keep one voice across many prompts

Save persona, tone, pacing, accent, style rules, and sample prompts as versioned assets your whole team can reuse.

TemplatePersonaBaseline voiceDirection
Calm NarratorClear narrator for explainers and product walkthroughsCharonCalm, reassuring, precise. Keep emotional changes controlled unless inline tags request otherwise.
Energetic CoachHigh-energy coach for launches and motivating clipsPuckEnergetic and encouraging without sounding frantic.
Warm TeacherFriendly teacher for lessons and onboardingSulafatFriendly, patient, and clear. The listener should feel safe asking the next question.

Template preview

Calm Narrator v1

approved

Compare saved versions across multiple prompts before making a template active.