AethonVoice is a text-to-speech service built on open-source AI. We deliver studio-grade voice synthesis across 21 languages at a price point that makes high-quality TTS accessible to developers, creators, and businesses of any size.
We operate a TTS API and MCP Server powered by OmniVoice, the most capable open-source speech model available. Our service adds production-grade features on top of this foundation: multilingual text mixing, paralinguistic expression, voice cloning, batch processing, and long-form audio generation.
Our approach is straightforward: use the best available open-source model, add the engineering needed for production use, and charge based on actual compute costs rather than inflated per-character fees.
Team information coming soon.
Babbly AI Company Limited
559/67 Thanapat Haus, Nonsi Road, Yannawa, Bangkok, 10120, Thailand
AethonVoice is built on top of OmniVoice, an open-source text-to-speech model developed and published by the k2-fsa research group. The model weights, training code, and inference pipeline are released under permissive licensing.
AethonVoice adds production-grade features on top of the base model — multilingual text splitting, paralinguistic expression, long-form generation, batch processing, and a managed API — but the core speech synthesis capability comes from the OmniVoice project. We acknowledge and are grateful for their contribution to open-source AI.
See what we can do, then get started.