Audio Models | Otoroshi LLM Extension

📄️ Audio Models

Otoroshi LLM Extension provides full support for audio generation models, enabling Text-to-Speech (TTS), Speech-to-Text (STT), and Audio Translation capabilities through a unified OpenAI-compatible API.

📄️ Text-to-Speech (TTS)

Text-to-Speech converts text input into audio output. Otoroshi exposes TTS through an OpenAI-compatible API endpoint.

📄️ Speech-to-Text (STT)

Speech-to-Text transcribes audio files into text. Otoroshi exposes STT through an OpenAI-compatible API endpoint.

📄️ Audio Translation

Audio Translation transcribes audio files and translates them into English. This is distinct from Speech-to-Text which transcribes in the original language.