📄️ Audio Models
Otoroshi LLM Extension provides full support for audio generation models, enabling Text-to-Speech (TTS), Speech-to-Text (STT), and Audio Translation capabilities through a unified OpenAI-compatible API.
📄️ Text-to-Speech (TTS)
Text-to-Speech converts text input into audio output. Otoroshi exposes TTS through an OpenAI-compatible API endpoint.
📄️ Speech-to-Text (STT)
Speech-to-Text transcribes audio files into text. Otoroshi exposes STT through an OpenAI-compatible API endpoint.
📄️ Audio Translation
Audio Translation transcribes audio files and translates them into English. This is distinct from Speech-to-Text which transcribes in the original language.