OCR Models | Otoroshi LLM Extension

📄️ OCR Models

Otoroshi LLM Extension provides support for OCR (Optical Character Recognition) models, enabling text extraction from images and PDF documents through a unified, Mistral-inspired API.

Each OCR Model wraps a single provider. The provider is selected with the provider field, and configured through config.connection (endpoint and credentials) and config.options (model and provider-specific options).

📄️ OCR API

OCR is exposed as an HTTP API through the Cloud APIM - OCR backend plugin. The same handler is also available through the unified OpenAI Compatible API plugin on the POST /ocr path.

📄️ OCR through text models

Besides the dedicated OCR Model entity, OCR can also be performed through a regular LLM (text) provider. AlphaEdge 🇫🇷 🇪🇺 can be configured as a standard LLM provider: you call the OpenAI-compatible /chat/completions endpoint with a message that contains an image or PDF content part, and the assistant response is the extracted text.

📄️ OCR Models

📄️ OCR Providers

📄️ OCR API

📄️ OCR through text models