📄️ OCR Models
Otoroshi LLM Extension provides support for OCR (Optical Character Recognition) models, enabling text extraction from images and PDF documents through a unified, Mistral-inspired API.
📄️ OCR Providers
Each OCR Model wraps a single provider. The provider is selected with the provider field, and configured through config.connection (endpoint and credentials) and config.options (model and provider-specific options).
📄️ OCR API
OCR is exposed as an HTTP API through the Cloud APIM - OCR backend plugin. The same handler is also available through the unified OpenAI Compatible API plugin on the POST /ocr path.
📄️ OCR through text models
Besides the dedicated OCR Model entity, OCR can also be performed through a regular LLM (text) provider. AlphaEdge 🇫🇷 🇪🇺 can be configured as a standard LLM provider: you call the OpenAI-compatible /chat/completions endpoint with a message that contains an image or PDF content part, and the assistant response is the extracted text.