Skip to main content

Search Engines

Otoroshi LLM Extension provides support for search engines, enabling real-time web search — and semantic search over your own data with RAG knowledge bases — through a unified, normalized API.

A search engine is exposed as a dedicated, first-class entity type — the Search Engine — just like Audio Models, Image Models, or OCR Models. Each Search Engine wraps a provider and its configuration, and exposes a single search(query) operation. It can be used in three ways: as an LLM tool (passed to a provider or an agent), through a dedicated HTTP plugin, or as a workflow function.

Every web provider is implemented in pure HTTP (no extra dependency) and every response is normalized to a common shape, so the same code/prompt works regardless of the underlying engine.

Two kinds of Search Engine

KindProvidersWhat it searches
Web searchStaan.ai, Tavily, Brave, SearXNG, Google, SearchApi, DuckDuckGo, ExaThe public web, through a third-party search API
RAG knowledge baseragYour own documents, by semantic similarity over an embedding store using an embedding model

Both kinds share the same entity, the same search(query) operation, and the same normalized result shape — so a RAG knowledge base can be wired into a provider, agent, plugin, or workflow exactly like a web engine. The only difference is what sits behind the query: a web API, or a vector similarity search over your embeddings.

Supported web providers

ProviderAuthNotes
Staan.ai (Qwant) 🇫🇷 🇪🇺BearerSovereign European web search API. Default provider.
TavilyBearerSearch API designed for LLM/RAG. Can return a synthetic answer.
Brave SearchX-Subscription-TokenIndependent web index.
SearXNGnone (self-hosted)Open-source metasearch engine. Point base_url at your instance — ideal for on-prem / sovereignty.
Google Custom Searchkey + cxGoogle Programmable Search Engine.
SearchApiBearerSERP aggregator (google by default, other engines available).
DuckDuckGononeInstant Answer API only (no full web result list — best effort).
Exax-api-keyAI / neural search. Returns relevance highlights — well suited to research and RAG.

In addition to these web providers, the rag provider turns a Search Engine into a retriever over your own data — see RAG knowledge base.

Features

  • Unified, normalized results: every provider returns the same shape — { provider, query, answer?, results: [{ title, url, snippet, score?, published_date? }] }. The raw provider payload is kept internally for advanced use.
  • Pure HTTP: no new dependency, only standard HTTP calls.
  • Vault integration: API tokens support Otoroshi vault references (e.g. ${vault://local/my-token}) and comma-separated rotation.
  • Search controls: max_results, market/locale, and domain include/exclude filters are passed through to providers that support them.
  • LLM tool integration: reference one or more Search Engines on a provider or an agent and the model gets a search tool it can call autonomously.
  • RAG knowledge bases: back a Search Engine with an embedding store and an embedding model (rag provider) to retrieve from your own data through the very same API — see RAG knowledge base.
  • Workflow integration: search is available as a workflow function (search_engine_search) for use in agentic pipelines.

How to use a Search Engine

There are three ways to use a Search Engine:

MethodDescription
LLM toolReference Search Engines on an LLM provider or an AI Agent; the model is given a search tool and calls it when it needs fresh web information. See Search as an LLM tool.
Dedicated pluginThe Cloud APIM - Search engine backend plugin exposes a single POST search route. See Plugins.
Workflow functionThe search_engine_search function runs a search from within a workflow.

Search Engine entity

A Search Engine entity wraps a provider connection and its options:

{
"id": "search-engine_xxxxxxxxx",
"name": "My Search Engine",
"description": "Sovereign web search backed by Staan.ai",
"provider": "staan",
"config": {
"connection": {
"base_url": "https://api.staan.ai",
"token": "${vault://local/STAAN_API_KEY}",
"timeout": 30000
},
"options": {
"market": "fr-FR",
"count": 10
}
},
"kind": "ai-gateway.extensions.cloud-apim.com/SearchEngine"
}

A RAG knowledge base uses the same entity, but instead of a connection block it references an embedding store and an embedding model:

{
"id": "search-engine_xxxxxxxxx",
"name": "Product docs",
"description": "Search the internal product documentation",
"provider": "rag",
"config": {
"embedding_store": "embedding-store_xxxxxxxxx",
"embedding_model": "embedding-model_xxxxxxxxx",
"options": {
"max_results": 5,
"min_score": 0.5
}
},
"kind": "ai-gateway.extensions.cloud-apim.com/SearchEngine"
}

See Providers for the per-provider configuration, Plugins for the HTTP API, and Search as an LLM tool to wire search into your providers and agents.