MCP Virtual Server
An MCP Virtual Server is a reusable, persisted definition of an exposed MCP server. Instead of repeating the same settings on every route, you define them once as a virtual server and reference it from the exposition plugins (and from the preset) through server_ref.
A virtual server bundles everything an exposition needs:
name,versionβ advertised in theinitializeresponse (serverInfo).refs(tool functions) andmcp_refs(MCP connectors) β the tools/resources/prompts to aggregate.- Filtering (
include_*/exclude_*,allow_rules,disallow_rules). - OAuth 2.0 (
enforce_oauth,auth_module_ref,auth_prm_url,validate_audience,opaque_token,use_introspection). - Scope-based tool authorization (
tool_scopes). - Meta mode (
expose_as_meta,meta_semantic_search). - Rate limiting & caching (
tool_rate_limits,tool_cache_ttls). - Managed resources (
resources,resource_fetch_allowed_hosts), managed prompts (prompts), and item overlays (overlays). - Zero-Trust controls (
zero_trust): anti-rug-pull pinning, tool-poisoning / prompt-injection scanning, and PII/secrets redaction. - Registry / publication (
registry): publish this server to the standard MCP registry so registry-aware clients discover it. emit_audit_eventsβ see audit events.
Virtual servers are managed from the MCP Virtual Servers page (or the admin API resource ai-gateway.extensions.cloud-apim.com/v1/mcp-virtual-servers).
The same configuration fields also exist inline on the exposition plugins. Defining a virtual server once and referencing it with server_ref keeps your routes DRY and lets you reuse one definition across many endpoints.
Referencing and overridingβ
The exposition plugins accept a server_ref. When set, the plugin starts from the virtual server's config and lets you override individual fields inline on the plugin. The merge rules are hybrid:
Optionfields β an inlineSomevalue wins;- array fields β a non-empty inline array wins;
- the enable-flags (
enforce_oauth,emit_audit_events,expose_as_meta, β¦) are OR'd; allow_rules/disallow_rules, managedresources/prompts,overlays, andzero_trustare merged additively.
This lets you keep a broad definition on the virtual server and tighten it per route, or vice-versa.
{
"server_ref": "mcp-virtual-server_xxxxx",
// inline overrides (optional):
"emit_audit_events": true
}
Filteringβ
You can filter which tools, resources, resource templates, and prompts are exposed to clients. These filters are applied on top of any filters already configured on the MCP connectors themselves β so you can have broad access on a connector and restrict it further here, or vice-versa.
| Parameter | Type | Description |
|---|---|---|
include_functions | array of strings | Only expose functions matching these regex patterns |
exclude_functions | array of strings | Hide functions matching these regex patterns |
include_resources | array of strings | Only expose resources matching these patterns |
exclude_resources | array of strings | Hide resources matching these patterns |
include_resource_templates | array of strings | Only expose resource templates matching these patterns |
exclude_resource_templates | array of strings | Hide resource templates matching these patterns |
include_resource_template_uris | array of strings | Only expose resource template URIs matching these patterns |
exclude_resource_template_uris | array of strings | Hide resource template URIs matching these patterns |
include_prompts | array of strings | Only expose prompts matching these patterns |
exclude_prompts | array of strings | Hide prompts matching these patterns |
allow_rules | object | Advanced JsonPath-based allow rules (see MCP Connectors β Advanced rules) |
disallow_rules | object | Advanced JsonPath-based disallow rules |
{
"name": "filtered-mcp-server",
"version": "1.0.0",
"refs": ["tool-function_xxxxx"],
"mcp_refs": ["mcp-connector_xxxxx"],
"include_functions": ["get_.*", "list_.*"],
"exclude_functions": ["delete_.*"],
"include_prompts": ["summarize"],
"exclude_prompts": [],
"include_resources": [],
"exclude_resources": ["admin_.*"]
}
allow_rules/disallow_rules use the same JsonPath shape as on connectors β keyed by category (tool_rules, prompt_rules, resource_rules, resource_templates_rules), then by item name β an array of { path, value } validators. See MCP Connectors β Advanced rules for the full format.
OAuth 2.0 securityβ
The OAuth tutorial walks through a full setup; the options are:
| Parameter | Type | Default | Description |
|---|---|---|---|
enforce_oauth | boolean | false | Require a valid OAuth 2.0 Bearer token on every MCP request. A missing/invalid token returns 401 with a WWW-Authenticate: Bearer ... resource_metadata="..." header (RFC 9728 discovery). |
auth_module_ref | string | β | The OAuth2/OIDC auth module used to validate tokens. |
auth_prm_url | string | β | Override the resource_metadata URL advertised in the 401 (default: <proto>://<host>/.well-known/oauth-protected-resource). |
validate_audience | boolean | false | Audience binding (RFC 8707): require the token aud claim to match this MCP server's URL. Prevents token-passthrough / confused-deputy. The aud claim may be a single string or an array. |
opaque_token | boolean | false | Accept opaque (non-JWT) access tokens, validated remotely (userinfo endpoint by default) instead of local JWT signature verification. |
use_introspection | boolean | false | For opaque tokens, validate via the auth module's RFC 7662 introspection endpoint instead of userinfo. |
The Protected MCP Streamable HTTP preset wires a protected MCP endpoint and the RFC 9728 discovery document from a single virtual server reference.
Scope-based tool authorizationβ
Beyond all-or-nothing OAuth, you can require specific OAuth scopes per tool with tool_scopes. The caller's granted scopes are read from the token (scope space-delimited claim and/or scp array β or the introspection response for opaque tokens), and a tool is allowed only when the caller has all the scopes it requires. Tools with no entry are open. The check filters tools/list (hidden) and denies tools/call.
| Parameter | Type | Description |
|---|---|---|
tool_scopes | object | Map of tool name (or "*" as a default for every tool) β array of required scopes. |
{
"name": "scoped-mcp-server",
"enforce_oauth": true,
"auth_module_ref": "auth_mod_xxxxx",
"mcp_refs": ["mcp-connector_xxxxx"],
"tool_scopes": {
"create_repository": ["mcp:write"],
"delete_repository": ["mcp:write", "mcp:admin"],
"*": ["mcp:tools"]
}
}
In this example every tool requires mcp:tools, create_repository additionally requires mcp:write, and delete_repository requires both mcp:write and mcp:admin.
Meta mode (tool virtualization)β
When a server aggregates many connectors, exposing the full tool list bloats the model's context (every tool schema is injected into the prompt on each call). With expose_as_meta, the server exposes 5 virtualization tools instead of the full list β the same surface as the meta connector, but on the server side:
list_servers Β· list_tools Β· get_tool_schema Β· search_tools Β· execute
The model discovers and calls tools dynamically through these, keeping the context small. Local tool functions (refs) stay listed directly; resources and prompts are unaffected.
| Parameter | Type | Default | Description |
|---|---|---|---|
expose_as_meta | boolean | false | Expose the referenced connectors through the 5 meta tools instead of the full tool list. |
meta_semantic_search | boolean | false | Also fuse BM25 with embedding-based similarity (MiniLM-L6-v2) in search_tools. |
{
"name": "meta-mcp-server",
"mcp_refs": ["mcp-connector_a", "mcp-connector_b", "mcp-connector_c"],
"expose_as_meta": true,
"meta_semantic_search": true
}
Per-tool rate limiting and result cachingβ
Both are enforced at the tools/call chokepoint and backed by the shared datastore (cluster-wide).
| Parameter | Type | Description |
|---|---|---|
tool_rate_limits | object | Map of tool name (or "*") β max calls per minute per consumer (fixed 60s window). The consumer is resolved as apikey > authenticated user > bearer token. 0/absent = no limit. |
tool_cache_ttls | object | Map of tool name (or "*") β cache TTL in seconds for the tool result. Opt-in, for idempotent tools only. Only successful results are cached, keyed by tool + arguments. 0/absent = no cache. |
{
"name": "throttled-mcp-server",
"mcp_refs": ["mcp-connector_xxxxx"],
"tool_rate_limits": {
"expensive_search": 10,
"*": 600
},
"tool_cache_ttls": {
"get_weather": 60,
"list_countries": 3600
}
}
The result cache key is the tool name + arguments (plus the server identity), not the caller. Only cache tools whose result depends solely on the arguments. Do not cache tools whose result depends on the caller's identity (e.g. forward_auth tools that return per-user data), otherwise one consumer's cached result could be served to another.
Managed resourcesβ
A virtual server can serve its own resources (in addition to those of the referenced connectors), defined inline via resources. Each entry has a uri, a name, optional metadata, and exactly one content source β text (inline), blob (inline base64), or url (fetched on the fly):
| Field | Description |
|---|---|
uri, name | Required identifier and display name. |
title, description, mime_type, annotations, meta | Optional metadata (meta is the MCP _meta object). |
text / blob / url | Content source (priority url > blob > text). |
url_as | "text" or "blob" β how to return the fetched bytes (for url). |
headers, timeout | Outgoing request settings (for url). |
forward_auth | Inject the caller's token as {input_token} into url/headers. |
url, headers and text support the expression language. The {input_token} placeholder is substituted with the caller's bearer token when forward_auth is true.
resource_fetch_allowed_hosts is an optional allow-list (glob) of hosts the server may fetch resource URLs from β leave empty only if you trust the configured URLs (SSRF risk otherwise).
{
"resources": [
{ "uri": "doc://readme", "name": "Readme", "mime_type": "text/markdown", "text": "# Hello" },
{ "uri": "api://profile", "name": "Profile", "url": "https://api.example.com/me", "url_as": "text", "forward_auth": true }
],
"resource_fetch_allowed_hosts": ["api.example.com"]
}
Outbound resource fetches (resources that use a url) emit a McpResourceFetchAudit audit event when emit_audit_events is enabled, and always emit mcp.resource.fetch.calls / mcp.resource.fetch.errors / mcp.resource.fetch.duration metrics.
Managed promptsβ
Similarly, a virtual server can serve its own prompts via prompts. Each prompt declares its arguments and a list of messages; message text supports {{argName}} substitution (from the prompts/get arguments) and the expression language.
| Field | Description |
|---|---|
name | Required prompt name. |
title, description, meta | Optional metadata. |
arguments | Array of { name, description?, required? }. |
messages | Array of { role: "user"ο½"assistant"ο½"system", text }. |
{
"prompts": [
{
"name": "summarize",
"description": "Summarize a text in a given language",
"arguments": [{ "name": "lang", "required": true }],
"messages": [{ "role": "user", "text": "Summarize the following text in {{lang}}." }]
}
]
}
Item overlaysβ
overlays lets you apply per-item JSON patches, deep-merged onto tools, prompts, resources and resource templates at list time (managed items included). This is handy to inject _meta/annotations, tweak a description, or add an outputSchema without touching the upstream server.
The shape is keyed by category, then by item key (tool/prompt name, resource name-or-uri, template uriTemplate). The special key "*" patches every item in a category. Deep-merge: nested objects are merged, scalars and arrays are replaced.
{
"overlays": {
"tools": {
"*": { "_meta": { "team": "platform" } },
"delete_repository": { "annotations": { "destructiveHint": true } }
},
"resources": {
"doc://readme": { "mimeType": "text/markdown" }
}
}
}
Zero-Trust controlsβ
The zero_trust block adds three independent, opt-in security controls on top of an exposed server. Each defaults to OFF, and blocking is itself opt-in: by default a control runs in monitor mode (it emits a correlated McpZeroTrustAlert audit event + metrics, but lets traffic through). Flip the matching *_enforce flag to switch to block. Alerts route through the usual data exporters (Kafka/ES/S3/SIEMβ¦), correlated with McpAudit/McpClientAudit by request_id.
| Field | Type | Default | Description |
|---|---|---|---|
pinning_enabled | boolean | false | A. Anti-rug-pull. Fingerprint (sha256 over name+description+inputSchema+annotations) and pin each tool the first time it is seen (Trust-On-First-Use). A later mutation of an already-pinned tool is detected. |
pinning_enforce | boolean | false | When a mutation is detected: false = alert only (tool still served), true = drop the tool from tools/list and deny tools/call. |
pinned_hashes | object | {} | Optional explicit map tool name β expected fingerprint. An entry is authoritative (overrides TOFU) β useful to pin a known-good hash declaratively. |
pinning_epoch | number | 0 | Bump this to re-pin everything after a legitimate description change (it namespaces the stored pins). |
description_guardrails | array | [] | B. Tool-poisoning / prompt-injection scanning of tool descriptions at tools/list. Reuses the guardrails engine β each item is { "enabled": true, "id": "<guardrail>", "config": {...} } (e.g. prompt_injection, pif, secrets_leakage, regex, contains, wasm). |
result_guardrails | array | [] | Same, applied to tool results at tools/call. |
guardrails_enforce | boolean | false | When a guardrail denies: false = alert only, true = drop the offending tool (description scan) / block the result (result scan). |
redact_arguments | boolean | false | C. Redaction. Mask PII/secrets in tool arguments before forwarding them upstream. |
redact_results | boolean | false | Mask PII/secrets in tool results before returning them to the model. |
redaction_builtins | array | [] | Built-in patterns to enable: email, credit_card, ssn, ipv4, jwt, aws_key, private_key, generic_api_key. Deterministic (no LLM), so they run on every call with no added latency/cost. |
redaction_rules | array | [] | Custom rules { "name": "...", "regex": "...", "replacement": "Β«redactedΒ»" }, applied after the built-ins. |
The two enforce controls are also re-checked at call time: a tool that pinning or a description guardrail removed from the secured tools/list cannot be called directly, even by a client that never issued tools/list.
{
"zero_trust": {
// A. anti-rug-pull β alert on mutation, then block once you trust the baseline
"pinning_enabled": true,
"pinning_enforce": false,
"pinning_epoch": 0,
// B. scan descriptions and results for prompt-injection / tool-poisoning
"description_guardrails": [
{ "enabled": true, "id": "prompt_injection", "config": { "provider": "provider_xxx", "threshold": 90 } }
],
"result_guardrails": [
{ "enabled": true, "id": "contains", "config": { "contains_none": ["BEGIN PRIVATE KEY"] } }
],
"guardrails_enforce": true,
// C. redact secrets/PII on the way in and out
"redact_results": true,
"redaction_builtins": ["email", "jwt", "aws_key", "private_key"],
"redaction_rules": [
{ "name": "internal-ticket", "regex": "TICKET-\\d+", "replacement": "Β«ticketΒ»" }
]
}
}
Start in monitor mode (*_enforce: false) and watch the McpZeroTrustAlert events / mcp.zerotrust.* metrics to size the impact, then enable enforcement once the baseline is clean. After a legitimate upstream change, bump pinning_epoch to re-pin.
Registry / publicationβ
A virtual server carries a registry block (entity-level metadata) that lets you publish it to the standard MCP registry so registry-aware clients discover it. The fields β published, name, version, title, url, deprecated β and the two exposition plugins are documented on the dedicated MCP Registry page.
In short: set registry.published = true (your approval gate, gated by RBAC) and the server is projected to the official server.json and served over GET /v0/servers.