Skip to main content

MCP Virtual Server

An MCP Virtual Server is a reusable, persisted definition of an exposed MCP server. Instead of repeating the same settings on every route, you define them once as a virtual server and reference it from the exposition plugins (and from the preset) through server_ref.

A virtual server bundles everything an exposition needs:

  • name, version β€” advertised in the initialize response (serverInfo).
  • refs (tool functions) and mcp_refs (MCP connectors) β€” the tools/resources/prompts to aggregate.
  • Filtering (include_*/exclude_*, allow_rules, disallow_rules).
  • OAuth 2.0 (enforce_oauth, auth_module_ref, auth_prm_url, validate_audience, opaque_token, use_introspection).
  • Scope-based tool authorization (tool_scopes).
  • Meta mode (expose_as_meta, meta_semantic_search).
  • Rate limiting & caching (tool_rate_limits, tool_cache_ttls).
  • Managed resources (resources, resource_fetch_allowed_hosts), managed prompts (prompts), and item overlays (overlays).
  • Zero-Trust controls (zero_trust): anti-rug-pull pinning, tool-poisoning / prompt-injection scanning, and PII/secrets redaction.
  • Registry / publication (registry): publish this server to the standard MCP registry so registry-aware clients discover it.
  • emit_audit_events β€” see audit events.

Virtual servers are managed from the MCP Virtual Servers page (or the admin API resource ai-gateway.extensions.cloud-apim.com/v1/mcp-virtual-servers).

tip

The same configuration fields also exist inline on the exposition plugins. Defining a virtual server once and referencing it with server_ref keeps your routes DRY and lets you reuse one definition across many endpoints.

Referencing and overriding​

The exposition plugins accept a server_ref. When set, the plugin starts from the virtual server's config and lets you override individual fields inline on the plugin. The merge rules are hybrid:

  • Option fields β†’ an inline Some value wins;
  • array fields β†’ a non-empty inline array wins;
  • the enable-flags (enforce_oauth, emit_audit_events, expose_as_meta, …) are OR'd;
  • allow_rules/disallow_rules, managed resources/prompts, overlays, and zero_trust are merged additively.

This lets you keep a broad definition on the virtual server and tighten it per route, or vice-versa.

{
"server_ref": "mcp-virtual-server_xxxxx",
// inline overrides (optional):
"emit_audit_events": true
}

Filtering​

You can filter which tools, resources, resource templates, and prompts are exposed to clients. These filters are applied on top of any filters already configured on the MCP connectors themselves β€” so you can have broad access on a connector and restrict it further here, or vice-versa.

ParameterTypeDescription
include_functionsarray of stringsOnly expose functions matching these regex patterns
exclude_functionsarray of stringsHide functions matching these regex patterns
include_resourcesarray of stringsOnly expose resources matching these patterns
exclude_resourcesarray of stringsHide resources matching these patterns
include_resource_templatesarray of stringsOnly expose resource templates matching these patterns
exclude_resource_templatesarray of stringsHide resource templates matching these patterns
include_resource_template_urisarray of stringsOnly expose resource template URIs matching these patterns
exclude_resource_template_urisarray of stringsHide resource template URIs matching these patterns
include_promptsarray of stringsOnly expose prompts matching these patterns
exclude_promptsarray of stringsHide prompts matching these patterns
allow_rulesobjectAdvanced JsonPath-based allow rules (see MCP Connectors β€” Advanced rules)
disallow_rulesobjectAdvanced JsonPath-based disallow rules
{
"name": "filtered-mcp-server",
"version": "1.0.0",
"refs": ["tool-function_xxxxx"],
"mcp_refs": ["mcp-connector_xxxxx"],
"include_functions": ["get_.*", "list_.*"],
"exclude_functions": ["delete_.*"],
"include_prompts": ["summarize"],
"exclude_prompts": [],
"include_resources": [],
"exclude_resources": ["admin_.*"]
}

allow_rules/disallow_rules use the same JsonPath shape as on connectors β€” keyed by category (tool_rules, prompt_rules, resource_rules, resource_templates_rules), then by item name β†’ an array of { path, value } validators. See MCP Connectors β€” Advanced rules for the full format.

OAuth 2.0 security​

The OAuth tutorial walks through a full setup; the options are:

ParameterTypeDefaultDescription
enforce_oauthbooleanfalseRequire a valid OAuth 2.0 Bearer token on every MCP request. A missing/invalid token returns 401 with a WWW-Authenticate: Bearer ... resource_metadata="..." header (RFC 9728 discovery).
auth_module_refstringβ€”The OAuth2/OIDC auth module used to validate tokens.
auth_prm_urlstringβ€”Override the resource_metadata URL advertised in the 401 (default: <proto>://<host>/.well-known/oauth-protected-resource).
validate_audiencebooleanfalseAudience binding (RFC 8707): require the token aud claim to match this MCP server's URL. Prevents token-passthrough / confused-deputy. The aud claim may be a single string or an array.
opaque_tokenbooleanfalseAccept opaque (non-JWT) access tokens, validated remotely (userinfo endpoint by default) instead of local JWT signature verification.
use_introspectionbooleanfalseFor opaque tokens, validate via the auth module's RFC 7662 introspection endpoint instead of userinfo.
One-click setup

The Protected MCP Streamable HTTP preset wires a protected MCP endpoint and the RFC 9728 discovery document from a single virtual server reference.

Scope-based tool authorization​

Beyond all-or-nothing OAuth, you can require specific OAuth scopes per tool with tool_scopes. The caller's granted scopes are read from the token (scope space-delimited claim and/or scp array β€” or the introspection response for opaque tokens), and a tool is allowed only when the caller has all the scopes it requires. Tools with no entry are open. The check filters tools/list (hidden) and denies tools/call.

ParameterTypeDescription
tool_scopesobjectMap of tool name (or "*" as a default for every tool) β†’ array of required scopes.
{
"name": "scoped-mcp-server",
"enforce_oauth": true,
"auth_module_ref": "auth_mod_xxxxx",
"mcp_refs": ["mcp-connector_xxxxx"],
"tool_scopes": {
"create_repository": ["mcp:write"],
"delete_repository": ["mcp:write", "mcp:admin"],
"*": ["mcp:tools"]
}
}

In this example every tool requires mcp:tools, create_repository additionally requires mcp:write, and delete_repository requires both mcp:write and mcp:admin.

Meta mode (tool virtualization)​

When a server aggregates many connectors, exposing the full tool list bloats the model's context (every tool schema is injected into the prompt on each call). With expose_as_meta, the server exposes 5 virtualization tools instead of the full list β€” the same surface as the meta connector, but on the server side:

list_servers Β· list_tools Β· get_tool_schema Β· search_tools Β· execute

The model discovers and calls tools dynamically through these, keeping the context small. Local tool functions (refs) stay listed directly; resources and prompts are unaffected.

ParameterTypeDefaultDescription
expose_as_metabooleanfalseExpose the referenced connectors through the 5 meta tools instead of the full tool list.
meta_semantic_searchbooleanfalseAlso fuse BM25 with embedding-based similarity (MiniLM-L6-v2) in search_tools.
{
"name": "meta-mcp-server",
"mcp_refs": ["mcp-connector_a", "mcp-connector_b", "mcp-connector_c"],
"expose_as_meta": true,
"meta_semantic_search": true
}

Per-tool rate limiting and result caching​

Both are enforced at the tools/call chokepoint and backed by the shared datastore (cluster-wide).

ParameterTypeDescription
tool_rate_limitsobjectMap of tool name (or "*") β†’ max calls per minute per consumer (fixed 60s window). The consumer is resolved as apikey > authenticated user > bearer token. 0/absent = no limit.
tool_cache_ttlsobjectMap of tool name (or "*") β†’ cache TTL in seconds for the tool result. Opt-in, for idempotent tools only. Only successful results are cached, keyed by tool + arguments. 0/absent = no cache.
{
"name": "throttled-mcp-server",
"mcp_refs": ["mcp-connector_xxxxx"],
"tool_rate_limits": {
"expensive_search": 10,
"*": 600
},
"tool_cache_ttls": {
"get_weather": 60,
"list_countries": 3600
}
}
The cache is keyed by tool + arguments β€” not per consumer

The result cache key is the tool name + arguments (plus the server identity), not the caller. Only cache tools whose result depends solely on the arguments. Do not cache tools whose result depends on the caller's identity (e.g. forward_auth tools that return per-user data), otherwise one consumer's cached result could be served to another.

Managed resources​

A virtual server can serve its own resources (in addition to those of the referenced connectors), defined inline via resources. Each entry has a uri, a name, optional metadata, and exactly one content source β€” text (inline), blob (inline base64), or url (fetched on the fly):

FieldDescription
uri, nameRequired identifier and display name.
title, description, mime_type, annotations, metaOptional metadata (meta is the MCP _meta object).
text / blob / urlContent source (priority url > blob > text).
url_as"text" or "blob" β€” how to return the fetched bytes (for url).
headers, timeoutOutgoing request settings (for url).
forward_authInject the caller's token as {input_token} into url/headers.

url, headers and text support the expression language. The {input_token} placeholder is substituted with the caller's bearer token when forward_auth is true.

resource_fetch_allowed_hosts is an optional allow-list (glob) of hosts the server may fetch resource URLs from β€” leave empty only if you trust the configured URLs (SSRF risk otherwise).

{
"resources": [
{ "uri": "doc://readme", "name": "Readme", "mime_type": "text/markdown", "text": "# Hello" },
{ "uri": "api://profile", "name": "Profile", "url": "https://api.example.com/me", "url_as": "text", "forward_auth": true }
],
"resource_fetch_allowed_hosts": ["api.example.com"]
}

Outbound resource fetches (resources that use a url) emit a McpResourceFetchAudit audit event when emit_audit_events is enabled, and always emit mcp.resource.fetch.calls / mcp.resource.fetch.errors / mcp.resource.fetch.duration metrics.

Managed prompts​

Similarly, a virtual server can serve its own prompts via prompts. Each prompt declares its arguments and a list of messages; message text supports {{argName}} substitution (from the prompts/get arguments) and the expression language.

FieldDescription
nameRequired prompt name.
title, description, metaOptional metadata.
argumentsArray of { name, description?, required? }.
messagesArray of { role: "user"|"assistant"|"system", text }.
{
"prompts": [
{
"name": "summarize",
"description": "Summarize a text in a given language",
"arguments": [{ "name": "lang", "required": true }],
"messages": [{ "role": "user", "text": "Summarize the following text in {{lang}}." }]
}
]
}

Item overlays​

overlays lets you apply per-item JSON patches, deep-merged onto tools, prompts, resources and resource templates at list time (managed items included). This is handy to inject _meta/annotations, tweak a description, or add an outputSchema without touching the upstream server.

The shape is keyed by category, then by item key (tool/prompt name, resource name-or-uri, template uriTemplate). The special key "*" patches every item in a category. Deep-merge: nested objects are merged, scalars and arrays are replaced.

{
"overlays": {
"tools": {
"*": { "_meta": { "team": "platform" } },
"delete_repository": { "annotations": { "destructiveHint": true } }
},
"resources": {
"doc://readme": { "mimeType": "text/markdown" }
}
}
}

Zero-Trust controls​

The zero_trust block adds three independent, opt-in security controls on top of an exposed server. Each defaults to OFF, and blocking is itself opt-in: by default a control runs in monitor mode (it emits a correlated McpZeroTrustAlert audit event + metrics, but lets traffic through). Flip the matching *_enforce flag to switch to block. Alerts route through the usual data exporters (Kafka/ES/S3/SIEM…), correlated with McpAudit/McpClientAudit by request_id.

FieldTypeDefaultDescription
pinning_enabledbooleanfalseA. Anti-rug-pull. Fingerprint (sha256 over name+description+inputSchema+annotations) and pin each tool the first time it is seen (Trust-On-First-Use). A later mutation of an already-pinned tool is detected.
pinning_enforcebooleanfalseWhen a mutation is detected: false = alert only (tool still served), true = drop the tool from tools/list and deny tools/call.
pinned_hashesobject{}Optional explicit map tool name β†’ expected fingerprint. An entry is authoritative (overrides TOFU) β€” useful to pin a known-good hash declaratively.
pinning_epochnumber0Bump this to re-pin everything after a legitimate description change (it namespaces the stored pins).
description_guardrailsarray[]B. Tool-poisoning / prompt-injection scanning of tool descriptions at tools/list. Reuses the guardrails engine β€” each item is { "enabled": true, "id": "<guardrail>", "config": {...} } (e.g. prompt_injection, pif, secrets_leakage, regex, contains, wasm).
result_guardrailsarray[]Same, applied to tool results at tools/call.
guardrails_enforcebooleanfalseWhen a guardrail denies: false = alert only, true = drop the offending tool (description scan) / block the result (result scan).
redact_argumentsbooleanfalseC. Redaction. Mask PII/secrets in tool arguments before forwarding them upstream.
redact_resultsbooleanfalseMask PII/secrets in tool results before returning them to the model.
redaction_builtinsarray[]Built-in patterns to enable: email, credit_card, ssn, ipv4, jwt, aws_key, private_key, generic_api_key. Deterministic (no LLM), so they run on every call with no added latency/cost.
redaction_rulesarray[]Custom rules { "name": "...", "regex": "...", "replacement": "Β«redactedΒ»" }, applied after the built-ins.

The two enforce controls are also re-checked at call time: a tool that pinning or a description guardrail removed from the secured tools/list cannot be called directly, even by a client that never issued tools/list.

{
"zero_trust": {
// A. anti-rug-pull β€” alert on mutation, then block once you trust the baseline
"pinning_enabled": true,
"pinning_enforce": false,
"pinning_epoch": 0,

// B. scan descriptions and results for prompt-injection / tool-poisoning
"description_guardrails": [
{ "enabled": true, "id": "prompt_injection", "config": { "provider": "provider_xxx", "threshold": 90 } }
],
"result_guardrails": [
{ "enabled": true, "id": "contains", "config": { "contains_none": ["BEGIN PRIVATE KEY"] } }
],
"guardrails_enforce": true,

// C. redact secrets/PII on the way in and out
"redact_results": true,
"redaction_builtins": ["email", "jwt", "aws_key", "private_key"],
"redaction_rules": [
{ "name": "internal-ticket", "regex": "TICKET-\\d+", "replacement": "Β«ticketΒ»" }
]
}
}
tip

Start in monitor mode (*_enforce: false) and watch the McpZeroTrustAlert events / mcp.zerotrust.* metrics to size the impact, then enable enforcement once the baseline is clean. After a legitimate upstream change, bump pinning_epoch to re-pin.

Registry / publication​

A virtual server carries a registry block (entity-level metadata) that lets you publish it to the standard MCP registry so registry-aware clients discover it. The fields β€” published, name, version, title, url, deprecated β€” and the two exposition plugins are documented on the dedicated MCP Registry page.

In short: set registry.published = true (your approval gate, gated by RBAC) and the server is projected to the official server.json and served over GET /v0/servers.