MCP Virtual Server

An MCP Virtual Server is a reusable, persisted definition of an exposed MCP server. Instead of repeating the same settings on every route, you define them once as a virtual server and reference it from the exposition plugins (and from the preset) through server_ref.

A virtual server bundles everything an exposition needs:

name, version — advertised in the initialize response (serverInfo).
refs (tool functions) and mcp_refs (MCP connectors) — the tools/resources/prompts to aggregate.
Filtering (include_*/exclude_*, allow_rules, disallow_rules).
OAuth 2.0 (enforce_oauth, auth_module_ref, auth_prm_url, validate_audience, opaque_token, use_introspection).
Scope-based tool authorization (tool_scopes).
Meta mode (expose_as_meta, meta_semantic_search).
Rate limiting & caching (tool_rate_limits, tool_cache_ttls).
Managed resources (resources, resource_fetch_allowed_hosts), managed prompts (prompts), and item overlays (overlays).
Zero-Trust controls (zero_trust): anti-rug-pull pinning, tool-poisoning / prompt-injection scanning, and PII/secrets redaction.
Registry / publication (registry): publish this server to the standard MCP registry so registry-aware clients discover it.
emit_audit_events — see audit events.

Virtual servers are managed from the MCP Virtual Servers page (or the admin API resource ai-gateway.extensions.cloud-apim.com/v1/mcp-virtual-servers).

tip

The same configuration fields also exist inline on the exposition plugins. Defining a virtual server once and referencing it with server_ref keeps your routes DRY and lets you reuse one definition across many endpoints.

Referencing and overriding

The exposition plugins accept a server_ref. When set, the plugin starts from the virtual server's config and lets you override individual fields inline on the plugin. The merge rules are hybrid:

Option fields → an inline Some value wins;
array fields → a non-empty inline array wins;
the enable-flags (enforce_oauth, emit_audit_events, expose_as_meta, …) are OR'd;
allow_rules/disallow_rules, managed resources/prompts, overlays, and zero_trust are merged additively.

This lets you keep a broad definition on the virtual server and tighten it per route, or vice-versa.

{
  "server_ref": "mcp-virtual-server_xxxxx",
  // inline overrides (optional):
  "emit_audit_events": true
}

Filtering

You can filter which tools, resources, resource templates, and prompts are exposed to clients. These filters are applied on top of any filters already configured on the MCP connectors themselves — so you can have broad access on a connector and restrict it further here, or vice-versa.

Parameter	Type	Description
`include_functions`	array of strings	Only expose functions matching these regex patterns
`exclude_functions`	array of strings	Hide functions matching these regex patterns
`include_resources`	array of strings	Only expose resources matching these patterns
`exclude_resources`	array of strings	Hide resources matching these patterns
`include_resource_templates`	array of strings	Only expose resource templates matching these patterns
`exclude_resource_templates`	array of strings	Hide resource templates matching these patterns
`include_resource_template_uris`	array of strings	Only expose resource template URIs matching these patterns
`exclude_resource_template_uris`	array of strings	Hide resource template URIs matching these patterns
`include_prompts`	array of strings	Only expose prompts matching these patterns
`exclude_prompts`	array of strings	Hide prompts matching these patterns
`allow_rules`	object	Advanced JsonPath-based allow rules (see MCP Connectors — Advanced rules)
`disallow_rules`	object	Advanced JsonPath-based disallow rules

{
  "name": "filtered-mcp-server",
  "version": "1.0.0",
  "refs": ["tool-function_xxxxx"],
  "mcp_refs": ["mcp-connector_xxxxx"],
  "include_functions": ["get_.*", "list_.*"],
  "exclude_functions": ["delete_.*"],
  "include_prompts": ["summarize"],
  "exclude_prompts": [],
  "include_resources": [],
  "exclude_resources": ["admin_.*"]
}

allow_rules/disallow_rules use the same JsonPath shape as on connectors — keyed by category (tool_rules, prompt_rules, resource_rules, resource_templates_rules), then by item name → an array of { path, value } validators. See MCP Connectors — Advanced rules for the full format.

OAuth 2.0 security

The OAuth tutorial walks through a full setup; the options are:

Parameter	Type	Default	Description
`enforce_oauth`	boolean	`false`	Require a valid OAuth 2.0 Bearer token on every MCP request. A missing/invalid token returns `401` with a `WWW-Authenticate: Bearer ... resource_metadata="..."` header (RFC 9728 discovery).
`auth_module_ref`	string	—	The OAuth2/OIDC auth module used to validate tokens.
`auth_prm_url`	string	—	Override the `resource_metadata` URL advertised in the `401` (default: `<proto>://<host>/.well-known/oauth-protected-resource`).
`validate_audience`	boolean	`false`	Audience binding (RFC 8707): require the token `aud` claim to match this MCP server's URL. Prevents token-passthrough / confused-deputy. The `aud` claim may be a single string or an array.
`opaque_token`	boolean	`false`	Accept opaque (non-JWT) access tokens, validated remotely (userinfo endpoint by default) instead of local JWT signature verification.
`use_introspection`	boolean	`false`	For opaque tokens, validate via the auth module's RFC 7662 introspection endpoint instead of userinfo.

One-click setup

The Protected MCP Streamable HTTP preset wires a protected MCP endpoint and the RFC 9728 discovery document from a single virtual server reference.

Scope-based tool authorization

Beyond all-or-nothing OAuth, you can require specific OAuth scopes per tool with tool_scopes. The caller's granted scopes are read from the token (scope space-delimited claim and/or scp array — or the introspection response for opaque tokens), and a tool is allowed only when the caller has all the scopes it requires. Tools with no entry are open. The check filters tools/list (hidden) and denies tools/call.

Parameter	Type	Description
`tool_scopes`	object	Map of `tool name` (or `"*"` as a default for every tool) → array of required scopes.

{
  "name": "scoped-mcp-server",
  "enforce_oauth": true,
  "auth_module_ref": "auth_mod_xxxxx",
  "mcp_refs": ["mcp-connector_xxxxx"],
  "tool_scopes": {
    "create_repository": ["mcp:write"],
    "delete_repository": ["mcp:write", "mcp:admin"],
    "*": ["mcp:tools"]
  }
}

In this example every tool requires mcp:tools, create_repository additionally requires mcp:write, and delete_repository requires both mcp:write and mcp:admin.

Meta mode (tool virtualization)

When a server aggregates many connectors, exposing the full tool list bloats the model's context (every tool schema is injected into the prompt on each call). With expose_as_meta, the server exposes 5 virtualization tools instead of the full list — the same surface as the meta connector, but on the server side:

list_servers · list_tools · get_tool_schema · search_tools · execute

The model discovers and calls tools dynamically through these, keeping the context small. Local tool functions (refs) stay listed directly; resources and prompts are unaffected.

Parameter	Type	Default	Description
`expose_as_meta`	boolean	`false`	Expose the referenced connectors through the 5 meta tools instead of the full tool list.
`meta_semantic_search`	boolean	`false`	Also fuse BM25 with embedding-based similarity (MiniLM-L6-v2) in `search_tools`.

{
  "name": "meta-mcp-server",
  "mcp_refs": ["mcp-connector_a", "mcp-connector_b", "mcp-connector_c"],
  "expose_as_meta": true,
  "meta_semantic_search": true
}

Per-tool rate limiting and result caching

Both are enforced at the tools/call chokepoint and backed by the shared datastore (cluster-wide).

Parameter	Type	Description
`tool_rate_limits`	object	Map of `tool name` (or `""`) → max calls per minute per consumer* (fixed 60s window). The consumer is resolved as apikey > authenticated user > bearer token. `0`/absent = no limit.
`tool_cache_ttls`	object	Map of `tool name` (or `""`) → cache TTL in seconds* for the tool result. Opt-in, for idempotent tools only. Only successful results are cached, keyed by tool + arguments. `0`/absent = no cache.

{
  "name": "throttled-mcp-server",
  "mcp_refs": ["mcp-connector_xxxxx"],
  "tool_rate_limits": {
    "expensive_search": 10,
    "*": 600
  },
  "tool_cache_ttls": {
    "get_weather": 60,
    "list_countries": 3600
  }
}

The cache is keyed by tool + arguments — not per consumer

The result cache key is the tool name + arguments (plus the server identity), not the caller. Only cache tools whose result depends solely on the arguments. Do not cache tools whose result depends on the caller's identity (e.g. forward_auth tools that return per-user data), otherwise one consumer's cached result could be served to another.

Managed resources

A virtual server can serve its own resources (in addition to those of the referenced connectors), defined inline via resources. Each entry has a uri, a name, optional metadata, and exactly one content source — text (inline), blob (inline base64), or url (fetched on the fly):

Field	Description
`uri`, `name`	Required identifier and display name.
`title`, `description`, `mime_type`, `annotations`, `meta`	Optional metadata (`meta` is the MCP `_meta` object).
`text` / `blob` / `url`	Content source (priority `url` > `blob` > `text`).
`url_as`	`"text"` or `"blob"` — how to return the fetched bytes (for `url`).
`headers`, `timeout`	Outgoing request settings (for `url`).
`forward_auth`	Inject the caller's token as `{input_token}` into `url`/`headers`.

url, headers and text support the expression language. The {input_token} placeholder is substituted with the caller's bearer token when forward_auth is true.

resource_fetch_allowed_hosts is an optional allow-list (glob) of hosts the server may fetch resource URLs from — leave empty only if you trust the configured URLs (SSRF risk otherwise).

{
  "resources": [
    { "uri": "doc://readme", "name": "Readme", "mime_type": "text/markdown", "text": "# Hello" },
    { "uri": "api://profile", "name": "Profile", "url": "https://api.example.com/me", "url_as": "text", "forward_auth": true }
  ],
  "resource_fetch_allowed_hosts": ["api.example.com"]
}

Outbound resource fetches (resources that use a url) emit a McpResourceFetchAudit audit event when emit_audit_events is enabled, and always emit mcp.resource.fetch.calls / mcp.resource.fetch.errors / mcp.resource.fetch.duration metrics.

Managed prompts

Similarly, a virtual server can serve its own prompts via prompts. Each prompt declares its arguments and a list of messages; message text supports {{argName}} substitution (from the prompts/get arguments) and the expression language.

Field	Description
`name`	Required prompt name.
`title`, `description`, `meta`	Optional metadata.
`arguments`	Array of `{ name, description?, required? }`.
`messages`	Array of `{ role: "user"｜"assistant"｜"system", text }`.

{
  "prompts": [
    {
      "name": "summarize",
      "description": "Summarize a text in a given language",
      "arguments": [{ "name": "lang", "required": true }],
      "messages": [{ "role": "user", "text": "Summarize the following text in {{lang}}." }]
    }
  ]
}

Item overlays

overlays lets you apply per-item JSON patches, deep-merged onto tools, prompts, resources and resource templates at list time (managed items included). This is handy to inject _meta/annotations, tweak a description, or add an outputSchema without touching the upstream server.

The shape is keyed by category, then by item key (tool/prompt name, resource name-or-uri, template uriTemplate). The special key "*" patches every item in a category. Deep-merge: nested objects are merged, scalars and arrays are replaced.

{
  "overlays": {
    "tools": {
      "*": { "_meta": { "team": "platform" } },
      "delete_repository": { "annotations": { "destructiveHint": true } }
    },
    "resources": {
      "doc://readme": { "mimeType": "text/markdown" }
    }
  }
}

Zero-Trust controls

The zero_trust block adds three independent, opt-in security controls on top of an exposed server. Each defaults to OFF, and blocking is itself opt-in: by default a control runs in monitor mode (it emits a correlated McpZeroTrustAlert audit event + metrics, but lets traffic through). Flip the matching *_enforce flag to switch to block. Alerts route through the usual data exporters (Kafka/ES/S3/SIEM…), correlated with McpAudit/McpClientAudit by request_id.

Field	Type	Default	Description
`pinning_enabled`	boolean	`false`	A. Anti-rug-pull. Fingerprint (sha256 over `name`+`description`+`inputSchema`+`annotations`) and pin each tool the first time it is seen (Trust-On-First-Use). A later mutation of an already-pinned tool is detected.
`pinning_enforce`	boolean	`false`	When a mutation is detected: `false` = alert only (tool still served), `true` = drop the tool from `tools/list` and deny `tools/call`.
`pinned_hashes`	object	`{}`	Optional explicit map `tool name → expected fingerprint`. An entry is authoritative (overrides TOFU) — useful to pin a known-good hash declaratively.
`pinning_epoch`	number	`0`	Bump this to re-pin everything after a legitimate description change (it namespaces the stored pins).
`description_guardrails`	array	`[]`	B. Tool-poisoning / prompt-injection scanning of tool descriptions at `tools/list`. Reuses the guardrails engine — each item is `{ "enabled": true, "id": "<guardrail>", "config": {...} }` (e.g. `prompt_injection`, `pif`, `secrets_leakage`, `regex`, `contains`, `wasm`).
`result_guardrails`	array	`[]`	Same, applied to tool results at `tools/call`.
`guardrails_enforce`	boolean	`false`	When a guardrail denies: `false` = alert only, `true` = drop the offending tool (description scan) / block the result (result scan).
`redact_arguments`	boolean	`false`	C. Redaction. Mask PII/secrets in tool arguments before forwarding them upstream.
`redact_results`	boolean	`false`	Mask PII/secrets in tool results before returning them to the model.
`redaction_builtins`	array	`[]`	Built-in patterns to enable: `email`, `credit_card`, `ssn`, `ipv4`, `jwt`, `aws_key`, `private_key`, `generic_api_key`. Deterministic (no LLM), so they run on every call with no added latency/cost.
`redaction_rules`	array	`[]`	Custom rules `{ "name": "...", "regex": "...", "replacement": "«redacted»" }`, applied after the built-ins.

The two enforce controls are also re-checked at call time: a tool that pinning or a description guardrail removed from the secured tools/list cannot be called directly, even by a client that never issued tools/list.

{
  "zero_trust": {
    // A. anti-rug-pull — alert on mutation, then block once you trust the baseline
    "pinning_enabled": true,
    "pinning_enforce": false,
    "pinning_epoch": 0,

    // B. scan descriptions and results for prompt-injection / tool-poisoning
    "description_guardrails": [
      { "enabled": true, "id": "prompt_injection", "config": { "provider": "provider_xxx", "threshold": 90 } }
    ],
    "result_guardrails": [
      { "enabled": true, "id": "contains", "config": { "contains_none": ["BEGIN PRIVATE KEY"] } }
    ],
    "guardrails_enforce": true,

    // C. redact secrets/PII on the way in and out
    "redact_results": true,
    "redaction_builtins": ["email", "jwt", "aws_key", "private_key"],
    "redaction_rules": [
      { "name": "internal-ticket", "regex": "TICKET-\\d+", "replacement": "«ticket»" }
    ]
  }
}

tip

Start in monitor mode (*_enforce: false) and watch the McpZeroTrustAlert events / mcp.zerotrust.* metrics to size the impact, then enable enforcement once the baseline is clean. After a legitimate upstream change, bump pinning_epoch to re-pin.

Registry / publication

A virtual server carries a registry block (entity-level metadata) that lets you publish it to the standard MCP registry so registry-aware clients discover it. The fields — published, name, version, title, url, deprecated — and the two exposition plugins are documented on the dedicated MCP Registry page.

In short: set registry.published = true (your approval gate, gated by RBAC) and the server is projected to the official server.json and served over GET /v0/servers.

Referencing and overriding​

Filtering​

OAuth 2.0 security​

Scope-based tool authorization​

Meta mode (tool virtualization)​

Per-tool rate limiting and result caching​

Managed resources​

Managed prompts​

Item overlays​

Zero-Trust controls​

Registry / publication​