LLM Bots Blocker (Proof of Work)

The LLM Bots Blocker plugin protects your routes against automated LLM bot scraping by requiring visitors to solve a cryptographic Proof of Work (PoW) challenge before accessing your content.

Why block LLM bots?

LLM companies deploy bots (crawlers, scrapers) that massively crawl websites to collect training data. Unlike traditional search engine bots that index pages, these bots download entire sites content at scale. This causes several problems:

Bandwidth and infrastructure costs: massive automated traffic increases server load and bandwidth bills
Intellectual property: your content is used to train commercial AI models without consent or compensation
Degraded user experience: bot traffic can slow down your site for real users
Rate limit exhaustion: bots can consume API quotas intended for legitimate users

Traditional methods like robots.txt or user-agent blocking are easily bypassed. A Proof of Work challenge provides a much stronger defense: it requires executing JavaScript in a real browser, which is something headless bots and simple HTTP clients cannot do efficiently.

How it works

The plugin implements a challenge-response protocol based on SHA-256 hash computation:

1. Challenge phase

When a visitor arrives without a valid PoW cookie, the plugin intercepts the request and returns an HTML page containing a JavaScript worker. The worker receives a random challenge string and a difficulty level.

2. Solving phase

The JavaScript worker iterates through nonce values, computing SHA-256(challenge + ":" + nonce) until it finds a hash whose binary representation starts with at least N leading zero bits (where N is the configured difficulty). A progress bar is displayed to the user during this process, which typically takes less than 2 seconds.

3. Verification phase

The browser sends the solution (challenge, nonce, hash) back to the server via a POST request with the Oto-LLm-Pow header. The server verifies:

The challenge exists and has not expired
The hash matches SHA-256(challenge + ":" + nonce)
The hash has enough leading zero bits

4. Token phase

Upon successful verification, the server issues a signed HMAC-SHA256 cookie. Subsequent requests carrying this cookie are allowed through without re-solving. The cookie can be bound to the visitor's IP and/or User-Agent for additional security.

Configuration

{
  "difficulty": 8,
  "token_ttl_seconds": 300,
  "challenge_ttl_seconds": 120,
  "cookie_name": "oto_llm_pow",
  "cookie_domain": null,
  "cookie_path": "/",
  "cookie_same_site": "Lax",
  "bind_ip": true,
  "bind_ua": true,
  "secret": null,
  "allowed_ips": [],
  "blocked_ips": [],
  "allowed_user_agents": [],
  "blocked_user_agents": [],
  "allowed_headers": {},
  "blocked_headers": {}
}

Parameters

Parameter	Type	Default	Description
`difficulty`	number	`8`	Number of leading zero bits required in the hash. Higher values make the challenge harder (each +1 roughly doubles the computation time)
`token_ttl_seconds`	number	`300`	How long (in seconds) the PoW cookie remains valid after solving
`challenge_ttl_seconds`	number	`120`	How long (in seconds) a challenge remains valid before expiring
`cookie_name`	string	`oto_llm_pow`	Name of the cookie used to store the PoW token
`cookie_domain`	string	`null`	Domain for the cookie. If null, uses the current domain
`cookie_path`	string	`/`	Path for the cookie
`cookie_same_site`	string	`Lax`	SameSite attribute for the cookie (`Lax`, `Strict`, or `None`)
`bind_ip`	boolean	`true`	Bind the token to the client's IP address. Prevents cookie theft across IPs
`bind_ua`	boolean	`true`	Bind the token to the client's User-Agent. Prevents cookie reuse from different browsers
`secret`	string	`null`	Secret key for HMAC signing the cookie. If null, uses the Otoroshi secret

Difficulty tuning

The difficulty parameter controls the computational cost of the challenge:

Difficulty	Approximate time	Use case
4-6	< 100ms	Light protection, minimal user friction
8	~ 200ms - 1s	Good default, blocks most bots
12-16	1s - 10s	Strong protection, noticeable delay
20+	10s+	Very aggressive, may impact user experience

A difficulty of 8 (default) provides a good balance: fast enough for real browsers but expensive at scale for bots.

Bypass rules

You can configure bypass rules to let certain traffic through without requiring PoW:

IP-based bypass

{
  "allowed_ips": ["192.168.1.0/24", "10\\.0\\..*"],
  "blocked_ips": ["203.0.113.50"]
}

allowed_ips: requests from these IPs skip the PoW challenge entirely. Supports CIDR notation and regex patterns.
blocked_ips: requests from these IPs are always blocked (even if they solve the PoW).

User-Agent-based bypass

{
  "allowed_user_agents": ["Googlebot.*", "Bingbot.*"],
  "blocked_user_agents": ["BadBot.*"]
}

allowed_user_agents: requests with matching User-Agent skip the PoW challenge. Supports regex patterns.
blocked_user_agents: requests with matching User-Agent are always blocked.

Header-based bypass

{
  "allowed_headers": {
    "X-Internal-Token": "my-secret-value"
  },
  "blocked_headers": {
    "X-Known-Bad": ".*"
  }
}

allowed_headers: requests with matching header key/value pairs skip the PoW challenge. Values support regex.
blocked_headers: requests with matching header key/value pairs are always blocked.

A request is bypassed only when all three conditions (IP, User-Agent, and headers) match their respective allow rules.

Plugin setup

Add the plugin to your route configuration:

{
  "enabled": true,
  "plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.ProofOfWorkPlugin",
  "config": {
    "difficulty": 8,
    "token_ttl_seconds": 300,
    "challenge_ttl_seconds": 120,
    "bind_ip": true,
    "bind_ua": true
  }
}

Full route example

{
  "id": "route_pow_protected",
  "name": "PoW Protected Site",
  "frontend": {
    "domains": ["my-site.example.com"],
    "strip_path": true,
    "exact": false
  },
  "backend": {
    "targets": [
      {
        "hostname": "my-backend.internal",
        "port": 8080,
        "tls": false
      }
    ]
  },
  "plugins": [
    {
      "enabled": true,
      "plugin": "cp:otoroshi.next.plugins.OverrideHost",
      "config": {}
    },
    {
      "enabled": true,
      "plugin": "cp:otoroshi_plugins.com.cloud.apim.otoroshi.extensions.aigateway.plugins.ProofOfWorkPlugin",
      "config": {
        "difficulty": 8,
        "token_ttl_seconds": 600,
        "challenge_ttl_seconds": 120,
        "bind_ip": true,
        "bind_ua": true,
        "allowed_ips": ["10.0.0.0/8"],
        "allowed_user_agents": ["Googlebot.*"]
      }
    }
  ]
}

Cluster support

The plugin fully supports Otoroshi cluster mode. Challenges are stored in the Otoroshi datastore and synchronized between leader and worker nodes. When running in worker mode, challenge operations (create, read, delete) are forwarded to the leader node to ensure consistency across the cluster.

How it differs from CAPTCHAs

	Proof of Work	CAPTCHA
User interaction	None - fully automatic	Requires manual solving
Accessibility	No accessibility barriers	Can be difficult for some users
Privacy	No third-party service involved	Often relies on external services (e.g. Google reCAPTCHA)
Effectiveness	Strong against headless bots	Increasingly solved by AI
User experience	Transparent, ~1-2 second wait	Interrupts user flow

Why block LLM bots?​

How it works​

1. Challenge phase​

2. Solving phase​

3. Verification phase​

4. Token phase​

Configuration​

Parameters​

Difficulty tuning​

Bypass rules​

IP-based bypass​

User-Agent-based bypass​

Header-based bypass​

Plugin setup​

Full route example​

Cluster support​

How it differs from CAPTCHAs​