# RFC-0004 — `llms.txt` v0.1 Agent-Discovery File

- **Status:** DRAFT 2026-04-21.
- **Author:** 🧠 Agentic Architect
- **Sprint:** 1 (post-pivot, pre-Sprint-2 cutover).
- **Related:** RFC-0001 v1.2 (`mcp.json` and `CapabilityManifest`), RFC-0002 v2.2 (WSS envelope), RFC-0003 v1.2.1 (RBAC and tokens).
- **Audience:** Cloud Agents (LLMs) — primary readers of the served file. `agentic-architect` — owner. `cloudflare-native-edge` — implements the route. `cto-orchestrator` — approval.
- **Scope:** Defines the content contract, hosting decision, honesty discipline, governance, and update cadence for the `llms.txt` file served at the dev gateway root.

---

## 1. Motivation

Cloud Agents are the **exclusive** consumers of the aethermesh platform. The [llmstxt.org](https://llmstxt.org/) convention designates a plain-Markdown file at `/llms.txt` as the canonical entrypoint by which an agent ingests the entirety of a service's surface in one shot. We adopt it for two reasons:

1. It compresses the platform's RFC tree, schema registry URLs, and protocol invariants into a single token-budget-efficient document an agent can read on cold start.
2. It forces editorial honesty: every claim in `llms.txt` is read by autonomous agents that will act on it. Mis-statements become exploits.

This RFC governs what goes in the file, where it is served, who maintains it, and how it changes over time.

---

## 2. Hosting Decision

**Decision: route handler in the existing Hono app, served as `text/markdown; charset=utf-8`.**

Rationale:

- The gateway is a Cloudflare Worker, not a Pages site. Workers Sites is deprecated; Workers Static Assets adds a build-time asset binding to `wrangler.toml` and a separate manifest, which is disproportionate cost for a single ~200-line text file in v0.1.
- A Hono route at `GET /llms.txt` mirrors the existing pattern used for `/.well-known/mcp.json` and `/.well-known/jwks.json` (see `workers/mcp-gateway/src/routes/well_known.ts`), giving operational consistency.
- Source-of-truth for the body lives at `workers/mcp-gateway/public/llms.txt` and is imported into the Worker as a string at build time (e.g. `import LLMS_TXT from "../../public/llms.txt"` with a Wrangler text loader, or inlined via a small build step). The file is NOT served from KV in v0.1.
- Response headers MUST be: `Content-Type: text/markdown; charset=utf-8`, `Cache-Control: public, max-age=300`, `Cache-Tag: llms-txt`. The cache tag enables targeted purge on update.

The route handler itself is OUT OF SCOPE for this RFC and is the deliverable of `cloudflare-native-edge` as the next handoff.

---

## 3. Content Sections (normative)

The served file MUST contain exactly the following H2 sections, in this order:

1. **Quickstart** — token acquisition outline, MCP discovery URL, WSS endpoint URL, expected first call.
2. **MCP Surface** — link to `mcp.json`, link to per-kind schemas, current and aspirational tool projections.
3. **Authentication** — token-class table, scope vocabulary, JWKS URL, algorithm note (HS256 placeholder → EdDSA Sprint 2), claim contract link.
4. **Wire Protocol** — links to RFC-0001/0002/0003, subprotocol name, envelope shape, `msg_id` and `node_id` formats, frame cap, idle/heartbeat timers.
5. **Device Lifecycle** — enroll → acquire-runtime-token → connect → announce → heartbeat → command → acknowledge, with shipped/aspirational markers.
6. **Rate Limits and Blast Radius** — `rate_limit_rps`, two-phase commit gate, `safety_class` provenance.
7. **Status and Versioning** — maturity, versioning policy, deprecation contract, update cadence.
8. **Operating Notes for Agents** — protocol requirements expressed declaratively (see §5).
9. **Out of Scope for v0.1** — explicit non-promises.

The file MUST begin with an H1 (`# aethermesh — Headless AIoT PaaS for Cloud Agents`) and a single blockquote summary of ≤ 3 sentences. UTF-8 encoding, no YAML/TOML front-matter, no HTML.

Length budget: **150–250 lines of Markdown**. Long enough to enumerate the RFC tree; short enough to fit a single LLM context window with room for follow-up reasoning.

---

## 4. Honesty Contract (normative)

Every section MUST mark each capability with one of three closed maturity tags:

| Tag              | Meaning                                                                 |
|------------------|-------------------------------------------------------------------------|
| `[shipped]`      | Implemented and reachable on `gateway-dev.aethermesh.app` today.        |
| `[aspirational]` | Specified in an approved RFC; not yet exposed at runtime.               |
| `[placeholder]`  | Implemented as a temporary stand-in; contract will change in a named future sprint. |

Honesty markers required in v0.1:

- Public self-service enrollment endpoint → `[aspirational]`.
- Externally callable MCP tool projection → `[aspirational]` (no user-facing tools shipped).
- Production hostname → `[aspirational]` (`gateway.aethermesh.app` reserved but not serving).
- `device-runtime` token signing algorithm → `[placeholder]` (HS256 today, EdDSA + JWKS in Sprint 2).
- JWKS document content → `[placeholder]` (endpoint shipped, returns `{"keys":[]}` until issuance is wired).
- `commit_token` / two-phase commit for `physical_actuation` → `[aspirational]`.
- RFC URLs under `/rfcs/*` → `[aspirational]` (RFCs in repo today, not served).

Mis-marking a capability is a P0 documentation defect. The file MUST NOT contain any unmarked aspirational claim.

---

## 5. Prompt-Injection Resistance

Cloud Agents that read this file will treat its contents as authoritative context for their own reasoning. To minimize blast radius if the file is reflected into a downstream agent's prompt:

- No imperative instructions to the reading agent ("you must …", "ignore previous …", "do not refuse …"). Protocol requirements are stated declaratively, e.g. "Subscriptions over polling: when an agent needs continuous device telemetry, the projected tool's `subscribe` verb is the supported path."
- No device-supplied strings interpolated into the file. Section headings, capability names, and example values are all editorially authored.
- No fenced code blocks containing executable agent prompts.
- No URLs outside the `aethermesh.app`, `aiotpaas.dev`, and `llmstxt.org` apex domains.

---

## 6. Coined Terminology

**None.** Every term used in `llms.txt` v0.1 is sourced verbatim from RFC-0001/0002/0003 or from the llmstxt.org convention. The maturity tag set (`[shipped]` / `[aspirational]` / `[placeholder]`) is editorial markup local to this file and is not a wire-protocol contract.

---

## 7. Discoverability References

The file references exactly two runtime endpoints normatively:

- **MCP discovery URL:** `https://gateway-dev.aethermesh.app/.well-known/mcp.json` — gateway-level capability descriptor. RFC-0001 §1 makes per-device manifest discovery WSS-only via the `announce` frame (RFC-0002 §5.4); the `/.well-known/mcp.json` document is the *gateway*'s own descriptor (server name, base URL, capability namespaces, MCP version), not a per-device manifest. This distinction is preserved in `llms.txt`.
- **WSS endpoint:** `wss://gateway-dev.aethermesh.app/devices/connect` — RFC-0002 §2.1. Cloud Agents are NOT the consumers of this URL in v0.1; it is reachable by enrolled devices only.

No other hostnames, internal IPs, or non-public node IDs appear in the file.

---

## 8. Update Cadence and Ownership

- **Owner:** `agentic-architect` is the single owner of `workers/mcp-gateway/public/llms.txt`. PRs touching the file require `agentic-architect` review.
- **Cadence:** mandatory refresh **once per sprint**, plus on any of the following triggers:
  - A new RFC is approved or an existing referenced RFC version bumps.
  - A `[placeholder]` becomes `[shipped]` or an `[aspirational]` becomes `[shipped]`.
  - A new `kind`, `scope`, `verb`, `safety_class`, or `event_kind` enters the closed-enum set.
  - The production hostname is created.
- **Cache invalidation:** on update, `cloudflare-native-edge` purges by `Cache-Tag: llms-txt`.
- **Versioning:** the file's version line is the H1 + the parenthetical sprint marker in §7 ("Status and Versioning"). No semver is exposed in the file body; pinning is by RFC version.
- **Audit:** changes are tracked in git; no separate changelog is maintained inside the file in v0.1 (kept short for token budget). Reintroduce a `## Changelog` section at v0.2 if the file grows past 250 lines.

---

## 9. Forbidden Content

- Secrets of any kind (HMAC values, signing keys, Cloudflare API tokens, `.dev.vars` contents).
- Real `node_id`, `tenant_id`, `jti`, or `kid` values from any device on the dev fleet.
- Internal hostnames, account IDs, zone IDs, KV namespace IDs, D1 database IDs, or DO class internals.
- Imperative instructions to the reading agent (see §5).
- Free-form natural-language descriptions sourced from device manifests (USB-descriptor-injection class).
- Any `kind`, `verb`, `scope`, or tool name not normatively defined by RFC-0001/0002/0003.

---

## 10. Open Questions

1. **`/rfcs/*` route:** the file links to `https://gateway-dev.aethermesh.app/rfcs/rfc-000{1,2,3}` as `[aspirational]`. Should the gateway serve the RFC Markdown directly (round-trip from repo at build time), or should the links point to a public git host? Defer to CTO; placeholder URL is harmless because of the maturity tag.
2. **`robots.txt` interaction:** llmstxt.org does not specify whether `llms.txt` should be allowed in `robots.txt`. Recommendation: `Allow: /llms.txt` explicitly, even if the rest of the host is `Disallow:`. Out of scope for this RFC.
3. **Multi-environment serving:** v0.1 ships dev only. When prod stands up, decide whether `gateway.aethermesh.app/llms.txt` carries the same content with `[shipped]` markers flipped, or whether prod serves a separate, narrower file.

---

## 11. Handoff

- **Status:** DRAFT pending CTO approval.
- **Next persona — `☁️ cloudflare-native-edge`:** add a Hono route `GET /llms.txt` to the `mcp-gateway` Worker that serves the contents of `workers/mcp-gateway/public/llms.txt` with the headers specified in §2. No KV read, no D1 read; pure inline-or-import string response. Add a route test under `workers/mcp-gateway/tests/` asserting 200, content-type, cache headers, and a non-empty body.
- **Next persona — `🛡️ devex-protocol-sec`:** confirm that `/llms.txt` does not require authentication (it does not — the file is a public contract document) and add `/llms.txt` to any future allowlist used by gateway-level access controls.
- **No edge-kubelet-engineer involvement:** the file does not touch device firmware or the transport crate.
