# aethermesh — Headless AIoT PaaS for Cloud Agents

> MCP-native control plane for fleets of physical edge devices. Cloud Agents enroll, observe, and command devices through a single gateway. Wire protocol is WebSocket Secure with subprotocol `aethermesh.v1`.

This document is the agent-discovery entrypoint per the llms.txt convention. Audience is Large Language Model agents acting as autonomous operators of the platform; schema is the user interface. Every contract referenced below is normatively defined in a versioned RFC and published as a content-addressable JSON Schema.

Maturity marker legend used throughout this file:

- `[shipped]` — in production as of 2026-04-22 (dev: gateway-dev.aethermesh.app, prod: aethermesh.app).
- `[draft]` — specified in a DRAFT RFC pending CTO ratification; endpoint may be live but contract not yet ratified.
- `[aspirational]` — specified in an approved RFC but not yet exposed at runtime.
- `[placeholder]` — implemented as a temporary stand-in; contract will change in a named future sprint.

## Quickstart

- [MCP discovery document](https://gateway-dev.aethermesh.app/.well-known/mcp.json): gateway-level capability descriptor; lists `capability_namespaces`, `base_url`, and `mcp_version`. `[shipped]`
- [WSS data-plane endpoint](https://gateway-dev.aethermesh.app/devices/connect): the wss:// URL for device sessions; not directly callable by Cloud Agents in v0.1. `[shipped]`
- [Per-device runtime token endpoint](https://gateway-dev.aethermesh.app/v1/devices/%7Bnode_id%7D/runtime-token): mints a 1-hour `device-runtime` JWT against a leaf-cert-signed client assertion. Reachable by enrolled devices only. `[shipped]`
- [Agent runtime token endpoint](https://gateway-dev.aethermesh.app/v1/agents/runtime-token): mints a 1-hour `agent_runtime` JWT for Cloud Agents. `[aspirational]` — agent-side issuance flow is not exposed in v0.1; pre-provisioned tokens are issued out-of-band by the operator.
- First call after token acquisition: `tools/list` over MCP returns the projection of all reachable device capabilities. In v0.1 the projection set is empty for end-user agents (see "MCP Surface").
- [Install script](https://aethermesh.app/install.sh): one-line POSIX sh installer; verifies minisign binary signature before exec; enrolls device and starts WSS session. `[shipped]` — aarch64 musl in v0.1; invoke with `curl -sSL https://aethermesh.app/install.sh | sh -s -- --enroll-token=<TOKEN>`.
- [Binary releases](https://aethermesh.app/releases/:filename): signed aiot-edge binaries (e.g. `aiot-edge-aarch64-musl`, `aiot-edge-aarch64-musl.minisig`). `[shipped]`
- [Operator enroll-token mint](https://gateway-dev.aethermesh.app/v1/operator/enroll-token): operator-only endpoint to mint a 24 h single-use enroll-token for device provisioning. Requires `device:admin` scope. `[shipped]`
- [Device enrollment endpoint](https://gateway-dev.aethermesh.app/v1/devices/enroll): device-facing enrollment API per RFC-0007; accepts enroll-token + Ed25519 public key; returns `(node_id, leaf_cert_pem, device-runtime_token)`. `[shipped]`

## MCP Surface

- [mcp.json discovery](https://gateway-dev.aethermesh.app/.well-known/mcp.json): canonical machine-readable descriptor of the gateway. `[shipped]`
- [Capability manifest schema](https://gateway-dev.aethermesh.app/schemas/manifest@1.1.0): per-device `CapabilityManifest` published over the WS `announce` frame; defines `kind`, `verbs`, `safety_class`, and `schema_ref`. `[shipped]` (registry served; schemas read-only and immutable per `(kind, semver)`).
- [system.metrics schema](https://gateway-dev.aethermesh.app/schemas/system.metrics@1.0.0): the only `kind` enabled in Sprint 1; `safety_class = read_only`; verbs `snapshot` and `subscribe`. `[shipped]` on the device side; not yet projected to externally callable MCP tools `[aspirational]`.
- [RFC-0005 — `system.echo`](https://gateway-dev.aethermesh.app/rfcs/rfc-0005): first ratified user-facing tool spec (RATIFIED 2026-04-21); projected name `sysecho.<node_id>.echo.invoke`; `safety_class = read_only`; `verb = invoke`; depends on RFC-0001 v1.3 (G1/G2/G3). `[shipped]` as a contract.
- Tools currently callable by Cloud Agents over MCP: **none**. `[aspirational]` — the MCP tool projection layer (RFC-0001 §3) is implemented at the schema level but not exposed as an MCP server endpoint to external agents in v0.1. The internal `system.*` namespace exists only as in-band telemetry over the WS data-plane.
- Tool naming, when projected, follows `{kind_short}.{node_id}.{cap_id}.{verb}` (≤ 64 chars, snake/dot only); see RFC-0001 §3 and §3.1 for the budget arithmetic. Agents must not assume any current tool names.

## Authentication

- Token vocabulary (closed enum): `provisioning`, `device-runtime`, `agent_runtime`, `admin`. Cloud Agents hold `agent_runtime` only.
- Scope vocabulary (closed enum, normative source RFC-0003 §4): `tools:list`, `tools:call:read_only`, `tools:call:reversible`, `tools:call:physical_actuation`, `device:enroll`, `device:connect`, `device:admin`, `audit:read`. Implication chain: `physical_actuation ⇒ reversible ⇒ read_only ⇒ list`. No wildcards.
- Token lifetime: `agent_runtime` and `device-runtime` are capped at 1 hour (`exp − iat ≤ 3600`); `provisioning` at 24 hours and single-use; `admin` at 1 hour with MFA.
- Issuer (`iss`): `https://auth.aiotpaas.dev/`. Audience (`aud`) is `broker.aiotpaas.dev` for tool invocation and WS sessions; `control.aiotpaas.dev` for the enroll and audit APIs.
- [JWKS endpoint](https://gateway-dev.aethermesh.app/.well-known/jwks.json): public, cacheable, served by the gateway. `[shipped]` — returns live Ed25519 public key `{kty:"OKP", crv:"Ed25519", x, kid, use:"sig", alg:"EdDSA"}`.
- **Algorithm note (read carefully).** RFC-0003 v1.3 §V13.1 specifies **EdDSA (Ed25519)** as the normative JWT signing algorithm. As of 2026-04-22 the gateway is in the **dual-accept cutover window** (RFC-0003 §V13.4 Phase 2): both HS256 and EdDSA tokens are accepted. **Cloud Agents holding HS256 tokens MUST migrate before 2026-04-29; thereafter only EdDSA accepted.** New tokens issued by the gateway are signed with EdDSA; `kid` is published at `/.well-known/jwks.json`. `[shipped]`
- [Token claim schema](https://gateway-dev.aethermesh.app/schemas/auth.jwt.claims@1.0.0): claims contract (`iss`, `sub`, `aud`, `iat`, `exp`, `jti`, `scope`, `tenant_id`, `key_thumbprint`, `token_class`, optional `mfa`). `[shipped]` as a contract; `[aspirational]` as an enforcement target until EdDSA cutover.
- Token presentation rules: HTTP control-plane requests use `Authorization: Bearer <jwt>`; WS sessions present the token only in the first frame's `token` field per RFC-0002 §2.3 — never in URL, query string, or any other header.

## Wire Protocol

- [RFC-0001 — `mcp.json` v1.3](https://gateway-dev.aethermesh.app/rfcs/rfc-0001): per-device `CapabilityManifest`, projection rule, error envelope; v1.3 opens the `kind` registry to admit `system.echo`. `[shipped]`
- [RFC-0002 — WSS Envelope v2.2](https://gateway-dev.aethermesh.app/rfcs/rfc-0002): WS upgrade, subprotocol routing hint, first-frame auth, per-type envelopes, resume protocol. `[shipped]`
- [RFC-0003 — RBAC and Token Format v1.4](https://gateway-dev.aethermesh.app/rfcs/rfc-0003): token classes, scope vocabulary, scope-to-`safety_class` matrix; v1.3 ratified 2026-04-22 (EdDSA + JWKS upgrade, dual-accept cutover); v1.4 ratified 2026-04-23 (§V13.10 audit IP-hashing). `[shipped]`
- [RFC-0006 — safety_class Enforcement v1.0](https://aethermesh.app/rfcs/rfc-0006): dispatch-time `(safety_class × verb) → required_scope` table and phased rollout; Phase A sig-verify LIVE. `[draft]` — pending CTO ratification.
- [RFC-0007 — Public Device Enrollment v1.1](https://aethermesh.app/rfcs/rfc-0007): wire contract for `POST /v1/devices/enroll`; v1.1 adds `enroll_pubkey_pem` persistence and `E_INVALID_CLIENT_ASSERTION` (RATIFIED 2026-04-23). `[shipped]`
- [RFC-0009 — system.metrics.subscribe SSE v1.0](https://aethermesh.app/rfcs/rfc-0009): frame schema, cadence, heartbeat, backpressure for streaming metrics SSE. `[draft]` — self-approvable per architect.
- [RFC-0011 — Security Threat Model v1.0](https://aethermesh.app/rfcs/rfc-0011): operational threat model — trust boundaries, attack trees (§5.1 impersonation, §5.2 cost-amp, §5.3 supply-chain), mitigation map, residual-risk register (RATIFIED 2026-04-23). `[shipped]`
- [RFC-0012 — Tenant Model & Isolation v1.0](https://aethermesh.app/rfcs/rfc-0012): DID-rooted tenant primitive (`did:pkh` / `did:key`); UUIDv5 `tenant_id` (namespace `56bdc01c-052e-5f60-abfb-7fa367b284e3`); strict data isolation, best-effort compute/fault; v2 RBAC scope grammar `t:<tid>:<resource>:<verb>` (RATIFIED 2026-05-03). `[draft]` — ratified spec; implementation lands W2–W4 of Sprint-3-Extended.
- [RFC-0013 — Post-Quantum Cryptography (Hybrid) v1.0](https://aethermesh.app/rfcs/rfc-0013): hybrid Ed25519+ML-DSA-65 signatures; X25519+ML-KEM-768 KEM; SLH-DSA-128s release signing; pure-TS `@noble/post-quantum` on Worker (WASM swap deferred to S4); 6 MB single-binary edge with PQC default-on (RATIFIED 2026-05-03). `[draft]` — ratified spec; implementation lands W1–W4 of Sprint-3-Extended.
- [RFC-0014 — Protocol v2 Wire Format v1.0](https://aethermesh.app/rfcs/rfc-0014): v2 WSS subprotocol `aethermesh.v2`; envelope `tid` field; locked JWS `alg=Ed25519+ML-DSA-65`; new error codes `E_TENANT_DENIED`/`E_ALG_NOT_SUPPORTED`/`E_DID_INVALID`/`E_TENANT_INTERNAL`; tenant-lifecycle endpoints (`POST /v1/tenants/init`, `GET/DELETE /v1/tenants/me`); v1 `410 Gone` at 2026-05-31 (RATIFIED 2026-05-03). `[draft]` — ratified spec; implementation lands W2–W4 of Sprint-3-Extended.
- WS subprotocol: `aethermesh.v1` (case-sensitive). At upgrade, devices send `Sec-WebSocket-Protocol: aethermesh.v1, node-<ULID26>`; the second token is a routing hint only and confers no authority.
- Envelope shape: `{ type, msg_id, in_reply_to?, token?, last_acked_msg_id?, missing_cmds?, payload? }` with `additionalProperties: false`. Body fields for `cmd_ack` and `error` are nested under `payload` (RFC-0002 §4.1, normative).
- `type` enum (closed): `auth`, `heartbeat`, `announce`, `telemetry`, `event`, `cmd_ack`, `resume`, `auth_ack`, `ack`, `cmd`, `resume_ack`, `error`. Unknown `type` → `error` (`E_PROTOCOL_UNKNOWN_FRAME`) and close 4400.
- `msg_id` format: 26-char Crockford ULID, regex `^[0-9A-HJKMNP-TV-Z]{26}$` (uppercase). Per-direction namespace; client and server msg_id spaces never overlap.
- `node_id` format: 26-char Crockford ULID, regex `^[0-9a-hjkmnp-tv-z]{26}$` (lowercase). The DO routing key is derived from the validated JWT `sub`, not from any client-asserted field.
- Frame size cap: 64 KiB at gateway and DO ingress; oversize → close 4413.
- Idle and heartbeat: 90 s idle close (4408); device-side heartbeat cadence is 30 s.

## Device Lifecycle

- **Enroll** `[shipped]`. `POST /v1/devices/enroll` (RFC-0007) — the device presents an enroll-token plus its Ed25519 public key; the gateway returns `(node_id, leaf_cert_pem, device-runtime_token)`. Operator pre-mints enroll-tokens at `POST /v1/operator/enroll-token`. Cloud Agents do not participate in this step.
- **Acquire device-runtime token** `[shipped]`. The device signs a ≤ 60 s client-assertion JWT with its leaf private key and presents it as `Authorization: Bearer` to `POST /v1/devices/{node_id}/runtime-token`. The gateway returns a 1-hour `device-runtime` JWT.
- **Connect** `[shipped]`. WSS upgrade to `wss://gateway-dev.aethermesh.app/devices/connect` with subprotocol `aethermesh.v1, node-<ULID26>`; first frame is `auth` carrying the device-runtime token; server replies `auth_ack` within 5 s.
- **Announce** `[shipped]`. Device sends one `announce` frame whose `payload` is the signed `CapabilityManifest`. The DO verifies the Ed25519 manifest signature against the JWT-bound `node_id` before persisting; mismatch → close 4401.
- **Heartbeat** `[shipped]`. Device emits `heartbeat` at 30 s cadence; refreshes the manifest `expires_at_ms` (24 h cap) and the idle timer (90 s).
- **Command** `[aspirational]`. Server-originated `cmd` frame addressed by projected tool name; `payload.arguments` validated against the tool's `schema_ref` before delivery. No `cmd` types are projected for external agent invocation in v0.1.
- **Acknowledge** `[shipped]`. Device returns `cmd_ack` with `payload.ok` (boolean); `payload.error` required when `ok=false` and forbidden when `ok=true`. Used today only by internal/system flows.

## Rate Limits and Blast Radius

- All capabilities declare `rate_limit_rps` (exclusive minimum 0, max 1000) and optional `max_concurrency` and `deadline_ms_default` in the manifest. Agents MUST NOT exceed these.
- Two-phase commit gate `[aspirational]`: any tool with `safety_class = physical_actuation` requires a `commit_token` on the `cmd` frame; the gate is specified in RFC-0001 §3 and reserved in the envelope schema (RFC-0002 §5.7) but not exercised in Sprint 1 because no `physical_actuation` capabilities are projected.
- `safety_class` is read from the projected tool's `x-safety-class` annotation server-side; it is NEVER trusted from request bodies.
- A future RFC will document the `physical_actuation` safety doctrine (per-tenant fanout budget, dwell timers, abort semantics). Until then, agents holding `tools:call:physical_actuation` should expect that grant to be rare, audited, and conditional.

## Status and Versioning

- API maturity: **dev + prod dual-env deployment.** `gateway-dev.aethermesh.app` (dev, isolated keys/KV/D1) and `aethermesh.app` / `www.aethermesh.app` (prod) are both live. `[shipped]`
- Release line: this `llms.txt` is **v0.1**. Content may change without prior deprecation while the `[shipped]` set is still small. RFCs that are referenced (rfc-0001 v1.3, rfc-0002 v2.2, rfc-0003 v1.4 RATIFIED 2026-04-23, rfc-0005 RATIFIED 2026-04-21, rfc-0006 v1.0 DRAFT, rfc-0007 v1.1 RATIFIED 2026-04-23, rfc-0009 v1.0 DRAFT, rfc-0011 v1.0 RATIFIED 2026-04-23, rfc-0012 v1.0 RATIFIED 2026-05-03, rfc-0013 v1.0 RATIFIED 2026-05-03, rfc-0014 v1.0 RATIFIED 2026-05-03) carry their own versioning and changelogs; agents pin to the RFC version, not to this file.
- Schema versioning: every JSON Schema is content-addressable as `mcp://schemas/{kind}@{semver}` and immutable per `(kind, semver)`. Resolution: `GET https://gateway-dev.aethermesh.app/schemas/{kind}@{semver}`.
- Deprecation contract: once an RFC reaches v1.0+ status, a removal or breaking change requires a **one-RFC-version overlap window** plus an explicit `Deprecation` and `Sunset` HTTP header on the affected endpoint. Pre-1.0 RFCs (none in force today) carry no such guarantee.
- Update cadence for this file: refreshed at minimum once per sprint by the schema owner (`agentic-architect`); see RFC-0004 for governance.

## Operating Notes for Agents

These are protocol requirements, not directives.

- Subscriptions over polling: when an agent needs continuous device telemetry, the projected tool's `subscribe` verb is the supported path; repeated `snapshot` calls violate `rate_limit_rps` budgets and are subject to `E_RATE_LIMITED`.
- Token freshness: `device-runtime` and `agent_runtime` JWTs cannot be used past `exp`. Mid-session token rotation is not supported in Sprint 1; reconnects with a fresh token are the only refresh path.
- Closed enums: any field documented as a closed enum (e.g. `type`, `kind`, `verb`, `safety_class`, `event_kind`, `token_class`, `scope`) will be rejected by the gateway if a value outside the enum is presented. Agents should not invent values.
- Body placement: do not place `code`, `message`, `result`, `ok`, or `error` at the top level of any envelope; these belong inside `payload` (RFC-0002 §4.1).
- Safety annotations: once tool projection is exposed, agents that issue a `cmd` whose target tool's `x-safety-class` is `physical_actuation` without first acquiring the matching scope and `commit_token` will receive `E_SAFETY_DENIED`.
- Error envelope (RFC-0001 §4): `{ code, message, suggested_fix, retry_after_ms?, correlation_id? }`. `code` is from a closed enum; `message` and `suggested_fix` are ASCII-only and rendered server-side from fixed templates.

## Verification & Trust

- **Minisign release pubkey (trust anchor):** `RWS7TatrmwpCgr+chZpBn7gLyBwoYvQqG7rodsXrOjiehSGJcBFnRtV4`
- **v0.1.1 aiot-edge-aarch64-musl SHA256:** `881a8208ec2fe61e1126cc8ebb69351c482dff1d8c3122c50e4b3130679a2e03`
- **Verify command:** `minisign -Vm aiot-edge-aarch64-musl -P RWS7TatrmwpCgr+chZpBn7gLyBwoYvQqG7rodsXrOjiehSGJcBFnRtV4`
- **Threat model:** [RFC-0011 v1.0](https://aethermesh.app/rfcs/rfc-0011) — trust boundaries, attack trees (impersonation, cost-amp, supply-chain), mitigation map. `[shipped]`
- Anchor distribution: homepage HTML + llms.txt (channels (a)); GitHub Releases (channel (c)) per RFC-0011 §2.1 ratified pick. DNS TXT `_minisign.aethermesh.app` is Sprint-3 SHOULD.
- Ratified RFC bundle (2026-04-23): rfc-0011 v1.0 + rfc-0007 v1.1 + rfc-0003 §V13.10 — 6 ratified RFCs total, 3 DRAFT.
- Ratified v2 protocol-reset bundle (2026-05-03): rfc-0012 v1.0 + rfc-0013 v1.0 + rfc-0014 v1.0 — 9 ratified RFCs total. Implementation window 2026-05-04 → 2026-05-30; v1 endpoints respond `410 Gone` at 2026-05-31.

## Out of Scope for v0.1

- Externally callable MCP tool projection.
- Two-phase commit for `physical_actuation` (deferred to RFC-0002 v2.3 per CEO Q4).
- Multi-tenant operator role for `admin` tokens.
