# RFC-0015 — Gateway Issuer DID, Token TTL Caps, and Hybrid JWS Signature Wire Format

- **Status:** **Ratified v1.1 — 2026-05-05 (security erratum).** CTO sign-off recorded; normative for W3 implementation. v1.1 folds 3 CRITICAL amendments from devex-protocol-sec review (see Changelog).
- **Date:** 2026-05-05
- **Author:** 🧠 Agentic Architect
- **Sprint:** Sprint 3 Extended — W2-3 follow-up (specification-gap closure prior to W3 implementation).
- **Audience:** Cloud Agents (LLMs); 🛡️ devex-protocol-sec; 🦀 edge-kubelet-engineer; ☁️ cloudflare-native-edge; CTO chair.
- **Depends on:** RFC-0012 (Tenant Model), RFC-0013 (PQC Hybrid), RFC-0014 (Protocol v2 Wire). Does **not** modify any ratified RFC body; this RFC is additive.
- **Out of scope:** TLS-layer identity (CF-managed), device leaf DIDs (RFC-0007 governs), MCP tool authorization claims (RFC-0003/0006).

---

## §1. Motivation — Three Spec Gaps Surfaced in W2-3

W2-3 implementation work surfaced three under-specified seams in the v2 protocol stack. Each is currently filled by an unratified placeholder in code; W3 cannot proceed without lock-down.

1. **Gap-A — Gateway issuer DID format.** All gateway-minted JWS tokens currently carry `iss=https://api.aethermesh.dev` (bare URL placeholder). The v2 protocol stack has standardized on DID-based identity for tenants and devices (RFC-0012 §3, RFC-0007), but the **gateway itself has no canonical DID**. The `iss` claim is therefore neither cryptographically resolvable nor self-describing.
2. **Gap-B — Token TTL caps.** During W2-3, a 24h (86 400 s) TTL was adopted on architect authority for the tenant-init token (see openapi/v1-tenants-init.yaml `x-design-notes` q1, q5). RFC-0012 §5.2 describes the token's *purpose* but never specifies a TTL ceiling, nor caps for the downstream enroll-token and runtime-token tiers. The 24h figure is currently un-ratified custom.
3. **Gap-C — Hybrid JWS signature wire byte layout.** RFC-0013 §4 specifies the **construction** of `hybrid_sig = ed25519_sig (64B) || mldsa65_sig (~3309B)` and the algorithm identifier `Ed25519+ML-DSA-65`, but does **not** specify how this concatenation is encoded inside a JWS signature segment per RFC 7515, nor the public-key bundle shape for `jwk` headers / DID-document `verificationMethod` entries. W2-3 stubs ML-DSA verification with a placeholder; W3 must lock the wire format before unstubbing.

---

## §2. Gateway Issuer DID (closes Gap-A)

### §2.1 DID Method — `did:web` (NORMATIVE recommendation)

The gateway SHALL identify itself with a `did:web` DID rooted at its public hostname:

```
did:web:api.aethermesh.dev
```

Resolution per W3C did:web spec:

```
GET https://api.aethermesh.dev/.well-known/did.json
```

served by the mcp-gateway Worker over TLS terminated at Cloudflare.

**Why `did:web` (and not `did:key` alone):** `did:key` encodes the public key into the identifier itself, which is excellent for HSM-backed roots but creates a **rotation problem**: any key rotation changes the DID, which invalidates every previously-issued `iss` claim. The gateway is a long-lived service identity that MUST support key rotation without identity churn (see §6.2). `did:web` decouples identifier from key material via a resolvable document, paying the cost of TLS-trusted resolution (acceptable: the gateway already lives behind CF TLS).

`did:key` remains in scope for the **device leaf** (RFC-0007) and for any **HSM-backed offline gateway root** used to sign DID-document updates (see §6.1, §7).

### §2.2 DID Document Content

The gateway DID document MUST contain a single `verificationMethod` whose public key is the hybrid Ed25519+ML-DSA-65 bundle defined in §4.3:

```json
{
  "@context": ["https://www.w3.org/ns/did/v1"],
  "id": "did:web:api.aethermesh.dev",
  "verificationMethod": [{
    "id": "did:web:api.aethermesh.dev#gw-sig-1",
    "type": "HybridEd25519MLDSA65VerificationKey2026",
    "controller": "did:web:api.aethermesh.dev",
    "publicKeyJwk": {
      "kty": "OKP",
      "crv": "Ed25519+ML-DSA-65",
      "ed25519_pk": "<base64url>",
      "mldsa65_pk":  "<base64url>",
      "kid": "gw-sig-1"
    }
  }],
  "assertionMethod": ["did:web:api.aethermesh.dev#gw-sig-1"]
}
```

### §2.3 `iss` Claim Format

All gateway-issued JWS (tenant-init token, enroll-token, runtime-token, any future gateway-signed envelope) MUST set:

```
iss = "did:web:api.aethermesh.dev"
kid = "gw-sig-1"      ; matches verificationMethod fragment
```

`kid` MUST be set so verifiers can select the correct key without re-fetching the DID document on every verify (cache key).

### §2.4 Backward Compatibility (placeholder cutover)

Existing W2-3 tokens carrying `iss=https://api.aethermesh.dev` SHALL be accepted by verifiers until the **cutover date defined in §5**. After cutover, only `iss=did:web:api.aethermesh.dev` is valid.

---

## §3. Token TTL Caps (closes Gap-B)

Three token tiers exist in the v2 stack. Each gets a hard TTL cap that MUST be enforced **at issuance** (`exp - iat ≤ cap`) **and at verification** (reject any token whose `exp - iat` exceeds the cap, regardless of signature validity).

| Tier              | Cap        | Seconds | Rationale                                                                                       |
|-------------------|------------|---------|-------------------------------------------------------------------------------------------------|
| Tenant-init token | **24 h**   | 86 400  | Tenant-bound, single-use bootstrap (RFC-0012 §5.2). Long enough to survive operator coffee/timezone gap; short enough that a leaked token's blast radius expires within one operational day. |
| Enroll-token      | **1 h**    | 3 600   | Device-class-bound, used during one enrollment ceremony (RFC-0007). Operator presence is implied; 1 h covers slow networks and retries. |
| Runtime-token     | **15 min** | 900     | Per-device session token. Frequent rotation bounds the impact of edge-side key compromise; 15 min matches typical cloud session-token defaults and aligns with WSS reconnect cadence. |

### §3.1 Enforcement (NORMATIVE)

- **Claim completeness (v1.1 erratum, A-1):** Verifiers MUST reject any token where `iat` or `exp` is absent or non-numeric, or where `exp ≤ iat`; a non-positive TTL is structurally invalid.
- **Issuance:** the gateway MUST reject any internal mint request that asks for `exp - iat > cap`. The gateway MUST NOT silently clamp; it MUST 400/422 the caller so misconfiguration is visible.
- **Verification:** verifiers (gateway, edge, MCP clients) MUST compute `exp - iat` from the token's own claims and reject if it exceeds the cap for the token's tier. The tier is determined by the token's `aud` / `scope` claim (already specified in RFC-0003 §3 and RFC-0012 §5).
- **Clock skew:** a verifier MAY allow ±60 s skew on `nbf`/`exp` evaluation, but the **TTL ceiling check is computed from the token's own `iat`/`exp` and is not subject to skew.**

### §3.2 Non-Goals

This RFC does NOT specify refresh-token mechanics, sliding-window renewal, or revocation lists. Those remain open (see §7).

---

## §4. Hybrid JWS Signature Wire Format (closes Gap-C)

### §4.1 `alg` Identifier (re-statement, no change)

Per RFC-0013 §4, the JWS `alg` value is the literal string:

```
Ed25519+ML-DSA-65
```

Verifiers MUST reject any gateway-issued token whose `alg` is not exactly this string (see §6.3 — algorithm-confusion defense).

### §4.2 Signature Bytes Layout (NORMATIVE)

The JWS signature segment is the base64url-encoding (RFC 7515 §2, no padding) of the following **3 373-byte** concatenation:

```
sig_bytes := ed25519_sig (64 B) || mldsa65_sig (3309 B)
            ^                    ^
            offset 0             offset 64
```

- **No separator.** Both component sizes are fixed by their respective FIPS specs (Ed25519 = 64 B; ML-DSA-65 = 3 309 B per FIPS 204).
- **No length prefix.** Lengths are constants of the algorithm identifier; varying them would change the algorithm.
- **Total length:** 3 373 bytes raw → **4 498 base64url characters** (unpadded).

### §4.3 Public-Key Bundle Format (NORMATIVE)

For JWS `jwk` headers, DID-document `publicKeyJwk` entries (§2.2), and any wire-side public-key transport, the hybrid public key is a JSON object:

```json
{
  "kty": "OKP",
  "crv": "Ed25519+ML-DSA-65",
  "ed25519_pk": "<base64url, 32 raw bytes>",
  "mldsa65_pk":  "<base64url, 1952 raw bytes>",
  "kid": "<opaque key id, REQUIRED for gateway keys>"
}
```

`additionalProperties` is forbidden by convention; verifiers MUST ignore unknown fields silently (per JOSE) but MUST NOT trust them.

**Note on coexistence with RFC-0013 §4 binary blob.** RFC-0013 §4 defines a length-tagged binary `hybrid_pubkey_v1` blob for **D1 storage and WSS auth frames** (device enrollment path). This RFC defines a JSON shape for **JWS / DID-document** contexts. Both encode the same logical key pair; converters MUST be byte-exact.

### §4.4 Verification Algorithm (NORMATIVE)

```
fn verify_jws_hybrid(jws_compact, pubkey_jwk) -> bool:
    assert header.alg == "Ed25519+ML-DSA-65"        # §4.1, §6.3
    sig_bytes = b64url_decode(jws.signature)
    assert len(sig_bytes) == 3373                    # hard length check
    ed_sig = sig_bytes[0..64]
    pq_sig = sig_bytes[64..3373]
    msg    = utf8(b64url(header) || "." || b64url(payload))   # RFC 7515 §5.1
    ed_pub = b64url_decode(pubkey_jwk.ed25519_pk)
    pq_pub = b64url_decode(pubkey_jwk.mldsa65_pk)
    ok_ed  = ed25519_verify(ed_pub, msg, ed_sig)     # do not short-circuit
    ok_pq  = mldsa65_verify(pq_pub, msg, pq_sig)
    return ok_ed AND ok_pq                           # constant-time AND
```

Both verifications MUST execute regardless of the first result (constant-time AND, per RFC-0013 §4 verification pseudocode). Length-mismatch on `sig_bytes` is a hard reject **before** any cryptographic operation.

### §4.5 JWKS Mirror Endpoint (NORMATIVE)

The gateway MUST serve a JWKS mirror of the DID-document `verificationMethod` set at:

```
GET https://api.aethermesh.dev/.well-known/jwks.json
```

**Response body** (`Content-Type: application/jwk-set+json`):

```json
{
  "keys": [
    {
      "kty": "OKP",
      "crv": "Ed25519+ML-DSA-65",
      "ed25519_pk": "<base64url, 32 raw bytes>",
      "mldsa65_pk":  "<base64url, 1952 raw bytes>",
      "kid": "gw-sig-1"
    }
  ]
}
```

**Normative requirements:**

- Each entry in `keys[]` MUST be the §4.3 hybrid `publicKeyJwk` shape, byte-identical to the corresponding `verificationMethod[*].publicKeyJwk` in `/.well-known/did.json` (§2.2). The two endpoints MUST be derived from the same source of truth; divergence is a defect.
- During key rotation (§6.2), `keys[]` MUST contain every `kid` currently published in the DID document (i.e. both old and new during the overlap window).
- The response MUST include `Cache-Control: public, max-age=300, stale-while-revalidate=600`.
- The endpoint MUST be served over the same Cloudflare-terminated TLS as `/.well-known/did.json`; integrity model is identical (§6.1).
- The endpoint MUST NOT include private-key material, signing-key proofs, or any field beyond the §4.3 bundle plus `kid`.

**Rationale.** Verifiers that do not implement W3C DID resolution (generic JOSE / JWS libraries, third-party MCP clients, debugging tooling) can fetch the JWKS directly and select by `kid`. The DID document remains the authoritative identity record (§2); JWKS is a cache-friendly projection of its key material only.

---

## §5. Migration & Compatibility

### §5.1 Cutover Plan

- **W3-D0 (W3 start):** mcp-gateway Worker publishes `/.well-known/did.json` per §2.2. Gateway begins minting all new tokens with `iss=did:web:api.aethermesh.dev`, `kid=gw-sig-1`.
- **(v1.1 erratum, A-3)** The gateway MUST NOT mint tokens with `iss=https://api.aethermesh.dev` on or after W3-D0. The 30-day dual-accept window is VERIFY-ONLY; the §6.3 alg check applies to legacy-iss tokens without exception.
- **W3-D0 → W3-D0 + 30 days:** verifiers (edge `aiot-edge`, MCP clients) accept **both** legacy `iss=https://api.aethermesh.dev` and new `iss=did:web:api.aethermesh.dev`.
- **W3-D0 + 30 days (cutover):** verifiers reject legacy `https://` issuer. All tokens minted before W3-D0 will have expired by then under §3 TTL caps (max 24h tenant-init), so no live token is invalidated by the cutover.

### §5.2 D1 / Storage Migration

**None required.** This RFC changes token claim values and signature wire encoding; no D1 schema column changes, no KV-key reshape. The existing `enroll_pubkey_pem` PEM blob (RFC-0013 §4) is unaffected — it lives on the device-enrollment path, not the gateway-identity path.

### §5.3 Edge Binary Impact

The `aiot-edge` binary already carries an ML-DSA-65 verifier (RFC-0013 §4, G-V2-B; W1 spike measured 3.62 MB / 6 MB cap). Adding the §4.4 JWS-shaped verification path is **logic only**, no new dependency, no new symbol; size impact estimated < 5 KB.

---

## §6. Security Considerations

*Hardened in v1.1 erratum; see Changelog.*

### §6.1 DID Document Tamper Protection

The `/.well-known/did.json` document is served over Cloudflare-terminated TLS; integrity rests on TLS for v0.1. **Future hardening (out of scope for this RFC):** the DID document SHOULD itself be signed by an HSM-backed offline root key (a `did:key` controller of the `did:web` DID), giving a verifiable trust chain that does not depend on TLS for integrity. Tracked in §7 Q1.

### §6.2 Key Rotation

- Each verification key in the DID document carries a `kid`. To rotate, the gateway:
  1. Generates new hybrid keypair `gw-sig-{N+1}`.
  2. Publishes an updated `did.json` containing **both** `gw-sig-N` and `gw-sig-{N+1}` in `verificationMethod`.
  3. Begins minting tokens with `kid=gw-sig-{N+1}`.
  4. After max-TTL-of-longest-token (24 h per §3) elapses, removes `gw-sig-N` from the DID document.
- Verifiers MUST select the verification key by `kid` matching against the resolved DID document.
- **(v1.1 erratum, A-2)** If `kid` is absent from the token header, or matches no verificationMethod id fragment in the resolved DID document, the verifier MUST reject immediately and fail closed. No fallback to any default or first-listed key is permitted.
- The JWKS mirror endpoint (§4.5) MUST list every currently-published `kid` for the duration of the overlap window.

### §6.3 Algorithm-Confusion Defense

Verifiers MUST reject any gateway-issued JWS whose `alg` header is **not** the exact string `Ed25519+ML-DSA-65`. In particular:

- `alg=none` MUST be rejected unconditionally.
- `alg=Ed25519` (classical-only) MUST be rejected — even if the embedded signature happens to be a valid Ed25519 signature, this RFC requires hybrid.
- `alg=ML-DSA-65` (PQ-only) MUST be rejected for the same reason.
- Verifiers MUST NOT select the verification algorithm from the key's `crv` field; the `alg` header is authoritative and is checked against a hardcoded allow-list.

### §6.4 Length-Confusion Defense

Per §4.4, verifiers MUST hard-reject any signature segment whose decoded byte length is not exactly 3 373. This prevents a class of attacks where a truncated or padded signature might cause undefined behaviour in a naïve splitter.

### §6.5 TTL-Cap Enforcement Justification

Enforcing TTL caps at **verification** time (not just issuance) closes the trust gap where a compromised or buggy minter issues a long-lived token: every verifier becomes a second line of defense.

---

## §7. Open Questions

- **Q1 — HSM vs software-only gateway key.** Test/staging environments will run software-only (Cloudflare Workers Secrets binding). Production target is an HSM-backed root that signs DID-document updates (see §6.1). Decision deferred: which HSM (CF's upcoming Keyless SSL-style API, or external KMS)?
- **Q2 — `did:web` vs `did:key` long-term.** This RFC commits to `did:web` for the gateway identity. If/when CF offers a primitive for stable resolvable identifiers without TLS dependence, re-evaluate.
- **Q3 — Cross-region key replication.** Cloudflare Workers run globally; the signing key must be available in every PoP. Options: (a) Workers Secret binding (replicated by CF); (b) KV with at-rest encryption; (c) per-region HSM with regional `kid`s. Decision deferred to W4.

*Resolved during v1.0 ratification:*

- **(was Q4) JWKS mirror endpoint** — **RESOLVED**: promoted to normative §4.5. The gateway MUST publish `/.well-known/jwks.json` for non-DID-aware verifiers.
- **(was Q5) Refresh / rotation of runtime tokens** — **DEFERRED to RFC-0016** (Runtime-Token Refresh & Rotation). RFC-0016 MUST be ratified before W3 implementation of the 15-min runtime-token TTL goes live, otherwise edge devices would disconnect every 15 minutes.

**Open question count: 3.**

---

## §8. References

- **RFC-0001** — MCP JSON v1 (envelope baseline).
- **RFC-0003** — RBAC token format (claim shape, `aud`/`scope`).
- **RFC-0007** — Public enrollment (device-leaf DIDs, enroll-token tier).
- **RFC-0012** — Tenant Model (tenant-init token purpose, §5.2).
- **RFC-0013** — Post-Quantum Cryptography (hybrid construction, §4; algorithm identifier; sizes).
- **RFC-0014** — Protocol v2 Wire (envelope, headers).
- **RFC-0016** — Runtime-Token Refresh & Rotation (forward reference; specifies the in-band refresh mechanism for the 15-min runtime-token tier defined in §3 of this RFC).
- **W3C DID Core** — <https://www.w3.org/TR/did-core/>.
- **did:web Method Specification** — <https://w3c-ccg.github.io/did-method-web/>.
- **RFC 7515 (JWS)** — JSON Web Signature, in particular §2 (compact serialization), §5.1 (signing input).
- **FIPS 204** — ML-DSA (signature size 3 309 B for ML-DSA-65).
- **NIST IR 8547 (draft)** — PQC migration guidance.

---

## Changelog

- **v1.1 — 2026-05-05 (security erratum, fast-tracked per CTO)** — Applied 3 CRITICAL amendments from devex-protocol-sec review (A-1 iat/exp absence rejection, A-2 strict kid match no-fallback, A-3 legacy-iss VERIFY-ONLY). No behavior changes for compliant clients; closes downgrade vectors. Source: `tracking/work/devex-protocol-sec/rfc-0015-security-review.md`. HIGH/MEDIUM/LOW items from the same review deferred to follow-up (no in-scope pure-text fixes beyond what A-1/A-2/A-3 already cover; H-1 alg-policy clarification is folded into A-3).
- **v1.0 Ratified (2026-05-05):** CTO sign-off. Q4 (JWKS mirror endpoint) promoted from Open Questions to normative §4.5 (`GET /.well-known/jwks.json`, hybrid `publicKeyJwk` shape, `Cache-Control: public, max-age=300, stale-while-revalidate=600`, byte-consistent with `did.json`). Q5 (runtime-token refresh) deferred to RFC-0016, which MUST ratify before W3 lights up the 15-min runtime-token TTL. Forward reference to RFC-0016 added to §8. Open-question count 5 → 3. No changes to §2/§3/§4.1–§4.4/§5/§6 normative content.
- **v0.1 DRAFT (2026-05-05):** Initial draft. Closes W2-3 gaps A/B/C: gateway `did:web` identity, three-tier TTL caps (24h/1h/15min), hybrid JWS signature wire layout (64B || 3309B = 3373B) and `publicKeyJwk` bundle shape. Awaiting CTO review gate. *(Archived at `tracking/work/agentic-architect/archive/rfc-0015-v0.1-draft.md`.)*
