# RFC-0012 — Tenant Model & Isolation (v2 Protocol Reset)

- **Status:** **Ratified v1.0 — 2026-05-03.**
- **Date:** 2026-05-03
- **Ratified-By:** CTO via `tracking/sprint-3-extended-v2-plan.md` §8 (G-V2-A through G-V2-G).
- **Author:** 🧠 Agentic Architect.
- **Sprint:** Sprint 3 Extended — v2 protocol reset bundle (RFC-0012/0013/0014).
- **Audience:** Cloud Agents (LLMs); ☁️ cloudflare-native-edge; 🦀 edge-kubelet-engineer; 🛡️ devex-protocol-sec; CTO chair.
- **Depends on:** RFC-0013 v1.0 (hybrid signature spec used in DID proof-of-control & in JWT signatures referenced from §5).
- **Will require v3.0 amendments to:** RFC-0001 (MCP envelope: `tid` field), RFC-0002 (WSS subprotocol header), RFC-0003 (RBAC scope grammar). Amendments deferred; this RFC only specifies the surface those amendments must carry. v3.0 amendment window: end-of-sprint-3-extended (2026-05-30).
- **Breaking changes:** YES, allowed per CTO-locked policy. v1 protocol surface deprecates 2026-05-31; mock tenant re-enrolls under v2.

---

## Change Log

- **v1.0 (2026-05-03):** Ratified. Folded G-V2-A..G G-V2-G per Sprint-3-Extended plan §8. Drop+recreate D1 migration. UUIDv5 tenant_id (ns=`56bdc01c-052e-5f60-abfb-7fa367b284e3`). Pure-TS PQC on Worker, WASM swap deferred to S4. 6MB edge binary cap. Self-sovereign tenant bootstrap (managed-tenant deferred to v3.0). Locked alg=`Ed25519+ML-DSA-65` with internal `alg_id` enum. Tenant DELETE OPTIONAL.
- **v0.1 DRAFT (2026-04-29):** Initial draft.

---

## §1. Abstract

This RFC defines the **tenant** as the top-level resource-isolation primitive of the AIoT PaaS. Every device, interface, audit row, KV key, R2 object, and Durable Object instance is owned by exactly one tenant. Tenant identity is rooted in a W3C **Decentralized Identifier (DID)**, allowing chain-agnostic Web3 wallet identities (`did:pkh`, CAIP-10) and self-generated keypair identities (`did:key`) to coexist in a single codebase serving both the hosted cloud and OSS self-hosters. Data isolation is **strict** at every storage and API boundary; compute and fault isolation are **best-effort** (shared Worker / D1 / KV; per-tenant rate limits and exception fences bound noisy-neighbor and crash blast-radius).

## §2. Motivation

The current single-tenant operation (mock tenant only) is operationally untenable for any future onboarding. Without a tenant primitive, every D1 query, KV lookup, and DO route silently joins all customers' data into one keyspace; any bug becomes a cross-customer disclosure. We must close this hole **before** the first non-mock device enrolls.

A **DID-based** identity model is preferred over an opaque server-issued account ID because (a) it gives tenants self-sovereign recovery (the wallet/keypair is the root, not a vendor-controlled DB row), (b) it is chain-agnostic (CAIP-10 covers Ethereum, Solana, and dozens of other chains under a single grammar), and (c) `did:key` provides an identical code path for OSS self-hosters with no wallet dependency, satisfying the **one-codebase** constraint.

Hybrid post-quantum signatures (RFC-0013) cover device identity and runtime tokens; this RFC is concerned with **resource scoping and access control**, not with the cryptography of the proofs themselves.

---

## §3. Tenant Identity Model

### §3.1 W3C DID Core

Tenant identity is a **W3C DID** ([W3C DID Core 1.0]). Two DID methods are normatively supported in v2:

| Method   | Spec                                                              | Use case                                  |
|----------|-------------------------------------------------------------------|-------------------------------------------|
| `did:pkh`| [CAIP-10] (chain-agnostic public-key-hash)                        | Web3 wallet (Ethereum, Solana, …)         |
| `did:key`| [W3C did:key]                                                     | Self-generated keypair (OSS default)      |

Other DID methods (`did:web`, `did:ion`, …) are **out of scope for v2**; may be added additively in v3 without breaking existing tenants.

### §3.2 `tenant_id` derivation (NORMATIVE — v1.0, supersedes draft)

**Algorithm:** RFC-4122 §4.3 **UUIDv5** (SHA-1 namespace hash) over a fixed DID namespace.

```
TENANT_NAMESPACE_UUID = "56bdc01c-052e-5f60-abfb-7fa367b284e3"
                      = uuidv5(DNS_NAMESPACE, "tenant.aethermesh.app")
tenant_id             = uuidv5(TENANT_NAMESPACE_UUID, did_string_utf8)
```

The UUIDv5 construction sets the version (5) and variant (RFC 4122) bits per spec; the resulting `tenant_id` is a **conformant** UUID string accepted by every standard UUID library and DB validator on the planet. This supersedes the v0.1 DRAFT `tenant_id = uuid_from_bytes(sha256(did)[0..16])` raw-hash construction (rejected by G-V2-D in favor of conformance).

**`TENANT_NAMESPACE_UUID` is a canonical compile-time constant.** Implementations MUST hardcode `56bdc01c-052e-5f60-abfb-7fa367b284e3`; they MUST NOT recompute it at runtime (drift across language runtimes' UUIDv5 implementations is the kind of bug that silently mints a different `tenant_id` for the same DID on different platforms). Worker reference: `src/domain/tenant_id.ts`. Edge reference: `crates/manifest::tenant::TENANT_NAMESPACE_UUID`. ☁️'s D1 work has already baked this constant.

**Worked example A — `did:pkh:eip155`:**
```
did       = "did:pkh:eip155:1:0xab5801a7d398351b8be11c439e05c5b3259aec9b"
ns        = 56bdc01c-052e-5f60-abfb-7fa367b284e3
tenant_id = uuidv5(ns, did)
          ; deterministic per RFC-4122 §4.3; conformance fixtures land with the implementation dispatch.
```

**Worked example B — `did:key`:**
```
did       = "did:key:z6MkpTHR8VNsBxYAAWHut2Geadd9jSwuBV8xRoAnwWsdvktH"  # gitleaks:allow
ns        = 56bdc01c-052e-5f60-abfb-7fa367b284e3
tenant_id = uuidv5(ns, did)
```

**Worked example C — mock tenant (sentinel DID, see §8):**
```
did       = "did:key:z__MOCK_TENANT__"
ns        = 56bdc01c-052e-5f60-abfb-7fa367b284e3
mock_tenant_id = uuidv5(ns, "did:key:z__MOCK_TENANT__")
```

(Hex digest values are deterministic but elided here; canonical fixture vectors live in the implementation test suite.)

### §3.3 Tenant lifecycle

| State        | Trigger                                                              | Storage effect                                              |
|--------------|----------------------------------------------------------------------|-------------------------------------------------------------|
| `pending`    | DID seen at `POST /v1/tenants/init`, signature challenge in flight   | none persisted                                              |
| `active`     | challenge verified; tenant row inserted (lazy-create, idempotent)    | `tenants` row with `status='active'`                        |
| `soft_deleted` | `DELETE /v1/tenants/me` by tenant-admin scope                       | `status='soft_deleted'`, `deleted_at` set; reads denied; data retained for grace period (§11 Q1) |
| `purged`     | grace period elapses; daily sweeper runs                             | all tenant-scoped rows/KV/R2/DO state hard-deleted          |

Re-enrollment after `purged` is allowed and creates a fresh row (same DID → same `tenant_id`; orphaned references would be impossible because purge is total).

---

## §4. Isolation Boundaries

| Boundary | Strength      | Mechanism                                                                                          |
|----------|---------------|----------------------------------------------------------------------------------------------------|
| Data     | **Strict**    | Every row/key/object/DO carries `tenant_id`; access goes through scoping helpers (§9)               |
| Compute  | Best-effort   | Shared Worker isolate + shared D1; per-tenant token-bucket rate limits enforce fair-share          |
| Fault    | Best-effort   | Tenant-scoped query paths wrapped in try/catch returning `E_TENANT_INTERNAL`; one tenant's bad data MUST NOT crash the request handler for another tenant on the same isolate |

Tail-latency spillover from a noisy tenant is **acceptable** (no per-tenant infra). Cross-tenant data exposure is **never acceptable** — any test that demonstrates one is a release blocker.

---

## §5. Data Model Changes

### §5.1 New table

```sql
CREATE TABLE tenants (
  tenant_id     TEXT    PRIMARY KEY,           -- UUID-shape per §3.2
  did           TEXT    NOT NULL UNIQUE,       -- full DID string
  did_method    TEXT    NOT NULL,              -- 'pkh' | 'key'
  created_at    INTEGER NOT NULL,              -- unix seconds
  status        TEXT    NOT NULL,              -- 'active' | 'soft_deleted' | 'purged'
  deleted_at    INTEGER,                       -- nullable; set on soft-delete
  metadata_json TEXT    NOT NULL DEFAULT '{}'
);
CREATE INDEX idx_tenants_did ON tenants(did);
```

### §5.2 Existing tables — `tenant_id` propagation (NORMATIVE)

Every existing D1 table gains `tenant_id TEXT NOT NULL` and a covering index. Enumeration:

| Table                       | Index added                               |
|-----------------------------|-------------------------------------------|
| `devices`                   | `idx_devices_tenant (tenant_id, node_id)` |
| `enroll_tokens`             | `idx_enroll_tokens_tenant (tenant_id)`    |
| `audit_log`                 | `idx_audit_tenant (tenant_id, ts)`        |
| `tool_invocations`          | `idx_tool_inv_tenant (tenant_id, ts)`     |
| `safety_decisions`          | `idx_safety_tenant (tenant_id, ts)`       |
| `device_shadow_snapshots`   | `idx_shadow_tenant (tenant_id, node_id)`  |
| `interfaces` (if present)   | `idx_interfaces_tenant (tenant_id)`       |

(Implementation dispatch will reconcile against the live schema; any table not in this list is a finding for 🛡️ devex-protocol-sec audit.)

### §5.3 KV key prefix scheme

All Workers KV keys gain a `t:{tenant_id}:` prefix. Enumeration of existing key shapes and their v2 form:

| v1 key shape                  | v2 key shape                                  |
|-------------------------------|-----------------------------------------------|
| `shadow:{node_id}`            | `t:{tenant_id}:shadow:{node_id}`              |
| `enroll-rl:{ip}`              | `t:{tenant_id}:enroll-rl:{ip}` (per-tenant)   |
| `runtime-rl:{node_id}`        | `t:{tenant_id}:runtime-rl:{node_id}`          |
| `presence:{node_id}`          | `t:{tenant_id}:presence:{node_id}`            |
| `idempotency:{key}`           | `t:{tenant_id}:idempotency:{key}`             |

Cross-tenant prefixes (gateway-internal): `sys:` reserved. End-user code paths MUST NOT read or write `sys:*`.

### §5.4 R2 prefix scheme

Tenant-uploaded artifacts: `t/{tenant_id}/...`.
**Shared (NOT tenant-scoped):** release binaries (`gateway/`, `edge/aiot-edge-*`), public homepage assets, `/.well-known/` resources. Rationale: these are not tenant data; they are vendor-supplied software.

### §5.5 Durable Object routing

```
do_id = NAMESPACE.idFromName(`${tenant_id}:${node_id}`)
```

Replaces v1 `idFromName(node_id)`. **All v1 DO instances become orphaned** at v2 cutover; their persisted state is unreachable. Acceptable per breaking-changes policy; mock-tenant DOs will be re-created at first re-enroll.

---

## §6. JWT Claim Changes & Tenant Bootstrap (NORMATIVE — v1.0)

All v2-issued tokens (enroll-token, runtime-token, operator-token) MUST carry:

| Claim | Type   | Notes                                                                    |
|-------|--------|--------------------------------------------------------------------------|
| `tid` | string | UUIDv5 `tenant_id` per §3.2. **Required.** Used by every authz check.    |
| `did` | string | Full DID string. **Required.** Audit/recovery; never used for authz directly. |
| `sub` | string | Subject within the tenant (e.g., `node_id` for runtime tokens). Unchanged semantics. |

A token whose `tid` does not match the resource being accessed → `E_TENANT_DENIED` (HTTP 403; new code, see §10). No leak of which tenant owns the resource.

**Tenant bootstrap (G-V2-F, ratified):** v2.0 ships **self-sovereign** only. The DID-holder who completes the `POST /v1/tenants/init` challenge (RFC-0014 §6.1) becomes the implicit owner of `t:<tid>:*` automatically. **No vendor-side `system:*` override path is minted in v2.0.** Managed-tenant flows (vendor-issued tenant-owner tokens for support workflows) are explicitly **deferred to v3.0**; reopening this gate requires a new RFC, not a config change.

---

## §7. RBAC Scope Grammar v2

Hierarchical, tenant-prefixed:

```
scope          := tenant_scope | system_scope
tenant_scope   := "t:" tenant_id ":" resource ":" verb_or_star
resource       := "devices" | "tools" | "audit" | "shadow" | "tenants" | "*"
verb_or_star   := "read" | "write" | "invoke" | "admin" | "*"
system_scope   := "system:" verb_or_star          ; gateway-internal only
```

Examples:
- `t:4fa3715e-…:devices:read` — read devices in this tenant
- `t:4fa3715e-…:tools:invoke` — invoke tools in this tenant
- `t:4fa3715e-…:*` — tenant-admin (all resources, all verbs, this tenant only)
- `system:*` — gateway maintenance scripts; **never minted into a user-issued token**

**Forbidden constructions:**
- `t:*:*` — no super-admin across tenants. The grammar explicitly disallows `*` in the `tenant_id` slot.
- Bare verbs without tenant prefix (e.g., `devices:read`) — rejected as malformed; legacy v1 scopes never satisfy any v2 check.

---

## §8. Mock Tenant

A reserved tenant for internal smoke/CI/demo use. Deterministic identity (v1.0, UUIDv5):

```
mock_did       = "did:key:z__MOCK_TENANT__"   ; well-known sentinel string
mock_tenant_id = uuidv5("56bdc01c-052e-5f60-abfb-7fa367b284e3", mock_did)
```

The mock tenant hosts the five standing mock devices (`01kpq9rv…`). It is created idempotently at gateway startup if absent. Production guard: in cloud deployment, the mock tenant MUST be flagged in `tenants.metadata_json` (`{"reserved":"mock"}`); enrollment via the public `POST /v1/tenants/init` MUST reject the mock DID.

---

## §9. OSS Bring-Your-Own-DID

Self-hosters generate an Ed25519 keypair via a new edge subcommand:

```
aiot-edge tenant init --out ./tenant.key
# emits: did:key:z<base58btc(0xed01 || pub)>
# stores private key under ~/.aiot/tenant.key (mode 0600)
```

The resulting `did:key` is the OSS deployment's tenant DID. `tenant_id` is deterministic from the DID, so re-importing the keypair on a fresh OSS install reproduces the same tenant — operators can migrate by carrying the keypair, no vendor coordination needed. The cloud and OSS code paths through the gateway are **identical**; only the DID method differs.

---

## §10. Cross-Tenant Enforcement

NORMATIVE patterns the implementation dispatch must follow:

1. **D1 query helper:** every parameterised query goes through `db.scoped(tenant_id).run(sql, params)`. The helper appends `AND tenant_id = ?` to every `WHERE` clause and prepends to every `INSERT`. Bare `db.prepare(...)` calls outside the helper are a lint failure (eslint custom rule + Rust clippy lint).
2. **KV access:** `kv.scoped(tenant_id).get/put/delete` enforces the `t:{tenant_id}:` prefix; raw `env.KV.get` is a lint failure.
3. **R2 access:** `r2.scoped(tenant_id).put/get/delete` enforces the `t/{tenant_id}/` prefix.
4. **DO routing:** `tenantScopedDoId(env, tenant_id, node_id)` is the only sanctioned constructor.
5. **Mismatch handling:** when an authenticated subject's `tid` claim does not match the resource owner, return `E_TENANT_DENIED` (HTTP 403, audit-logged, no resource detail in the response body). `E_TENANT_DENIED` is a new error code introduced by this RFC; it is **additive** to existing error surfaces (RFC-0014 §4).

---

## §11. Migration from v1 (NORMATIVE — v1.0, supersedes draft)

**Strategy: drop+recreate (G-V2-C, ratified).** No backfill, no `NULL`-permissive interim, no in-place `ALTER`.

- The v1 D1 schema is dropped at v2 cutover and recreated under §5. Mock-data loss (5 standing mock devices, ~30 audit rows) is acceptable per CTO ratification of G-V2-C; mock tenant re-enrolls cleanly via the W4 re-enrollment script.
- ☁️ cloudflare-native-edge has already executed the drop+recreate on the dev D1 (`manifests-dev`) as part of W1 D1 schema work. Prod D1 (`manifests-prod`) will be cut over at the implementation-window close per the deprecation timeline (RFC-0014 §8).
- v1 endpoints (RFC-0001/0002/0007 surfaces): deprecated 2026-05-31; return HTTP 410 Gone with `Link: <https://aethermesh.app/migrate-v2>; rel="help"` per RFC-0014 §5.
- No live customer data exists today; no customer-side migration required.

---

## §12. Open Questions — Resolved at v1.0 Ratification

1. **Soft-delete grace period:** Tenant `DELETE` endpoint is **OPTIONAL** at v1.0 per descope ladder (G-V2-G change-log note). Soft-delete + grace-period sweeper deferred to Sprint 4. Implementations MAY omit the `DELETE /v1/tenants/me` route entirely; if implemented, it MUST follow the §3.3 `soft_deleted → purged` lifecycle with a grace period selected by the operator (recommendation: 30d for hosted, 0d for OSS).
2. **Per-tenant quotas:** Counters ship in KV at v1.0; enforcement deferred to v3.0.
3. **`tenant_id` UUID-version bits:** **RESOLVED — UUIDv5** per §3.2 (G-V2-D, ratified). The raw-hash construction is rejected.
4. **DID-method allowlist enforcement point:** Worker-side enforcement is normative; edge-side `aiot-edge tenant init` CLI lint is RECOMMENDED but not blocking for v1.0.
5. **Tenant-admin bootstrap:** **RESOLVED — self-sovereign** per §6 (G-V2-F, ratified). Managed-tenant deferred to v3.0.
6. **Cross-tenant OSS contributor model:** v1.0 stance is one DID = one tenant. Multi-tenant ownership by a single human DID is a v3.0 concern; out of scope here.
7. **Reserved tenant namespace beyond mock:** v1.0 reserves only `mock`. Additional reserved DIDs assigned on demand by amendment.

---

## §13. Dependency Graph

```
RFC-0012 (this) ──depends on──▶ RFC-0013 (hybrid sig spec for DID proof-of-control & JWT alg)
RFC-0014        ──depends on──▶ RFC-0012 + RFC-0013
RFC-0001/0002/0003 ──require v3.0 amendments after this bundle ratifies (see References)
```

## §14. References

- W3C DID Core 1.0 — https://www.w3.org/TR/did-core/
- CAIP-10 (Account ID Specification) — https://chainagnostic.org/CAIPs/caip-10
- W3C did:key Method — https://w3c-ccg.github.io/did-method-key/
- RFC 4122 (UUID) §4.3 (UUIDv5 namespace-name SHA-1 hash) — https://www.rfc-editor.org/rfc/rfc4122
- RFC-0013 v1.0 RATIFIED 2026-05-03 (hybrid PQC spec) — same dispatch.
- RFC-0014 v1.0 RATIFIED 2026-05-03 (v2 wire format) — same dispatch.
- `tracking/sprint-3-extended-v2-plan.md` §2 (G-V2-A..G), §8 (CTO sign-off) — ratification source of record.
- RFC-0001 v1.3, RFC-0002 v2.2, RFC-0003 v1.3 — **will require v3.0 amendments** after this bundle ratifies (envelope `tid` field; subprotocol `aethermesh.v2`; scope grammar v2 respectively). Amendments are out of scope for this RFC.
