# RFC-0017 — D1 Schema & Migration Contract

- **Status:** **Ratified v1.0 — 2026-05-05 (CEO fast-track, per RFC-0015/0016 erratum-bundle precedent).** Normative for all `workers/mcp-gateway/migrations/*.sql` from migration 0007 onward and for any back-fill touching ≤ 0006.
- **Date:** 2026-05-05
- **Author:** 🧠 Agentic Architect
- **Sprint:** Sprint 3 Extended — Wave A unblocker for W3-6 (`runtime_token_audit`, RFC-0016 v1.1 §lifecycle).
- **Audience:** Cloud Agents (LLMs); ☁️ cloudflare-native-edge (primary consumer); 🦀 edge-kubelet-engineer; 🛡️ devex-protocol-sec; CTO chair.
- **Supersedes:** None.
- **References:** RFC-0001 (MCP-JSON v1, append-only audit clause), RFC-0012 (Tenant Model — tenant_id is the partition key of the v2 stack), RFC-0016 v1.1 (Runtime-Token Refresh — §lifecycle defines the audit table this contract governs).
- **Forward-references:** RFC-0018 (Cross-Region Signing-Key Replication & per-PoP `kid` Strategy) — placeholder, see §7.
- **Out of scope:** Application-layer Zod validation, KV/R2/DO storage conventions, observability schemas, RBAC claim shapes (RFC-0003 governs).

---

## §1. Motivation

The v2 multi-tenant schema (migration 0006) was hand-rolled under W1 time pressure. It got the big rocks right (composite PK with `tenant_id` first on `devices`, UUIDv5-from-DID tenant IDs) but left several conventions implicit. W3-6 is about to write migration 0007 introducing `runtime_token_audit` (RFC-0016 v1.1 §lifecycle), and without a normative contract the cloud-native team will hand-roll DDL conventions that diverge from 0006. We will then pay for that drift forever, because D1 migrations are append-only by RFC-0001.

This RFC locks in **six normative rules + one architectural ruling** covering migration ordering, tenant-scoping, indexing, audit-table shape, type discipline, PR process, and `kid` strategy. It introduces **no new tools, schemas, or wire formats** — DDL and process only.

---

## §2. Migration File Naming & Ordering (NORMATIVE)

### §2.1 Filename grammar

```
NNNN_snake_case_summary.sql
```

- `NNNN` is a **zero-padded 4-digit decimal** sequence number, monotonically increasing, **no gaps**.
- `snake_case_summary` is `[a-z0-9_]+`, ≤ 48 chars, describes the change in present tense (e.g. `runtime_token_audit`, not `add_runtime_token_audit`).
- Total filename ≤ 64 chars; tokenizer-stable for LLM agents.

### §2.2 Ordering invariant

Migrations apply in lexicographic order, which under §2.1 equals numeric order. Tooling (CI guard, see §6.4) MUST reject any PR that introduces:

- a non-monotonic `NNNN` (e.g. `0008` when `0007` does not yet exist),
- a duplicate `NNNN`,
- a filename not matching the grammar.

### §2.3 Append-only invariant

Per RFC-0001, migrations are append-only. **A merged migration file is immutable.** Errata are expressed as a new migration with a higher number (e.g. `0009_fix_runtime_token_audit_index.sql`), never as an in-place edit to a prior file. The only exception is comment-only typo fixes that do not alter parsed SQL — even those require CTO sign-off in the PR body.

### §2.4 The `_migrations.lock` artifact

Every PR that adds a migration MUST update `workers/mcp-gateway/migrations/_migrations.lock`, a plaintext file containing one line per migration:

```
NNNN  <sha256-of-file>  <filename>
```

sorted by `NNNN` ascending. The PR body MUST quote the new file's SHA-256 (see §6 checklist). CI verifies the lockfile matches the on-disk migrations.

---

## §3. Tenant-Scoping Invariant (NORMATIVE)

### §3.1 The rule

Every **business table** (defined in §3.3) MUST carry `tenant_id TEXT NOT NULL` and that column MUST be the **leading column of the primary key**, either as the sole PK or as the first column of a composite PK.

Canonical example (migration 0006, `devices` table):

```sql
PRIMARY KEY (tenant_id, node_id)
```

### §3.2 Rationale

D1 is SQLite-on-the-edge; range scans on the leading PK column are O(log N) seek + sequential read. Putting `tenant_id` first means **every per-tenant query — without exception — uses the leading edge of the primary index**. Any other ordering re-introduces the per-tenant-table-scan failure mode that v2 was designed to eliminate.

Foreign-key edges to `tenants(tenant_id)` are RECOMMENDED but D1 does not enforce them by default; the invariant is the PK shape, not the FK declaration.

### §3.3 Exemption list (CLOSED enum)

The following tables are exempt because they are **system-scope** (not tenant-owned data):

| Table | Owner | Justification |
|---|---|---|
| `_migrations` | D1 runtime | Migration tracker; D1-managed. |
| `d1_migrations` | D1 runtime | Same as above; legacy name. |
| `sqlite_sequence` | SQLite runtime | AUTOINCREMENT counter. |
| `_cf_KV` | Cloudflare runtime | KV shadow; CF-managed. |
| `tenants` | Gateway control plane | The tenant registry itself; `tenant_id` IS the PK. |

Any future system table additions to this list require an RFC-0017 amendment.

### §3.4 Application-layer NOT NULL enforcement

Where a transitional column was declared nullable (e.g. `enroll_audit.tenant_id` in migration 0006 pending the W3 tenant-init endpoint), the application layer MUST reject writes with NULL `tenant_id` once the upstream resolver lands. The next migration touching that table MUST tighten the column to `NOT NULL` and back-fill any orphan rows under a `tenant_id = '00000000-0000-5000-8000-000000000000'` reserved system-orphan UUID. (Reserved UUIDv5 sentinel; never issued to a real tenant.)

---

## §4. Indexing Convention (NORMATIVE)

### §4.1 The rule

Every secondary index on a business table MUST include `tenant_id` as its **leading column**. Bare single-column indexes on tenant-scoped fields (e.g. `node_id`, `last_seen_ms`, `ts_ms`) are **forbidden**.

✅ Conformant:

```sql
CREATE INDEX idx_devices_last_seen ON devices(tenant_id, last_seen_ms);
```

❌ Non-conformant:

```sql
CREATE INDEX idx_devices_node ON devices(node_id);              -- bare node_id
CREATE INDEX idx_audit_ts    ON audit_log(ts_ms);               -- bare ts_ms
```

### §4.2 Rationale

A bare `node_id` index lets a malicious or buggy query path reach across tenants in O(log N). The PK invariant (§3) is necessary but not sufficient: secondary indexes are the other half of the partitioning story.

### §4.3 Allowed exceptions

- **Hash-lookup indexes** on globally-unique cryptographic digests (e.g. `enroll_token_sha256`, `public_key_sha256`, `prev_audit_hash`) MAY omit the leading `tenant_id` column **only if** the indexed column is documented as cryptographically unique across tenants (≥ 256 bits of entropy from a CSPRNG or hash function). Such indexes MUST carry an inline SQL comment citing this clause, e.g. `-- RFC-0017 §4.3 exception: SHA-256 is globally unique`.
- **System tables** (§3.3) are exempt from this section in their entirety.

---

## §5. Audit-Table Contract (NORMATIVE)

All tables matching the suffix `_audit` MUST conform to §5.1–§5.3. RFC-0016 v1.1's `runtime_token_audit` (migration 0007) is the first table written under this contract.

### §5.1 Mandatory columns (CLOSED set, in declared order)

```sql
CREATE TABLE <name>_audit (
  audit_id          TEXT    NOT NULL PRIMARY KEY,  -- ULID, 26-char Crockford lowercase
  tenant_id         TEXT    NOT NULL,              -- §3 invariant
  event_at          INTEGER NOT NULL,              -- unix seconds (NOT ms; see §6)
  actor_did         TEXT    NOT NULL,              -- DID of acting principal; 'did:internal:gateway' for gw-initiated
  event_type        TEXT    NOT NULL,              -- closed enum, table-specific
  payload_json      TEXT    NOT NULL,              -- CHECK(json_valid(payload_json)); see §6.3
  prev_audit_hash   TEXT    NOT NULL,              -- sha256-hex of previous row's canonical bytes; '0'*64 for genesis row per (tenant_id)
  -- table-specific columns MAY follow; none of the above MAY be omitted or renamed.
  CHECK (json_valid(payload_json))
);

-- Mandatory composite PK / scan index per §3 + §4:
CREATE INDEX idx_<name>_audit_tenant_time ON <name>_audit(tenant_id, event_at);
```

Notes:

- `audit_id` is a **TEXT ULID**, never `INTEGER PRIMARY KEY AUTOINCREMENT` (see §6.1). The ULID's time prefix gives natural ordering; the random suffix prevents enumerability.
- The PK is `audit_id` (globally unique under §4.3 exception); the per-tenant scan path is the `idx_<name>_audit_tenant_time` index.
- `prev_audit_hash` is computed **per `tenant_id` chain**, not globally. The genesis row of each tenant's chain uses `'0' * 64`.
- Hash canonicalization: `sha256_hex( audit_id || '\x1f' || tenant_id || '\x1f' || event_at_ascii || '\x1f' || actor_did || '\x1f' || event_type || '\x1f' || payload_json_canonical )`. `payload_json_canonical` is JCS (RFC 8785). Records the gateway commit, not network bytes.

### §5.2 Forbidden columns

- `source_ip` (plaintext) — replaced by `payload_json.source_ip_hash` per Sprint 3 P2-7 precedent.
- `INTEGER PRIMARY KEY AUTOINCREMENT` — see §6.1.
- Any column whose name shadows a §5.1 column with different semantics.

### §5.3 Retention policy hooks

Every audit table MUST have a documented retention class declared in the migration's leading comment block, drawn from this CLOSED enum:

| Class | TTL | Eviction owner |
|---|---|---|
| `forensic_long` | 365 d | gateway nightly cron, deletes WHERE `event_at < unixepoch() - 31536000` |
| `forensic_short` | 90 d | gateway nightly cron, 7776000 s |
| `operational` | 30 d | gateway nightly cron, 2592000 s |
| `transient` | 7 d | gateway nightly cron, 604800 s |

The eviction job itself is out of scope for this RFC; the contract is that each audit table picks exactly one class. RFC-0016 v1.1 `runtime_token_audit` is `forensic_short` (90 d) per its §lifecycle.

---

## §6. Forward-Compatible Type Rules (NORMATIVE)

### §6.1 Identifiers

- All **tenant-visible identifiers** (anything that may appear in an MCP tool argument, JWT claim, log line, or external API response) MUST be `TEXT`. Closed enum of allowed shapes:
  - **UUIDv5** (lowercase hex with dashes) — for derived IDs (e.g. `tenant_id` from DID).
  - **ULID** (26-char Crockford lowercase) — for time-ordered IDs (e.g. `node_id`, `audit_id`).
  - **DID string** — for principal identifiers.
  - **SHA-256 hex** (64 lowercase hex chars) — for content-addressed digests.
- `INTEGER PRIMARY KEY AUTOINCREMENT` is **forbidden** for tenant-visible IDs. It leaks row counts, collides across regions if D1 ever sharded, and is non-portable.
- `INTEGER PRIMARY KEY AUTOINCREMENT` remains acceptable **only** for purely-internal book-keeping rows that never escape the gateway boundary (none currently exist; the invariant is "if in doubt, use TEXT ULID").

### §6.2 Timestamps

- **New columns** MUST use `INTEGER` storing **unix seconds** (UTC), column name suffix `_at`. Example: `created_at`, `event_at`, `expires_at`.
- The **legacy `_ms` suffix** (epoch-milliseconds) is grandfathered for migrations 0001–0006. New migrations MUST NOT introduce new `_ms` columns; if sub-second precision is genuinely required, declare a separate `<name>_us` column (microseconds) with explicit justification in the migration comment.
- Default literal: `unixepoch()` (D1 SQLite ≥ 3.38). Avoid `strftime('%s','now')` in new code; both are equivalent but `unixepoch()` is type-stable INTEGER.

### §6.3 Structured columns

- Any column holding JSON MUST be declared `TEXT` and carry a table-level `CHECK(json_valid(<col>))` constraint. The constraint MUST be present in the same `CREATE TABLE` statement (no later `ADD CONSTRAINT` — D1 does not support it).
- Column name MUST end in `_json` (e.g. `payload_json`, `metadata_json`).
- Schemas for the JSON content live in the application layer (Zod) and MUST be referenced by name in the migration's leading comment block.

### §6.4 Booleans

D1/SQLite has no native BOOLEAN. Use `INTEGER` with a `CHECK(<col> IN (0,1))`, column name prefix `is_` or `has_`. Forbidden: `TEXT` columns storing `'true'`/`'false'`.

---

## §7. Architectural Ruling — `kid` Strategy for Sprint 3 (NORMATIVE)

**Question (carried over from §phase report):** With the gateway about to deploy across multiple Cloudflare PoPs, MUST the JWS `kid` be issuer-global (one `kid` for all PoPs) or MAY each PoP mint its own (`kid` = `gw-sig-1-iad`, `gw-sig-1-fra`, …)?

**Ruling:** For Sprint 3, `kid` MUST be **issuer-global**. The current production value `kid = "gw-sig-1"` (RFC-0015 §2.2 `verificationMethod.id` fragment) is locked in as the single signing-key identifier across all PoPs. The signing private key is, for Sprint 3, one logical key replicated into each PoP's signer (key-replication mechanics — KMS, secret-binding, rotation atomicity — are deferred to RFC-0018). Per-PoP `kid`s are explicitly **forbidden** until RFC-0018 ratifies the cross-region kid taxonomy and updates RFC-0016 §6.5.1 sub/kid binding rules to tolerate kid-set semantics. Rationale: with per-PoP `kid`s today, a runtime-token issued at PoP-A and refreshed at PoP-B would carry a different `kid` than the device's pinned value, causing RFC-0016 §6.5.1 to reject the refresh mid-connection — a P0 footgun. One-kid-many-keys is *not* a long-term answer (it complicates rotation), but it is the correct Sprint-3 answer because it preserves the binding contract while RFC-0018 designs the proper namespace.

**Reserved column convention (forward-compat):** any new table whose rows are signed or carry signed material MUST reserve a `region_key_id TEXT` column (NULL-allowed for Sprint 3, default NULL). RFC-0018 will define its values; until then, writers leave it NULL and readers ignore it. This avoids a schema migration churn when RFC-0018 lands.

---

## §8. Migration-PR Checklist (NORMATIVE)

Every PR adding or modifying migrations MUST include, in the PR body, the following (verbatim section headings, parseable by CI):

```
## Migration PR Checklist (RFC-0017 §8)
- [ ] SQL file: `migrations/NNNN_<summary>.sql` (matches §2.1 grammar)
- [ ] _migrations.lock updated; new SHA-256: <64-hex>
- [ ] Tenant-scoping (§3): <conformant | exempt-with-citation>
- [ ] Indexing (§4): <conformant | §4.3-exception-cited>
- [ ] Audit-table (§5): <N/A | conformant, retention class = ...>
- [ ] Type rules (§6): conformant
- [ ] Test fixture updated: <path/to/fixture> OR <"none affected">
- [ ] Rollback plan: <inverse-DDL summary> OR `DESTRUCTIVE — no rollback`
      (if DESTRUCTIVE: cite CEO break-change authorization, e.g.
       "CEO 2026-05-05: 'drop all d1 data' is fine")
- [ ] Affected RFCs cross-linked
```

A missing or unchecked item is a CI-blocking failure. The CTO orchestrator MAY waive at-most-one item per PR with explicit comment; §2.3 (append-only) is **non-waivable**.

---

## §9. Conformance Sweep — Migrations 0001–0006

Audited 2026-05-05 against §§2–6 of this RFC.

| Migration | Status | Notes |
|---|---|---|
| `0001_devices.sql` | ✅ superseded | Pre-tenant v1; entirely dropped + recreated by 0006. No live drift. |
| `0002_audit.sql` | ⚠️ superseded-with-drift | `audit_log` lacks `tenant_id` (PK violation §3) and has bare `idx_audit_node ON audit_log(node_id)` (§4 violation). 0006 recreates the table with the same shape and an explicit `TODO(W2)` comment — drift carried forward. **Follow-up: migration 0008 MUST add `tenant_id` to `audit_log` and replace bare-`node_id` index with `(tenant_id, node_id)`. Tracking item created.** |
| `0003_enroll_audit.sql` | ✅ superseded | Replaced by 0006's `enroll_audit`. No live drift. |
| `0004_enroll_pubkey.sql` | ✅ conformant | Pure `ALTER ADD COLUMN`; column was rolled into 0006's `devices`. |
| `0005_enroll_audit_hash.sql` | ✅ conformant | Pure `ALTER ADD COLUMN` + index; index `idx_enroll_audit_ip_hash` is a globally-unique-digest index covered by §4.3 exception. **Follow-up: 0008 MUST add the inline §4.3-citation comment to the existing CREATE INDEX in 0006's recreation.** |
| `0006_v2_tenant_schema.sql` | ⚠️ partial drift | `tenants` & `devices` fully conformant. `enroll_audit`: `tenant_id` is nullable (§3.4 transitional — acceptable until W3 tenant-init lands, then MUST tighten); indexes `idx_enroll_audit_ts`, `idx_enroll_audit_token_h`, `idx_enroll_audit_ip`, `idx_enroll_audit_ip_hash` do not lead with `tenant_id` — first two are §4.3 exceptions (SHA-256 globally unique), latter two are drift. `audit_log`: see 0002 row above — same drift, carried forward intentionally with TODO. `enroll_audit` does **not** yet conform to §5 audit-table contract; it predates this RFC and is grandfathered as a legacy `_audit` table — **the §5 contract applies only to audit tables created from migration 0007 onward.** |

**Required follow-up migration 0008** (ownership: ☁️ cloudflare-native-edge, due W4):

- Add `tenant_id` to `audit_log`, recreate PK as `(tenant_id, id)`, drop bare `idx_audit_node`, add `idx_audit_log_tenant_time(tenant_id, ts_ms)` and `idx_audit_log_tenant_node(tenant_id, node_id)`.
- Drop bare `idx_enroll_audit_ts` and `idx_enroll_audit_ip`; replace with `(tenant_id, ts_ms)` and `(tenant_id, source_ip_hash)`.
- Add §4.3-citation comments to surviving SHA-256 indexes.

Migration 0008 is a normal (non-fast-track) PR; it does not block 0007.

---

## §10. Handoff

- **Next persona:** ☁️ cloudflare-native-edge.
- **Next artifact:** `workers/mcp-gateway/migrations/0007_runtime_token_audit.sql` strictly conforming to §§2–6, with the `runtime_token_audit` table satisfying §5 (retention class = `forensic_short`, 90 d) and the §8 PR checklist.
- **Then:** 🦀 edge-kubelet-engineer is unblocked to wire RFC-0016 v1.1 §lifecycle writes against the new table.
- **Deferred:** RFC-0018 (cross-region kid + signing-key namespace) — owned by 🧠 Agentic Architect, scheduled Sprint 3 Wave B.

---

## Why this shape (justification table)

| Decision | Why |
|---|---|
| `tenant_id` first in PK, not just present | D1 is SQLite; only the leading PK column gets the seek-then-scan path. Any other position re-introduces table scans. |
| ULID `audit_id`, not AUTOINCREMENT | AUTOINCREMENT leaks row counts cross-tenant and is non-portable across any future D1 sharding. ULID time-prefix preserves order without leakage. |
| Hash chain per `tenant_id`, not global | A global chain forces cross-tenant write serialization at the gateway — a tail-latency footgun and a tenancy bleed. Per-tenant chains are independent. |
| `unixepoch()` seconds, not `_ms` | RFC-0016 lifecycle works in seconds; ms columns in 0001-0006 are a legacy mistake (mixed units in joins). New code unifies on seconds. |
| §4.3 SHA-256 exception | Refusing it would mean prefixing 32-byte hashes with `tenant_id` for no lookup benefit (the hash is already globally unique) and would bloat the index. Citation requirement keeps the exception auditable. |
| Issuer-global `kid` for Sprint 3 | RFC-0016 §6.5.1 binds device-pinned `kid` to refresh acceptance. Per-PoP `kid` would cause cross-PoP refresh failures mid-connection. RFC-0018 will lift this without a schema migration thanks to reserved `region_key_id`. |
| `_migrations.lock` SHA in PR body | Defends against silent post-merge edits to "immutable" migration files and gives reviewers a single byte-level checkpoint. |

---

## Changelog

- **v1.0 — 2026-05-05.** Initial draft → fast-track ratification in same PR (CEO-authorized; precedent: RFC-0015/0016 erratum bundle). §7 `kid` ruling locked in pending RFC-0018.
