3 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
a17107f3df chore(vault-cli): update d1 schema + codegen hashes for schema v1.0
- d1-schema.sql: regenerated to match compiler.py changes. Adds
  competency_area, bloom_level, phase, human_review_* columns to
  questions table. Adds idx_questions_human_review index.
  chain_questions PK changes from (chain_id, position) to
  (chain_id, question_id) for multi-chain + non-contiguous support.
  Drops deep_dive_title/deep_dive_url.
- codegen-hashes.txt: new baseline covering the v1.0 models.py,
  d1-schema.sql, and @staffml/vault-types/index.ts.

Fixes the vault codegen --check drift test that was failing CI.
2026-04-21 18:24:21 -04:00
Vijay Janapa Reddi
42f4d1ca8b fix(vault): Round-3 correctness + vault ship + authoring contract
Round-3 review (4 reviewers on v2.1) surfaced two code-correctness
Criticals that this commit fixes, plus the contracted-but-missing
`vault ship` coordinator and David's authoring-UX gaps.

Critical fixes (real bugs in landed code):

worker/src/index.ts
- SCHEMA_FINGERPRINT placeholder fail-closed (Chip R3-C1 / Dean R3-NH-3).
  Was: placeholder auto-passed and silently disabled the fingerprint
  check. Now: placeholder forces degraded mode until operator sets
  real fingerprint.
- DDL hash now includes triggers (FTS5-aware).
- release_id change invalidates schema-fingerprint memoization
  (Dean R3-NH-4).
- wrangler.toml now pins the real fingerprint.

staffml/public/sw.js
- /manifest polling TTL-throttled to 5min (Chip R3-C2). Was:
  per-request fetch nullified the §10.4 cost model.
- API origin persisted to IndexedDB; rehydrated on activate so cold
  offline wake-ups serve cached content (Chip R3-H3).

vault-cli/src/vault_cli/release.py
- emit_migrations diffs all 4 tables via PRAGMA-driven column
  introspection (Dean R3-NC-1 + R3-NH-2). Was: only questions table,
  silently missing chains/chain_questions/tags. Rollback-symmetry
  test extended to populate + verify all tables.

vault-cli/src/vault_cli/commands/release.py
- vault verify --git-ref reconstructs release from 'git archive <ref>'
  into a tempdir (Dean R3-NC-2). Was: always rebuilt from HEAD, so
  verifying a historical release always failed post-authoring.
  Academic-citability contract (C-3) now actually holds.

vault-cli/src/vault_cli/ship.py (NEW)
- vault ship composed verb with journaling (Dean R3-NH-1):
  * Legs run D1 → Next.js → paper-tag-last (§6.1.1 ordering).
  * Journal at releases/<v>/.ship-journal.json records per-leg state;
    --resume continues interrupted ships idempotently.
  * Pre-paper failure auto-rolls back in reverse order.
  * Paper-leg failure pages operator; does NOT auto-rollback earlier
    legs (git tag is remote-durable per §6.1.1).
- 4 unit tests cover happy path, pre-paper failure auto-rollback,
  paper-leg needs-manual, --resume across interruptions.

vault-cli/src/vault_cli/commands/authoring.py
- vault new appends to id-registry.yaml (David R3-H3 + C-5
  enforcement); `git pull --rebase` before allocation.
- authors: auto-populated from git config user.email (David R3-H4 /
  M-15). Was: field never set.
- vault edit injects validation-error comment block at top of YAML
  and re-opens up to --retries=3 times (David R3-H1). Was: terminal
  traceback mid-authoring session.
- vault move refuses dirty tree, chained question, excluded-cell
  per applicability matrix (David R3-H2). Was: unchecked git mv.
- vault renumber command (NEW): post-rebase seq-collision recovery.
  Bumps seq, renames file, updates id field, appends registry
  (David R3-N-2, was spec-only).
- vault mark-exemplar command (NEW): promotes to vault/exemplars/
  with provenance + human_reviewed_at gate (David R3-N-9).

vault-cli/src/vault_cli/compiler.py
- FTS5 virtual table + sync triggers added to DDL (B.5). Triggers
  keep questions_fts in sync via AFTER INSERT/UPDATE/DELETE.
  schema_fingerprint accounts for triggers now.

tests/test_hashing.py
- Nested-dict hash-stability fixture (Soumith R3-F-4). Was: test
  only reordered top-level keys + collapsed details to one key.

All 28 tests pass (22 → 28: +4 ship journaling, +1 multi-table
migration symmetry, +1 nested-dict hash stability). release_hash
unchanged at 1b304282... — FTS5 addition doesn't affect content
Merkle per §3.5 input-only design.
2026-04-16 13:10:16 -04:00
Vijay Janapa Reddi
d8f6abae4b feat(worker): Phase 3 D1 worker scaffold + shared types package
Phase 3 is CODE-COMPLETE; actual D1 creation + Worker deployment
require authenticated Cloudflare credentials (user action gate per
kickoff stop-conditions).

staffml-vault-worker/
  wrangler.toml            — DB binding, CORS allowlist, TTL env vars,
                             SCHEMA_FINGERPRINT placeholder,
                             GRACE_WINDOW_SECONDS for cross-release
                             serving.
  src/index.ts             — 6 endpoints (manifest, questions, questions/:id,
                             search, stats) with ETag + cursor pagination +
                             SWR Cache-Control + CORS.
  src/types.ts             — Env binding + row shapes.
  README.md                — deploy-day runbook.

Key v2.1 behaviors wired:
- X-Vault-Release is INFORMATIONAL (not hard-reject) — worker serves
  from release_metadata.release_id; header is SLI signal only. Fixes
  Soumith H-NEW-2 local-dev + SWR revalidation brownout.
- schema_fingerprint cold-start check hashes actual sqlite_master DDL
  (not metadata-vs-metadata, closes Dean N-4). On mismatch: Cache-API
  read-only mode with X-Vault-Degraded header, never 5xx (closes
  Chip N-H1 total-outage risk).
- Cache keys keyed by release_id → deploy-time atomic POP
  invalidation (H-14).
- ETag format: '<release_id>:q:<content_hash>' — 304 support
  (Soumith H-NEW-2).
- Cursor pagination via opaque base64 {offset, filter_hash} tokens
  (H-20). Clients never construct cursors.
- CORS allowlist from wrangler var; no wildcard in prod.

staffml-vault-types/
  index.ts                 — shared TS contract types; pnpm workspace
                             protocol between worker + site (Soumith
                             M-NEW-1 resolution).
  package.json             — @staffml/vault-types, workspace-private.

vault-cli/scripts/
  emit_d1_schema.py        — generates d1-schema.sql from compiler DDL;
                             reports SHA-256 fingerprint to paste into
                             wrangler.toml SCHEMA_FINGERPRINT var.
  d1-schema.sql            — committed schema; applied to fresh D1 via
                             'wrangler d1 execute <db> --file d1-schema.sql'.

Deploy-day gates (per CUTOVER_QA.md §0 and TESTING.md phase-entry):
  1. License decision resolved (L-10 still OPEN).
  2. wrangler d1 create staffml-vault (prod + staging) — user action.
  3. Apply d1-schema.sql + seed via d1-migration.sql.
  4. FTS5 load-test gate: p99 warm ≤100ms, p99 cold ≤500ms,
     ≤500 D1 row-reads/query (Dean N-5 cost gate).
  5. Data-plane SLI crons emitting to Grafana.
2026-04-16 12:42:13 -04:00