2 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
42f4d1ca8b fix(vault): Round-3 correctness + vault ship + authoring contract
Round-3 review (4 reviewers on v2.1) surfaced two code-correctness
Criticals that this commit fixes, plus the contracted-but-missing
`vault ship` coordinator and David's authoring-UX gaps.

Critical fixes (real bugs in landed code):

worker/src/index.ts
- SCHEMA_FINGERPRINT placeholder fail-closed (Chip R3-C1 / Dean R3-NH-3).
  Was: placeholder auto-passed and silently disabled the fingerprint
  check. Now: placeholder forces degraded mode until operator sets
  real fingerprint.
- DDL hash now includes triggers (FTS5-aware).
- release_id change invalidates schema-fingerprint memoization
  (Dean R3-NH-4).
- wrangler.toml now pins the real fingerprint.

staffml/public/sw.js
- /manifest polling TTL-throttled to 5min (Chip R3-C2). Was:
  per-request fetch nullified the §10.4 cost model.
- API origin persisted to IndexedDB; rehydrated on activate so cold
  offline wake-ups serve cached content (Chip R3-H3).

vault-cli/src/vault_cli/release.py
- emit_migrations diffs all 4 tables via PRAGMA-driven column
  introspection (Dean R3-NC-1 + R3-NH-2). Was: only questions table,
  silently missing chains/chain_questions/tags. Rollback-symmetry
  test extended to populate + verify all tables.

vault-cli/src/vault_cli/commands/release.py
- vault verify --git-ref reconstructs release from 'git archive <ref>'
  into a tempdir (Dean R3-NC-2). Was: always rebuilt from HEAD, so
  verifying a historical release always failed post-authoring.
  Academic-citability contract (C-3) now actually holds.

vault-cli/src/vault_cli/ship.py (NEW)
- vault ship composed verb with journaling (Dean R3-NH-1):
  * Legs run D1 → Next.js → paper-tag-last (§6.1.1 ordering).
  * Journal at releases/<v>/.ship-journal.json records per-leg state;
    --resume continues interrupted ships idempotently.
  * Pre-paper failure auto-rolls back in reverse order.
  * Paper-leg failure pages operator; does NOT auto-rollback earlier
    legs (git tag is remote-durable per §6.1.1).
- 4 unit tests cover happy path, pre-paper failure auto-rollback,
  paper-leg needs-manual, --resume across interruptions.

vault-cli/src/vault_cli/commands/authoring.py
- vault new appends to id-registry.yaml (David R3-H3 + C-5
  enforcement); `git pull --rebase` before allocation.
- authors: auto-populated from git config user.email (David R3-H4 /
  M-15). Was: field never set.
- vault edit injects validation-error comment block at top of YAML
  and re-opens up to --retries=3 times (David R3-H1). Was: terminal
  traceback mid-authoring session.
- vault move refuses dirty tree, chained question, excluded-cell
  per applicability matrix (David R3-H2). Was: unchecked git mv.
- vault renumber command (NEW): post-rebase seq-collision recovery.
  Bumps seq, renames file, updates id field, appends registry
  (David R3-N-2, was spec-only).
- vault mark-exemplar command (NEW): promotes to vault/exemplars/
  with provenance + human_reviewed_at gate (David R3-N-9).

vault-cli/src/vault_cli/compiler.py
- FTS5 virtual table + sync triggers added to DDL (B.5). Triggers
  keep questions_fts in sync via AFTER INSERT/UPDATE/DELETE.
  schema_fingerprint accounts for triggers now.

tests/test_hashing.py
- Nested-dict hash-stability fixture (Soumith R3-F-4). Was: test
  only reordered top-level keys + collapsed details to one key.

All 28 tests pass (22 → 28: +4 ship journaling, +1 multi-table
migration symmetry, +1 nested-dict hash stability). release_hash
unchanged at 1b304282... — FTS5 addition doesn't affect content
Merkle per §3.5 input-only design.
2026-04-16 13:10:16 -04:00
Vijay Janapa Reddi
812ba408d0 feat(vault): Phase 1 core — schema, hashing, policy, loader, validator
LinkML schema at vault/schema/question_schema.yaml is the sole schema
source of truth. Pydantic models in vault_cli.models are currently
hand-authored to match; full LinkML codegen wires in Phase 2 with the
drift-check in CI.

Core modules:
  vault_cli/models.py     — Pydantic question model (closed enums, content-
                            format per field, schema_version=1 gate).
  vault_cli/hashing.py    — canonical content_hash over whitelisted fields;
                            release_hash Merkle with __policy__ and
                            __canon_version__ leaves (Chip N-H5).
  vault_cli/yaml_io.py    — hardened SafeLoader: 256KB cap, depth 10 cap,
                            aliases rejected, timeout (H-7).
  vault_cli/paths.py      — path-as-classification parser with lowercase +
                            enum enforcement (H-9).
  vault_cli/loader.py     — walks vault/questions/, returns loaded + errors
                            (never raises — aggregate reporting).
  vault_cli/validator.py  — tiered invariant engine; fast + structural tiers
                            implemented per ARCHITECTURE.md §5.
  vault_cli/compiler.py   — YAML → SQLite with release_metadata rows
                            (release_id, release_hash, policy_version,
                            schema_version, published_count).
  vault_cli/policy.py     — single filter predicate. No consumer
                            re-implements (H-21).

release-policy.yaml v1: status=published. Dropped require_validated in
the wake of 9199/8053 resolution — validation is implicit in the
maintainer-approval → status=published transition, not a separate flag.

Tests (19 pass): key-order hash invariance (Soumith M-NEW-4), policy
filter correctness (H-21 runtime check), YAML hardening (H-7).
2026-04-16 12:37:06 -04:00