cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-07 10:08:50 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	d2a8e4d28b	fix(vault-cli): spoof check picked wrong base ref, swept 132-commit diff The reviewer-identity spoof check tried base refs in the order (origin/main, origin/dev, HEAD~1) and returned the first that resolved. On dev, where origin/main is 132 commits behind origin/dev, this picked main and diffed every vault YAML changed since that point — sweeping up 100+ files unrelated to the current push and reporting each as a spoof-check failure. Fix: respect GITHUB_BASE_REF when set (PR mode), otherwise diff against HEAD~1 (push mode). This produces exactly the file set the check is meant to validate — what this PR or push is proposing — not the entire branch divergence from main. Verified locally on the codespell+codegen-hashes commit: now reports "no vault/questions/ changes in this PR" instead of 100+ spurious failures.	2026-04-25 14:06:18 -04:00
Vijay Janapa Reddi	3f9b044b31	chore(ci): rename vault-ci.yml → staffml-validate-vault.yml Brings the last outlier workflow file into the repo-wide <cluster>-<verb>-<scope>.yml naming convention. Every other cluster (book, tinytorch, kits, labs, instructors, mlsysim, slides, site, staffml) uses this pattern; vault-ci.yml was the only one that didn't. vault-ci.yml → staffml-validate-vault.yml name: '🎯 StaffML · 🔎 Vault CI' → '🎯 StaffML · ✅ Validate (Vault)' Now staffml-validate-vault.yml is a direct sibling of staffml-validate-dev.yml — the former validates the vault data + CLI + worker, the latter validates the site build. Same verb, different scope, easy to reason about. Updated references: .github/workflows/staffml-validate-vault.yml — self-reference in the paths trigger (so the workflow still fires when it's edited) interviews/vault/ARCHITECTURE.md §19.3 and §51 — both path refs interviews/vault/TESTING.md §4.1 — workflow name + display name interviews/vault-cli/scripts/check_registry_append_only.py — docstring No branch-protection settings change needed — GitHub matches required checks on the workflow's 'name:' field, not the filename. Anyone with a bookmark to the old Actions-tab URL will get a 404 (harmless). Other workflow naming I surveyed but deliberately LEFT alone (all consistent with existing conventions): staffml-update-paper.yml matches tinytorch-update-pdfs pattern staffml-auto-pr.yml matches bot-workflow convention staffml-welcome.yml single-word verb, standard auto-label / update-contributors / infra-* / publish-all-live are cross-cutting (no cluster prefix) by design	2026-04-22 11:27:37 -04:00
Vijay Janapa Reddi	8d385b0c1a	feat(d1): cutover production D1 to schema v1.0 + live worker serving Four deployment-level fixes landed on the live Cloudflare worker + D1 instance: 1. compiler.py — populate chains table from chains.json. Pre-v1.0 the table was never filled, which only mattered once D1 (which enforces FKs by default, unlike SQLite) tried to insert chain_questions. The cutover failed with FOREIGN KEY constraint failed until chains(id) was populated. 2. types.ts (worker) — add competency_area, bloom_level, phase, and human_review_* fields. Worker SQL was already SELECT *, so the new columns flow through without code changes, but the TypeScript row interface needed updating for downstream consumers. 3. rate_limit.ts — Math.max(60, …) floor on expirationTtl. Old calc could emit values as low as 11s, which D1's KV backend rejects (minimum 60s). Was throwing 1101 on every request after the deployment. Tail logs showed 'Invalid expiration_ttl of 14'. 4. wrangler.toml — bump SCHEMA_FINGERPRINT to match the v1.0 vault.db (b97218dae6354b1b…). Without this, /manifest reports schema_fingerprint_ok: false and clients degrade. New script: scripts/ship_d1.py — end-to-end reload of D1 from the current YAMLs. 'vault build' → SQL dump → 'wrangler d1 execute --file'. Handles FK ordering (chains first, then questions, then chain_questions). Used for this cutover; repeatable for future schema bumps. Deployment state (2026-04-22): Worker URL: https://staffml-vault.mlsysbook-ai-account.workers.dev D1 database: staffml-vault (254f630f-…) — 9,199 questions loaded Release hash: 997747a8f43bbd89e03c6bb0e67865f8de35ac8316fbb0457ee0b8f955afb32f Manifest: curl …/manifest returns 9,199 / schema_fingerprint_ok=true GET question: /questions/cloud-0185 returns the post-Phase-2 v1.0 record (zone=mastery, level=L6+, competency_area=latency, …) Filtered list: /questions?track=cloud&level=L6%2B works with pagination Site cutover is NOT in this commit. The existing hybrid path (bundled corpus.json primary + worker /search secondary) keeps working unchanged. To flip the site entirely to the worker: export NEXT_PUBLIC_VAULT_API=https://staffml-vault.mlsysbook-ai-account.workers.dev unset NEXT_PUBLIC_VAULT_FALLBACK # then: next build && next deploy That flip converts every caller from sync 'getQuestions()' to async via corpus-source.ts — deferred because callers need an audit pass to handle async correctly.	2026-04-22 10:29:35 -04:00
Vijay Janapa Reddi	a17107f3df	chore(vault-cli): update d1 schema + codegen hashes for schema v1.0 - d1-schema.sql: regenerated to match compiler.py changes. Adds competency_area, bloom_level, phase, human_review_* columns to questions table. Adds idx_questions_human_review index. chain_questions PK changes from (chain_id, position) to (chain_id, question_id) for multi-chain + non-contiguous support. Drops deep_dive_title/deep_dive_url. - codegen-hashes.txt: new baseline covering the v1.0 models.py, d1-schema.sql, and @staffml/vault-types/index.ts. Fixes the vault codegen --check drift test that was failing CI.	2026-04-21 18:24:21 -04:00
Vijay Janapa Reddi	6ccee10a9d	feat(vault-cli): add vault lint + schema drift check Two new tools. vault lint <path> Author-facing linter. Accepts a single YAML file or a directory. Severity levels: ERROR schema violation; question cannot be loaded WARN likely misclassification (zone-level affinity mismatch, chain position duplication, etc.) INFO hygiene suggestions (human-review-pending on published Qs) Zone-level affinity warning implements paper §3.3 Table 2 (line 397): 'An L1 question tagged as evaluation is flagged for review, since evaluation is cognitively inconsistent with Bloom's Remember level.' The warning is soft — marking an outlier does not reject it; it surfaces for reviewer judgement. Quickly identifies the ~943 L6+ questions currently carrying zone=design that should probably be zone=mastery. scripts/check_schema_sync.py CI drift check. Compares enum values in schema/enums.py against schema/question_schema.yaml (the authoritative LinkML schema) and exits non-zero if they disagree. Prevents the three-schema drift that caused the v0.1 migration defects from recurring. Enums cross-checked: Track, Level, Zone, BloomLevel, Phase, Status, Provenance, HumanReviewStatus. Output on success: 'OK: 8 enums in sync.' Wire into CI in a follow-up PR.	2026-04-21 18:04:10 -04:00
Vijay Janapa Reddi	ed58b56cf4	docs(vault): archive obsolete scripts + post-mortem the v1.0 migration Archives pre-v1.0 scripts under scripts/archive/ in both interviews/vault/ and interviews/vault-cli/. ARCHITECTURE.md §3.3 rewritten with a post-mortem on why path-as-classification could not represent the paper's full 11-zone × 6-level taxonomy. CHANGELOG.md added documenting the full v1.0 migration.	2026-04-21 18:02:05 -04:00
Vijay Janapa Reddi	63dc15977c	chore(paper): regenerate macros for 87-topic taxonomy (0.10.0)	2026-04-18 08:07:22 -04:00
Vijay Janapa Reddi	4013aa422e	feat(vault): seed exemplar pool with 86 human-reviewed questions Adds suggest_exemplars.py script for identifying high-quality candidates. Moves 86 top-scoring questions (1 per topic) from vault/questions/ to vault/exemplars/ with provenance upgraded to human. Scored by presence of napkin_math, common_mistake, solution length, and scenario length. vault generate now finds exemplars for topic-specific generation. Published count: 9,113 (86 moved to exemplar pool).	2026-04-18 08:06:24 -04:00
Vijay Janapa Reddi	0eecfe1108	fix(vault-cli): pre-commit corpus guard resolves COMMIT_EDITMSG via git rev-parse The hook read '.git/COMMIT_EDITMSG' via a literal relative path. That works in a regular clone where .git is a directory, but fails silently in a git worktree where .git is a file pointing at a per-worktree gitdir under the main repo. In a worktree, Path('.git/COMMIT_EDITMSG') never exists, so commit_message_has_override() always returned False and legitimate Vault-Override trailers were rejected. Resolve via 'git rev-parse --git-path COMMIT_EDITMSG' which returns the correct path in both regular clones and worktrees. This matches the pattern already used by the Makefile's HOOKS_DIR resolution. No behaviour change in a regular clone; worktrees can now commit with the Vault-Override trailer as documented.	2026-04-16 18:22:03 -04:00
Vijay Janapa Reddi	3ce0595035	fix: Round-7 (Chip) + Round-8 (Dean) \u2014 2H + 1MH + 5M + 4L closed Chip R7 findings: R7-H-1 (HIGH): sw.js ReferenceError on offline fetch failure `cached` was const-scoped inside `if (!manifestStale)` block but referenced in the outer catch's "if (cached) return cached" offline fallback. Offline users hit ReferenceError instead of cache. Fix: hoist to `let cached = null` above the gate. R7-H-2 (MEDIUM-HIGH): schema_fingerprint portability across SQLite versions Previous compiler hashed all sqlite_master including FTS5 shadow tables (questions_fts_data/idx/docsize/content/config) whose DDL varies across SQLite versions. Host Python SQLite \u2260 Cloudflare D1 SQLite \u2192 fingerprint permanent mismatch \u2192 worker pinned to degraded mode forever. Fix: filter shadow tables out on both sides (compiler.py + worker/index.ts); fingerprint covers only user-authored DDL. R7-M-3 (MEDIUM): schemaOk sticky on transient D1 failure Previously any probe exception pinned schemaOk=false until release rollover. Now: 5-minute retry window via schemaCheckedAt tracking. R7-M-4 (MEDIUM): vault dup --vault-dir + pass-through ACKS_PATH was CWD-relative; invoking CLI from non-default cwd silently missed the ack file, legitimate templates reddied nightly CI forever. Fix: vault dup --vault-dir flag + pass through to ack_pairs(vault_dir); validator._scenario_dedup_lsh takes vault_dir; slow_tier threads it. R7-M-5 (MEDIUM): FTS5 probe memoization Previously probed sqlite_master on every /search request \u2014 directly undid part of R5-H-1's manifest memo cost fix. Now: module-level ftsProbed memo, reset on release_id change (FTS5 presence can only change across releases). R7-L-6 (LOW): reviewer-identity name clarity Var was `committer_emails` but git log %ae is AUTHOR email. Behavior was correct (intentional, so rebase-by-maintainer preserves chain); renamed to commit_author_emails and updated comments. R7-L-7 (LOW): manifest memo race on release rollover maybeInvalidateSchemaCache nulled manifestMemo mid-write causing microsecond stampede. Now: don't null memo \u2014 60s TTL is forgiving enough staleness bound for release rollover. Dean R8 findings: R8-H-1 (HIGH): SLI cron was structurally broken Previous SLI reconstructed canonical content_hash from worker JSON response \u2014 but reconstruction omitted tags, chain, generation_meta (WHITELIST_TOP includes tags + chain). Every hourly run false-positive'd on any question with a non-empty tag list, effectively a pager-DoS. Fix: compare worker's stored content_hash directly against release vault.db's stored content_hash. Same compilation source \u2192 mismatch means real corruption. R8-M-2 (MEDIUM): SLI 404 handling urlopen crashed on deprecated IDs during release rollover. Fix: classify by response code. 404 \u2192 id_missing_in_worker (expected); 5xx \u2192 transport_errors (separate tally); only real hash mismatch pages the operator. R8-M-3 (MEDIUM): vault deploy + rollback primitives spec-only ARCHITECTURE \u00a76.2 said 'default rollback = snapshot restore \u2014 always works' but no vault deploy or vault rollback command existed, so the R2 snapshot substrate that makes §6.2 true was never built. Fix: implemented `vault deploy` with synchronous R2 snapshot (wrangler d1 export + wrangler r2 object put before migration) and `vault rollback --method snapshot\|sql`. CLI now has 26 subcommands. Deploy requires authenticated wrangler; code path exists. R8-LM-4 (LOW-MED): D1 bootstrap migration wrangler.toml referenced migrations_dir='migrations' but that dir didn't exist. First-deploy-from-scratch relied on manual operator steps. Fix: generated interviews/staffml-vault-worker/migrations/0001_bootstrap.sql from compiler.DDL so `wrangler d1 migrations apply` works on a fresh D1. Test matrix (post-R7+R8 integration): pytest: 38 green in 0.15s vitest: 7 green in 131ms ruff: All checks passed vault build: release_hash fe69d4c4... stable (unchanged \u2014 fingerprint filter change affects release_metadata content, not the release Merkle per \u00a73.5) vault --help: 26 subcommands (added deploy + rollback) Convergence tracking: R1-R5 closed 90+ findings R6 (Gemini queued, not yet invoked) \u2014 will launch with R9/R10 R7-R8 produced 12 new findings (2H + 1MH + 5M + 4L), all closed here Pattern: each round still finds 8-12 issues. Not yet stable. Expect 2-3 more rounds to hit 'no new findings' signal.	2026-04-16 16:31:17 -04:00
Vijay Janapa Reddi	6b7b3f1b70	fix: integrate Gemini Round-5 holistic review \u2014 3C + 4H + 1M + 1L fixed Gemini 3.1 Pro reviewed the full branch (371KB / 43K words) with 1M context. Caught 9 cross-file issues none of the 4 prior per-file rounds saw because they required seeing multiple systems at once. CRITICAL fixes: R5-C-1: _MIGRATION_TABLES omitted release_metadata (release.py:118). Result: after `vault ship`, release_metadata never propagated to D1 \u2192 worker kept serving old release_id forever \u2192 cache never invalidated \u2192 new release functionally invisible. Fix: added 'release_metadata' to migration-participating tables. R5-C-2: SW offline wake-up deleted the real cache (sw.js). Result: when SW woke offline, currentRelease=null, cacheName defaulted to '...-unknown'. Activate pruned all caches not matching, i.e. it deleted the real cache. Offline users: no cache. Fix: persist currentRelease to IDB on fetch success; restore on activate; move cache pruning from activate to updateReleaseFromManifest so it only runs AFTER a successful online manifest fetch. R5-C-3: schema_fingerprint hand-edited in wrangler.toml (compiler.py + worker/src/index.ts). Every DDL change required manually recomputing + pasting a hash + Worker redeploy; forgetting any step put the site in degraded mode. Fix: compiler.py now computes fingerprint from sqlite_master at build time and stores in release_metadata. Worker reads it from the DB via getManifest; env.SCHEMA_FINGERPRINT path removed. HIGH fixes: R5-H-1: getManifest hit D1 on every request before Cache API check (worker/src/index.ts:130). Destroyed the \u00a710.4 cost target. Fix: module-level manifest memo with 60s TTL. Invalidated on release_id change (natural cadence from Cloudflare's eventual propagation). R5-H-2: _insert_stmt emitted NULL for columns absent in row (release.py). Result: rolling back past a new NOT NULL column would crash on SQLite constraint violation. Fix: emit only columns actually in row dict; let SQLite apply defaults. R5-H-3: ARCHITECTURE.md \u00a713 promised CI rejects --reviewed-by spoofing, but no check existed. Fix: new scripts/check_reviewer_identity.py + CI step. Verifies for every changed question with provenance=llm-then-human-edited that at least one `authors` entry matches a commit email from the PR. R5-H-4: LSH dedup told operator to run `vault dup --ack` but that command didn't exist \u2014 legitimate templates would red nightly CI forever. Fix: implemented `vault dup --ack`/--unack/--show. Writes to vault/dedup-acks.yaml. Validator reads the ack list and skips flagged pairs. MEDIUM / LOW fixes: R5-M-1: `vault tag` swallowed git failures with check=False. Result: 'tag already exists', 'nothing to commit', merge conflicts all printed '[green]tagged[/green]' and exited 0. Fix: explicit error check on every subprocess call; pre-existence check on tag like ship.paper_forward does. R5-L-1: applicability-matrix invariant case-sensitive; 'Cloud' vs 'cloud' silently failed enforcement. Fix: lowercase-normalize both sides of the comparison. State: pytest: 38/38 green in 0.15s vitest: 7/7 green (fingerprint test updated to mock via release_metadata) ruff: All checks passed CLI: 23 subcommands (added vault dup) release_hash: fe69d4c4... (unchanged \u2014 schema_fingerprint addition affects release_metadata table, not content Merkle per \u00a73.5)	2026-04-16 16:13:35 -04:00
Vijay Janapa Reddi	cbdb566381	feat(vault): Phase-1 migration contract fully closed in-repo v2.3 \u2192 v2.4. ARCHITECTURE.md header + Appendix reflect the completed migration. WHAT CLOSED (\u00a711.1 contract): 1. `vault build --legacy-json` regenerates the site's interviews/staffml/src/data/corpus.json from YAML. 9,199 published questions, site-compatible shape (chain_positions back to 0-indexed dict form, bloom_level derived from zone, competency_area aliased from topic, scope aliased from track). Deterministic via sort_keys + id-sort. 2. Pre-commit hook INSTALLED via worktree-aware Makefile target (`make -C interviews/vault-cli hooks`). Symlink points at pre_commit_corpus_guard.py. Tested end-to-end: direct edit to vault/corpus.json triggers exit-1 with §11.1 reference. 3. CI equivalence check added to .github/workflows/vault-ci.yml: regenerates corpus.json from YAML, diffs against committed. Fails PR on drift with actionable error message. 4. Legacy generators demoted with DEPRECATED headers: - interviews/paper/scripts/analyze_corpus.py \u2192 vault export-paper - interviews/staffml/scripts/sync-vault.py \u2192 vault build --legacy-json - interviews/staffml/scripts/generate-manifest.py \u2192 vault publish - interviews/vault/scripts/export_to_staffml.py \u2192 vault build --legacy-json 5. New DEPRECATED.md files at interviews/vault/scripts/ and interviews/staffml/scripts/ map every legacy script to its replacement. Both directories keep the old scripts for git-history legibility and archaeology; new contributors see the vault CLI first. 6. ARCHITECTURE.md \u00a7Appendix rewritten as current-state table instead of aspirational "gone. replaced by..." entries. NEW TESTS (interviews/vault-cli/tests/test_legacy_export.py \u2014 +4): - test_legacy_shape_matches_site_interface: every field corpus.ts declares is present in regenerated JSON. - test_chain_positions_legacy_shape: 1-indexed new schema \u2192 0-indexed legacy dict form. - test_emitter_deterministic: byte-stable across reversed input order (required for CI diff-check). - test_competency_area_aliases_topic: legacy alias fields populated correctly. FULL MATRIX GREEN: pytest: 38/38 passed in 0.19s (34 + 4 legacy-export) ruff: All checks passed hook: exit 0 on clean diff / exit 1 on corpus.json direct edit e2e: vault build --legacy-json regenerates a bit-identical corpus.json vs the committed one; CI check wired to catch drift WHAT'S LEFT (deploy-gated, \u00a720.5 #1, #5, #6 partial, #8, #9): - Production serves from D1: requires Phase-3 wrangler d1 create + deploy - Manual QA per CUTOVER_QA.md: requires live staging - Zero data loss D1-side verification: requires live D1 - 48h monitoring: requires production traffic These are intrinsically user-action; the YAML-side migration is done.	2026-04-16 14:57:24 -04:00
Vijay Janapa Reddi	8e1e47f9f8	fix(vault): normalize chain positions + tighten provenance invariant vault-cli/scripts/normalize_chain_positions.py (NEW) Phase-1 split kept only chain_ids[0] per question when legacy corpus had multi-chain membership (up to 4 chains/question). Chains whose members chose a different chain_ids[0] were left with position gaps. Script walks vault/questions/, groups by chain_id, renumbers each chain's members to contiguous [1..N] sorted by current position. Idempotent. Rewrote 87 questions across 977 chains. validator.py #18 (provenance-meta) Tightened from 'any non-human provenance requires generation_meta' to 'only llm-draft / llm-then-human-edited require it'. Imported content legitimately has no LLM attribution and shouldn't carry stub meta. Was incorrectly flagging 9,199 imported questions. Re-ran vault build → new release_hash (input changed, which is correct): fe69d4c4d3c2884efeab6189a67e929e4e970dc0f4de42ab9493531a4cabeda1. Republished 0.9.0 release artifact. corpus-equivalence-hash.txt updated. paper/macros.tex + corpus_stats.json regenerated (same counts: 9199/87/964 chains/31.9% coverage). State: vault check --strict 100% clean on full 9,657-question corpus; zero load errors; zero invariant failures. 28/28 pytest green. vault verify 0.9.0 round-trips from YAML source. Citation property holds on the new hash.	2026-04-16 14:10:14 -04:00
Vijay Janapa Reddi	1bc93374e1	feat(vault): Phase-1/2 polish + LICENSEs + corpus cutover branch vault-cli/src/vault_cli/commands/stats.py (NEW, B.8) vault stats — live scorecard over vault.db with --format-prometheus scrape mode + --exemplar-coverage audit shim. Reports total / topics / chains / by_status / by_track / by_provenance. Resolves R3 gap about missing stats subcommand. vault-cli/src/vault_cli/commands/codegen.py (NEW, B.7) vault codegen --check — Phase-1 presence-and-non-empty verification of the 3 shared-artifact files (models.py, d1-schema.sql, @staffml/vault-types/index.ts). Full LinkML-driven generation is Phase-2 follow-up. vault-cli/Makefile (NEW, B.2) make install / test / lint / hooks / hooks-uninstall. Hooks target symlinks pre_commit_corpus_guard.py into .git/hooks/pre-commit. vault-cli/scripts/check_registry_append_only.py (NEW, B.3) CI script verifying id-registry.yaml is append-only vs base branch. Rejects removed or reordered lines — C-5 enforcement at merge time. vault/questions/LICENSE (NEW) CC-BY-4.0 for corpus content. BibTeX template with release_hash placeholder. Scope note clarifies vault-cli is MIT separately. vault-cli/LICENSE (NEW) MIT for vault-cli Python package + scripts + docs. Scope note clarifies corpus is CC-BY-4.0 separately. staffml/src/lib/corpus-vault.ts (NEW, B.11) Vault-API-backed data source mirroring corpus.ts public surface. Adapts @staffml/vault-types Question → legacy Question shape so callers don't need to change. Not wired into any component yet — the swap happens via corpus-source.ts. staffml/src/lib/corpus-source.ts (NEW, B.11) Cutover router: getCorpusSource() returns 'static' or 'vault-api' based on NEXT_PUBLIC_VAULT_FALLBACK. Components that opt into the cutover import from here; others continue using corpus.ts directly (unchanged behavior). Phase-4 cutover flips components one-by-one rather than big-bang-replacing corpus.ts. Phase-1/2 now has the full CLI surface (19 subcommands), LICENSEs for legal Phase-3 deploy, and the site-side cutover pathway ready for Phase-4 canary.	2026-04-16 13:10:16 -04:00
Vijay Janapa Reddi	42f4d1ca8b	fix(vault): Round-3 correctness + `vault ship` + authoring contract Round-3 review (4 reviewers on v2.1) surfaced two code-correctness Criticals that this commit fixes, plus the contracted-but-missing `vault ship` coordinator and David's authoring-UX gaps. Critical fixes (real bugs in landed code): worker/src/index.ts - SCHEMA_FINGERPRINT placeholder fail-closed (Chip R3-C1 / Dean R3-NH-3). Was: placeholder auto-passed and silently disabled the fingerprint check. Now: placeholder forces degraded mode until operator sets real fingerprint. - DDL hash now includes triggers (FTS5-aware). - release_id change invalidates schema-fingerprint memoization (Dean R3-NH-4). - wrangler.toml now pins the real fingerprint. staffml/public/sw.js - /manifest polling TTL-throttled to 5min (Chip R3-C2). Was: per-request fetch nullified the §10.4 cost model. - API origin persisted to IndexedDB; rehydrated on activate so cold offline wake-ups serve cached content (Chip R3-H3). vault-cli/src/vault_cli/release.py - emit_migrations diffs all 4 tables via PRAGMA-driven column introspection (Dean R3-NC-1 + R3-NH-2). Was: only questions table, silently missing chains/chain_questions/tags. Rollback-symmetry test extended to populate + verify all tables. vault-cli/src/vault_cli/commands/release.py - vault verify --git-ref reconstructs release from 'git archive <ref>' into a tempdir (Dean R3-NC-2). Was: always rebuilt from HEAD, so verifying a historical release always failed post-authoring. Academic-citability contract (C-3) now actually holds. vault-cli/src/vault_cli/ship.py (NEW) - vault ship composed verb with journaling (Dean R3-NH-1): * Legs run D1 → Next.js → paper-tag-last (§6.1.1 ordering). * Journal at releases/<v>/.ship-journal.json records per-leg state; --resume continues interrupted ships idempotently. * Pre-paper failure auto-rolls back in reverse order. * Paper-leg failure pages operator; does NOT auto-rollback earlier legs (git tag is remote-durable per §6.1.1). - 4 unit tests cover happy path, pre-paper failure auto-rollback, paper-leg needs-manual, --resume across interruptions. vault-cli/src/vault_cli/commands/authoring.py - vault new appends to id-registry.yaml (David R3-H3 + C-5 enforcement); `git pull --rebase` before allocation. - authors: auto-populated from git config user.email (David R3-H4 / M-15). Was: field never set. - vault edit injects validation-error comment block at top of YAML and re-opens up to --retries=3 times (David R3-H1). Was: terminal traceback mid-authoring session. - vault move refuses dirty tree, chained question, excluded-cell per applicability matrix (David R3-H2). Was: unchecked git mv. - vault renumber command (NEW): post-rebase seq-collision recovery. Bumps seq, renames file, updates id field, appends registry (David R3-N-2, was spec-only). - vault mark-exemplar command (NEW): promotes to vault/exemplars/ with provenance + human_reviewed_at gate (David R3-N-9). vault-cli/src/vault_cli/compiler.py - FTS5 virtual table + sync triggers added to DDL (B.5). Triggers keep questions_fts in sync via AFTER INSERT/UPDATE/DELETE. schema_fingerprint accounts for triggers now. tests/test_hashing.py - Nested-dict hash-stability fixture (Soumith R3-F-4). Was: test only reordered top-level keys + collapsed details to one key. All 28 tests pass (22 → 28: +4 ship journaling, +1 multi-table migration symmetry, +1 nested-dict hash stability). release_hash unchanged at 1b304282... — FTS5 addition doesn't affect content Merkle per §3.5 input-only design.	2026-04-16 13:10:16 -04:00
Vijay Janapa Reddi	d8f6abae4b	feat(worker): Phase 3 D1 worker scaffold + shared types package Phase 3 is CODE-COMPLETE; actual D1 creation + Worker deployment require authenticated Cloudflare credentials (user action gate per kickoff stop-conditions). staffml-vault-worker/ wrangler.toml — DB binding, CORS allowlist, TTL env vars, SCHEMA_FINGERPRINT placeholder, GRACE_WINDOW_SECONDS for cross-release serving. src/index.ts — 6 endpoints (manifest, questions, questions/:id, search, stats) with ETag + cursor pagination + SWR Cache-Control + CORS. src/types.ts — Env binding + row shapes. README.md — deploy-day runbook. Key v2.1 behaviors wired: - X-Vault-Release is INFORMATIONAL (not hard-reject) — worker serves from release_metadata.release_id; header is SLI signal only. Fixes Soumith H-NEW-2 local-dev + SWR revalidation brownout. - schema_fingerprint cold-start check hashes actual sqlite_master DDL (not metadata-vs-metadata, closes Dean N-4). On mismatch: Cache-API read-only mode with X-Vault-Degraded header, never 5xx (closes Chip N-H1 total-outage risk). - Cache keys keyed by release_id → deploy-time atomic POP invalidation (H-14). - ETag format: '<release_id>:q:<content_hash>' — 304 support (Soumith H-NEW-2). - Cursor pagination via opaque base64 {offset, filter_hash} tokens (H-20). Clients never construct cursors. - CORS allowlist from wrangler var; no wildcard in prod. staffml-vault-types/ index.ts — shared TS contract types; pnpm workspace protocol between worker + site (Soumith M-NEW-1 resolution). package.json — @staffml/vault-types, workspace-private. vault-cli/scripts/ emit_d1_schema.py — generates d1-schema.sql from compiler DDL; reports SHA-256 fingerprint to paste into wrangler.toml SCHEMA_FINGERPRINT var. d1-schema.sql — committed schema; applied to fresh D1 via 'wrangler d1 execute <db> --file d1-schema.sql'. Deploy-day gates (per CUTOVER_QA.md §0 and TESTING.md phase-entry): 1. License decision resolved (L-10 still OPEN). 2. wrangler d1 create staffml-vault (prod + staging) — user action. 3. Apply d1-schema.sql + seed via d1-migration.sql. 4. FTS5 load-test gate: p99 warm ≤100ms, p99 cold ≤500ms, ≤500 D1 row-reads/query (Dean N-5 cost gate). 5. Data-plane SLI crons emitting to Grafana.	2026-04-16 12:42:13 -04:00
Vijay Janapa Reddi	f633cc9174	feat(vault-cli): Phase 1 commands — build, check, new/edit/rm/restore/move, api, serve vault build — compile YAML → vault.db with release_hash stamped. vault check — invariant check engine with --tier (fast\|structural\|all) and --json (LSP-diagnostic shape per JSON_OUTPUT.md). vault new — allocate content-addressed ID (topic + 6-hex + seq); refuses to overwrite existing files. vault edit — open in $EDITOR; re-validates on save; exit 1 on validation failure per exit-code taxonomy. vault rm — default soft-delete (status=deprecated); --hard requires typed title confirmation + --force for chained questions (Chip N-H11). vault restore — inverse of soft-delete. vault move — reclassify via git mv (falls back to shutil.move if not in a git repo); --dry-run safe. vault api — localhost Worker-surface shim serving the production endpoint contract from local vault.db (closes H-17). Binds 127.0.0.1; prints divergence notice. vault serve — Datasette wrapper, 127.0.0.1 only. scripts/split_corpus.py — one-shot converter corpus.json → per-question YAML under vault/questions/<track>/<level>/<zone>/. Preserves legacy IDs for bookmark stability. Appends to id-registry.yaml (append-only log). Legacy chain_positions dict → structured {id, position} form with 1-indexed positions. scripts/pre_commit_corpus_guard.py — refuses commits that edit vault/corpus.json directly (enforces C-2 single-authoring-surface). Override via 'Vault-Override: corpus-json-hand-edit' trailer.	2026-04-16 12:37:06 -04:00
Vijay Janapa Reddi	8af0948a35	ci(vault): Phase 0 CI workflow + exemplar-coverage audit .github/workflows/vault-ci.yml Matches repo workflow style (emoji-prefixed name, concurrency group, path-scoped triggers). Phase-0 scope: pip install, vault --version, ruff, pytest, exemplar-audit staleness check. Python 3.12 pinned for hash stability per ARCHITECTURE.md §3.5. mypy --strict included but non-blocking at Phase 0; enforces in Phase 1. Placeholder for vault check --strict, vault build, vault codegen --check as those commands land. interviews/vault-cli/scripts/exemplar_coverage_audit.py Reads corpus.json, groups by (track, level, zone), counts total questions vs exemplar-eligible per cell (requires provenance ∈ {human, llm-then-human-edited}). Phase-0 honest output: provenance field doesn't exist in corpus.json yet, so eligible=0 for every cell until Phase-1 YAML split + provenance backfill. Audit shape is stable so Phase-1 re-runs slot in without refactoring. interviews/vault/exemplar-gaps.yaml First audit snapshot: 190 cells catalogued, all gap=3 pending Phase-1. Filling gaps unblocks vault generate in Phase 7, not a Phase 0 blocker (Chip N-H3 resolution). Phase 0 milestone: complete.	2026-04-15 21:25:52 -04:00

1 2

68 Commits