cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-08 09:57:21 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	bc26a0bf37	feat(vault): Phase 6 schema tightening — markers + Details forbid + invariant Three coordinated edits to lift the marker convention from a soft draft-validation gate to a published-corpus invariant: 1. interviews/vault/schema/question_schema.yaml (LinkML, source of truth): common_mistake and napkin_math gain regex patterns matching the AUTHORING.md Pitfall/Rationale/Consequence and Assumptions/ Calculations/Conclusion conventions. Documents the spec; enforced in the validator below. 2. interviews/vault-cli/src/vault_cli/models.py (Pydantic, derived): Details flips from extra='allow' to extra='forbid'. A pre-flight survey on 2026-05-04 across all 10,711 YAMLs found 0 unknown keys on Details, so the historical 'imported legacy fields' risk no longer applies. 3. interviews/vault-cli/src/vault_cli/validator.py: structural_tier gains _check_format_markers (invariant #19), which flags published YAMLs whose non-empty cm/nm doesn't match the AUTHORING.md markers. Drafts are exempt — author-in-progress drafts may still have malformed markers. Lifts gate_format from validate_drafts.py / _judges.py from a CI-time gate to a vault-check-strict invariant. Tests: 4 new cases in test_models covering Details forbid, marker- compliant pass, malformed cm fail, and draft-exempt skip. Total 88 passing (was 84). codegen-hashes.txt updated for the models.py edit; vault codegen --check passes. The on-disk corpus is fully clean post-Phase-5+drain: vault check --strict reports 10,711 loaded, 0 invariant failures, 0 format- marker violations on published YAMLs.	2026-05-04 08:41:08 -04:00
Vijay Janapa Reddi	542aaf95d2	cleanup(vault): release-ready Phase A — schema hardening + lint calibration + chain repair Closes the cleanup arc (A.1–A.10 in RESUME_PLAN_RELEASE.md). Every gate is now green: vault check --strict, vault lint, vault doctor, vault codegen --check, staffml validate-vault, Playwright (9/9), tsc. A.1 mobile-1962.svg: renamed `Edge` → `RegEdge` in graphviz source (`Edge` is a reserved keyword); SVG renders cleanly. Also fixed tinyml-1570.py (missing `import numpy as np`) which the new failure log surfaced. A.2 render_visuals.py: structured per-ID failure log written to `_validation_results/render_failures.json` on every run; non-zero exit on any per-item crash; new `--fail-fast` and `--failure-log` CLI options. Replaces the prior silent-failure mode. A.3 LinkML visual schema: typed as a structured sub-schema. New `VisualKind` enum (svg only — `mermaid` was reserved but never shipped, dropped to keep the enum honest). Path regex tightened to `^[a-z0-9-]+\.svg$`. Alt minimum length 10, caption required minimum length 5. TypeScript Visual interface + Question.visual field added to staffml-vault-types/index.ts. A.4 Pydantic Visual + Question validators: - Visual.kind hard-rejects anything but `svg` - Visual.path enforces the new regex - Visual.alt min 10 chars, caption required min 5 chars - Question.model_validator: visual.path MUST resolve to a real file under interviews/vault/visuals/<track>/. Skipped in production deploys where the working tree is absent. A.5 Registry repair + doctor split: - tools: repair_registry.py appended 5,269 missing IDs (the rename refactor at `8a5c3ff3c` left the append-only registry unsynced; this brings disk-coverage to 100%). Header block in id-registry.yaml documents the rebuild rationale. - doctor.py: split symmetric `registry-integrity` check into `disk-coverage` (HARD FAIL if any disk YAML id is unregistered) and `registry-history` (INFO ONLY for retired ids — the registry is by design an audit log, retired ids are normal). Pre-existing `_check_schema_version` bug (`versions == {1}` vs string `"1.0"`) fixed. A.6 Lint calibration via 4-expert consensus + bloom-canonical reclassification: - Spawned 4 experts (Vijay Reddi, Chip Huyen, Jeff Dean, education-reviewer) on 42 disputed (zone, level) pairs; consensus-builder aggregated to 15 valid / 19 invalid / 8 borderline. - User arbitrated 8 borderlines: 7 widen / 1 reclassify. - Built ZONE_BLOOM_AFFINITY matrix (Education-Reviewer's idea): every zone admits its dominant Bloom verb + adjacent verbs, rejects clear hierarchy violations. - reclassify_zone_bloom_mismatch.py applied 576 deterministic zone fixes via BLOOM_CANONICAL_ZONE mapping (e.g. fluency+analyze → analyze, recall+analyze → analyze, evaluation+apply → implement). - Question.model_validator(_zone_bloom_compatible): hard-rejects future zone-bloom mismatches at write time. Generated drafts can no longer ship a self-contradicting classification. - ZONE_LEVEL_AFFINITY widened per consensus + arbitration + post-reclassification adjustments. Lint warnings: 1,308 → 0. A.7 Chain integrity: - repair_chains.py: drops chain refs when a chain has <2 published members (chain ceases to exist), renumbers all members of any chain whose positions are non-sequential / duplicated / non-monotonic-by-level. Sort key: level ascending, then old position, then qid (deterministic). - validate-vault.py: relaxed sequential check to unique-positions check. Position gaps from mid-chain deletions are normal; what matters is uniqueness + bloom-monotonicity (vault check --strict enforces both from YAML source-of-truth). A.8 Practice page visual + zoom modal: - QuestionVisual.tsx: wraps the `<img>` in `<Zoom>` from react-medium-image-zoom (4 KB). Click image → fullscreen `<dialog data-rmiz-modal>`; ESC closes. Added test-id `question-visual-img` for stable selector. - New Playwright test: 9th in the suite, deep-links cloud-4492, asserts the dialog opens on click and closes on ESC. - TypeScript: removed `mermaid` from local Visual types in corpus.ts and corpus-vault.ts; tsc clean. A.9 All gates green: - vault check --strict: 0 errors / 0 invariant failures - vault lint: 0 errors / 0 warnings (was 1,308 warnings) - vault codegen --check: artifacts in sync (hash baseline updated) - vault doctor: 0 fails (registry-history info, git-state warn on uncommitted state-pre-this-commit) - staffml validate-vault: 0 errors / 0 warnings, deployment-ready - Playwright: 9/9 pass (was 8; +zoom modal test) - render_visuals: 0 errors (was 2 silent failures pre-A.2) - tsc: clean Distribution after reclassification: 9,544 published unchanged; 576 items moved zone via bloom-canonical mapping (full per-item report at /tmp/reclassify_changes.csv). Chain count 879 → 850 after orphan-singleton drops. release_hash updated. Carry-forward to next session (Phase B): - Priority gap closure for parallelism cells + global L4-L6+ (the run that produced this corpus did not close the targeted cells; B.3 needs specialized prompts per cell-class) - 120 NEEDS_FIX items from coverage_loop/20260425_150712/ still carry judge fix_suggestions; spawn fix-agent in Phase C	2026-04-25 15:12:51 -04:00
Vijay Janapa Reddi	a17107f3df	chore(vault-cli): update d1 schema + codegen hashes for schema v1.0 - d1-schema.sql: regenerated to match compiler.py changes. Adds competency_area, bloom_level, phase, human_review_* columns to questions table. Adds idx_questions_human_review index. chain_questions PK changes from (chain_id, position) to (chain_id, question_id) for multi-chain + non-contiguous support. Drops deep_dive_title/deep_dive_url. - codegen-hashes.txt: new baseline covering the v1.0 models.py, d1-schema.sql, and @staffml/vault-types/index.ts. Fixes the vault codegen --check drift test that was failing CI.	2026-04-21 18:24:21 -04:00
Vijay Janapa Reddi	f25f9e8184	feat(vault): B.1-B.7 + B.13 + B.15 + B.17 \u2014 finish bucket B Worker hardening (interviews/staffml-vault-worker/src/index.ts rewritten): - B.1 Cloudflare Cache API wired via caches.default; cache key is /__vault__/<release_id>/<path> so each release is a disjoint namespace. Deploy changes release_id \u2192 all old entries miss atomically. Degraded responses are NEVER cached (would poison the namespace). - B.3 Keyset pagination: cursor is {after_id, filter_hash}. Server computes filter_hash per-request and rejects cross-filter cursor reuse with 400. Pagination cost drops from O(offset + N) to O(N) per page. - B.4 Rate limiting via RATE_LIMIT_KV (src/rate_limit.ts): token bucket per (IP, class) windowed at 60s. 'default' 60 rpm, 'search' 10 rpm. Returns 429 with Retry-After header. Open-allows if KV not bound so the local vault-api shim still works. - /search uses FTS5 MATCH when questions_fts exists; fallback to LIKE for pre-FTS5 D1 instances. Escapes FTS5 special chars to prevent MATCH injection. vault-api.ts circuit breaker (B.2 \u2014 Soumith R3-F-2 fix): - Proper closed \u2192 open \u2192 half-open state machine. Half-open admits exactly one probe; failure \u2192 re-open immediately, success \u2192 close. - AbortSignal.timeout(10_000) per-attempt; AbortSignal.any() combines with caller's signal so React unmounts don't count as failures. - Retry only on retryable statuses (408/425/429/5xx/network), not on 4xx user errors or caller-aborted fetches. - Module-level _singleton so multiple makeClientFromEnv() share breaker state. __resetSingleton() exposed for tests. Worker vitest suite (B.6 \u2014 staffml-vault-worker/tests/worker.test.ts): 6 tests: rate-limit under/over cap with Retry-After; schema-fingerprint placeholder forces degraded mode; real fingerprint clears flag; cursor filter_hash mismatch returns 400; CORS echoes allowed origin; 405 on POST/PUT/DELETE; /admin/release returns 404 (no auth footgun). vault ship real hooks (B.15 \u2014 commands/release.py): - d1_forward: pnpm exec wrangler d1 execute <env-db> --file <migration.sql> - d1_rollback: applies d1-rollback.sql (SQL path); snapshot path remains primary per \u00a76.2. - nextjs_forward: pnpm run deploy:<env> from site_dir. - nextjs_rollback: pnpm exec wrangler pages deployment list (lets operator pick rollback target). - paper_forward: git tag -a v<version> && git push origin v<version>. - --skip-legs allows shipping subset (e.g., skip=paper for pre-tag validation). Content-hash SLI workflow (B.5 \u2014 .github/workflows/vault-content-hash-sli.yml): Hourly GitHub Action samples 20 IDs from latest release's vault.db, fetches same IDs from production worker, recomputes canonical content_hash in Python, asserts parity. Files a priority-high issue on mismatch. Avoids porting hashing.py canonicalization to TypeScript (Chip R3-H5's invariant-bomb risk). JSON schemas (B.7 \u2014 vault-cli/docs/JSON_OUTPUT.md): Full stable shapes for build, publish, ship, new, rm, move, renumber, restore, promote, mark-exemplar, snapshot, migrations-emit, export-paper, tag, deploy, rollback, generate. Plus notes for serve/api (not JSON-emitting \u2014 long-running servers). Codegen hash baseline (B.13 hash-check variant): vault codegen --check now computes SHA-256 over 3 shared artifacts and compares to committed interviews/vault-cli/codegen-hashes.txt. First run auto-records baseline; subsequent runs enforce no drift. Full LinkML-driven regeneration remains a Phase-2 follow-up. Baseline recorded this commit. Component migration hook (B.17 \u2014 staffml/src/lib/hooks/useVaultQuestion.ts): Minimal React hook that routes through corpus-source.ts. Components opt into the cutover by importing from here; existing corpus.ts callers remain untouched. Cutover-day swap is one import per component, not a big-bang replacement. 28/28 pytest still green. release_hash 1b304282... unchanged (no content-affecting mutations).	2026-04-16 14:04:03 -04:00

4 Commits