cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-07-17 16:34:48 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	2b381bb949	refactor(vault-cli): rename --legacy-json to --local-json The flag is the StaffML frontend's local-dev fallback (read corpus.json from disk via NEXT_PUBLIC_VAULT_FALLBACK=static), not a deprecated path. "Legacy" implied "soon to be removed"; "local-json" describes its actual role and reads correctly in scripts and docs. - vault-cli: rename CLI flag, parameter, result key, and help text. - CI workflows + pre-commit config: invoke the new flag name. - All scripts that print the command (suggest_exemplars, pre_commit_corpus_guard, promote_validated, rename_legacy_ids, export_to_staffml, the paper analyze_corpus/generate_*) updated. - Comments and docs (ARCHITECTURE, CHANGELOG, REVIEWS, TESTING, MASSIVE_BUILD_RUNBOOK, DEPRECATED, AUTHORING, plus frontend comments and .env.example / .gitignore) updated. The "legacy_json" sentinel string in corpus_stats.json._meta.source is intentionally NOT renamed — it is a stable artifact format read by downstream paper-generation tooling.	2026-04-30 09:30:28 -04:00
Vijay Janapa Reddi	a610deec21	feat: verify and fix BibTeX for interviews and MLSysIM papers - interviews/paper: SWE-bench/ETS/VLDB/NeurIPS/MMLU metadata, figure rebuild and corpus script updates - mlsysim: Eisenman author list, ISCA x-verified-source DOIs, snell2025scaling, Narayanan/Pope/PaLM/MLPerf; docs/references.bib aligned with paper	2026-04-26 14:55:52 -04:00
Vijay Janapa Reddi	c133967094	paper: v0.1.0 audit + fresh-reader pass — apparatus framing, validator coverage, practice-UI mockup Combined revision pass against (a) a 10-item correctness audit of paper.tex versus the consolidated 0.1.0 release at `55fec89898` and (b) three fresh-reader persona reads (ML systems engineer, NeurIPS D&B reviewer, working practitioner). The two passes converged on the same five high-leverage issues; this commit addresses all of them plus the audit's must-fix list. Structural rewrites (§1–§3): - Abstract leads with the artifact (~9,757 questions, real hardware) and acknowledges the construct-validity gap on page 1. - "Ikigai" demoted from competency-model brand to a one-line mention of where the four-circle Venn visual is borrowed from; body uses "four-skill model" and "cognitive zones" throughout. - §1 lead-restructured: TinyML duty-cycling example moved up to follow the "constraints drive architecture" thesis (so reader sees a concrete question before the framework apparatus). Punchline corrected — at 2% duty cycle the active term dominates sleep current; constraint is duty cycle. - §3.2 adds the upfront 11→6 zones honesty so the §13 admission isn't a surprise reveal. - §3.3 reframes ZONE_LEVEL_AFFINITY as a soft authoring prior; flags ZONE_BLOOM_AFFINITY as the hard validate-at-write rule. - "Five Laws" softened from operative axis to organising pillars (paper doesn't actually tag questions with laws). Audit correctness fixes: - Quantify→implement naming: kept "quantify" in prose, added a clarifying footnote about the schema identifier and macros mapping. - L867 zone counts now use \zoneDiagnosisCount / \zoneFluencyCount / \zoneEvaluationCount / \zonesBelowFloor macros (already auto-emitted by the consolidated 0.1.0 release). - L892 "27%" arithmetic + 79-vs-87 topic discrepancy: documented the pre-v1 origin of the matrix in a footnote; reported counts characterise the original 316-cell matrix; v1-topic per-track applicability now an explicit limitation in §13. - Table 3 (areas) regenerated for 87 topics across 13 areas. - Table 4 (full taxonomy) extended with the 8 v1 topics in their appropriate area columns. - \numedges semantics clarified: 57 means prerequisite edges only; raw counts for broader/narrower (14) and related (54) given inline. - 31→32 root topics (matches corpus_stats.json.taxonomy_graph.root_topics). - L815 math-verification: replaced bi-model story with what actually shipped — three-stage Gemini-3.1-pro-preview pipeline (generator → judge → math reviewer) with cross-stage agreement; bi-model framed as future. - Bloom→industry-ladder mapping (Table 5) softened to "illustrative, not normative" with explicit deferral to ongoing psychometric calibration. - "Physics-grounded" exclusion-matrix language softened to "deployment-feasibility" where appropriate; "physics-grounded" preserved for napkin math (where it is genuinely physics). - §6→§7 LinkML claim made honest: schema is canonical, derived artefacts kept in sync via tools/check_schema_sync.py drift check. - §7 schema/infrastructure: documents the SQLite vault.db build and Cloudflare D1 worker production path (corpus.json relegated to fallback). Insertion paragraphs (audit Rewrites A/B/C/D): - Rewrite A: §QA Schema Validation bullets now document the four v0.1.0 model-validators (Visual.kind enum, path regex, alt/caption length, zone-bloom compatibility, visual-path-resolves) plus the 15 unit tests. - Rewrite B: §LLM-Assisted Generation gains validate-at-write contract paragraph + PARALLELISM_RULES variant + cumulative yield numbers (462 area fixes, 576 zone-bloom fixes, 1308→0 lint warnings). - Rewrite C: §QA Structural and Semantic Invariants gains repair-script paragraph naming all five scripts and the bounded fix discipline. - Rewrite D: §LLM Failure Modes replaced with empirical 5-mode taxonomy observed during release-readiness audits. Reader-flagged fixes: - MFU=40% diagnosis question lowered to a more plausible 8% with batch- size context; arithmetic-intensity ceiling math made explicit. - MCQ format constraint context added to §7 opening so the L768 "MCQ must have 4 options" rule isn't a surprise. - §13 Limitations expanded from six to seven; new entry covers the v1-topic applicability-matrix lag. New figure: - figures/fig-practice-ui.svg: a single-page mockup of the practice interface (filter sidebar, active question card, chain progression rail). Inserted as Figure 11 in §11 Practical Applications. Addresses the convergent reader complaint that the paper never showed what a study session looks like. Build verification: - 35 pages, three pdflatex passes, zero overfull hboxes from new content. - All numbers consistent with macros.tex emitted by vault export-paper at the consolidated 0.1.0 release (793c06f414f2bf83). Pre-existing undefined-citation warnings (williams2009roofline, nvidia2022h100, etc.) are not from this revision; they were already present in the bibliography.	2026-04-26 10:01:11 -04:00
Vijay Janapa Reddi	9b313c17d9	feat(paper/scripts): add validate_refs.py — CrossRef spot-check for paper.bbl Small bbl-validation helper for the interviews paper bibliography. Reads paper.bbl, extracts each bibitem's rough title, queries CrossRef, and prints [OK] / [WARN] / [ERR] per citation key. Useful as a spot-check after large bibliography edits to catch typos, wrong years, or silently- renamed works. Placed alongside the other paper-tooling (analyze_corpus.py, generate_figures.py, generate_macros.py). Path resolution uses Path(__file__).parent so it works from any CWD.	2026-04-24 11:26:07 -04:00
Vijay Janapa Reddi	0ad41c693d	docs(vault): architecture v2.2 + Round-3 ledger + paper-agree-by-SQL ARCHITECTURE.md header bumped to v2.2. Full changelog block added (v2.1 → v2.2) keyed to Round-3 finding IDs. §7.1 + §10.2 edited to align X-Vault-Release soft-signal semantics with §6.1.1 (Soumith F-1). REVIEWS.md §Round-3 added: per-reviewer verdicts (Chip YELLOW, Dean YELLOW→GREEN, Soumith GREEN-conditional, David YELLOW→GREEN), convergence map of 11 integrated items, explicitly-deferred list (Cache API, breaker half-open, rate-limit KV, cross-lang hash path, worker vitest, LSH dedup — all documented as Phase-3-entry gates). CONTRIBUTING.md quickstart corrected (David R3-H5): step 3 dropped the Phase-1+ 'doctor'/'stats' references; step 4 shows 'vault build' before 'vault api' so the shim has something to serve. paper/scripts/generate_macros.py rewritten as thin wrapper over 'vault export-paper' (B.1 — closes §20.5 #2 + #7). Uses sys.executable -m vault_cli.main so PATH isn't required. paper/macros.tex (regenerated): 66-line emission with both \staffml* and legacy \num* namespaces. paper.tex needs no edits during transition. Paper and site now agree by construction — the structural fix for H-21 (9,199 vs 8,053) bug class. paper/corpus_stats.json (regenerated): full superset of the v1 analyze_corpus.py output, driven by SQL over vault.db with 'by_zone', 'by_level', 'by_track', chain 'by_length' distribution, 'bloom_distribution' (zone→bloom derived mapping), applicability.	2026-04-16 13:10:16 -04:00
Vijay Janapa Reddi	4c31251b39	refactor: standardize paper directory structure across all three papers Consistent layout for StaffML, mlsysim, and TinyTorch papers: - figures/ for all visual assets (SVGs, PDFs, PNGs) - scripts/ for utility scripts (analysis, validation, benchmarks) - tables/ for standalone table .tex files (StaffML only) - Makefile at root for building (created one for mlsysim) Removed redundant build scripts (compile_paper.sh, build.sh) in favor of Makefiles. Deleted sort_app_matrix.py (no longer needed). Merged mlsysim images/ into figures/. Updated all references in paper.tex, Makefiles, and CI workflows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 15:57:55 -04:00

6 Commits