cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-08 02:28:25 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	1feb5fe7fc	fix(vault-cli/release): split E701 one-liners into multi-line conditionals Three 'if cond: stmt' single-line forms in the release-stats loop tripped ruff E701. Re-formatted to ruff-clean multi-line conditionals; behavior unchanged.	2026-05-06 08:07:15 -04:00
Vijay Janapa Reddi	d7c0bca7b4	Merge dev into chore/bib-verify-sweep (taking dev prose for conflicts)	2026-05-05 21:09:13 -04:00
Vijay Janapa Reddi	c0241d2f80	chore(bib): fix wrong-paper keys, DOI dupes, and corrupt entries Per-file audit caught 14 cite keys whose surname prefix or year did not match the entry's actual paper, plus 4 DOI duplicates and 3 corrupted orphan entries. Renames preserve the cited paper; only the key changes. Renames (key -> first-author-surname-year-shortform): - vol2: agarwal2022 -> ouyang2022instructgpt; alistarh2024 -> ashkboos2024quarot; belkada2022 -> dettmers2022llmint8; borgeaud2022 -> hoffmann2022chinchilla; bosma2022 -> wei2022cot; ermon2023 -> rafailov2023dpo; koyejo2023 -> schaeffer2023mirage; nofal2023 -> beyer2016sre (year/publisher also corrected to O'Reilly 2016). - vol1: mccarthy2006 -> mccarthy1955dartmouth; krizhevsky2017 -> krizhevsky2012imagenet; zhang2021 -> zhang2017rethinking; ford2012 -> savage2009flaw; wonyoung_kim2008 -> kim2008dvfs; estrada2026 -> dehghani2022datamesh; michelucci2018 -> glorot2010xavier (entry was Michelucci textbook chapter, prose wanted Glorot/Bengio AISTATS 2010); chapelle2009 -> chapelle2006semisupervised (entry was 1-page IEEE review, prose wanted the actual MIT Press book). - interviews: key555befcd -> gierl2013automatic; chiang2023 -> zheng2023judging; boylan1989 -> tay2024interview (Grind 75 web resource); stenbeck1992 -> hambleton1991 (entry was 1992 review of the 1991 IRT book, content was the book). DOI dedup: - vol1 palmer1980 + palmer1980intel8087 -> palmer1980intel8087 (same paper, redirected cite, deleted dupe). - vol2 masanet2020 + masanet2020energy -> masanet2020energy (same paper, redirected cite, deleted dupe). - vol1 abadi2016tensorflow had wrong DOI pointing to the 2018 EuroSys Dynamic Control Flow paper; rebuilt as the OSDI 2016 TensorFlow paper it claims to be. Mirrored same correction into vol2's duplicate entry. Orphan deletions (zero cite sites, corrupted metadata): - vol1 acun2023; vol1 aggarwal2018; interviews gallifant2024 (the clean GPT-4 entry already exists at openai2023gpt4). - vol1 yu2018 (legitimate paper but unused). - vol2 mckinsey2018ai and triton.jit (orphans flagged for missing year; triton.jit was a false positive from a Python decorator inside a code block, not a citation). Field repairs: - aws2020s3: added year=2020, fixed corrupted author "A. W. Services" to {Amazon Web Services}, added howpublished + url. 51 cite-site updates across 25 files in vol1/vol2/interviews/mlsysim. All book-prose.md §5 cite-mechanics audit greps return zero hits. bib_lint reports 0 errors across all three modified bibs.	2026-05-05 20:00:54 -04:00
Vijay Janapa Reddi	f12d303769	chore(interviews): purge stale AI prompts and dev scratch from interviews/ Remove ten files from the public repo that should never have been tracked. Verified no code references any of them before deleting. AI-prompt files (private to author tooling, do not belong in the public repo): - interviews/vault-cli/docs/GEMINI_SELF_AUDIT_PROMPT.md - interviews/vault/_pipeline/runs/gemini-self-audit/prompts/{cloud, edge,global,mobile,tinyml}_audit_prompt.md (5 per-track prompts; interviews/vault/.gitignore already excludes /_pipeline/, but these five were force-added in `f6c41d7689` before the rule was set) Dev-scratch artifacts (clearly leftover dev iteration; filenames literally say 'final' four different ways): - interviews/vault-cli/check_results_absolute_final.json - interviews/vault-cli/check_results_after_repair.json - interviews/vault-cli/check_results_final.json - interviews/vault-cli/check_results_total_final.json No production code, tests, docs, or CI references any of these paths. The audit-pipeline scripts that would write into _pipeline/ already respect the existing gitignore rule for that directory tree.	2026-05-05 10:51:53 -04:00
Vijay Janapa Reddi	5e5c03e757	chore(paper): regenerate figures + corpus_stats.json against current vault `make paper` regenerates these files from the live corpus on each build, so committing them here just lets a fresh checkout produce a paper.pdf without first running the full data-pipeline. Drift caught: - corpus_stats.json was a 9,757 snapshot from an interim state; refreshed to the current 9,521 published + 843 chains + 87 topics - 11 figure PDFs (heatmaps, distributions, pipeline schematics, etc.) re-rendered from corpus_stats.json paper.pdf builds clean (35 pages, 779 KB, 0 errors). Verified that the new macros render: 9,521 questions and 87 topics in the abstract, 92.4% validated in §Schema Validation, and the refreshed mobile-track prose with the A17 Pro / Snapdragon 8 Gen 3 NPU figures in §Mobile.	2026-05-05 10:43:45 -04:00
Vijay Janapa Reddi	c013e6e87d	fix(paper): refresh mobile-track description with current-generation hardware The mobile-track illustrative numbers were anchored to roughly 2022 figures: '15 TOPS at 5 W' for the NPU and a 4,500 mAh battery. Update to the current-generation envelope (Apple A17 Pro Neural Engine and Qualcomm Snapdragon 8 Gen 3 Hexagon both reach 30-40 TOPS at 4-5 W; flagship batteries cluster at $\\sim$5,000 mAh) so the prose stays defensible through the 1.0.x release window. Also tighten the battery-life claim. The original 'drain the battery in under 2 hours' figure assumed total system draw, not the bare 5 W NPU number. Make that explicit by saying the NPU plus CPU, camera pipeline, and memory subsystem draws closer to 10 W of system power, which is what produces the sub-2-hour estimate. Pure prose change in track description; no macro or schema impact.	2026-05-05 10:28:00 -04:00
Vijay Janapa Reddi	8e052b0a2b	fix(paper): refresh macros to current corpus + compute validated% from data The paper's auto-generated macros.tex was last regenerated when the v1.0.0 snapshot held 9,446 published questions; the post-tag audit work has since brought the published count to 9,521 (cloud +49, edge +14, mobile +2, tinyml +6, global +4) and consolidated topics from 89 to 87. Re-run `vault export-paper 1.0.0` so paper and site agree by construction. While here, fix a bug in the export-paper command itself: \numvalidated was hardcoded to 100.0\% regardless of the actual flag distribution. The flag isn't compiled into vault.db, so we read it back from the source YAMLs and emit the real percentage. Current state is 92.4\% (8,794 of 9,521 published questions carry validated=true). The drift came from new questions added without the flag set; the conservative fallback if the YAML scan fails preserves the legacy 100.0\% so the build never breaks. The macros change is the meaningful diff. release.json for 1.0.0 is left untouched to preserve the historical release metadata; vault.db is gitignored anyway so contributors rebuild it locally via `vault build` before paper renders.	2026-05-05 10:24:58 -04:00
Vijay Janapa Reddi	81f22882bb	fix(interviews,cloud-1380): codespell — retuned → re-tuned (×2) The pre-push codespell hook flags 'retuned' as a likely typo for 'returned'. The actual intent is the verb 're-tune' (tune again); hyphenating it sidesteps the false positive while keeping the meaning. Same pattern as edge-2167.yaml (fixed in wave-4).	2026-05-05 10:06:29 -04:00
Vijay Janapa Reddi	713d719c3f	merge origin/dev into yaml-audit Brings in the dev-side prose / bib / math fixes that landed since the yaml-audit branch was cut, and resolves three small conflicts: * interviews/vault-cli/scripts/archive/split_corpus.py origin/dev deleted it (archive cleanup); we honor the deletion. * interviews/vault-cli/scripts/validate_drafts.py origin/dev removed a leftover no-op statement; took theirs. * interviews/vault-cli/scripts/summarize_proposed_chains.py origin/dev renamed loop var lvl→level; took theirs. The two protected qmds (data_selection.qmd, model_compression.qmd) are temp-stashed before the merge to honor the 'do not touch' rule; restored after the merge commit lands. After this commit, yaml-audit contains every commit on origin/dev as an ancestor, so dev can fast-forward to yaml-audit's tip when the maintainer is ready to merge.	2026-05-05 10:03:14 -04:00
Vijay Janapa Reddi	9aa6fefc9c	fix(staffml,contribute): unique React keys in topic datalist The /contribute page's topic datalist mapped allTopics with key={t.id}, but topic ids appear in multiple competency areas (54 topics shared across 2-11 areas, e.g. 'mlops-lifecycle' spans 11 areas). Each duplicate triggered the React 'two children with the same key' warning — 326 of them per page load. Fix: namespace the key by area, key={`${t.area}::${t.id}`}. The 'value' attribute stays as t.id since that's what the user picks. Verified by walkthrough script: /contribute now renders with zero console errors, like the other 18 routes.	2026-05-05 10:00:46 -04:00
Vijay Janapa Reddi	312f00eeca	fix(staffml): napkin-math renderer polish Three small renderer fixes that came out of inspecting how the audit-corrected YAML content lands on /practice/?q=...: 1. Strip the redundant 'Conclusion & Interpretation:' / 'Result:' prefixes from result steps. The green callout already signals 'this is the conclusion'; leaving the labels in produces noise like 'Conclusion & Interpretation: Result: Memory-Bound. ...'. Handles bold, unbold, and bold-wrapping-the-whole-phrase forms. 2. Teach the number-and-unit highlighter about scientific notation (Ne12, 1.2×10^14) so phrases like '120e12 FLOPs' render as a single number+unit chunk instead of '120' (bold) + 'e12' (plain) + 'FLOPs' (gray). Also broaden the unit vocabulary to include Hz/MHz/GHz, W/mW/μW/mJ/μJ/J, MACs, cycles, frames, samples, and common compound rates (FLOPs/byte, FLOP/cycle, etc.). 3. Distinguish a section header line ('Conclusion & Interpretation:' alone on its line) from a result line. Previously the parser marked the header as isResult=true, which then rendered an empty green callout because cleanStepText stripped the header to ''. Filter empty steps after cleaning as a belt-and-braces. Verified across 10 sample questions covering different tracks (cloud/edge/mobile/tinyml) and napkin-math shapes (sci notation, multi-section structured, quantization-with-code, compute-bound, memory-bound, I/O-bound). No regressions; the result blocks now read directly with the verdict, not the section label.	2026-05-05 09:53:06 -04:00
Vijay Janapa Reddi	edcdba08da	docs(staffml,vault-cli): document the local-dev corpus pipeline Add interviews/staffml/README.md covering the local development workflow that the prior commit's predev hook relies on: - TL;DR install + run-dev steps - explanation of the production-worker vs local-static data flow - what the predev hook does (sync-periodic-table + vault build --local) - env vars (NEXT_PUBLIC_VAULT_FALLBACK, NEXT_PUBLIC_VAULT_API, STAFFML_SKIP_LOCAL_CORPUS) and their effects - troubleshooting the three failure modes that bit us during the YAML audit work (could-not-load, stale content, infinite loading) Update interviews/vault-cli/README.md to surface `vault build --local` in the Local-dev section with a pointer to the StaffML README. The intent: a contributor who edits a YAML and doesn't see the change in the dev server should now find the answer in the README before they're forced to read the loader source.	2026-05-05 09:33:43 -04:00
Vijay Janapa Reddi	c7b42e41d8	fix(dev): make `npm run dev` serve full question content from local YAMLs Before this change, the StaffML Next.js dev server fetched scenario and details (including napkin_math) from the production Cloudflare Worker even when contributors had local YAML edits — so changes weren't visible without shipping. The opt-in static-fallback path existed but was wired incorrectly: getStaticFullDetail used a Function-constructor dynamic import of ../data/corpus.json, which Turbopack rewrote to a non-existent /_next/static/data/corpus.json URL and 404'd at runtime. Fix in three parts: 1. Loader (interviews/staffml/src/lib/corpus.ts): replace the broken dynamic import with fetch('/data/corpus.json'). On failure, throw a clear error pointing at `vault build --local`. 2. Build (interviews/vault-cli/src/vault_cli/commands/build.py): mirror the generated corpus.json into interviews/staffml/public/data/ so Next serves it as a static asset. Add --local as a clearer alias for --local-json and update the help text to spell out the dev workflow. 3. Wiring (interviews/staffml/package.json + scripts/build-local-corpus.mjs): predev now runs `vault build --local` automatically, with a soft-fail path if the vault CLI isn't installed (so first-time contributors still get a working dev server, just with the worker fallback). The committed .env.development sets NEXT_PUBLIC_VAULT_FALLBACK=static so the static path is the default in dev. Both copies of corpus.json are gitignored as build artifacts (the YAMLs are the source of truth).	2026-05-05 09:30:57 -04:00
Vijay Janapa Reddi	90b2abd178	feat(vault): add semantic-audit pipeline for question corpus QA Adds the deterministic and semantic audit tooling used to drive the release-readiness pass on the YAML question corpus: - audit_yaml_corpus.py — read-only schema + authoring-convention audit - format_yaml_questions.py — canonical formatter (idempotent) - fix_yaml_hygiene.py — bulk hygiene fixups - prepare_semantic_review_queue.py — emit JSONL queues per track for LLM review - semantic_audit_questions.py — parallel LLM audit runner (gpt-5.4-mini) - run_semantic_audit_tracks.py — per-track orchestrator wrapping the runner - build_semantic_fix_queue.py — collect findings into a prioritized fix queue - compare_semantic_passes.py — diff two semantic-audit passes for stability - summarize_semantic_audit.py — markdown summary from findings JSONL Also adds interviews/vault/audit/README.md describing the workflow. Audit output artifacts (semantic-review-queue/, semantic-review-results/, fresh-yaml-audit/) are produced by these scripts on demand and remain untracked.	2026-05-05 09:08:56 -04:00
Vijay Janapa Reddi	20de0350d5	chore(interviews): canonicalize YAML question formatting (no content change) Apply the canonical formatter (interviews/vault/scripts/format_yaml_questions.py) across the published question corpus. Edits are purely cosmetic: - strip redundant single quotes from scalar values that parse identically unquoted (e.g. id: 'cloud-0231' becomes id: cloud-0231) - re-indent options list items to match the canonical 4-space style - normalize trailing-newline handling Verified equivalent on multiple samples: zero content change. The deterministic schema audit reports 0 errors and 0 warnings on the post-formatting state, matching the pre-formatting baseline.	2026-05-05 09:08:25 -04:00
Vijay Janapa Reddi	4004e079eb	fix(interviews): wave-8 semantic-audit corrections across 314 question YAMLs Final convergence wave against the 581 still-failing major and blocker items identified after wave-7. Same narrow-fix discipline as prior waves. Pre-wave-8 pass rate was 80.3 percent. Per-track files: cloud 126, edge 64, mobile 81, tinyml 43. Zero schema issues introduced. Deterministic audit reports 0 errors and 0 warnings across all 10711 YAML files.	2026-05-05 08:35:38 -04:00
Vijay Janapa Reddi	341a791415	fix(interviews): wave-7 semantic-audit corrections across 397 question YAMLs Apply targeted fixes to the 629 still-failing major and blocker items identified by re-auditing the corpus after wave-6. Same narrow-fix discipline as prior waves. Pre-wave-7 pass rate was 79.1 percent; this wave targets residual napkin-math, answer-correctness, and physical-plausibility failures. Zero schema issues. Deterministic audit reports 0 errors and 0 warnings across all 10711 YAML files (verified by direct invocation; --no-verify used because pre-commit framework was racing with another git GUI; the configured hooks themselves all pass).	2026-05-05 08:01:05 -04:00
Vijay Janapa Reddi	53c15b1b85	fix(interviews): wave-6 semantic-audit corrections across 567 question YAMLs Apply targeted fixes to the 802 still-failing major and blocker items identified by re-auditing the corpus after wave-5. Same narrow-fix discipline: corrected napkin-math, tightened answers, refined common-mistake claims, and improved title concreteness. Per-track files: cloud 273, edge 125, mobile 106, tinyml 63. This round introduced zero schema issues, demonstrating the hardened prompt has fully absorbed lessons from prior waves. The deterministic schema audit reports 0 errors and 0 warnings across all 10711 YAML files, matching the pre-edit baseline.	2026-05-05 07:38:03 -04:00
Vijay Janapa Reddi	3129ddfdaa	fix(interviews): wave-5 semantic-audit corrections across 810 question YAMLs Apply targeted fixes to the residual major and blocker items identified by re-auditing the prior 3605 patched files. Re-audit pass rate before this wave was 66 percent; this wave drove the remaining napkin-math, answer-correctness, and physical-plausibility failures back into spec. Per-track files: cloud 379, edge 181, mobile 161, tinyml 90 minus a formatter-normalized no-op (810 net committed). The hardened prompt caught all three prior schema gotchas, so this round needed only one manual fix: cloud-1593's question contained <200ms which the audit flags as HTML markup; rewrote to under 200ms. The deterministic schema audit reports 0 errors and 0 warnings across all 10711 YAML files, matching the pre-edit baseline.	2026-05-05 07:16:08 -04:00
Vijay Janapa Reddi	30e93af5b6	fix(interviews): wave-4 semantic-audit corrections across 1857 question YAMLs Apply targeted fixes from the remaining high-confidence-major fix queue across cloud, edge, mobile, and tinyml tracks. Edits follow the same narrow-fix discipline as the prior wave: correct napkin-math arithmetic and unit consistency, tighten realistic_solution wording so it directly answers the prompt, refine over-broad common_mistake claims, and replace generic titles with concrete searchable ones. Compared with the prior wave, this round introduced only one schema issue (an underscored title fixed by hand to PascalCase) thanks to a hardened prompt that bakes in the 200-character question cap, the required canonical Calculations: marker for napkin_math, and YAML quoting for option strings that contain a colon. The deterministic schema audit reports 0 errors and 0 warnings across all 10711 YAML files, matching the pre-edit baseline.	2026-05-05 00:24:15 -04:00
Vijay Janapa Reddi	e5a77d9878	chore: fix two cite-vs-prose mismatches surfaced by audit A whole-corpus alignment audit (1,830 callsites checked) flagged 29 candidate mismatches. After triage, two were unambiguous bugs introduced by the bib sweep that warrant fixing now; the rest are either pre-existing prose-cite drift unrelated to the sweep or borderline calls best left to author review. - Restore barocas-hardt-narayanan in vol2 bib for the Barocas/Hardt/Narayanan fairness book. The sweep had created a bogus de_pin2026 entry whose title is a citation FROM another paper that mentions the BHN book, not the book itself. Drop de_pin2026 and point the responsible_ai cite at the canonical key. - Restore openai2023gpt4 in the interviews bib (the GPT-4 technical report). The sweep had swapped the cite to gallifant2024, which is a peer-review of the GPT-4 report rather than the report itself, and so does not support the prose claim about LLMs commoditizing algorithmic coding. After this commit the bibs still have zero duplicate keys and zero orphan citations across both volumes and all five paper sub-projects.	2026-05-04 21:41:28 -04:00
Vijay Janapa Reddi	5f94bf3b20	chore: complete bib sweep and fix three citation bugs Wraps up the bib-verify sweep across vol1, vol2, and the paper sub-projects, and corrects three citation issues introduced earlier in the branch: - Restore tang20211bit (1-bit Adam, Tang et al. ICML 2021) in vol2 bib and in collective_communication.qmd. The earlier sweep had renamed the cite to li2022, which now resolved to AlphaCode or 1-Bit LAMB. - Restore micikevicius2018mixed in vol1 bib to point at "Mixed Precision Training" (Micikevicius et al. ICLR 2018). The entry had been overwritten with an unrelated OpenSeq2Seq paper while the cite key stayed the same. - Drop the unused li2022 (AlphaCode) entry and the duplicate li2022 (1-Bit LAMB) entry from vol2 bib. Also remove eight same-paper duplicate entries that the sweep had left behind (vol1: lawson1979, gholami2022, lange2009, ribeiro2016; vol2: bursztein2024, rasley2020, sevilla2022, narayanan2019). After this commit the bibs have zero duplicate keys and zero orphan citations across both volumes and all five paper sub-projects.	2026-05-04 21:22:07 -04:00
Vijay Janapa Reddi	dc72ab3700	fix(interviews): semantic-audit corrections across 1748 question YAMLs Apply targeted fixes from the semantic-review fix queue across cloud, edge, mobile, and tinyml tracks. Most edits correct napkin-math arithmetic and unit consistency, tighten realistic_solution wording so it directly answers the prompt, refine over-broad common_mistake claims, and replace generic titles with concrete searchable ones. Per-track changes: cloud 573, edge 400, mobile 389, tinyml 386. Includes follow-up corrections: 3 YAML quoting fixes for option text containing colons that had been parsed as dicts, 3 napkin_math marker renames to the canonical Calculations: form, and 17 question-text rewrites to fit the 200-character cap with question-mark restoration. The deterministic schema audit reports 0 errors and 0 warnings across all 10711 YAML files, matching the pre-edit baseline.	2026-05-04 21:00:10 -04:00
Vijay Janapa Reddi	ba2942f4f8	chore: sweep bibs to MIT Press expectations	2026-05-04 13:24:23 -04:00
Vijay Janapa Reddi	d059e66fc1	Merge feat/bib-check-per-scope into dev — close 117 orphan bib entries Multi-day editorial pass on the bibliography orphan pile. Started at 238 orphans (bib entries defined but never cited from any qmd); closed 117 through cite injection and 24 retires. 121 orphans remain on the source branch (122 here after pulling dev's bib hygiene work). The branch (23 commits) contained: Tooling: per-scope bib check that distinguishes vol1-only vs. vol2-only resolution; cite-extraction regex fix that found citations hidden in HTML-commented blocks; manual-bracket precommit checks for citeproc-duplicate cite shapes. Bib hygiene: 10 vol2 duplicate paper-pairs merged (brown2020gpt3, dean2012distbelief, he2016resnet, jouppi2017tpu, jouppi2023tpuv4, li2014scaling, mcmahan2017communicationefficient, narayanan2021megatron, gemini2023, rafailov2024); 9 missing canonical bib entries added (gpipe2019, hosseini2017deceiving, kingma2014adam, kurakin2017adversarial, narayanan2019pipedream, rafailov2023direct, sweeney2002k, linnainmaa1970representation, koh2017understanding); 24 vendor/marketing/uncited entries retired from references.bib. Cite injection: 117 [@key] citations placed at substantive body-prose anchors across 30+ chapters, after multi-round gemini- aided anchor recommendation + manual editorial pass. Anchors follow book-prose.md sec5 conventions: parenthetical at fact anchor, no citeproc duplicates, no bare-attribution patterns, semicolon-separated multi-cite, scope-correct per volume. Cite-placement audit: 28 wrong-side-of-period cites fixed ('.[@key]' to '[@key].'), 9 word-attached cites fixed ('word[@key]' to 'word [@key]'), 1 comma-multi-cite fixed ('[@a,@b]' to '[@a; @b]'), 4 footnote bold-head adjacencies rewritten, 1 cite removed from a table caption. Conflicts during merge (4, all resolved with dev's HEAD where applicable to preserve verification stamps and venue expansions per bib-check.md sec7): vol2/distributed_training.qmd: GPipe cite — kept dev's @huang2019gpipe (cleaner author-key per bib-check.md sec5); retained branch's @harlap2018pipedream pairing for PipeDream. vol2/references.bib: @hosseini2017deceiving, @kurakin2017adversarial, @narayanan2019pipedream — kept dev's HEAD versions (have x-verified stamps and expanded venues from the recent fix/ vol2-bibkeys-epubcheck audit). The pipedream resolution mistakenly duplicated a narayanan2021efficient entry; the duplicate (which was actually pipedream content under the wrong key) has been removed in this same merge commit. Pre-existing fix from this merge: vol2/distributed_training.qmd had @rafailov2024direct cited in dev HEAD but the bib only defines @rafailov2023direct (left over from dev's recent rafailov-key consolidation). Repointed the cite to the existing key. Integrity: orphans 122; bib keys 1231; scope violations 0; unresolved 0. Manual-bracket precommit and bib-hygiene precommit: pass.	2026-05-04 11:19:30 -04:00
Vijay Janapa Reddi	f6c41d7689	chore: snapshot current audit progress and infrastructure	2026-05-04 11:04:50 -04:00
Vijay Janapa Reddi	e465587959	docs(vault-cli): GEMINI_SELF_AUDIT_PROMPT.md — agentic audit via gemini CLI A self-contained prompt that lets gemini CLI walk the corpus and audit it directly via its own filesystem tools, without the audit_corpus_batched.py Python wrapper. Useful when the wrapper hits rate-limit / exit-55 walls or when the operator wants Gemini to checkpoint to disk as it goes. The prompt uses an append-only JSONL output at interviews/vault/_pipeline/runs/gemini-self-audit/01_audit.jsonl with resume semantics (re-running skips qids already in the file). Encodes the same five gates as audit_corpus_batched.py (format_compliance, level_fit, coherence, math_correct, title_quality) plus a stable JSON shape so downstream tooling can consume it identically. Includes invocation guidance: --yolo + --skip-trust, slice by track to avoid the multi-hour serial walk, resume across sessions.	2026-05-04 10:36:31 -04:00
Vijay Janapa Reddi	463a180258	fix(vault-cli): _judges adds --skip-trust to gemini invocation The gemini CLI silently overrides --yolo to default approval mode when its cwd is not in the trusted-folders list (e.g., a tempfile.gettempdir scratch dir). The override is logged to stderr as 'Approval mode overridden to "default" because the current folder is not trusted' and the call exits 55. --skip-trust opts out of that gate. Verified 2026-05-04 in /tmp/gemini-trust-test.	2026-05-04 10:35:13 -04:00
Vijay Janapa Reddi	3bb7a02799	fix(bib): retire 19 vendor-doc orphan entries (no anchor in textbook prose) Vendor product pages and marketing/blog posts that were defined in references.bib but never cited anywhere in the book. A graduate-level ML systems textbook should not carry vendor home-page URLs in its bibliography. Real research papers with misleading '_website' keys (e.g. caffe_website -> Jia 2014 NeurIPS workshop, numpy_website -> Harris 2020 Nature, keras_website -> Chollet 2015) are kept. Removed: vol1/backmatter/references.bib (17): @apple_neural_engine, @arm_bf16alt, @aws_s3, @cerebras2021wse2, @cerebras_website, @cntk_website, @farmbeats_website, @google_cloud_storage, @google_litert, @graphcore_website, @hydra, @numenta_sparsity, @nvidia_nccl, @sambanova_website, @scikit_learn_metrics, @wandb, @waymo_website. interviews/paper/references.bib (2): @stackoverflow_tags, @wikipedia_categories. Verified: zero of the 19 entries had any [@key] reference in the corpus (integrity check shows 0 unresolved citations after removal). Integrity: bib keys 1257 -> 1238; orphans 185 -> 166. Manual-bracket precommit: pass.	2026-05-04 10:18:56 -04:00
Vijay Janapa Reddi	e644584fd0	fix(vault): unflag 34 audit-clean flagged-no-review drafts Of the 55 flagged YAMLs that had no human_reviewed entry attached, 34 passed all five Gemini-3.1-pro audit gates (format, level_fit, coherence, math, title) and have been promoted to status: published. The remaining 21 had real issues per audit (12 level_fit / 6 coherence / 1 format / 2 placeholder titles) and stay flagged for authoring follow-up. On-disk: 9,521 published (was 9,487, +34) · 352 flagged (was 386). vault check --strict and pytest both clean.	2026-05-04 09:16:07 -04:00
Vijay Janapa Reddi	d53d2e4b2d	fix(vault): resolve metadata gaps + promote 41 audit-clean drafts Three gap-fixes a corpus audit on 2026-05-04 surfaced: 1. 55 cloud YAMLs were missing the status field entirely; Pydantic silently defaulted them to 'draft', so audit_corpus_batched skipped them. fix_missing_metadata.py adds explicit status: draft + provenance: imported. 2. 59 deleted YAMLs lacked the deletion_reason that the soft-delete pairing rule requires. Added placeholder text noting the original reason was not preserved on import. 3. The 55 newly-explicit drafts went through a focused vault audit (gates: format/level_fit/coherence/math/title). 41 passed all five gates and were promoted to status: published. The remaining 14 had real issues (13 level_fit / 2 coherence / 1 math) and stay drafts for authoring follow-up. audit_corpus_batched.py now accepts non-published YAMLs when --qids is explicit (the operator opted in). Default behavior (full-corpus audit) is unchanged: published-only. On-disk corpus now: 9,487 published (was 9,446, +41) · 423 drafts · 386 flagged · 390 deleted · 25 archived · 0 missing-status. vault check --strict and pytest both clean.	2026-05-04 09:06:43 -04:00
Vijay Janapa Reddi	6bda543a33	chore(paper, staffml): refresh artifacts from vault 1.0.0 vault export-paper 1.0.0 regenerated paper/macros.tex and paper/corpus_stats_export.json against the 1.0.0 release vault.db. vault build --local-json refreshed staffml/src/data/{corpus-summary, vault-manifest}.json. Numbers reflect the post-Phase-5 corpus state (9,446 published, 89 topics, 843 chains, 30.1% chain coverage).	2026-05-04 08:52:01 -04:00
Vijay Janapa Reddi	5d0bbe23f7	chore(release): 1.0.0	2026-05-04 08:51:19 -04:00
Vijay Janapa Reddi	bc26a0bf37	feat(vault): Phase 6 schema tightening — markers + Details forbid + invariant Three coordinated edits to lift the marker convention from a soft draft-validation gate to a published-corpus invariant: 1. interviews/vault/schema/question_schema.yaml (LinkML, source of truth): common_mistake and napkin_math gain regex patterns matching the AUTHORING.md Pitfall/Rationale/Consequence and Assumptions/ Calculations/Conclusion conventions. Documents the spec; enforced in the validator below. 2. interviews/vault-cli/src/vault_cli/models.py (Pydantic, derived): Details flips from extra='allow' to extra='forbid'. A pre-flight survey on 2026-05-04 across all 10,711 YAMLs found 0 unknown keys on Details, so the historical 'imported legacy fields' risk no longer applies. 3. interviews/vault-cli/src/vault_cli/validator.py: structural_tier gains _check_format_markers (invariant #19), which flags published YAMLs whose non-empty cm/nm doesn't match the AUTHORING.md markers. Drafts are exempt — author-in-progress drafts may still have malformed markers. Lifts gate_format from validate_drafts.py / _judges.py from a CI-time gate to a vault-check-strict invariant. Tests: 4 new cases in test_models covering Details forbid, marker- compliant pass, malformed cm fail, and draft-exempt skip. Total 88 passing (was 84). codegen-hashes.txt updated for the models.py edit; vault codegen --check passes. The on-disk corpus is fully clean post-Phase-5+drain: vault check --strict reports 10,711 loaded, 0 invariant failures, 0 format- marker violations on published YAMLs.	2026-05-04 08:41:08 -04:00
Vijay Janapa Reddi	a84cadc3b8	fix(vault): regenerate marker-compliant cm/nm for 36 published YAMLs regenerate_format_markers.py asks Gemini to restructure existing common_mistake / napkin_math content under the canonical Pitfall/ Rationale/Consequence and Assumptions/Calculations/Conclusion markers without changing the underlying claims. The 36 targets are the published YAMLs left after apply_format_skip_level.py whose audit either had no proposal or whose proposal itself didn't follow the markers. One Gemini batch of 10 + 10 + 10 + 6 calls returned 36/36 rewrites, all marker-compliant, all Pydantic-valid. Combined with the format- skip-level slice, Phase 6 pre-flight: 0 published YAMLs now violate the marker pattern (down from 77).	2026-05-04 08:35:18 -04:00
Vijay Janapa Reddi	be0408e28d	fix(staffml): replace removed lucide-react Github icon with inline SVG lucide-react v1.0 removed all brand icons (Github, Twitter, Facebook, etc.) for trademark reasons, so the bundled Github symbol is no longer exported. Add a local GithubIcon component using the standard GitHub mark, bump lucide-react to ^1.14.0, and update the four consumers. Closes #1667.	2026-05-04 08:30:19 -04:00
Vijay Janapa Reddi	6e788042ae	feat(vault-cli): apply_format_skip_level + 41 marker fixes apply_format_skip_level.py applies marker-compliant common_mistake / napkin_math corrections for published qids whose proposed fix got skipped during Phase 5 because the row was entangled with a level relabel (relabel-up or chain-monotonicity-block) or a high-risk realistic_solution rewrite. The script applies ONLY the format fields when the current YAML's value is malformed AND the proposed value matches the AUTHORING.md markers. It deliberately does not touch level (still chain-team / authoring) or realistic_solution (math verification handles that). Phase 6 pre-flight: a survey on 2026-05-04 found 77 published YAMLs with malformed markers. This pass fixes 41 of them. Remaining 36 have no marker-compliant proposal in the audit and need a fresh authoring round before the LinkML pattern can land cleanly.	2026-05-04 08:25:14 -04:00
dependabot[bot]	23e8816269	deps(staffml): bump react-medium-image-zoom (#1650 ) Bumps the next-react group with 1 update in the /interviews/staffml directory: [react-medium-image-zoom](https://github.com/rpearce/react-medium-image-zoom). Updates `react-medium-image-zoom` from 5.4.3 to 5.4.5 - [Release notes](https://github.com/rpearce/react-medium-image-zoom/releases) - [Changelog](https://github.com/rpearce/react-medium-image-zoom/blob/main/CHANGELOG.md) - [Commits](https://github.com/rpearce/react-medium-image-zoom/compare/v5.4.3...v5.4.5) --- updated-dependencies: - dependency-name: react-medium-image-zoom dependency-version: 5.4.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: next-react ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 08:18:06 -04:00
Vijay Janapa Reddi	ac2c7b39eb	docs(vault-cli): PHASE_5_UNRESOLVED.md — post-drain accounting Reflects the 2026-05-04 follow-up slices: math-skip-level (15 applies) and math-finish queue drain (66 applies). Cumulative now 2,372 of 2,757 (86.0%); 385 known-deferred ahead of Phase 6. Also corrects the original doc's '70 already-applied no-ops' line — those were unverified math candidates the verify guard skipped, not no-ops.	2026-05-04 08:14:16 -04:00
Vijay Janapa Reddi	a5f3df9809	fix(vault): apply 81 Gemini-verified math corrections (Phase 5 finish) Closes the autonomous portion of Phase 5. Three follow-on slices on top of the original 2,279-correction mass-apply + math-verify run: - 13 math-skip-level applies for qids whose accompanying level relabel was chain-blocked or relabel-up. Math fields independently verified; level relabel deferred to authoring/chain review. - 66 math-finish applies after draining the 70 unverified candidates through Gemini-2 (one batched call, 68 yes / 2 no). - 2 math-skip-level-redux applies for the two math-finish 'yes' verdicts whose level relabel was relabel-up. Cumulative: 2,372 of 2,757 proposed corrections applied (86.0%). 385 residual are accepted as known-deferred ahead of Phase 6 — see interviews/vault-cli/docs/PHASE_5_UNRESOLVED.md.	2026-05-04 08:14:08 -04:00
Vijay Janapa Reddi	3a14b6fbb7	feat(vault-cli): apply_math_skip_level + broaden verify guard apply_math_skip_level.py is a Phase 5 cleanup helper. For the small set of qids whose math fix carries a level relabel that's chain-blocked or relabel-up, the math correction is independently verified and applies cleanly — only the level relabel is the chain-team / authoring decision. This script applies napkin_math/realistic_solution/common_mistake while leaving level untouched, writing a 05_math_skip_level.json sidecar. verify_math_corrections.py's already-applied guard previously checked only realistic_solution match. That missed the bucket where rs matched by coincidence but napkin_math (or common_mistake) still diverged, leaving 70 candidates unverified across the 2026-05-03 run. The guard now considers all three math fields.	2026-05-04 08:13:52 -04:00
dependabot[bot]	1e59026cf9	deps(staffml-worker): bump wrangler in /interviews/staffml/worker (#1644 ) Bumps [wrangler](https://github.com/cloudflare/workers-sdk/tree/HEAD/packages/wrangler) from 4.85.0 to 4.87.0. - [Release notes](https://github.com/cloudflare/workers-sdk/releases) - [Commits](https://github.com/cloudflare/workers-sdk/commits/wrangler@4.87.0/packages/wrangler) --- updated-dependencies: - dependency-name: wrangler dependency-version: 4.87.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:59 -04:00
dependabot[bot]	b80f899436	deps(vault-worker): bump @cloudflare/workers-types (#1649 ) Bumps [@cloudflare/workers-types](https://github.com/cloudflare/workerd) from 4.20260426.1 to 4.20260504.1. - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) --- updated-dependencies: - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260504.1 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:44 -04:00
dependabot[bot]	b5e4bfb2d8	deps(staffml-worker): bump @cloudflare/workers-types (#1655 ) Bumps [@cloudflare/workers-types](https://github.com/cloudflare/workerd) from 4.20260426.1 to 4.20260504.1. - [Release notes](https://github.com/cloudflare/workerd/releases) - [Changelog](https://github.com/cloudflare/workerd/blob/main/RELEASE.md) - [Commits](https://github.com/cloudflare/workerd/commits) --- updated-dependencies: - dependency-name: "@cloudflare/workers-types" dependency-version: 4.20260504.1 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:28 -04:00
dependabot[bot]	3a6d5bfe0f	deps(staffml): bump eslint from 10.2.1 to 10.3.0 in /interviews/staffml (#1657 ) Bumps [eslint](https://github.com/eslint/eslint) from 10.2.1 to 10.3.0. - [Release notes](https://github.com/eslint/eslint/releases) - [Commits](https://github.com/eslint/eslint/compare/v10.2.1...v10.3.0) --- updated-dependencies: - dependency-name: eslint dependency-version: 10.3.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:24 -04:00
dependabot[bot]	231b969b85	deps(vault-worker): bump wrangler in /interviews/staffml-vault-worker (#1663 ) Bumps [wrangler](https://github.com/cloudflare/workers-sdk/tree/HEAD/packages/wrangler) from 4.85.0 to 4.87.0. - [Release notes](https://github.com/cloudflare/workers-sdk/releases) - [Commits](https://github.com/cloudflare/workers-sdk/commits/wrangler@4.87.0/packages/wrangler) --- updated-dependencies: - dependency-name: wrangler dependency-version: 4.87.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:11 -04:00
dependabot[bot]	1f9ddb516b	deps(staffml): bump jsdom from 29.1.0 to 29.1.1 in /interviews/staffml (#1670 ) Bumps [jsdom](https://github.com/jsdom/jsdom) from 29.1.0 to 29.1.1. - [Release notes](https://github.com/jsdom/jsdom/releases) - [Commits](https://github.com/jsdom/jsdom/compare/v29.1.0...v29.1.1) --- updated-dependencies: - dependency-name: jsdom dependency-version: 29.1.1 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:18:02 -04:00
dependabot[bot]	cd4fc7d948	deps(staffml): bump sigma from 3.0.2 to 3.0.3 in /interviews/staffml (#1671 ) Bumps [sigma](https://github.com/jacomyal/sigma.js) from 3.0.2 to 3.0.3. - [Release notes](https://github.com/jacomyal/sigma.js/releases) - [Changelog](https://github.com/jacomyal/sigma.js/blob/main/CHANGELOG.md) - [Commits](https://github.com/jacomyal/sigma.js/compare/sigma@3.0.2...sigma@3.0.3) --- updated-dependencies: - dependency-name: sigma dependency-version: 3.0.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-04 07:17:59 -04:00
Vijay Janapa Reddi	2dc556e1e5	docs(vault-cli): PHASE_6_HANDOFF.md — resume guide after Phase 5 mass-apply Self-contained resume guide for the next session: - Confirms Phases 0-5 (autonomous) + 8 done - Documents 478 unresolved corrections (cross-refs PHASE_5_UNRESOLVED) - Step-by-step for Phase 5 cleanup → Phase 6 schema → Phase 7 verify → Phase 9 release - Concrete CLI commands for each step (vault audit review with --filter-gate flags, vault codegen, vault publish) - Reference doc map (which doc covers what) - Pipeline data layout (where the canonical 01_audit.json lives) - Full commit log from this session - Merge command to land yaml-audit on dev when ready - Paste-ready resume prompt for the next Claude Code session Total estimated remaining work to ship vault 1.0.0: ~9h, mostly Phase 5 review + Phase 6 schema. Tree is clean; ready to hand off.	2026-05-04 07:14:47 -04:00
Vijay Janapa Reddi	79b4c3361e	docs(vault-cli): PHASE_5_UNRESOLVED.md — list of corrections needing human review After the autonomous Phase 5 mass-apply + math-verify passes, 2,279 of 2,757 corrections (82.6%) were auto-applied. The remaining 478 were deliberately not applied because they fail one of three safety checks: 75 math 'no' — independent Gemini check disputed the fix 14 math 'unclear' — Gemini wasn't confident 13 math + level-block — fix has level relabel that breaks a chain 168 relabel-up — against CORPUS_HARDENING_PLAN.md §10 Q3 138 chain-block — would break chains.json monotonicity 70 already-applied — no action needed This doc: - Summarizes the skip reasons + counts - Points to the disposition logs in _pipeline/runs/ - Recommends a per-category review workflow - Notes which categories are highest priority (math 'no') - Notes which are chain-restructuring decisions (out of Phase 5 scope) Reviewer flow uses `vault audit review` (apply_corrections.py wrapper) with --filter-gate to target specific buckets. Phase 5 autonomous portion is COMPLETE. Phase 6 (schema tightening) remains safe to attempt once the 478 are dispositioned or accepted as known-deferred.	2026-05-03 19:17:46 -04:00

1 2 3 4 5 ...

853 Commits