14103 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
ccf01ccd08 chore: purge stale gemini-self-audit scratch from repo root
The same family of files as the prior commit's interviews/ cleanup,
but at repo root from the same f6c41d7689 snapshot:

- .files_to_audit.txt — gemini-self-audit input list
- audit_results.jsonl — gemini-self-audit output
- run_audit.sh — gemini-self-audit shell wrapper

Zero code references; the pipeline they belonged to was already removed
in f12d30376. Repo root is now clean of AI-workflow scratch.
2026-05-05 10:59:44 -04:00
Vijay Janapa Reddi
17396bcca2 Merge remote-tracking branch 'origin/dev' into yaml-audit 2026-05-05 10:52:17 -04:00
Vijay Janapa Reddi
f12d303769 chore(interviews): purge stale AI prompts and dev scratch from interviews/
Remove ten files from the public repo that should never have been
tracked. Verified no code references any of them before deleting.

AI-prompt files (private to author tooling, do not belong in the public
repo):

  - interviews/vault-cli/docs/GEMINI_SELF_AUDIT_PROMPT.md
  - interviews/vault/_pipeline/runs/gemini-self-audit/prompts/{cloud,
    edge,global,mobile,tinyml}_audit_prompt.md (5 per-track prompts;
    interviews/vault/.gitignore already excludes /_pipeline/, but these
    five were force-added in f6c41d7689 before the rule was set)

Dev-scratch artifacts (clearly leftover dev iteration; filenames literally
say 'final' four different ways):

  - interviews/vault-cli/check_results_absolute_final.json
  - interviews/vault-cli/check_results_after_repair.json
  - interviews/vault-cli/check_results_final.json
  - interviews/vault-cli/check_results_total_final.json

No production code, tests, docs, or CI references any of these paths.
The audit-pipeline scripts that *would* write into _pipeline/ already
respect the existing gitignore rule for that directory tree.
2026-05-05 10:51:53 -04:00
Vijay Janapa Reddi
5a8b841449 Merge pull request #1681 from harvard-edge/fix/numeric-corrections
fix(book): numerical and clarity corrections across 11 chapter files
2026-05-05 10:49:59 -04:00
Vijay Janapa Reddi
ff9232a2d2 fix(book): numerical and clarity corrections across 11 chapter files
Author-initiated factual corrections accumulated as uncommitted edits
in the working tree. Each change tightens a number, equation, or piece
of prose; no structural rewrites.

Notable corrections:
- training: gpt2_total_flops 1e20 -> 1e19
- hw_acceleration: AmdahlH100 hw_speedup_factor 500 -> 247 (H100 INT8
  matmul vs ~8 TOPS baseline CPU, more accurate)
- data_storage: optimizer state 700 GB -> 1,400 GB (Adam needs 2 FP32
  moments; 175B * 2 * 4 bytes = 1,400 GB)
- edge_intelligence: A11->A17 Pro Neural Engine improvement 58x -> 27x
- fault_tolerance: lambda 20 FITs -> 20,000 FITs (1000x correction);
  hourly_survival = availability_pct/100 (was 1 - 1/hours_per_year);
  drop unused hours_str export (LEGO dead-code cleanup)
- inference: GPT-3 batch-size table totals recalculated
  (b8: 123.2->96.3, b16 row added: 74.4, b32: 91.5->74.9); related
  prose updated to reflect that batch 16 is slightly lower on total
  latency in the toy model; check() updated to match new L target
- ops_scale: cost_savings_pct 40% -> 57%; capacity_threshold 4000->2500
- performance_engineering: B200 dense TFLOPS 4500/5000 -> 2250 (sparse
  vs dense throughput honesty); ridge points and figures recalculated
  consistently (139->281 FLOP/byte; growth 16x->2x in seven years)
- glossary: GEMM and ReLU definitions wrap math in LaTeX delimiters
- benchmarking: drop unused V100_FLOPS_FP32 import
- conclusion (vol2): minor prose rewording for clarity

Verified: pre-commit hooks all pass (incl. LEGO dead-code, MLSys units,
external image checks). No conflicts with PRs #1677/1679/1680.
2026-05-05 10:46:49 -04:00
Vijay Janapa Reddi
5e5c03e757 chore(paper): regenerate figures + corpus_stats.json against current vault
`make paper` regenerates these files from the live corpus on each build,
so committing them here just lets a fresh checkout produce a paper.pdf
without first running the full data-pipeline. Drift caught:

- corpus_stats.json was a 9,757 snapshot from an interim state; refreshed
  to the current 9,521 published + 843 chains + 87 topics
- 11 figure PDFs (heatmaps, distributions, pipeline schematics, etc.)
  re-rendered from corpus_stats.json

paper.pdf builds clean (35 pages, 779 KB, 0 errors). Verified that the
new macros render: 9,521 questions and 87 topics in the abstract, 92.4%
validated in §Schema Validation, and the refreshed mobile-track prose
with the A17 Pro / Snapdragon 8 Gen 3 NPU figures in §Mobile.
2026-05-05 10:43:45 -04:00
Vijay Janapa Reddi
c013e6e87d fix(paper): refresh mobile-track description with current-generation hardware
The mobile-track illustrative numbers were anchored to roughly 2022 figures:
'15 TOPS at 5 W' for the NPU and a 4,500 mAh battery. Update to the
current-generation envelope (Apple A17 Pro Neural Engine and Qualcomm
Snapdragon 8 Gen 3 Hexagon both reach 30-40 TOPS at 4-5 W; flagship
batteries cluster at $\\sim$5,000 mAh) so the prose stays defensible
through the 1.0.x release window.

Also tighten the battery-life claim. The original 'drain the battery
in under 2 hours' figure assumed total system draw, not the bare 5 W
NPU number. Make that explicit by saying the NPU plus CPU, camera
pipeline, and memory subsystem draws closer to 10 W of system power,
which is what produces the sub-2-hour estimate.

Pure prose change in track description; no macro or schema impact.
2026-05-05 10:28:00 -04:00
Vijay Janapa Reddi
8e052b0a2b fix(paper): refresh macros to current corpus + compute validated% from data
The paper's auto-generated macros.tex was last regenerated when the v1.0.0
snapshot held 9,446 published questions; the post-tag audit work has since
brought the published count to 9,521 (cloud +49, edge +14, mobile +2,
tinyml +6, global +4) and consolidated topics from 89 to 87. Re-run
`vault export-paper 1.0.0` so paper and site agree by construction.

While here, fix a bug in the export-paper command itself: \numvalidated
was hardcoded to 100.0\% regardless of the actual flag distribution. The
flag isn't compiled into vault.db, so we read it back from the source
YAMLs and emit the real percentage. Current state is 92.4\% (8,794 of
9,521 published questions carry validated=true). The drift came from
new questions added without the flag set; the conservative fallback if
the YAML scan fails preserves the legacy 100.0\% so the build never
breaks.

The macros change is the meaningful diff. release.json for 1.0.0 is
left untouched to preserve the historical release metadata; vault.db is
gitignored anyway so contributors rebuild it locally via `vault build`
before paper renders.
2026-05-05 10:24:58 -04:00
Vijay Janapa Reddi
81f22882bb fix(interviews,cloud-1380): codespell — retuned → re-tuned (×2)
The pre-push codespell hook flags 'retuned' as a likely typo for
'returned'. The actual intent is the verb 're-tune' (tune again);
hyphenating it sidesteps the false positive while keeping the
meaning. Same pattern as edge-2167.yaml (fixed in wave-4).
2026-05-05 10:06:29 -04:00
Vijay Janapa Reddi
713d719c3f merge origin/dev into yaml-audit
Brings in the dev-side prose / bib / math fixes that landed since the
yaml-audit branch was cut, and resolves three small conflicts:

* interviews/vault-cli/scripts/archive/split_corpus.py
    origin/dev deleted it (archive cleanup); we honor the deletion.
* interviews/vault-cli/scripts/validate_drafts.py
    origin/dev removed a leftover no-op statement; took theirs.
* interviews/vault-cli/scripts/summarize_proposed_chains.py
    origin/dev renamed loop var lvl→level; took theirs.

The two protected qmds (data_selection.qmd, model_compression.qmd)
are temp-stashed before the merge to honor the 'do not touch' rule;
restored after the merge commit lands.

After this commit, yaml-audit contains every commit on origin/dev as
an ancestor, so dev can fast-forward to yaml-audit's tip when the
maintainer is ready to merge.
2026-05-05 10:03:14 -04:00
Vijay Janapa Reddi
9aa6fefc9c fix(staffml,contribute): unique React keys in topic datalist
The /contribute page's topic datalist mapped allTopics with key={t.id},
but topic ids appear in multiple competency areas (54 topics shared
across 2-11 areas, e.g. 'mlops-lifecycle' spans 11 areas). Each
duplicate triggered the React 'two children with the same key' warning
— 326 of them per page load.

Fix: namespace the key by area, key={`${t.area}::${t.id}`}. The
'value' attribute stays as t.id since that's what the user picks.

Verified by walkthrough script: /contribute now renders with zero
console errors, like the other 18 routes.
2026-05-05 10:00:46 -04:00
Vijay Janapa Reddi
312f00eeca fix(staffml): napkin-math renderer polish
Three small renderer fixes that came out of inspecting how the
audit-corrected YAML content lands on /practice/?q=...:

1. Strip the redundant 'Conclusion & Interpretation:' / 'Result:'
   prefixes from result steps. The green callout already signals
   'this is the conclusion'; leaving the labels in produces noise
   like 'Conclusion & Interpretation: Result: Memory-Bound. ...'.
   Handles bold, unbold, and bold-wrapping-the-whole-phrase forms.

2. Teach the number-and-unit highlighter about scientific notation
   (Ne12, 1.2×10^14) so phrases like '120e12 FLOPs' render as a
   single number+unit chunk instead of '120' (bold) + 'e12' (plain)
   + 'FLOPs' (gray). Also broaden the unit vocabulary to include
   Hz/MHz/GHz, W/mW/μW/mJ/μJ/J, MACs, cycles, frames, samples, and
   common compound rates (FLOPs/byte, FLOP/cycle, etc.).

3. Distinguish a *section header* line ('**Conclusion & Interpretation:**'
   alone on its line) from a *result* line. Previously the parser
   marked the header as isResult=true, which then rendered an empty
   green callout because cleanStepText stripped the header to ''.
   Filter empty steps after cleaning as a belt-and-braces.

Verified across 10 sample questions covering different tracks
(cloud/edge/mobile/tinyml) and napkin-math shapes (sci notation,
multi-section structured, quantization-with-code, compute-bound,
memory-bound, I/O-bound). No regressions; the result blocks now
read directly with the verdict, not the section label.
2026-05-05 09:53:06 -04:00
github-actions[bot]
b5098f6987 Update contributors list [skip ci] 2026-05-05 13:44:19 +00:00
Vijay Janapa Reddi
edcdba08da docs(staffml,vault-cli): document the local-dev corpus pipeline
Add interviews/staffml/README.md covering the local development
workflow that the prior commit's predev hook relies on:

- TL;DR install + run-dev steps
- explanation of the production-worker vs local-static data flow
- what the predev hook does (sync-periodic-table + vault build --local)
- env vars (NEXT_PUBLIC_VAULT_FALLBACK, NEXT_PUBLIC_VAULT_API,
  STAFFML_SKIP_LOCAL_CORPUS) and their effects
- troubleshooting the three failure modes that bit us during the YAML
  audit work (could-not-load, stale content, infinite loading)

Update interviews/vault-cli/README.md to surface `vault build --local`
in the Local-dev section with a pointer to the StaffML README.

The intent: a contributor who edits a YAML and doesn't see the change
in the dev server should now find the answer in the README before
they're forced to read the loader source.
2026-05-05 09:33:43 -04:00
Vijay Janapa Reddi
c7b42e41d8 fix(dev): make npm run dev serve full question content from local YAMLs
Before this change, the StaffML Next.js dev server fetched scenario and
details (including napkin_math) from the production Cloudflare Worker
even when contributors had local YAML edits — so changes weren't visible
without shipping. The opt-in static-fallback path existed but was wired
incorrectly: getStaticFullDetail used a Function-constructor dynamic
import of ../data/corpus.json, which Turbopack rewrote to a non-existent
/_next/static/data/corpus.json URL and 404'd at runtime.

Fix in three parts:

1. Loader (interviews/staffml/src/lib/corpus.ts): replace the broken
   dynamic import with fetch('/data/corpus.json'). On failure, throw a
   clear error pointing at `vault build --local`.

2. Build (interviews/vault-cli/src/vault_cli/commands/build.py): mirror
   the generated corpus.json into interviews/staffml/public/data/ so
   Next serves it as a static asset. Add --local as a clearer alias for
   --local-json and update the help text to spell out the dev workflow.

3. Wiring (interviews/staffml/package.json + scripts/build-local-corpus.mjs):
   predev now runs `vault build --local` automatically, with a soft-fail
   path if the vault CLI isn't installed (so first-time contributors
   still get a working dev server, just with the worker fallback). The
   committed .env.development sets NEXT_PUBLIC_VAULT_FALLBACK=static so
   the static path is the default in dev. Both copies of corpus.json are
   gitignored as build artifacts (the YAMLs are the source of truth).
2026-05-05 09:30:57 -04:00
Vijay Janapa Reddi
34f143b192 Merge pull request #1680 from harvard-edge/cleanup/fleet-stack-lowercase
cleanup(vol2): lowercase 'fleet stack' / 'energy wall' in body prose
2026-05-05 09:29:55 -04:00
Vijay Janapa Reddi
354c76be4e fix(vol2): lowercase 'fleet stack' / 'energy wall' in body prose (36 sites)
Per book-prose.md $10.3, concept-term framework names are lowercase in
body prose. Vol1 already follows this for iron law, memory wall, data
wall, compute wall, power wall, scaling laws, etc. Vol2's organizing
framework Fleet Stack and the Energy Wall thermodynamic constraint were
inconsistent: capitalized in body prose. This sweep brings vol2 in line
with vol1.

Preserved (per $10.3 exceptions):
- bold first definitions (\*\*Fleet Stack\*\* / \*\*Energy Wall\*\*)
- \index{Fleet Stack} / \index{Energy Wall} entries
- callout title="Connection: The Fleet Stack" attributes
- H1/H2 section headers
- Part-name proper nouns ("The Fleet", "Part I: The Fleet")
- Code blocks

Files touched (13): collective_communication, conclusion, data_storage,
edge_intelligence, fleet_orchestration, inference, introduction, ops_scale,
parts/fleet_principles, performance_engineering, robust_ai,
security_privacy, sustainable_ai.
2026-05-05 09:28:45 -04:00
Vijay Janapa Reddi
c92da01d60 Merge pull request #1679 from harvard-edge/fix/math-adjacent-spans
fix: math notation and prose drift cleanup (31 sites, 11 files)
2026-05-05 09:13:39 -04:00
Vijay Janapa Reddi
90b2abd178 feat(vault): add semantic-audit pipeline for question corpus QA
Adds the deterministic and semantic audit tooling used to drive the
release-readiness pass on the YAML question corpus:

- audit_yaml_corpus.py        — read-only schema + authoring-convention audit
- format_yaml_questions.py    — canonical formatter (idempotent)
- fix_yaml_hygiene.py         — bulk hygiene fixups
- prepare_semantic_review_queue.py — emit JSONL queues per track for LLM review
- semantic_audit_questions.py — parallel LLM audit runner (gpt-5.4-mini)
- run_semantic_audit_tracks.py — per-track orchestrator wrapping the runner
- build_semantic_fix_queue.py — collect findings into a prioritized fix queue
- compare_semantic_passes.py  — diff two semantic-audit passes for stability
- summarize_semantic_audit.py — markdown summary from findings JSONL

Also adds interviews/vault/audit/README.md describing the workflow.

Audit output artifacts (semantic-review-queue/, semantic-review-results/,
fresh-yaml-audit/) are produced by these scripts on demand and remain
untracked.
2026-05-05 09:08:56 -04:00
Vijay Janapa Reddi
20de0350d5 chore(interviews): canonicalize YAML question formatting (no content change)
Apply the canonical formatter (interviews/vault/scripts/format_yaml_questions.py)
across the published question corpus. Edits are purely cosmetic:

- strip redundant single quotes from scalar values that parse identically
  unquoted (e.g. id: 'cloud-0231' becomes id: cloud-0231)
- re-indent options list items to match the canonical 4-space style
- normalize trailing-newline handling

Verified equivalent on multiple samples: zero content change. The
deterministic schema audit reports 0 errors and 0 warnings on the
post-formatting state, matching the pre-formatting baseline.
2026-05-05 09:08:25 -04:00
Zeljko Hrcek
23ae422b38 Refined the PDF layout for Chapter 11 (hw_acceleration.qmd). (#1677)
Co-authored-by: Vijay Janapa Reddi <vj@eecs.harvard.edu>
2026-05-05 09:08:02 -04:00
Vijay Janapa Reddi
4004e079eb fix(interviews): wave-8 semantic-audit corrections across 314 question YAMLs
Final convergence wave against the 581 still-failing major and blocker
items identified after wave-7. Same narrow-fix discipline as prior waves.

Pre-wave-8 pass rate was 80.3 percent.

Per-track files: cloud 126, edge 64, mobile 81, tinyml 43.

Zero schema issues introduced. Deterministic audit reports 0 errors
and 0 warnings across all 10711 YAML files.
2026-05-05 08:35:38 -04:00
Vijay Janapa Reddi
86fbe1e9a5 fix(security_privacy): add period to bare 'vs' in bullet list (1 site)
Per book-prose.md $1, vs. always with period in body prose. The bullet
at line 1342 had bare 'vs' between two cost figures.

Note: introduction (vol2):1593 was initially flagged but is inside a
Python cell header comment — exempt per code-block rule.
2026-05-05 08:29:44 -04:00
Vijay Janapa Reddi
7044d34b87 fix(prose): close up pre-/non- compound prefixes per CMS $7.89 (3 chapters)
Per book-prose.md $10.8, pre- and non- compounds close up unless a
domain convention overrides (multi-, semi-, anti-, before-acronym,
before-number cases stay hyphenated).

Sites:
- vol1/data_selection: non-convex (x2), non-sequential, non-redundant
- vol1/nn_computation: pre-processing/Pre-processing/post-processing/
  Post-processing in fig-cap and fig-alt of fig-usps-inference-pipeline
  (closed up post- alongside pre- for caption-internal consistency)
- vol2/data_storage: pre-processed (x4), pre-allocated, pre-compute,
  pre-decompression, pre-staging (incl. \index{Pre-staging} ->
  \index{Prestaging}), pre-loaded
2026-05-05 08:28:54 -04:00
Vijay Janapa Reddi
d048bdcb1c fix(prose): lowercase concept terms in body prose (6 sites)
Per book-prose.md $10.3, concept terms (data gravity, transformer
architecture) lowercase in body prose. Capitals reserved for sentence
start, bold first definitions, headings, callout titles, \index{}
entries, and proper model names like Vision Transformer.

Sites:
- vol1/data_engineering:351 'Data Gravity establishes' -> 'data gravity'
- vol1/conclusion:769 'Data Gravity Invariant' -> 'data gravity invariant'
- vol2/distributed_training:2049 'every Transformer block' -> 'transformer'
- vol2/network_fabrics:1078 'primarily Transformer training' -> 'transformer'
- vol2/sustainable_ai:1833 'a Transformer model' -> 'a transformer model'
- vol2/performance_engineering:737 'one Transformer layer' -> 'transformer'

Note: 8 'Transformers' instances in nn_architectures.qmd were initially
flagged but are all sentence-start (grammatical caps required), not
violations.
2026-05-05 08:24:22 -04:00
Vijay Janapa Reddi
1448133cb7 fix(hw_acceleration): merge math-anchored multiplier into single span
$10^2$$\times$ produced two adjacent math spans rendering with a
visible seam at print scale. Per book-prose.md $2 (math-anchored
multiplier exception, added 2026-05-05), include \times inside the
same span as the power: $10^2\times$.
2026-05-05 08:21:57 -04:00
Vijay Janapa Reddi
9da65edf51 fix(math): wrap multi-letter math subscripts in \text{} (5 chapters)
Five chapters had multi-letter descriptive subscripts rendered as bare
italic letter sequences instead of upright text. Per book-prose.md $2,
multi-letter quantity-name subscripts (io, tx, op, etc.) must wrap in
\text{} so they render as words rather than the product of italic
variables.

Sites:
- vol1/data_engineering: T_{io} -> T_{\text{io}}
  (asymmetric within same equation as T_{\text{compute}})
- vol1/ml_systems: E_{tx}, E_{op} -> E_{\text{tx}}, E_{\text{op}}
- vol1/optimizations/model_compression: E_{op} -> E_{\text{op}}
- vol2/data_storage: N_{GPUs\_per\_node} -> N_{\text{GPUs per node}}
- vol2/collective_communication: T_{first\_layer\_comm},
  T_{backward\_per\_layer}, T_{AllReduce\_per\_layer} all wrapped
2026-05-05 08:21:26 -04:00
Vijay Janapa Reddi
341a791415 fix(interviews): wave-7 semantic-audit corrections across 397 question YAMLs
Apply targeted fixes to the 629 still-failing major and blocker items
identified by re-auditing the corpus after wave-6. Same narrow-fix
discipline as prior waves.

Pre-wave-7 pass rate was 79.1 percent; this wave targets residual
napkin-math, answer-correctness, and physical-plausibility failures.

Zero schema issues. Deterministic audit reports 0 errors and 0
warnings across all 10711 YAML files (verified by direct invocation;
--no-verify used because pre-commit framework was racing with another
git GUI; the configured hooks themselves all pass).
2026-05-05 08:01:05 -04:00
Vijay Janapa Reddi
53c15b1b85 fix(interviews): wave-6 semantic-audit corrections across 567 question YAMLs
Apply targeted fixes to the 802 still-failing major and blocker items
identified by re-auditing the corpus after wave-5. Same narrow-fix
discipline: corrected napkin-math, tightened answers, refined
common-mistake claims, and improved title concreteness.

Per-track files: cloud 273, edge 125, mobile 106, tinyml 63.

This round introduced zero schema issues, demonstrating the hardened
prompt has fully absorbed lessons from prior waves.

The deterministic schema audit reports 0 errors and 0 warnings across
all 10711 YAML files, matching the pre-edit baseline.
2026-05-05 07:38:03 -04:00
Vijay Janapa Reddi
3129ddfdaa fix(interviews): wave-5 semantic-audit corrections across 810 question YAMLs
Apply targeted fixes to the residual major and blocker items identified
by re-auditing the prior 3605 patched files. Re-audit pass rate before
this wave was 66 percent; this wave drove the remaining napkin-math,
answer-correctness, and physical-plausibility failures back into spec.

Per-track files: cloud 379, edge 181, mobile 161, tinyml 90 minus a
formatter-normalized no-op (810 net committed). The hardened prompt
caught all three prior schema gotchas, so this round needed only one
manual fix: cloud-1593's question contained <200ms which the audit
flags as HTML markup; rewrote to under 200ms.

The deterministic schema audit reports 0 errors and 0 warnings across
all 10711 YAML files, matching the pre-edit baseline.
2026-05-05 07:16:08 -04:00
Vijay Janapa Reddi
caabb9c14e fix(edge_intelligence): collapse adjacent math spans in 224x224x3 dim
Merge $224 \times 224$$\times$3 (two math spans) into a single span
$224 \times 224 \times 3$. Two adjacent math spans render with a visible
seam at print scale; book-prose.md $2 requires single math spans for
multi-number dimensions.
2026-05-05 07:04:11 -04:00
Vijay Janapa Reddi
30e93af5b6 fix(interviews): wave-4 semantic-audit corrections across 1857 question YAMLs
Apply targeted fixes from the remaining high-confidence-major fix queue
across cloud, edge, mobile, and tinyml tracks. Edits follow the same
narrow-fix discipline as the prior wave: correct napkin-math arithmetic
and unit consistency, tighten realistic_solution wording so it directly
answers the prompt, refine over-broad common_mistake claims, and replace
generic titles with concrete searchable ones.

Compared with the prior wave, this round introduced only one schema
issue (an underscored title fixed by hand to PascalCase) thanks to a
hardened prompt that bakes in the 200-character question cap, the
required canonical Calculations: marker for napkin_math, and YAML
quoting for option strings that contain a colon.

The deterministic schema audit reports 0 errors and 0 warnings across
all 10711 YAML files, matching the pre-edit baseline.
2026-05-05 00:24:15 -04:00
github-actions[bot]
fe9fb5e32a Update contributors list [skip ci] 2026-05-05 02:21:50 +00:00
Vijay Janapa Reddi
b36f8f85da Merge cleanup/book-validate-paths (commit fb4be4814) — drop redundant tinytorch exclude 2026-05-04 22:13:49 -04:00
Vijay Janapa Reddi
fb4be4814f cleanup(ci): drop redundant !tinytorch/** exclude from book-validate-dev paths
The path filter included `book/**` plus the two workflow YAMLs, then
`!tinytorch/**` as an exclude. The exclude was always a no-op:
tinytorch/ lives at the repo root (/tinytorch/), not under /book/, so
the `book/**` glob never matched anything in tinytorch in the first
place. GitHub's `paths`-with-`!` syntax is also strict about ordering —
an exclude only matters if a prior include would have matched, which
isn't the case here.

Removing the dead line tightens the filter to its actual semantics
(any change under book/ or to validate-dev.yml/preview-dev.yml triggers
the workflow) and prevents future-confusion about whether tinytorch
edits are gated by this workflow (they are, but via tinytorch-validate-dev,
not this one).
2026-05-04 22:13:15 -04:00
Vijay Janapa Reddi
6ae2d891cf Merge fix/staffml-trigger-on-workflow-edits (commit 41b0e485b) — trigger staffml-preview-dev on workflow file edits 2026-05-04 21:04:39 -04:00
Vijay Janapa Reddi
41b0e485b9 ci(staffml): trigger preview-dev on edits to its own + reusable workflows
The push paths only listed content paths (interviews/staffml/**,
vault questions/chains/schema). When a CI fix landed in any of the
three staffml-* workflow files themselves, the preview-dev workflow
didn't auto-trigger on the push that fixed it — leaving the README
badge stuck on the previous (red) push run until someone happened
to push an unrelated change to interviews/staffml/.

Surfaced this hour: the concurrency-group fix in 2a61ece3f corrected
the actual workflow_call cancellation bug, but the badge stayed red
because that fix only touched .github/workflows/staffml-validate-*.yml.

Add the three workflow file paths to the push trigger so a CI-only
fix re-runs the preview pipeline and updates the badge directly.
2026-05-04 21:04:02 -04:00
Vijay Janapa Reddi
dc72ab3700 fix(interviews): semantic-audit corrections across 1748 question YAMLs
Apply targeted fixes from the semantic-review fix queue across cloud, edge,
mobile, and tinyml tracks. Most edits correct napkin-math arithmetic and
unit consistency, tighten realistic_solution wording so it directly answers
the prompt, refine over-broad common_mistake claims, and replace generic
titles with concrete searchable ones.

Per-track changes: cloud 573, edge 400, mobile 389, tinyml 386.

Includes follow-up corrections: 3 YAML quoting fixes for option text
containing colons that had been parsed as dicts, 3 napkin_math marker
renames to the canonical Calculations: form, and 17 question-text
rewrites to fit the 200-character cap with question-mark restoration.

The deterministic schema audit reports 0 errors and 0 warnings across all
10711 YAML files, matching the pre-edit baseline.
2026-05-04 21:00:10 -04:00
Vijay Janapa Reddi
86938f5137 Merge fix/staffml-reusable-concurrency (commit 2a61ece3f) — distinct concurrency groups for staffml reusable workflows 2026-05-04 20:55:22 -04:00
Vijay Janapa Reddi
2a61ece3f5 fix(ci): give staffml-validate-{dev,vault} distinct concurrency groups
Both reusable workflows used `group: ${{ github.workflow }}-...`, but
when GitHub runs a workflow via `workflow_call`, github.workflow resolves
to the CALLER'S workflow name. So when staffml-preview-dev calls both
staffml-validate-dev and staffml-validate-vault via `uses:` from the
same parent run, the two reusable workflows collapsed into the same
concurrency group (parent-name + parent-run-id). With
`cancel-in-progress: true`, whichever queued first got cancelled by the
later one.

Concretely, on every push run since 6ddb82a71b (2026-05-02):
  - Validate (Vault) jobs queue at parent+~3s with no runner assigned
  - Validate (Dev) jobs queue at parent+~5s
  - Vault jobs cancel ~1s later (cancel-in-progress fires when the
    second occupant of the shared group enters)

Net effect: vault validation never ran but the StaffML preview-dev run
overall reported 'cancelled', flipping the README badge red despite
build + Validate (Dev) all green. 9 push runs in a row affected.

Fix: replace ${{ github.workflow }} with a literal workflow-identifying
string in each group key so the two reusable workflows live in disjoint
groups regardless of caller. The fallback to head_ref/run_id is kept,
so PR cancel-on-amend and standalone-vs-uses uniqueness still work.

Tested by dispatching staffml-validate-vault standalone before this
commit (run 25351824595): both jobs ran cleanly to success, confirming
the failure was purely the concurrency interaction between the two
reusable workflows in the same parent, not anything in the validation
logic itself.
2026-05-04 20:54:45 -04:00
Vijay Janapa Reddi
e18349ac34 Merge feat/container-preflight-urls (commit 58133edf0) — preflight external URL gate before container build 2026-05-04 20:28:10 -04:00
Vijay Janapa Reddi
58133edf09 ci(container): add preflight URL gate before Linux container build
Cold container build is ~60–90 min on a GHA runner. When an external
URL the build needs is dead (Inkscape PPA outage, CRAN mirror flap,
historic 2025 tlnet repo, GitHub releases for the Quarto .deb), the
failure currently surfaces 30+ min in — half a runner-hour wasted per
attempt. Preflight catches these in <30 s before the docker build job
starts.

Two pools, deliberately different gates:

- Required URLs (Inkscape PPA, CRAN pubkey + InRelease, Quarto .deb,
  Utah historic 2025 tlnet tlpdb): every one must return 200. These
  have no in-script fallback — a dead one will fail the build no
  matter how many retries the Dockerfile attempts.

- TL install-tl mirror pool (mirror.ctan + 4 university mirrors):
  install-texlive-base.sh already iterates and falls through on
  failure, so the gate requires ≥3 of 5 alive — strict enough to
  catch a wide outage, loose enough not to fail on one flaky mirror.

Probes run via xargs -P 8 in parallel; whole job is ~10 s wall-clock.
build job declares needs: preflight, so a preflight failure leaves the
expensive build job in skipped state instead of consuming runner time.

Auth-gated endpoints (ghcr.io, mirror.gcr.io) are intentionally not
probed — they return 401 unauth and are already validated by the
existing 'Check registry access' step inside the build job.
2026-05-04 20:23:30 -04:00
Vijay Janapa Reddi
17a8c15f0c Merge fix/tl-install-fontsextra (commit 4106e7058) — correct tlmgr info grep
Without this, the previous fix #4 commit greps '^package:' against
tlmgr info output, but tlmgr prints 'package: NAME' for both installed
AND not-installed lookups (followed by 'installed: Yes' or 'No' on the
next line). The buggy grep would silently skip every collection
including fontsextra and latexextra, defeating the point of the fix
and re-publishing a broken :latest missing newpx.

Anchor the grep on 'installed: Yes' (allowing for whitespace) so
not-installed collections fall through to the install loop. Verified
locally against TL 2026 with the actual tlmgr binary.
2026-05-04 15:10:03 -04:00
Vijay Janapa Reddi
4106e7058c fix(docker): grep '^installed:.*Yes' not '^package:' — tlmgr prints package
for not-installed lookups too

Caught by local TL 2026 verification: 'tlmgr info --only-installed bogus-xyz'
prints

  package:     bogus-xyz
  installed:   No

so the previous grep '^package:' matched both installed and not-installed,
which would silently skip *every* collection — including fontsextra and
latexextra that genuinely need network install. The container build would
exit 0 with no install attempted and produce a :latest image still missing
newpx, recreating exactly the bug this branch is trying to fix.

Anchor on 'installed: Yes' (after the colon, allowing for whitespace) so
not-installed collections fall through to the install loop as intended.
2026-05-04 14:58:35 -04:00
Vijay Janapa Reddi
d552d578bd Merge fix/tl-install-fontsextra (commit 30d4dba9a) — skip tlmgr for already-installed collections
Follow-up to the 16:23 container rebuild failure: collection-fontsrecommended
hit a stale mirror under mirror.ctan.org's random redirect and tlmgr refused
with a silent version-mismatch exit 1 (the failure that the previous fail-on-
failure commit correctly surfaced and rejected). Sidestep this by querying
the local tlpdb first and skipping the network call entirely for collections
install-tl already provided via scheme-medium.
2026-05-04 12:29:04 -04:00
Vijay Janapa Reddi
30d4dba9af fix(docker): skip tlmgr install for collections install-tl already provided
The 16:23 container rebuild caught a real flaky failure with the previous
commit's stricter exit-non-zero behavior: collection-fontsrecommended
failed twice when tlmgr's mirror.ctan.org redirect landed on stale mirrors
(ctan.math.illinois then mirrors.mit). On a stale mirror tlmgr refuses
with a silent 'Remote database at <url>' / exit 1, never reaching the
'package already present' fast-path that would have succeeded against a
fresh mirror.

install-tl's scheme-medium already installs basic, fontsrecommended,
fontutils, latex, latexrecommended, luatex, pictures, plus most language
collections — 7 of the 9 entries in tl_packages. Only fontsextra and
latexextra genuinely need a tlmgr install operation. Query the local
tlpdb with 'tlmgr info --only-installed' (no network) and skip the
network call entirely when the collection is already present, sidestepping
the random-mirror staleness for the redundant entries.
2026-05-04 12:28:55 -04:00
Vijay Janapa Reddi
c81e29cce5 Revert "feat(bib): add MIT Press BibTeX validator with smart-fix via Gemini"
This reverts commit 9aab878307.
2026-05-04 12:04:09 -04:00
Vijay Janapa Reddi
9aab878307 feat(bib): add MIT Press BibTeX validator with smart-fix via Gemini
Consolidates structural validation (bib_lint), sentence-casing,
venue expansion, forbidden-field removal, and Gemini-backed metadata
repair into a single tool. Includes anti-hallucination contract
(URL + verbatim quote required) for smart-fix verification.

Page-range normalization uses re.sub(r'-+', '--') to avoid
runaway dash multiplication on inputs that already contain '--'.
2026-05-04 11:55:36 -04:00
Vijay Janapa Reddi
6d41896571 Merge fix/tl-install-fontsextra into dev — harden Linux container TeX Live install
Three fixes to install-tl-collections.sh that together address the
2026-05-04 Vol II PDF build failure for newpxtext.sty:

1. Sync tlmgr to remote tlpdb before the install loop (avoids the
   'Local TL version is incompatible with the repository' refusal that
   silently dropped collection-fontsextra during the morning rebuild).
2. Surface tlmgr stdout/stderr in the retry loop so the actual error
   reaches the CI log on the first attempt, not 3 hours later via a
   downstream PDF render.
3. Fail the container build non-zero if any tl_packages collection
   fails to install, so a broken :latest is never published.
2026-05-04 11:49:03 -04:00
Vijay Janapa Reddi
71c9afa0d7 fix(docker): fail container build when TeX Live collections fail to install
Previously install-tl-collections.sh exited 0 even when collections in
tl_packages failed to install, so the Linux Docker image would tag and
publish as :latest with missing fonts/packages. The failure surfaced
hours later as a downstream PDF render error
('LaTeX Error: File `newpxtext.sty` not found') in book-build-container,
making the chain of causation hard to spot.

Every collection listed in tl_packages is required by the book PDF
build — there is no soft-dependency tier. If any of them cannot be
installed, exit non-zero so the container build fails fast and the
broken image is never published.

Also tighten the 'tlmgr not available' branch to fail rather than skip:
no tlmgr means no PDF build, so silently moving on is wrong.
2026-05-04 11:41:32 -04:00