cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-08 02:28:25 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	c8b54a1a8f	polish(tinytorch): cross-reference audit + remaining SVG polish Cross-reference audit subagent: scanned all 30 scoped .qmd files for orphan table / figure / listing labels (caption with {#tbl-/fig-/lst-...} but no corresponding @label reference in prose). Added natural references for orphans so every labeled artifact is now introduced in the surrounding text. Final counts: 247 labels defined, 216 refs used (87% coverage). The remaining ~30 orphans were either self-describing (milestone-result tables whose placement is obvious from context) or inside scope I left untouched to preserve existing voice. Also included: tiers-optimization-dependencies.svg updates from the earlier Gemini consistency audit that had been left uncommitted. Audit report at .claude/_reviews/crossref-audit-report.md.	2026-04-23 14:59:42 -04:00
Vijay Janapa Reddi	0f704b8023	polish(tinytorch): systems-first module hooks + glossary + svg newlines Wave 4 editorial content across 20 modules + new glossary back matter: 1. Module opener hooks (20 new 2-3 sentence paragraphs between the chapter heading and Module Info callout). Every hook LEADS with the systems angle (memory, bandwidth, arithmetic intensity, cache, HBM, roofline, KV cache, hardware utilization, etc.) and connects back to the ML story. Reinforces that this is a lab guide for ML systems, not an ML-theory textbook. 2. Code-listing captions on substantive code blocks (roughly >10 lines, defines a class/function/algorithm). Populates Quarto's List of Listings front matter. Combined across F1/F2/L/O subagent waves: roughly 60 listings now carry 'Listing N.M — Brief description' captions. 3. Figure alt-text audit across 20 module diagrams. Most already carried objective specific alt-text; a handful were rewritten for precision. 4. Glossary at back matter (tinytorch/quarto/glossary.qmd + registered in pdf/_quarto.yml). 90 alphabetical entries spanning tensor / memory / autograd / training / architecture / optimization terms. One-sentence definitions. Module cross-references where the term is central. Lab-guide voice, not dictionary. 5. Style discipline: no em-dashes in prose (caption templates '— Description' are the only exception, required by parser). All agent outputs and the hand-revised hooks audited for em-dash use. 6. SVG trailing-newline hygiene: 8 SVGs touched by the Gemini style audit had lost their trailing newline. Restored per the SVG file-hygiene rule.	2026-04-23 14:52:51 -04:00
Vijay Janapa Reddi	88716495ac	polish(tinytorch): 151 table captions + 20 Further-Reading hyperlinks Subagent A: add Quarto caption + {#tbl-...} label to every pipe table so they index into the List of Tables (previously empty). 151 tables across 28 files: 20 modules, 7 milestones, big-picture, getting-started, conclusion. Caption style: single bold sentence under 15 words; labels namespaced by module number. Subagent B: hyperlink well-known external references in Further Reading / Additional Resources sections. 20 links added spanning arXiv papers (Attention Is All You Need, Scaling Laws, Deep Compression, AlexNet, LeNet, Rosenblatt 1958, Rumelhart/Hinton/ Williams 1986), Jay Alammar's Illustrated Transformer / Word2Vec, Karpathy's Recipe for Training Neural Nets, Ruder's gradient descent overview, Jurafsky & Martin SLP3, Cybenko's UAT, Drepper cache paper, HuggingFace Transformers. 6 references left unlinked (physical textbooks and ambiguous blog posts where canonical URL couldn't be verified).	2026-04-23 14:06:08 -04:00
Vijay Janapa Reddi	6c730f57c8	polish(tinytorch/modules): streamline callout classes Audit found two inconsistencies that this fixes: 1. Systems Implication callouts split 9 note / 12 warning — the 12 warning instances were pre-existing 'legacy' classification. Per the preamble convention (Systems Implication = insight = note, not warning), convert all 12 to callout-note. Result: 21/21 Systems Implication callouts now use callout-note (blue bar), consistent semantic signal across all modules. 2. Answer callouts (101 instances) were callout-note. Since Check-Your-Understanding wrappers are callout-tip (green), and answers are the 'reveal' to questions (productive/reward semantic, not neutral info), switch all 101 to callout-tip collapse=true. Green callout feels like reveal, matches CYU's visual language, and distinguishes answers from the 149 generic notes used for Coming-Up/Module-Info/Further-Reading boxes. Final callout inventory across 20 modules: callout-note ~140 (Systems Implication, Coming-Up, Further- Reading, Module-Info, Historical Context) callout-tip ~180 (Check Your Understanding wrappers, Answers, Key Takeaways titles) callout-warning ~22 (Save-Your-Progress, Performance-Note, other warnings — not Systems Implication) No content change — only class swaps. Titles untouched.	2026-04-23 13:43:24 -04:00
Vijay Janapa Reddi	2094bb5c83	polish(tinytorch/modules): Wave 3b editorial additions across all 20 modules Four parallel subagents completed the content work derived from Wave 2 audits. Each of the 20 module files now has: - A Check Your Understanding callout (callout-tip) at chapter end, with 3-5 technical-specific checkboxes keyed to that chapter's unique content. Checkbox content targets specific concepts, NOT generic 'did you understand this' wording. Callout titles are text-only ('Check Your Understanding — <Module>') — no emojis, survives the strip filter, renders identically cross-platform. - A Key Takeaways section (3-4 bullet recap + Coming-next hook to the next module) inserted before Further Reading / What's Next. Serves as the section students flip back to when reviewing. Per-module audit gap fills: - 01_tensor: normalized to canonical 12-section order; added Get Started section; removed duplicate Further Reading fragment. - 04_losses: added explicit O(B × C) complexity line in Core Concepts. - 07_optimizers: added SGD O(P) / Adam O(P) + O(2P) memory line. - 11_embeddings: added Systems Implication callout (sparse gradients, HBM layout, distributed sharding). - 13_transformers: added O(N² · d) attention complexity + Systems Implication callout (KV-cache memory growth under autoregressive decoding). - 14_profiling: added FLOP/bandwidth complexity framing + Systems Implication callout (reading the roofline as a decision tree). - 15_quantization: added constant-factor speedup complexity framing. - 16_compression: added pruning O(N log N) / compression ratio math. - 17_acceleration: added fused-kernel memory-complexity reduction. - 19_benchmarking: added Little's Law L = λW block equation. - 20_capstone: added end-to-end inference complexity decomposition with each optimization module mapped to its attacking term. 20 modules × avg ~24 lines added each (~480 lines of new pedagogy). All callout titles are text-only. No ::: fence regressions. Audit trail: .claude/_reviews/section-consistency-report.md .claude/_reviews/systems-emphasis-report.md .claude/_reviews/wave-plan.md	2026-04-23 13:34:16 -04:00
Vijay Janapa Reddi	0fde765058	polish(tinytorch/modules): Wave 3a — strip emojis, remove duplicate Get Started Audit findings from .claude/_reviews/section-consistency-report.md and systems-emphasis-report.md: - Strip leading emoji prefixes from Systems Implication callout titles across 17 modules. strip-emojis.lua removes them from PDF render anyway (XeLaTeX + Latin Modern fonts don't cover emoji ranges cross-platform), so source now matches what ships. - Remove duplicate trailing '## Get Started' section from modules 10, 11, 13, 14 — copy-paste artifacts. - Update pdf/_quarto.yml preamble comment: callout conventions are class + title-word, no emoji prefixes.	2026-04-23 13:26:29 -04:00
Vijay Janapa Reddi	93fbc8fcbc	fix(tinytorch/tokenization): restore missing python fence; keep TOC spacing Two fixes: 1. Critical rendering bug in modules/10_tokenization.qmd: a Python code block used for inline {python} variable refs was missing its opening ```{python} fence. Pandoc interpreted the two `#` comments inside ("# Character tokenizer: vocab 100, dim 512, float32" and "# BPE tokenizer: vocab 50,000, dim 512, float32") as top-level markdown headings, which the book class rendered as Chapters 19 and 20 with the prose lines below dumped as plain text. Restoring the opening fence (with `#\| echo: false`) puts the variable defs back inside a proper code chunk. The inline {python} tradeoff_char_embed etc. refs in the next paragraph now resolve correctly. 2. Revert the TOC spacing change from the previous commit. User confirmed the default book-class TOC leading is already fine; tocloft bumps made it too airy. arraystretch 1.2 for tables and enumitem itemsep 0.3em for lists remain — those targeted visible defects.	2026-04-23 13:05:53 -04:00
Vijay Janapa Reddi	bedec16f5a	Merge tinytorch-updates into dev	2026-04-22 18:20:55 -04:00
Vijay Janapa Reddi	b4de873c91	docs(tinytorch): inject systems engineering perspective into quarto modules - Add 'Hardware Reality' systems callouts (Compute/Memory) to all 20 modules - Enhance 'Seminal Papers' sections with systems implications - Polish narrative flow to bridge algorithmic concepts and hardware constraints - Standardize Quarto callout blocks for systems insights	2026-04-22 18:19:51 -04:00
Vijay Janapa Reddi	ff9f8a21d6	polish(tinytorch): deep narrative pass across all 30 chapters Parallel agent pass (one per chapter, 30 chapters) followed by orchestrator-level cross-chapter polish, framed against the "iconic lab book" bar codified in tools/CHAPTER_POLISH_BRIEF.md. Per-chapter agent edits (5–15 high-leverage changes each): - Sharper openers — first paragraph earns the chapter in two sentences; killed "in this module we will..." throat-clearing. - Stronger "What's Next?" bridges — every chapter now ends on a concrete question the next chapter answers, not a passive feature list. - Tightened verbose prose; cut hedging adverbs ("essentially", "fundamentally", "remarkably") and hype words ("powerful", "elegant", "revolutionary", "comprehensive"). - House-rule enforcement: blank lines before bullet lists, callouts titled, ASCII-art fences given `text` language tag. Project-wide bug fix surfaced by the pass: pdf/_quarto.yml has `execute: enabled: false` set globally, which means every `{python} foo` inline shortcode and every `{python}` setup chunk was rendering as raw text in BOTH HTML and PDF. Across the 20 modules + 6 milestones + 3 front/ back-matter chapters this added up to ~300 broken inline shortcodes and ~50 dead setup blocks. The agents inlined the pre-computed values directly into prose/tables/answer blocks and removed the dead chunks. Cross-chapter consistency fixes (orchestrator pass): - Removed the `## ⚡ PyTorch` panel-tabset emoji that the remaining agents missed (07_optimizers); XeLaTeX renders this as a phantom-glyph trap when the tab title promotes to an H2 in PDF. - Standardized milestone H2 `## YOUR Code Powers This` → `## Your Code Powers This` across the 6 milestone chapters (was inconsistent — only 04_cnn had been normalized). - Removed leftover scaffolding line `(see ../assets/images/...)` in modules/13_transformers.qmd line 332. - Deleted three remaining orphan `{python}` blocks the agents conservatively left in modules/15_quantization.qmd (×2) and modules/18_memoization.qmd (×1) since their consumers were already inlined. - modules/04_losses.qmd: `\medskip` between Case 1 and Case 2 in the multi-label callout was too subtle in tcolorbox; now `\vspace{1em}` so the cases visibly separate in PDF. Net effect: PDF dropped from 384 pages → 336 pages (-12.5%) and 2.0 MB → 1.8 MB. Zero `{python}` text leakage in the rendered PDF (verified via pdftotext grep). Verified visually: - Foundation Tier Part page (flameorange + torchnavy branding) - Module 03 Layers diagram (redrawn) + sharper figure caption - Module 18 Memoization opener (5050→100 collapse hook) - Conclusion close ("Don't import torch. You built it." as uncontested final sentence) - Case 1 / Case 2 separation in 04_losses (\vspace{1em} fix) Reports collected from all 30 agents flagged a project-wide follow-up worth a separate pass: every module repeats ~300 lines of inline `<style>`/`<script>`/action-cards HTML at the top, which is HTML-only by construction (correctly inside a `{=html}` raw block) but should probably be hoisted into a shared partial/include.	2026-04-22 18:06:52 -04:00
Vijay Janapa Reddi	edbea966bf	refactor(tinytorch): rename site-quarto/ to quarto/ Brings the TinyTorch lab guide's Quarto project in line with book/quarto/, the only other in-tree Quarto publication that builds both web and PDF outputs from a single source. The previous name had three redundancies: - already under tinytorch/, so "site-" prefix wasn't disambiguating - also produces the PDF lab guide, so "site-" was misleading - the top-level site/ dir made "site-quarto" read as "the site's quarto config" rather than "the tinytorch site, in quarto" After this rename the convention is straightforward: book/quarto/ -> the textbook (web + PDF) tinytorch/quarto/ -> the TinyTorch lab guide (web + PDF) mlsysim/docs/ -> mlsysim API reference (kept as docs/, since it really is API reference, not a publication) Touches 7 GitHub workflows, both .gitignore files, the rename target's own self-references (Makefile, _quarto.yml configs, STYLE.md, measure-pdf-images.py), and 6 copies of subscribe-modal.js plus a few shared scripts/configs whose comments documented the old path. Verified: rebuilt pdf/TinyTorch-Guide.pdf (2.1M) cleanly from the new location with 'make pdf' from tinytorch/quarto/.	2026-04-22 14:38:18 -04:00

11 Commits