cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-03-11 17:49:25 -05:00

Author	SHA1	Message	Date
Zeljko Hrcek	1f85111486	Updated a figure in chapter 15	2026-02-25 10:54:38 +01:00
Zeljko Hrcek	eccbd9d5d6	Updated a figure in chapter 15	2026-02-25 10:47:25 +01:00
Zeljko Hrcek	96efa3bf29	Merge pull request #1190 from Zeljko-Hrcek/fix/training Update training chapter and add missing color definition	2026-02-21 09:55:21 +01:00
Zeljko Hrcek	ae2ef83dd3	Merge branch 'feature/book-volumes' into fix/training	2026-02-21 09:54:24 +01:00
Zeljko Hrcek	403994ea2e	Update training chapter and add missing color definition	2026-02-21 09:28:04 +01:00
Vijay Janapa Reddi	739b48622f	Add war story callout with proper icon formats and supporting files - Add war story callout definition in custom-numbered-blocks.yml - Create war story icon in all three formats (SVG, PNG, PDF) matching the 64x64 stroke-only style used by all other callout icons - Add war story bibliography and PDF config entry - Add first war story ("The Quadratic Wall") in nn_architectures - Include icon conversion utility script	2026-02-19 07:38:16 -05:00
Vijay Janapa Reddi	15f7e139a7	Clean up Vol1 manuscript: fix code conventions, cross-refs, and formatting - Fix quiz filenames in 4 chapters (footnote_context -> chapter-specific) - Replace raw _value variables in prose with formatted _str versions - Add missing engine: jupyter in ml_ops and responsible_engr YAML - Remove debug comments in nn_computation, translate Serbian TikZ comments - Fix fairness figure label mismatch (96% -> 91%) in responsible_engr - Replace hardcoded placeholder strings with computed values in ml_systems - Fix cross-ref issues: "DL Primer" -> @sec- ref, "Part II" -> "Part I" - Fix duplicate latency-budget-calc label, "dotted" -> "dashed" arrow - Normalize magnitude references (seven orders), fix table spacing - Clean up truncated comment blocks in data_selection - Remove internal authoring HTML comment in hw_acceleration	2026-02-18 17:28:31 -05:00
Vijay Janapa Reddi	5c45391557	Refactors Distributed Training chapter for Systems Sandwich alignment Aligns the Distributed Training chapter with the Volume 2 'Systems Sandwich' framework, establishing it as the 'Operational Layer' of the Machine Learning Fleet. Key changes: - Refactors 'Purpose' and 'Learning Objectives' to use rhetorical pivots and focus on the 'Physics of the Fleet'. - Updates Python setup cell to use the 'Safe Class Namespace' pattern (P.I.C.O.) and adds Archetype A (GPT-4) constants. - Rewrites 'Multi-Machine Scaling Fundamentals' to center on the 'Communication-Computation Ratio' and the 'Law of Distributed Efficiency'. - Cross-references the Volume 2 Introduction definitions to create a cohesive narrative.	2026-02-18 15:25:00 -05:00
Vijay Janapa Reddi	ede45403c3	Refactors Volume 2 Introduction for alignment with Volume 1 Gold Standard Aligns the rhetorical style and quantitative rigor of the Volume 2 Introduction with the established Volume 1 standards. Introduces the "Machine Learning Fleet" narrative as the central engineering challenge of Volume 2, shifting from single-node optimization to cluster-scale orchestration. Key changes: - Establishes the "Law of Distributed Efficiency" and "CI Ratio" (Communication Intensity) as new quantitative frameworks. - Defines the "Reliability Gap" to address statistical failure certainty in massive clusters. - Refactors all TikZ diagrams (Systems Sandwich, Roadmap, AI Triad) to use project-standard colors and Helvetica font. - Updates the "Lighthouse Archetypes" to focus on throughput, latency, and resource-bound fleet challenges. - Implements P.I.C.O. math patterns for fleet-scale calculations.	2026-02-18 15:20:33 -05:00
Vijay Janapa Reddi	85c60b5935	Refactors framework evolution section Renames and restructures the framework evolution section to "The Ladder of Abstraction," emphasizing the problem-solving nature of each abstraction layer. Clarifies the role of each layer (BLAS/LAPACK, NumPy, Deep Learning Frameworks) in solving specific problems related to performance, usability, and differentiation, respectively. Highlights the trade-offs between productivity and transparency as we move up the abstraction ladder.	2026-02-17 18:49:49 -05:00
Vijay Janapa Reddi	67b4d48470	Clarifies notation for units and precision Refines the notation section to explicitly state the use of decimal SI prefixes for data, memory, and compute. Updates wording for clarity and consistency, specifically addressing units, storage, and compute contexts. Ensures that the book uses only decimal SI prefixes and specifies the formatting of numbers and units.	2026-02-16 10:39:14 -05:00
Vijay Janapa Reddi	dcc8afb4cc	Parallel builds: add HTML/EPUB format picker, clearer reporting - debugCommands: prompt for format (PDF, HTML, EPUB) in Build All Chapters (Parallel) - parallelDebug: clearer success/fail messages, Open Reports Folder, REPORT.md header - README: document volume + format selection for parallel builds	2026-02-15 17:41:10 -05:00
Vijay Janapa Reddi	e6051001b1	Parallel builds: track callout PDFs, fix extension errors, remove copy step - .gitignore: allow book/quarto/assets/images/icons/callouts/*.pdf - Add missing callout PDFs (takeaways, pitfall, fallacy) so worktrees have them - Extension: runtime repo-root check and clearer errors for parallel debug - Extension: remove copyCalloutPdfsToWorktree (PDFs now in repo by default) - README: document parallel build requirements; drop callout copy note	2026-02-15 16:50:25 -05:00
Vijay Janapa Reddi	37d26f6def	Book volumes: content, VS Code ext, CLI debug, and build updates	2026-02-15 16:02:54 -05:00
Vijay Janapa Reddi	73a956a09b	chore(volumes,vscode-ext): batch volume updates and tooling improvements Checkpoint the branch-wide content/config revisions together with workbench enhancements so chapter rendering and developer workflows stay aligned. This captures the current validation-driven formatting and parallel build/debug improvements in one commit.	2026-02-15 14:03:27 -05:00
Vijay Janapa Reddi	cac84290df	feat(vscode-ext): auto-open PDF in editor after build completes Use a file watcher to detect when the PDF is created/modified during build, then automatically open it in VS Code. Build still runs in the visible terminal so users see progress. Also fix LaTeX comma-in-title bug in foldbox.tex by bracing the title argument inside tcolorbox options.	2026-02-15 12:13:04 -05:00
Vijay Janapa Reddi	1db2dacfe7	style(vol2): fix lowercase x multiplication notation across all chapters Convert all remaining lowercase 'x' used as multiplication (e.g., "1000x faster") to $\times$ across 17 vol2 chapters. These were flagged by the new lowercase_x_multiplication validator check. Simplifies the validator regex from a fragile word-list approach to a broader pattern matching digit-x-lowercase (e.g., \dx\s+[a-z]) which naturally excludes hardware counts (8x A100) and hex literals (0x61). Includes the conversion script in _archive.	2026-02-15 11:53:51 -05:00
Vijay Janapa Reddi	5d68f0a2e0	style: standardize multiplication notation to $\times$ across all chapters Convert all Unicode × (U+00D7) to LaTeX $\times$ in prose, tables, and math contexts across both volumes. Unicode × is preserved only inside fig-alt text for accessibility screen readers. One instance inside a plain markdown backtick code span (frameworks.qmd) was reverted to Unicode × since LaTeX doesn't render in code spans. Updates validate.py with a new lowercase-x-as-multiplication check and refines the latex_adjacent warning to distinguish _str variables (safe) from raw inline Python. Updates validate_inline_refs.py comments to reflect the new convention. Includes the conversion script in _archive.	2026-02-15 11:43:45 -05:00
Vijay Janapa Reddi	f65e68222a	fix(vol1): correct stale backward references across 3 chapters Audited all 52 backward-looking prose references ("recall", "as we saw", "introduced earlier") across all 16 Vol I chapters. Found 46 valid and 6 with issues; fixed the 4 actionable ones: - benchmarking: fix dual attribution for energy-movement claim - hw_acceleration: fix imprecise "100x" energy gap to "orders-of-magnitude" - hw_acceleration: change "introduced in" to "mentioned in" for HBM ref - conclusion: correct invariant attribution from data_engineering to Part I Audit report: .claude/_reviews/2026-02-15_backward-reference-audit.yaml	2026-02-15 09:52:32 -05:00
Vijay Janapa Reddi	577b5d3cc7	feat(vscode-ext): add health status bar and validation checks Add a persistent health indicator to the extension: a status bar item that shows pass/warn/error at a glance, plus a health summary node at the top of the Pre-commit tree view. Fast in-process TypeScript checks run on file save, editor switch, and startup (<100ms per file). Checks: duplicate labels, unclosed div fences, missing figure alt-text, and unresolved in-file cross-references. - Add src/validation/qmdChecks.ts with four pure check functions - Add src/validation/healthManager.ts with central status tracker - Wire HealthManager into extension.ts with status bar and event hooks - Add expandable health summary node to PrecommitTreeProvider - Register showHealthDetails command in package.json	2026-02-14 16:18:16 -05:00
Vijay Janapa Reddi	97118ba0d8	style(vol1): fix remaining multiplication notation violations Second pass catching ~37 additional instances missed in the initial cleanup, including prose in frameworks, glossary definitions, footnotes, fig-caps, fig-alts, table cells, and callout content. All remaining `Nx` patterns are now exclusively inside Python code blocks (comments, docstrings, f-strings) or are mathematical variable expressions (e.g., derivative = 2x), which are correct as-is.	2026-02-14 15:46:57 -05:00
Vijay Janapa Reddi	c9d21b768b	feat(binder): add `render plots` command for matplotlib figure gallery Integrate figure rendering into the binder CLI so plots can be previewed without a full Quarto build. Extracts Python code blocks with fig-* labels from QMD files, renders them to PNG, and outputs a browsable gallery at _output/plots/<chapter>/. Also fixes the package import chain so `binder` works correctly as an installed entry point. - Add book/cli/commands/render.py with RenderCommand class - Wire into main.py with help table entry and command dispatch - Add matplotlib>=3.7.0 to pyproject.toml dependencies - Add book/quarto/_output/ to .gitignore - Archive standalone render_figures.py to _archive/	2026-02-14 12:43:23 -05:00
Vijay Janapa Reddi	a03ce064ec	style(vol1): standardize multiplication notation across all chapters Replace ~115 inconsistent multiplication symbols with the codified standard from book-prose.md: - Multiplier/ratio: N× (Unicode ×, no space) — e.g. "4× speedup" - Dimensions: M × N (Unicode ×, spaces) — e.g. "224 × 224" - Math mode: $\times$ only inside LaTeX expressions Fixes applied across 20 files: - Lowercase x → Unicode × in prose (~80 instances) - Standalone $\times$ → Unicode × for multipliers (~32 instances) - Dimension spacing corrections (3 instances) - Hyphen → en dash in ranges (e.g. "2-4×" → "2–4×") Protected content (code blocks, TikZ, display math) left untouched.	2026-02-14 12:09:32 -05:00
Vijay Janapa Reddi	40fcce9955	content(vol1): add thesis-driven titles to all Key Takeaways callouts Every callout-takeaways block across Vol 1 now has a title attribute that captures the chapter's core insight rather than repeating the chapter name. Titles are drawn from each chapter's purpose question or central thesis, answering "what should a student remember six months from now?" Examples: "Constraints Drive Architecture" (Intro), "Perfectly Available, Perfectly Wrong" (ML Ops), "Architecture Is Infrastructure" (Network Architectures).	2026-02-14 11:23:29 -05:00
Vijay Janapa Reddi	6bca8ef9b0	fix(vscode-ext): fix typed reference colors and simplify settings Typed references (@tbl-, @fig-, @sec-, @lst-, @eq-) and label definitions ({#tbl-...}, {#fig-...}) were all rendering in generic blue because overlapping decorations overrode the typed colors. Fix by collecting typed matches first and excluding them from generic/structural buckets. Also fix label-definition regexes to match labels with trailing attributes (e.g. {#fig-foo fig-env=...}). Change footnote colors from invisible slate-gray to distinct pink/rose for clear visual separation from other reference types. Remove all 27 individual color-override settings (mlsysbook.color*) since only the preset picker (subtle/balanced/vivid) is needed.	2026-02-13 18:21:25 -05:00
Vijay Janapa Reddi	f4391ce26f	refactor(vscode-ext): remove diagnostics system and clean up highlighter Remove QmdDiagnosticsManager and WorkspaceLabelIndex which caused false-positive blue squiggles during workspace index loading. Pre-commit hooks already validate cross-references and inline Python at commit time. Also remove div fence marker highlighting (Quarto handles natively), add !inFence guards to label line checks, remove broken reference decoration, and change footnote colors from gold to muted slate-gray to convey their marginal/supplementary nature.	2026-02-13 16:54:42 -05:00
Vijay Janapa Reddi	e3cc9f7af3	refactor: rename ml_ml_workflow files, consolidate CLI, and clean up scripts Remove redundant ml_ prefix from ml_workflow chapter files and update all Quarto config references. Consolidate custom scripts into native binder subcommands and archive obsolete tooling.	2026-02-13 11:06:28 -05:00
Vijay Janapa Reddi	acd571095a	fix(binder): include all label types by default, enable pattern checks - Default label types now include Equation (was missing from default set) - --check-patterns now defaults to True for inline-refs - Removed redundant --all-types from VSCode extension command All five label types (Figure, Table, Section, Equation, Listing) are now always checked unless explicitly filtered with --figures/--tables/etc.	2026-02-12 23:46:58 -05:00
Vijay Janapa Reddi	e41c2af2b7	fix(binder): update fix command prog name in argparse	2026-02-12 23:44:55 -05:00
Vijay Janapa Reddi	a0a7f7c658	feat(binder): restructure CLI into check/fix/format hierarchy Reorganize binder commands into a clean three-verb quality system: check — grouped validation (refs, labels, headers, footnotes, figures, rendering) with --scope for granularity fix — content management (headers, footnotes, glossary, images) format — auto-formatters (blanks, python, lists, divs, tables) Key changes: - validate → check (with backward-compat alias) - maintain → fix (with backward-compat alias) - 17 flat checks grouped into 6 semantic categories - --scope flag narrows to individual checks within a group - New FormatCommand with native blanks/lists + script delegation - Updated pre-commit hooks, VSCode extension, and help output	2026-02-12 23:37:56 -05:00
Vijay Janapa Reddi	8caeac9cc7	refactor(binder): rename validate/maintain subcommands for clarity Rename verbose compound names to clean, noun-based names: - section-ids → headers - forbidden-footnotes → footnote-placement - footnotes → footnote-refs - figure-completeness → figures - figure-placement → float-flow - index-placement → indexes - render-patterns → rendering - dropcap → dropcaps - part-keys → parts - image-refs → images Updated in: validate.py, maintenance.py, pre-commit hooks, VSCode extension.	2026-02-12 23:26:17 -05:00
Vijay Janapa Reddi	755e4cc6a6	feat(binder): consolidate 18 custom scripts into native binder subcommands Port all custom validation and maintenance scripts into the binder CLI as native subcommands, eliminating the need for standalone scripts. New `binder validate` subcommands (10): - section-ids: verify all headers have {#sec-...} IDs - forbidden-footnotes: check footnotes in tables/captions/divs - footnotes: validate footnote refs/defs (undefined, unused, duplicate) - figure-completeness: check figures have captions and alt-text - figure-placement: audit figure/table proximity to first reference - index-placement: check LaTeX \index{} placement - render-patterns: detect problematic rendering patterns - dropcap: validate drop cap compatibility - part-keys: validate \part{key:...} against summaries.yml - image-refs: validate image references exist on disk New `binder maintain` subcommands (2): - section-ids (add/repair/list/remove): full section ID lifecycle - footnotes (cleanup/reorganize/remove): footnote management Updated 11 pre-commit hooks to use binder commands instead of scripts. Updated VSCode extension commands to use binder CLI. All validators verified against original script output (parity confirmed).	2026-02-12 23:20:54 -05:00
Vijay Janapa Reddi	6b4af17b8f	feat(vscode-ext): workspace-wide cross-reference validation and section ID commands - Add WorkspaceLabelIndex: scans all .qmd files on activation, updates incrementally on save, provides hasLabel() for cross-file validation - Extend QmdDiagnosticsManager to validate references against workspace index (not just current file); triggered on save only, not keystrokes - Add broken reference decoration (red wavy underline) in chunk highlighter for refs that don't resolve to any label in the workspace - Add commands: Add Missing Section IDs, Verify Section IDs, Validate Cross-References (command palette) - Enable diagnostics by default (save-triggered, not noisy) - Support YAML-style label definitions (#\| label:, #\| fig-label:, etc.)	2026-02-12 22:53:50 -05:00
Vijay Janapa Reddi	d39ff325c0	feat(content): add missing section IDs across Vol 1 and enforce via pre-commit - Run manage_section_ids.py to add 294 missing section IDs and standardize 62 non-conforming IDs to hierarchy-based format - Fix 12 double-hash bugs (e.g., -cbb8-cbb8 -> -cbb8) from script - All cross-references updated to match new IDs (19 refs across files) - Verified clean: check_unreferenced_labels.py and check_duplicate_labels.py pass - Add book-verify-section-ids pre-commit hook (runs --verify mode, ~1.5s)	2026-02-12 22:53:13 -05:00
Vijay Janapa Reddi	f75bd2e490	fix(vscode-ext): force matplotlib Agg backend more aggressively Use force=True, switch_backend(), and close("all") after execution to prevent plot windows from appearing during Python value resolution.	2026-02-12 21:57:31 -05:00
Vijay Janapa Reddi	1995a17d6b	feat(vscode-ext): add inline Python value resolution and reduce visual noise - Add Python value resolver that executes QMD code blocks and resolves inline `{python} var` references to their actual values - Display resolved values as ghost text, hover tooltips, and CodeLens - Unwrap IPython.display.Markdown objects (.data) and display_value dicts for readable output instead of object reprs - Suppress matplotlib plot windows during resolution (Agg backend) - Split inline Python highlighting into keyword + variable decorations - Disable callout/div body and table region background highlights by default (too noisy); fence markers and cross-references still colored - Add settings: showInlinePythonValues, showInlinePythonCodeLens, highlightCalloutBackground, highlightTables (default: false)	2026-02-12 21:53:39 -05:00
Vijay Janapa Reddi	df8c3174e5	Refactor: centralize Quarto shared config and streamline PDF build fragments. Moves common diagram and PDF title/build settings into shared metadata layers, simplifies per-volume configs to keep only volume-specific values, and carries related chapter figure text/asset updates needed in the current working set.	2026-02-12 20:14:31 -05:00
Vijay Janapa Reddi	2390c3ab31	Refactor: consolidate Quarto config layers and content reorganization. Unifies Quarto metadata into shared base/format/volume fragments while carrying through chapter path, asset, and tooling updates to keep the repository consistent and easier to maintain.	2026-02-12 15:38:55 -05:00
Vijay Janapa Reddi	b3e86c5cc6	Removes image backup files. Deletes image backup files from different timestamps. These files are no longer needed and are removed to clean up the repository.	2026-02-12 06:32:59 -05:00
Vijay Janapa Reddi	d9cb03cf38	Refactor: Systematic Goal/Show/How header audit for Volume 1 - Completed full standardization of 150+ calculation headers across all 16 Volume 1 chapters. - Replaced legacy 'Why:' blocks with the 'Goal/Show/How' documentation pattern. - Finalized P.I.C.O. class refactors for complex cells in frameworks and serving. - Verified header consistency across introduction, ml_systems, training, and optimizations. - Performed minor stabilization in book/vscode-ext logic.	2026-02-11 21:33:27 -05:00
Vijay Janapa Reddi	c015b9d80a	Refactor: stabilize non-PDF build workflows and semantic editor cues. Standardize Quarto config/style handling for HTML/EPUB volume builds, add explicit binder reset commands by format, and align QMD reference/label highlighting so structural tokens share consistent visual semantics.	2026-02-11 20:36:16 -05:00
Vijay Janapa Reddi	ce68808185	Fix: make pipe table prettifier apply visible alignment changes Treat internal spacing changes as real formatting differences and normalize separator padding so table prettification is applied consistently. Save files before running pre-commit fixers from the extension so results match editor state.	2026-02-11 18:46:18 -05:00
Vijay Janapa Reddi	c16333cbad	Refactor: Finalize Volume 2 P.I.C.O. refactor and TikZ standardization - Refactored ops_scale.qmd to P.I.C.O. pattern and standardized constants. - Standardized TikZ colors in storage.qmd (fig-storage-hierarchy) and distributed_training.qmd. - Verified elimination of magic numbers (1e9, 3600, etc.) across all Volume 2 Python cells. - Completed full conversion of narrative guards to check() helper in Volume 2.	2026-02-11 17:34:55 -05:00
Vijay Janapa Reddi	66c4970c51	Refactor: standardize all appendix files across Volume 1 and 2 - Refactored appendix_algorithm.qmd, appendix_data.qmd, and appendix_dam.qmd to P.I.C.O. pattern. - Integrated standardized constants (BILLION, THOUSAND, SEC_PER_HOUR) in all appendix calculations. - Simplified narrative guards in appendices using the check() helper. - Verified backmatter consistency across both volumes.	2026-02-11 16:19:14 -05:00
Vijay Janapa Reddi	9a1e9a53cd	Refactor: standardize Volume 2 constants and move checkpoint math - Refactored Volume 2 core chapters (Intro, Dist Training, Storage) to P.I.C.O. pattern. - Moved Young-Daly checkpoint optimization logic from Storage to Distributed Training. - Eliminated magic numbers (1e9, 3600, etc.) from all Volume 2 calculation cells. - Implemented central 'chapter-start' cells with shared imports for consistency with Vol 1.	2026-02-11 16:05:04 -05:00
Vijay Janapa Reddi	77f478c1b5	Refactor: standardize constants, simplify invariants, and polish Vol 1 docs - Standardized time (SEC_PER_HOUR, SEC_PER_DAY), memory (KIB_TO_BYTES), and scale (BILLION, TRILLION) constants in mlsys/constants.py. - Refactor reordered time constants to fix NameError in constants.py. - Refactored 90+ narrative guards across Volume 1 to use the simplified check() helper. - Eliminated remaining magic numbers (1024, 3600, 1e9, etc.) from Python calculation cells. - Adopted 'Goal / Show / How' documentation structure for key P.I.C.O. scenarios. - Performed minor stabilization and feature polish in book/vscode-ext.	2026-02-11 15:57:00 -05:00
Vijay Janapa Reddi	abe634ead2	Refactor: unify binder workflows and polish authoring UX Consolidate Binder/extension command behavior, improve chapter navigation and QMD editor ergonomics, and carry forward Volume 1 content updates so build/debug and writing workflows stay aligned.	2026-02-11 12:34:06 -05:00
Vijay Janapa Reddi	ff3797a1d8	Refactor: Finalize Volume 1 and update CLI/VSCode tooling - Completed full Volume 1 refactor to Safe Class Namespace pattern. - Fixed render errors and verified all 16 chapters. - Updated 'binder' CLI with native validation and maintenance namespaces. - Enhanced VS Code extension with Chapter Navigator and Run History. - Integrated 'binder validate' into pre-commit workflows.	2026-02-11 09:25:50 -05:00
Vijay Janapa Reddi	41ec86ba10	Refactor: Fix Render Errors in Vol 1 - ml_systems.qmd: Fixed NameError in CloudEdgeTCO/EdgeSizing exports (replaced undefined '_value' vars with class attributes). - training.qmd: Fixed IndentationError in TrainingDimensions class.	2026-02-11 09:08:06 -05:00
Vijay Janapa Reddi	ce867fe486	Refactor: Finalize Chapter 8 (Model Training) Migration - Refactored FlashAttentionSpeedup, GradientAccumulation, and TrainingCarbonFootprint to Safe Class Namespace pattern. - Added P.I.C.O. structure and invariant checks. - Exported variables for prose integration. - Fixed inline Python validation warnings (replaced LaTeX math with Unicode).	2026-02-11 08:37:28 -05:00

1 2 3 4 5 ...

10330 Commits