cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-03 00:07:08 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	e3cc9f7af3	refactor: rename ml_ml_workflow files, consolidate CLI, and clean up scripts Remove redundant ml_ prefix from ml_workflow chapter files and update all Quarto config references. Consolidate custom scripts into native binder subcommands and archive obsolete tooling.	2026-02-13 11:06:28 -05:00
Vijay Janapa Reddi	2390c3ab31	Refactor: consolidate Quarto config layers and content reorganization. Unifies Quarto metadata into shared base/format/volume fragments while carrying through chapter path, asset, and tooling updates to keep the repository consistent and easier to maintain.	2026-02-12 15:38:55 -05:00
Vijay Janapa Reddi	d9cb03cf38	Refactor: Systematic Goal/Show/How header audit for Volume 1 - Completed full standardization of 150+ calculation headers across all 16 Volume 1 chapters. - Replaced legacy 'Why:' blocks with the 'Goal/Show/How' documentation pattern. - Finalized P.I.C.O. class refactors for complex cells in frameworks and serving. - Verified header consistency across introduction, ml_systems, training, and optimizations. - Performed minor stabilization in book/vscode-ext logic.	2026-02-11 21:33:27 -05:00
Vijay Janapa Reddi	ce68808185	Fix: make pipe table prettifier apply visible alignment changes Treat internal spacing changes as real formatting differences and normalize separator padding so table prettification is applied consistently. Save files before running pre-commit fixers from the extension so results match editor state.	2026-02-11 18:46:18 -05:00
Vijay Janapa Reddi	83ce92624e	Editorial Corrections & Code Hardening (Volume 1) This commit refactors the underlying Python calculation cells for Chapters 1-16 to strictly enforce mathematical consistency with the narrative. Key Text/Numeric Updates (For Editorial Review): 1. Chapter 3 (Workflow) - Edge Necessity Scenario: - Change: Increased clinic patient count from 100 to 150. - Reason: With 100 patients, the calculated upload time was ~5.5 hours, which fits within the 8-hour clinic day, contradicting the chapter's conclusion that 'Edge is Mandatory.' Increasing to 150 pushes upload time to >8 hours, mathematically validating the narrative. 2. Chapter 1 (Introduction) - Model Drift Scenario: - Change: Reduced monthly accuracy drift rate from 8.0% to 0.8%. - Reason: An 8% monthly drop is a catastrophic failure that would be immediately noticed. A 0.8% drop correctly models the 'silent failure' (boiling frog) scenario described in the text. 3. Chapter 3 (Workflow) - Velocity vs Quality: - Change: Reduced 'Large Model' accuracy gain per iteration from 0.5% to 0.15%. - Reason: The original rate caused the large model to hit 99% accuracy almost instantly, invalidating the 'Velocity is a Feature' argument. The new rate correctly models diminishing returns, allowing the faster (small) model to win. 4. Chapter 15 (Responsible Engineering) - TCO Analysis: - Verification: Verified and stabilized the 3-year Total Cost of Ownership (TCO) calculations. Confirmed that Inference TCO (.5M) dominates Training TCO (8K) by ~40x, supporting the 'Efficiency as Responsibility' thesis. Technical Changes (Code Only): - Refactored all calculation cells to use the P.I.C.O. (Parameters, Invariants, Calculation, Outputs) design pattern. - Added assertion guards (Invariants) to prevent future regressions where math contradicts prose. - Fixed variable scope issues in Chapter 10 (Model Compression) and Chapter 15. - Disabled false-positive linter warnings for standard LaTeX spacing.	2026-02-10 14:59:26 -05:00
Vijay Janapa Reddi	4dd1bf70aa	Fix pre-commit issues: cross-refs, footnotes, unreferenced tables, SVG hook - Fix broken cross-refs in training.qmd (em-dash parsed as part of ID) - Remove footnote from table cell in ml_systems.qmd - Add @tbl- references for 22 unreferenced tables across 5 files - Comment out stale SVG prevention hook in pre-commit config - Auto-fixes from bibtex-tidy, blank-line collapse, pipe-table prettify	2026-02-09 07:57:16 -05:00
Vijay Janapa Reddi	3dbaa04ebf	fix: resolve all pre-commit hook failures across Vol 1 and Vol 2 Content fixes: - Add references for all 8 appendix_machine tables in surrounding prose - Remove cross-volume refs (@sec-distributed-training, @sec-security-privacy) and replace with self-contained prose - Fix broken cross-refs (em-dashes, @sec-data-engineering → @sec-data-engineering-ml) - Fix unreferenced equations (@eq-memory-wall, @eq-training-iron-law) - Fix nested/forbidden footnotes (hw_acceleration, introduction, dl_primer) - Fix drop cap incompatibility in conclusion.qmd - Fix codespell false positive ("trough" added to ignore list) - Add closer @tbl/@fig references near definitions across all chapters - Replace inline fmt() calls with pre-computed _str variables (dl_primer) Checker improvements: - figure_table_flow_audit.py: exclude code block lines from gap calculation, add forward-reference tolerance, broaden code block detection to all fenced blocks (tikz, etc.) - check_render_patterns.py: improve $...$ parsing with shortest-match spans, add exponent exception for {python} in ^{...}, exit 0 on warnings-only	2026-02-08 02:01:49 -05:00
Vijay Janapa Reddi	3d54da6305	fix: resolve inline Python build errors across Vol 1 chapters Fix NameError build failures in ml_systems, data_engineering, and benchmarking chapters caused by missing imports and variables referenced before their defining code cells. - ml_systems: add missing Kparam and Bparam imports from physx.constants - data_engineering: compute transfer_time_10g_md preview in setup cell, add md_math import, add deduplication-dividend-calc cell, convert hardcoded values to physics engine units - benchmarking: compute BERT roofline preview values in roofline-example-calc cell before they are referenced in narrative text, convert hardcoded values to inline Python, condense redundant footnotes Also includes physics engine integration improvements across all Vol 1 chapters: unit-safe conversions, inline Python for previously hardcoded values, streamlined footnotes with cross-references, and new content validation scripts. All 21 Vol 1 chapters pass PDF build tests.	2026-02-06 09:57:25 -05:00
Vijay Janapa Reddi	e942b552ba	fix: resolve cross-reference issues and add missing table/figure refs - Update check_unreferenced_labels.py to detect YAML id: frontmatter - Add references to all unreferenced tables and listings in Vol1 - Scope unreferenced labels hook to Vol1 only (Vol2 has WIP chapters) - Fix inline Python in LaTeX math blocks across multiple chapters - Update test_units.py to use Dense (not Sparse) H100 FLOPS values - Update validate_inline_refs.py regex to ignore escaped dollar signs Key files fixed: - appendix_algorithm.qmd: @tbl-tensor-op-ref, @fig-broadcasting-rules - appendix_data.qmd: @tbl-data-gravity, @tbl-serialization-cost - appendix_dam.qmd: @tbl-dam-overlap, @tbl-bottleneck-actions, etc. - appendix_machine.qmd: @tbl-latency-hierarchy, @tbl-hardware-cheatsheet - frameworks.qmd: @lst-gradient-accumulation, @lst-custom-autograd-function - dnn_architectures.qmd: @lst-conv_layer_spatial	2026-02-06 06:03:19 -05:00
Vijay Janapa Reddi	a75e8b80e5	Update book chapters and clean up testing artifacts - Update all vol1 and vol2 chapter content with formatting improvements - Add pre-commit hooks for additional validation checks - Remove obsolete testing artifacts (appendix_dam, appendix_data, dl_primer, glossary) - Add new testing logs for vol2 chapters and appendix_assumptions/notation - Add utility scripts for table rendering checks and prettification - Remove deprecated hw_acceleration.rmarkdown file	2026-02-01 23:28:30 -05:00
Vijay Janapa Reddi	8578982175	Update grid-to-pipe table converter with alignment support - Properly preserves left/center/right alignment from grid tables - Added --check mode for pre-commit warning - Added book-check-grid-tables hook to warn about grid tables - Grid tables should be converted to pipe for better inline Python support	2026-02-01 22:50:09 -05:00
Vijay Janapa Reddi	a85d513cd1	Fix list spacing: add before, remove between items Updated fix_bullet_spacing.py to handle both cases: 1. Add blank line BEFORE lists (intro text followed by bullet) 2. Remove blank lines BETWEEN consecutive list items Fixed 70 issues across 17 files in vol1, vol2, and frontmatter.	2026-02-01 22:21:32 -05:00
Vijay Janapa Reddi	f94e5514cf	Add bullet spacing check to pre-commit hooks - Updated fix_bullet_spacing.py with --check mode for CI validation - Added book-fix-bullet-spacing hook to auto-fix missing blank lines before bullet lists during commits - Script now provides clear error messages with line numbers	2026-02-01 22:18:19 -05:00
Vijay Janapa Reddi	6a343e8767	Add blank line before bullet lists for proper PDF rendering Fixed 19 bullet lists across vol1 and vol2 that were missing the blank line before the list starts. This ensures proper rendering in PDF/LaTeX. Added fix_bullet_spacing.py utility script for automated detection and fixing of this pattern.	2026-02-01 21:19:03 -05:00
Vijay Janapa Reddi	86d2e15372	Convert all appendix grid tables to pipe tables - appendix_data.qmd: Data Gravity and Serialization tables - appendix_dam.qmd: DAM Components, Troubleshooting, Tooling, Scorecard tables - appendix_algorithm.qmd: Tensor Primitives table - appendix_machine.qmd: Numerical Formats table Pipe tables handle inline Python code better than grid tables. Also adds utility script for future grid-to-pipe conversions.	2026-02-01 20:36:06 -05:00
Vijay Janapa Reddi	7ad6d51f96	Update two-volume textbook content, config, and tooling - Edit all Vol 1 and Vol 2 chapters for print readiness and pedagogical clarity - Update Quarto config files for both volumes (PDF, HTML, EPUB) - Add frontmatter updates (about, acknowledgements, socratiq) - Remove unused _brand assets (scss, favicon, scripts, manifest) - Add new utility scripts (audit_figure_placement, format_div_spacing, audit_refs) - Update format_python_in_qmd script - Add references.bib entries and seminal papers corpus	2026-01-30 02:42:59 -05:00
Vijay Janapa Reddi	632c53f0a0	feat: add cross-chapter footnote audit utility New script to identify duplicate footnote definitions across chapters. Helps catch repeated definitions that should be differentiated or consolidated for consistency.	2026-01-24 11:18:24 -05:00
Vijay Janapa Reddi	74a6c4d760	fix: resolve pre-commit hook failures - Add 'ure' to codespell ignore list (false positive from regex) - Fix 17 broken cross-references pointing to non-existent sections - Rename 4 duplicate figure labels in vol2 (interpretability-spectrum, train-data-parallelism, model-parallelism, layers-blocks) - Remove missing bibliography reference from data_efficiency frontmatter - Add missing footnote definition (fn-containerization-orchestration) - Remove 5 unused footnote definitions - Update validate_part_keys.py to support two-volume directory structure	2026-01-24 09:35:39 -05:00
Vijay Janapa Reddi	ab5b180fc5	Improves script execution from different directories Updates the script to locate necessary files and directories relative to the script's execution point, allowing it to run correctly whether invoked from the root or book/ directory. This change enhances the script's flexibility and usability within the project's directory structure.	2026-01-07 11:57:47 -05:00
Vijay Janapa Reddi	9781727d60	refactor: rename advanced_intro to introduction and update scripts - Renamed vol2/advanced_intro to vol2/introduction for consistency - Updated all scripts and configs to use vol1/ instead of core/ - Updated pre-commit config to check all contents/ not just vol1/ - Updated path references in Lua filters, Python scripts, and configs	2026-01-01 14:46:52 -05:00
Vijay Janapa Reddi	853eb03ee8	style: apply consistent whitespace and formatting across codebase	2025-12-13 14:05:34 -05:00
Vijay Janapa Reddi	1cca4139f3	Fix pre-commit config paths after restructure - Update format_tables.py to use workspace-relative path (quarto/contents/) - Update validate_part_keys.py script to use book/quarto paths - Scripts in book/tools/ that calculate workspace_root need paths relative to book/ - Other scripts need full book/quarto/contents/ paths	2025-12-05 14:16:13 -08:00
Vijay Janapa Reddi	7b92e11193	Repository Restructuring: Prepare for TinyTorch Integration (#1068 ) * Restructure: Move book content to book/ subdirectory - Move quarto/ → book/quarto/ - Move cli/ → book/cli/ - Move docker/ → book/docker/ - Move socratiQ/ → book/socratiQ/ - Move tools/ → book/tools/ - Move scripts/ → book/scripts/ - Move config/ → book/config/ - Move docs/ → book/docs/ - Move binder → book/binder Git history fully preserved for all moved files. Part of repository restructuring to support MLSysBook + TinyTorch. Pre-commit hooks bypassed for this commit as paths need updating. * Update pre-commit hooks for book/ subdirectory - Update all quarto/ paths to book/quarto/ - Update all tools/ paths to book/tools/ - Update config/linting to book/config/linting - Update project structure checks Pre-commit hooks will now work with new directory structure. * Update .gitignore for book/ subdirectory structure - Update quarto/ paths to book/quarto/ - Update assets/ paths to book/quarto/assets/ - Maintain all existing ignore patterns * Update GitHub workflows for book/ subdirectory - Update all quarto/ paths to book/quarto/ - Update cli/ paths to book/cli/ - Update tools/ paths to book/tools/ - Update docker/ paths to book/docker/ - Update config/ paths to book/config/ - Maintain all workflow functionality * Update CLI config to support book/ subdirectory - Check for book/quarto/ path first - Fall back to quarto/ for backward compatibility - Maintain full CLI functionality * Create new root and book READMEs for dual structure - Add comprehensive root README explaining both projects - Create book-specific README with quick start guide - Document repository structure and navigation - Prepare for TinyTorch integration	2025-12-05 14:04:21 -08:00

23 Commits