Commit Graph

23 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
e3cc9f7af3 refactor: rename ml_ml_workflow files, consolidate CLI, and clean up scripts
Remove redundant ml_ prefix from ml_workflow chapter files and update all
Quarto config references. Consolidate custom scripts into native binder
subcommands and archive obsolete tooling.
2026-02-13 11:06:28 -05:00
Vijay Janapa Reddi
2390c3ab31 Refactor: consolidate Quarto config layers and content reorganization.
Unifies Quarto metadata into shared base/format/volume fragments while carrying through chapter path, asset, and tooling updates to keep the repository consistent and easier to maintain.
2026-02-12 15:38:55 -05:00
Vijay Janapa Reddi
d9cb03cf38 Refactor: Systematic Goal/Show/How header audit for Volume 1
- Completed full standardization of 150+ calculation headers across all 16 Volume 1 chapters.
- Replaced legacy 'Why:' blocks with the 'Goal/Show/How' documentation pattern.
- Finalized P.I.C.O. class refactors for complex cells in frameworks and serving.
- Verified header consistency across introduction, ml_systems, training, and optimizations.
- Performed minor stabilization in book/vscode-ext logic.
2026-02-11 21:33:27 -05:00
Vijay Janapa Reddi
ce68808185 Fix: make pipe table prettifier apply visible alignment changes
Treat internal spacing changes as real formatting differences and normalize separator padding so table prettification is applied consistently. Save files before running pre-commit fixers from the extension so results match editor state.
2026-02-11 18:46:18 -05:00
Vijay Janapa Reddi
83ce92624e Editorial Corrections & Code Hardening (Volume 1)
This commit refactors the underlying Python calculation cells for Chapters 1-16
to strictly enforce mathematical consistency with the narrative.

**Key Text/Numeric Updates (For Editorial Review):**

1.  **Chapter 3 (Workflow) - Edge Necessity Scenario:**
    -   *Change:* Increased clinic patient count from **100** to **150**.
    -   *Reason:* With 100 patients, the calculated upload time was ~5.5 hours, which fits within the 8-hour clinic day, contradicting the chapter's conclusion that 'Edge is Mandatory.' Increasing to 150 pushes upload time to >8 hours, mathematically validating the narrative.

2.  **Chapter 1 (Introduction) - Model Drift Scenario:**
    -   *Change:* Reduced monthly accuracy drift rate from **8.0%** to **0.8%**.
    -   *Reason:* An 8% monthly drop is a catastrophic failure that would be immediately noticed. A 0.8% drop correctly models the 'silent failure' (boiling frog) scenario described in the text.

3.  **Chapter 3 (Workflow) - Velocity vs Quality:**
    -   *Change:* Reduced 'Large Model' accuracy gain per iteration from **0.5%** to **0.15%**.
    -   *Reason:* The original rate caused the large model to hit 99% accuracy almost instantly, invalidating the 'Velocity is a Feature' argument. The new rate correctly models diminishing returns, allowing the faster (small) model to win.

4.  **Chapter 15 (Responsible Engineering) - TCO Analysis:**
    -   *Verification:* Verified and stabilized the 3-year Total Cost of Ownership (TCO) calculations. Confirmed that Inference TCO (.5M) dominates Training TCO (8K) by ~40x, supporting the 'Efficiency as Responsibility' thesis.

**Technical Changes (Code Only):**
-   Refactored all calculation cells to use the **P.I.C.O. (Parameters, Invariants, Calculation, Outputs)** design pattern.
-   Added assertion guards (Invariants) to prevent future regressions where math contradicts prose.
-   Fixed variable scope issues in Chapter 10 (Model Compression) and Chapter 15.
-   Disabled false-positive linter warnings for standard LaTeX spacing.
2026-02-10 14:59:26 -05:00
Vijay Janapa Reddi
4dd1bf70aa Fix pre-commit issues: cross-refs, footnotes, unreferenced tables, SVG hook
- Fix broken cross-refs in training.qmd (em-dash parsed as part of ID)
- Remove footnote from table cell in ml_systems.qmd
- Add @tbl- references for 22 unreferenced tables across 5 files
- Comment out stale SVG prevention hook in pre-commit config
- Auto-fixes from bibtex-tidy, blank-line collapse, pipe-table prettify
2026-02-09 07:57:16 -05:00
Vijay Janapa Reddi
3dbaa04ebf fix: resolve all pre-commit hook failures across Vol 1 and Vol 2
Content fixes:
- Add references for all 8 appendix_machine tables in surrounding prose
- Remove cross-volume refs (@sec-distributed-training, @sec-security-privacy)
  and replace with self-contained prose
- Fix broken cross-refs (em-dashes, @sec-data-engineering → @sec-data-engineering-ml)
- Fix unreferenced equations (@eq-memory-wall, @eq-training-iron-law)
- Fix nested/forbidden footnotes (hw_acceleration, introduction, dl_primer)
- Fix drop cap incompatibility in conclusion.qmd
- Fix codespell false positive ("trough" added to ignore list)
- Add closer @tbl/@fig references near definitions across all chapters
- Replace inline fmt() calls with pre-computed _str variables (dl_primer)

Checker improvements:
- figure_table_flow_audit.py: exclude code block lines from gap calculation,
  add forward-reference tolerance, broaden code block detection to all fenced
  blocks (tikz, etc.)
- check_render_patterns.py: improve $...$ parsing with shortest-match spans,
  add exponent exception for {python} in ^{...}, exit 0 on warnings-only
2026-02-08 02:01:49 -05:00
Vijay Janapa Reddi
3d54da6305 fix: resolve inline Python build errors across Vol 1 chapters
Fix NameError build failures in ml_systems, data_engineering, and
benchmarking chapters caused by missing imports and variables referenced
before their defining code cells.

- ml_systems: add missing Kparam and Bparam imports from physx.constants
- data_engineering: compute transfer_time_10g_md preview in setup cell,
  add md_math import, add deduplication-dividend-calc cell, convert
  hardcoded values to physics engine units
- benchmarking: compute BERT roofline preview values in roofline-example-calc
  cell before they are referenced in narrative text, convert hardcoded
  values to inline Python, condense redundant footnotes

Also includes physics engine integration improvements across all Vol 1
chapters: unit-safe conversions, inline Python for previously hardcoded
values, streamlined footnotes with cross-references, and new content
validation scripts.

All 21 Vol 1 chapters pass PDF build tests.
2026-02-06 09:57:25 -05:00
Vijay Janapa Reddi
e942b552ba fix: resolve cross-reference issues and add missing table/figure refs
- Update check_unreferenced_labels.py to detect YAML id: frontmatter
- Add references to all unreferenced tables and listings in Vol1
- Scope unreferenced labels hook to Vol1 only (Vol2 has WIP chapters)
- Fix inline Python in LaTeX math blocks across multiple chapters
- Update test_units.py to use Dense (not Sparse) H100 FLOPS values
- Update validate_inline_refs.py regex to ignore escaped dollar signs

Key files fixed:
- appendix_algorithm.qmd: @tbl-tensor-op-ref, @fig-broadcasting-rules
- appendix_data.qmd: @tbl-data-gravity, @tbl-serialization-cost
- appendix_dam.qmd: @tbl-dam-overlap, @tbl-bottleneck-actions, etc.
- appendix_machine.qmd: @tbl-latency-hierarchy, @tbl-hardware-cheatsheet
- frameworks.qmd: @lst-gradient-accumulation, @lst-custom-autograd-function
- dnn_architectures.qmd: @lst-conv_layer_spatial
2026-02-06 06:03:19 -05:00
Vijay Janapa Reddi
a75e8b80e5 Update book chapters and clean up testing artifacts
- Update all vol1 and vol2 chapter content with formatting improvements
- Add pre-commit hooks for additional validation checks
- Remove obsolete testing artifacts (appendix_dam, appendix_data, dl_primer, glossary)
- Add new testing logs for vol2 chapters and appendix_assumptions/notation
- Add utility scripts for table rendering checks and prettification
- Remove deprecated hw_acceleration.rmarkdown file
2026-02-01 23:28:30 -05:00
Vijay Janapa Reddi
8578982175 Update grid-to-pipe table converter with alignment support
- Properly preserves left/center/right alignment from grid tables
- Added --check mode for pre-commit warning
- Added book-check-grid-tables hook to warn about grid tables
- Grid tables should be converted to pipe for better inline Python support
2026-02-01 22:50:09 -05:00
Vijay Janapa Reddi
a85d513cd1 Fix list spacing: add before, remove between items
Updated fix_bullet_spacing.py to handle both cases:
1. Add blank line BEFORE lists (intro text followed by bullet)
2. Remove blank lines BETWEEN consecutive list items

Fixed 70 issues across 17 files in vol1, vol2, and frontmatter.
2026-02-01 22:21:32 -05:00
Vijay Janapa Reddi
f94e5514cf Add bullet spacing check to pre-commit hooks
- Updated fix_bullet_spacing.py with --check mode for CI validation
- Added book-fix-bullet-spacing hook to auto-fix missing blank lines
  before bullet lists during commits
- Script now provides clear error messages with line numbers
2026-02-01 22:18:19 -05:00
Vijay Janapa Reddi
6a343e8767 Add blank line before bullet lists for proper PDF rendering
Fixed 19 bullet lists across vol1 and vol2 that were missing the blank
line before the list starts. This ensures proper rendering in PDF/LaTeX.

Added fix_bullet_spacing.py utility script for automated detection
and fixing of this pattern.
2026-02-01 21:19:03 -05:00
Vijay Janapa Reddi
86d2e15372 Convert all appendix grid tables to pipe tables
- appendix_data.qmd: Data Gravity and Serialization tables
- appendix_dam.qmd: DAM Components, Troubleshooting, Tooling, Scorecard tables
- appendix_algorithm.qmd: Tensor Primitives table
- appendix_machine.qmd: Numerical Formats table

Pipe tables handle inline Python code better than grid tables.
Also adds utility script for future grid-to-pipe conversions.
2026-02-01 20:36:06 -05:00
Vijay Janapa Reddi
7ad6d51f96 Update two-volume textbook content, config, and tooling
- Edit all Vol 1 and Vol 2 chapters for print readiness and pedagogical clarity
- Update Quarto config files for both volumes (PDF, HTML, EPUB)
- Add frontmatter updates (about, acknowledgements, socratiq)
- Remove unused _brand assets (scss, favicon, scripts, manifest)
- Add new utility scripts (audit_figure_placement, format_div_spacing, audit_refs)
- Update format_python_in_qmd script
- Add references.bib entries and seminal papers corpus
2026-01-30 02:42:59 -05:00
Vijay Janapa Reddi
632c53f0a0 feat: add cross-chapter footnote audit utility
New script to identify duplicate footnote definitions across chapters.
Helps catch repeated definitions that should be differentiated or
consolidated for consistency.
2026-01-24 11:18:24 -05:00
Vijay Janapa Reddi
74a6c4d760 fix: resolve pre-commit hook failures
- Add 'ure' to codespell ignore list (false positive from regex)
- Fix 17 broken cross-references pointing to non-existent sections
- Rename 4 duplicate figure labels in vol2 (interpretability-spectrum,
  train-data-parallelism, model-parallelism, layers-blocks)
- Remove missing bibliography reference from data_efficiency frontmatter
- Add missing footnote definition (fn-containerization-orchestration)
- Remove 5 unused footnote definitions
- Update validate_part_keys.py to support two-volume directory structure
2026-01-24 09:35:39 -05:00
Vijay Janapa Reddi
ab5b180fc5 Improves script execution from different directories
Updates the script to locate necessary files and directories
relative to the script's execution point, allowing it to run
correctly whether invoked from the root or book/ directory.
This change enhances the script's flexibility and usability
within the project's directory structure.
2026-01-07 11:57:47 -05:00
Vijay Janapa Reddi
9781727d60 refactor: rename advanced_intro to introduction and update scripts
- Renamed vol2/advanced_intro to vol2/introduction for consistency
- Updated all scripts and configs to use vol1/ instead of core/
- Updated pre-commit config to check all contents/ not just vol1/
- Updated path references in Lua filters, Python scripts, and configs
2026-01-01 14:46:52 -05:00
Vijay Janapa Reddi
853eb03ee8 style: apply consistent whitespace and formatting across codebase 2025-12-13 14:05:34 -05:00
Vijay Janapa Reddi
1cca4139f3 Fix pre-commit config paths after restructure
- Update format_tables.py to use workspace-relative path (quarto/contents/)
- Update validate_part_keys.py script to use book/quarto paths
- Scripts in book/tools/ that calculate workspace_root need paths relative to book/
- Other scripts need full book/quarto/contents/ paths
2025-12-05 14:16:13 -08:00
Vijay Janapa Reddi
7b92e11193 Repository Restructuring: Prepare for TinyTorch Integration (#1068)
* Restructure: Move book content to book/ subdirectory

- Move quarto/ → book/quarto/
- Move cli/ → book/cli/
- Move docker/ → book/docker/
- Move socratiQ/ → book/socratiQ/
- Move tools/ → book/tools/
- Move scripts/ → book/scripts/
- Move config/ → book/config/
- Move docs/ → book/docs/
- Move binder → book/binder

Git history fully preserved for all moved files.

Part of repository restructuring to support MLSysBook + TinyTorch.

Pre-commit hooks bypassed for this commit as paths need updating.

* Update pre-commit hooks for book/ subdirectory

- Update all quarto/ paths to book/quarto/
- Update all tools/ paths to book/tools/
- Update config/linting to book/config/linting
- Update project structure checks

Pre-commit hooks will now work with new directory structure.

* Update .gitignore for book/ subdirectory structure

- Update quarto/ paths to book/quarto/
- Update assets/ paths to book/quarto/assets/
- Maintain all existing ignore patterns

* Update GitHub workflows for book/ subdirectory

- Update all quarto/ paths to book/quarto/
- Update cli/ paths to book/cli/
- Update tools/ paths to book/tools/
- Update docker/ paths to book/docker/
- Update config/ paths to book/config/
- Maintain all workflow functionality

* Update CLI config to support book/ subdirectory

- Check for book/quarto/ path first
- Fall back to quarto/ for backward compatibility
- Maintain full CLI functionality

* Create new root and book READMEs for dual structure

- Add comprehensive root README explaining both projects
- Create book-specific README with quick start guide
- Document repository structure and navigation
- Prepare for TinyTorch integration
2025-12-05 14:04:21 -08:00