Remove redundant ml_ prefix from ml_workflow chapter files and update all
Quarto config references. Consolidate custom scripts into native binder
subcommands and archive obsolete tooling.
Unifies Quarto metadata into shared base/format/volume fragments while carrying through chapter path, asset, and tooling updates to keep the repository consistent and easier to maintain.
- Completed full standardization of 150+ calculation headers across all 16 Volume 1 chapters.
- Replaced legacy 'Why:' blocks with the 'Goal/Show/How' documentation pattern.
- Finalized P.I.C.O. class refactors for complex cells in frameworks and serving.
- Verified header consistency across introduction, ml_systems, training, and optimizations.
- Performed minor stabilization in book/vscode-ext logic.
Treat internal spacing changes as real formatting differences and normalize separator padding so table prettification is applied consistently. Save files before running pre-commit fixers from the extension so results match editor state.
This commit refactors the underlying Python calculation cells for Chapters 1-16
to strictly enforce mathematical consistency with the narrative.
**Key Text/Numeric Updates (For Editorial Review):**
1. **Chapter 3 (Workflow) - Edge Necessity Scenario:**
- *Change:* Increased clinic patient count from **100** to **150**.
- *Reason:* With 100 patients, the calculated upload time was ~5.5 hours, which fits within the 8-hour clinic day, contradicting the chapter's conclusion that 'Edge is Mandatory.' Increasing to 150 pushes upload time to >8 hours, mathematically validating the narrative.
2. **Chapter 1 (Introduction) - Model Drift Scenario:**
- *Change:* Reduced monthly accuracy drift rate from **8.0%** to **0.8%**.
- *Reason:* An 8% monthly drop is a catastrophic failure that would be immediately noticed. A 0.8% drop correctly models the 'silent failure' (boiling frog) scenario described in the text.
3. **Chapter 3 (Workflow) - Velocity vs Quality:**
- *Change:* Reduced 'Large Model' accuracy gain per iteration from **0.5%** to **0.15%**.
- *Reason:* The original rate caused the large model to hit 99% accuracy almost instantly, invalidating the 'Velocity is a Feature' argument. The new rate correctly models diminishing returns, allowing the faster (small) model to win.
4. **Chapter 15 (Responsible Engineering) - TCO Analysis:**
- *Verification:* Verified and stabilized the 3-year Total Cost of Ownership (TCO) calculations. Confirmed that Inference TCO (.5M) dominates Training TCO (8K) by ~40x, supporting the 'Efficiency as Responsibility' thesis.
**Technical Changes (Code Only):**
- Refactored all calculation cells to use the **P.I.C.O. (Parameters, Invariants, Calculation, Outputs)** design pattern.
- Added assertion guards (Invariants) to prevent future regressions where math contradicts prose.
- Fixed variable scope issues in Chapter 10 (Model Compression) and Chapter 15.
- Disabled false-positive linter warnings for standard LaTeX spacing.
- Fix broken cross-refs in training.qmd (em-dash parsed as part of ID)
- Remove footnote from table cell in ml_systems.qmd
- Add @tbl- references for 22 unreferenced tables across 5 files
- Comment out stale SVG prevention hook in pre-commit config
- Auto-fixes from bibtex-tidy, blank-line collapse, pipe-table prettify
Fix NameError build failures in ml_systems, data_engineering, and
benchmarking chapters caused by missing imports and variables referenced
before their defining code cells.
- ml_systems: add missing Kparam and Bparam imports from physx.constants
- data_engineering: compute transfer_time_10g_md preview in setup cell,
add md_math import, add deduplication-dividend-calc cell, convert
hardcoded values to physics engine units
- benchmarking: compute BERT roofline preview values in roofline-example-calc
cell before they are referenced in narrative text, convert hardcoded
values to inline Python, condense redundant footnotes
Also includes physics engine integration improvements across all Vol 1
chapters: unit-safe conversions, inline Python for previously hardcoded
values, streamlined footnotes with cross-references, and new content
validation scripts.
All 21 Vol 1 chapters pass PDF build tests.
- Properly preserves left/center/right alignment from grid tables
- Added --check mode for pre-commit warning
- Added book-check-grid-tables hook to warn about grid tables
- Grid tables should be converted to pipe for better inline Python support
Updated fix_bullet_spacing.py to handle both cases:
1. Add blank line BEFORE lists (intro text followed by bullet)
2. Remove blank lines BETWEEN consecutive list items
Fixed 70 issues across 17 files in vol1, vol2, and frontmatter.
- Updated fix_bullet_spacing.py with --check mode for CI validation
- Added book-fix-bullet-spacing hook to auto-fix missing blank lines
before bullet lists during commits
- Script now provides clear error messages with line numbers
Fixed 19 bullet lists across vol1 and vol2 that were missing the blank
line before the list starts. This ensures proper rendering in PDF/LaTeX.
Added fix_bullet_spacing.py utility script for automated detection
and fixing of this pattern.
New script to identify duplicate footnote definitions across chapters.
Helps catch repeated definitions that should be differentiated or
consolidated for consistency.
Updates the script to locate necessary files and directories
relative to the script's execution point, allowing it to run
correctly whether invoked from the root or book/ directory.
This change enhances the script's flexibility and usability
within the project's directory structure.
- Renamed vol2/advanced_intro to vol2/introduction for consistency
- Updated all scripts and configs to use vol1/ instead of core/
- Updated pre-commit config to check all contents/ not just vol1/
- Updated path references in Lua filters, Python scripts, and configs
- Update format_tables.py to use workspace-relative path (quarto/contents/)
- Update validate_part_keys.py script to use book/quarto paths
- Scripts in book/tools/ that calculate workspace_root need paths relative to book/
- Other scripts need full book/quarto/contents/ paths
* Restructure: Move book content to book/ subdirectory
- Move quarto/ → book/quarto/
- Move cli/ → book/cli/
- Move docker/ → book/docker/
- Move socratiQ/ → book/socratiQ/
- Move tools/ → book/tools/
- Move scripts/ → book/scripts/
- Move config/ → book/config/
- Move docs/ → book/docs/
- Move binder → book/binder
Git history fully preserved for all moved files.
Part of repository restructuring to support MLSysBook + TinyTorch.
Pre-commit hooks bypassed for this commit as paths need updating.
* Update pre-commit hooks for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update all tools/ paths to book/tools/
- Update config/linting to book/config/linting
- Update project structure checks
Pre-commit hooks will now work with new directory structure.
* Update .gitignore for book/ subdirectory structure
- Update quarto/ paths to book/quarto/
- Update assets/ paths to book/quarto/assets/
- Maintain all existing ignore patterns
* Update GitHub workflows for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update cli/ paths to book/cli/
- Update tools/ paths to book/tools/
- Update docker/ paths to book/docker/
- Update config/ paths to book/config/
- Maintain all workflow functionality
* Update CLI config to support book/ subdirectory
- Check for book/quarto/ path first
- Fall back to quarto/ for backward compatibility
- Maintain full CLI functionality
* Create new root and book READMEs for dual structure
- Add comprehensive root README explaining both projects
- Create book-specific README with quick start guide
- Document repository structure and navigation
- Prepare for TinyTorch integration