- book-prose: allow compound × for simple products; require × alone only when
followed by word/unit; Unicode × only in fig-alt
- Revert split × back to compound (e.g. $3 \times 10^{-4}$)
- data_engineering: 8× A100 → 8$\times$ A100 (LaTeX in table)
- appendix_dam: Python outputs use LaTeX ×
- hw_acceleration: table dimensions use compound math ($4\times4\times4$)
- benchmarking: fix Python equation string
Ensures cover images in Vol. 2 chapters fill the available width, improving visual presentation across different screen sizes.
Removes duplicate cover image from the introduction chapter.
Corrects a typographical error in Appendix Machine regarding energy ratios.
Refactors the build process to leverage shared output file resolution logic, ensuring consistency across build and debug commands.
Improves validation by streamlining bibliography handling and adding stricter citation matching.
Updates diagram dependencies and adjusts content for clarity and accuracy.
Figures should have caption only in fig-cap attribute, not duplicated
as trailing text. Removed redundant captions from:
- introduction.qmd: fig-loss-vs-n-d, fig-data-scaling-regimes, fig-scaling-regimes
- sustainable_ai.qmd: fig-datacenter-energy-usage
- Vol2 PDF: remove half-title, disable lof/lot/lol, toc/number-depth 2, align titlepage
- Vol2 about.qmd: add Beyond This Book and Using This Book supplementary blocks
Replace four distinct colors (Brown, Blue, Green, Red) with ETHZ Blue
intensity gradient (25%→50%→75%→100%) in fig-fleet-stack and
fig-vol2-roadmap, matching vol1 mlsysstack crimson gradient pattern.
- Vol 2: reference/citation at document end, sky-blue logo, cover image position
- Vol 1: match cover image position (bg-image-left 0.175, bg-image-bottom 8)
- Add recolor_cover_logo.py for hue-shift variants of cover logo
- Wrap Python figures in div with fig-env, fig-pos, fig-cap, fig-alt
- Remove #| label from blocks when div has #fig-xxx
- Add responsive figure CSS (max-width, height: auto) for HTML
- Add figure headers to Python blocks (Context, Goal, Show, How, Imports, Exports)
- Add Lua filter to convert <br> in table cells to \makecell for PDF
- Use pipe table with • and <br> in hw_acceleration hardware evolution table
- Add makecell package; set arraystretch to 1.6; top-align makecell cells
- Register filter in PDF config
Add Context/Goal/Show/How/Imports/Exports headers to all Python figure
blocks (#| label: fig-*) in Vol 1 and Vol 2, matching the setup-block
pattern. Headers placed after Quarto options and before imports.
- extension.ts: wrap activate() body in try/catch so activation failures
surface in the Output channel instead of crashing silently
- workspace.ts: return undefined when no book/binder marker is found
instead of returning the first workspace folder unconditionally
After the class-based namespace isolation pass, missing EXPORTS bridge
variables were discovered by running all chapters through the HTML build pipeline.
Vol1 fixes:
- nn_computation: add hog_grid_str/hog_bins_str exports; convert generator
expressions to for-loops (Python 3 class scope skips class namespace);
add mnist_large/small_l1/l2 exports for footnote inline Python
- ml_systems: add cloud_compute/memory/ai_frac, mobile_tops/bw/ratio/
bottleneck/compute/memory_frac, cloud_thresh_bw_str, edge_thresh_bw_str
exports; complete ResnetMobile EXPORTS section
- data_selection: fix FpScalingCalc invariant (min_samples_threshold 50→150
so 100 expected rare samples < 150 threshold holds true)
- model_compression: FusionCalc bandwidth_reduction invariant 50→40%
- nn_architectures: add 'param' unit to lighthouse-table-specs imports
Vol2 fixes:
- data_storage: add missing 'watt' import to chapter setup cell
- fault_tolerance: export per_node_gbs raw float for prose arithmetic
- appendix_fleet: export rho_7b raw float for fmt() call in prose
- appendix_c3: add .magnitude to calc_effective_flops() result (returns
Quantity since formulas.py upgrade, not raw float)
- appendix_reliability: wrap worked-example-young-daly in class with EXPORTS
All 43 chapters with Python cells verified passing after fixes.
- hw_acceleration: escape % in callout title 'The Five-Percent Utilization Mystery'
(LaTeX treats % as comment char in div attribute titles, truncating the box)
- data_selection: escape % in callout title 'The Ninety-Nine Percent Sparsity Trap'
(same \fbxSimple runaway argument error)
- model_compression: remove 28-line orphaned stale class body (merge artifact);
add missing mat_dim=4096 to LowRankFactorization class parameters
- model_serving: move littles-law-calc code cell before the prose that references
its exported variables (serving_qps_str etc. used before they were defined)
Fixes a series of LaTeX/Pandoc compilation errors across Vol2 so every
chapter builds cleanly with `binder build pdf <chapter> --vol2 -v`.
Key fixes applied:
- Citations removed from fig-caps, table cells, and footnote definitions
(Quarto 1.8 `marginCitePlaceholderInlineWithProtection` bug with
`citation-location: margin`); citations restored to surrounding prose
- TikZ nodes with `\\` line breaks given `align=center/left` to exit
LR mode (robust_ai, sustainable_ai)
- `\argmax` → `\operatorname{arg\,max}` (undefined in amsmath)
- `\texorpdfstring` wrapping for math in section headers (notation)
- Multi-line `{python}` inline expressions in grid tables converted to
pipe tables (appendix_communication)
- Math expressions split across grid table row boundaries converted to
pipe tables to avoid `\{\beta\}\$` rendering corruption
- Stale class references (`ImageNetBottleneck`, `PrefetchBuffer`,
`CheckpointStorage`) fixed → `StorageEconomics.*` (data_storage)
- Missing `batch_per_gpu` factor in aggregate bandwidth formula (data_storage)
- Duplicate `xytext` keyword in `ax.annotate()` call (edge_intelligence)
- `<` HTML entity mixed with unescaped `$` in table cells fixed (security_privacy)
- Incorrect `check()` invariant corrected (appendix_fleet)
Refines book abstracts, table of contents, and diagram configurations for improved clarity and structure.
This commit enhances the descriptions of both Volume I and Volume II, emphasizing their respective focuses. It also introduces a framework decision tree to guide the selection of parallel training strategies and inference frameworks, and diagrams for visualizing hardware constraints.
Creates a YAML configuration file specifically for generating the PDF version of Volume II: Machine Learning Systems at Scale.
This configuration defines the project structure, book metadata (title, author, abstract), chapter organization, and PDF-specific settings like cover page design, table of contents depth, and inclusion of LaTeX files for custom styling.
This allows for independent building and customization of the PDF output for Volume II.
Improves the data pipeline debugging flowchart by adding visual cues.
These cues help to highlight the type of data issue being investigated
and make the flowchart easier to understand.
Enhances the conclusion of Volume 1, improving clarity and flow by:
- Refining wording and structure for better readability
- Clarifying the connection between theoretical invariants and practical applications
- Adding information for clarity and context
Audits and refactors Volume 2 chapters to ensure all Python calculation cells adhere to the P.I.C.O. (Parameters, Invariants, Calculation, Outputs) standard.
- Consolidates storage specifications and economics into StorageSetup and StorageEconomics classes in data_storage.qmd.
- Refactors collective communication math into the AllReduceCost class in collective_communication.qmd.
- Standardizes infrastructure and performance engineering setups in compute_infrastructure.qmd and performance_engineering.qmd.
- Corrects NameErrors and missing imports in benchmarking and platform ROI calculations.
- Ensures all prose variables are correctly exported and scoped within Safe Class Namespaces to prevent global pollution and ensure mathematical consistency across the fleet-scale narrative.
Audits all Volume 1 chapters to identify and repair structural errors in Python calculation cells introduced during the P.I.C.O. refactor.
- Consolidates redundant memory calculations and fixes missing imports in nn_computation.qmd.
- Refactors AttentionMemory in nn_architectures.qmd to resolve NameErrors and duplicated blocks.
- Cleans up QuantizationSpeedup and restores MobileNetCompressionAnchor in model_compression.qmd.
- Resolves missing Models and Hardware imports in benchmarking.qmd.
- Updates LighthouseModels in ml_systems.qmd with missing variables for MobileNet and KWS.
- Corrects indentation and structural integrity across all Volume 1 calculation scenarios to ensure valid rendering and mathematical consistency.
Restructures Volume II to improve narrative flow and address scale impediments, including reordering of sections and addition of introductory material.
Introduces "Master Map" to guide readers through the volume's layered progression.
Adds callout notes to bridge concepts between sections.
Moves references.qmd to backmatter and adjusts chapter organization for clarity.
Updates hardware parameterization and network performance modeling within code blocks.
Deepens understanding of abstract principles by adding concrete examples and numerical anchors.
These additions provide tangible context and illustrate the practical implications of the discussed concepts, which aids in comprehension and application. It also adds context to constraints, economics and performance.
Updates concept map YAML files for various chapters in volume 1, including introduction, benchmarking, data engineering, data selection, frameworks, hardware acceleration, ML systems, MLOps, ML workflow, model serving, NN architectures, NN computation, optimizations, responsible engineering, and training.
Replaces the old YAML structure with a new structure that focuses on primary, secondary concepts, technical terms, methodologies, and formulas. The change emphasizes the core concepts and their relationships within each chapter. The generated dates are updated to reflect a future date.
Insert thesis declarations, spine reconnections, and evidence elevations
that make the book's central claim explicit: ML systems engineering is a
distinct discipline governed by permanent physical laws. No restructuring
or deletions; insertions only, matching the surrounding rhetorical register.