Add proper equation labels ({#eq-...}) and prose references (@eq-...)
to 138 equations across 15 Volume 1 chapters following the gold-standard
pattern from serving.qmd.
Key changes:
- Label all display math equations with {#eq-kebab-case-name}
- Add @eq-name references in prose before each equation
- Equations include: Iron Law, Amdahl's Law, Roofline Model,
activation functions, backpropagation, attention mechanisms,
queuing theory, quantization, and system throughput formulas
Also includes:
- PDF formatting improvements (newpage directives for Vol 2)
- LaTeX header updates for chapter styling
- Pre-commit config and validation script updates
- Add large decorative chapter number in upper-right corner using TikZ
- Remove redundant "Chapter X" prefix (number serves this purpose)
- Rewrite dropcap filter to process elements in document order
(fixes bug where filter processed all headers before any paragraphs)
- Add PDF-conditional page break before Learning Objectives in intro
- Adjust section spacing for tighter layout
Design inspired by Harris & Harris "Digital Design and Computer Architecture"
- ml_systems: move [^fn-dgx-spark-edge] footnote out of table cell
into the table caption text
- data_selection: rename fig-foundation-cost-data to foundation-cost-calc
(computation cell, not a figure)
- Auto-formatter fixes: collapse blank lines, prettify pipe tables
Fix NameError build failures in ml_systems, data_engineering, and
benchmarking chapters caused by missing imports and variables referenced
before their defining code cells.
- ml_systems: add missing Kparam and Bparam imports from physx.constants
- data_engineering: compute transfer_time_10g_md preview in setup cell,
add md_math import, add deduplication-dividend-calc cell, convert
hardcoded values to physics engine units
- benchmarking: compute BERT roofline preview values in roofline-example-calc
cell before they are referenced in narrative text, convert hardcoded
values to inline Python, condense redundant footnotes
Also includes physics engine integration improvements across all Vol 1
chapters: unit-safe conversions, inline Python for previously hardcoded
values, streamlined footnotes with cross-references, and new content
validation scripts.
All 21 Vol 1 chapters pass PDF build tests.
Systematic redundancy removal for MIT Press submission. Applied two-phase
editorial process: (1) identified all conceptual repetition and near-duplication
across main text, examples, and callouts; (2) executed targeted edits to
eliminate redundant content while preserving tone and structure.
Files modified (22 chapters):
- Frontmatter: about, acknowledgements, notation
- Part I: introduction, ml_systems, workflow, data_engineering
- Part II: dl_primer, dnn_architectures, frameworks, training
- Part III: data_selection, hw_acceleration, benchmarking
- Part IV: serving, ops, responsible_engr, conclusion
- Appendices: appendix_dam, appendix_machine, appendix_algorithm, appendix_data
Net reduction: ~72 lines of redundant content removed
- data_engineering.qmd: Move pricing footnote after callout block
- data_engineering.qmd: Convert SATA footnote to inline text
- dl_primer.qmd: Move GPT-4 estimate note to table caption
- introduction.qmd: Move Box quote after callout, remove unused fn-algorithm
All footnotes now follow Quarto rendering rules.
- Audited all .qmd files in Volume 1 to identify hardcoded numerical constants.
- Replaced hardcoded numbers with dynamic Python variables derived from `physx/constants.py`.
- Updated `physx/constants.py` with missing constants (e.g., battery specs, dataset sizes).
- Created new Python calculation blocks in chapters to derive local metrics (e.g., energy per inference, training costs) from global constants.
- Ensured mathematical consistency across chapters by linking all values to a single source of truth.
- Fixed a citation in references.bib.
This ensures that future updates to core constants (e.g., hardware specs) will automatically propagate throughout the text.
- Fix LaTeX equations in appendix_dam using md() for proper rendering
- Bibtex tidy reformatting of references.bib
- Table alignment fixes across multiple chapters
- Minor formatting cleanup from pre-commit hooks
Replace hardcoded byte sizes (2 for FP16, 4 for FP32) and model parameters
with global constants from physx/constants.py for consistency.
Changes:
- Add model_memory() helper to physx/formulas.py for standardized memory calculations
- Replace manual memory calculations with model_memory(params, bytes_per_param, unit)
- Use BYTES_FP16, BYTES_FP32 constants instead of hardcoded 2/4 values
- Use GPT2_PARAMS, GPT3_PARAMS constants instead of local 1.5e9/175e9 values
Files updated: hw_acceleration, dnn_architectures, training, data_engineering,
dl_primer, frameworks
Apply consistent header format to setup cells in appendix_machine.qmd
and appendix_dam.qmd. All compute cells now follow the same structured
pattern used throughout ml_systems.qmd and other chapters.
Convert magic numbers and hardcoded calculations to Python-computed
inline references following the Computed Arithmetic Rule. Changes span
appendices (D·A·M, Machine Foundations), all main chapters, and glossary.
Key improvements:
- Amdahl's/Gustafson's Law examples now compute all derived values
- Training time formula example uses computed days/minutes
- Little's Law example computes concurrent requests from QPS×latency
- Bandwidth-latency example parameterizes link speed and ping
- Glossary consolidates forward pass/forward propagation entries
- Add audit_narrative.py script for prose validation
Improve how all ~260 figure references flow in the prose across all 16
chapters and appendices. Replace generic verbs (illustrates, shows, depicts)
with directive, student-engaging language that tells readers what to observe
and why.
Also fix 16 factual inaccuracies found during verification audit:
- introduction: correct compute growth from "five" to "eight" orders of magnitude
- frameworks: fix inverted slope descriptions and crossover magnitudes in
compilation continuum; correct "embedded targets" to "language bindings"
- data_engineering: remove fabricated "feature engineering" stage from TFX
pipeline; remove unverifiable animal species names from hard labels
- benchmarking: correct power units from "microwatts/megawatts" to
"milliwatts/hundreds of kilowatts"
- responsible_engr: correct governance pillar labels to match figure caption
- ml_systems: fix cloud ML examples, mobile ML characteristics, and hybrid
sync description to match actual figure content
- training: correct LLM scaling curve attribution; fix node color description
- hw_acceleration: fix tiling diagram description
- model_compression: fix quantization error distribution description
- dnn_architectures: fix im2col kernel size; fix attention visualization
- Comment out all chapters except hw_acceleration in PDF config for focused testing
- Add missing physx.constants imports to ml_systems TCO calculation block
- Update figure manifest to reflect single-chapter build
Per MIT Press production feedback (Feb 2026):
- Change paper size from 7x10 to 8x10 inches
- Set 1/2" top margin to header
- Set 5/8" bottom margin
- Set 7/8" gutter (inner margin)
- Move page numbers to outside edge (standard book convention)
- Change PDF layout from TwoPageRight to SinglePage for preflight
Also adds copyedit configs for double-spaced PDFs:
- _quarto-pdf-vol1-copyedit.yml
- _quarto-pdf-vol2-copyedit.yml
Purpose sections are abstract by design—they teach principles,
not specific hardware. Replace GPU/TPU references with
"accelerators" in the three Vol 1 purpose sections that
named specific hardware (serving, hw_acceleration, dl_primer).
- Vol1: Harvard Crimson (#A51C30)
- Vol2: ETH Zurich Blue (#1F407A)
Architecture:
- themes/_theme-harvard.scss, _theme-eth.scss: Color variables
- _base-styles.scss, _dark-mode-base.scss: Shared styles using $accent
- style-vol1/2.scss, dark-mode-vol1/2.scss: Entry points per volume
Each volume now has its own distinct visual identity while sharing
the same underlying style rules.
AutoML content is better suited for Volume II's optimization chapter
(distributed-scale model search). Moved from vol1/optimizations/ to
vol2/optimization/ to keep it accessible for future integration.
Two issues:
1. LaTeX parser regex only matched numeric figure numbers (e.g., 1.1)
but appendices use letter prefixes (B.1, C.2, D.1). Changed \d+ to
[A-Z\d]+ so all 214 figures are captured.
2. --scan-all mode picked up _shelved QMD files that aren't in the
actual build, causing a count mismatch. Added _shelved to skip list.
- Add pre-render hook to clear stale LaTeX data between builds
- Add post-render hook to generate FIGURE_LIST.txt in output dir
- LaTeX captures figure numbers and pages during compilation
- Use deferred write for accurate page numbers (after float placement)
- Python merges with QMD captions and alt-text
- Output automatically appears in _build/pdf-vol1/ after each build
Quarto requires #| directives to be at the start of code blocks.
Fixed 93+ code blocks across 15 files where imports came before
the echo: false directive, causing code to be visible in PDFs.
- Added PYTHONPATH='.' to quarto execute config
- Modified viz.setup_plot() to return (fig, ax, COLORS, plt)
- Cleaned up all plotting cells to use simple imports
- No more sys.path manipulation needed in individual cells