Commit Graph

1585 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
01a7ee2a18 Add FP8 to Vol 1 precision list and tidy Vol 2 notation ending.
FP8 is a data format relevant to single-node systems (H100 supports
it natively), so it belongs in Vol 1's precision list rather than as
an orphaned bullet in Vol 2. Network throughput units (Gbps vs GB/s)
remain Vol 2-only under a proper "Additional Units" subsection.
2026-03-05 15:34:02 -05:00
Vijay Janapa Reddi
7992c8ff13 Switch book validation to container-only fail-fast builds.
Remove baremetal workflow usage and add explicit Linux/Windows preflight toolchain checks so missing dependencies fail immediately before render.
2026-03-05 15:33:19 -05:00
Vijay Janapa Reddi
71b6090064 Split notation into per-volume files with Vol2 inheriting from Vol1.
Vol 1 owns its notation (vol1/frontmatter/notation.qmd) as a
standalone document free of distributed systems concepts. Vol 2
(vol2/frontmatter/notation.qmd) uses {{< include >}} to pull in
Vol 1's notation, then adds distributed systems notation on top.
This eliminates duplication while keeping Vol 1 independent of
Vol 2's publication timeline.

- Remove shared/notation.qmd (single shared file)
- Create vol1/frontmatter/notation.qmd (core ML systems notation)
- Create vol2/frontmatter/notation.qmd (extends Vol 1 via include)
- Update all 8 config files to point to volume-specific paths
- Remove contents/shared/ render glob from HTML configs
2026-03-05 11:23:51 -05:00
Vijay Janapa Reddi
2100099efb Standardize LaTeX subscripts to \text{} across both volumes.
Replace D_{vol}, R_{peak}, L_{lat} with D_{\text{vol}},
R_{\text{peak}}, L_{\text{lat}} in all QMD files and notation.qmd
to match the canonical notation convention. Also escape bare
FLOPs/$ to FLOPs/\$ in vol1 introduction. 288 replacements
across 24 files.
2026-03-05 11:04:34 -05:00
Vijay Janapa Reddi
048492d0e1 Move scaling regimes section from Vol1 intro to Vol2 intro.
The "Scaling the Machine: From Node to Fleet" section, including the
TikZ stack-comparison figure and layer walkthrough prose, belongs in
Vol2 where the reader crosses the scaling boundary. Vol1 now flows
directly from Samples/Dollar into ML vs. Traditional Software, with
a two-sentence scope statement in Book Organization. The fn-gpu-parallel
footnote is relocated next to its callsite on line 116.
2026-03-05 08:39:48 -05:00
Vijay Janapa Reddi
d767eb4212 Fix PowerShell quoting in Windows Docker Python alias step.
Use single-quoted python -c code in the Docker RUN command so the command parses correctly under pwsh -Command and avoids parser errors during image build.
2026-03-05 08:00:38 -05:00
Vijay Janapa Reddi
9314128cf7 Fix Windows python3 availability in container build paths.
Preserve container PATH during Windows docker-run steps and create/verify a python3 alias from Scoop Python so Quarto/Jupyter kernels that invoke python3 work reliably in both install-time and final verification checks.
2026-03-05 07:42:29 -05:00
Vijay Janapa Reddi
887fc9eb81 Harden final Windows tool checks for existence and exit status.
Update Dockerfile and workflow final verification steps to normalize native command exit detection and fail explicitly on non-zero exits, avoiding false positives and false negatives in PowerShell checks.
2026-03-04 19:46:38 -05:00
Vijay Janapa Reddi
75850ffd7a Fix TeX verification false failure in Windows Docker build.
Avoid checking $LASTEXITCODE after a piped lualatex command by capturing command output first and normalizing exit-code detection, preventing false non-zero failures.
2026-03-04 19:44:45 -05:00
Vijay Janapa Reddi
3fdae4a365 Defer Ghostscript runtime checks to final verification.
Keep fail-fast installation checks for Ghostscript command presence, but defer executing Ghostscript until the final verification phase where full runtime dependencies and PATH are in place.
2026-03-04 18:03:23 -05:00
Vijay Janapa Reddi
300457a013 Fix Ghostscript verification in Windows container build.
Verify Ghostscript through the Scoop shim (`gs`) and restore the Ghostscript `lib` path in image PATH so DLL-dependent checks pass during install and final verification.
2026-03-04 17:46:20 -05:00
Vijay Janapa Reddi
e0e9714669 Fix Windows Dockerfile PATH parse error.
Collapse multiline PATH ENV into a single valid Dockerfile instruction so Windows container builds parse and start correctly.
2026-03-04 17:24:04 -05:00
Vijay Janapa Reddi
77a37a7707 Harden Windows build tool installation and verification flow.
Add immediate per-tool post-install checks with explicit command resolution and exit handling, keep end-of-job final verification with isolated per-tool reporting, and ensure rsvg-convert is installed/verified for Quarto PDF SVG conversion.
2026-03-04 17:21:20 -05:00
Vijay Janapa Reddi
12268748b2 Overhaul Windows container build for PATH and tool reliability
- Add explicit ENV PATH directive (Phase 15) so Docker layers
  inherit tool paths instead of relying on registry writes
- Reorder phases: TeX Live moved last (slowest, fail last)
- Create stable symlink C:\texlive\bin\windows for year-agnostic PATH
- Skip pip self-upgrade to avoid WinError 3 shim lock
- Use gswin64c (correct Scoop binary name) instead of gs
- Add rsvg-convert fallback to Chocolatey if Scoop fails
- Replace fragile verification loop with Test-Tool function
- Relax ErrorActionPreference for Chocolatey TeX Live in baremetal
2026-03-04 17:16:34 -05:00
Vijay Janapa Reddi
fc0290df44 Fix broken \right) in DAM appendix equation
The \right) delimiter was split across two lines, causing
a LaTeX compilation error ("Missing \right. inserted").
2026-03-04 16:41:08 -05:00
Vijay Janapa Reddi
41097ce0c9 Harden TeX Live path resolution in Windows container
Add fallback search for tlmgr.bat when year-directory pattern
fails, validate bin directory exists before use, and verify
lualatex.exe path explicitly. Adds diagnostic output to help
debug future path resolution issues.
2026-03-04 15:37:55 -05:00
Vijay Janapa Reddi
3fce198695 Fix PowerShell quoting in Windows container verification
Replace double-quoted string interpolation in throw statement
with -f format operator to prevent Docker RUN flattening from
stripping quotes and causing PowerShell parse errors.
2026-03-04 15:07:03 -05:00
Vijay Janapa Reddi
c394766b98 Refines numerical output precision
Applies consistent decimal formatting to f-string representations of various metrics throughout the book, enhancing readability. Also updates specific hardcoded numerical examples for clarity.
2026-03-04 14:14:48 -05:00
Vijay Janapa Reddi
0600bf8633 Enhance content rendering and consistency
Improves math expression rendering in the Log-Sum-Exp example for clarity and LaTeX robustness.
Standardizes the display of "MTBF" by removing unnecessary `\text{}` commands throughout the 'Fault Tolerance' chapter.
Updates various internal cross-references and section IDs to ensure correct linking and improved navigation.
Clarifies explanations in the error feedback worked example and the communication-computation overlap condition.
Adds a new bibliography entry for the Goodfellow et al. (2016) Deep Learning book.
2026-03-04 13:53:41 -05:00
Vijay Janapa Reddi
226b68ad1f Fix CI tool verification and switch Quarto to Scoop install
- Baremetal workflow: verification step now tracks failures and exits
  non-zero when tools are missing (previously always reported success)
- Baremetal workflow: R.exe --version replaced with Rscript --version
  to avoid PowerShell Invoke-History alias collision
- Windows Dockerfile: Quarto install switched from direct zip download
  (C:\quarto-1.9.27\bin) to scoop install extras/quarto for consistent
  PATH handling via Scoop shims
- Windows Dockerfile: final verification rewritten with failure tracking
  and exit 1 on missing tools
2026-03-04 13:52:02 -05:00
Vijay Janapa Reddi
e2e9095b00 precommit fixes 2026-03-04 13:07:08 -05:00
Vijay Janapa Reddi
da69ea1a69 Writing updates 2026-03-04 12:56:04 -05:00
Vijay Janapa Reddi
501ac92bb1 Adds SVG XML validation hook
Introduces a pre-commit hook to ensure SVG image files are well-formed XML,
preventing potential rendering or processing issues. This leverages `lxml` for
parsing, which has been added as a new dependency.

Corrects missing whitespace between attributes in existing SVG figures to
comply with the new validation requirements.
2026-03-04 11:07:06 -05:00
Vijay Janapa Reddi
4bd062c545 Standardizes citation formatting
Streamlines in-text references by removing redundant author mentions and consolidating to Quarto's native `@` syntax. Improves consistency and readability of academic citations.
2026-03-04 10:00:41 -05:00
Vijay Janapa Reddi
3097fe9931 Merge branch 'feature/book-volumes' into dev 2026-03-04 09:59:37 -05:00
Vijay Janapa Reddi
4b5a0a27f2 Enhances documentation readability and tone
Standardizes the narrative to a third-person, objective perspective.
Improves conciseness and flow of explanations throughout the content.
Refines punctuation for better readability and impact.
2026-03-04 09:29:46 -05:00
Zeljko Hrcek
9fa4cb214e Updated figures in chapter 9 2026-03-04 15:23:28 +01:00
Vijay Janapa Reddi
31ced18ec3 Refines conceptual language and generalizes examples
Updates various sections to use more general and impactful language.
Replaces specific numerical examples and ranges with qualitative descriptions (e.g., "orders of magnitude," "substantial fraction," "staggering capital expenditure").
Improves the timelessness and broad applicability of the content by reducing reliance on potentially outdated figures.
2026-03-04 08:57:36 -05:00
Vijay Janapa Reddi
870dc85a9f Installs rsvg-convert and refactors Quarto
Adds rsvg-convert to the Windows Dockerfile to enable SVG-to-PDF conversion for Quarto, including its verification.

Refactors Quarto document source:
- Renames a speculative decoding footnote for improved clarity.
- Standardizes an internal TikZ drawing macro boolean flag from `\ifbox@dashed` to `\ifboxdashed` across several files.
2026-03-04 08:08:39 -05:00
Vijay Janapa Reddi
210f8b173d Adds new inference SVG diagrams and unifies background styles
Enhances the `inference` chapter with several new SVG diagrams, providing visual explanations for complex topics. These figures illustrate:
- Tensor, pipeline, and expert parallelism request routing
- Horizontal scaling with shard groups
- Global load balancing across multiple regions
- Edge caching strategies (hit/miss paths)
- Spot-aware traffic distribution

Updates the `inference.qmd` document to integrate these new diagrams, replacing previous textual and ASCII-art descriptions for improved clarity and presentation.

Applies a widespread style standardization to existing SVG diagrams, uniformly setting the main background fill color to `#fff` (pure white) and a consistent corner radius (`rx="4"`) for the primary canvas rectangle to enhance visual consistency throughout the book.
2026-03-03 19:29:27 -05:00
Vijay Janapa Reddi
0b8e0209d0 Remove trailing blank lines and fix whitespace across chapters
Removes trailing blank lines from 14 chapters, fixes table column
alignment and missing newline at end of file in vol2 conclusion,
and adds missing section ID to dedication page.
2026-03-03 18:56:38 -05:00
Vijay Janapa Reddi
f98af8baae Fix QMD chapter structure: title order, chapter-start placement, div spacing
- Move chapter title (# Title) to immediately after YAML frontmatter in all vol1/vol2 chapters
- Move chapter-start Python cells to after Learning Objectives callout (not before cover image)
- Remove empty chapter-start cells (no executable code) from vol1 conclusion, data_engineering, introduction
- Move conclusion-roofline-setup cell (vol1 conclusion) to after Learning Objectives
- Add missing blank lines after chapter titles before ::: blocks
- Add missing blank lines before/after all ::: div markers across all chapter files (2028 insertions)
2026-03-03 18:55:11 -05:00
Vijay Janapa Reddi
bd5dd6f088 Enhances Quarto build for robustness and dynamic cross-referencing
Configures explicit `render` paths for both volumes to ensure complete and correct builds, particularly for selective rendering workflows.

Replaces the static cross-reference fix script with a dynamic version. This new script automatically discovers and resolves internal links from QMD sources, improving maintainability and ensuring links remain functional during partial book builds.

Adds a new script to check and auto-fix bibliography completeness, facilitating self-contained volumes.

Removes redundant empty Python code blocks from chapter QMDs and refines frontmatter content for consistency.
2026-03-03 16:04:25 -05:00
Vijay Janapa Reddi
0dfbba8f45 Removes generated book compilation files
Deletes temporary files generated during Quarto book compilation, including index files and the figure manifest. This keeps the repository clean by untracking build artifacts.
2026-03-03 15:03:11 -05:00
Vijay Janapa Reddi
d9c2906e40 Adds Windows book builds and refines Quarto content
Introduces Windows HTML, PDF, and EPUB build configurations for both Volume I and Volume II in the GitHub Actions workflow, expanding the available output formats for the book.

Updates Quarto callout and figure syntax from `::::` to `:::` across numerous content files for consistency and compatibility.

Removes unreferenced `war_stories.bib` and `data_engineering.bib` bibliography files and their corresponding entries in Quarto configuration.

Standardizes internal references to the 'Responsible AI' chapter by updating `@sec-responsible-engineering` to `@sec-responsible-ai` for improved linking accuracy throughout the text.
2026-03-03 14:50:20 -05:00
Vijay Janapa Reddi
91e5c320c5 Improves Quarto build and table rendering
Adds PYTHONPATH and MPLBACKEND environment variables to Quarto PDF configurations. This ensures Python code blocks execute reliably, particularly for plot generation.

Refactors table styling in appendix content to use direct Quarto block attributes for the `.column-page` class, simplifying markup and improving consistent layout.
2026-03-03 11:27:25 -05:00
Vijay Janapa Reddi
59b23a22c5 Activates appendices in PDF configurations
Uncomments the `appendices` section in Quarto PDF configuration files for both volumes. This ensures that the specified appendix content is included in the generated PDF output.
2026-03-03 11:19:13 -05:00
Vijay Janapa Reddi
10a2dc4303 fix(docker): add librsvg2-bin to Linux container for rsvg-convert
Quarto's Lua filter calls rsvg-convert to convert SVG figures to PDF
during PDF builds. librsvg2-dev was present (C headers/lib) but the
binary package librsvg2-bin was missing, causing a FATAL build error:
  'Could not convert a SVG to a PDF. Please ensure rsvg-convert is on path'

Also adds rsvg-convert to the Phase 2 verification checks so missing
tools are caught at image build time, not at render time.
2026-03-03 11:05:56 -05:00
Vijay Janapa Reddi
e6bad1fd45 refactor(docker): extract TeX Live install logic into standalone script
Move the ~100-line Phase 4 inline PowerShell block into
book/docker/windows/install_texlive.ps1. The Dockerfile now simply
COPYs and calls the script. Benefits:
- Script can be tested and updated independently of the Dockerfile
- Cleaner, readable PS syntax (no backtick line-continuation noise)
- Docker layer only invalidates when the script actually changes
2026-03-03 08:24:15 -05:00
Vijay Janapa Reddi
96f03a672b fix(build): fix three container build failures across epub, pdf, and html targets
- Remove invalid `output-file` from `project:` block in both EPUB configs
  (Quarto schema only allows `output-file` under `book:`, not `project:`)
- Move `language` to top-level `lang:` and remove HTML-only keys from
  EPUB format blocks (`fig-caption`, `footnotes-hover`, `citations-hover`,
  `code-copy`, `code-line-numbers`, `description`) per Quarto EPUB spec
- Add `matplotlib>=3.7.0` to requirements.txt — was missing from container
  image, causing ModuleNotFoundError during figure rendering
- Add `_matplotlib_available` guard in `viz.setup_plot()` to raise a clear
  ImportError instead of a cryptic AttributeError when matplotlib is absent
2026-03-03 08:14:59 -05:00
Vijay Janapa Reddi
2349e63094 fix(ci): consolidate black version — drop workflow pin, floor to >=24.0.0
The CI workflow hard-pinned black==24.10.0 separately from requirements.txt
(which said >=23.0.0), causing version skew that reformatted 11 QMD files
on every CI run. Remove the override and let requirements.txt be the single
source of truth, bumped to >=24.0.0 to align with current latest.
2026-03-03 07:39:31 -05:00
Vijay Janapa Reddi
6cb39f40ab fix(build): set PYTHONPATH for mlsysim, move output-file to book:, add volume to job name 2026-03-03 07:32:40 -05:00
Vijay Janapa Reddi
79a1015a5c fix(docker): avoid backtick escaping in cmd /c call for install-tl-windows.bat 2026-03-02 22:58:27 -05:00
Vijay Janapa Reddi
83cb23d178 fix(docker): use cmd /c for .bat invocation and fix exit in pwsh inline mode 2026-03-02 22:44:52 -05:00
Vijay Janapa Reddi
b316005230 fix(ci): reformat Python blocks with Black 24.10.0 and fix PS string interpolation
CI pins black==24.10.0 but requirements.txt had black>=23.0.0, causing
pre-commit to reformat 11 QMD files on the CI run and fail. Format all
affected files locally with 24.10.0 to match CI expectations.

Also fix PowerShell PATH string interpolation in Windows Dockerfile:
use explicit concatenation instead of nested method call inside a
double-quoted string, which can be unreliable in some PS contexts.
2026-03-02 22:21:41 -05:00
Vijay Janapa Reddi
bb0cecbe3d chore: add git hooks to run pre-commit on all files (matches CI)
- book/tools/git-hooks/pre-commit: runs pre-commit run --all-files
- setup.sh: one-time config (git config core.hooksPath)
- Ensures local commits pass same checks as CI
2026-03-02 20:45:01 -05:00
Vijay Janapa Reddi
159f4588c8 fix(docker): replace Chocolatey texlive with direct install-tl and mirror fallback
Chocolatey's texlive wrapper sets ErrorActionPreference=Stop and relies on
install-tl picking a random CTAN mirror at runtime. When that mirror is
flaky (as mirrors.rit.edu was), the entire build fails with no fallback.

Switch to calling install-tl-windows.bat directly:
- Set ErrorActionPreference=Continue so we own error handling
- Write a profile with instopt_adjustrepo=0 to prevent auto-mirror switching
- Pass -repository explicitly, trying Illinois → MIT → mirror.ctan.org in order
- Pin tlmgr repository post-install to the same stable mirror
- Remove Chocolatey texlive dependency entirely
2026-03-02 20:38:22 -05:00
Vijay Janapa Reddi
0cc0361f60 fix: remove --params mirror arg from choco texlive install
The InstallerParameters flag passed to install-tl via --params was
corrupting the installer profile, causing abs_path($::installerdir)
to return undef and triggering the 'uninitialized value $tmp' Perl
error at install-tl line 651. Install without params and set the
tlmgr repository mirror post-install instead.
2026-03-02 20:16:37 -05:00
Vijay Janapa Reddi
954b7942c2 chore: harden Windows TeX Live install and default to latest
Improve Windows container reliability by pinning TeX Live installer mirrors with fallback and setting safer Chocolatey CI defaults. Make TeX Live version configurable via build arg and default to latest while retaining override support.
2026-03-02 19:32:40 -05:00
Vijay Janapa Reddi
f64ba2962c chore: resolve pre-commit warning backlog and stabilize checks
Normalize book prose/style issues across touched chapters and remove remaining structural warnings so validation output is clean and reproducible in CI. Also tighten inline/times-spacing validation behavior to reduce noisy false positives while preserving strict checks.
2026-03-02 19:04:35 -05:00