Commit Graph

1005 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
1052b2be31 Update book workflows for volume-only builds
Switch container/baremetal/validate/preview/live flows to vol1+vol2 artifacts, keep baremetal in dev validation, and add stable single-book navbar link.
2026-03-02 09:45:40 -05:00
Vijay Janapa Reddi
a7f9367e42 Merge dev into feature/book-volumes: CI, contributors, workflows
# Conflicts:
#	README.md
2026-03-02 09:38:47 -05:00
Vijay Janapa Reddi
48b519c42e Merge feature/tinytorch-core into feature/book-volumes
# Conflicts:
#	README.md
#	tinytorch/src/01_tensor/01_tensor.py
#	tinytorch/src/15_quantization/ABOUT.md
2026-03-02 09:38:08 -05:00
Vijay Janapa Reddi
73db0e021a Streamlines chapter introduction
Removes a sentence that summarized the chapter's structure.
This change simplifies the immediate opening, aligning with broader content organization efforts.
2026-03-02 09:37:06 -05:00
Vijay Janapa Reddi
8a1b0b8cd5 Reorganizes Introduction chapter content and prose
Moves the 'Scaling the Machine: From Node to Fleet' section to a more logical position
within the chapter, following the discussion on defining ML systems.

Refines various sentences for improved clarity, conciseness, and a more formal,
impersonal tone. Adds an introductory sentence to better outline the chapter's
structure and movements.
2026-03-02 08:38:57 -05:00
Vijay Janapa Reddi
533cfa6e99 fix: pre-commit hooks — all 48 checks now pass
- book/quarto/mlsys/__init__.py: add repo-root sys.path injection so
  mlsysim is importable when scripts run from book/quarto/ context
- book/quarto/mlsys/{constants,formulas,formatting,hardware}.py: new
  compatibility shims that re-export from mlsysim.core.* and mlsysim.fmt
- mlsysim/viz/__init__.py: remove try/except for dashboard import; use
  explicit "import from mlsysim.viz.dashboard" pattern instead
- .codespell-ignore-words.txt: add "covert" (legitimate security term)
- book/tools/scripts/reference_check_log.txt: delete generated artifact
- Various QMD, bib, md files: auto-formatted by pre-commit hooks
  (trailing whitespace, bibtex-tidy, pipe table alignment)
2026-03-01 17:30:24 -05:00
Vijay Janapa Reddi
c30f2a3bfd refactor: move mlsysim to repo root, extract fmt module from viz
Moves the mlsysim package from book/quarto/mlsysim/ to the repo root
so it is importable as a proper top-level package across the codebase.

Key changes:
- mlsysim/fmt.py: new top-level module for all formatting helpers (fmt,
  sci, check, md_math, fmt_full, fmt_split, etc.), moved out of viz/
- mlsysim/viz/__init__.py: now exports only plot utilities; dashboard.py
  (marimo-only) is no longer wildcard-exported and must be imported
  explicitly by marimo labs
- mlsysim/__init__.py: added `from . import fmt` and `from .core import
  constants`; removed broken `from .viz import plots as viz` alias
- execute-env.yml: fixed PYTHONPATH from "../../.." to "../.." so
  chapters resolve to repo root, not parent of repo
- 51 QMD files: updated `from mlsysim.viz import <fmt-fns>` to
  `from mlsysim.fmt import <fmt-fns>`
- book/quarto/mlsys/: legacy shadow package contents cleaned up;
  stub __init__.py remains for backward compat
- All Vol1 and Vol2 chapters verified to build with `binder build pdf`
2026-03-01 17:24:11 -05:00
Vijay Janapa Reddi
6a763c2552 Fix Node 1 NVLink ring arrowhead tangents in hierarchical-allreduce.svg
Offset the 2nd bezier control point x from the endpoint x on all four
Node 1 ring arcs so orient="auto" computes a diagonal arrival angle
instead of a straight vertical arrowhead.
2026-03-01 16:02:21 -05:00
Vijay Janapa Reddi
b0d826df64 Add Vol 2 textbook-quality SVG figures across all 17 chapters
Generated and audited 122 SVG figures covering all Vol 2 chapters:
introduction, compute_infrastructure, network_fabrics, data_storage,
distributed_training, collective_communication, fault_tolerance,
performance_engineering, inference, fleet_orchestration, ops_scale,
edge_intelligence, responsible_ai, robust_ai, security_privacy,
sustainable_ai. All figures follow the shared SVG style guide
(680x460 viewBox, Helvetica Neue, no embedded titles). Layout audit
applied 11 fixes for text overflow, out-of-bounds elements, and
missing arrowheads.
2026-03-01 15:51:20 -05:00
Vijay Janapa Reddi
bf9c402827 Adds callout-definition blocks to all Vol.2 chapters and fixes pre-commit hook errors
- Adds standardized callout-definition blocks with bold term + clear definition
  to all Vol.2 chapters (distributed training, inference, network fabrics, etc.)
- Fixes caption_inline_python errors: replaces Python inline refs in table
  captions with static text in responsible_engr, appendix_fleet, appendix_reliability,
  compute_infrastructure
- Fixes undefined_inline_ref errors: adds missing code fence for PlatformEconomics
  class in ops_scale.qmd; converts display math blocks with Python refs to prose
- Fixes render-pattern errors: moves inline Python outside $...$ math delimiters
  in conclusion, fleet_orchestration, inference, introduction, network_fabrics,
  responsible_ai, security_privacy, sustainable_ai, distributed_training
- Fixes dropcap errors: restructures drop-cap sentences in hw_acceleration and
  nn_architectures to not start with cross-references
- Fixes unreferenced-label errors: removes @ prefix from @sec-/@tbl- refs inside
  Python comment strings in training, model_compression, ml_systems
- Adds clientA to codespell ignore words (TikZ node label in edge_intelligence)
- Updates mlsys constants, hardware, models, and test_units for Vol.2 calculations
- Updates _quarto.yml and references.bib for two-volume structure
2026-03-01 10:44:33 -05:00
Vijay Janapa Reddi
69736d3bdb updates 2026-02-28 18:20:47 -05:00
Vijay Janapa Reddi
ae6f5d9f11 Refines book structure; modularizes embedded code and updates content
Updates Quarto configurations to reorder, add, and rename appendices across all output formats for both volumes, and includes previously commented chapters in PDF builds.

Encapsulates Python calculation logic and exported variables within dedicated classes across numerous Quarto documents, improving modularity, maintainability, and clarity of in-text references.

Refines MLOps definitions, corrects TCO calculation with distinct inference GPU rates, adjusts distributed training scaling scenarios (e.g., commodity network bandwidth), and clarifies network fabric details (e.g., FEC latency).
2026-02-28 17:00:09 -05:00
Vijay Janapa Reddi
d299e49d10 update 2026-02-28 16:25:00 -05:00
Vijay Janapa Reddi
72d64a5499 cell updates 2026-02-28 13:03:38 -05:00
Vijay Janapa Reddi
2ce322def1 LEGO updates , call out updates 2026-02-28 11:47:42 -05:00
Vijay Janapa Reddi
c8dd1782d3 Math updates 2026-02-28 08:28:51 -05:00
Vijay Janapa Reddi
30f4cb1453 Renames Volume II parts and refines content for clarity
Renames Volume II parts from V-VIII to I-IV, updating all corresponding references in the about section, volume introduction, and individual part principle files.

Refines various textual elements across the book for improved conciseness and readability. Cleans up markdown formatting, including removal of unnecessary horizontal rules and empty code blocks. Adjusts footnote placement for better consistency.

Adds new reliability calculation parameters and corrects a tikz diagram rendering issue.
2026-02-27 18:00:41 -05:00
Vijay Janapa Reddi
dcf48671e2 Merge remote-tracking branch 'origin/feature/book-volumes' into feature/book-volumes 2026-02-27 08:09:51 -05:00
Vijay Janapa Reddi
b02b38aa32 fix: resolve PDF build failures in distributed_training and robust_ai
distributed_training: fix unclosed code cell (backticks appended to comment
line), add missing variable computations (a100_mem, nvlink_a100, etc.),
reorder LEGO cells so inline Python refs follow their defining cells, fix
duplicate cell label and stray code fence near young-daly-calc.

robust_ai: add missing TikZ definitions (gear macro, brain/skull pics,
LinePE style) to the data poisoning diagram so it compiles standalone.
2026-02-27 08:08:43 -05:00
Vijay Janapa Reddi
9cba37c92d Refactor TikZ figures and standardize code constants
Introduces reusable `pic` definitions for common elements across numerous TikZ diagrams, enhancing modularity and visual consistency. Improves diagram readability through explicit node positioning and refined styling.

Standardizes hardware and model constants in Python code by using specific `mlsys.constants` and dedicated setup classes, improving maintainability and clarity.

Addresses minor LaTeX formatting in math blocks and refines unit-aware calculations.
2026-02-27 07:15:37 -05:00
Zeljko Hrcek
6de84f20e6 Update chapter 20 figures 2026-02-27 12:02:50 +01:00
Vijay Janapa Reddi
303cd26669 refactor: use fmt_percent across Vol 1 and Vol 2 to prevent Pint precision bugs
This commit standardizes percentage formatting across the entire codebase to prevent critical rendering bugs (like the `19250000000000%` effective utilization bug in Vol 2).

Root Cause:
When dividing two Pint Quantities (e.g., `flop/second` by `TFLOPs/second`), Pint creates a mixed unit (`flop/TFLOPs`). The raw `.magnitude` of this fraction is $10^{12}$. When passed to `fmt(x * 100)`, it multiplied that massive magnitude by 100, resulting in an incorrect display.

Fix:
1. Fortified `fmt_percent` and `display_percent` in `mlsys/formatting.py` to defensively strip units using `.m_as('')`. This forces Pint to cancel out the units (e.g., `flop/TFLOPs` becomes `1.0`) *before* extracting the number.
2. Replaced all instances of `fmt(X * 100)` with the fortified `fmt_percent(X)` across Vol 1 and Vol 2.
3. Fixed inline f-strings in `appendix_assumptions.qmd` by moving formatting logic into the Python setup cell as `_str` variables, adhering to the book's standard practice.

Validation:
- Audited all `.magnitude` extractions in the codebase to ensure they are safe (e.g., explicitly converting to dimensionless units first).
- Ran `validate_inline_refs.py` and confirmed no Python variables are trapped inside LaTeX math mode.
- Successfully built full PDFs for both Volume 1 and Volume 2.
2026-02-26 20:59:43 -05:00
Vijay Janapa Reddi
96336ab0c6 fix: resolve Vol 2 PDF build failures and Pint unit display bugs
- Add missing attributes to FleetFoundations in appendix_fleet.qmd
- Fix regression_testing.png image path in fault_tolerance.qmd
- Add pgfplots package to header-includes.tex for TikZ compatibility
- Fortify fmt_percent in formatting.py to handle Pint Quantities properly, fixing the 19250000000000% display bug
2026-02-26 20:46:12 -05:00
Vijay Janapa Reddi
734e6fc987 fix: contribution guidelines link (main branch, CONTRIBUTING.md) 2026-02-26 17:42:08 -05:00
Vijay Janapa Reddi
baebb4c6d7 fix(vol1): model_serving PDF build — Python cell and TikZ
- Remove duplicate indented block in resnet-spectrum-calc cell that caused
  IndentationError (partial EXPORTS + stray class-body lines).
- Fix TikZ in fig-server-anatomy: add missing 'to' in brain path segments,
  remove stray/double commas in node and draw options.
2026-02-26 17:35:42 -05:00
Vijay Janapa Reddi
141a1efbe3 Refactor Volume 2 TikZ diagrams for structural integrity and positioning 2026-02-26 16:05:29 -05:00
Vijay Janapa Reddi
5e0c9a2f5d Update book quarto mlsys (hardware, validate_inline_refs, engine) 2026-02-26 15:23:07 -05:00
Vijay Janapa Reddi
73e39a0b8e Update book index 2026-02-26 15:23:04 -05:00
Vijay Janapa Reddi
2be59e3cec Update shared frontmatter (about, socratiq) 2026-02-26 15:23:04 -05:00
Vijay Janapa Reddi
0e992b79ae Update vol2 content and config 2026-02-26 15:23:03 -05:00
Vijay Janapa Reddi
c8447dd556 Update vol1 content and config 2026-02-26 15:11:04 -05:00
Vijay Janapa Reddi
45a3ad829e feat(landing): refine DAM/C3 hexagon wireframe visibility 2026-02-26 13:14:46 -05:00
Vijay Janapa Reddi
9420cfb87e feat(landing): replace sliders with DAM/C3 hexagon cube animation 2026-02-26 13:12:38 -05:00
Vijay Janapa Reddi
fe4daeb728 chore(landing): remove unused background variations 2026-02-26 12:47:54 -05:00
Vijay Janapa Reddi
59cffeef48 feat(landing): add matrix and particle background variations 2026-02-26 12:31:08 -05:00
Vijay Janapa Reddi
e0a71023e4 chore(landing): remove separate layout files in favor of unified light/dark mode 2026-02-26 11:43:19 -05:00
Vijay Janapa Reddi
fadef036e0 feat(landing): add multiple background animation variations and fix index.qmd 2026-02-26 11:42:59 -05:00
Vijay Janapa Reddi
809fd5ffce feat(landing): add dark/cyberpunk and minimal/brutalist variations 2026-02-26 11:37:48 -05:00
Vijay Janapa Reddi
293623e8e7 feat(landing): update modern landing page with pixel bg and animations 2026-02-26 11:33:23 -05:00
Vijay Janapa Reddi
bdf8f7decd Merge remote-tracking branch 'origin/feature/book-volumes' into feature/book-volumes 2026-02-26 08:10:51 -05:00
Zeljko Hrcek
b16f8f36cd A figure has been updated in chapter 18 2026-02-26 12:57:51 +01:00
Zeljko Hrcek
81e9c34ba7 Updated a figure in chapter 16 2026-02-26 10:19:42 +01:00
Vijay Janapa Reddi
79b7925b95 Landing site: two-volume hub with vol1/vol2 navbar, hero, cards, local covers 2026-02-25 15:08:33 -05:00
Vijay Janapa Reddi
9dbdac00a1 refactor: final Gold Standard polish across both volumes; ensure all mathematical variables render correctly and narrative is authoritatively consistent 2026-02-25 08:39:30 -05:00
Vijay Janapa Reddi
2de66f1c0f refactor: complete Gold Standard audit for core foundation chapters; unify Volume 1 and Volume 2 math; verify physical realism of hardware constants 2026-02-25 08:31:21 -05:00
Vijay Janapa Reddi
a78f5dd893 Merge pull request #1201 from harvard-edge/fix/ch15 and resolve README conflict 2026-02-25 08:27:02 -05:00
Vijay Janapa Reddi
c990d0037e Merge remote-tracking branch 'origin/fix/ch15' into feature/book-volumes 2026-02-25 07:54:56 -05:00
Vijay Janapa Reddi
aafc8f5d95 Merge remote-tracking branch 'origin/feature/book-volumes' into feature/book-volumes 2026-02-25 07:47:27 -05:00
Zeljko Hrcek
1f85111486 Updated a figure in chapter 15 2026-02-25 10:54:38 +01:00
Zeljko Hrcek
eccbd9d5d6 Updated a figure in chapter 15 2026-02-25 10:47:25 +01:00