Commit Graph

14 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
4ceb25fe83 word 2026-03-04 13:07:17 -05:00
Vijay Janapa Reddi
533cfa6e99 fix: pre-commit hooks — all 48 checks now pass
- book/quarto/mlsys/__init__.py: add repo-root sys.path injection so
  mlsysim is importable when scripts run from book/quarto/ context
- book/quarto/mlsys/{constants,formulas,formatting,hardware}.py: new
  compatibility shims that re-export from mlsysim.core.* and mlsysim.fmt
- mlsysim/viz/__init__.py: remove try/except for dashboard import; use
  explicit "import from mlsysim.viz.dashboard" pattern instead
- .codespell-ignore-words.txt: add "covert" (legitimate security term)
- book/tools/scripts/reference_check_log.txt: delete generated artifact
- Various QMD, bib, md files: auto-formatted by pre-commit hooks
  (trailing whitespace, bibtex-tidy, pipe table alignment)
2026-03-01 17:30:24 -05:00
Vijay Janapa Reddi
bf9c402827 Adds callout-definition blocks to all Vol.2 chapters and fixes pre-commit hook errors
- Adds standardized callout-definition blocks with bold term + clear definition
  to all Vol.2 chapters (distributed training, inference, network fabrics, etc.)
- Fixes caption_inline_python errors: replaces Python inline refs in table
  captions with static text in responsible_engr, appendix_fleet, appendix_reliability,
  compute_infrastructure
- Fixes undefined_inline_ref errors: adds missing code fence for PlatformEconomics
  class in ops_scale.qmd; converts display math blocks with Python refs to prose
- Fixes render-pattern errors: moves inline Python outside $...$ math delimiters
  in conclusion, fleet_orchestration, inference, introduction, network_fabrics,
  responsible_ai, security_privacy, sustainable_ai, distributed_training
- Fixes dropcap errors: restructures drop-cap sentences in hw_acceleration and
  nn_architectures to not start with cross-references
- Fixes unreferenced-label errors: removes @ prefix from @sec-/@tbl- refs inside
  Python comment strings in training, model_compression, ml_systems
- Adds clientA to codespell ignore words (TikZ node label in edge_intelligence)
- Updates mlsys constants, hardware, models, and test_units for Vol.2 calculations
- Updates _quarto.yml and references.bib for two-volume structure
2026-03-01 10:44:33 -05:00
Vijay Janapa Reddi
83ce92624e Editorial Corrections & Code Hardening (Volume 1)
This commit refactors the underlying Python calculation cells for Chapters 1-16
to strictly enforce mathematical consistency with the narrative.

**Key Text/Numeric Updates (For Editorial Review):**

1.  **Chapter 3 (Workflow) - Edge Necessity Scenario:**
    -   *Change:* Increased clinic patient count from **100** to **150**.
    -   *Reason:* With 100 patients, the calculated upload time was ~5.5 hours, which fits within the 8-hour clinic day, contradicting the chapter's conclusion that 'Edge is Mandatory.' Increasing to 150 pushes upload time to >8 hours, mathematically validating the narrative.

2.  **Chapter 1 (Introduction) - Model Drift Scenario:**
    -   *Change:* Reduced monthly accuracy drift rate from **8.0%** to **0.8%**.
    -   *Reason:* An 8% monthly drop is a catastrophic failure that would be immediately noticed. A 0.8% drop correctly models the 'silent failure' (boiling frog) scenario described in the text.

3.  **Chapter 3 (Workflow) - Velocity vs Quality:**
    -   *Change:* Reduced 'Large Model' accuracy gain per iteration from **0.5%** to **0.15%**.
    -   *Reason:* The original rate caused the large model to hit 99% accuracy almost instantly, invalidating the 'Velocity is a Feature' argument. The new rate correctly models diminishing returns, allowing the faster (small) model to win.

4.  **Chapter 15 (Responsible Engineering) - TCO Analysis:**
    -   *Verification:* Verified and stabilized the 3-year Total Cost of Ownership (TCO) calculations. Confirmed that Inference TCO (.5M) dominates Training TCO (8K) by ~40x, supporting the 'Efficiency as Responsibility' thesis.

**Technical Changes (Code Only):**
-   Refactored all calculation cells to use the **P.I.C.O. (Parameters, Invariants, Calculation, Outputs)** design pattern.
-   Added assertion guards (Invariants) to prevent future regressions where math contradicts prose.
-   Fixed variable scope issues in Chapter 10 (Model Compression) and Chapter 15.
-   Disabled false-positive linter warnings for standard LaTeX spacing.
2026-02-10 14:59:26 -05:00
Vijay Janapa Reddi
3dbaa04ebf fix: resolve all pre-commit hook failures across Vol 1 and Vol 2
Content fixes:
- Add references for all 8 appendix_machine tables in surrounding prose
- Remove cross-volume refs (@sec-distributed-training, @sec-security-privacy)
  and replace with self-contained prose
- Fix broken cross-refs (em-dashes, @sec-data-engineering → @sec-data-engineering-ml)
- Fix unreferenced equations (@eq-memory-wall, @eq-training-iron-law)
- Fix nested/forbidden footnotes (hw_acceleration, introduction, dl_primer)
- Fix drop cap incompatibility in conclusion.qmd
- Fix codespell false positive ("trough" added to ignore list)
- Add closer @tbl/@fig references near definitions across all chapters
- Replace inline fmt() calls with pre-computed _str variables (dl_primer)

Checker improvements:
- figure_table_flow_audit.py: exclude code block lines from gap calculation,
  add forward-reference tolerance, broaden code block detection to all fenced
  blocks (tikz, etc.)
- check_render_patterns.py: improve $...$ parsing with shortest-match spans,
  add exponent exception for {python} in ^{...}, exit 0 on warnings-only
2026-02-08 02:01:49 -05:00
Vijay Janapa Reddi
42bc152f7d Fixes pgfplots dimension overflow in data_selection chapter
- Fixes fig-amortization-comparison: scales Y-axis values from 12000 to 12
  to avoid LaTeX dimension limit (~16383pt)
- Fixes fig-compute-optimal-frontier: replaces problematic \fill...plot
  with proper \addplot[fill=...] \closedcycle for log-scale coordinates
- Updates figure reference text to use @fig-selection-inequality
- Adds ch_data_selection.py calculation module
- Updates viz.py with new plot functions
- Various chapter updates across vol1 and vol2
2026-02-02 06:20:23 -05:00
Vijay Janapa Reddi
74a6c4d760 fix: resolve pre-commit hook failures
- Add 'ure' to codespell ignore list (false positive from regex)
- Fix 17 broken cross-references pointing to non-existent sections
- Rename 4 duplicate figure labels in vol2 (interpretability-spectrum,
  train-data-parallelism, model-parallelism, layers-blocks)
- Remove missing bibliography reference from data_efficiency frontmatter
- Add missing footnote definition (fn-containerization-orchestration)
- Remove 5 unused footnote definitions
- Update validate_part_keys.py to support two-volume directory structure
2026-01-24 09:35:39 -05:00
Vijay Janapa Reddi
4d3c58c537 Merge origin/dev into feat/volume-restructure
Resolved conflicts:
- .codespell-ignore-words.txt: combined entries from both branches
- introduction.qmd: kept dev's AI Triangle framework description
- ml_systems.qmd: kept dev's concise physical constraints paragraph
2026-01-11 08:38:38 -05:00
Vijay Janapa Reddi
adcbed3ed3 fix(citations): update Volume I bibliography files and add cross-references
This commit includes:
- Bibliography reformatting across all Volume I chapters
- Updated cross-references in Vol II chapters
- Added 'fpr' to codespell ignore list
- Updated symlink to point to vol1 PDF config

Changes span both volumes as part of ongoing volume restructure work.
2026-01-10 16:10:14 -05:00
Vijay Janapa Reddi
d43bf848c6 feat(glossary): add responsible engineering terms and update glossary tools
- Add responsible_engr concepts, glossary, and quizzes files
- Update serving glossary with new terms
- Update build_global_glossary.py script
- Add Clos, Marz, Pease to codespell ignore (proper names)
2026-01-07 13:36:55 -05:00
JEON HYUNJUN(Luciano)
f2f5f6d2b6 Add README Korean, Chinese, and Japanese (#1102)
Add README translations in Chinese (zh), Japanese (ja), and Korean (ko) with language switcher links.

Changes made by maintainer:
- Standardized file names to ISO 639-1 codes
- Fixed year target (2026 → 2030) to match main README
- Added language switcher to all READMEs
2026-01-07 08:40:13 -05:00
Vijay Janapa Reddi
6ab52bff7e chore: add ROUGE and FPR to codespell ignore list
ROUGE is a legitimate ML evaluation metric (Recall-Oriented Understudy
for Gisting Evaluation) used in text summarization benchmarks.

FPR is False Positive Rate, a standard ML classification metric.
2026-01-06 17:27:25 -05:00
Vijay Janapa Reddi
9f6d44cffc fix: add 'rin' to codespell ignore list (contributor name) 2026-01-05 18:58:37 -05:00
Vijay Janapa Reddi
0484c68add fix: update codespell config to ignore false positives
- Exclude minified JS files (bundle.js, *.min.js) from spelling checks
- Exclude .bib files from end-of-file-fixer (bibtex-tidy handles them)
- Add ignore words for: variable names (currentY, initialY), Python
  methods (assertIn), acronyms (SER, ALS), and valid technical terms
- Skip .tex files from codespell
2025-12-14 10:26:42 -05:00