cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-03-09 07:15:51 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	4ceb25fe83	word	2026-03-04 13:07:17 -05:00
Vijay Janapa Reddi	533cfa6e99	fix: pre-commit hooks — all 48 checks now pass - book/quarto/mlsys/__init__.py: add repo-root sys.path injection so mlsysim is importable when scripts run from book/quarto/ context - book/quarto/mlsys/{constants,formulas,formatting,hardware}.py: new compatibility shims that re-export from mlsysim.core.* and mlsysim.fmt - mlsysim/viz/__init__.py: remove try/except for dashboard import; use explicit "import from mlsysim.viz.dashboard" pattern instead - .codespell-ignore-words.txt: add "covert" (legitimate security term) - book/tools/scripts/reference_check_log.txt: delete generated artifact - Various QMD, bib, md files: auto-formatted by pre-commit hooks (trailing whitespace, bibtex-tidy, pipe table alignment)	2026-03-01 17:30:24 -05:00
Vijay Janapa Reddi	bf9c402827	Adds callout-definition blocks to all Vol.2 chapters and fixes pre-commit hook errors - Adds standardized callout-definition blocks with bold term + clear definition to all Vol.2 chapters (distributed training, inference, network fabrics, etc.) - Fixes caption_inline_python errors: replaces Python inline refs in table captions with static text in responsible_engr, appendix_fleet, appendix_reliability, compute_infrastructure - Fixes undefined_inline_ref errors: adds missing code fence for PlatformEconomics class in ops_scale.qmd; converts display math blocks with Python refs to prose - Fixes render-pattern errors: moves inline Python outside $...$ math delimiters in conclusion, fleet_orchestration, inference, introduction, network_fabrics, responsible_ai, security_privacy, sustainable_ai, distributed_training - Fixes dropcap errors: restructures drop-cap sentences in hw_acceleration and nn_architectures to not start with cross-references - Fixes unreferenced-label errors: removes @ prefix from @sec-/@tbl- refs inside Python comment strings in training, model_compression, ml_systems - Adds clientA to codespell ignore words (TikZ node label in edge_intelligence) - Updates mlsys constants, hardware, models, and test_units for Vol.2 calculations - Updates _quarto.yml and references.bib for two-volume structure	2026-03-01 10:44:33 -05:00
Vijay Janapa Reddi	83ce92624e	Editorial Corrections & Code Hardening (Volume 1) This commit refactors the underlying Python calculation cells for Chapters 1-16 to strictly enforce mathematical consistency with the narrative. Key Text/Numeric Updates (For Editorial Review): 1. Chapter 3 (Workflow) - Edge Necessity Scenario: - Change: Increased clinic patient count from 100 to 150. - Reason: With 100 patients, the calculated upload time was ~5.5 hours, which fits within the 8-hour clinic day, contradicting the chapter's conclusion that 'Edge is Mandatory.' Increasing to 150 pushes upload time to >8 hours, mathematically validating the narrative. 2. Chapter 1 (Introduction) - Model Drift Scenario: - Change: Reduced monthly accuracy drift rate from 8.0% to 0.8%. - Reason: An 8% monthly drop is a catastrophic failure that would be immediately noticed. A 0.8% drop correctly models the 'silent failure' (boiling frog) scenario described in the text. 3. Chapter 3 (Workflow) - Velocity vs Quality: - Change: Reduced 'Large Model' accuracy gain per iteration from 0.5% to 0.15%. - Reason: The original rate caused the large model to hit 99% accuracy almost instantly, invalidating the 'Velocity is a Feature' argument. The new rate correctly models diminishing returns, allowing the faster (small) model to win. 4. Chapter 15 (Responsible Engineering) - TCO Analysis: - Verification: Verified and stabilized the 3-year Total Cost of Ownership (TCO) calculations. Confirmed that Inference TCO (.5M) dominates Training TCO (8K) by ~40x, supporting the 'Efficiency as Responsibility' thesis. Technical Changes (Code Only): - Refactored all calculation cells to use the P.I.C.O. (Parameters, Invariants, Calculation, Outputs) design pattern. - Added assertion guards (Invariants) to prevent future regressions where math contradicts prose. - Fixed variable scope issues in Chapter 10 (Model Compression) and Chapter 15. - Disabled false-positive linter warnings for standard LaTeX spacing.	2026-02-10 14:59:26 -05:00
Vijay Janapa Reddi	3dbaa04ebf	fix: resolve all pre-commit hook failures across Vol 1 and Vol 2 Content fixes: - Add references for all 8 appendix_machine tables in surrounding prose - Remove cross-volume refs (@sec-distributed-training, @sec-security-privacy) and replace with self-contained prose - Fix broken cross-refs (em-dashes, @sec-data-engineering → @sec-data-engineering-ml) - Fix unreferenced equations (@eq-memory-wall, @eq-training-iron-law) - Fix nested/forbidden footnotes (hw_acceleration, introduction, dl_primer) - Fix drop cap incompatibility in conclusion.qmd - Fix codespell false positive ("trough" added to ignore list) - Add closer @tbl/@fig references near definitions across all chapters - Replace inline fmt() calls with pre-computed _str variables (dl_primer) Checker improvements: - figure_table_flow_audit.py: exclude code block lines from gap calculation, add forward-reference tolerance, broaden code block detection to all fenced blocks (tikz, etc.) - check_render_patterns.py: improve $...$ parsing with shortest-match spans, add exponent exception for {python} in ^{...}, exit 0 on warnings-only	2026-02-08 02:01:49 -05:00
Vijay Janapa Reddi	42bc152f7d	Fixes pgfplots dimension overflow in data_selection chapter - Fixes fig-amortization-comparison: scales Y-axis values from 12000 to 12 to avoid LaTeX dimension limit (~16383pt) - Fixes fig-compute-optimal-frontier: replaces problematic \fill...plot with proper \addplot[fill=...] \closedcycle for log-scale coordinates - Updates figure reference text to use @fig-selection-inequality - Adds ch_data_selection.py calculation module - Updates viz.py with new plot functions - Various chapter updates across vol1 and vol2	2026-02-02 06:20:23 -05:00
Vijay Janapa Reddi	74a6c4d760	fix: resolve pre-commit hook failures - Add 'ure' to codespell ignore list (false positive from regex) - Fix 17 broken cross-references pointing to non-existent sections - Rename 4 duplicate figure labels in vol2 (interpretability-spectrum, train-data-parallelism, model-parallelism, layers-blocks) - Remove missing bibliography reference from data_efficiency frontmatter - Add missing footnote definition (fn-containerization-orchestration) - Remove 5 unused footnote definitions - Update validate_part_keys.py to support two-volume directory structure	2026-01-24 09:35:39 -05:00
Vijay Janapa Reddi	4d3c58c537	Merge origin/dev into feat/volume-restructure Resolved conflicts: - .codespell-ignore-words.txt: combined entries from both branches - introduction.qmd: kept dev's AI Triangle framework description - ml_systems.qmd: kept dev's concise physical constraints paragraph	2026-01-11 08:38:38 -05:00
Vijay Janapa Reddi	adcbed3ed3	fix(citations): update Volume I bibliography files and add cross-references This commit includes: - Bibliography reformatting across all Volume I chapters - Updated cross-references in Vol II chapters - Added 'fpr' to codespell ignore list - Updated symlink to point to vol1 PDF config Changes span both volumes as part of ongoing volume restructure work.	2026-01-10 16:10:14 -05:00
Vijay Janapa Reddi	d43bf848c6	feat(glossary): add responsible engineering terms and update glossary tools - Add responsible_engr concepts, glossary, and quizzes files - Update serving glossary with new terms - Update build_global_glossary.py script - Add Clos, Marz, Pease to codespell ignore (proper names)	2026-01-07 13:36:55 -05:00
JEON HYUNJUN(Luciano)	f2f5f6d2b6	Add README Korean, Chinese, and Japanese (#1102 ) Add README translations in Chinese (zh), Japanese (ja), and Korean (ko) with language switcher links. Changes made by maintainer: - Standardized file names to ISO 639-1 codes - Fixed year target (2026 → 2030) to match main README - Added language switcher to all READMEs	2026-01-07 08:40:13 -05:00
Vijay Janapa Reddi	6ab52bff7e	chore: add ROUGE and FPR to codespell ignore list ROUGE is a legitimate ML evaluation metric (Recall-Oriented Understudy for Gisting Evaluation) used in text summarization benchmarks. FPR is False Positive Rate, a standard ML classification metric.	2026-01-06 17:27:25 -05:00
Vijay Janapa Reddi	9f6d44cffc	fix: add 'rin' to codespell ignore list (contributor name)	2026-01-05 18:58:37 -05:00
Vijay Janapa Reddi	0484c68add	fix: update codespell config to ignore false positives - Exclude minified JS files (bundle.js, *.min.js) from spelling checks - Exclude .bib files from end-of-file-fixer (bibtex-tidy handles them) - Add ignore words for: variable names (currentY, initialY), Python methods (assertIn), acronyms (SER, ALS), and valid technical terms - Skip .tex files from codespell	2025-12-14 10:26:42 -05:00

14 Commits