cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-04-29 00:59:07 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	4ae406160d	feat: add Quarto equation labels and cross-references across Vol 1 Add proper equation labels ({#eq-...}) and prose references (@eq-...) to 138 equations across 15 Volume 1 chapters following the gold-standard pattern from serving.qmd. Key changes: - Label all display math equations with {#eq-kebab-case-name} - Add @eq-name references in prose before each equation - Equations include: Iron Law, Amdahl's Law, Roofline Model, activation functions, backpropagation, attention mechanisms, queuing theory, quantization, and system throughput formulas Also includes: - PDF formatting improvements (newpage directives for Vol 2) - LaTeX header updates for chapter styling - Pre-commit config and validation script updates	2026-02-07 09:40:01 -05:00
Vijay Janapa Reddi	8f7cbbd58e	feat: add Harris & Harris-style chapter opening design - Add large decorative chapter number in upper-right corner using TikZ - Remove redundant "Chapter X" prefix (number serves this purpose) - Rewrite dropcap filter to process elements in document order (fixes bug where filter processed all headers before any paragraphs) - Add PDF-conditional page break before Learning Objectives in intro - Adjust section spacing for tighter layout Design inspired by Harris & Harris "Digital Design and Computer Architecture"	2026-02-06 16:38:14 -05:00
Vijay Janapa Reddi	7ef35ebc1c	refactor: standardize Python compute cells across Volume 1 - Move 38 Python cells from inside callouts to before callouts - Add header box formatting to all 268 compute cells (100% compliance) - Fix unescaped dollar signs for currency values - Fix inline Python inside LaTeX math blocks - Update validator to exclude _str variables from false positives Chapters updated: serving, training, data_engineering, ml_systems, data_selection, dl_primer, benchmarking, dnn_architectures, hw_acceleration, model_compression, conclusion, frameworks, introduction, ops, workflow, responsible_engr, appendix_* Validation: 3,708 inline refs, 0 errors, 0 warnings	2026-02-06 16:06:30 -05:00
Vijay Janapa Reddi	44a61a0ab1	fix: resolve duplicate cell label 'foundation-cost-calc' in data_selection Rename to 'foundation-amortization-data' to avoid collision with the existing 'foundation-cost-calc' cell earlier in the chapter.	2026-02-06 10:02:09 -05:00
Vijay Janapa Reddi	0343a8a536	fix: resolve pre-commit errors (footnote, label, formatting) - ml_systems: move [^fn-dgx-spark-edge] footnote out of table cell into the table caption text - data_selection: rename fig-foundation-cost-data to foundation-cost-calc (computation cell, not a figure) - Auto-formatter fixes: collapse blank lines, prettify pipe tables	2026-02-06 10:00:45 -05:00
Vijay Janapa Reddi	3d54da6305	fix: resolve inline Python build errors across Vol 1 chapters Fix NameError build failures in ml_systems, data_engineering, and benchmarking chapters caused by missing imports and variables referenced before their defining code cells. - ml_systems: add missing Kparam and Bparam imports from physx.constants - data_engineering: compute transfer_time_10g_md preview in setup cell, add md_math import, add deduplication-dividend-calc cell, convert hardcoded values to physics engine units - benchmarking: compute BERT roofline preview values in roofline-example-calc cell before they are referenced in narrative text, convert hardcoded values to inline Python, condense redundant footnotes Also includes physics engine integration improvements across all Vol 1 chapters: unit-safe conversions, inline Python for previously hardcoded values, streamlined footnotes with cross-references, and new content validation scripts. All 21 Vol 1 chapters pass PDF build tests.	2026-02-06 09:57:25 -05:00
Vijay Janapa Reddi	1d19aa676b	refactor: remove conceptual redundancies across Volume 1 chapters Systematic redundancy removal for MIT Press submission. Applied two-phase editorial process: (1) identified all conceptual repetition and near-duplication across main text, examples, and callouts; (2) executed targeted edits to eliminate redundant content while preserving tone and structure. Files modified (22 chapters): - Frontmatter: about, acknowledgements, notation - Part I: introduction, ml_systems, workflow, data_engineering - Part II: dl_primer, dnn_architectures, frameworks, training - Part III: data_selection, hw_acceleration, benchmarking - Part IV: serving, ops, responsible_engr, conclusion - Appendices: appendix_dam, appendix_machine, appendix_algorithm, appendix_data Net reduction: ~72 lines of redundant content removed	2026-02-06 06:49:25 -05:00
Vijay Janapa Reddi	0e17889b20	fix: add caption and alt-text to composite figure in socratiq.qmd The fig-quizzes composite figure was missing required fig-cap and fig-alt attributes. Added descriptive caption and accessibility text.	2026-02-06 06:10:48 -05:00
Vijay Janapa Reddi	56657d8152	fix: move footnotes out of forbidden locations (callouts, tables) - data_engineering.qmd: Move pricing footnote after callout block - data_engineering.qmd: Convert SATA footnote to inline text - dl_primer.qmd: Move GPT-4 estimate note to table caption - introduction.qmd: Move Box quote after callout, remove unused fn-algorithm All footnotes now follow Quarto rendering rules.	2026-02-06 06:07:08 -05:00
Vijay Janapa Reddi	e942b552ba	fix: resolve cross-reference issues and add missing table/figure refs - Update check_unreferenced_labels.py to detect YAML id: frontmatter - Add references to all unreferenced tables and listings in Vol1 - Scope unreferenced labels hook to Vol1 only (Vol2 has WIP chapters) - Fix inline Python in LaTeX math blocks across multiple chapters - Update test_units.py to use Dense (not Sparse) H100 FLOPS values - Update validate_inline_refs.py regex to ignore escaped dollar signs Key files fixed: - appendix_algorithm.qmd: @tbl-tensor-op-ref, @fig-broadcasting-rules - appendix_data.qmd: @tbl-data-gravity, @tbl-serialization-cost - appendix_dam.qmd: @tbl-dam-overlap, @tbl-bottleneck-actions, etc. - appendix_machine.qmd: @tbl-latency-hierarchy, @tbl-hardware-cheatsheet - frameworks.qmd: @lst-gradient-accumulation, @lst-custom-autograd-function - dnn_architectures.qmd: @lst-conv_layer_spatial	2026-02-06 06:03:19 -05:00
Vijay Janapa Reddi	962427ffa2	refactor: continue Physics Engine integration across Volume 1 - Update appendix files with dynamic variable references - Consolidate references.bib entries - Apply inline Python patterns to remaining chapters - Fix notation and formatting consistency	2026-02-06 05:18:43 -05:00
Vijay Janapa Reddi	23d76ac82e	Refactor Volume 1 to use dynamic variables from Physics Engine - Audited all .qmd files in Volume 1 to identify hardcoded numerical constants. - Replaced hardcoded numbers with dynamic Python variables derived from `physx/constants.py`. - Updated `physx/constants.py` with missing constants (e.g., battery specs, dataset sizes). - Created new Python calculation blocks in chapters to derive local metrics (e.g., energy per inference, training costs) from global constants. - Ensured mathematical consistency across chapters by linking all values to a single source of truth. - Fixed a citation in references.bib. This ensures that future updates to core constants (e.g., hardware specs) will automatically propagate throughout the text.	2026-02-06 04:59:21 -05:00
Vijay Janapa Reddi	e5d9dc06e1	refactor: replace remaining hardcoded byte sizes with constants Additional locations updated to use BYTES_FP32, BYTES_FP16, and ALLREDUCE_FACTOR from physx/constants.py instead of hardcoded values. Files updated: - appendix_algorithm.qmd: bytes_per_fp32 → BYTES_FP32.magnitude - dl_primer.qmd: bytes_per_param for MNIST → BYTES_FP32.magnitude - hw_acceleration.qmd: bytes_per_float for tensor calc → BYTES_FP32.magnitude - serving.qmd: bytes_per_param for KV cache → BYTES_FP16.magnitude - training.qmd: bytes_per_param_fp16 → BYTES_FP16.magnitude	2026-02-06 03:30:01 -05:00
Vijay Janapa Reddi	2f9899153c	style: pre-commit fixes and inline Python improvements - Fix LaTeX equations in appendix_dam using md() for proper rendering - Bibtex tidy reformatting of references.bib - Table alignment fixes across multiple chapters - Minor formatting cleanup from pre-commit hooks	2026-02-06 03:28:13 -05:00
Vijay Janapa Reddi	8ce4e20549	refactor: use global constants for byte sizes and model parameters Replace hardcoded byte sizes (2 for FP16, 4 for FP32) and model parameters with global constants from physx/constants.py for consistency. Changes: - Add model_memory() helper to physx/formulas.py for standardized memory calculations - Replace manual memory calculations with model_memory(params, bytes_per_param, unit) - Use BYTES_FP16, BYTES_FP32 constants instead of hardcoded 2/4 values - Use GPT2_PARAMS, GPT3_PARAMS constants instead of local 1.5e9/175e9 values Files updated: hw_acceleration, dnn_architectures, training, data_engineering, dl_primer, frameworks	2026-02-06 03:26:33 -05:00
Vijay Janapa Reddi	184fdf34b8	Fix and verify bibliography references Comprehensive verification and cleanup of references.bib: - Verified and updated 38 entries with missing URLs/ISBNs/DOIs - Fixed critical error in vaswani2017attention (Transformer paper) - Had incorrect DOI from "Shenzhen Medical Academy" dated 2025 - Corrected to proper 2017 NeurIPS publication with arXiv URL - Removed 4 fabricated/unverifiable references - Chowdhery2021 (fake Edge TPU paper, not cited) - Cheng2022 (fake memory-efficient DL survey) - huang2023adaptive (fake autonomous driving paper) - yu2023efficient (fake early exit paper) - Added 2 verified replacement references - chen2024eellm (EE-LLM: ICML 2024) - seo2023neuroflow (NeuroFlow: arXiv 2023) - Updated citations in model_compression.qmd to use verified sources Key papers verified: GPT-3, BERT, Transformer, InstructGPT, Switch Transformers, Vision Transformer, CLIP, DALL-E, ResNet, SimCLR, AlpaServe, Ansor, GShard, Clockwork, DeepSpeed, TensorFlow Lite Micro, MLPerf Mobile, Edge Impulse, and many more. Results: 759 entries (down from 760), 92.5% with verification metadata, all critical errors and fabrications eliminated.	2026-02-06 02:17:31 -05:00
Vijay Janapa Reddi	75bb63d9e3	style: standardize compute cell headers with PURPOSE/INPUT/PROCESS/OUTPUT Apply consistent header format to setup cells in appendix_machine.qmd and appendix_dam.qmd. All compute cells now follow the same structured pattern used throughout ml_systems.qmd and other chapters.	2026-02-06 01:33:49 -05:00
Vijay Janapa Reddi	c1a3d08284	refactor: replace hardcoded arithmetic with computed inline refs across Vol1 Convert magic numbers and hardcoded calculations to Python-computed inline references following the Computed Arithmetic Rule. Changes span appendices (D·A·M, Machine Foundations), all main chapters, and glossary. Key improvements: - Amdahl's/Gustafson's Law examples now compute all derived values - Training time formula example uses computed days/minutes - Little's Law example computes concurrent requests from QPS×latency - Bandwidth-latency example parameterizes link speed and ping - Glossary consolidates forward pass/forward propagation entries - Add audit_narrative.py script for prose validation	2026-02-06 01:32:17 -05:00
Vijay Janapa Reddi	2cee4e9b81	Add lead-in sentence for Statistics of Representation callout Add contextual lead-in before the notebook callout to maintain consistency with other chapters' callout patterns.	2026-02-06 00:55:31 -05:00
Vijay Janapa Reddi	40e71b54c7	Improve figure reference narrative and fix factual inaccuracies across Vol1 Improve how all ~260 figure references flow in the prose across all 16 chapters and appendices. Replace generic verbs (illustrates, shows, depicts) with directive, student-engaging language that tells readers what to observe and why. Also fix 16 factual inaccuracies found during verification audit: - introduction: correct compute growth from "five" to "eight" orders of magnitude - frameworks: fix inverted slope descriptions and crossover magnitudes in compilation continuum; correct "embedded targets" to "language bindings" - data_engineering: remove fabricated "feature engineering" stage from TFX pipeline; remove unverifiable animal species names from hard labels - benchmarking: correct power units from "microwatts/megawatts" to "milliwatts/hundreds of kilowatts" - responsible_engr: correct governance pillar labels to match figure caption - ml_systems: fix cloud ML examples, mobile ML characteristics, and hybrid sync description to match actual figure content - training: correct LLM scaling curve attribution; fix node color description - hw_acceleration: fix tiling diagram description - model_compression: fix quantization error distribution description - dnn_architectures: fix im2col kernel size; fix attention visualization	2026-02-06 00:08:02 -05:00
Vijay Janapa Reddi	41ef0cacdb	wip: isolate hw_acceleration chapter and fix missing imports - Comment out all chapters except hw_acceleration in PDF config for focused testing - Add missing physx.constants imports to ml_systems TCO calculation block - Update figure manifest to reflect single-chapter build	2026-02-05 22:18:03 -05:00
Vijay Janapa Reddi	17c2de646d	docs: apply stashed prose improvements and tighten figure references	2026-02-05 21:17:56 -05:00
Vijay Janapa Reddi	3f12a6555e	refactor: rename DAM Taxonomy to D·A·M taxonomy and standardize terminology	2026-02-05 21:09:52 -05:00
Vijay Janapa Reddi	605c48737d	Docs: clarify Vol1 figure context Tighten surrounding narrative for key figures to note units and illustrative assumptions without changing the book's structure or tone.	2026-02-05 15:44:53 -05:00
Vijay Janapa Reddi	83f05a51b5	feat(pdf): update layout to MIT Press 8x10 specifications Per MIT Press production feedback (Feb 2026): - Change paper size from 7x10 to 8x10 inches - Set 1/2" top margin to header - Set 5/8" bottom margin - Set 7/8" gutter (inner margin) - Move page numbers to outside edge (standard book convention) - Change PDF layout from TwoPageRight to SinglePage for preflight Also adds copyedit configs for double-spaced PDFs: - _quarto-pdf-vol1-copyedit.yml - _quarto-pdf-vol2-copyedit.yml	2026-02-05 14:28:36 -05:00
Vijay Janapa Reddi	20b54a774e	chore: update volume configs and frontmatter assets Remove legacy _quarto.yml and figure index, adjust volume config files, refresh acknowledgements/references, and add theme and epub assets.	2026-02-04 17:42:17 -05:00
Vijay Janapa Reddi	0fba57e1b0	docs: annotate egress pricing baseline Document AWS 2024 egress pricing as the baseline and note it in the data gravity callout.	2026-02-04 17:41:07 -05:00
Vijay Janapa Reddi	354bbeee31	refactor: standardize vol1 constants and conversions Route canonical time, precision, pricing, and reference values through physx. Update vol1 QMDs to use shared constants and conversion factors.	2026-02-04 17:32:43 -05:00
Vijay Janapa Reddi	563061a0aa	refactor: centralize canonical constants in physx Move energy, network, AlexNet, and carbon baselines into physx constants. Wire vol1 QMDs to consume those constants for consistent formatting.	2026-02-04 17:21:11 -05:00
Vijay Janapa Reddi	19fb2fba78	fix: use accelerator-first terminology in purpose sections Purpose sections are abstract by design—they teach principles, not specific hardware. Replace GPU/TPU references with "accelerators" in the three Vol 1 purpose sections that named specific hardware (serving, hw_acceleration, dl_primer).	2026-02-04 17:20:10 -05:00
Vijay Janapa Reddi	47bd285d29	fix: clarify carbon conversion and derive low-util energy Make the CO2 conversion formula explicit about the hour term. Compute low-utilization joules/token from idle power and throughput.	2026-02-04 16:38:22 -05:00
Vijay Janapa Reddi	668cc25030	refactor: inline QMD plots and slim viz helpers Move remaining plot logic into QMD blocks and keep physx/viz styling-only. Update preview scripts to use local plot code.	2026-02-04 16:34:31 -05:00
Vijay Janapa Reddi	ab9d9b49a5	feat: Add volume-specific theming system - Vol1: Harvard Crimson (#A51C30) - Vol2: ETH Zurich Blue (#1F407A) Architecture: - themes/_theme-harvard.scss, _theme-eth.scss: Color variables - _base-styles.scss, _dark-mode-base.scss: Shared styles using $accent - style-vol1/2.scss, dark-mode-vol1/2.scss: Entry points per volume Each volume now has its own distinct visual identity while sharing the same underlying style rules.	2026-02-04 15:48:52 -05:00
Vijay Janapa Reddi	e236277925	Move shelved AutoML section from vol1 to vol2 optimization AutoML content is better suited for Volume II's optimization chapter (distributed-scale model search). Moved from vol1/optimizations/ to vol2/optimization/ to keep it accessible for future integration.	2026-02-04 15:16:47 -05:00
Vijay Janapa Reddi	29fabf35c1	Fix figure list to handle appendix figures and exclude shelved files Two issues: 1. LaTeX parser regex only matched numeric figure numbers (e.g., 1.1) but appendices use letter prefixes (B.1, C.2, D.1). Changed \d+ to [A-Z\d]+ so all 214 figures are captured. 2. --scan-all mode picked up _shelved QMD files that aren't in the actual build, causing a count mismatch. Added _shelved to skip list.	2026-02-04 15:14:36 -05:00
Vijay Janapa Reddi	ac3c9ab2e5	Fix figure list regex to handle LaTeX braces and apostrophes Three regex bugs caused missing/truncated captions in the figure list: 1. div_pattern broke on LaTeX {} (e.g., $W_{hh}$, \index{...}) — fixed with greedy .* anchored to end-of-line 2. Caption/alt regex [^"']+ truncated at apostrophes (e.g., Moore's) — fixed by matching double-quote delimiters only: "([^"]*)" 3. Duplicate figures when ::: div wraps a code block — added dedup logic Fixes applied to both generate_figure_list.py and figure_list_for_press.py. Regenerated FIGURE_LIST_VOL1.csv: 182 figures, 0 empty captions. mit-submission-v1	2026-02-04 15:03:17 -05:00
Vijay Janapa Reddi	f0edc97e0d	remove vol 1 title	2026-02-04 08:19:32 -05:00
Vijay Janapa Reddi	8fb27cc973	mit release (after fig alt issue fix)	2026-02-04 08:14:34 -05:00
Vijay Janapa Reddi	c63e1429f2	figure listing	2026-02-04 07:25:56 -05:00
Vijay Janapa Reddi	765896b90d	fix principle references	2026-02-04 07:25:42 -05:00
Vijay Janapa Reddi	ace5f2f673	Fix malformed equation in serving.qmd Convert plain text equation to proper LaTeX math block with label for @eq-precision-throughput cross-reference to work.	2026-02-04 02:32:52 -05:00
Vijay Janapa Reddi	7f0e31bfb4	Fix fenced div and footnote warnings - Fix malformed div in networking.qmd: :::.column-margin -> ::: {.column-margin} - Add missing footnote reference [^fn-box-model] in introduction.qmd	2026-02-04 02:25:22 -05:00
Vijay Janapa Reddi	9d0eb24fa3	Enable all chapters in PDF vol1 config for full book builds Uncomment all frontmatter, chapters, and appendices so future builds include the complete book with all figure numbers.	2026-02-04 02:21:19 -05:00
Vijay Janapa Reddi	1a36108b49	Consolidate figure list scripts into single file with --clear flag - Merge clear_figure_cache.py into generate_figure_list.py - Pre-render: generate_figure_list.py --clear - Post-render: generate_figure_list.py - Single file easier to maintain	2026-02-04 02:17:56 -05:00
Vijay Janapa Reddi	a702f879ae	Add automatic figure list generation for MIT Press - Add pre-render hook to clear stale LaTeX data between builds - Add post-render hook to generate FIGURE_LIST.txt in output dir - LaTeX captures figure numbers and pages during compilation - Use deferred write for accurate page numbers (after float placement) - Python merges with QMD captions and alt-text - Output automatically appears in _build/pdf-vol1/ after each build	2026-02-04 02:13:16 -05:00
Vijay Janapa Reddi	6c5ffae4cb	stale	2026-02-04 01:23:41 -05:00
Vijay Janapa Reddi	9e857f318d	fixes	2026-02-04 01:18:33 -05:00
Vijay Janapa Reddi	d29965a0c3	Fix: move #\| directives before imports in all code blocks Quarto requires #\| directives to be at the start of code blocks. Fixed 93+ code blocks across 15 files where imports came before the echo: false directive, causing code to be visible in PDFs.	2026-02-04 00:36:33 -05:00
Vijay Janapa Reddi	8094efe659	Simplify plotting code: project-wide PYTHONPATH + viz returns plt - Added PYTHONPATH='.' to quarto execute config - Modified viz.setup_plot() to return (fig, ax, COLORS, plt) - Cleaned up all plotting cells to use simple imports - No more sys.path manipulation needed in individual cells	2026-02-04 00:30:16 -05:00
Vijay Janapa Reddi	0ef4842d91	Fix: use '.' for sys.path to import physx module sys.path.insert(0, '.') adds the project root to Python's module search path, allowing 'from physx import viz' to find the physx package.	2026-02-04 00:26:27 -05:00

1 2 3 4 5 ...

9937 Commits