mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-11 17:49:25 -05:00
Vol1: improve book abstracts and chapter content
- Config: academic, standalone abstracts for PDF/EPUB/copyedit - Chapters: ml_systems, nn_architectures, nn_computation, training
This commit is contained in:
@@ -45,7 +45,7 @@ book:
|
||||
orcid: "0000-0002-5259-7721"
|
||||
|
||||
abstract: |
|
||||
Volume I of Machine Learning Systems presents a comprehensive introduction to understanding and engineering machine learning systems, with a focus on mastering the machine learning node. This volume establishes the foundations through four progressive stages: building conceptual models, engineering complete workflows, optimizing for real-world constraints, and deploying with confidence.
|
||||
Foundations of machine learning systems engineering for single-machine deployment. ML systems are treated as infrastructure governed by physical constraints: data movement, memory bandwidth, and compute limits shape design decisions from model architecture to deployment target. The treatment progresses through conceptual models, end-to-end workflows, optimization under operational constraints, and production deployment. Quantitative reasoning and enduring principles are emphasized over transient tools. Suitable for undergraduate and graduate courses in computer science and engineering, and for practitioners designing flexible, efficient, and robust ML systems.
|
||||
|
||||
repo-url: https://github.com/harvard-edge/cs249r_book
|
||||
|
||||
@@ -53,7 +53,7 @@ book:
|
||||
left: |
|
||||
Written, edited and curated by Prof. Vijay Janapa Reddi (Harvard University)
|
||||
right: |
|
||||
This book was built with <a href="https://quarto.org/">Quarto</a>.
|
||||
Built with <a href="https://quarto.org/">Quarto</a>.
|
||||
|
||||
chapters:
|
||||
- index.qmd
|
||||
|
||||
@@ -70,7 +70,7 @@ website:
|
||||
text: "Volume I: Introduction"
|
||||
href: ./
|
||||
- icon: journal
|
||||
text: "Volume II: Advanced"
|
||||
text: "Volume II: At Scale"
|
||||
href: ../vol2/
|
||||
- text: "---"
|
||||
- icon: fire
|
||||
|
||||
@@ -35,7 +35,7 @@ book:
|
||||
roles: "Author, editor and curator."
|
||||
|
||||
abstract: |
|
||||
Volume I of Machine Learning Systems presents a comprehensive introduction to understanding and engineering machine learning systems, with a focus on mastering the machine learning node. This volume establishes the foundations through four progressive stages: building conceptual models, engineering complete workflows, optimizing for real-world constraints, and deploying with confidence. The content bridges theoretical foundations and practical engineering, emphasizing the systems context that engineers need to master when building AI solutions. While ML applications and tools evolve rapidly, the engineering principles for building ML systems remain largely consistent. This volume distills these enduring concepts for anyone seeking to build flexible, efficient, and robust ML systems.
|
||||
Foundations of machine learning systems engineering for single-machine deployment. ML systems are treated as infrastructure governed by physical constraints: data movement, memory bandwidth, and compute limits shape design decisions from model architecture to deployment target. The treatment progresses through conceptual models, end-to-end workflows, optimization under operational constraints, and production deployment. Quantitative reasoning and enduring principles are emphasized over transient tools. Suitable for undergraduate and graduate courses in computer science and engineering, and for practitioners designing flexible, efficient, and robust ML systems.
|
||||
|
||||
repo-url: https://github.com/harvard-edge/cs249r_book
|
||||
|
||||
@@ -43,7 +43,7 @@ book:
|
||||
left: |
|
||||
Written, edited and curated by Prof. Vijay Janapa Reddi (Harvard University)
|
||||
right: |
|
||||
This book was built with <a href="https://quarto.org/">Quarto</a>.
|
||||
Built with <a href="https://quarto.org/">Quarto</a>.
|
||||
|
||||
chapters:
|
||||
- index.qmd
|
||||
|
||||
@@ -34,7 +34,7 @@ book:
|
||||
roles: "Author, editor and curator."
|
||||
|
||||
abstract: |
|
||||
This book presents a comprehensive introduction to understanding and engineering machine learning systems, with a focus on mastering the machine learning node. It establishes the foundations through four progressive stages: building conceptual models, engineering complete workflows, optimizing for real-world constraints, and deploying with confidence. The content bridges theoretical foundations and practical engineering, emphasizing the systems context that engineers need to master when building AI solutions. While ML applications and tools evolve rapidly, the engineering principles for building ML systems remain largely consistent. This book distills these enduring concepts for anyone seeking to build flexible, efficient, and robust ML systems.
|
||||
Foundations of machine learning systems engineering for single-machine deployment. ML systems are treated as infrastructure governed by physical constraints: data movement, memory bandwidth, and compute limits shape design decisions from model architecture to deployment target. The treatment progresses through conceptual models, end-to-end workflows, optimization under operational constraints, and production deployment. Quantitative reasoning and enduring principles are emphasized over transient tools. Suitable for undergraduate and graduate courses in computer science and engineering, and for practitioners designing flexible, efficient, and robust ML systems.
|
||||
|
||||
repo-url: https://github.com/harvard-edge/cs249r_book
|
||||
|
||||
@@ -42,7 +42,7 @@ book:
|
||||
left: |
|
||||
Written, edited and curated by Prof. Vijay Janapa Reddi (Harvard University)
|
||||
right: |
|
||||
This book was built with <a href="https://quarto.org/">Quarto</a>.
|
||||
Built with <a href="https://quarto.org/">Quarto</a>.
|
||||
|
||||
chapters:
|
||||
- index.qmd
|
||||
@@ -50,58 +50,58 @@ book:
|
||||
# ==================================================
|
||||
# Volume I Frontmatter
|
||||
# ==================================================
|
||||
- contents/vol1/frontmatter/dedication.qmd
|
||||
- contents/vol1/frontmatter/foreword.qmd
|
||||
- contents/vol1/frontmatter/about.qmd
|
||||
- contents/vol1/frontmatter/acknowledgements.qmd
|
||||
- contents/vol1/frontmatter/notation.qmd
|
||||
# - contents/vol1/frontmatter/dedication.qmd
|
||||
# - contents/vol1/frontmatter/foreword.qmd
|
||||
# - contents/vol1/frontmatter/about.qmd
|
||||
# - contents/vol1/frontmatter/acknowledgements.qmd
|
||||
# - contents/vol1/frontmatter/notation.qmd
|
||||
|
||||
# ==================================================
|
||||
# Part I: Foundations
|
||||
# ==================================================
|
||||
- contents/vol1/parts/foundations_principles.qmd
|
||||
- contents/vol1/introduction/introduction.qmd
|
||||
- contents/vol1/ml_systems/ml_systems.qmd
|
||||
- contents/vol1/ml_workflow/ml_workflow.qmd
|
||||
- contents/vol1/data_engineering/data_engineering.qmd
|
||||
# - contents/vol1/parts/foundations_principles.qmd
|
||||
# - contents/vol1/introduction/introduction.qmd
|
||||
# - contents/vol1/ml_systems/ml_systems.qmd
|
||||
# - contents/vol1/ml_workflow/ml_workflow.qmd
|
||||
# - contents/vol1/data_engineering/data_engineering.qmd
|
||||
|
||||
# ==================================================
|
||||
# Part II: Build
|
||||
# ==================================================
|
||||
- contents/vol1/parts/build_principles.qmd
|
||||
- contents/vol1/nn_computation/nn_computation.qmd
|
||||
- contents/vol1/nn_architectures/nn_architectures.qmd
|
||||
- contents/vol1/frameworks/frameworks.qmd
|
||||
- contents/vol1/training/training.qmd
|
||||
# - contents/vol1/parts/build_principles.qmd
|
||||
# - contents/vol1/nn_computation/nn_computation.qmd
|
||||
# - contents/vol1/nn_architectures/nn_architectures.qmd
|
||||
# - contents/vol1/frameworks/frameworks.qmd
|
||||
# - contents/vol1/training/training.qmd
|
||||
|
||||
# ==================================================
|
||||
# Part III: Optimize
|
||||
# ==================================================
|
||||
- contents/vol1/parts/optimize_principles.qmd
|
||||
- contents/vol1/data_selection/data_selection.qmd
|
||||
- contents/vol1/optimizations/model_compression.qmd
|
||||
- contents/vol1/hw_acceleration/hw_acceleration.qmd
|
||||
# - contents/vol1/parts/optimize_principles.qmd
|
||||
# - contents/vol1/data_selection/data_selection.qmd
|
||||
# - contents/vol1/optimizations/model_compression.qmd
|
||||
# - contents/vol1/hw_acceleration/hw_acceleration.qmd
|
||||
- contents/vol1/benchmarking/benchmarking.qmd
|
||||
|
||||
# ==================================================
|
||||
# Part IV: Deploy
|
||||
# ==================================================
|
||||
- contents/vol1/parts/deploy_principles.qmd
|
||||
- contents/vol1/model_serving/model_serving.qmd
|
||||
- contents/vol1/ml_ops/ml_ops.qmd
|
||||
- contents/vol1/responsible_engr/responsible_engr.qmd
|
||||
- contents/vol1/conclusion/conclusion.qmd
|
||||
- contents/vol1/backmatter/references.qmd
|
||||
# - contents/vol1/parts/deploy_principles.qmd
|
||||
# - contents/vol1/model_serving/model_serving.qmd
|
||||
# - contents/vol1/ml_ops/ml_ops.qmd
|
||||
# - contents/vol1/responsible_engr/responsible_engr.qmd
|
||||
# - contents/vol1/conclusion/conclusion.qmd
|
||||
# - contents/vol1/backmatter/references.qmd
|
||||
|
||||
# ==================================================
|
||||
# Appendices (uses Appendix A, B, C... numbering)
|
||||
# ==================================================
|
||||
# appendices:
|
||||
- contents/vol1/backmatter/appendix_dam.qmd
|
||||
- contents/vol1/backmatter/appendix_machine.qmd
|
||||
- contents/vol1/backmatter/appendix_algorithm.qmd
|
||||
- contents/vol1/backmatter/appendix_data.qmd
|
||||
- contents/vol1/backmatter/glossary/glossary.qmd
|
||||
# - contents/vol1/backmatter/appendix_dam.qmd
|
||||
# - contents/vol1/backmatter/appendix_machine.qmd
|
||||
# - contents/vol1/backmatter/appendix_algorithm.qmd
|
||||
# - contents/vol1/backmatter/appendix_data.qmd
|
||||
# - contents/vol1/backmatter/glossary/glossary.qmd
|
||||
|
||||
bibliography:
|
||||
- contents/vol1/backmatter/references.bib
|
||||
|
||||
@@ -705,7 +705,7 @@ These archetypes map naturally to deployment paradigms: **Compute Beasts** and *
|
||||
# └─────────────────────────────────────────────────────────────────────────────
|
||||
from mlsys import Models
|
||||
from mlsys.constants import (
|
||||
RESNET50_FLOPs, GFLOPs, Mparam, Bparam, byte, MB, GB
|
||||
RESNET50_FLOPs, GFLOPs, Mparam, Bparam, Kparam, byte, MB, GB, KB
|
||||
)
|
||||
from mlsys.formatting import fmt, check
|
||||
|
||||
|
||||
@@ -297,6 +297,9 @@ class LighthouseSpecs:
|
||||
|
||||
a100_mem_str = fmt(a100_mem, precision=0)
|
||||
|
||||
# Transformer quadratic scaling: doubling sequence length quadruples attention memory
|
||||
transformer_scaling_ratio_str = "4"
|
||||
|
||||
# Note: No exports. Use LighthouseSpecs.variable directly.
|
||||
```
|
||||
|
||||
|
||||
@@ -66,6 +66,10 @@ Neural networks reduce to a small set of mathematical operations. Matrix multipl
|
||||
from mlsys.constants import *
|
||||
from mlsys.formatting import fmt, sci
|
||||
from mlsys.formulas import model_memory
|
||||
|
||||
# MNIST 784→128→64→10 MAC count (used in Purpose / From Logic to Arithmetic)
|
||||
_inf_madd = 784 * 128 + 128 * 64 + 64 * 10 # 109,184
|
||||
inf_madd_total_str = f"{_inf_madd:,}"
|
||||
```
|
||||
|
||||
## From Logic to Arithmetic {#sec-neural-computation-deep-learning-systems-engineering-foundation-597f}
|
||||
|
||||
@@ -17,6 +17,14 @@ engine: jupyter
|
||||
|
||||
:::
|
||||
|
||||
```{python}
|
||||
#| label: purpose-anchor
|
||||
#| echo: false
|
||||
# Early anchor for Purpose section inline refs (training-setup runs later)
|
||||
gpt2_train_cost_str = "$50,000"
|
||||
gpt2_inf_cost_str = "$0.0001"
|
||||
```
|
||||
|
||||
## Purpose {.unnumbered}
|
||||
|
||||
\begin{marginfigure}
|
||||
|
||||
Reference in New Issue
Block a user