mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-11 17:49:25 -05:00
refactor: rename Vol I tagline from Operate to Deploy
Update Volume I tagline from "Build, Optimize, Operate" to "Build, Optimize, Deploy" across all documentation. - "Deploy" better matches Part IV content (Serving, MLOps, Responsible Engineering) - "Operate" implies ongoing management which is more Volume II territory - Also fixes hw_acceleration table to use proper grid table format
This commit is contained in:
@@ -45,17 +45,17 @@ This textbook is organized into **two volumes** following the Hennessy & Patters
|
||||
|
||||
| Volume | Theme | Focus |
|
||||
|--------|-------|-------|
|
||||
| **Volume I** | Build, Optimize, Operate | Single-machine ML systems, foundational principles |
|
||||
| **Volume I** | Build, Optimize, Deploy | Single-machine ML systems, foundational principles |
|
||||
| **Volume II** | Scale, Distribute, Govern | Distributed systems at production scale |
|
||||
|
||||
#### Volume I: Build, Optimize, Operate
|
||||
#### Volume I: Build, Optimize, Deploy
|
||||
|
||||
| Part | Focus | Chapters |
|
||||
|------|-------|----------|
|
||||
| **ML Foundations** | Core concepts | Introduction, ML Systems, DL Primer, Architectures |
|
||||
| **System Development** | Building blocks | Workflow, Data Engineering, Frameworks, Training |
|
||||
| **Model Optimization** | Making it fast | Efficient AI, Optimizations, HW Acceleration, Benchmarking |
|
||||
| **System Operations** | Making it work | MLOps, Responsible Engineering |
|
||||
| **Foundations** | Core concepts | Introduction, ML Systems, DL Primer, Architectures |
|
||||
| **Development** | Building blocks | Workflow, Data Engineering, Frameworks, Training |
|
||||
| **Optimization** | Making it fast | Efficient AI, Optimizations, HW Acceleration, Benchmarking |
|
||||
| **Deployment** | Making it work | Serving, MLOps, Responsible Engineering |
|
||||
|
||||
#### Volume II: Scale, Distribute, Govern
|
||||
|
||||
@@ -124,7 +124,7 @@ cd book
|
||||
book/
|
||||
├── quarto/ # Book source (Quarto markdown)
|
||||
│ ├── contents/ # Chapter content
|
||||
│ │ ├── vol1/ # Volume I: Build, Optimize, Operate
|
||||
│ │ ├── vol1/ # Volume I: Build, Optimize, Deploy
|
||||
│ │ ├── vol2/ # Volume II: Scale, Distribute, Govern
|
||||
│ │ ├── frontmatter/ # Preface, about, changelog
|
||||
│ │ └── backmatter/ # References, glossary
|
||||
|
||||
@@ -10,14 +10,14 @@
|
||||
|
||||
This textbook is organized into two volumes following the Hennessy & Patterson pedagogical model:
|
||||
|
||||
- **Volume I: Build, Optimize, Operate** - Foundational knowledge for single-machine ML systems
|
||||
- **Volume I: Build, Optimize, Deploy** - Foundational knowledge for single-machine ML systems
|
||||
- **Volume II: Scale, Distribute, Govern** - Advanced distributed systems at production scale
|
||||
|
||||
Each volume stands alone as a complete learning experience while together forming a comprehensive treatment of the field.
|
||||
|
||||
---
|
||||
|
||||
## Volume I: Build, Optimize, Operate
|
||||
## Volume I: Build, Optimize, Deploy
|
||||
|
||||
### Goal
|
||||
A reader completes Volume I and can competently build, optimize, and deploy ML systems on a single machine with awareness of responsible practices.
|
||||
|
||||
@@ -18,12 +18,12 @@ This textbook is organized into two volumes following the **Hennessy & Patterson
|
||||
|
||||
| Volume | Theme | Focus | Analogy |
|
||||
|--------|-------|-------|---------|
|
||||
| **Volume I** | Build, Optimize, Operate | Single-machine ML systems, foundational principles | "Computer Organization and Design" |
|
||||
| **Volume I** | Build, Optimize, Deploy | Single-machine ML systems, foundational principles | "Computer Organization and Design" |
|
||||
| **Volume II** | Scale, Distribute, Govern | Distributed systems at production scale | "Computer Architecture" |
|
||||
|
||||
**Volume I** teaches you to *understand* ML systems. **Volume II** teaches you to *build* ML systems at scale.
|
||||
|
||||
**Volume I: Build, Optimize, Operate** establishes the foundations through four progressive stages:
|
||||
**Volume I: Build, Optimize, Deploy** establishes the foundations through four progressive stages:
|
||||
|
||||
- **Foundations** (Part I): Build your conceptual foundation, establishing the mental models that underpin all effective systems work.
|
||||
|
||||
@@ -118,7 +118,7 @@ SocratiQ is still a work in progress, and we welcome your feedback to make it be
|
||||
|
||||
This work takes you from understanding ML systems conceptually to building and deploying them in practice. The content is organized into two volumes following the Hennessy & Patterson pedagogical model, each containing four parts that develop specific capabilities.
|
||||
|
||||
**Volume I: Build, Optimize, Operate**
|
||||
**Volume I: Build, Optimize, Deploy**
|
||||
|
||||
1. **Part I: Foundations**
|
||||
*Master the fundamentals.* Build intuition for how ML systems differ from traditional software, understand the hardware-software stack, and gain fluency with essential architectures and mathematical foundations.
|
||||
|
||||
@@ -1126,13 +1126,19 @@ The energy differential established earlier—where memory access costs dominate
|
||||
|
||||
@tbl-accelerator-economics provides concrete cost-performance data for representative accelerators, but the economic analysis must account for utilization efficiency and energy consumption patterns that determine real-world performance.
|
||||
|
||||
| **Accelerator** | **List Price (USD)** | **Peak FLOPS (FP16)** | **Memory Bandwidth** | **Price/Performance** |
|
||||
|-----------------|----------------------|-----------------------|----------------------|-----------------------|
|
||||
| NVIDIA V100 | ~$9,000 (2017-19) | 125 TFLOPS | 900 GB/s | $72/TFLOP |
|
||||
| NVIDIA A100 | $15,000 | 312 TFLOPS (FP16) | 1,935 GB/s | $48/TFLOP |
|
||||
| NVIDIA H100 | $25,000-30,000 | 756 TFLOPS (TF32) | 3,350 GB/s | $33/TFLOP |
|
||||
| Google TPUv4 | ~$8,000* | 275 TFLOPS (BF16) | 1,200 GB/s | $29/TFLOP |
|
||||
| Intel Gaudi 2 | $12,000 | 200 TFLOPS (INT8) | 800 GB/s | $60/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
| **Accelerator** | **List Price (USD)** | **Peak FLOPS (FP16)** | **Memory Bandwidth** | **Price/Performance** |
|
||||
+==================:+=====================:+======================:+=====================:+======================:+
|
||||
| **NVIDIA V100** | ~$9,000 (2017-19) | 125 TFLOPS | 900 GB/s | $72/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
| **NVIDIA A100** | $15,000 | 312 TFLOPS (FP16) | 1,935 GB/s | $48/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
| **NVIDIA H100** | $25,000-30,000 | 756 TFLOPS (TF32) | 3,350 GB/s | $33/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
| **Google TPUv4** | ~$8,000* | 275 TFLOPS (BF16) | 1,200 GB/s | $29/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
| **Intel Gaudi 2** | $12,000 | 200 TFLOPS (INT8) | 800 GB/s | $60/TFLOP |
|
||||
+-------------------+----------------------+-----------------------+----------------------+-----------------------+
|
||||
|
||||
: **Accelerator Cost-Performance Comparison**: Hardware costs must be evaluated against computational capabilities to determine optimal deployment strategies. While newer accelerators like H100 offer better price-performance ratios, total cost of ownership includes power consumption, cooling requirements, and infrastructure costs that significantly impact operational economics. *TPU pricing estimated from cloud rates. {#tbl-accelerator-economics}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user