cs249r_book/instructors/customization.qmd

---
title: "Customization Guide"
subtitle: "Adapt the curriculum to your program's format and emphasis"
---

The 16-week syllabi are designed as complete, ready-to-use courses. But not every program has 16 weeks, and not every audience has the same emphasis. This guide shows how to adapt.

---

## 10-Week Quarter Version (Foundations)

For quarter systems, compress the 16-week Foundations syllabus:

| Quarter Week | Content (from 16-week) | What Changes |
|:---|:---|:---|
| 1 | Weeks 1–2 | Introduction + ML Systems combined; Module 01 only |
| 2 | Week 3 | ML Workflow; Module 02 starts |
| 3 | Weeks 4–5 | Data Engineering + Neural Computation; Module 02 + 03 |
| 4 | Week 6 | NN Architectures; Module 04 |
| 5 | Weeks 7–8 | Frameworks + Training; Module 05 + 06 |
| 6 | Week 9 | Data Selection; Module 07 |
| 7 | Week 10 | Model Compression + Lab 09 (Quantization); Module 08 |
| 8 | Week 11 | HW Acceleration + Lab 10 (Roofline) |
| 9 | Weeks 13–14 | Serving + Operations; Labs 12–13 |
| 10 | Week 16 | Capstone (AI Olympics, reduced scope) |

**What gets dropped:** Benchmarking (Week 12), Responsible Engineering (Week 15) — assign as optional reading. Integrate key responsibility points into the capstone rubric.

**What gets compressed:** TinyTorch Modules 01+02 doubled up in Weeks 1+3. Labs 00, 01, and 03 become optional.

::: {.callout-caution}
## Quarter Tradeoff
The 10-week version sacrifices breathing room. Consider reducing Decision Logs to 100 words and assigning only 2 Design Challenges instead of 4.
:::

---

## 3-Day Workshop Version

For short workshops with experienced practitioners:

| Day | Focus | Materials |
|:---|:---|:---|
| **Day 1** | The Physics of Inference | Iron Law introduction + Labs 01, 05, 09 (Magnitude Gap, Architecture Tradeoffs, Quantization) |
| **Day 2** | The Optimization Frontier | Labs 10, 11 (Roofline, Benchmarking) + TinyTorch Module 08 speed-run |
| **Day 3** | Production Deployment | Labs 12, 13 (Tail Latency, Drift Detection) + mini Design Challenge |

Focus exclusively on the **Iron Law** and **Interactive Labs**. Skip TinyTorch (except as demo). No formal assessment — use labs for hands-on discovery.

---

## For Software Engineers

If your audience is experienced developers, lean into **TinyTorch**:

| Weeks | Focus | Modules |
|:---|:---|:---|
| 1–4 | Building the Autograd Engine | Modules 01–06 (Tensor → Autograd) |
| 5–8 | From CNNs to Transformers | Modules 09–13 (Conv → Transformer) |
| 9–12 | Production Optimization | Modules 14–19 (Profiling → Benchmarking) |
| 13–16 | Capstone: Torch Olympics | Module 20 + competition |

Use textbook chapters as background reading, not lecture material. Labs serve as validation checkpoints, not primary pedagogy.

---

## For Computer Architects

Shift the focus toward **Hardware Acceleration** and **mlsysim**:

- Use the hardware Zoo in `mlsysim` to compare architectures (H100, B200, edge devices)
- Spend 2 weeks on the Roofline model — have students plot multiple workloads
- Extend model compression to 2 weeks (quantization + pruning as hardware-aware optimizations)
- Use hardware kits extensively — make them mandatory, not optional
- Reduce TinyTorch to Modules 01–03 (enough to understand what frameworks do)

---

## Graduate Seminar Version

For a graduate-level seminar (assumes strong systems background):

| Week | Topic | Textbook | Paper |
|:---|:---|:---|:---|
| 1 | The Iron Law | Vol I: Intro + ML Systems | Hennessy & Patterson, "A New Golden Age" (2019) |
| 2 | Memory Hierarchy | Vol I: HW Acceleration | Williams et al., "Roofline" (2009) |
| 3 | Quantization | Vol I: Model Compression | Dettmers et al., "LLM.int8()" (2022) |
| 4 | Serving Systems | Vol I: Model Serving | Yu et al., "Orca" (2022) |
| 5 | Distributed Training | Vol II: Distributed Training | Shoeybi et al., "Megatron-LM" (2020) |
| 6 | 3D Parallelism | Vol II: Distributed Training | Narayanan et al., "Efficient Large-Scale Training" (2021) |
| 7 | Collective Comms | Vol II: Collective Comm. | Patarasuk & Yuan, "Bandwidth Optimal All-Reduce" (2009) |
| 8 | Fault Tolerance | Vol II: Fault Tolerance | Jeon et al., "Large-Scale GPU Clusters" (2019) |
| 9 | Inference at Scale | Vol II: Inference | Kwon et al., "vLLM/PagedAttention" (2023) |
| 10 | KV-Cache Optimization | Vol II: Inference | Ainslie et al., "GQA" (2023) |
| 11 | Edge Intelligence | Vol II: Edge Intelligence | Lin et al., "MCUNet" (2020) |
| 12 | Fleet Operations | Vol II: Ops at Scale | Zhao et al., "ATC'24 Fleet Analysis" (2024) |
| 13 | Sustainability | Vol II: Sustainable AI | Patterson et al., "Carbon Emissions and AI" (2021) |
| 14 | Student presentations | — | — |

**Assessment:** 40% paper presentations, 30% lab Decision Logs (selected labs only), 30% semester project (original system design or benchmarking study).

---

## Mixing and Matching Components

Each component is independently adoptable:

| Pattern | Components Used | Typical Context |
|:---|:---|:---|
| **Textbook Only** | Vol I or II as required reading | Supplement for existing ML course |
| **Textbook + Labs** | Readings + interactive labs | Active learning without coding assignments |
| **TinyTorch Only** | 20 modules as programming assignments | Systems programming course |
| **Labs Only** | Interactive labs as in-class activities | Active learning supplement for any course |
| **Hardware Kits Only** | Edge deployment labs | Embedded systems course |
| **Full Stack** | All components integrated | Dedicated ML Systems course |

::: {.callout-tip}
## Start Small, Layer Up
If adopting for the first time, start with **Textbook + Labs** for one semester. Add TinyTorch the second time you teach it. Add hardware kits the third. Each component is valuable on its own.
:::