--- title: "For Instructors" subtitle: "Reproducible, hardware-independent exercises — paired with 35 lecture decks and 266 diagrams." --- MLSYSIM provides a framework for assigning analytically grounded problem sets where every answer is deterministic and reproducible — regardless of what hardware your students have access to. Combined with the companion [lecture slides](https://mlsysbook.ai/slides/), it forms a complete teaching toolkit for ML systems courses. --- ## Why MLSYSIM for Teaching? | Challenge | How MLSYSIM Helps | |:----------|:------------------| | Students lack GPU access | All analysis runs on a laptop — no cloud credits needed | | Homework answers vary by hardware | Vetted registry specs produce identical results everywhere | | Hard to grade open-ended systems questions | Analytical solvers give deterministic, verifiable outputs | | Specifications become stale | Registry updated from official datasheets; one update propagates everywhere | | Students memorize without understanding | "Predict first" exercises build genuine intuition | | No time to build slides from scratch | 35 Beamer decks with speaker notes, active learning, and SVG diagrams ready to use | --- ## The Teaching Ecosystem MLSYSIM is one component of a larger open teaching toolkit: | Resource | What It Provides | Link | |:---------|:-----------------|:-----| | **Textbook** | Two-volume open textbook — foundations (Vol I) and scale (Vol II) | [mlsysbook.ai](https://mlsysbook.ai) | | **Lecture Slides** | 35 Beamer decks, 1,099 slides, 266 SVG diagrams, speaker notes on every slide | [Slides Portal](https://mlsysbook.ai/slides/) | | **MLSYSIM** | 6 analytical solvers, typed hardware registry, deterministic assignments | [Getting Started](getting-started.qmd) | | **TinyML Courseware** | 4-course sequence with 178 slide decks for embedded ML | [TinyML Slides](https://mlsysbook.ai/slides/tinyml.html) | | **Teaching Guide** | 16-week semester plans, active learning taxonomy, customization guide | [Teaching Guide](https://mlsysbook.ai/slides/teaching.html) | --- ## Course Integration Patterns ### Pattern 1 — Textbook Companion (Full Semester) Map MLSYSIM tutorials and assignments directly to textbook chapters and lecture decks. The table below shows one possible 16-week arrangement using Volume I slides. | Week | Lecture Slides | Textbook Topic | MLSYSIM Assignment | |:-----|:---------------|:---------------|:-------------------| | 2 | [Introduction](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/01_introduction.pdf) | The Iron Law of ML Systems | Read [Hello, Roofline](tutorials/00_hello_roofline.qmd) warmup — identify bottleneck equation | | 5 | [NN Computation](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/05_nn_computation.pdf) | FLOPs, memory footprint | [Hello, Roofline](tutorials/00_hello_roofline.qmd) — roofline analysis, batch size sweep | | 8 | [Model Training](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/08_training.pdf) | Training memory budget | [Solver Guide](solver-guide.qmd) — TrainingStateSolver, ZeRO stages | | 11 | [HW Acceleration](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/11_hw_acceleration.pdf) | Roofline model, accelerator comparison | Hardware comparison assignment (see below) | | 13 | [Model Serving](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/13_model_serving.pdf) | TTFT, ITL, KV-cache | [Two Phases of Inference](tutorials/02_two_phases.qmd) — serving latency analysis | For a **Volume II** course on distributed systems: | Week | Lecture Slides | Textbook Topic | MLSYSIM Assignment | |:-----|:---------------|:---------------|:-------------------| | 3 | [Compute Infrastructure](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/02_compute_infrastructure.pdf) | GPU clusters, interconnects | TCO analysis with EconomicsModel | | 5 | [Distributed Training](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/05_distributed_training.pdf) | 3D parallelism, scaling | [Scaling to 1000 GPUs](tutorials/06_scaling_1000_gpus.qmd) — parallelism strategies | | 7 | [Fault Tolerance](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/07_fault_tolerance.pdf) | Checkpointing, MTBF | ReliabilityModel — Young-Daly checkpoint interval | | 10 | [Performance Engineering](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/10_performance_engineering.pdf) | Profiling, optimization | Multi-solver composition (see capstone ideas below) | | 15 | [Sustainable AI](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/15_sustainable_ai.pdf) | Energy, carbon, water | [Geography is a Systems Variable](tutorials/07_geography.qmd) — carbon footprint | ::: {.callout-tip} ## Semester Plans The [Teaching Guide](https://mlsysbook.ai/slides/teaching.html#suggested-semester-plans) provides complete 16-week schedules for Volume I, Volume II, and a combined 32-week sequence — with timing estimates for every deck. ::: ### Pattern 2 — Standalone Labs Use individual tutorials as self-contained lab assignments in any systems course. Each tutorial includes exercises with clear expected outputs: | Tutorial | Duration | Key Concepts | Pairs With Slides | |:---------|:---------|:-------------|:------------------| | [Hello, Roofline](tutorials/00_hello_roofline.qmd) | 15 min | Roofline model, memory vs. compute bound | [HW Acceleration](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/11_hw_acceleration.pdf) | | [Geography is a Systems Variable](tutorials/07_geography.qmd) | 20 min | Energy, carbon footprint, regional grids | [Sustainable AI](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/15_sustainable_ai.pdf) | | [Two Phases of Inference](tutorials/02_two_phases.qmd) | 25 min | TTFT vs. ITL, KV-cache pressure | [Model Serving](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/13_model_serving.pdf) | | [Scaling to 1000 GPUs](tutorials/06_scaling_1000_gpus.qmd) | 30 min | Data/tensor/pipeline parallelism | [Distributed Training](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/05_distributed_training.pdf) | ### Pattern 3 — Capstone Projects Advanced students compose multiple solvers to answer research-style questions. See [Writing a Custom Solver](solver-guide.qmd#writing-a-custom-solver) for the custom solver API. --- ## Assignment Ideas ### Homework: Hardware Comparison (30 min) > Using `Engine.solve()`, compare ResNet-50 inference latency on the A100, H100, and Jetson AGX at batch sizes 1, 32, and 256. For each configuration, state whether the workload is memory-bound or compute-bound and explain why the bottleneck shifts with batch size. **Pairs with**: [HW Acceleration slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/11_hw_acceleration.pdf) (roofline model, ridge point) and [Benchmarking slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/12_benchmarking.pdf) (measurement methodology). ### Homework: Training Memory Budget (30 min) > Using the TrainingStateSolver, calculate the memory required to train GPT-2 (1.5B parameters) in FP16 with Adam optimizer under ZeRO Stage 0, Stage 1, and Stage 3. Explain *why* each stage reduces memory and what trade-off it introduces. **Pairs with**: [Model Training slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/08_training.pdf) and [Distributed Training slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/05_distributed_training.pdf). ### Lab: Carbon-Aware Training (45 min) > Using the SustainabilityModel, calculate the carbon footprint of training GPT-3 on a 256-GPU H100 cluster in Quebec vs. US Average vs. Poland. Produce a table and a 2-paragraph analysis of why datacenter location matters more than hardware choice for carbon. **Pairs with**: [Sustainable AI slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/15_sustainable_ai.pdf) (grid carbon intensity, PUE). ### Lab: LLM Serving Capacity Planning (45 min) > Using the ServingModel, determine the maximum sequence length at which Llama-3.1-70B can serve a single request on an 8-GPU H100 node without exceeding memory. Then calculate TTFT and ITL at sequence lengths of 1K, 4K, and 16K tokens. At what point does KV-cache pressure dominate? **Pairs with**: [Model Serving slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/13_model_serving.pdf) and [Inference at Scale slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/09_inference.pdf). ### Exam Question: Back-of-Envelope > A GPU has 1,979 TFLOP/s peak compute (FP16) and 3.35 TB/s memory bandwidth. (a) What is the ridge point in FLOP/Byte? (b) A model layer has arithmetic intensity of 50 FLOP/Byte — is it compute-bound or memory-bound? (c) Another layer has arithmetic intensity of 400 FLOP/Byte — which regime is it in, and what does that imply about the benefit of moving to a GPU with 2x the bandwidth? Show your work. **Pairs with**: [HW Acceleration slides](https://github.com/harvard-edge/cs249r_book/releases/download/slides-latest/11_hw_acceleration.pdf) (roofline model, ridge point derivation). ### Capstone: Multi-Solver Design Study (1 week) > Design a training cluster for a 70B-parameter model. Use the DistributedModel to select a parallelism strategy, the EconomicsModel for TCO over 6 months, the SustainabilityModel to compare three datacenter locations, and the ReliabilityModel to determine checkpoint frequency. Present your analysis as a 3-page technical memo with quantitative justification for each decision. **Pairs with**: the full [Volume II slide set](https://mlsysbook.ai/slides/vol2.html) — infrastructure, training, fault tolerance, and sustainability. --- ## Grading Notes Because MLSYSIM produces deterministic output from vetted specifications: - **Answer keys are stable** — the same `mlsysim` version produces identical numbers for every student, every semester - **Partial credit is straightforward** — grade the reasoning (which solver, which inputs, which bottleneck explanation), not just the number - **"Predict first" questions are easy to assess** — students submit their prediction *before* running code; compare the two for a conceptual understanding score ::: {.callout-note} ## Version Pinning Pin the version in your assignment instructions (`pip install mlsysim==0.1.0`) so answer keys remain valid even after new releases update specifications. ::: --- ## Reproducibility Guarantee All specifications in the [MLSys Zoo](zoo/index.qmd) are: - **Sourced** from official manufacturer datasheets and published benchmarks - **Typed** with `pint.Quantity` for dimensional correctness — unit errors are caught at runtime - **Frozen** per release — `mlsysim==0.1.0` always produces the same answers This means your answer key works for every student, every semester. --- ## Jupyter & Quarto Compatibility All tutorials run in: - **Jupyter Notebooks** — standard `.ipynb` workflow - **Quarto documents** — render to HTML, PDF, or slides with `quarto render` - **Google Colab** — `pip install mlsysim` in the first cell, then go No GPU runtime required. CPU-only environments work perfectly because MLSYSIM computes from equations, not empirical profiling. --- ## Getting Started 1. Point students to the [Getting Started](getting-started.qmd) guide for installation 2. Assign the [Hello, Roofline](tutorials/00_hello_roofline.qmd) tutorial as a warmup 3. Browse the [Solver Guide](solver-guide.qmd) to select solvers for your course topics 4. Pair each assignment with the relevant [lecture slides](https://mlsysbook.ai/slides/) for classroom context 5. Use the [MLSys Zoo](zoo/index.qmd) for available hardware, model, and infrastructure specifications --- ## Related Resources - **[Solver Guide](solver-guide.qmd)** — which solver maps to which topic - **[Math Foundations](math.qmd)** — all equations, for your own reference and exam prep - **[Accuracy & Validation](accuracy.qmd)** — how close are analytical estimates to empirical results? - **[Paper PDF](mlsysim-paper.pdf)** — the MLSys·im paper describing the framework design and validation - **[Lecture Slides Portal](https://mlsysbook.ai/slides/)** — 35 Beamer decks with speaker notes and active learning - **[Teaching Guide](https://mlsysbook.ai/slides/teaching.html)** — semester plans, customization, and the active learning taxonomy