mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-02 10:39:10 -05:00
- book/quarto/mlsys/__init__.py: add repo-root sys.path injection so
mlsysim is importable when scripts run from book/quarto/ context
- book/quarto/mlsys/{constants,formulas,formatting,hardware}.py: new
compatibility shims that re-export from mlsysim.core.* and mlsysim.fmt
- mlsysim/viz/__init__.py: remove try/except for dashboard import; use
explicit "import from mlsysim.viz.dashboard" pattern instead
- .codespell-ignore-words.txt: add "covert" (legitimate security term)
- book/tools/scripts/reference_check_log.txt: delete generated artifact
- Various QMD, bib, md files: auto-formatted by pre-commit hooks
(trailing whitespace, bibtex-tidy, pipe table alignment)
4.5 KiB
4.5 KiB
📐 Mission Plan: 16_ml_conclusion (System Synthesis)
1. Chapter Context
- Chapter Title: ML Conclusion: The Architect's Synthesis.
- Core Invariant: System Synthesis (The D·A·M Convergence) and the Conservation of Complexity.
- The Struggle: Synthesizing all 12 physical invariants to solve a final, integrated engineering crisis. Understanding that complexity is never destroyed, only shifted between Data, Algorithm, and Machine.
- Target Duration: 45 Minutes.
2. The 4-Track Storyboard (Single-Node Finales)
| Track | Persona | Fixed North Star Mission | The "Synthesis" Crisis |
|---|---|---|---|
| Cloud Titan | LLM Architect | Maximize Llama-3-70B serving. | The Multimodal Memory Wall. You are adding a Vision Encoder to your chatbot. You must balance the memory-bound decoding of the LLM with the compute-bound encoding of the Vision model on one card. |
| Edge Guardian | AV Systems Lead | Deterministic 10ms safety loop. | The Shadow Mode Gap. A new Transformer-based end-to-end model has arrived. You must prove it respects the 10ms budget and the 'Verification Gap' before it is allowed to touch the steering wheel. |
| Mobile Nomad | AR Glasses Dev | 60FPS AR translation. | The Thermal Intelligence Ceiling. Marketing wants to add 'Reasoning' to the 60FPS filters. You must find the exact point where intelligence exceeds the fixed 2W thermal envelope. |
| Tiny Pioneer | Hearable Lead | Neural isolation in <10ms under 1mW. | The Universal Translator Paradox. You are scaling from 1 language to 50. The Embedding Table Gravity now exceeds your 256KB SRAM. You must engineer a 'Streaming Weights' strategy. |
3. The 3-Part Mission (The KATs)
Part 1: The Diagnostic Challenge (Exploration - 15 Mins)
- Objective: Diagnose an unexplained production failure using the full "Systems Detective" toolkit.
- The "Lock" (Prediction): "If accuracy drops 10% but latency remains stable, is the bottleneck more likely to be 'Data Drift' or 'Framework Overhead'?"
- The Workbench:
- Action: Probe the system for Statistical Drift, Training-Serving Skew, and Iron Law bottlenecks simultaneously.
- Observation: The Diagnostic HUD. A multi-gauge view showing PSI (Drift), MFU (Utilization), and latency components.
- Reflect: "Patterson asks: 'Identify the exact invariant that was violated.' Use the diagnostic data to prove your root-cause analysis."
Part 2: The Upgrade Paradox (Trade-off - 15 Mins)
- Objective: Navigate a multi-objective Pareto Frontier to "Design the Future" of your track.
- The "Lock" (Prediction): "Will upgrading to a larger model improve your 'Samples-per-Dollar' if the system becomes more memory-bound?"
- The Workbench:
- Sliders: All DAM levers (Data resolution, Algorithm complexity, Machine BW/TFLOPS).
- Instruments: Master Synthesis Radar. A radar chart showing Accuracy, Latency, Energy, Carbon, and TCO.
- The 15-Iteration Rule: Students must "Dimension" a next-generation system that hits a 2x accuracy goal while staying within the same 3-year TCO budget.
- Reflect: "Jeff Dean observes: 'You successfully reduced the Algorithm complexity, but the total system cost stayed the same.' Explain how the 'Conservation of Complexity' manifested in your design."
Part 3: Node to Fleet (Synthesis - 15 Mins)
- Objective: Identify the physical limit of the "Single Node" and prepare for Volume 2.
- The "Lock" (Prediction): "At what exact model scale or user load does your single-node architecture hit an absolute physical wall that no further local optimization can fix?"
- The Workbench:
- Interaction: Scale-out Scrubber. Slide the load until the "Node Feasibility" gauge turns permanently red.
- The "Stakeholder" Challenge: The CEO demands a 100x scale-up. You must use Amdahl's Law to prove that single-node physics have reached their saturation point.
- Reflect (The Ledger): "Defend your 'Graduation' to Volume 2. What specific constraint (Network, Reliability, or Contention) is now forcing you to move from 'The Node' to 'The Fleet'?"
4. Visual Layout Specification
- Primary:
SynthesisRadar(Covering the full D·A·M spectrum). - Secondary:
NodeSaturationPlot(Throughput vs. Load showing the "Vertical Scaling Wall"). - Math Peek: Toggle for the 12 Quantitative Invariants of Volume 1.