mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-11 17:49:25 -05:00
feat(labs): rewrite lab_00 as ML Systems architect portal
Complete rewrite of lab_00_introduction.py with four sections: 1. The 95% Problem — ML systems vs ML framing (not models, infrastructure) 2. Physical Constraints — speed of light, thermodynamics, memory physics 3. Four Deployment Regimes — Cloud/Edge/Mobile/TinyML constraint walls 4. Interface Orientation — live cockpit tour (tabs, levers, prediction lock, MathPeek) Each concept block gates the next via mo.stop() with structured checks: - Check 1: radio MCQ (silent degradation / ML systems domain) - Check 2: multiselect (AV latency / speed-of-light constraint) - Check 3: radio scenario (ICU sensor / TinyML constraint analysis) Interface tour uses real mo.ui components (dropdown, slider, accordion) so students build motor memory for the cockpit before Lab 01 content begins. Design Ledger initialized at completion with deployment context + check answers. Fix: DesignLedger import corrected to labs.core.state (not mlsysim.sim.ledger). Verified: Exit 0 under Python 3.13 with marimo 0.19.6.
This commit is contained in:
@@ -6,7 +6,7 @@
|
||||
"cells": [
|
||||
{
|
||||
"id": "Hbol",
|
||||
"code_hash": "d1e137ab08858d9220092662911a6492",
|
||||
"code_hash": "ddf5f33f9e09f77c352bce879bc06a23",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
@@ -19,12 +19,12 @@
|
||||
},
|
||||
{
|
||||
"id": "MJUe",
|
||||
"code_hash": "33c294b315853e0a7442dadb2c4be01b",
|
||||
"code_hash": "e7fb484f9c20fb8cdd29e615e072214b",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
"text/html": "<div style='display: flex;flex: 1;flex-direction: column;justify-content: flex-start;align-items: normal;flex-wrap: nowrap;gap: 0.5rem'>\n<style>\n :root {\n --blue: #006395;\n --red: #CB202D;\n --green: #008F45;\n --text: #2c3e50;\n }\n\n .lab-container {\n font-family: 'Inter', -apple-system, sans-serif;\n color: var(--text);\n max-width: 900px;\n margin: auto;\n }\n\n .lab-card {\n background-color: white;\n border: 1px solid #dee2e6;\n border-radius: 12px;\n padding: 1.5rem;\n height: 100%;\n box-shadow: 0 4px 12px rgba(0,0,0,0.03);\n display: flex;\n flex-direction: column;\n }\n\n .lab-card h3 {\n margin-top: 0;\n font-size: 0.9rem;\n text-transform: uppercase;\n letter-spacing: 0.1em;\n color: #7f8c8d;\n border-bottom: 1px solid #eee;\n padding-bottom: 8px;\n margin-bottom: 1rem;\n }\n\n .metric-row {\n display: flex;\n justify-content: space-between;\n padding: 8px 0;\n border-bottom: 1px solid #f8f9fa;\n }\n\n .metric-label { font-weight: 500; color: #7f8c8d; font-size: 0.9rem; }\n .metric-value { font-family: 'SF Mono', monospace; font-weight: 600; color: var(--blue); }\n\n .prediction-box {\n background-color: #FFE5CC;\n border-left: 4px solid #CC5500;\n padding: 1rem;\n border-radius: 8px;\n margin: 1rem 0;\n }\n\n .feasibility-banner {\n padding: 1rem;\n border-radius: 10px;\n font-weight: 700;\n text-align: center;\n margin-bottom: 2rem;\n box-shadow: 0 4px 15px rgba(0,0,0,0.1);\n }\n\n .stakeholder-card {\n background: #f8f9fa;\n border-radius: 12px;\n padding: 1.5rem;\n border-left: 6px solid var(--blue);\n margin-top: 1rem;\n }\n</style>\n<span class=\"markdown prose dark:prose-invert contents\"><div style=\"background: linear-gradient(135deg, #0f172a 0%, #1e293b 100%);\n padding: 36px 44px; border-radius: 16px; color: white;\n box-shadow: 0 8px 32px rgba(0,0,0,0.3);\">\n <div style=\"font-size: 0.72rem; font-weight: 700; letter-spacing: 0.18em;\n color: #475569; text-transform: uppercase; margin-bottom: 10px;\">\n Machine Learning Systems \u00b7 Volume I \u00b7 Lab 00\n </div>\n <h1 style=\"margin: 0 0 10px 0; font-size: 2.4rem; font-weight: 900;\n color: #f8fafc; line-height: 1.1; letter-spacing: -0.02em;\">\n The Architect's Portal\n </h1>\n <p style=\"margin: 0 0 20px 0; font-size: 1.05rem; color: #94a3b8;\n max-width: 620px; line-height: 1.65;\">\n This course is not about machine learning. It is about the infrastructure\n that makes machine learning possible \u2014 and the physical laws that govern it.\n </span>\n <div style=\"display: flex; gap: 12px; flex-wrap: wrap;\">\n <span style=\"background: rgba(99,102,241,0.15); color: #a5b4fc;\n padding: 5px 14px; border-radius: 20px; font-size: 0.8rem;\n font-weight: 600; border: 1px solid rgba(99,102,241,0.25);\">\n Orientation \u00b7 3 Concept Checks \u00b7 Interface Tour\n </span>\n <span style=\"background: rgba(16,185,129,0.15); color: #6ee7b7;\n padding: 5px 14px; border-radius: 20px; font-size: 0.8rem;\n font-weight: 600; border: 1px solid rgba(16,185,129,0.25);\">\n 20\u201325 min\n </span>\n <span style=\"background: rgba(245,158,11,0.15); color: #fcd34d;\n padding: 5px 14px; border-radius: 20px; font-size: 0.8rem;\n font-weight: 600; border: 1px solid rgba(245,158,11,0.25);\">\n No prior reading required\n </span>\n </div>\n</div></span></div>"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -32,12 +32,12 @@
|
||||
},
|
||||
{
|
||||
"id": "vblA",
|
||||
"code_hash": "37f6a477d1e84e01fe3f525d912a417b",
|
||||
"code_hash": "c99e97c8c1fe662b0b310749f4fde94c",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
"text/html": "<div style='display: flex;flex: 1;flex-direction: column;justify-content: flex-start;align-items: normal;flex-wrap: nowrap;gap: 0.5rem'><span class=\"markdown prose dark:prose-invert contents\"><hr /></span><span class=\"markdown prose dark:prose-invert contents\"><h2 id=\"the-95-problem\">The 95% Problem</h2>\n<span class=\"paragraph\">When Google published a study of their internal ML systems in 2015, they found\nsomething that surprised the field. In a production ML system, the actual model \u2014\nthe neural network, the training algorithm, the matrix math \u2014 accounts for roughly\n<strong>5% of the total codebase</strong>.</span>\n<span class=\"paragraph\">The other <strong>95%</strong> is infrastructure: data pipelines, serving systems, monitoring,\nhardware resource management, configuration, feature stores, deployment tooling.</span>\n<span class=\"paragraph\">This has a direct implication for how you should think about your role as an engineer:</span></span>\n <div style=\"display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 16px 0;\">\n <div style=\"background: #fef2f2; border: 1px solid #fecaca; border-radius: 12px;\n padding: 20px; border-left: 5px solid #ef4444;\">\n <div style=\"font-weight: 800; color: #991b1b; margin-bottom: 8px;\">\n ML Engineering\n </div>\n <div style=\"color: #7f1d1d; font-size: 0.9rem; line-height: 1.6;\">\n Build and improve the model. Choose the architecture.\n Tune hyperparameters. Improve accuracy. <br/><br/>\n <strong>Optimizes the 5%.</strong>\n </div>\n </div>\n <div style=\"background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 12px;\n padding: 20px; border-left: 5px solid #16a34a;\">\n <div style=\"font-weight: 800; color: #14532d; margin-bottom: 8px;\">\n ML Systems Engineering\n </div>\n <div style=\"color: #14532d; font-size: 0.9rem; line-height: 1.6;\">\n Build the infrastructure that makes the model run reliably\n at scale, within hardware constraints, in production. <br/><br/>\n <strong>Optimizes the 95%.</strong>\n </div>\n </div>\n </div>\n <span class=\"markdown prose dark:prose-invert contents\"><span class=\"paragraph\">A model that achieves 99% accuracy in a Jupyter notebook is <strong>not a product</strong>.\nIt becomes a product only when it can run in real-time on real hardware,\nserve thousands of concurrent users, recover from failures, detect when it\ndegrades, and update without downtime. That is the engineering this course teaches.</span></span></div>"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -45,7 +45,7 @@
|
||||
},
|
||||
{
|
||||
"id": "bkHC",
|
||||
"code_hash": "e7a18c8049ed6bfd00fc0fef89bdd96d",
|
||||
"code_hash": "1bcdf120c15fbde8d591fa469285e078",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
@@ -58,12 +58,12 @@
|
||||
},
|
||||
{
|
||||
"id": "lEQa",
|
||||
"code_hash": "3d14a02aa63fe0fd8cafcc26f1ef8172",
|
||||
"code_hash": "5233f52c0b3cbb1ccbb35040a0f91a2b",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
"text/html": "<div style='display: flex;flex: 1;flex-direction: column;justify-content: flex-start;align-items: normal;flex-wrap: nowrap;gap: 0.5rem'><marimo-ui-element object-id='bkHC-0' random-id='c9ca4c5a-bfe0-71f6-6120-0eca8dbbc058'><marimo-radio data-initial-value='null' data-label='"<span class=\"markdown prose dark:prose-invert contents\"><span class=\"paragraph\"><strong>Check your understanding.</strong> A startup ships a model with 94% accuracy.\nSix months later, accuracy has silently dropped to 81% in production \u2014 but no code\nhas changed. As an ML Systems engineer, which part of the system is your <em>primary</em>\ndomain for diagnosing and fixing this?</span></span>"' data-options='["A) The model architecture \u2014 choosing transformers over CNNs","B) The training algorithm \u2014 selecting Adam vs SGD","C) The serving infrastructure \u2014 how the model runs reliably in production","D) The dataset size \u2014 gathering more labeled training examples"]' data-inline='false' data-disabled='false'></marimo-radio></marimo-ui-element>\n <div style=\"background: #fef2f2; border: 1.5px solid #ef4444;\n border-radius: 10px; padding: 18px 20px; margin-top: 8px;\">\n <div style=\"font-weight: 700; color: #ef4444; margin-bottom: 6px;\">\n \u26a0\ufe0f Not quite\n </div>\n <div style=\"color: #1e293b; font-size: 0.92rem; line-height: 1.6;\">\n **Not quite.** More training data would help if you were retraining \u2014 but the immediate problem is that you don't even *know* the model is degrading until someone complains. The systems problem is the absence of monitoring. Data collection is part of the solution, but detecting the problem comes first.\n </div>\n </div>\n </div>"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -71,12 +71,12 @@
|
||||
},
|
||||
{
|
||||
"id": "PKri",
|
||||
"code_hash": "63bd617772be04f09aef7b9bd42c6461",
|
||||
"code_hash": "cda5d32f0c0cb1ffa41f016278e72b82",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
"text/html": "<div style='display: flex;flex: 1;flex-direction: column;justify-content: flex-start;align-items: normal;flex-wrap: nowrap;gap: 0.5rem'><span class=\"markdown prose dark:prose-invert contents\"><hr /></span><span class=\"markdown prose dark:prose-invert contents\"><h2 id=\"why-constraints-drive-architecture\">Why Constraints Drive Architecture</h2>\n<span class=\"paragraph\">The same model cannot simply be \"resized\" to run everywhere.\nThree physical laws carve the deployment landscape into distinct regimes\nthat no amount of software engineering can bridge:</span></span>\n <div style=\"display: grid; grid-template-columns: repeat(3, 1fr); gap: 14px; margin: 16px 0;\">\n\n <div style=\"background: white; border: 1px solid #e2e8f0; border-radius: 12px;\n padding: 18px; border-top: 4px solid #6366f1;\">\n <div style=\"font-size: 1.4rem; margin-bottom: 6px;\">\u26a1</div>\n <div style=\"font-weight: 800; color: #1e293b; font-size: 0.95rem; margin-bottom: 6px;\">\n The Speed of Light\n </div>\n <div style=\"color: #64748b; font-size: 0.85rem; line-height: 1.5;\">\n London to New York = 36 ms minimum round-trip, one-way.\n A self-driving car that needs a 10 ms decision loop\n <strong>cannot route to a remote datacenter</strong>.\n Physics sets this floor. No GPU upgrade helps.\n </div>\n </div>\n\n <div style=\"background: white; border: 1px solid #e2e8f0; border-radius: 12px;\n padding: 18px; border-top: 4px solid #ef4444;\">\n <div style=\"font-size: 1.4rem; margin-bottom: 6px;\">\ud83c\udf21\ufe0f</div>\n <div style=\"font-weight: 800; color: #1e293b; font-size: 0.95rem; margin-bottom: 6px;\">\n Thermodynamics\n </div>\n <div style=\"color: #64748b; font-size: 0.85rem; line-height: 1.5;\">\n Heat accumulates faster than a small enclosure can dissipate it.\n A smartphone running a heavy model continuously\n <strong>throttles its processor after 90 seconds</strong>.\n No software fix prevents heat.\n </div>\n </div>\n\n <div style=\"background: white; border: 1px solid #e2e8f0; border-radius: 12px;\n padding: 18px; border-top: 4px solid #10b981;\">\n <div style=\"font-size: 1.4rem; margin-bottom: 6px;\">\ud83d\udcbe</div>\n <div style=\"font-weight: 800; color: #1e293b; font-size: 0.95rem; margin-bottom: 6px;\">\n Memory Physics\n </div>\n <div style=\"color: #64748b; font-size: 0.85rem; line-height: 1.5;\">\n Moving data through memory costs energy and takes time.\n A microcontroller with 256 KB of SRAM\n <strong>cannot page memory from disk</strong>.\n If the model doesn't fit, it doesn't run.\n </div>\n </div>\n\n </div>\n <span class=\"markdown prose dark:prose-invert contents\"><span class=\"paragraph\">These three constraints \u2014 latency floors, power limits, and memory capacity \u2014\ndivide the world into four fundamentally different deployment environments.\nEngineers who treat deployment as an afterthought collide with these walls\nafter months of architectural work.</span>\n<span class=\"paragraph\"><strong>The insight of ML Systems engineering:</strong> choose your regime <em>first</em>,\nbecause the physics of that regime constrains every design decision that follows.</span></span></div>"
|
||||
}
|
||||
}
|
||||
],
|
||||
@@ -84,7 +84,7 @@
|
||||
},
|
||||
{
|
||||
"id": "Xref",
|
||||
"code_hash": "36b13dca6d552ab03eb36816b0cb2cad",
|
||||
"code_hash": "b7e2d50be5fd599276333a173d5a52ac",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
@@ -97,7 +97,7 @@
|
||||
},
|
||||
{
|
||||
"id": "SFPL",
|
||||
"code_hash": "6afaf345be9de9338a6d263ea95c39d2",
|
||||
"code_hash": "c8eed0bb6c348abf5c520c0b82abbb5c",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
@@ -110,23 +110,120 @@
|
||||
},
|
||||
{
|
||||
"id": "BYtC",
|
||||
"code_hash": "1bda80f2be4d3658e0baa43fbe7ae8c1",
|
||||
"code_hash": "c5affd7fc479b743727501b988968c2b",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
}
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "RGSE",
|
||||
"code_hash": "bc63f9af4a914e5e2ce16d1ae16548d6",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/plain": ""
|
||||
}
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "Kclp",
|
||||
"code_hash": "94be6867fba4661cf7ad924e093bc5ca",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "error",
|
||||
"ename": "exception",
|
||||
"evalue": "name 'view' is not defined",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
],
|
||||
"console": [
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "emfo",
|
||||
"code_hash": "0393b9cb5eaf6dbb04d44c962fd8fcc1",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "stream",
|
||||
"name": "stderr",
|
||||
"text": "<span class=\"codehilite\"><div class=\"highlight\"><pre><span></span><span class=\"gt\">Traceback (most recent call last):</span>\n File <span class=\"nb\">"/var/folders/nv/p2yc903d60vbvprnhf4hhvhc0000gq/T/marimo_32436/__marimo__cell_BYtC_.py"</span>, line <span class=\"m\">1</span>, in <span class=\"n\"><module></span>\n<span class=\"w\"> </span><span class=\"n\">view</span>\n<span class=\"gr\">NameError</span>: <span class=\"n\">name 'view' is not defined</span>\n</pre></div>\n</span>",
|
||||
"mimetype": "application/vnd.marimo+traceback"
|
||||
"type": "error",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
]
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "Hstk",
|
||||
"code_hash": "2d43ec0662d1feb3264fe81613dbd973",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "error",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "nWHF",
|
||||
"code_hash": "288b8aeadc385e04249311d6fd71ca10",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "error",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "iLit",
|
||||
"code_hash": "1d3104cde642f65c3f442f3eacd64162",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "error",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "ZHCJ",
|
||||
"code_hash": "aea079cd3a2333ee1fad70908380fd0a",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "error",
|
||||
"ename": "ancestor-stopped",
|
||||
"evalue": "This cell wasn't run because an ancestor was stopped with `mo.stop`: ",
|
||||
"traceback": []
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
},
|
||||
{
|
||||
"id": "ROlb",
|
||||
"code_hash": "d2f0420508045e548e25db5b3fb08abd",
|
||||
"outputs": [
|
||||
{
|
||||
"type": "data",
|
||||
"data": {
|
||||
"text/html": "\n<div style=\"display:flex; gap:28px; align-items:center; padding:12px 24px;\n background:#0f172a; border-radius:10px; margin-top:32px;\n font-family:'SF Mono','Fira Code',monospace; font-size:0.8rem;\n border:1px solid #1e293b;\">\n <div style=\"color:#475569; font-weight:600; letter-spacing:0.06em;\">\ud83d\uddc2\ufe0f DESIGN LEDGER</div>\n <div>\n <span style=\"color:#475569;\">Context: </span>\n <span style=\"color:#475569; font-weight:700;\">EDGE</span>\n </div>\n <div>\n <span style=\"color:#475569;\">Chapter: </span>\n <span style=\"color:#e2e8f0;\">0</span>\n </div>\n <div>\n <span style=\"color:#475569;\">Status: </span>\n <span style=\"color:#4ade80;\">Active \u2014 Chapter 0</span>\n </div>\n</div>\n"
|
||||
}
|
||||
}
|
||||
],
|
||||
"console": []
|
||||
}
|
||||
]
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user