mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-27 09:59:49 -05:00
Iterates on the post-merge 404 redesign across all 8 sub-sites: - SVG roofline-plot fonts bumped for readability without changing layout: axis labels 9.5pt to 14pt, region labels 9.5pt to 14pt, title and "your page (404)" annotation 11pt to 16pt. - Random-joke font shrunk from 1.6rem to 1.3rem (1.1rem on mobile) so the joke no longer dominates the SVG above it. - Removed the static "Looks like this page slipped past our load balancer" subline from 6 sub-sites — it read as a competing static joke alongside the random rotation. Slides and instructors keep their informational sublines. - Joke pool tightened 97 to 79 via a strict ML-systems-centric test: if you could swap "page" for any generic web/ops resource and the joke still works, cut. Cuts removed the H&P canonical material (Amdahl's Law, TLB miss, Dennard scaling, false sharing, fat-tree topology) that is general computer architecture rather than ML systems specifically. 19 borderline jokes were rewritten to anchor punchlines in concepts only ML practitioners decode (KV cache, gradient AllReduce, prefill, ZeRO, speculative-decode acceptance, 1F1B schedule, ridge point, BPE merges).
232 lines
16 KiB
Plaintext
232 lines
16 KiB
Plaintext
---
|
|
title: "Page Not Found"
|
|
---
|
|
|
|
```{=html}
|
|
<style>
|
|
.j404-wrap { max-width: 720px; margin: 4rem auto 2rem; padding: 0 1.5rem; text-align: center; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", system-ui, sans-serif; }
|
|
.j404-eyebrow { font-size: 0.75rem; font-weight: 700; letter-spacing: 0.18em; text-transform: uppercase; color: #a31f34; margin-bottom: 1.5rem; }
|
|
.j404-illustration { max-width: 520px; margin: 0 auto 1.75rem; }
|
|
.j404-illustration svg { width: 100%; height: auto; display: block; }
|
|
.j404-joke { font-size: 1.3rem; font-weight: 500; line-height: 1.45; color: #1a1a1a; margin: 0 auto 1.25rem; font-style: italic; min-height: 3.5rem; }
|
|
.j404-sub { color: #555; font-size: 1rem; margin-bottom: 2.25rem; }
|
|
.j404-actions { display: flex; gap: 0.75rem; flex-wrap: wrap; justify-content: center; margin-bottom: 1rem; }
|
|
.j404-btn { padding: 0.7rem 1.4rem; border-radius: 6px; font-weight: 600; text-decoration: none; font-size: 0.95rem; transition: transform 0.1s ease, border-color 0.15s ease, color 0.15s ease, background 0.15s ease; display: inline-block; }
|
|
.j404-btn:hover { transform: translateY(-1px); }
|
|
.j404-btn { text-decoration: none !important; }
|
|
.j404-btn-primary { background: #a31f34 !important; color: #fff !important; }
|
|
.j404-btn-primary:hover { background: #8c1a2c !important; color: #fff !important; }
|
|
.j404-btn-secondary { background: #fff !important; color: #1a1a1a !important; border: 1.5px solid #d0d0d0; }
|
|
.j404-btn-secondary:hover { border-color: #a31f34 !important; color: #a31f34 !important; }
|
|
.j404-refresh { margin: 0 auto 2.5rem; display: block; font-size: 0.82rem; color: #888; background: none; border: none; cursor: pointer; text-decoration: underline; text-decoration-style: dotted; text-underline-offset: 3px; font-family: inherit; }
|
|
.j404-refresh:hover { color: #a31f34 !important; }
|
|
.j404-meta-row { display: flex; gap: 0.6rem 0.9rem; align-items: center; justify-content: center; flex-wrap: wrap; margin: 0 auto 2.5rem; }
|
|
.j404-meta-row .j404-refresh { margin: 0; display: inline; }
|
|
.j404-meta-sep { color: #ccc; font-size: 0.85rem; user-select: none; }
|
|
.j404-kits-row { display: flex; gap: 0.5rem 0.9rem; flex-wrap: wrap; justify-content: center; font-size: 0.82rem; margin-bottom: 1.75rem; color: #777; }
|
|
.j404-kits-row a { color: #555 !important; text-decoration: none !important; }
|
|
.j404-kits-row a:hover { color: #a31f34 !important; }
|
|
.j404-kits-row span { color: #c0c0c0; }
|
|
.j404-nav { display: flex; gap: 0.6rem 1.1rem; flex-wrap: wrap; justify-content: center; font-size: 0.85rem; border-top: 1px solid #e5e5e5; padding-top: 1.5rem; }
|
|
.j404-nav a { color: #555 !important; text-decoration: none !important; }
|
|
.j404-nav a:hover { color: #a31f34 !important; }
|
|
@media (max-width: 540px) { .j404-wrap { margin-top: 2.5rem; } .j404-joke { font-size: 1.1rem; } }
|
|
@media (prefers-color-scheme: dark) {
|
|
.j404-joke { color: #f0f0f0; }
|
|
.j404-sub { color: #aaa; }
|
|
.j404-btn-secondary { background: transparent !important; color: #f0f0f0 !important; border-color: #444; }
|
|
.j404-btn-secondary:hover { border-color: #ff6b80 !important; color: #ff6b80 !important; }
|
|
.j404-refresh { color: #999; }
|
|
.j404-refresh:hover { color: #ff6b80 !important; }
|
|
.j404-meta-sep { color: #555; }
|
|
.j404-kits-row { color: #aaa; }
|
|
.j404-kits-row a { color: #aaa !important; }
|
|
.j404-kits-row a:hover { color: #ff6b80 !important; }
|
|
.j404-kits-row span { color: #555; }
|
|
.j404-nav { border-top-color: #333; }
|
|
.j404-nav a { color: #aaa !important; }
|
|
.j404-nav a:hover { color: #ff6b80 !important; }
|
|
}
|
|
</style>
|
|
|
|
<div class="j404-wrap">
|
|
<p class="j404-eyebrow">Route not found</p>
|
|
<div class="j404-illustration">
|
|
<svg viewBox="0 0 680 280" xmlns="http://www.w3.org/2000/svg" font-family="Helvetica Neue, Helvetica, Arial, sans-serif" role="img" aria-labelledby="r404-title r404-desc">
|
|
<title id="r404-title">Roofline plot: the requested page is outside the achievable region</title>
|
|
<desc id="r404-desc">A roofline plot. The Y axis is throughput in FLOPs per second. The X axis is arithmetic intensity in FLOPs per byte. A blue memory-bandwidth slope rises from the origin to a ridge point, where it meets a green horizontal compute roof. The achievable region under the lines is shaded faintly. A red X marks the requested page far above the slope at low arithmetic intensity, in unreachable territory, with the annotation "your page (404)".</desc>
|
|
<defs>
|
|
<marker id="r404-arrow-red" markerWidth="8" markerHeight="6" refX="7" refY="3" orient="auto"><path d="M0,0 L8,3 L0,6 Z" fill="#a31f34"/></marker>
|
|
<marker id="r404-arrow" markerWidth="8" markerHeight="6" refX="7" refY="3" orient="auto"><path d="M0,0 L8,3 L0,6 Z" fill="#555"/></marker>
|
|
</defs>
|
|
<rect width="680" height="280" fill="#fff" rx="4"/>
|
|
<!-- Faint grid -->
|
|
<g stroke="#eee" stroke-width="0.8">
|
|
<line x1="100" y1="80" x2="620" y2="80"/>
|
|
<line x1="100" y1="120" x2="620" y2="120"/>
|
|
<line x1="100" y1="160" x2="620" y2="160"/>
|
|
<line x1="100" y1="200" x2="620" y2="200"/>
|
|
<line x1="200" y1="40" x2="200" y2="240"/>
|
|
<line x1="300" y1="40" x2="300" y2="240"/>
|
|
<line x1="400" y1="40" x2="400" y2="240"/>
|
|
<line x1="500" y1="40" x2="500" y2="240"/>
|
|
<line x1="600" y1="40" x2="600" y2="240"/>
|
|
</g>
|
|
<!-- Achievable region (faint shade) -->
|
|
<path d="M 100,240 L 340,80 L 610,80 L 610,240 Z" fill="#f7f7f7"/>
|
|
<!-- Title -->
|
|
<text x="340" y="22" text-anchor="middle" font-size="16" font-weight="700" fill="#333">Roofline analysis of your request</text>
|
|
<!-- Axes -->
|
|
<line x1="100" y1="240" x2="100" y2="40" stroke="#555" stroke-width="1.2" marker-end="url(#r404-arrow)"/>
|
|
<line x1="100" y1="240" x2="630" y2="240" stroke="#555" stroke-width="1.2" marker-end="url(#r404-arrow)"/>
|
|
<!-- Memory-bandwidth slope (blue = compute / data movement boundary) -->
|
|
<line x1="100" y1="240" x2="340" y2="80" stroke="#4a90c4" stroke-width="2.5"/>
|
|
<!-- Compute roof (green = compute-bound region) -->
|
|
<line x1="340" y1="80" x2="610" y2="80" stroke="#3d9e5a" stroke-width="2.5"/>
|
|
<!-- Ridge point -->
|
|
<circle cx="340" cy="80" r="3.5" fill="#333"/>
|
|
<text x="340" y="64" text-anchor="middle" font-size="13" fill="#555">ridge point</text>
|
|
<!-- Region labels -->
|
|
<text x="200" y="155" text-anchor="middle" font-size="14" font-weight="700" fill="#4a90c4">memory-bound</text>
|
|
<text x="490" y="120" text-anchor="middle" font-size="14" font-weight="700" fill="#3d9e5a">compute-bound</text>
|
|
<!-- "Achievable" italics inside the shaded region -->
|
|
<text x="430" y="200" text-anchor="middle" font-size="12" font-style="italic" fill="#999">achievable region</text>
|
|
<!-- The X marker for "your page" - placed above the slope (unreachable) -->
|
|
<g stroke="#a31f34" stroke-width="3" stroke-linecap="round">
|
|
<line x1="155" y1="115" x2="175" y2="135"/>
|
|
<line x1="175" y1="115" x2="155" y2="135"/>
|
|
</g>
|
|
<!-- Curved annotation arrow + label -->
|
|
<path d="M 280 145 Q 230 138 188 130" fill="none" stroke="#a31f34" stroke-width="1.4" marker-end="url(#r404-arrow-red)"/>
|
|
<text x="290" y="149" text-anchor="start" font-size="16" font-weight="700" fill="#a31f34">your page (404)</text>
|
|
<!-- Axis labels -->
|
|
<text x="50" y="140" text-anchor="middle" font-size="14" fill="#555" transform="rotate(-90 50 140)">throughput (FLOP/s)</text>
|
|
<text x="365" y="265" text-anchor="middle" font-size="14" fill="#555">arithmetic intensity (FLOP/byte)</text>
|
|
</svg>
|
|
</div>
|
|
<p class="j404-joke" id="j404-joke"> </p>
|
|
<div class="j404-actions">
|
|
<a class="j404-btn j404-btn-primary" href="/">← Kits home</a>
|
|
<a class="j404-btn j404-btn-secondary" href="https://mlsysbook.ai/games/">Open the playground →</a>
|
|
</div>
|
|
<div class="j404-meta-row">
|
|
<button class="j404-refresh" type="button" onclick="window.j404Pick && window.j404Pick()">↻ another joke</button>
|
|
<span class="j404-meta-sep" aria-hidden="true">·</span>
|
|
<a class="j404-refresh" href="https://github.com/harvard-edge/cs249r_book/issues/new?template=404_joke.yml" target="_blank" rel="noopener">✎ contribute one</a>
|
|
</div>
|
|
<div class="j404-kits-row" aria-label="Specific kits">
|
|
<a href="/contents/getting-started.html">Getting started</a><span>·</span>
|
|
<a href="/contents/arduino/nicla_vision/nicla_vision.html">Arduino Nicla</a><span>·</span>
|
|
<a href="/contents/seeed/xiao_esp32s3/xiao_esp32s3.html">XIAO ESP32S3</a><span>·</span>
|
|
<a href="/contents/seeed/grove_vision_ai_v2/grove_vision_ai_v2.html">Grove Vision AI</a><span>·</span>
|
|
<a href="/contents/raspi/raspi.html">Raspberry Pi</a>
|
|
</div>
|
|
<nav class="j404-nav" aria-label="Site navigation">
|
|
<a href="https://mlsysbook.ai/">mlsysbook.ai</a>
|
|
<a href="https://mlsysbook.ai/vol1/">Volume I</a>
|
|
<a href="https://mlsysbook.ai/vol2/">Volume II</a>
|
|
<a href="https://mlsysbook.ai/tinytorch/">TinyTorch</a>
|
|
<a href="https://mlsysbook.ai/labs/">Labs</a>
|
|
<a href="https://mlsysbook.ai/slides/">Slides</a>
|
|
<a href="https://mlsysbook.ai/instructors/">Instructors</a>
|
|
<a href="https://github.com/harvard-edge/cs249r_book/issues/new">Report broken link</a>
|
|
</nav>
|
|
</div>
|
|
|
|
<script>
|
|
(function() {
|
|
const jokes = [
|
|
"Attention is all you need. This page needed something else.",
|
|
"L1 missed. L2 missed. L3 missed. HBM missed. Your weights are in another castle.",
|
|
"We trained on this page. It was in the test set the whole time.",
|
|
"p50 says the page exists. p99 disagrees. p99.9 timed out mid-decode.",
|
|
"The decoder hallucinated a 404 with high confidence and low calibration.",
|
|
"Cold start. The container is downloading 80 GB of weights to serve you a 404.",
|
|
"OOM during prefill. The prompt was longer than the context window we lied about supporting.",
|
|
"Head-of-line blocking. A 32k-token request landed in the batch ahead of you and just started decoding.",
|
|
"Pipeline parallel stage 3 stalled. The bubble ate your URL.",
|
|
"The router in our MoE never activated this expert. It atrophied.",
|
|
"Loss went to NaN at step 47,000. We rolled back to step 46,500. The page rolled back further.",
|
|
"Position embedding ran out at index 4096. The page lived at 4097.",
|
|
"Catastrophic forgetting claimed this route during continual learning.",
|
|
"The teacher model knew this page. The student did not absorb it during distillation.",
|
|
"Training-serving skew: the URL existed at training time but not at inference.",
|
|
"Hot-swap loaded the new model. The new model has never heard of this URL.",
|
|
"Canary saw 0.4 percent error and held the rollout. You are in the 0.4 percent.",
|
|
"The arithmetic intensity here is 0.3 FLOPs per byte. We are firmly memory-bound and pageless.",
|
|
"The energy budget was 1 mJ per inference. The page cost 2 mJ to find.",
|
|
"The mixture-of-experts gating function routed your request to expert 0, which is unimplemented.",
|
|
"Megatron sharded the page tensor-parallel across 8 ways. We needed 9.",
|
|
"Pipeline bubble fraction hit 100 percent. The whole training run was bubble.",
|
|
"Weak scaling held until we asked for the page. Then communication ate the gradient.",
|
|
"Rendezvous timed out at 30 minutes. The straggler with your shard showed up at 31.",
|
|
"Data parallel replica 17 has a different version of this page. AllReduce will not be reconciling.",
|
|
"Tensor parallel rank 0 has the page. Tensor parallel rank 1 disagrees. The all-gather chose chaos.",
|
|
"Launched the run on 8,192 H100s. Your shard sits on the one in the Reno DC that just lost cooling.",
|
|
"We trained for 47 hours. The loss diverged at hour 46. So did the page.",
|
|
"Training cost: 4.2 million dollars. The page was cut in the post-mortem.",
|
|
"Pipeline schedule was 1F1B. The page was the F that never reached its B.",
|
|
"The all-to-all for expert parallelism routed the page to expert 64 of 64. There are 63 experts.",
|
|
"Reward model assigned this page a score of -2.3. The policy stopped sampling it.",
|
|
"RLHF raters labeled this URL as unhelpful. The policy learned.",
|
|
"The page exists, but only at temperature 0.0. You queried at 0.7.",
|
|
"Compute-optimal scaling told us to train on fewer tokens. This page was one of them.",
|
|
"We computed the gradient with respect to this page and found it identically zero.",
|
|
"Sharded across 8 GPUs. The GPU with your shard is busy on someone else's training job.",
|
|
"GPU memory is fragmented enough that we can fit the page or the KV cache, not both.",
|
|
"Inference latency budget consumed entirely by tokenization. No tokens were generated.",
|
|
"Decode phase is memory-bound. So is the team's enthusiasm for fixing this.",
|
|
"Distribution shift: 404s are now the dominant class.",
|
|
"Batch size auto-tuner picked 1. We are now serving you, alone, at the wrong end of the roofline.",
|
|
"The model registry pointed at v3. v3 was deleted by the cleanup job last Tuesday.",
|
|
"Retraining was triggered by your request. Come back in 6 to 8 weeks.",
|
|
"Stale weights served this page until 14 minutes ago. Now no checkpoint serves it.",
|
|
"Vector index has not been rebuilt since the last embedding model swap. Neither has trust.",
|
|
"The thermal sensor read 95 C. The accelerator throttled the page to a lower frequency, then to zero tokens per second.",
|
|
"Tensor cores idle. SMs idle. The page is also idle, somewhere on the die.",
|
|
"We pruned 90 percent of the weights to fit the MCU. We pruned the page too.",
|
|
"We tried 1-bit quantization on the URL. The page rounded to zero.",
|
|
"The KV cache was offloaded to swap. There is no swap on a microcontroller.",
|
|
"NCCL hung at 86 percent of the AllReduce. The remaining 14 percent contained the page.",
|
|
"Elastic training rescaled the world size mid-step. The page belonged to the old world.",
|
|
"Fault recovery loaded the last checkpoint. The page was added a thousand steps after.",
|
|
"We bucketed gradients at 25 MB. The page was 26.",
|
|
"Recomputation skipped the layer with the page. Backward pass produced confident garbage.",
|
|
"Wall-clock time to convergence: 19 days. Wall-clock time to find this page: longer.",
|
|
"The tokenizer split this URL into bytes the BPE merges had never seen before.",
|
|
"Continuous batching evicted this page mid-generation. No tokens were saved.",
|
|
"The KV cache for this route was flushed when the replica hot-swapped weights.",
|
|
"Request queue depth exceeded the batcher's patience. Dropped before prefill.",
|
|
"Batch window closed before your request joined. The next decode step is in 40 ms, or never.",
|
|
"Speculative decoding accepted 73 percent of drafted tokens. Today's page was in the 27.",
|
|
"This route violated the throughput SLO and was demoted to best-effort inference.",
|
|
"Traffic shifted 5 percent to the new model. You were in the 5 percent. The new model never saw this page in training.",
|
|
"Shadow inference served this page. Production never noticed it was missing.",
|
|
"Data drift detector fired on this URL. The route was retrained out of existence.",
|
|
"Warm pool of model replicas was sized for predicted traffic. Your traffic was unpredicted.",
|
|
"Prefix cache hit. The prefix matched a different prompt by 31 of 32 tokens.",
|
|
"The page was quantized to INT4 for the edge. The decimal point wandered off.",
|
|
"This page was scheduled on the systolic array. Row 7 is still propagating partial sums.",
|
|
"The ridge point is at 76 FLOPs per byte. This page came in at 0.4.",
|
|
"NVLink saturated between GPUs. The page was queued behind the gradient AllReduce.",
|
|
"Ring AllReduce completed in 2(N-1) hops. The page was at hop 2N.",
|
|
"GPU-direct RDMA bypassed the CPU on the gradient path. It also bypassed this page.",
|
|
"Gradient bucketing flushed the bucket containing this page into the void.",
|
|
"Training resumed from a checkpoint that predated the page by 3,000 steps.",
|
|
"Micro-batch 14 of 16 contained the page. Micro-batch 14 was skipped to fit the pipeline schedule.",
|
|
"ZeRO-2 offloaded the gradient for this page to CPU. The CPU was OOM."
|
|
];
|
|
function pick() {
|
|
const el = document.getElementById('j404-joke');
|
|
if (!el) return;
|
|
let i = Math.floor(Math.random() * jokes.length);
|
|
if (el.dataset.lastIdx === String(i) && jokes.length > 1) i = (i + 1) % jokes.length;
|
|
el.dataset.lastIdx = String(i);
|
|
el.textContent = jokes[i];
|
|
}
|
|
window.j404Pick = pick;
|
|
pick();
|
|
})();
|
|
</script>
|
|
```
|