mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 10:08:50 -05:00
dev
17 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
825d9571a6 |
chore: remove archived content and refresh contributor docs
- Remove retired _archive/ and scripts/archive/ trees (site, book filters, games, vault); vault CHANGELOG points to git history for old scripts. - CONTRIBUTING: site project row, site/ in area map, root vs TinyTorch pre-commit, vault schema drift wording. - Newsletter CLI: path-agnostic news alias; tinytorch pre-commit comments; add tools/ and staffml-vault-types READMEs for maintainers. |
||
|
|
9d4732f4ff |
fix(games): module-relative imports across all 13 remaining games
The same path-prefix bug that broke Lander on dev preview affected the other
13 games too. Fixing all of them in one batch so the entire catalog works
on /cs249r_book_dev/, mlsysbook.ai/, and localhost equally.
Pattern applied:
.qmd include-in-header script:
import "/assets/games/X.mjs" → import "../assets/games/X.mjs"
.mjs ES imports:
from "/assets/games/runtime.mjs" → from "./runtime.mjs"
from "/assets/games/vendor/pixi.min.mjs" → from "./vendor/pixi.min.mjs"
Files touched (10 .mjs + 13 .qmd):
.mjs: allreduce, batch, cluster, kvcache, moe, oom, pipeline, prune,
quantization, topology
.qmd: allreduce, batch, checkpoint, cluster, kvcache, loader, moe, oom,
pipeline, prune, quantization, roofline, topology
(checkpoint, loader, roofline .mjs already used 'import * as runtime from
./runtime.mjs' — only their qmd files needed updating)
Verification: all 14 games rendered locally (quarto render games/), served
via python3 -m http.server, swept with Playwright headless Chromium.
Result: 14/14 pass — canvas mounted, MLSP runtime ready, game registered,
no JS errors, no 4xx network requests. Visual screenshots confirm each
game's HUD/title/content paints correctly.
|
||
|
|
021feb5875 |
fix(lander): playtest hotfix — module-relative imports + ENTER-to-launch
User reported the live dev preview was broken (blank canvas, 'doesn't do anything'). Playwright probe confirmed all .mjs imports 404'd: [http 404] https://harvard-edge.github.io/assets/games/runtime.mjs [http 404] https://harvard-edge.github.io/assets/games/lander.mjs Root cause: dev preview lives at /cs249r_book_dev/ but every game imported its modules via root-absolute paths (/assets/games/...). The dev-URL rewrite script only handles https://mlsysbook.ai/... — not root-relative paths. All 14 games have this bug; Lander is fixed here. Path-prefix fix: - lander.qmd: /assets/games/X.mjs → ../assets/games/X.mjs - lander.mjs: /assets/games/runtime.mjs → ./runtime.mjs (sibling) - lander.mjs: /assets/games/vendor/pixi.min.mjs → ./vendor/pixi.min.mjs - runtime.mjs: /assets/games/vendor/pixi.min.mjs → ./vendor/pixi.min.mjs - runtime.mjs: pixi-filters dynamic import → ./vendor/pixi-filters.min.mjs UX feedback (bundled): user asked 'say hit enter to start so people don't feel rushed and then they can read what's expected': - READY CTA 'press UP to launch' → 'press ENTER to launch' - Added italic 'Take your time — read the controls.' hint above the CTA - Keydown accepts Enter, Space, OR ↑ as launch — any of the three works - Center touch zone calls new shared launch() helper - 'How to play' instructions updated to match Verification: rendered locally (quarto render games/lander.qmd), served via python3 -m http.server, probed with Playwright (headless Chromium). Page loads, READY shows new CTA, Enter dismisses overlay, ↑ thrusts, crash triggers per-failure aha card with correct share text. Zero console errors. Outstanding: other 13 games still have the same path-prefix bug. Either apply the same per-file fix, or extend rewrite-dev-urls.sh to also rewrite /assets/... paths. |
||
|
|
25016a9923 |
feat(lander): iter 9 — mobile / touch / a11y — three touch zones, reduce-motion, aria-live
The game was unplayable on a phone (keyboard-only controls). Animations
ignored OS reduce-motion preference. No aria announcement for game-over.
- Three invisible Pixi touch zones: left ⅓ steer-left, right ⅓ steer-right,
center ⅓ thrust + tap-to-launch from READY screen
- Pointer fallbacks (pointerupoutside, pointercancel) prevent stuck-key state
- Z-order re-pinned after touch-zone creation so retry pill stays clickable
- reduceMotion flag from matchMedia('(prefers-reduced-motion: reduce)')
- safeShake + safeBurst wrappers gate all 5 shakes and 4 big crash bursts
- Goal-pad pulse and CTA pulse held static when reduce-motion is set
- New aria-live='assertive' span; onGameOver writes per-reason announcement
- 'How to play' gains a mobile/tablet bullet
Lens-bounded change. Iter 10 (final ship-readiness pass) is next.
|
||
|
|
1e97ae6da9 |
feat(lander): iter 8 — landing page — trim duplicate HUD, refresh copy, a11y
The landing page lagged the in-canvas state machine. DOM HUD duplicated VRAM and Speed values now drawn in-canvas. 'How to play' didn't mention READY screen, RETRY button, daily seed, trajectory marker, or altitude line. 'The Systems Concept' only framed the win path. - DOM HUD trimmed to a key-cap controls row; VRAM/Speed values moved to .mlsp-sr-only aria-live span (kept for AT, hidden visually) - 'How to play' rewritten to match current state (READY-to-launch, all six affordances called out) - 'The Systems Concept' lists all five failure modes mapped to real training: diverged, local-min, missed-basin, off-course, OOM - Tail line invites the other 13 games - New shared CSS: .mlsp-controls-line (key-cap row), .mlsp-sr-only (standard visually-hidden pattern); reusable across the catalog Lens-bounded change. Iter 9 (mobile/touch/a11y) is next. |
||
|
|
eb98094d53 |
feat(lander): iter 7 — replay loop — daily seed, best score, retry button, share row
Lander was missing every replay-loop affordance the other 13 games have:
no daily seed, no best-score persistence, no visible retry button, no share
artifact, dead-static game-over state.
- dailySeed('lander') → terrain RNG; same loss surface worldwide today
- bestScore integration: lowest impact speed stored per-day; top-center chip
shows 'Day #N · your softest landing today: X m/s'
- In-canvas RETRY pill (Pixi Container, eventMode=static), MIT-red background,
visible only after gameOverFired
- buildShareText(state) per outcome: emoji-grid lines for win + 5 failure modes,
with ⭐ new personal best marker
- attachShareRow in lander.qmd appends share text + 📋 copy button to aha card,
with success feedback (✅ copied → reverts)
- New shared CSS in common.css: .mlsp-share-row, .mlsp-share-text, .mlsp-share-btn
(reusable by every other game)
Lens-bounded change. Iter 8 (landing page) is next.
|
||
|
|
50fd1ade50 |
feat(lander): iter 5 — pedagogy — per-failure aha messages, real OOM
Six possible outcomes, but all produced the identical aha card. The OOM event
didn't even end the game — flashed text once and let play continue. And
'GRADIENT EXPLOSION' misuses ML terminology (gradient explosion is NaN
propagation, not landing in the wrong place).
- state.reason tracks which failure occurred ('win'|'diverged'|'local-min'|
'off-course'|'missed-basin'|'oom')
- 'GRADIENT EXPLOSION' → 'MISSED THE BASIN' (accurate to loss-surface metaphor)
- OOM now properly ends the game with full juice (matches real training)
- New AHA[reason] map: six distinct messages, each mapping the failure to its
real ML systems counterpart
- api.aha(reason) returns the right card; lander.qmd attachAha consumes it
- Bug fix: state.gameOverFired guard so onGameOver fires once, not every frame
Lens-bounded change. Iter 6 (visual polish) is next.
|
||
|
|
81f8ced46a |
fix(playground): refine game UX and visuals
Improve the playground shell, game explanations, and several mini-game interactions so the games read more clearly as teaching artifacts and work better in fullscreen. |
||
|
|
12d4aeaa7d |
feat(playground): complete 14-game MLSysBook arcade gallery
- Added 'KV Cache Packer' to teach PagedAttention and KV fragmentation - Added 'Cluster Commander' to teach Slurm scheduling and fleet fragmentation - Registered all 14 games in the runtime registry - Fixed WebGL rendering loops to avoid performance overhead and crashes - Updated 404 pages across all workspaces to route to the new games Playground - Overrode default Quarto 'S' search shortcut to Shift+? to free up typing controls |
||
|
|
31f38e5c33 |
feat(games): MLSysBook Playground v2 — PixiJS migration + Straggler ships, Roofline archived
Two threads landing together: PixiJS migration (Pulse Prune + OOM): - Vendored PixiJS v8 + pixi-filters v6 to site/assets/games/vendor/ (pixi-filters bundle patched to resolve "pixi.js" import locally) - Added shared Pixi runtime (runtime.mjs): mountPixiOnCanvas, pop/flash/burst/ floatText/shake juice helpers, tween + lazy getFilters, daily seed + best score; legacy window.MLSP bridge preserved for canvas-2D games - Pulse Prune rewritten on Pixi (prune.mjs v8): GlowFilter on inference pulse + 18-position trail, dense bursts and bloom rings on critical cuts/win - OOM rewritten on Pixi (oom.mjs v6) with pedagogy fixes from audit: KV cache replaced with parameters block (training game, not inference); block ratios rescaled to real HBM proportions (act=6, opt=6, params=3, grad=1); activation freeing flipped FIFO -> LIFO (autograd traversal order); aha card cites Korthikanti et al. 2022; visual lift via tween-in placement, ambient particles at >=40% fill, GlowFilter on params + win state Straggler (new game #4): - New game on Pixi teaching distributed-training tail latency and ring all-reduce - 8 GPU-creature sprite cast (idle/waiting/sighing/ash) generated as iconic vector mascots, transparency-cleaned and sized for runtime in sprites/ - Synchronous-ring rhythm mechanic: tap each round to advance the ring; miss the rhythm and the cluster stalls -> non-player GPUs accumulate idle time; 4.5s stall throttles the player -> game over - 4-second grace period auto-passes (tutorial), round duration ramps from 2.4s -> 0.7s over 60s, phase-2 gather pass spawns at t=30s - HUD reports steps + cluster idle %; aha card cites Sergeev & Del Balso 2018 Catalogue cleanup: - Roofline Runner archived to site/games/_archive/ and site/assets/games/_archive/ (kept in registry.js with available:false rather than deleted) - Gallery (index.qmd) updated: 4 cards = Pulse Prune, Straggler, OOM, Sharp Shot - Brand sweep "MLSys Playground" -> "MLSysBook Playground" across qmd footers, registry header, common.css comment, 404 aria-label, and all .js share-text Tooling: - pre-commit codespell: skip site/assets/games/vendor/ (third-party minified bundles); fix "alltime" -> "all-time" in user-visible strings; rename fromI/toI -> fromIdx/toIdx in straggler ring math Tested: all 4 games load with zero JS errors in headless Chromium. |
||
|
|
b9e9948b43 |
feat(games): round-6 synthesis — OOM connective tissue, Prune simplification + progress bar, Sharp Shot replaces Quantization Cliff
Three redesigns shipped together after a round-6 consultation (indie
designer + Song Han + beginner-player). Each addresses a specific
complaint from playtests:
1. OOM step-bar driven by PACKING, not a wall clock
Old: stepCountdown ticked down 7s, independent of what the player
did. User feedback: 'I don't know what the step is doing. Is it
a timer?' — all three reviewers flagged the missing causal chain.
New: state.stepProgress increments on every block placed. Every
6 blocks fires step() (clears gradients). Every 3 blocks fires
backward (consumes oldest activations). The right-side bar now
*fills* as you pack, not counts down. Causation is visible: your
packing IS the training loop.
2. Pulse Prune simplification + visible sparsity goal bar
Old aha text overclaimed (name-dropped lottery-ticket, 2:4 sparsity
which the game doesn't model). Emma-the-beginner also flagged the
win condition wasn't legible — 'snip things and hope?'.
New aha text is one honest sentence: 'Magnitude is a usable proxy
for importance (Han et al. 2015). Real pruning adds a fine-tuning
step to recover accuracy — you just did the cut.' Song Han's
exact recommendation.
HUD now has a visible sparsity progress bar (green, fills toward
the 60% goal with a target tick at the end) above the accuracy
bar. The goal is a bar you fill, not a number you infer.
3. Sharp Shot replaces Quantization Cliff
Old: static precision-dial form. Reviewers called it a spreadsheet.
Beginner said she'd bounce off it. Game designer: 'barely a game
— three clicks and a deploy button.'
New: target-shooting game. Per-layer precision dials on the left
panel; target downrange. Song Han's per-layer visual mapping:
- edge layers (embedding, output) at int4 → POSITION DRIFT
(target actually moves away from where you aim, teaching the
LLM.int8 edge-layer cliff; Dettmers 2022)
- attention layers at low precision → JITTER (softmax noise)
- FFN layers at low precision → BLUR (contrast loss, tolerant)
10 shots per round, hit 7 to ship. When you miss, the game briefly
reveals the true target position in dashed red so you can see how
far your sight was misaligned — pedagogy through failure.
Controls: mouse or arrow keys to aim, space / click to fire, 1-6
keys cycle layer precision. Immediate visual feedback (no deploy
phase; the picture updates as you dial).
4. Naming
Quantization Cliff → Sharp Shot on the gallery card, the
registry, the page title, and inside the game. 'Quantization'
stays in the aha card and page prose (that's where it belongs),
but beginners who bounce on jargon won't bounce on 'Sharp Shot.'
Emma's recommendation.
Files: 3 game .js files rewritten, registry.js updated for rename,
index.qmd gallery card updated, quantization.qmd page prose rewritten
for the new mechanic. Previous Roofline Climber redesign deferred
(indie designer flagged it as three genres stacked; current Roofline
works and rewriting it risked losing more than gaining).
|
||
|
|
f38179b65a |
feat(games): iteration 3 — juice module, emoji-grid shares, deploy stagger, citation fixes
Iteration 2 surfaced four fresh critiques (viral, juice, accessibility,
academic). This commit ships the highest-leverage fixes from those four,
keeping mobile + accessibility for a future pass.
1. Shared juice module (common.js)
Adds MLSP.pop, MLSP.flash, MLSP.tickJuice, MLSP.drawJuice, plus
easeOutCubic and easeOutBack curves. Pop = expanding ring on
score events; flash = full-screen tinted wash on catastrophes.
Wired into all four games via 4 lines apiece.
Also adds MLSP.dayNumber() — days since 2026-04-22, used for the
'Day N' framing in share text (Wordle pattern).
2. Emoji-grid share artifacts in all four games
Critical viral fix from round 2 — text-only shares don't escape
ML-Twitter, but emoji grids spread.
- Pulse Prune: 4×6 grid of input → hidden weights, with pruned
(⬛) / kept-bright (🟦) / kept-dim (🟩) / critical-mistake (🟥).
- Roofline Runner: 10-cell histogram of last 10 catch outcomes
(🟦 caught-good, 🟥 above-ceiling, ⬛ missed).
- OOM: 8×4 sampled HBM map at game-over (🟦 act, 🟥 grad, 🟧 opt,
🟩 KV, ⬛ empty) — captures the visually distinctive game state.
- Quantization Cliff: 6-emoji precision ladder (🟦 fp32 / 🟩 fp16
/ 🟧 int8 / 🟥 int4) — the decision space made shareable.
3. Quantization Cliff: deploy stagger reveal
Old behavior: deploy() revealed all 6 layer accuracy drops
simultaneously. Round-2 juice review called this 'the flattest
game by an order of magnitude.'
New: reveal staggers per-layer at 150ms intervals, each with a
pop ring colored by the drop magnitude. Deploy button shows
'deploying…' and is disabled during the reveal sequence. Final
accuracy shown only after the last layer reveals, with a
full-screen flash (green if shipped, red if off-spec).
4. OOM: massively louder STEP() event
Round-2 juice review: 'fireStepEvent fires burst(... 3) per cell
— three particles per cell is criminally undersized for the
marquee event.'
New step event triggers a 360ms full-screen green flash, 8
particles per freed cell (was 3), AND a pop ring per freed cell.
Backward event gets its own blue flash + pops. Game-over
overflow gets red flash. The signature beat is now signature.
5. Aha card factual fixes (per academic review)
- Pulse Prune: Han et al. (2015) cited as primary for magnitude
pruning; LTH framed correctly as a separate further claim.
- Roofline Runner: Williams, Waterman, Patterson 2009 cited;
attention correctly described as spanning compute-/memory-bound
depending on sequence length and prefill-vs-decode phase.
- OOM: training vs inference regimes correctly distinguished —
optimizer state lives during training, KV cache during
inference; they don't typically coexist.
- Quantization Cliff: HAQ (Wang 2019) and HAWQ (Dong 2019) cited
as the actual per-layer bit-allocation methods. AWQ and GPTQ
reframed correctly as uniform-precision techniques (they
minimize accuracy cost AT a chosen precision, not allocate
bits across precisions).
6. Page prose updates
- games/quantization.qmd: prose updated to match new aha card,
distinguishing bit-allocation vs uniform-precision techniques.
- games/prune.qmd: Han et al. (2015) added as the primary
citation alongside the LTH reference.
Skipped this round (next iteration if needed):
- Mobile redesign (Pulse Prune touch radius, OOM cell size on
small screens) — accessibility review identified these but they
require layout work, not surgical edits.
- Reduced-motion media-query check inside canvas games.
- Today's-puzzle banner on the gallery page.
Files: 5 game .js files, 2 page .qmd files.
|
||
|
|
f41e940741 |
feat(games): iteration 2 — fix the three critical issues from playtest reviews
Five-reviewer playtest converged on three fixes that everyone agreed
needed to land before this catalogue ships. This commit applies all
three plus the daily-seed parity fix the author flagged.
1. OOM mechanic rewrite: lifetime-driven freeing, not Tetris row-clear
Three reviewers (author, production-engineer, hardware) independently
said the row-clear mechanic teaches a falsehood — real GPU
allocators do not compact, they return blocks to a free-list.
Filling a contiguous span does not free memory.
New mechanic ties freeing to ML semantics:
- STEP! event fires every 7s: every red gradient block clears
simultaneously (mimics .step() releasing gradients), bonus
score per gradient freed.
- BACKWARD event fires every 4.5s: oldest 1-2 activation
blocks dissolve (mimics activations consumed during backward).
- Optimizer state and KV cache persist across the run.
Visual phase indicator at top of canvas shows current phase
(FORWARD / BACKWARD / STEP). Right-side countdown bar shows
time-to-next-step. The satisfying clearing moment is preserved
but now teaches the real lifetime-driven memory pattern.
2. Roofline op intensity bands
Hardware reviewer flagged: kernel y-positions were random
independent of op type, so GEMM could spawn at low intensity.
This taught the wrong intuition (kernel position is arbitrary)
when it should teach the opposite (op type determines intensity).
Each op now has a fixed intensity band: GEMM 0.65-0.85
(compute-bound), conv 0.50-0.65, attn 0.35-0.50, gelu 0.18-0.28,
layernorm 0.12-0.20, softmax 0.10-0.18, elem 0.05-0.12. Y is
still randomised within above/below ceiling.
Plus David's predictive-landing-reticle suggestion: a dotted
ghost circle at each kernel's landing point shows where to be
before the kernel arrives. Turns reaction into anticipation.
Plus a defensive fix: the hit-detection window now spans
+-16 to +-8 px around the target rather than 'crossed targetX
this frame', so dt spikes can't teleport kernels past their
hit window.
3. Quantization Cliff: deterministic accuracy
Gamer reviewer flagged: '(3 + rand() * 2)' in the accuracy
calculation made the same configuration produce different
results across deploys. Players literally could not learn
the system. That is noise hiding the lesson.
Replaced the random multiplier with a constant. Sensitivity
is still hidden and seeded once per day, but for a given
configuration the result is deterministic. Game becomes a
real puzzle.
4. Daily seed parity
Author flagged: only Pulse Prune was using a seeded PRNG.
'Daily seed' is a brand commitment; 3 of 4 games defaulting
to Math.random was inconsistent. All four games now seed a
mulberry32 PRNG from today's date.
5. Aha card factual edits (per production-engineer review)
- Prune: '5-20x' replaced with 'substantially shrink parameter
counts (combined with quantization and distillation)' since
the larger multipliers conflate techniques.
- OOM: full rewrite to describe lifetime-driven memory, with
fragmentation correctly framed as a failure mode, not a
technique.
- Roofline: added that real engineering raises a kernel's
intensity (fusion, tiling) rather than catching what falls.
- Quantization: SmoothQuant replaced with AWQ/GPTQ — those
are the actual bit-allocation search techniques.
6. Gallery taglines: verb-first, beginner-friendly
Beginner reviewer: 'lead with the verb and the human stakes,
not the concept name.' Updated all four card taglines on
/games/ from concept-name-first to action-first prose.
Files: 4 game .js, 1 gallery .qmd. Static server in _build/ already
shows the new code; just refresh.
|
||
|
|
9766bdcd92 |
feat(games): four-game MLSys Playground (simplified, fun-first)
User direction reset: 'I don't think this game is accurately correct,
in fact. It's very confusing. I think we're making it way too
complicated. It should be something fun. Pruning is good; roof line
is good. Let's build up those other games as well.'
This commit pivots the entire playground from simulation-grade to
arcade-grade, and ships four games at once:
1. PULSE PRUNE — simplified. Removes all the fine-tune dynamics,
ceiling/staleness, patience, rewiring, daily-seed complexity.
Core loop is now a 45-second timer: click dim weights, keep
accuracy above 50%, hit 60% sparsity to win. Same ML-themed
visuals (network, inference pulse, hover tooltips) but no
simulation. 300 lines, was 650.
2. ROOFLINE RUNNER — new. Log-log chart drawn on canvas with the
classic bent roofline ceiling. Kernels (GEMM, attn, softmax,
etc.) fly in from the right at various heights. Player moves
a crosshair with mouse or arrow keys. Catch kernels BELOW the
ceiling for +1; kernels ABOVE the ceiling are -1 (unrealisable
throughput). Three lives. 30-second rounds.
3. OOM — new. Tensor Tetris. Activations (blue, wide), gradients
(red, narrow), optimizer states (orange square), KV cache
(green, tall) fall into a bounded HBM region. Move with arrow
keys; space for hard drop. Complete rows free memory
(allocator reclaim metaphor). Score = blocks placed before
overflow.
4. QUANTIZATION CLIFF — new. Six layers stacked vertically, each
with a precision dial (fp32 / fp16 / int8 / int4). Bit budget
is 96 (half of 6 × 32). Click a layer to cycle its precision.
Press deploy to reveal accuracy — but you get only THREE
deploys per run. Sensitivity is hidden and uneven: first and
last layers (embeddings, output) hate low precision; middle
layers tolerate int4 fine. Hit 85% accuracy within budget to
ship.
Infrastructure:
- site/assets/games/{prune,roofline,oom,quantization}.js: four
self-contained game modules. Each ~200-300 lines.
- site/assets/games/registry.js: all four games marked available.
404 randomizer now rotates across all four.
- site/games/{prune,roofline,oom,quantization}.qmd: standalone
pages per game.
- site/games/index.qmd: gallery updated to show all four live.
Design principle swap:
- OLD: 'Feel the constraint' / 'subtraction under scarcity' —
pedagogical thesis demanding simulation-grade depth.
- NEW: fun first, ML aesthetic as flavor, teaching as a bonus in
the aha card. Each game is an arcade mechanic with an ML theme
painted on, not a simulator pretending to be a game.
|
||
|
|
4e47e6258b |
feat(games): Pulse Prune v3 — fix pulse routing + ship 4-reviewer design synthesis
Two threads landed in this commit: the bugs the user reported
(pulse-routing + pacing) and the design-level fixes from a
4-reviewer synthesis (deep-mechanic, meaningful-interaction,
viral browser-game, friction-as-feature lenses).
Bug fixes (the things you saw):
- Pulse now routes through the real graph. Previously the
inference pulse was drawn as a straight line from a random
input to a random output, ignoring topology — pruning had
real scoring effects but no visible inference consequence.
Now spawnPulse picks a concrete path via pickOutgoingUnpruned:
input -> unpruned weight -> hidden -> unpruned weight -> output.
If a leg has no unpruned outgoing weight, the pulse dies at
that node with a visible dropped marker and tiny screenshake.
Cut every outgoing weight from input-2 and you literally watch
inferences from input-2 die at the source.
- Slowed pacing. One pulse at a time, 850ms per leg, 500ms gap.
Reduced fine-tune jitter (0.015 -> 0.006) so weight thicknesses
stop flickering. Slowed activation glow (0.82 -> 0.92) and
neuron pulse decay (0.88 -> 0.94).
- Activations glow only on the chosen path edges, not all weights
in the current layer. Screen is dramatically less noisy.
Design-level fixes:
- HUD stripped. On-canvas HUD is now one accuracy bar (with
ceiling tick + red dashed floor), sparsity %, and the daily
seed line. Combo counter and classification tally are gone
from the canvas surface — they live in the score model and
appear only on game-over and in the share artifact. (Two
reviewers independently flagged 'too many numbers competing.')
- Combo -> Confidence redesign. Old combo rewarded fast clicking
on dim weights (tick-gaming). New patience system rewards how
long the weight stayed dim BEFORE you cut it: +5 patience for
weights small for 5+ ticks, +0 hasty for fresh dips,
-load-bearing flag with screenshake for high-importance cuts.
- 3-second ghost demo. Game opens by playing itself: small red
ghost cursor slides toward the smallest-magnitude weight,
the weight pulses, the cursor clicks it, then a 'your turn'
banner appears. First user pointerdown takes control.
- Failure-as-content. Game-over reveals the network skeleton
of your destruction: pruned weights drawn as red dashed lines
over the surviving graph, load-bearing mistakes drawn thicker
in MIT red. Translucent overlay lets the corpse ghost through.
- Emoji-grid share artifact (Wordle pattern). Share text now
embeds a 5x8 grid of the input-hidden weight matrix:
blue square = kept high-magnitude, green = kept low-magnitude,
black = pruned cleanly, red = pruned but load-bearing
(a mistake). Visually recognisable on social — the single
highest-leverage fix from the viral-browser-games reviewer.
Plan document for the full design synthesis (the four reviewer
perspectives, the new 'subtraction under scarcity' thesis, the
revised game-4 lineup) lives at .claude/_reviews/mlsys-playground-plan-2026-04-22.md
locally. M2 (Roofline reimagined as architecture-not-reflex),
M3 (OOM with mercy moments), and M4 (Quantization Cliff,
commit-based) wait on user signoff.
|
||
|
|
d094a1aad4 |
feat(games): rewrite Prune as reactive Pulse Prune with fine-tune dynamics
v1 was a one-sided puzzle — the player cut weights, nothing pushed back.
Three parallel design reviews (lab-designer, Song Han as Prune's original
author, and a gamer gut-check) converged on the same fix: give the
network its own heartbeat so pruning becomes reactive instead of
contemplative.
v2 runs three simultaneous beats:
- Fine-tune tick (~2 Hz): weight magnitudes drift each tick toward
importance-weighted targets, surviving weights strengthen as
sparsity grows, the accuracy ceiling reflects current capacity with
a concave falloff, and staleness penalises sitting idle. This makes
the lottery-ticket intuition — the network rewires around your
cuts — physically felt rather than explained.
- Inference pulse (~1.1 Hz): a sample sprite flows left-to-right
through the network, activates edges along its path (which glow
green), pulses the source and target neurons, and ticks the
correct/attempted counter at the output. Misclassifications cause
a small accuracy nudge and screenshake.
- Player cuts: clicking a weight that has stayed small across
N consecutive ticks scores a combo multiplier; cutting a high-
importance weight triggers screenshake, a red particle burst, and
resets the combo.
Added retention mechanics:
- Daily seed: everyone playing today gets the same network, generated
by mulberry32 seeded from the ISO date. Daily best persists until
tomorrow's puzzle replaces it.
- Share button on the aha card copies a score summary to clipboard.
- Alltime best persists separately from daily best.
Added game feel:
- Screenshake on bad cuts and misclassifications.
- Particle bursts (green for clean cuts, red for critical cuts).
- Floating score texts rise from cut points.
- Edges glow when carrying an inference pulse.
- Neurons pulse with red rings when activated by a sample.
Pedagogy change: the aha card now teaches iterative pruning and gradual
magnitude pruning (Zhu & Gupta 2017) plus lottery-ticket intuition
(Frankle & Carbin 2018), because players now feel those phenomena
rather than the simpler 'most weights are small' observation of v1.
Files:
- site/assets/games/prune.js: full rewrite, 9.5K → 24K.
- site/games/prune.qmd: updated prose, added share button wiring,
canvas upsized to 680x460 to match the book's SVG canvas default.
- site/404.qmd: HUD now surfaces combo and classification tally;
share button wired identically to the standalone page; canvas
upsized to 680x460.
|
||
|
|
748f4a8d1f |
feat(site): launch MLSys Playground with Prune as first mini-game
Introduces a new /games/ sub-section — MLSys Playground — designed around
the 'Feel the constraint' thesis: small browser games that turn real ML
systems concepts into 30-second playable loops.
This commit ships the foundation plus the first game, Prune:
- site/assets/games/common.{css,js}: shared palette and runtime used by
every game (best-score persistence via localStorage, canvas
coordinate helpers, aha-card renderer, Box-Muller gauss).
- site/assets/games/registry.js: single source of truth listing all
games. Adding game N+1 is one entry plus flipping an 'available' flag.
- site/assets/games/prune.js: the Prune game — a 5-8-3 neural network
on a canvas where you click low-magnitude weights to remove them
while keeping accuracy above 60 percent. Teaches magnitude-based
pruning and lottery-ticket intuition.
- site/games/prune.qmd: standalone page playable at /games/prune/.
- site/games/index.qmd: gallery landing with Prune live and placeholder
cards for Roofline Runner and OOM (games 2 and 3).
- site/404.qmd: rewritten to dynamically pick a random available game
from the registry and embed it. Today that picks Prune; when
Roofline and OOM land, the 404 rotates across all three.
Design comes from parallel review by lab-designer, author's vision
(Vijay Reddi), Song Han's efficiency-game lens, and Soumith Chintala's
framework-engineer perspective — all four converged on Prune as a
strong first game for fast time-to-aha and genuine mechanical novelty
(grow-by-subtracting has no close analog in arcade gaming).
|