7 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
e1863d1a38 fix(mlperf-edu): use script-relative paths for paper figure outputs
Replace hardcoded /Users/VJ/GitHub/mlperf-edu/paper/figures/... paths
in generate_all_curves.py and generate_all_curves_v2.py with paths
derived from os.path.dirname(__file__), so the figure-generation
scripts work for any user/checkout location.
2026-04-20 14:52:49 -04:00
Vijay Janapa Reddi
b693a0832d mlperf-edu: sync iters 7-10 (LoRA + compression + cost+DQ + distributed) 2026-04-16 18:28:49 -04:00
Vijay Janapa Reddi
d16c7585c8 mlperf-edu: sync iter-6 (LLM serving, 23 workloads, 16 measured) 2026-04-16 17:48:30 -04:00
Vijay Janapa Reddi
9aa876e2ed mlperf-edu: sync iter-5 (provenance + roofline emitter)
Iter-5 from standalone: real Merkle-style provenance manifest
(src/mlperf/manifest.py) replacing the iter-1 era str(report) self-hash,
plus a roofline-coordinate emitter (src/mlperf/roofline.py) that
populates the iter-4 taxonomy axes empirically.

Smoke test: tamper detection works (mutated 1 byte -> weights.sha256
FAIL). Roofline emitter SNR 179x (gate >= 4x).

Working group sign-off: Dean (proposer + verifier).
Branch parked; not for merge to dev. 5 of 10 iterations complete.
2026-04-16 15:27:03 -04:00
Vijay Janapa Reddi
30f80aaf1f mlperf-edu: sync iter-3 (NanoGPT prefill/decode split)
Snapshots iter-3 from the standalone repo. Adds:
  - Real KV-cache plumbing in gpt2_infer.py (CausalSelfAttention,
    GPTBlock, GPT2WhiteBox now support use_kv_cache + past_key_values).
  - NanoGPTWhiteBox unified forward signature returning either
    (logits, loss) for training or (logits, present_kvs) for inference.
    max_seq_len bumped 1024 -> 2048 per Dean's sizing math.
  - Two new workloads (nanogpt-prefill, nanogpt-decode) sharing the
    same trained checkpoint. Prefill demonstrates compute-bound
    behavior (~289 FLOP/byte at ctx=1792); decode demonstrates the
    bandwidth-bound regime (~0.5 FLOP/byte) that dominates LLM serving.
  - smoke_nanogpt_phases.py harness with intensity-ratio gate >= 5x;
    measured 578x on M-series MPS.

Working group sign-off: Dean (proposer + verifier).
Branch parked; not for merge to dev. Three iterations complete; seven
remaining per the autonomous loop plan.
2026-04-16 15:08:22 -04:00
Vijay Janapa Reddi
efaa075ba8 mlperf-edu: sync iter-1 and iter-2 from standalone repo
Snapshots the autonomous-iteration work happening in the standalone
/Users/VJ/GitHub/mlperf-edu/ repo. Two iterations folded in:

  iter-1: code-defect cleanup (Patterson + Dean sign-off)
    - Remove dead simulated_loss + load_real_wikitext_data from
      nanogpt_train.py; align NanoGPTWhiteBox vocab to char-level
      (50,257 -> 128, dropping 19.3M unused embedding params).
    - Fix two broken examples.{edge,mobile} imports in inference paths.
    - Reconcile README benchmark table with workloads.yaml (was wrong
      on 7 of 16 workloads).

  iter-2: DLRM DRAM-resident variant (Emer sign-off)
    - New MicroDLRMDRAM with 2M-row hash-mapped virtual EmbeddingBag,
      sized so per-batch byte transfer (8 MB at B=8192, m_spa=256)
      exceeds PyTorch's ~50 us dispatch floor and exhibits the
      bandwidth-bound regime production DLRM lives in.
    - Smoke test asserts pure-lookup gap >= 3x; current host shows
      4.29x end-to-end and 3.49x lookup-only.

Branch is parked; not for merge to dev. Iteration log lives in the
standalone repo under .iteration_log/ (gitignored locally).
2026-04-16 14:59:42 -04:00
Vijay Janapa Reddi
a9878ad6bd feat: import mlperf-edu pedagogical benchmark suite
Snapshot of the standalone /Users/VJ/GitHub/mlperf-edu/ repo as of
2026-04-16, brought into MLSysBook as a parked feature branch for
backup and iteration. Not for merge to dev.

Contents (88 files, ~2.3 MB):
- 16 reference workloads (cloud / edge / tiny / agent divisions)
- LoadGen proxy harness + SUT plugin protocol
- Compliance checker, autograder, hardware fingerprint
- Paper draft (paper.tex) with TikZ/SVG figure sources
- Three lab examples + practitioner workflow configs
- Workload + dataset YAML registries (single source of truth)

Excluded (per mlperf-edu/.gitignore + size constraints):
- Datasets (6.6 GB), checkpoints (260 MB), gpt2 weights (523 MB)
- Generated PDFs, .venv, build artifacts
2026-04-16 14:15:05 -04:00