mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-25 09:42:03 -05:00
Align the MLSys·im code, docs, paper, website, workflows, and lab wheel for the 0.1.1 release. This also fixes runtime/API issues found during release review and prepares the paper PDF plus archive package.
26 lines
1.5 KiB
Plaintext
26 lines
1.5 KiB
Plaintext
---
|
|
title: "MLSys·im Paper"
|
|
---
|
|
|
|
MLSys·im is a first-principles analytical modeling framework for reasoning about ML infrastructure before provisioning hardware. It separates workload demand from hardware supply, infrastructure context, and fleet topology, then uses typed solvers to identify the constraints that bind performance, cost, carbon, and reliability.
|
|
|
|
```{mermaid}
|
|
%%{init: {'theme': 'neutral'}}%%
|
|
%%| fig-cap: "The MLSys·im 5-layer stack. Workloads, hardware, infrastructure, and systems provide typed inputs; resolvers produce performance, cost, carbon, and reliability outputs."
|
|
flowchart TB
|
|
A["<b>Layer A: Workloads</b><br/>TransformerWorkload, CNNWorkload, SSMWorkload<br/><i>Parameters, FLOPs, arithmetic intensity</i>"]
|
|
B["<b>Layer B: Hardware</b><br/>HardwareNode, ComputeCore, MemoryHierarchy<br/><i>Peak FLOP/s, bandwidth, capacity, TDP</i>"]
|
|
C["<b>Layer C: Infrastructure</b><br/>GridProfile, Datacenter<br/><i>Carbon intensity, PUE, WUE</i>"]
|
|
D["<b>Layer D: Systems</b><br/>Node, Fleet, NetworkFabric<br/><i>Topology, accelerators/node, fabric bandwidth</i>"]
|
|
E["<b>Layer E: Resolvers</b><br/>SingleNode · Distributed · Serving<br/>Economics · Sustainability · Reliability"]
|
|
F["<b>Results</b><br/>PerformanceProfile · SystemEvaluation"]
|
|
|
|
A --> E
|
|
B --> D
|
|
C --> D
|
|
D --> E
|
|
E --> F
|
|
```
|
|
|
|
For the full architecture, validation anchors, and limitations, read the paper: [open `mlsysim-paper.pdf`](mlsysim-paper.pdf).
|