Commit Graph

2 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
a78f1bd8b0 feat(mlsysim): add documentation site, typed registries, and 6-solver core
Complete MLSYSIM v0.1.0 implementation with:

- Documentation website (Quarto): landing page with animated hero
  and capability carousel, 4 tutorials (hello world, LLM serving,
  distributed training, sustainability), hardware/model/fleet/infra
  catalogs, solver guide, whitepaper, math foundations, glossary,
  and full quartodoc API reference
- Typed registry system: Hardware (18 devices across 5 tiers),
  Models (15 workloads), Systems (fleets, clusters, fabrics),
  Infrastructure (grid profiles, rack configs, datacenters)
- Core types: Pint-backed Quantity, Metadata provenance tracking,
  custom exception hierarchy (OOMError, SLAViolation)
- SimulationConfig with YAML/JSON loading and pre-validation
- Scenario system tying workloads to systems with SLA constraints
- Multi-level evaluation scorecard (feasibility, performance, macro)
- Examples, tests, and Jetson Orin NX spec fix (100 → 25 TFLOP/s)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 15:59:51 -05:00
Vijay Janapa Reddi
f213260153 feat(mlsysim): align analytical solvers with industry-standard literature
Updated solvers to use literature-grade models for:
- Roofline Performance (Williams et al. 2009)
- Transformer Scaling (6PD rule, Kaplan et al. 2020)
- Training Memory (Shoeybi et al. 2019)
- Pipeline Parallelism (Huang et al. 2019)
- LLM Serving (Pope et al. 2023)
- Reliability (Young-Daly 1974/2006)

Introduced Hierarchical Communication Modeling and MFU/HFU metrics.
Fixed test suite imports and return key mismatches.
Updated Smart Doorbell scorecard reference in ml_systems.qmd.
Restored core __init__.py exports for backward compatibility.
2026-03-07 15:02:26 -05:00