Files
cs249r_book/mlsysim/docs/api/index.qmd
Vijay Janapa Reddi 2bbe3e1a69 docs(mlsysim): redesign website, add 12 tutorials, and CLI entry points
Replace 9 old tutorials with 12 new numbered tutorials (00-11) covering
roofline through full-stack audit. Redesign landing page, add
models-and-solvers and extending-the-engine guides. Add __main__.py,
cli.py, and cli/ package for command-line interface.
2026-03-12 16:04:51 -04:00

28 lines
2.7 KiB
Plaintext

# API Reference {.doc .doc-index}
## Core API
Primary objects and solvers.
| | |
| --- | --- |
| [hardware](hardware.qmd#mlsysim.hardware) | |
| [models](models.qmd#mlsysim.models) | |
| [infra](infra.qmd#mlsysim.infra) | |
| [systems](systems.qmd#mlsysim.systems) | |
| [core](core.qmd#mlsysim.core) | |
| [core.solver.SingleNodeModel](core.solver.SingleNodeModel.qmd#mlsysim.core.solver.SingleNodeModel) | Resolves single-node hardware Roofline bounds and feasibility. |
| [core.solver.ServingModel](core.solver.ServingModel.qmd#mlsysim.core.solver.ServingModel) | Analyzes the two-phase LLM serving lifecycle: Pre-fill vs. Decoding. |
| [core.solver.DistributedModel](core.solver.DistributedModel.qmd#mlsysim.core.solver.DistributedModel) | Resolves fleet-wide communication, synchronization, and pipelining constraints. |
| [core.solver.DataModel](core.solver.DataModel.qmd#mlsysim.core.solver.DataModel) | Analyzes the 'Data Wall' — the throughput bottleneck between storage and compute. |
| [core.solver.ScalingModel](core.solver.ScalingModel.qmd#mlsysim.core.solver.ScalingModel) | Analyzes the 'Scaling Physics' of model training (Chinchilla Laws). |
| [core.solver.OrchestrationModel](core.solver.OrchestrationModel.qmd#mlsysim.core.solver.OrchestrationModel) | Analyzes Cluster Orchestration and Queueing (Little's Law). |
| [core.solver.CompressionModel](core.solver.CompressionModel.qmd#mlsysim.core.solver.CompressionModel) | Analyzes model compression trade-offs (Accuracy vs. Efficiency). |
| [core.solver.SustainabilityModel](core.solver.SustainabilityModel.qmd#mlsysim.core.solver.SustainabilityModel) | Calculates Datacenter-scale Sustainability metrics. |
| [core.solver.EconomicsModel](core.solver.EconomicsModel.qmd#mlsysim.core.solver.EconomicsModel) | Calculates Total Cost of Ownership (TCO) including Capex and Opex. |
| [core.solver.ContinuousBatchingModel](core.solver.ContinuousBatchingModel.qmd#mlsysim.core.solver.ContinuousBatchingModel) | Analyzes production LLM serving with Continuous Batching and PagedAttention. |
| [core.solver.WeightStreamingModel](core.solver.WeightStreamingModel.qmd#mlsysim.core.solver.WeightStreamingModel) | Analyzes Wafer-Scale inference (e.g., Cerebras CS-3) using Weight Streaming. |
| [core.solver.TailLatencyModel](core.solver.TailLatencyModel.qmd#mlsysim.core.solver.TailLatencyModel) | Analyzes queueing delays and P99 tail latency for deployed inference (M/M/c). |
| [core.solver.ReliabilityModel](core.solver.ReliabilityModel.qmd#mlsysim.core.solver.ReliabilityModel) | Calculates Mean Time Between Failures (MTBF) and optimal checkpointing intervals. |
| [core.solver.CheckpointModel](core.solver.CheckpointModel.qmd#mlsysim.core.solver.CheckpointModel) | Analyzes checkpoint I/O burst penalties and MFU impact. |