Files
cs249r_book/mlsysim/tutorial/prerequisites.md
Vijay Janapa Reddi 1eb30f5f86 fix(mlsysim): harden release QA and paper artifacts
Align the MLSys·im code, docs, paper, website, workflows, and lab wheel for the 0.1.1 release. This also fixes runtime/API issues found during release review and prepares the paper PDF plus archive package.
2026-04-25 10:06:01 -04:00

4.1 KiB

MLSys·im Tutorial: Prerequisites and Setup

Everything you need to run the hands-on exercises.


System Requirements

Requirement Minimum Recommended
Python 3.10 3.12+
OS macOS, Linux, Windows Any
RAM 2 GB 4 GB
Disk 50 MB 100 MB
GPU Not required Not required

mlsysim is a first-principles infrastructure modeling tool -- it models hardware performance analytically without executing any GPU kernels. All exercises run on a laptop CPU in seconds.


Installation

pip install mlsysim

Option B: Install from source (for tutorial development)

git clone https://github.com/harvard-edge/cs249r_book.git
cd cs249r_book/mlsysim
pip install -e ".[full]"

The [full] extra includes visualization libraries (plotly, matplotlib) and optimization solvers (scipy, ortools) used in some exercises.

Option C: Minimal install (exercises only)

pip install mlsysim

Core dependencies installed automatically:

  • pint (physical units)
  • pydantic (data validation)
  • numpy (numerical computation)
  • typer and rich (CLI interface)
  • pyyaml (configuration files)

Verification

Run this command to verify your installation:

python3 -c "
import mlsysim
print(f'mlsysim v{mlsysim.__version__}')

# Quick smoke test: ResNet-50 on A100
from mlsysim import Engine, Hardware, Models
p = Engine.solve(Models.ResNet50, Hardware.A100, batch_size=1)
print(f'ResNet-50 on A100: {p.latency:~P.2f}, bottleneck={p.bottleneck}')

# Verify registries
print(f'Hardware: {len(Hardware.list())} accelerators')
print(f'Models:   {len(Models.list())} workloads')
print('All checks passed.')
"

Expected output (approximate):

mlsysim v0.1.1
ResNet-50 on A100: 0.XX ms, bottleneck=Memory
Hardware: XX accelerators
Models:   XX workloads
All checks passed.

If the import succeeds and the smoke test prints a latency value, you are ready for the tutorial.


Troubleshooting FAQ

1. ModuleNotFoundError: No module named 'mlsysim'

Cause: mlsysim is not installed in your active Python environment.

Fix:

# Check which Python you are using
which python3
python3 --version

# Install in the active environment
pip install mlsysim

# If using conda:
conda activate your_env
pip install mlsysim

2. ModuleNotFoundError: No module named 'pint'

Cause: Core dependency not installed (rare with pip, common with manual source installs).

Fix:

pip install pint>=0.23 pydantic>=2.0 numpy>=1.24

3. ImportError: cannot import name 'Hardware' from 'mlsysim'

Cause: You have an outdated version of mlsysim or a name collision with another package.

Fix:

# Check version
python3 -c "import mlsysim; print(mlsysim.__version__)"

# Upgrade to latest
pip install --upgrade mlsysim

# If name collision, check for local files named mlsysim.py
ls mlsysim.py 2>/dev/null && echo "Remove or rename this file!"

4. pint.DimensionalityError or unexpected unit errors

Cause: Mixing raw numbers with Pint Quantities. mlsysim uses physical units throughout -- all inputs and outputs carry dimensions.

Fix:

# Wrong: passing a raw number where a Quantity is expected
bandwidth = 100  # missing units!

# Right: use the unit registry
from mlsysim import ureg
bandwidth = 100 * ureg.GB / ureg.s

General rule: If a function expects bandwidth, memory, or time, pass a Pint Quantity with explicit units. The error message will tell you which units are expected.


Optional: Visualization Setup

Some exercises include optional visualization. Install plotly for interactive charts:

pip install mlsysim[viz]

This adds plotly and matplotlib for plot_roofline() and plot_evaluation_scorecard().


Tutorial Files

After setup, the exercise file is at:

mlsysim/tutorial/exercises.md

Each exercise is self-contained. You can copy code blocks into a Python REPL, Jupyter notebook, or any IDE. No special notebook infrastructure is required.