mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 10:08:50 -05:00
Align the MLSys·im code, docs, paper, website, workflows, and lab wheel for the 0.1.1 release. This also fixes runtime/API issues found during release review and prepares the paper PDF plus archive package.
101 lines
4.9 KiB
Plaintext
101 lines
4.9 KiB
Plaintext
---
|
|
title: "Agentic Workflows & MCP"
|
|
subtitle: "Using MLSys·im as the physics backend for LLM Agents."
|
|
---
|
|
|
|
The ultimate vision for `mlsysim` is not just to educate humans, but to serve as the **ground-truth physics engine for autonomous AI systems**.
|
|
|
|
Large Language Models (like Claude 3.5 Sonnet, GPT-4o, or Gemini Pro) are excellent at writing code and structuring YAML, but they frequently hallucinate complex math. If you ask an LLM to calculate the Inter-Token Latency of a 70B model on 8x H100s with PagedAttention, it will confidently guess wrong.
|
|
|
|
By wrapping `mlsysim` in the **Model Context Protocol (MCP)**, you give your agents the ability to dynamically design hardware clusters, run them through a dimensionally strict physics engine, and interpret the precise bottlenecks to iteratively improve the design.
|
|
|
|
---
|
|
|
|
## 1. Using MLSys·im with Claude Desktop (MCP)
|
|
|
|
We provide a production-ready MCP server that exposes the `mlsysim` engine to Claude Desktop.
|
|
|
|
### Setup
|
|
|
|
1. Ensure you have installed `mlsysim` and the `mcp` Python package:
|
|
```bash
|
|
pip install mlsysim mcp
|
|
```
|
|
|
|
2. Open your Claude Desktop configuration file.
|
|
- **macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
- **Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
|
|
|
|
3. Add the `mlsysim` server:
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"mlsysim": {
|
|
"command": "python3",
|
|
"args": ["-m", "mlsysim.examples.mcp_server"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
4. Restart Claude Desktop. You will now see a hammer icon 🛠️ indicating the tools are available.
|
|
|
|
### What to ask Claude
|
|
|
|
You can now ask Claude questions that require deep hardware simulation:
|
|
> *"I need to serve Llama-3 70B. Can you use your mlsysim tool to find out if it fits on a single H100? If it doesn't, design a cluster that does, and tell me the annual TCO."*
|
|
|
|
Claude will automatically generate the required YAML schema, call the `evaluate_cluster_yaml` tool, see the Out-of-Memory (OOM) failure, correct its design to use 2 nodes, and return the final mathematical truth to you.
|
|
|
|
---
|
|
|
|
## 2. The Agentic "Predict-Compute-Reflect" Loop
|
|
|
|
If you are building your own multi-agent system (using LangChain, AutoGen, or raw Gemini APIs), `mlsysim`'s schema architecture is built specifically for you.
|
|
|
|
* **The Input:** Export our schema using `mlsysim schema --type plan`. Feed this JSON schema directly into your LLM's system prompt or tool definition. The LLM instantly knows how to structure the request.
|
|
* **The Execution:** Call `mlsysim eval your_file.yaml --output json` (or use the Python API).
|
|
* **The Feedback:** Because `mlsysim` outputs a strictly-typed, flat JSON dictionary, your agent can easily parse the results. If `f_status == "FAIL"`, the agent reads the `f_summary` (e.g., "OOM: Requires 140 GB but only has 80 GB") and adjusts its design autonomously.
|
|
|
|
We have included a conceptual Python implementation of this loop in our repository at [`mlsysim/examples/gemini_design_loop.py`](https://github.com/harvard-edge/cs249r_book/blob/dev/mlsysim/examples/gemini_design_loop.py).
|
|
|
|
---
|
|
|
|
## 3. Exposed MCP Tools
|
|
|
|
When running as an MCP server, `mlsysim` exposes the following tools to the connected agent:
|
|
|
|
| Tool | Description |
|
|
|:-----|:-----------|
|
|
| `get_schemas` | Return the current JSON schema for valid MLSys·im YAML plans |
|
|
| `evaluate_cluster_yaml` | Evaluate a YAML cluster specification through the full 3-lens scorecard (Feasibility, Performance, Macro) |
|
|
|
|
The agent can call these tools programmatically. The YAML schema can be exported with:
|
|
|
|
```bash
|
|
mlsysim schema --type plan
|
|
```
|
|
|
|
Feed this schema into your agent's system prompt or tool definition so it knows how to structure valid requests.
|
|
|
|
---
|
|
|
|
## 4. Troubleshooting
|
|
|
|
**Claude doesn't show the hammer icon:**
|
|
: Make sure you restarted Claude Desktop after editing the config. Check that `python3 /path/to/MLSysBook/mlsysim/examples/mcp_server.py` runs without errors in your terminal.
|
|
|
|
**Agent gets OOM errors:**
|
|
: This is expected behavior — it means the model doesn't fit on the specified hardware. The agent should read the error message and adjust (e.g., add nodes, reduce precision, or pick larger hardware).
|
|
|
|
**Agent hallucinates hardware specs:**
|
|
: Remind the agent to call `get_schemas` and use registry names from the schema/docs rather than inventing specs. The `llms.txt` file at the root of the docs site contains agent-specific guidance.
|
|
|
|
---
|
|
|
|
## 5. Why This Matters
|
|
|
|
The "academic simulator graveyard" is filled with tools that were too hard for humans to compile and too unstructured for machines to use.
|
|
|
|
By defining `mlsysim` through **strict Pydantic schemas** and standardizing the **22 ML Systems Walls**, we have created an intermediate representation (IR) that both humans and AI agents can understand. In the near future, you will not manually calculate whether a new model architecture is viable; you will ask your Agentic Architect to run 10,000 simulations against the `mlsysim` physics engine while you sleep.
|