cs249r_book/CONTRIBUTING.md

# Contributing to MLSysBook

Thanks for your interest in MLSysBook! This repository is the home for the
**ML Systems textbook** plus a family of sibling projects — TinyTorch, Co-Labs,
Hardware Kits, MLSys·im, MLPerf EDU, and StaffML. Most contributions land in
exactly one of those projects, so this top-level guide just gets you to the
right place.

> [!IMPORTANT]
> Please read our [Code of Conduct](CODE_OF_CONDUCT.md) before contributing.
> Security issues should follow [SECURITY.md](SECURITY.md), not the public
> issue tracker.

## Pick your project

| If you want to... | Project | Read this guide |
|---|---|---|
| Fix a typo, improve a chapter, add a figure | **Textbook** | [`book/docs/CONTRIBUTING.md`](book/docs/CONTRIBUTING.md) |
| Add or fix a TinyTorch module / test / milestone | **TinyTorch** | [`tinytorch/CONTRIBUTING.md`](tinytorch/CONTRIBUTING.md) |
| Improve a hardware lab or board recipe | **Hardware Kits** | [`kits/README.md`](kits/README.md) |
| Add or fix an interactive Co-Lab | **Labs** | [`labs/README.md`](labs/README.md) |
| Contribute an MLSys·im model, scenario, or scorecard | **MLSys·im** | [`mlsysim/docs/contributing.qmd`](mlsysim/docs/contributing.qmd) |
| Add a workload to the MLPerf EDU benchmark suite | **MLPerf EDU** | [`mlperf-edu/README.md`](mlperf-edu/README.md) |
| Author or fix a StaffML interview question | **StaffML** | [`interviews/CONTRIBUTING.md`](interviews/CONTRIBUTING.md) |
| Improve teaching materials, syllabi, or rubrics | **Instructors** | [`instructors/README.md`](instructors/README.md) |
| Update slides for a chapter | **Slides** | [`slides/README.md`](slides/README.md) |
| Change the unified landing site, newsletter wiring, or games | **Site** | [`site/README.md`](site/README.md) |

Not sure which one applies? Open a
[Discussion](https://github.com/harvard-edge/cs249r_book/discussions) and we'll
help route it.

## Common gotchas first-time contributors hit

These are the things that aren't obvious from reading any single sub-project's
README. Each links to the canonical doc rather than restating it.

* **TinyTorch uses the `tito` CLI for everything.** Module status, tests,
  exports, environment health all flow through `tito` (`tito --version`,
  `tito system health`, `tito module status`, `tito module test NN`). See
  [`tinytorch/CONTRIBUTING.md`](tinytorch/CONTRIBUTING.md) for the full
  command list. If `tito` isn't on your PATH after
  `pip install -e tinytorch/`, re-activate your venv.
* **TinyTorch source edits need an export step.** When you change a file
  under `tinytorch/src/`, the in-package version under `tinytorch/tinytorch/`
  is regenerated by `tito src export` (see
  [`tinytorch/CONTRIBUTING.md`](tinytorch/CONTRIBUTING.md), "Module
  Development"). `tinytorch/tinytorch/*` is gitignored — the source of
  truth is `src/`.
* **Co-Labs run in the browser via Pyodide / WebAssembly.** Imports must be
  Pyodide-compatible (no compiled-only packages without a wheel) and every
  Marimo cell that produces a UI element must `return` it so the dataflow
  routes it onward — that's release invariant #4 in
  [`labs/PROTOCOL.md`](labs/PROTOCOL.md). The lab test suite enforces both.
* **Don't commit large binaries.** Distribution PDFs, EPUBs, podcast MP3s,
  and JS bundles balloon `.git` (see issues
  [#1393](https://github.com/harvard-edge/cs249r_book/issues/1393) and
  [#1175](https://github.com/harvard-edge/cs249r_book/issues/1175)). The
  root `.gitattributes` is set up so future EPUB / PDF / MP3 / MP4 / WAV /
  WASM additions land in Git LFS automatically. Generated artefacts
  (`bundle.js`, `corpus.json`, search indexes) are gitignored — regenerate
  them locally rather than committing.
* **Where each area lives** — the table above is the authoritative map.
  At a glance: `book/` for the textbook, `tinytorch/` for the framework,
  `labs/` for browser labs, `kits/` for hardware recipes, `mlsysim/` for
  the simulator, `instructors/` for teaching materials, `slides/` for
  per-chapter decks, `interviews/` for StaffML, `site/` for the unified
  landing and newsletter.

## Universal policies (apply to every project)

These conventions hold across the whole monorepo. The per-project guides
specialize them.

### 1. Branch from `dev`, not `main`

`main` tracks the published live site. All work merges to `dev` first and ships
to `main` on release.

```bash
git checkout dev
git pull origin dev
git checkout -b iss123-short-descriptive-slug
```

Branch names should reference the issue number when one exists
(`iss42-fix-figure-caption`, `feat/tinytorch-conv-module`).

### 2. Set up pre-commit hooks (one time per clone)

This repo runs ~60 pre-commit checks (BibTeX validation, figure-div syntax,
markdown link checks, EPUB hygiene, vault schema drift, and more) defined in
`.pre-commit-config.yaml`. They catch issues that would otherwise burn
maintainer review cycles. Install them once per fresh clone:

```bash
pip install pre-commit
pre-commit install
```

This is enough to contribute to **any** sub-project. The default hook set is
**only** the root `.pre-commit-config.yaml`. TinyTorch additionally ships
`tinytorch/.pre-commit-config.yaml` (markdown collapse, CLI doc checks); run it
when you need those checks:
`cd tinytorch && pre-commit run --config .pre-commit-config.yaml --all-files`.
Some projects also have
their own setup step that installs project-specific tooling (and may wire up
pre-commit for you as a convenience):

| Project | Project-specific setup |
|---|---|
| Textbook | `./book/binder setup` (also installs Quarto / Java / epubcheck checks) |
| TinyTorch | `pip install -r tinytorch/requirements.txt && pip install -e tinytorch/` |
| StaffML / vault-cli | `pip install -e interviews/vault-cli/[dev]` |
| MLSys·im | `pip install -e mlsysim/[dev]` |
| MLPerf EDU | `pip install -e mlperf-edu/[dev]` |
| StaffML site | `cd interviews/staffml && npm install` |

See each sub-project's CONTRIBUTING / README for the full development loop.

### 3. Stage files explicitly

Do **not** use `git add .` — it's easy to commit unrelated edits, secrets, or
build artifacts. Stage paths individually:

```bash
git add book/quarto/contents/vol1/introduction/introduction.qmd
git commit -m "Fix caption formatting in introduction (issue #14)"
```

### 4. Open a Pull Request to `dev`

* Reference the issue number (`Fixes #123` or `Related to #456`).
* Mark drafts with `[WIP]` in the title or use GitHub's "Draft PR" mode.
* Use the [PR template](.github/PULL_REQUEST_TEMPLATE.md) — it asks the
  questions reviewers will ask anyway.
* CI will render the affected project (book / tinytorch / staffml / etc.); fix
  any failures before requesting review.

### 5. Code of Conduct

By contributing you agree to abide by the
[Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). Report concerns to
`vj@eecs.harvard.edu` or `nkhoshnevis@g.harvard.edu`.

### 6. License of contributions

By submitting a PR you agree to license your contribution under the project's
[license](LICENSE.md): Creative Commons Attribution-NonCommercial-ShareAlike
4.0 International for content, with code components dual-licensed under their
project-local terms (see each sub-project's `LICENSE` for specifics).

## Reporting bugs and asking questions

* **Found a real bug or specific issue?** Open an
  [issue](https://github.com/harvard-edge/cs249r_book/issues) using the
  template that fits (we have eight: book, TinyTorch bug, MLSys·im bug, new
  challenge, interview question, StaffML report/contribute, and more).
* **General question or design discussion?**
  [Discussions](https://github.com/harvard-edge/cs249r_book/discussions) is the
  better fit.
* **Security issue?** See [SECURITY.md](SECURITY.md) — please do **not** open a
  public issue.

## Contributor recognition

We use the [All Contributors](https://allcontributors.org) bot. After your PR
merges, a maintainer (or you, on your own PR) can comment:

```text
@all-contributors please add @your-username for doc, code, ideas
```

You'll be added to the project's recognition table in the README. See
[`book/docs/CONTRIBUTING.md`](book/docs/CONTRIBUTING.md#contribution-types) for
the full list of contribution types.

---

Thanks for helping make MLSysBook better. The community runs on people who
take the time to fix one typo, file one good bug, or write one careful PR.