Files
cs249r_book/docs/VERSIONING.md
Vijay Janapa Reddi 789509080d docs: contributor-facing versioning guide
Documents the unified versioning convention adopted across all
publishable artifacts: how to publish (operator), how to verify a
deployed release (consumer), what each tier means, where the contract
lives. Written last so it reflects what actually shipped, not what
was planned.

Pairs with shared/release/README.md (operator-internal contract) —
this file is for first-time contributors trying to understand "why
do I get five inputs when I run the publish workflow?".
2026-04-28 18:22:25 -04:00

8.3 KiB

Versioning

How releases work across the MLSysBook monorepo. One pattern for all publishable artifacts; one source of truth per release; one URL the operator visits to ship one.

Why this exists

Before this convention landed, every project had its own version story:

  • StaffML built every publish as --release-id publish-live (a constant string), and the manifest drifted to 0.1.2-dev while releases/0.1.0/release.json said 0.1.0. Citations couldn't be trusted.
  • TinyTorch hand-edited the version in 6 files per release; one missed edit could silently downgrade the source.
  • Book had per-volume tags but no cross-coordination — Vol I and Vol II could disagree about which release shipped together.
  • MLSYSIM had a PyPI version, a paper, and a docs site that all claimed different things at any given time.
  • Kits/Labs/Instructors had no versions at all — just commit SHAs.

The unified convention fixes the operational problem ("did I ship the right thing? what's deployed right now?") and the citation problem ("v0.1.0 means exactly these bytes, forever").

What every release produces

Each <project>-publish-live workflow run produces:

  1. A git tag<project>-v<release_id>, e.g. staffml-v0.1.1.
  2. A release-manifest.json at the deploy URL — e.g. https://mlsysbook.ai/staffml/release-manifest.json. Cacheable, readable by anyone, parseable by tools.
  3. A draft GitHub Release — for the human-facing changelog with auto-generated commit list (Tier A also runs AI-enhanced notes).
  4. A footer pill on the live site — small inline "v0.1.1 · Apr 28, 2026" element. Click to copy hash. Best-effort chrome.

Tier A projects additionally produce:

  1. A release_hash — Merkle-style SHA-256 over input bytes, recorded in the manifest. This is the citation anchor: a paper referencing "MLSysBook v0.1.1 (hash 6883e85)" is now reproducible.

How to publish (operator)

  1. Go to the project's "Publish (Live)" workflow in GitHub Actions.
  2. Click "Run workflow" and fill in:
    • release_type: patch (small fixes), minor (new content), major (breaking changes). Default patch.
    • description: One-line summary. Becomes the release-notes title.
    • site_only: Check this for CSS/copy-only redeploys that should NOT bump the release_id (citation integrity demands a given version maps to fixed bytes — re-tagging existing releases is forbidden).
    • explicit_version: For non-incremental jumps (e.g. 0.1.x → 0.10.0 alongside a coordinated launch). Leave blank to auto-bump.
    • confirm: Type PUBLISH. The workflow refuses to proceed if this isn't exact — stops accidental clicks.
  3. Wait for the workflow to complete. The draft GitHub Release will appear at https://github.com/<repo>/releases. Review the auto- generated notes, then publish.

How to verify a deployed version

# What release is live right now?
curl -s https://mlsysbook.ai/staffml/release-manifest.json | jq .

# What's in release 0.1.1?
gh release view staffml-v0.1.1
cat releases/staffml-0.1.1/release.json | jq .

# Does the deployed manifest match what's tagged?
curl -s https://mlsysbook.ai/staffml/release-manifest.json | jq -r .releaseHash
git show staffml-v0.1.1:releases/staffml-0.1.1/release.json | jq -r .release_hash

If those two hashes differ, something between tag and deploy went wrong — file an issue.

Who lives in which tier

Project Tier Why
StaffML A Citable question bank; authors will reference v X.Y.Z in papers
TinyTorch A Educational framework; cited in syllabi and papers
Book Vol I A Textbook, multiple per-volume tag tracks (vol1-v*)
Book Vol II A Textbook, separate tag track (vol2-v*)
MLSYSIM A Site identity binds to PyPI + paper, all citable
Kits B Hardware deployment labs, iterate fast, not formally cited
Labs B Marimo notebooks, evolve constantly
Instructors B Instructor guide, lower citation stakes

Tier A and Tier B share the workflow UX. They differ in:

  • Hash detail: Tier A includes a per-file files: [{path, hash}] index in release.json (Merkle-ish: lets a consumer verify a single question without downloading the whole corpus). Tier B uses a flat SHA-256.
  • Release notes: Tier A runs AI-enhanced summarization. Tier B uses plain auto-generated commit lists.

Architecture in one paragraph

scripts/version/release.py is the canonical implementation of every versioning operation: hash a directory, compute next release_id, emit a release.json, emit a build-time manifest. The reusable workflow .github/workflows/_release-prepare.yml validates confirm, computes the new release_id from prior tag + bump, and outputs values the caller workflow uses to drive its own build. Each project's <project>-publish-live.yml calls _release-prepare.yml first, runs its existing build with the computed release_id, emits a manifest into the build output (so it deploys at the canonical URL), and then tags + creates a release. The shared/release/release-pill.html fragment fetches the manifest at runtime and renders the footer pill; each project's Quarto config sets a <meta name="release-manifest"> tag so the snippet finds the right URL.

Contract reference

Field Source Notes
releaseId _release-prepare.yml output Bare semver, no prefix
releaseHash release.py compute-hash 64 hex chars; Merkle-ish (Tier A) or flat SHA-256 (Tier B)
schemaVersion Manifest emitter caller Project-internal; rarely changes
tier Manifest emitter caller A or B
project Manifest emitter caller Short identifier
buildDate Manifest emitter UTC ISO 8601 — set at emit time

The full schema is in scripts/version/schema.json.

Backfill

Pre-cutover releases (everything tagged before this convention landed) do NOT have a releases/<id>/release.json artifact. They keep their existing tags and behave normally; only releases ≥ this PR get the full treatment. Backfilling historical releases is a separate, optional follow-up task.

Out of scope (today)

  • Periodic-Table — no publish workflow exists; we'd be adding versioning to a project that doesn't deploy. Establish publishing first.
  • Reusable build orchestration — the reusable workflow only does prepare. Each project keeps its own build/test/deploy because those steps are project-specific (Quarto vs Next.js vs Marimo). Trying to generalize the build itself produced too-thin or too-rigid abstractions in earlier drafts.
  • PyPI version unification — MLSYSIM's PyPI version comes from pyproject.toml and remains the canonical source for the package. The site's release identity is a separate (compatible) track. The paper's identity rides on the site release.
  • Cross-project release coordination — there's no "stamp all of MLSysBook with the same release_id" path. Each project bumps independently. The book's per-volume coordination is the only intentional exception.

Files at a glance

docs/VERSIONING.md                              ← this file
scripts/version/release.py                      ← Python helpers + CLI
scripts/version/schema.json                     ← JSON Schema for release.json
shared/release/README.md                        ← contract documentation
shared/release/release-pill.html                ← footer snippet
shared/release/release-card.html                ← about-page snippet
.github/workflows/_release-prepare.yml          ← reusable workflow

# Per-project: each <project>-publish-live.yml calls _release-prepare,
# emits a manifest before deploy, tags + drafts a release after.
.github/workflows/staffml-publish-live.yml
.github/workflows/tinytorch-publish-live.yml
.github/workflows/book-publish-live.yml
.github/workflows/mlsysim-publish-live.yml
.github/workflows/kits-publish-live.yml
.github/workflows/labs-publish-live.yml
.github/workflows/instructors-publish-live.yml

# Per-project Quarto/Next config: meta tag + pill include
interviews/staffml/src/components/Footer.tsx    ← StaffML uses build-time bake
tinytorch/quarto/_quarto.yml
book/quarto/config/_quarto-html-vol1.yml
book/quarto/config/_quarto-html-vol2.yml
mlsysim/docs/config/_quarto-html.yml
kits/config/_quarto-html.yml
labs/config/_quarto-html.yml
instructors/_quarto.yml