mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-07 02:03:55 -05:00

Files

Vijay Janapa Reddi b8183404b8 chore(release): shared versioning infrastructure

Lays foundation for unified release versioning across MLSysBook
publishable artifacts. Pure additions — no existing builds, configs,
or sources are touched.

scripts/version/release.py
  Python CLI with helpers:
  - compute-id: semver bump from previous tag (patch/minor/major/none/explicit)
  - compute-hash: deterministic SHA-256 over input directories with per-file index
  - emit-release: writes releases/<project>-<id>/release.json (canonical artifact)
  - emit-manifest: writes the build-time manifest the deployable bundles
  Tier A (citable) emits per-file Merkle index; Tier B (lite) is flat.

scripts/version/schema.json
  JSON Schema for release.json. Validates project/tier/release_id/release_hash
  + Tier A's files[] index. Used by validators in CI.

shared/release/release-pill.html
  Footer snippet — fetches deployable manifest at runtime, renders
  "v0.1.0 · Apr 26, 2026" pill. Configured per-project via
  <meta name="release-manifest"> tag. Silent on any fetch failure.

shared/release/release-card.html
  About-page snippet — fuller release-identity card with
  click-to-copy hash. Same fetch + meta-tag conventions.

shared/release/README.md
  Operator-facing contract documentation.

.github/workflows/_release-prepare.yml
  Reusable workflow_call. Validates confirm == "PUBLISH", computes
  new_release_id from previous tag + bump (delegates to release.py
  for canonical math). Outputs new_release_id/new_tag/previous_*
  for caller's downstream build and finalize steps. Refuses to
  re-tag existing releases (citation integrity).

Caller workflows still own their build commands and tag/release
creation; this only standardizes the input shape and version math.

2026-04-28 18:06:07 -04:00

version

chore(release): shared versioning infrastructure

2026-04-28 18:06:07 -04:00

figure_audit.py

feat: add multimodal figure audit automation script and README

2026-04-27 13:35:48 -04:00

README.md

fix(bib): restore vol1/vol2 references.bib after title-mangling regression

2026-04-27 15:11:37 -04:00

README.md

Figure Audit Automation

This directory contains figure_audit.py, a script designed to automate the visual auditing of figures within the ML Systems textbook.

What it does

The script orchestrates a multimodal audit of every figure across Volume 1 and Volume 2 of the textbook. It ensures that the prose, the captions (fig-cap), and the alt-text (fig-alt) precisely match the content of the fully rendered visual images.

Discovery: It scans the book/quarto/contents/ directory to identify all .qmd chapters containing figures.
Visual Extraction: It resolves the corresponding published HTML URL for each chapter, parses the HTML, and downloads the exact rendered <img src="..."> and inline <svg> visual assets locally.
Auditing: It dispatches parallel worker tasks via the gemini CLI. The CLI is given explicit instructions to load the local images visually, compare them directly against the .qmd source text, and evaluate them based on the figure-audit-brief.md rubric.
Reporting: It generates strict, granular YAML output files in .claude/_reviews/Figure Audit/, detailing any misalignments (e.g., the text claims 10^4 but the chart shows 10^3) along with surgically precise .qmd fix recommendations.

How to use it

Run the script from the repository root:

python3 scripts/figure_audit.py

Pre-requisites

You must have gemini CLI installed and authenticated on your local machine.
The script assumes the rendered HTML book is available at https://harvard-edge.github.io/cs249r_book_dev/... (used purely to scrape the final image variants).

Applying the fixes

Once figure_audit.py finishes running, your .claude/_reviews/Figure Audit/ directory will be populated with .yml files containing proposed_fix entries.

These fixes are written as precise, minimal adjustments targeting the .qmd source files. They can either be applied manually by a human reviewing the YAML reports, or parsed programmatically/agentically to apply the diffs across the workspace.