Files
cs249r_book/.pre-commit-history.md
Vijay Janapa Reddi f53e2ed8a4 refactor(pre-commit): one hook per binder check group
Consolidate 80 hooks → 45 by routing every book-* check through
`./book/binder check <group>`. Hook names now mirror the binder command
tree (book-check-refs, book-check-prose, book-check-footnotes, ...).
Each hook runs the curated default scopes for its group; opt-in scopes
remain reachable on the CLI via `--scope <name>` or `--all-scopes`.

The 51-hook reduction in book validators is fragmentation removal, not
coverage removal. Where N hooks used to call `binder check <group>
--scope <one>` each (one process spawn per scope), one hook now calls
`binder check <group>` and the scope curation lives in the binder's
GROUPS dict (per-Scope `default` flag, see prior commit).

Naming changes:
  - `book-validate-*` / `book-mlsys-*` / `mitpress-*` prefixes retired.
    A rule's provenance (MIT Press style policy, MLSys-specific) belongs
    in a comment / .claude/rules/book-prose.md, not in a command name.
  - `book-format-*` (binder format ...) and `book-check-*`
    (binder check ...) are the only book-* prefixes.

Two hooks intentionally retain explicit flags:
  - `book-check-headers --vol1` (vol2 sections in early development)
  - `book-check-labels --vol1` (vol2 has many forward refs not yet authored)
  Both flag-sets are documented inline.

Retired-hook history moved to .pre-commit-history.md so the live config
is not a graveyard. .pre-commit-history.md captures the 2026-04-26 retirements
(vault-corpus-guard, staffml-validate-vault), the 2026-05-03
figure-placement deferral, and the markdownlint / yamllint evaluations.

Hook count summary:
  before  after  section
  -----   -----  ------------------------------------------
     12      8   Generic third-party (pre-commit-hooks + codespell)
      3      3   Bibliography
      2      2   Repo guards
      5      5   Book formatters
     52     21   Book validators (one per binder check group)
      2      2   Manual stage
      1      1   CI hygiene
      3      3   Vault / StaffML
     ===     ===
     80     45
2026-05-06 08:44:47 -04:00

4.8 KiB

Pre-commit Hook History

A short log of hooks that have lived in .pre-commit-config.yaml and been retired. Each entry: when, why, and where the equivalent coverage now lives (if anywhere). Keep this file when you remove a hook so the institutional context is not lost.

The active configuration lives in .pre-commit-config.yaml. To see the current curated set of binder check <group> scopes, run ./book/binder check <group> help for any group.

2026-05-06 — pre-commit restructure

Collapsed 80 individual hooks into ~30 by routing every book-* check through ./book/binder check <group>. Each binder group now has one pre-commit hook; the curated subset of scopes is encoded in book/cli/commands/validate.py (Scope(default=True)). Heavier / opt-in scopes are reachable via --scope <name> or --all-scopes on the CLI.

Migrated four scripts into binder so every book-* check has one entry point:

Script (was) New scope
book/tools/audit/index/check_anti_patterns.py index/anti-patterns
book/tools/audit/index/check_tag_placement.py index/tag-placement
book/tools/audit/index/check_xref_resolves.py index/xref-resolves
scripts/check_lego_vars.py code/lego-dead-code

The standalone scripts remain runnable on the command line for ad-hoc use.

Hooks consolidated under their binder group (one example per group; the collapse is the same shape elsewhere — see .pre-commit-config.yaml for the current list and git log for the full mapping):

Old hook ids (consolidated) New hook
book-validate-citations, book-check-references, book-check-duplicate-citation-year, ... book-check-refs
book-check-contractions, book-check-duplicate-words, mitpress-acknowledgements, ... book-check-prose
mitpress-spaced-emdash, mitpress-spaced-slash, mitpress-vs-period, mitpress-eg-ie-comma, mitpress-hyphen-range book-check-punctuation
book-check-percent-spacing, book-check-unit-spacing, book-check-binary-units, mitpress-percent-in-captions book-check-numbers

The mitpress-* prefix is retired — provenance of a rule belongs in a comment / .claude/rules/book-prose.md, not in a command name.

2026-04-26 — vault-corpus-guard retired

Was: pre-commit hook that prevented tracked changes to interviews/vault/corpus.json and interviews/staffml/src/data/corpus.json.

Retired alongside the files themselves. Both are now build artifacts (gitignored, regenerated by vault build --local-json). Production reads from the Cloudflare D1 worker plus the bundled corpus-summary.json. With no tracked corpus.json there is nothing for the guard to protect.

2026-04-26 — staffml-validate-vault retired from pre-commit

Was: full integrity pass over interviews/vault-cli/scripts/validate-vault.py.

Retired from pre-commit because a full integrity pass requires vault build --local-json first (corpus.json is now a build artifact, see above). That belongs in StaffML/Vault CI (staffml-validate-dev, staffml-validate-vault, preview/publish), not in every pre-commit on unrelated paths. Sparse mode (taxonomy + manifest only) still runs in CI when corpus.json is absent.

2026-05-03 — book-check-figure-placement deferred

Was: ./book/binder check figures --scope flow — a "near first reference" heuristic blocking figures placed too far from their first cite.

Deferred to the copyeditor's PDF layout pass. The "near first reference" heuristic conflicts with the repositioning he does to balance pages in print. Re-enable when we own figure placement again. The scope still exists in book/cli/commands/validate.py with default=False — runnable manually via ./book/binder check figures --scope flow.

Disabled / opt-in third-party hooks

The following third-party hooks have been considered and skipped. Listed here so we don't re-evaluate cold every quarter:

  • markdownlint-cli (igorshubovych/markdownlint-cli v0.45.0): Considered for ^book/quarto/contents/.*\.qmd$. Replaced by mdformat (formatter) plus the binder markup and prose groups (linters). Adding markdownlint on top would surface the same issues twice.

  • yamllint: Considered for ^book/.*\.(yml|yaml)$. Generic pre-commit check-yaml covers parse errors; yamllint adds style rules we don't enforce. Re-evaluate if the book picks up substantial YAML config beyond Quarto's.