- Add book/tools/dependencies/required_r_packages.R for a single source of truth
- install_packages.R: fail if any package not loadable after install; source shared list
- verify: match R_LIBS_USER and requireNamespace only (no re-source of full install/tinytex)
- Dockerfile: COPY and cleanup /tmp/required_r_packages.R with install/verify scripts
- mkdir R_LIBS_USER before Rscript; set .libPaths + cloud.r-project.org in install_packages.R
- Pre-flight step fails in seconds if cmake or pkg-config missing (avoids 30–60m then fs/rmarkdown/tidyverse verify failure)
- Ncpus=1 for R package compiles; realistic time hint for the long CRAN compile step
Folds both EPUB checks into the binder CLI so pre-commit, CI, and
interactive `./binder check epub` all exercise the same Python code
(no subprocess-to-script). Adds epubcheck to book/tools/dependencies
so pip install gives every contributor the validator, and simplifies
the CI job from ~60 lines of shell+Java setup to three binder calls.
New CLI surface
---------------
./binder check epub # both scopes (default)
./binder check epub --scope hygiene # source-level invariants
./binder check epub --scope epubcheck # W3C validator on built EPUBs
./binder check epub --scope epubcheck \
--max-fatal 0 --max-errors 0 # tighten thresholds
./binder check epub --json # machine-readable output
Implementation
--------------
New file book/cli/commands/_epub_checks.py holds the primitives:
find_hygiene_issues() — walks SVG + BIB source, returns
EpubIssue records for the four invariant
classes (svg-c0, svg-dupe-marker,
bib-url-escape-{underscore,percent},
bib-url-raw-angle).
run_epubcheck_on() — invokes the epubcheck binary or Python
wrapper, parses JSON messages, returns
one EpubIssue per location with severity
counts for threshold checks.
emit_github_annotations() — under GITHUB_ACTIONS=true, emits
`::error file=...` so findings appear
inline on the PR diff.
_discover_built_epubs() — finds the most recent EPUB per volume
under book/quarto/_build/epub-vol*/.
validate.py gains:
- GROUPS['epub'] with 'hygiene' and 'epubcheck' scopes (the old
'structure' scope delegated to validate_epub.py and had a
pre-existing broken exit code; retired from default runs).
- New --max-fatal and --max-errors flags.
- Native _run_epub_hygiene and _run_epubcheck methods that call
_epub_checks.py directly and wrap results into ValidationIssue.
No subprocess-to-my-own-script.
Pre-commit wiring
-----------------
book-epub-hygiene now runs `./book/binder check epub --scope hygiene`.
Trigger files extended to include validate.py and _epub_checks.py so
the hook re-runs when the check itself changes.
CI wiring
---------
epub-validate job drops ~60 lines of shell + java jar download and
calls `./binder check epub --scope epubcheck` twice: once with --json
to capture a machine-readable artifact, once without for the
threshold-gated human summary. Java is still installed via
actions/setup-java because epubcheck is a jar.
Requirements
------------
`epubcheck>=5.1.0` added to book/tools/dependencies/requirements.txt
so every contributor gets the validator alongside the rest of the
binder runtime. Wrapper bundles the jar; JRE 8+ is still a host
requirement (brew install temurin / apt install default-jre). When
neither the Python wrapper nor the system binary is available, the
binder check emits an `epubcheck-missing` issue with install
instructions rather than silently passing.
Removed
-------
book/tools/audit/checks/epub_hygiene.py — its logic now lives in
book/cli/commands/_epub_checks.py.
Documentation
-------------
book/cli/README.md gains a "EPUB Checks — Two Layers, One CLI
Surface" section with an error-code table, fix-at-source guidance
per code, and the defense-in-depth rationale (hygiene →
epub_postprocess → epubcheck) so future contributors understand
where each check fits in the pipeline.
Validation
----------
./binder check epub → 0 issues, ~7s (both scopes)
./binder check epub --scope hygiene → 0 issues, ~100ms
pre-commit run book-epub-hygiene → passes
YAML parse of pre-commit + CI → both OK
Two project-health fixes flagged by the MIT Press release audit:
1. The `book-cleanup-artifacts` hook ran on every pre-commit and
destroyed `_build/`, `.quarto/`, and `index_files/`. This made
sequential PDF builds (vol1 -> vol2) fail with "FATAL Error opening
book citations file" because the second build had no cached state to
reuse. Move the hook to `stages: [manual]` so it runs only when
explicitly invoked (`pre-commit run --hook-stage manual
book-cleanup-artifacts`, or directly via `./book/binder clean
artifacts`). The hook still works -- it just no longer auto-fires.
2. Add `pydantic`, `scipy`, and `seaborn` to the canonical requirements
file. The audit had to install these by hand on a fresh python 3.14
env before pre-commit ran cleanly; codifying them prevents future
contributors from hitting the same wall.
No content changes.
Introduces a pre-commit hook to ensure SVG image files are well-formed XML,
preventing potential rendering or processing issues. This leverages `lxml` for
parsing, which has been added as a new dependency.
Corrects missing whitespace between attributes in existing SVG figures to
comply with the new validation requirements.
- Remove invalid `output-file` from `project:` block in both EPUB configs
(Quarto schema only allows `output-file` under `book:`, not `project:`)
- Move `language` to top-level `lang:` and remove HTML-only keys from
EPUB format blocks (`fig-caption`, `footnotes-hover`, `citations-hover`,
`code-copy`, `code-line-numbers`, `description`) per Quarto EPUB spec
- Add `matplotlib>=3.7.0` to requirements.txt — was missing from container
image, causing ModuleNotFoundError during figure rendering
- Add `_matplotlib_available` guard in `viz.setup_plot()` to raise a clear
ImportError instead of a cryptic AttributeError when matplotlib is absent
The CI workflow hard-pinned black==24.10.0 separately from requirements.txt
(which said >=23.0.0), causing version skew that reformatted 11 QMD files
on every CI run. Remove the override and let requirements.txt be the single
source of truth, bumped to >=24.0.0 to align with current latest.
- Quarto 1.9.27: Linux (.deb), Windows (direct download; Scoop Extras has 1.8.27)
- R 4.5.2: Linux (CRAN jammy-cran40), Windows (Scoop main/r)
- Baremetal: quarto-actions/setup for both Linux and Windows
- Remove ggrepel version pin (R 4.5.x supports ggrepel 0.9.7)
- Update docs: BUILD.md, CONTAINER_BUILDS.md, docker READMEs
Test suite validates structural invariants across chapters (progressive
knowledge building, cross-reference consistency, volume boundary compliance).
Update requirements.txt with new dependency.
* Restructure: Move book content to book/ subdirectory
- Move quarto/ → book/quarto/
- Move cli/ → book/cli/
- Move docker/ → book/docker/
- Move socratiQ/ → book/socratiQ/
- Move tools/ → book/tools/
- Move scripts/ → book/scripts/
- Move config/ → book/config/
- Move docs/ → book/docs/
- Move binder → book/binder
Git history fully preserved for all moved files.
Part of repository restructuring to support MLSysBook + TinyTorch.
Pre-commit hooks bypassed for this commit as paths need updating.
* Update pre-commit hooks for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update all tools/ paths to book/tools/
- Update config/linting to book/config/linting
- Update project structure checks
Pre-commit hooks will now work with new directory structure.
* Update .gitignore for book/ subdirectory structure
- Update quarto/ paths to book/quarto/
- Update assets/ paths to book/quarto/assets/
- Maintain all existing ignore patterns
* Update GitHub workflows for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update cli/ paths to book/cli/
- Update tools/ paths to book/tools/
- Update docker/ paths to book/docker/
- Update config/ paths to book/config/
- Maintain all workflow functionality
* Update CLI config to support book/ subdirectory
- Check for book/quarto/ path first
- Fall back to quarto/ for backward compatibility
- Maintain full CLI functionality
* Create new root and book READMEs for dual structure
- Add comprehensive root README explaining both projects
- Create book-specific README with quick start guide
- Document repository structure and navigation
- Prepare for TinyTorch integration