Add `width="100%"` to every HTML content and contributor table across all
project READMEs so they render full-width on GitHub instead of collapsing
to natural content width. Cell-level `width="X%"` percentages were already
in place but only take effect once the table itself has an explicit width.
Also update the contributor-sync scripts so the auto-generated tables stay
consistent on the next bot run:
- .github/workflows/contributors/generate_main_readme.py
- .github/workflows/contributors/generate_readme_tables.py
Scope: 27 files, 85 tables. Sub-project READMEs that already use the
"card" pattern (labs/, kits/ content sections with <table width="98%">
wrappers) are intentionally untouched.
After the FATAL/URL/OPF/cross-ref fixes, the residual epubcheck errors
were all RSC-005 "attribute alt not allowed here" (214 vol1, 307 vol2)
and a small tail of RSC-005 "Duplicate id=arrow" on 6 SVG files in
vol2. Two independent mechanisms, one error class.
Post-process: alt -> aria-label on non-img wrappers
Quarto emits `fig-alt` as alt="..." on the enclosing
<div class="quarto-figure">, while the inner <img> already carries
alt="" (empty). Strict XHTML forbids alt on non-image elements.
Fix: in the sanitize_xml_for_epubcheck pass, rewrite alt="..." on
any element other than img/area/input to aria-label="...". If an
aria-label already exists on the element the alt is stripped instead
of duplicated. 214 rewrites vol1 + 271 rewrites vol2.
Source SVGs: remove duplicate <marker> defs
Seven SVGs under contents/vol2/.../images/svg/ contained a
duplicated block of <marker id="arrow"/>, <marker id="arrow-red"/>,
<marker id="arrow-green"/> inside their <defs> element — evidently
a copy-paste slip during authoring. Six of the seven are referenced
in the vol2 EPUB and were raising RSC-005 "Duplicate id" under
epubcheck. Fix: dedupe markers by id in-place (keep first, drop
subsequent). 3 duplicates removed per file.
Affected files:
book/quarto/contents/vol2/ops_scale/images/svg/ecommerce-dependency-graph.svg
book/quarto/contents/vol2/ops_scale/images/svg/time-travel.svg
book/quarto/contents/vol2/responsible_ai/images/svg/reward-hacking-loop.svg
book/quarto/contents/vol2/robust_ai/images/svg/autoencoder.svg
book/quarto/contents/vol2/robust_ai/images/svg/gradient-attack.svg
book/quarto/contents/vol2/sustainable_ai/images/svg/ai-lca.svg
book/quarto/contents/vol2/sustainable_ai/images/svg/water-cycle.svg
Validation (epubcheck 5.3.0, both volumes rebuilt from scratch):
Vol1: 0 fatals / 0 errors / 0 warnings / 0 infos
Vol2: 0 fatals / 0 errors / 0 warnings / 0 infos
Combined with the prior two commits, this closes the epubcheck work
for issues #1014, #1052, #1148. Total error reduction: 1000 -> 0
(vol1 346->0, vol2 654->0).
Extends epub_postprocess.py (already invoked by the EPUB configs as a
post-render hook) with three string-level passes that fix the 11 FATAL
epubcheck errors rejecting the EPUB from Kindle and ClearView, plus
two smaller error classes:
- RSC-016 FATAL: "--" inside HTML comment bodies.
Quarto's EPUB filter wraps raw TikZ source in <!-- ... --> for
figures that have a PNG fallback. TikZ arrow syntax (\draw (a) -- (b))
produces "--" inside the comment, violating the XML comment spec.
Fix: replace every "--" inside a comment body with "- -" (XML-safe).
3 FATALs cleared in vol2.
- RSC-016 FATAL: bare <br> tags.
Pandoc emits HTML5 bare <br> inside XHTML for multi-line table
cells that use the book's `•<br>` bullet convention. Strict XML
parsers (Kindle, epubcheck) require self-closing <br/>.
Fix: rewrite <br> to <br/>.
1 FATAL vol1 + 2 FATALs vol2 cleared (plus 41 more non-FATAL fixes).
- RSC-016 FATAL: C0 control chars in SVG aria-label attributes.
The matplotlib->SVG pipeline emits aria-label values containing
U+0003 / U+000F / U+001D characters from raw-bytes representations.
XML 1.0 forbids these in attribute values.
Fix: strip C0 controls (except TAB/LF/CR) from every aria-label.
2 FATALs vol1 + 3 FATALs vol2 cleared.
Also fixes two non-FATAL classes in the same pipeline since they are
mechanical string fixes on the same extracted EPUB tree:
- RSC-020: BibTeX escape leaks in href URLs (\_, \%) and raw/encoded
angle brackets in DOI URLs (SICI DOIs like
10.1002/(sici)...<995::aid-spe111>3.0.co;2-6).
Fix: unescape \_ -> _, \% -> %, and percent-encode < / > / < / >
inside every href.
27 RSC-020 errors cleared (13 vol1 + 14 vol2); 0 remaining.
- OPF-014: nav.xhtml contains <math> elements but the OPF manifest
does not declare the mathml property on the nav item.
Fix: add `mathml` to the space-separated properties= attribute
of the nav manifest entry in content.opf.
2 OPF-014 errors cleared (1 per volume); 0 remaining.
Validation (epubcheck 5.3.0, on both volumes):
- vol1: 346 errors -> 215 errors, 3 FATAL -> 0 FATAL
- vol2: 654 errors -> 307 errors, 8 FATAL -> 0 FATAL
- Residual is 521 RSC-005 + 1 RSC-007, tracked as P4/P6.
Relates to: #1014 (EPUB load failures), #1052 (ClearView "--" rejection),
#1148 (Kindle E999). These three issues require FATAL=0 to be resolved;
full resolution is pending validation on a CI-built artifact.
Standardize table formatting across 25 README files to use
HTML tables with consistent styling (thead/tbody, column widths,
bold labels) matching the main README's presentation.
- Add PYTHONUTF8=1 env var to all Windows Docker run commands (PEP 540)
- Fix generate_figure_list.py to explicitly use encoding='utf-8' in
write_text() instead of relying on system default (cp1252 on Windows)
- The ≈ character (\u2248) in Vol I content triggered charmap codec errors
Configures explicit `render` paths for both volumes to ensure complete and correct builds, particularly for selective rendering workflows.
Replaces the static cross-reference fix script with a dynamic version. This new script automatically discovers and resolves internal links from QMD sources, improving maintainability and ensuring links remain functional during partial book builds.
Adds a new script to check and auto-fix bibliography completeness, facilitating self-contained volumes.
Removes redundant empty Python code blocks from chapter QMDs and refines frontmatter content for consistency.
- Fix rendering: dimensions (e.g. 224×224) use single math span $N\times M$
- Revert multipliers to N$\times$ / N--M$\times$ per LaTeX convention
- Fix malformed $N\times$ M → $N\times M$ across vol1/vol2
- Add revert_times_multipliers.py (one-off) and fix_times_math.py (dimension-only)
- Update book-prose guidelines in .claude/rules (dimension vs multiplier)
Checkpoint the branch-wide content/config revisions together with workbench enhancements so chapter rendering and developer workflows stay aligned. This captures the current validation-driven formatting and parallel build/debug improvements in one commit.
Remove redundant ml_ prefix from ml_workflow chapter files and update all
Quarto config references. Consolidate custom scripts into native binder
subcommands and archive obsolete tooling.
Unifies Quarto metadata into shared base/format/volume fragments while carrying through chapter path, asset, and tooling updates to keep the repository consistent and easier to maintain.
Standardize Quarto config/style handling for HTML/EPUB volume builds, add explicit binder reset commands by format, and align QMD reference/label highlighting so structural tokens share consistent visual semantics.
Refactors figure list generation to reliably locate and clear LaTeX manifest files.
- Searches for the figure manifest in both the quarto root and build output directory,
handling cases where the post-render step moves the file.
- Clears stale manifests from both locations to avoid incorrect figure counts from
previous builds.
- Moves the LaTeX manifest to the build output directory to keep the source
tree clean.
- Updates the merge script to find the manifest dynamically.
This prevents issues where figure counts are mismatched due to outdated or
missing manifest files.
Training and Frameworks chapters restructured for clarity.
Data Selection chapter expanded. Header-includes.tex updated.
Various minor fixes across all chapter files.
Two issues:
1. LaTeX parser regex only matched numeric figure numbers (e.g., 1.1)
but appendices use letter prefixes (B.1, C.2, D.1). Changed \d+ to
[A-Z\d]+ so all 214 figures are captured.
2. --scan-all mode picked up _shelved QMD files that aren't in the
actual build, causing a count mismatch. Added _shelved to skip list.
- Add pre-render hook to clear stale LaTeX data between builds
- Add post-render hook to generate FIGURE_LIST.txt in output dir
- LaTeX captures figure numbers and pages during compilation
- Use deferred write for accurate page numbers (after float placement)
- Python merges with QMD captions and alt-text
- Output automatically appears in _build/pdf-vol1/ after each build
Replaces hardcoded numerical values with symbolic Python variables derived from defined constants and formulas.
This improves code maintainability and consistency, ensuring calculations are based on accurate and up-to-date physical values.
Update fix_cross_references.py and generate_glossary.py to reflect
the renamed chapter sections in Vol 1 (e.g., sec-ml-systems to
sec-ml-system-architecture, sec-dl-primer to
sec-deep-learning-systems-foundations).
- Renamed vol2/advanced_intro to vol2/introduction for consistency
- Updated all scripts and configs to use vol1/ instead of core/
- Updated pre-commit config to check all contents/ not just vol1/
- Updated path references in Lua filters, Python scripts, and configs
* Restructure: Move book content to book/ subdirectory
- Move quarto/ → book/quarto/
- Move cli/ → book/cli/
- Move docker/ → book/docker/
- Move socratiQ/ → book/socratiQ/
- Move tools/ → book/tools/
- Move scripts/ → book/scripts/
- Move config/ → book/config/
- Move docs/ → book/docs/
- Move binder → book/binder
Git history fully preserved for all moved files.
Part of repository restructuring to support MLSysBook + TinyTorch.
Pre-commit hooks bypassed for this commit as paths need updating.
* Update pre-commit hooks for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update all tools/ paths to book/tools/
- Update config/linting to book/config/linting
- Update project structure checks
Pre-commit hooks will now work with new directory structure.
* Update .gitignore for book/ subdirectory structure
- Update quarto/ paths to book/quarto/
- Update assets/ paths to book/quarto/assets/
- Maintain all existing ignore patterns
* Update GitHub workflows for book/ subdirectory
- Update all quarto/ paths to book/quarto/
- Update cli/ paths to book/cli/
- Update tools/ paths to book/tools/
- Update docker/ paths to book/docker/
- Update config/ paths to book/config/
- Maintain all workflow functionality
* Update CLI config to support book/ subdirectory
- Check for book/quarto/ path first
- Fall back to quarto/ for backward compatibility
- Maintain full CLI functionality
* Create new root and book READMEs for dual structure
- Add comprehensive root README explaining both projects
- Create book-specific README with quick start guide
- Document repository structure and navigation
- Prepare for TinyTorch integration