mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-11 17:49:25 -05:00
Adds a utility script to enforce proper Markdown list rendering. This addresses an issue where Quarto/Pandoc might incorrectly parse lists as paragraph continuations if not preceded by a blank line. Applies this formatting fix across all Quarto files by inserting the necessary blank lines.
Scripts Directory
Automation scripts and tools for the Machine Learning Systems textbook.
Deprecation Note
For workflows now exposed by Binder, prefer ./book/binder ... commands over direct script execution.
- Validation checks: use
./book/binder validate ... - Maintenance utilities: use
./book/binder maintain ...
Scripts remain available as internal utilities, but direct invocation is soft-deprecated for Binder-covered tasks.
Directory Structure
scripts/
├── common/ Shared base classes, config, logging, validators
├── content/ Content validation, formatting, and editing tools
├── docs/ Script documentation
├── genai/ AI-assisted tools (quizzes, footnotes, dash fixes)
├── glossary/ Glossary generation and consolidation
├── images/ Image processing, compression, validation
├── infrastructure/ CI/CD and Docker utilities
├── maintenance/ Repo health, image casing, build artifact cleanup
├── publish/ MIT Press release builder, figure extraction, deployment
├── socratiQ/ SocratiQ integration
├── testing/ Debug builds, test runners, linters
└── utilities/ Footnote analysis, ref auditing, JSON/EPUB validation
Key Scripts by Task
Content Editing
content/format_blank_lines.py- Normalize blank lines in .qmd filescontent/format_tables.py- Format Quarto tablescontent/section_splitter.py- Split chapters into sections for processingcontent/relocate_figures.py- Move figures closer to first referencecontent/manage_section_ids.py- Manage@sec-cross-reference IDs
Validation
- Reference check —
./book/binder validate references(native CLI; validates .bib vs academic DBs via hallucinator). See README_REFERENCE_CHECK.md. content/check_duplicate_labels.py- Find duplicate labelscontent/check_fig_references.py- Validate figure referencescontent/check_unreferenced_labels.py- Find unused labelscontent/validate_citations.py- Check citation formattingutilities/validate_epub.py- Validate EPUB outpututilities/validate_json.py- Validate JSON files
Publishing
publish/mit-press-release.sh- Build MIT Press PDFs (regular or copy-edit)publish/extract_figures.py- Extract figure lists for MIT Press submissionpublish/publish.sh- Full release workflow with versioningpublish/render_compress_publish.py- Render, compress, and publish
Images
images/compress_images.py- Compress images for web/PDFimages/validate_image_references.py- Check image referencesimages/convert_svg_to_png.py- SVG to PNG conversion
Glossary
glossary/build_global_glossary.py- Build master glossary from chaptersglossary/consolidate_similar_terms.py- Merge near-duplicate terms
AI Tools
genai/quizzes.py- Generate quiz questionsgenai/footnote_assistant.py- AI-assisted footnote writing
Usage
All Python scripts use python3. Most support --help for options.
python3 book/tools/scripts/content/format_blank_lines.py path/to/file.qmd
python3 book/tools/scripts/publish/extract_figures.py --vol 1
./book/tools/scripts/publish/mit-press-release.sh --vol1