mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-08 23:03:55 -05:00
Scripts Directory
Automation scripts and tools for the Machine Learning Systems textbook.
Deprecation Note
For workflows now exposed by Binder, prefer ./book/binder ... commands over direct script execution.
- Validation checks: use
./book/binder validate ... - Maintenance utilities: use
./book/binder maintain ...
Scripts remain available as internal utilities, but direct invocation is soft-deprecated for Binder-covered tasks.
Directory Structure
scripts/
├── common/ Shared base classes, config, logging, validators
├── content/ Content validation, formatting, and editing tools
├── docs/ Script documentation
├── genai/ AI-assisted tools (quizzes, footnotes, dash fixes)
├── glossary/ Glossary generation and consolidation
├── images/ Image processing, compression, validation
├── infrastructure/ CI/CD and Docker utilities
├── maintenance/ Repo health, image casing, build artifact cleanup
├── publish/ MIT Press release builder, figure extraction, deployment
├── socratiQ/ SocratiQ integration
├── testing/ Debug builds, test runners, linters
└── utilities/ Footnote analysis, ref auditing, JSON/EPUB validation
Key Scripts by Task
Content Editing
content/format_blank_lines.py- Normalize blank lines in .qmd filescontent/format_tables.py- Format Quarto tablescontent/section_splitter.py- Split chapters into sections for processingcontent/relocate_figures.py- Move figures closer to first referencecontent/manage_section_ids.py- Manage@sec-cross-reference IDs
Validation
- Reference check —
./book/binder validate references(native CLI; validates .bib vs academic DBs via hallucinator). See README_REFERENCE_CHECK.md. content/check_duplicate_labels.py- Find duplicate labelscontent/check_fig_references.py- Validate figure referencescontent/check_unreferenced_labels.py- Find unused labelscontent/validate_citations.py- Check citation formattingutilities/validate_epub.py- Validate EPUB outpututilities/validate_json.py- Validate JSON files
Publishing
publish/mit-press-release.sh- Build MIT Press PDFs (regular or copy-edit)publish/extract_figures.py- Extract figure lists for MIT Press submissionpublish/publish.sh- Full release workflow with versioningpublish/render_compress_publish.py- Render, compress, and publish
Images
images/compress_images.py- Compress images for web/PDFimages/validate_image_references.py- Check image referencesimages/convert_svg_to_png.py- SVG to PNG conversion
Glossary
glossary/build_global_glossary.py- Build master glossary from chaptersglossary/consolidate_similar_terms.py- Merge near-duplicate terms
AI Tools
genai/quizzes.py- Generate quiz questionsgenai/footnote_assistant.py- AI-assisted footnote writing
Usage
All Python scripts use python3. Most support --help for options.
python3 book/tools/scripts/content/format_blank_lines.py path/to/file.qmd
python3 book/tools/scripts/publish/extract_figures.py --vol 1
./book/tools/scripts/publish/mit-press-release.sh --vol1