mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-09 07:15:51 -05:00
78 lines
3.3 KiB
Markdown
78 lines
3.3 KiB
Markdown
# Scripts Directory
|
|
|
|
Automation scripts and tools for the Machine Learning Systems textbook.
|
|
|
|
## Deprecation Note
|
|
|
|
For workflows now exposed by Binder, prefer `./book/binder ...` commands over direct script execution.
|
|
|
|
- Validation checks: use `./book/binder validate ...`
|
|
- Maintenance utilities: use `./book/binder maintain ...`
|
|
|
|
Scripts remain available as internal utilities, but direct invocation is soft-deprecated for Binder-covered tasks.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
scripts/
|
|
├── common/ Shared base classes, config, logging, validators
|
|
├── content/ Content validation, formatting, and editing tools
|
|
├── docs/ Script documentation
|
|
├── genai/ AI-assisted tools (quizzes, footnotes, dash fixes)
|
|
├── glossary/ Glossary generation and consolidation
|
|
├── images/ Image processing, compression, validation
|
|
├── infrastructure/ CI/CD and Docker utilities
|
|
├── maintenance/ Repo health, image casing, build artifact cleanup
|
|
├── publish/ MIT Press release builder, figure extraction, deployment
|
|
├── socratiQ/ SocratiQ integration
|
|
├── testing/ Debug builds, test runners, linters
|
|
└── utilities/ Footnote analysis, ref auditing, JSON/EPUB validation
|
|
```
|
|
|
|
## Key Scripts by Task
|
|
|
|
### Content Editing
|
|
- `content/format_blank_lines.py` - Normalize blank lines in .qmd files
|
|
- `content/format_tables.py` - Format Quarto tables
|
|
- `content/section_splitter.py` - Split chapters into sections for processing
|
|
- `content/relocate_figures.py` - Move figures closer to first reference
|
|
- `content/manage_section_ids.py` - Manage `@sec-` cross-reference IDs
|
|
|
|
### Validation
|
|
- **Reference check** — `./book/binder validate references` (native CLI; validates .bib vs academic DBs via [hallucinator](https://github.com/gianlucasb/hallucinator)). See [README_REFERENCE_CHECK.md](README_REFERENCE_CHECK.md).
|
|
- `content/check_duplicate_labels.py` - Find duplicate labels
|
|
- `content/check_fig_references.py` - Validate figure references
|
|
- `content/check_unreferenced_labels.py` - Find unused labels
|
|
- `content/validate_citations.py` - Check citation formatting
|
|
- `utilities/validate_epub.py` - Validate EPUB output
|
|
- `utilities/validate_json.py` - Validate JSON files
|
|
|
|
### Publishing
|
|
- `publish/mit-press-release.sh` - Build MIT Press PDFs (regular or copy-edit)
|
|
- `publish/extract_figures.py` - Extract figure lists for MIT Press submission
|
|
- `publish/publish.sh` - Full release workflow with versioning
|
|
- `publish/render_compress_publish.py` - Render, compress, and publish
|
|
|
|
### Images
|
|
- `images/compress_images.py` - Compress images for web/PDF
|
|
- `images/validate_image_references.py` - Check image references
|
|
- `images/convert_svg_to_png.py` - SVG to PNG conversion
|
|
|
|
### Glossary
|
|
- `glossary/build_global_glossary.py` - Build master glossary from chapters
|
|
- `glossary/consolidate_similar_terms.py` - Merge near-duplicate terms
|
|
|
|
### AI Tools
|
|
- `genai/quizzes.py` - Generate quiz questions
|
|
- `genai/footnote_assistant.py` - AI-assisted footnote writing
|
|
|
|
## Usage
|
|
|
|
All Python scripts use `python3`. Most support `--help` for options.
|
|
|
|
```bash
|
|
python3 book/tools/scripts/content/format_blank_lines.py path/to/file.qmd
|
|
python3 book/tools/scripts/publish/extract_figures.py --vol 1
|
|
./book/tools/scripts/publish/mit-press-release.sh --vol1
|
|
```
|