MLSysBook Scripts Directory
This directory contains all automation scripts, tools, and utilities for the Machine Learning Systems book project. The scripts are organized into logical categories for easy discovery and maintenance.
📁 Directory Structure
tools/scripts/
├── build/ # Build and development scripts
├── content/ # Content management and editing tools
├── maintenance/ # System maintenance and updates
├── testing/ # Test scripts and validation
├── utilities/ # General utility scripts
├── docs/ # Documentation for scripts
├── genai/ # AI and generation tools
├── cross_refs/ # Cross-reference management
├── publish/ # Publishing and deployment
└── ai_menu/ # AI menu and interface tools
🔨 Build Scripts (build/)
Scripts for building, cleaning, and development workflows:
clean.sh- Comprehensive cleanup script (build artifacts, caches, temp files)standardize_sources.sh- Standardize source file formattinggenerate_stats.py- Generate statistics about the Quarto project
Usage Examples
# Clean all build artifacts
./build/clean.sh
# Deep clean including caches and virtual environments
./build/clean.sh --deep
# Generate project statistics
python build/generate_stats.py
📝 Content Management (content/)
Tools for managing, editing, and validating book content:
improve_figure_captions.py- Enhance figure captions using AImanage_section_ids.py- Manage section IDs and cross-referencesfind_unreferenced_labels.py- Find unused labels and referencesfind_duplicate_labels.py- Detect duplicate labelsextract_headers.py- Extract headers from content filesfind_acronyms.py- Find and manage acronymsfind_fig_references.py- Analyze figure referencesfix_bibliography.py- Fix bibliography formattingsync_bibliographies.py- Synchronize bibliography filesclean_callout_titles.py- Clean callout title formattingcollapse_blank_lines.py- Remove excessive blank lines
Usage Examples
# Improve figure captions
python content/improve_figure_captions.py
# Find unreferenced labels
python content/find_unreferenced_labels.py
# Manage section IDs
python content/manage_section_ids.py
🔧 Maintenance Scripts (maintenance/)
System maintenance, updates, and changelog management:
generate_release_content.py- Generate changelog entries and release notesfix_changelog.py- Fix changelog formatting issuesupdate_texlive_packages.py- Update LaTeX package dependenciescleanup_old_runs.sh- Clean up old build runs
Usage Examples
# Generate changelog entry or release notes
python maintenance/generate_release_content.py
# Update LaTeX packages
python maintenance/update_texlive_packages.py
🧪 Testing Scripts (testing/)
Test scripts and validation tools:
run_tests.py- Run comprehensive test suitetest_section_ids.py- Test section ID management
Usage Examples
# Run all tests
python testing/run_tests.py
# Test section ID system
python testing/test_section_ids.py
🛠️ Utilities (utilities/)
General-purpose utility scripts:
check_ascii.py- Check for non-ASCII characterscheck_images.py- Validate image files and referencescheck_sources.py- Comprehensive source file validationfix_titles.py- Fix title formattingcount_footnotes.sh- Count footnotesanalyze_footnotes.sh- Detailed footnote analysis
Usage Examples
# Check for non-ASCII characters
python utilities/check_ascii.py
# Validate images
python utilities/check_images.py
# Check source files
python utilities/check_sources.py
📖 Documentation (docs/)
Documentation for scripts and systems:
README.md- General scripts documentationSECTION_ID_SYSTEM.md- Section ID management system guideFIGURE_CAPTIONS.md- Figure caption enhancement guide
🤖 Specialized Tools
AI and Generation (genai/)
Tools for AI-powered content generation and enhancement.
Cross-References (cross_refs/)
Advanced cross-reference management and validation tools.
Publishing (publish/)
Scripts for publishing and deployment workflows. Note: The main publishing workflow is now handled by ./binder publish.
AI Menu (ai_menu/)
AI-powered menu and interface tools.
🚀 Quick Start
First Time Setup
# Make all scripts executable
find tools/scripts -name "*.sh" -exec chmod +x {} \;
# Install Python dependencies (if needed)
pip install -r tools/dependencies/requirements.txt
Common Workflows
Before Working on Content
# Clean workspace
./build/clean.sh
# Check project health
python utilities/check_sources.py
Content Editing Session
# Improve figures
python content/improve_figure_captions.py
# Find issues
python content/find_unreferenced_labels.py
python content/find_duplicate_labels.py
# Clean up formatting
python content/collapse_blank_lines.py
Before Publishing
# Full validation
python testing/run_tests.py
python utilities/check_images.py
python utilities/ascii_checker.py
# Update changelog
python maintenance/update_changelog.py
# Final cleanup
./build/clean.sh
# Publish (using binder)
./binder publish
📋 Script Categories Summary
| Category | Purpose | Count | Key Scripts |
|---|---|---|---|
| build | Development & building | 3 | clean.sh, generate_stats.py |
| content | Content management | 11 | manage_section_ids.py, improve_figure_captions.py |
| maintenance | System maintenance | 4 | generate_release_content.py, update_texlive_packages.py |
| testing | Testing & validation | 2 | run_tests.py, test_section_ids.py |
| utilities | General utilities | 6 | check_sources.py, check_ascii.py |
| docs | Documentation | 3 | Various .md files |
🔍 Finding the Right Script
By Purpose
- Need to clean up? →
build/clean.sh - Content has issues? →
utilities/check_sources.py - Figures need improvement? →
content/improve_figure_captions.py - Want project stats? →
build/generate_stats.py - Need to test changes? →
testing/run_tests.py
By File Type
.shscripts - Shell scripts (mostly inbuild/andutilities/).pyscripts - Python scripts (distributed across categories).mdfiles - Documentation (indocs/)
🤝 Contributing New Scripts
When adding new scripts:
- Choose the right category based on the script's primary purpose
- Follow naming conventions - descriptive, lowercase with underscores
- Add documentation - Include usage examples and descriptions
- Update this README - Add the script to the appropriate section
- Make executable -
chmod +xfor shell scripts - Test thoroughly - Ensure scripts work in different environments
📞 Support
For issues with specific scripts:
- Check the script's docstring or comments
- Look for documentation in the
docs/directory - Run scripts with
--helpflag if available - Review this README for context and examples