Update BUILD.md with improved binder CLI documentation

- Replace 'hello' command with 'doctor' for system health checks
- Add detailed build command examples for single/multiple chapters
- Add dedicated shortcuts for PDF and EPUB builds
- Restructure requirements section with clear table format
- Add version requirements for all dependencies
- Improve clarity on Python dependencies and their purposes
This commit is contained in:
Vijay Janapa Reddi
2025-10-02 08:41:00 -04:00
parent 7aa2703afd
commit 706b02ae45

View File

@@ -12,15 +12,23 @@ For most users, the easiest way is using our **Book Binder CLI**:
# First time setup
./binder setup
# Welcome and overview
./binder hello
# System health check
./binder doctor
# Quick preview
# Quick chapter preview (HTML with live reload)
./binder preview intro
# Build everything
./binder build html
./binder build pdf
# Build specific chapter(s)
./binder build intro # Single chapter (HTML)
./binder build intro,ml_systems # Multiple chapters (HTML)
# Build complete book
./binder build # Complete book (HTML)
./binder pdf # Complete book (PDF)
./binder epub # Complete book (EPUB)
# Get help
./binder help
```
The `binder` tool automatically handles all dependencies, configuration, and build processes for you!
@@ -40,20 +48,46 @@ By default, Quarto can build the HTML version pretty easily. But **building the
---
## ✅ What Youll Need (And Why)
## ✅ What You'll Need (And Why)
| Tool | Why It's Needed |
|------|------------------|
| **Quarto** | The core tool that converts the `.qmd` files into HTML/PDF |
| **R** | Some chapters include R code chunks and R-based plots |
| **R packages** | Supporting packages (defined in `install_packages.R`) |
| **TinyTeX + TeX Live** | Needed for LaTeX → PDF rendering |
| **Inkscape** | Converts `.svg` diagrams into `.pdf` (especially TikZ) |
| **Ghostscript** | Compresses large PDF files |
| **Python 3.9+** | Needed for PDF compression scripts and Book Binder CLI |
| **System libraries** | Fonts and rendering support on Linux systems |
| Tool | Why It's Needed | Version |
|------|------------------|---------|
| **Quarto** | The core tool that converts the `.qmd` files into HTML/PDF | 1.7.31+ |
| **Python** | Required for Book Binder CLI and build scripts | 3.9+ |
| **Python packages** | Dependencies (see `tools/dependencies/requirements.txt`) | See below |
| **R** | Some chapters include R code chunks and R-based plots | 4.0+ |
| **R packages** | Supporting packages (defined in `tools/dependencies/install_packages.R`) | Latest |
| **TinyTeX + TeX Live** | Needed for LaTeX → PDF rendering | Latest |
| **Inkscape** | Converts `.svg` diagrams into `.pdf` (especially TikZ) | 1.0+ |
| **Ghostscript** | Compresses large PDF files | Latest |
| **System libraries** | Fonts and rendering support (Linux systems) | Various |
Dont worry — this guide will walk you through installing all of them, step by step.
Don't worry — this guide will walk you through installing all of them, step by step.
### Python Dependencies
The project uses a modern Python packaging setup with `pyproject.toml`. Core dependencies include:
**Core Build Dependencies:**
- `jupyterlab-quarto>=0.3.0` - Quarto integration
- `jupyter>=1.0.0` - Jupyter notebook support
- `pybtex>=0.24.0` - Bibliography processing
- `pypandoc>=1.11` - Document conversion
- `pyyaml>=6.0` - Configuration management
- `rich>=13.0.0` - CLI formatting and output
**Data Processing:**
- `pandas>=2.0.0` - Data manipulation
- `numpy>=1.24.0` - Numerical computing
- `Pillow>=9.0.0` - Image processing
**Additional Tools:**
- `openai>=1.0.0` - AI-assisted content tools
- `gradio>=4.0.0` - Interactive interfaces
- `ghostscript>=0.7` - PDF compression
- `pre-commit>=3.0.0` - Code quality hooks
For the complete list, see `tools/dependencies/requirements.txt` and `pyproject.toml`.
---
@@ -99,10 +133,10 @@ Once R is installed, open it by typing `R`, then run:
```r
install.packages("remotes")
source("install_packages.R")
source("tools/dependencies/install_packages.R")
```
This installs everything the book needs to render code, plots, etc.
This installs everything the book needs to render code, plots, etc. The R package dependencies are centrally managed in `tools/dependencies/install_packages.R`.
---
@@ -160,24 +194,65 @@ sudo apt-get install -y ghostscript
---
### 8. 🐍 Install Python 3 and pip (used for helper scripts)
### 8. 🐍 Install Python 3.9+ and Dependencies
```sh
sudo apt-get install -y python3 python3-pip
sudo apt-get install -y python3 python3-pip python3-venv
```
Test with:
```sh
python3 --version
python3 --version # Should be 3.9 or higher
pip3 --version
```
### 9. 📦 Install Python Dependencies
The project uses modern Python packaging. Install all dependencies with:
```sh
# Option 1: Using pip (recommended)
pip install -r requirements.txt
# Option 2: Install in development mode (includes CLI as command)
pip install -e .
# Option 3: Using a virtual environment (best practice)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
**What gets installed:**
- Book Binder CLI and all build tools
- Jupyter and Quarto integration packages
- Data processing libraries (pandas, numpy)
- AI/ML tools for content assistance
- Pre-commit hooks for code quality
The `requirements.txt` file points to `tools/dependencies/requirements.txt`, which contains all production and development dependencies.
---
### 9. 🧪 Test That It All Works
### 10. 🧪 Test That It All Works
Once youve installed everything, you're ready to try building the book!
Once you've installed everything, run the health check:
```sh
./binder doctor
```
This will verify:
- ✅ Quarto installation
- ✅ Python and dependencies
- ✅ R and required packages
- ✅ LaTeX and TinyTeX
- ✅ Inkscape and Ghostscript
- ✅ Configuration files
- ✅ Build directory structure
If everything passes, you're ready to build the book!
---
@@ -189,165 +264,419 @@ Navigate to the root folder of the project:
cd path/to/MLSysBook
```
### 🚀 **NEW: Dual-Configuration System**
### 🚀 **Dual-Configuration System**
The book now uses a **dual-configuration approach** that automatically switches between optimized settings for different output formats:
The book uses a **dual-configuration approach** that automatically switches between optimized settings for different output formats:
- **`quarto/_quarto-html.yml`** → Optimized for interactive website (clean navigation, TikZ→SVG, no citations)
- **`quarto/_quarto-pdf.yml`** → Optimized for academic PDF (full citations, LaTeX rendering, book structure)
- **`quarto/config/_quarto-html.yml`** → Optimized for interactive website (clean navigation, TikZ→SVG, cross-references)
- **`quarto/config/_quarto-pdf.yml`** → Optimized for academic PDF (full citations, LaTeX rendering, book structure)
- **`quarto/config/_quarto-epub.yml`** → Optimized for EPUB (e-reader format, reflowable content)
The build system automatically handles configuration switching using symlinks — **no manual file copying needed!**
The Binder CLI automatically handles configuration switching using symlinks — **no manual file management needed!**
---
### 🔹 **Build Commands (Recommended)**
### 🔹 **Build Commands (Book Binder CLI)**
Use these **automated commands** that handle configuration switching:
The **recommended way** to build the book is using the Book Binder CLI:
#### Interactive Build (Recommended)
#### Build Complete Book
```sh
make build
```
- Choose format interactively (HTML/PDF/Both)
- User-friendly prompts
- Perfect for development workflow
#### Build Website (HTML)
```sh
make build-html
```
- Uses HTML-optimized configuration
- TikZ diagrams → SVG conversion
- Clean navigation without chapter numbers
- Interactive quizzes and cross-references
#### Build PDF Book
```sh
make build-pdf
```
- Uses PDF-optimized configuration
- Full LaTeX rendering with citations
- Professional book formatting
- Traditional chapter numbering
#### Build Both Formats
```sh
make build-all
./binder build # Complete website (HTML)
./binder pdf # Complete book (PDF)
./binder epub # Complete e-book (EPUB)
```
#### Development Preview
#### Build Specific Chapter(s)
```sh
make preview # HTML preview with live reload
make preview-pdf # PDF preview
./binder build intro # Single chapter (HTML)
./binder build intro,ml_systems # Multiple chapters (HTML)
./binder pdf intro # Single chapter (PDF, selective build)
```
You'll find outputs in the `build/html/` folder for HTML and `build/pdf/` for PDF.
#### Preview Mode (Live Reload)
```sh
./binder preview # Preview complete book
./binder preview intro # Preview specific chapter
./binder preview intro,ml_systems # Preview multiple chapters
```
#### Management Commands
```sh
./binder clean # Clean build artifacts
./binder status # Show current status
./binder list # List all available chapters
./binder doctor # Run comprehensive health check
./binder help # Show all commands
```
**Output Locations:**
- **HTML:** `build/html/`
- **PDF:** `build/pdf/`
- **EPUB:** `build/epub/`
---
### 🔹 **Manual Commands (Advanced)**
### 🔹 **Advanced: Direct Quarto Commands**
If you need direct control, these commands work but require manual configuration management:
If you need direct control without the Binder CLI:
#### Website (HTML) version:
```sh
cd book
ln -sf _quarto-html.yml _quarto.yml
cd quarto
ln -sf config/_quarto-html.yml _quarto.yml
quarto render --to html
rm _quarto.yml
```
#### PDF version:
```sh
cd book
ln -sf _quarto-pdf.yml _quarto.yml
cd quarto
ln -sf config/_quarto-pdf.yml _quarto.yml
quarto render --to titlepage-pdf
rm _quarto.yml
```
**Note:** The automated `make` commands are recommended as they handle configuration switching and cleanup automatically.
#### EPUB version:
```sh
cd quarto
ln -sf config/_quarto-epub.yml _quarto.yml
quarto render --to epub
```
**Important:** The Binder CLI is strongly recommended as it:
- ✅ Handles configuration switching automatically
- ✅ Manages build artifacts and cleanup
- ✅ Provides progress indicators
- ✅ Validates system health
- ✅ Supports fast/selective builds
---
## 🪟 Setup on **Windows**
1. **Install Quarto**
Download from [quarto.org](https://quarto.org/docs/download/)
### Prerequisites
- Windows 10 or later
- Administrator access for some installations
2. **Install R**
Download from [CRAN](https://cran.r-project.org/)
### 1. Install Quarto
Download and install from [quarto.org](https://quarto.org/docs/download/)
3. **Install R Packages**
Open R and run:
```r
install.packages("remotes")
source("install_packages.R")
```
### 2. Install Python 3.9+
Download from [python.org](https://www.python.org/downloads/) or use Windows Store.
4. **Install TinyTeX**
```r
install.packages("tinytex")
tinytex::install_tinytex()
```
**Important:** Check "Add Python to PATH" during installation.
5. **Install Inkscape, Ghostscript, Python**
Open PowerShell (as Administrator), then run:
```powershell
choco install inkscape ghostscript python -y
```
### 3. Install R
Download from [CRAN](https://cran.r-project.org/)
6. **Test Everything Works**
Open a new terminal and try:
```powershell
quarto render --to html
quarto render --to titlepage-pdf
```
### 4. Install R Packages
Open R and run:
```r
install.packages("remotes")
source("tools/dependencies/install_packages.R")
```
### 5. Install TinyTeX
From R console:
```r
install.packages("tinytex")
tinytex::install_tinytex()
```
### 6. Install Inkscape, Ghostscript (Using Chocolatey)
Open PowerShell (as Administrator):
```powershell
# Install Chocolatey if not already installed
Set-ExecutionPolicy Bypass -Scope Process -Force
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
# Install tools
choco install inkscape ghostscript -y
```
### 7. Install Python Dependencies
Open Command Prompt or PowerShell in the project directory:
```powershell
# Create virtual environment (recommended)
python -m venv .venv
.venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
```
### 8. Test Everything Works
Run the health check:
```powershell
python binder doctor
```
Or test building:
```powershell
python binder build intro
python binder pdf
```
---
## 💡 Troubleshooting Tips
**Quarto not found?**
Make sure its in your PATH and installed correctly.
**PDF build fails?**
- Check that LaTeX and Inkscape are working.
- Make sure you're using `--to titlepage-pdf` and not just `--to pdf`.
**Compression script doesnt work?**
- Make sure Ghostscript is installed and accessible.
- You may need to install Python packages:
```sh
pip3 install pikepdf ghostscript PyPDF2
```
---
## 🎉 Thats It!
Once everything is set up, youll be able to:
- Preview changes locally
- Build clean HTML and PDF versions
- Contribute to the book like a pro 💪
Let me know if you'd like this saved as `manual_setup.md` or included in your Quarto documentation!
---
## 🔧 Additional Troubleshooting
**Icon files missing for foldbox callouts?**
If you see errors like `File 'icon_callout-quiz-question.pdf' not found`, the PNG icons need to be converted to PDF format for LaTeX rendering:
### Common Installation Issues
**Quarto not found?**
```sh
cd quarto/_extensions/ute/custom-numbered-blocks/style/icons
convert icon_callout-quiz-question.png icon_callout-quiz-question.pdf
convert icon_callout-quiz-answer.png icon_callout-quiz-answer.pdf
convert icon_callout-chapter-connection.png icon_callout-chapter-connection.pdf
convert icon_callout-resource-exercises.png icon_callout-resource-exercises.pdf
convert Icon_callout-resource-slides.png icon_callout-resource-slides.pdf
convert Icon_callout-resource-videos.png icon_callout-resource-videos.pdf
# Verify installation
quarto --version
# Check PATH (Linux/macOS)
echo $PATH | grep quarto
# Reinstall if needed
# Linux: sudo dpkg -i quarto-*.deb
# macOS: brew install --cask quarto
# Windows: Download from quarto.org
```
**Note:** This requires ImageMagick to be installed. On macOS: `brew install imagemagick`, on Ubuntu: `sudo apt-get install imagemagick`.
**Python version issues?**
```sh
# Check Python version (must be 3.9+)
python --version
python3 --version
# Use specific version if multiple installed
python3.9 --version
```
**Dependencies not installing?**
```sh
# Upgrade pip first
pip install --upgrade pip setuptools wheel
# Try with verbose output
pip install -r requirements.txt -v
# If SSL errors occur
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org -r requirements.txt
```
### Build Issues
**PDF build fails?**
- Verify LaTeX is installed: `pdflatex --version`
- Verify Inkscape is installed: `inkscape --version`
- Check TinyTeX path: `tinytex::tinytex_root()` in R
- Try rebuilding from scratch:
```sh
./binder clean
./binder pdf
```
**Chapter not found?**
```sh
# List all available chapters
./binder list
# Use exact chapter names (case-sensitive)
./binder build intro # ✓ correct
./binder build Intro # ✗ wrong
```
**Build artifacts detected?**
```sh
# Clean all build artifacts
./binder clean
# Check status
./binder status
# Run health check
./binder doctor
```
**Configuration issues?**
```sh
# Check current configuration
ls -la quarto/_quarto.yml
# Should be a symlink to config/_quarto-html.yml or config/_quarto-pdf.yml
# If not, recreate:
cd quarto
ln -sf config/_quarto-html.yml _quarto.yml
```
### System-Specific Issues
**macOS: Inkscape not in PATH?**
```sh
# Add Inkscape to PATH
echo 'export PATH="/Applications/Inkscape.app/Contents/MacOS:$PATH"' >> ~/.zshrc
source ~/.zshrc
```
**Linux: Missing system libraries?**
```sh
# Install common missing libraries
sudo apt-get install -y libcairo2-dev libharfbuzz-dev libfribidi-dev \
libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev
```
**Windows: Permission errors?**
```powershell
# Run PowerShell as Administrator
# Disable execution policy temporarily
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process
```
### Getting Help
If you're still having issues:
1. **Run the health check**: `./binder doctor`
2. **Check the logs**: Look for detailed error messages
3. **Consult documentation**:
- [BINDER.md](BINDER.md) - Binder CLI guide
- [DEVELOPMENT.md](DEVELOPMENT.md) - Development setup
- [MAINTENANCE_GUIDE.md](MAINTENANCE_GUIDE.md) - Maintenance tasks
4. **Ask for help**:
- GitHub Discussions: https://github.com/harvard-edge/cs249r_book/discussions
- GitHub Issues: https://github.com/harvard-edge/cs249r_book/issues
---
## 📦 Modern Python Packaging
The project uses modern Python packaging standards with `pyproject.toml`:
### Project Structure
```
MLSysBook/
├── pyproject.toml # Python project configuration
├── requirements.txt # Points to tools/dependencies/requirements.txt
├── tools/
│ └── dependencies/
│ ├── requirements.txt # Actual dependencies
│ └── install_packages.R # R dependencies
└── cli/ # Modular CLI package
├── main.py # CLI entry point
├── commands/ # Command implementations
├── core/ # Core functionality
└── utils/ # Utilities
```
### Installation Options
**Standard Installation (Recommended):**
```sh
pip install -r requirements.txt
```
**Development Installation:**
```sh
# Installs package in editable mode with CLI as command
pip install -e .
# Now you can use:
binder build
mlsysbook build # Alternative command name
```
**With Optional Dependencies:**
```sh
# Install with AI features
pip install -e ".[ai]"
# Install with development tools
pip install -e ".[dev]"
# Install everything
pip install -e ".[ai,dev]"
```
### Key Features
The `pyproject.toml` defines:
- **Minimum Python version**: 3.9+
- **Core dependencies**: Listed in `dependencies` section
- **Optional dependencies**: AI tools, dev tools, build tools
- **Entry points**: `binder` and `mlsysbook` commands
- **Code quality tools**: Black, isort, pylint, mypy configurations
- **Testing setup**: Pytest with coverage
### Benefits
- ✅ Standards-compliant packaging
- ✅ Proper dependency management
- ✅ CLI installed as system command
- ✅ Supports pip, poetry, and other tools
- ✅ Easy distribution and installation
---
## 🎉 That's It!
Once everything is set up, you'll be able to:
### Development Workflow
- 🚀 **Preview changes locally** with live reload: `./binder preview intro`
- 🔨 **Build individual chapters** for fast iteration: `./binder build intro`
- 📚 **Build complete book** in multiple formats: `./binder build`, `./binder pdf`, `./binder epub`
- 🔍 **Validate your setup** anytime: `./binder doctor`
- 🧹 **Clean up artifacts**: `./binder clean`
### Contributing
- 📝 **Make edits** to chapter content in `quarto/contents/`
- ✅ **Test locally** before committing
- 🤝 **Follow best practices** with pre-commit hooks
- 💪 **Contribute like a pro** to the open-source book
### Next Steps
1. Read [BINDER.md](BINDER.md) for complete CLI reference
2. Check [DEVELOPMENT.md](DEVELOPMENT.md) for development guidelines
3. Review [contribute.md](contribute.md) for contribution guidelines
4. Join discussions at [GitHub Discussions](https://github.com/harvard-edge/cs249r_book/discussions)
---
## 📖 Additional Resources
### Documentation
- **[BINDER.md](BINDER.md)** - Complete Book Binder CLI reference
- **[DEVELOPMENT.md](DEVELOPMENT.md)** - Development guidelines and workflow
- **[MAINTENANCE_GUIDE.md](MAINTENANCE_GUIDE.md)** - Maintenance tasks and troubleshooting
- **[contribute.md](contribute.md)** - Contribution guidelines
- **[PUBLISH_LIVE_WORKFLOW.md](PUBLISH_LIVE_WORKFLOW.md)** - Publishing workflow
### Community
- **[GitHub Discussions](https://github.com/harvard-edge/cs249r_book/discussions)** - Ask questions and share knowledge
- **[GitHub Issues](https://github.com/harvard-edge/cs249r_book/issues)** - Report bugs and request features
- **[MLSysBook.org](https://mlsysbook.org)** - Main website and learning platform
### Tools and Scripts
The `tools/scripts/` directory contains various utilities:
- **`content/`** - Content management tools
- **`cross_refs/`** - Cross-reference management
- **`genai/`** - AI-assisted content tools
- **`glossary/`** - Glossary management
- **`maintenance/`** - System maintenance scripts
- **`publish/`** - Publishing and deployment tools
Run `./binder help` to see all available commands and their descriptions.
---
## 🙏 Contributing
We welcome contributions! The easiest way to get started:
1. **Fork and clone** the repository
2. **Set up your environment**: `./binder setup`
3. **Make your changes** to content or code
4. **Test locally**: `./binder preview <chapter>`
5. **Submit a pull request**
For detailed contribution guidelines, see [contribute.md](contribute.md).
---
**Last Updated**: October 2025
**Project**: Machine Learning Systems - Principles and Practices
**Website**: https://mlsysbook.ai