mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-08 23:03:55 -05:00
Add explicit per-check preflight logging and matrix failure instance reporting in the container build workflow, and update stale documentation links and workflow/file path references.
5.5 KiB
5.5 KiB
Containerized Build System
Overview
This document describes the containerized build system for MLSysBook that significantly reduces build times from 45 minutes to 5-10 minutes for Linux builds.
Architecture
Container Strategy
- Linux builds: Use pre-built container with all dependencies
- Windows builds: Keep traditional approach (unchanged)
- Container registry: GitHub Container Registry (ghcr.io)
Performance Benefits
Current Linux Build (45 minutes):
├── Install system packages (5-10 min)
├── Install TeX Live (15-20 min)
├── Install R packages (5-10 min)
├── Install Python packages (2-5 min)
├── Install Quarto (1-2 min)
└── Build content (5-10 min)
Containerized Linux Build (5-10 minutes):
├── Pull container (30 seconds)
├── Checkout code (30 seconds)
└── Build content (5-10 min)
Files
Core Files
book/docker/linux/Dockerfile- Linux build container definitionbook/docker/linux/README.md- Linux container documentationbook/docker/linux/.dockerignore- Linux build exclusionsbook/docker/windows/Dockerfile- Windows build container definitionbook/docker/windows/README.md- Windows container documentationbook/docker/windows/.dockerignore- Windows build exclusions
Container Lifecycle
- Build: Weekly automatic rebuilds + manual triggers
- Linux container: Sunday 12am
- Windows container: Sunday 2am
- Storage: GitHub Container Registry (ghcr.io)
- Usage: Pulled fresh for each build job
- Cleanup: GitHub manages old images automatically
Usage
Registry Paths
- Linux Registry:
ghcr.io/harvard-edge/cs249r_book/quarto-linux - Windows Registry:
ghcr.io/harvard-edge/cs249r_book/quarto-windows
Manual Builds
You can build the containers locally using these commands:
- Linux:
docker build -f book/docker/linux/Dockerfile -t mlsysbook-linux . - Windows:
docker build -f book/docker/windows/Dockerfile -t mlsysbook-windows .
Manual Build Test
# Test containerized build
gh workflow run book-build-container.yml \
--field build_linux=true \
--field build_windows=false \
--field build_html=true \
--field build_pdf=false \
--field build_epub=false \
--field build_target=vol1 \
--field target=dev
Container Information
- Linux Registry:
ghcr.io/harvard-edge/cs249r_book/quarto-linux - Windows Registry:
ghcr.io/harvard-edge/cs249r_book/quarto-windows - Tags:
latest,main,dev, branch-specific tags - Linux Size: ~2-3GB (includes TeX Live, R, Python packages)
- Windows Size: ~4-5GB (includes Windows Server Core + dependencies)
Workflow Integration
Current Workflows
book-build-container.yml- Containerized book build matrixinfra-container-linux.yml- Linux container image managementinfra-container-windows.yml- Windows container image management
Migration Status
- ✅ Phase 1: Containerized builds tested and validated
- ✅ Phase 2: Performance significantly improved (45min → 5-10min)
- ✅ Phase 3: Container workflow is the only build method
Container Contents
Pre-installed Dependencies
Linux Container
- System: Ubuntu 22.04 with all required libraries
- TeX Live: Full distribution (texlive-full)
- R: R-base with all required packages
- Python: Python 3.13 with all requirements
- Quarto: Version 1.9.27
- Tools: Inkscape, Ghostscript, fonts
Windows Container
- System: Windows Server Core 2022
- TeX Live: MiKTeX distribution
- R: R-base with all required packages
- Python: Python 3.x with all requirements
- Quarto: Version 1.9.27
- Tools: Inkscape, Ghostscript, Chocolatey package manager
Environment Variables
R_LIBS_USER=/usr/local/lib/R/library
QUARTO_LOG_LEVEL=INFO
PYTHONIOENCODING=utf-8
LANG=en_US.UTF-8
LC_ALL=en_US.UTF-8
Troubleshooting
Container Build Issues
- Check container build logs in Actions
- Verify dependency files are up to date
- Test locally with
docker build -t test .
Build Issues
- Check if container exists:
ghcr.io/harvard-edge/cs249r_book/quarto-linux:latest - Verify container has all dependencies
- Review container preflight/toolchain logs first
Performance Issues
- Monitor container pull times
- Check disk space in container
- Verify memory allocation
Future Enhancements
Potential Improvements
- Multi-stage builds for smaller images
- Windows containers for Windows builds
- Layer optimization for faster pulls
- Parallel builds for multiple formats
Monitoring
- Container build frequency
- Build time improvements
- Error rates vs traditional builds
- Resource usage optimization
Recovery Plan
If issues arise:
- Rebuild and republish the container image
- Fix preflight failures before render starts
- Re-run the container workflow
- Investigate dependency drift in Dockerfiles
Security
Container Security
- Uses official Ubuntu base image
- Minimal attack surface
- Regular base image updates
- GitHub security scanning enabled
Access Control
- Container registry access via GitHub Actions
- No external dependencies
- All builds run in isolated containers
Building the Containers
To build the containers, use the standard docker build command:
# For Linux
docker build -f book/docker/linux/Dockerfile -t mlsysbook-linux .
# For Windows
docker build -f book/docker/windows/Dockerfile -t mlsysbook-windows .