mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-29 00:59:07 -05:00
docs: remove outdated documentation files
- Remove ENHANCED_BUILD_MANAGER_TESTING.md (references non-existent workflows) - Remove PRE_COMMIT_VALIDATION_SUMMARY.md (redundant with PART_KEY_VALIDATION.md) - Remove PRE_COMMIT_EXTERNAL_IMAGES.md (incorrect paths and obsolete references) Reduces docs directory from 13 to 10 files, removing redundancy and outdated information.
This commit is contained in:
@@ -1,159 +0,0 @@
|
||||
# Enhanced Build Manager Testing Plan
|
||||
|
||||
## Overview
|
||||
|
||||
The enhanced build manager provides intelligent container-based builds with fallback to traditional builds. This document outlines the testing strategy for the feature branch.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Smart Container Management
|
||||
1. **Container Health Check**: Verifies if containers exist and are up-to-date
|
||||
2. **Conditional Building**: Only rebuilds containers when needed
|
||||
3. **Intelligent Routing**: Uses fast containers when available, traditional builds otherwise
|
||||
4. **Comprehensive Reporting**: Clear visibility into which strategy was used
|
||||
|
||||
### Performance Benefits
|
||||
- **Fast path**: 5-10 minutes (with containers)
|
||||
- **Traditional path**: 45 minutes (without containers)
|
||||
- **Graceful degradation**: Always works even if containers fail
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Phase 1: Container Build Testing
|
||||
Test that containers build correctly from feature branch:
|
||||
|
||||
```bash
|
||||
# Test Linux container build
|
||||
gh workflow run build-linux-container.yml --ref feature/enhanced-build-manager
|
||||
|
||||
# Test Windows container build
|
||||
gh workflow run build-windows-container.yml --ref feature/enhanced-build-manager
|
||||
```
|
||||
|
||||
### Phase 2: Enhanced Manager Testing
|
||||
Test the enhanced manager with different scenarios:
|
||||
|
||||
```bash
|
||||
# Test 1: Full enhanced manager with container building
|
||||
gh workflow run build-manager-enhanced.yml \
|
||||
--ref feature/enhanced-build-manager \
|
||||
--field force_container_rebuild=true \
|
||||
--field build_format=html
|
||||
|
||||
# Test 2: Enhanced manager using existing containers
|
||||
gh workflow run build-manager-enhanced.yml \
|
||||
--ref feature/enhanced-build-manager \
|
||||
--field force_container_rebuild=false \
|
||||
--field build_format=html
|
||||
|
||||
# Test 3: Test with specific branch
|
||||
gh workflow run build-manager-enhanced.yml \
|
||||
--ref feature/enhanced-build-manager \
|
||||
--field test_branch=feature/enhanced-build-manager \
|
||||
--field build_format=html
|
||||
```
|
||||
|
||||
### Phase 3: Individual Workflow Testing
|
||||
Test individual workflows still work:
|
||||
|
||||
```bash
|
||||
# Test container-based build directly
|
||||
gh workflow run quarto-build-container.yml \
|
||||
--ref feature/enhanced-build-manager \
|
||||
--field os=ubuntu-latest \
|
||||
--field format=html
|
||||
|
||||
# Test traditional build directly
|
||||
gh workflow run quarto-build.yml \
|
||||
--ref feature/enhanced-build-manager \
|
||||
--field os=ubuntu-latest \
|
||||
--field format=html
|
||||
```
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
### Successful Container Path
|
||||
1. Container health check finds containers available
|
||||
2. Skips container building (unless forced)
|
||||
3. Uses `quarto-build-container.yml` for fast builds
|
||||
4. Completes in 5-10 minutes
|
||||
|
||||
### Successful Traditional Path
|
||||
1. Container health check finds containers unavailable
|
||||
2. Skips container building
|
||||
3. Uses `quarto-build.yml` for traditional builds
|
||||
4. Completes in ~45 minutes
|
||||
|
||||
### Container Building Path
|
||||
1. Container health check determines rebuild needed
|
||||
2. Builds containers (may take 20-30 minutes first time)
|
||||
3. Uses newly built containers for fast builds
|
||||
4. Future runs are fast (5-10 minutes)
|
||||
|
||||
## Container Naming Convention
|
||||
|
||||
The enhanced manager uses project-based naming:
|
||||
|
||||
- **Linux Container**: `ghcr.io/harvard-edge/cs249r_book/quarto-linux:latest`
|
||||
- **Windows Container**: `ghcr.io/harvard-edge/cs249r_book/quarto-windows:latest`
|
||||
|
||||
This clearly identifies containers as belonging to the ML Systems book project and scales well for future projects.
|
||||
|
||||
## Safety Features
|
||||
|
||||
### Branch Isolation
|
||||
- Feature branch won't trigger automatic builds on main/dev
|
||||
- Manual testing only via `workflow_dispatch`
|
||||
- No impact on production workflows
|
||||
|
||||
### Fallback Protection
|
||||
- Always falls back to working traditional builds
|
||||
- Never breaks existing functionality
|
||||
- Comprehensive error reporting
|
||||
|
||||
### Consistency Enforcement
|
||||
- Single source of truth for container names
|
||||
- Standardized container references across workflows
|
||||
- Prevents the naming mismatches we just fixed
|
||||
|
||||
## Success Criteria
|
||||
|
||||
### Must Have
|
||||
- [ ] Containers build successfully from feature branch
|
||||
- [ ] Enhanced manager completes without errors
|
||||
- [ ] Traditional builds still work as fallback
|
||||
- [ ] Clear reporting of which strategy was used
|
||||
|
||||
### Nice to Have
|
||||
- [ ] Performance improvement visible in build times
|
||||
- [ ] Container reuse works (second run much faster)
|
||||
- [ ] Windows containers also work (when implemented)
|
||||
|
||||
## Migration Plan
|
||||
|
||||
Once testing is successful:
|
||||
|
||||
1. **Validate**: All tests pass on feature branch
|
||||
2. **Review**: Code review and documentation update
|
||||
3. **Merge**: Merge to dev branch for broader testing
|
||||
4. **Monitor**: Watch dev branch builds for any issues
|
||||
5. **Deploy**: Enable for main branch after dev validation
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If issues arise:
|
||||
1. **Immediate**: Use manual `workflow_dispatch` with traditional builds
|
||||
2. **Short-term**: Revert to original `build-manager.yml`
|
||||
3. **Long-term**: Fix issues on feature branch and re-test
|
||||
|
||||
## Key Benefits
|
||||
|
||||
### For Development
|
||||
- **Faster iteration**: 5-10 min builds instead of 45 min
|
||||
- **Better reliability**: Fallback ensures builds always work
|
||||
- **Clear feedback**: Know immediately which strategy was used
|
||||
|
||||
### For Production
|
||||
- **Consistency**: Single manager orchestrates all builds
|
||||
- **Performance**: Dramatic build time reduction
|
||||
- **Maintenance**: Centralized container management
|
||||
@@ -1,89 +0,0 @@
|
||||
# Pre-commit Hook: External Image Validation
|
||||
|
||||
## Overview
|
||||
|
||||
This repository includes a pre-commit hook that validates all Quarto markdown files (`.qmd`) to ensure they don't contain external image references. This helps maintain build reliability and ensures the book can be built offline.
|
||||
|
||||
## How It Works
|
||||
|
||||
The `validate-external-images` hook runs automatically before each commit and:
|
||||
|
||||
1. **Scans** all `.qmd` files in `book/contents/`
|
||||
2. **Detects** images with `#fig-` references that use external URLs (http/https)
|
||||
3. **Fails** the commit if external images are found
|
||||
4. **Provides** clear instructions on how to fix the issue
|
||||
|
||||
## Example Output
|
||||
|
||||
When external images are detected:
|
||||
|
||||
```
|
||||
❌ External image found in book/contents/core/chapter/chapter.qmd: fig-example → https://example.com/image.png
|
||||
|
||||
💡 To fix, run:
|
||||
python3 tools/scripts/download_external_images.py -f book/contents/core/chapter/chapter.qmd
|
||||
```
|
||||
|
||||
## Fixing External Images
|
||||
|
||||
When the hook fails, you have two options:
|
||||
|
||||
### Option 1: Download Images Locally (Recommended)
|
||||
|
||||
**Single file:**
|
||||
```bash
|
||||
python3 tools/scripts/download_external_images.py -f path/to/file.qmd
|
||||
```
|
||||
|
||||
**All files in directory:**
|
||||
```bash
|
||||
python3 tools/scripts/download_external_images.py -d book/contents/
|
||||
```
|
||||
|
||||
**Preview what would be downloaded:**
|
||||
```bash
|
||||
python3 tools/scripts/download_external_images.py -d book/contents/ --dry-run
|
||||
```
|
||||
|
||||
### Option 2: Bypass Hook (Not Recommended)
|
||||
|
||||
Only use this for special cases where external images are intentionally required:
|
||||
|
||||
```bash
|
||||
git commit --no-verify -m "your commit message"
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
- **Build Reliability**: Local images prevent broken builds due to external URL changes
|
||||
- **Offline Capability**: The book can be built without internet access
|
||||
- **Performance**: Faster builds with local images
|
||||
- **Consistency**: Ensures all contributors follow the same image management practices
|
||||
|
||||
## Configuration
|
||||
|
||||
The hook is configured in `.pre-commit-config.yaml`:
|
||||
|
||||
```yaml
|
||||
- id: validate-external-images
|
||||
name: "Check for external images in Quarto files"
|
||||
entry: python3 tools/scripts/download_external_images.py --validate book/contents/
|
||||
language: system
|
||||
pass_filenames: false
|
||||
files: ^book/contents/.*\.qmd$
|
||||
```
|
||||
|
||||
## Running Manually
|
||||
|
||||
You can run the validation manually at any time:
|
||||
|
||||
```bash
|
||||
# Check all files
|
||||
pre-commit run validate-external-images --all-files
|
||||
|
||||
# Check only staged files
|
||||
pre-commit run validate-external-images
|
||||
|
||||
# Run the validation script directly
|
||||
python3 tools/scripts/download_external_images.py --validate book/contents/
|
||||
```
|
||||
@@ -1,72 +0,0 @@
|
||||
# Pre-commit Part Key Validation - Summary
|
||||
|
||||
## What We Implemented
|
||||
|
||||
You were absolutely right! Instead of doing validation in the GitHub workflow, we moved it to **pre-commit hooks** where it belongs. This catches issues before they even get committed, let alone pushed to the workflow.
|
||||
|
||||
## ✅ **What's Now in Place:**
|
||||
|
||||
### 1. Pre-commit Hook
|
||||
- **Location**: `.pre-commit-config.yaml`
|
||||
- **Trigger**: Runs on every commit
|
||||
- **Action**: Validates all part keys in `.qmd` files
|
||||
- **Result**: Blocks commit if invalid keys found
|
||||
|
||||
### 2. Validation Script
|
||||
- **Location**: `scripts/validate_part_keys.py`
|
||||
- **Function**: Scans all 65+ `.qmd` files
|
||||
- **Checks**: Validates against `book/part_summaries.yml`
|
||||
- **Output**: Detailed error report with file/line numbers
|
||||
|
||||
### 3. Easy-to-Use Tools
|
||||
- **Quick check**: `pre-commit run validate-part-keys --all-files`
|
||||
- **Wrapper script**: `./scripts/check_keys.sh`
|
||||
- **Direct validation**: `python3 scripts/validate_part_keys.py`
|
||||
|
||||
## 🚀 **Benefits of Pre-commit Approach:**
|
||||
|
||||
1. **Catches issues early** - before commit, not after push
|
||||
2. **Faster feedback** - no waiting for CI/CD
|
||||
3. **Prevents broken commits** - keeps history clean
|
||||
4. **Developer-friendly** - immediate feedback
|
||||
5. **Reduces CI/CD load** - fewer failed builds
|
||||
|
||||
## 📊 **Current Status:**
|
||||
|
||||
- ✅ **15 valid keys** in `part_summaries.yml`
|
||||
- ✅ **65+ .qmd files** scanned
|
||||
- ✅ **0 issues** found
|
||||
- ✅ **Pre-commit hook** working perfectly
|
||||
|
||||
## 🔧 **How to Use:**
|
||||
|
||||
### For Developers:
|
||||
```bash
|
||||
# Normal workflow (validation runs automatically)
|
||||
git add .
|
||||
git commit -m "Your changes"
|
||||
# If invalid keys found, commit is blocked
|
||||
```
|
||||
|
||||
### For Manual Testing:
|
||||
```bash
|
||||
# Test validation
|
||||
pre-commit run validate-part-keys --all-files
|
||||
|
||||
# Or run directly
|
||||
python3 scripts/validate_part_keys.py
|
||||
```
|
||||
|
||||
## 🛠️ **Removed from Workflow:**
|
||||
|
||||
- ❌ Removed validation step from `.github/workflows/quarto-build.yml`
|
||||
- ✅ Validation now happens in pre-commit hooks
|
||||
- ✅ Faster, more efficient, developer-friendly
|
||||
|
||||
## 🎯 **Result:**
|
||||
|
||||
The `key:xxx` error you were seeing will now be **caught before commit**, preventing it from ever reaching the build process. This is much more efficient and user-friendly than catching it in the workflow.
|
||||
|
||||
---
|
||||
|
||||
*This approach is much better because it catches issues at the source (during development) rather than after they've been pushed to the repository.*
|
||||
Reference in New Issue
Block a user