docs: remove outdated documentation files

- Remove ENHANCED_BUILD_MANAGER_TESTING.md (references non-existent workflows)
- Remove PRE_COMMIT_VALIDATION_SUMMARY.md (redundant with PART_KEY_VALIDATION.md)
- Remove PRE_COMMIT_EXTERNAL_IMAGES.md (incorrect paths and obsolete references)

Reduces docs directory from 13 to 10 files, removing redundancy and outdated information.
This commit is contained in:
Vijay Janapa Reddi
2025-08-19 11:43:35 -04:00
parent dde41fdf43
commit e1cef50bbf
3 changed files with 0 additions and 320 deletions

View File

@@ -1,159 +0,0 @@
# Enhanced Build Manager Testing Plan
## Overview
The enhanced build manager provides intelligent container-based builds with fallback to traditional builds. This document outlines the testing strategy for the feature branch.
## Architecture
### Smart Container Management
1. **Container Health Check**: Verifies if containers exist and are up-to-date
2. **Conditional Building**: Only rebuilds containers when needed
3. **Intelligent Routing**: Uses fast containers when available, traditional builds otherwise
4. **Comprehensive Reporting**: Clear visibility into which strategy was used
### Performance Benefits
- **Fast path**: 5-10 minutes (with containers)
- **Traditional path**: 45 minutes (without containers)
- **Graceful degradation**: Always works even if containers fail
## Testing Strategy
### Phase 1: Container Build Testing
Test that containers build correctly from feature branch:
```bash
# Test Linux container build
gh workflow run build-linux-container.yml --ref feature/enhanced-build-manager
# Test Windows container build
gh workflow run build-windows-container.yml --ref feature/enhanced-build-manager
```
### Phase 2: Enhanced Manager Testing
Test the enhanced manager with different scenarios:
```bash
# Test 1: Full enhanced manager with container building
gh workflow run build-manager-enhanced.yml \
--ref feature/enhanced-build-manager \
--field force_container_rebuild=true \
--field build_format=html
# Test 2: Enhanced manager using existing containers
gh workflow run build-manager-enhanced.yml \
--ref feature/enhanced-build-manager \
--field force_container_rebuild=false \
--field build_format=html
# Test 3: Test with specific branch
gh workflow run build-manager-enhanced.yml \
--ref feature/enhanced-build-manager \
--field test_branch=feature/enhanced-build-manager \
--field build_format=html
```
### Phase 3: Individual Workflow Testing
Test individual workflows still work:
```bash
# Test container-based build directly
gh workflow run quarto-build-container.yml \
--ref feature/enhanced-build-manager \
--field os=ubuntu-latest \
--field format=html
# Test traditional build directly
gh workflow run quarto-build.yml \
--ref feature/enhanced-build-manager \
--field os=ubuntu-latest \
--field format=html
```
## Expected Outcomes
### Successful Container Path
1. Container health check finds containers available
2. Skips container building (unless forced)
3. Uses `quarto-build-container.yml` for fast builds
4. Completes in 5-10 minutes
### Successful Traditional Path
1. Container health check finds containers unavailable
2. Skips container building
3. Uses `quarto-build.yml` for traditional builds
4. Completes in ~45 minutes
### Container Building Path
1. Container health check determines rebuild needed
2. Builds containers (may take 20-30 minutes first time)
3. Uses newly built containers for fast builds
4. Future runs are fast (5-10 minutes)
## Container Naming Convention
The enhanced manager uses project-based naming:
- **Linux Container**: `ghcr.io/harvard-edge/cs249r_book/quarto-linux:latest`
- **Windows Container**: `ghcr.io/harvard-edge/cs249r_book/quarto-windows:latest`
This clearly identifies containers as belonging to the ML Systems book project and scales well for future projects.
## Safety Features
### Branch Isolation
- Feature branch won't trigger automatic builds on main/dev
- Manual testing only via `workflow_dispatch`
- No impact on production workflows
### Fallback Protection
- Always falls back to working traditional builds
- Never breaks existing functionality
- Comprehensive error reporting
### Consistency Enforcement
- Single source of truth for container names
- Standardized container references across workflows
- Prevents the naming mismatches we just fixed
## Success Criteria
### Must Have
- [ ] Containers build successfully from feature branch
- [ ] Enhanced manager completes without errors
- [ ] Traditional builds still work as fallback
- [ ] Clear reporting of which strategy was used
### Nice to Have
- [ ] Performance improvement visible in build times
- [ ] Container reuse works (second run much faster)
- [ ] Windows containers also work (when implemented)
## Migration Plan
Once testing is successful:
1. **Validate**: All tests pass on feature branch
2. **Review**: Code review and documentation update
3. **Merge**: Merge to dev branch for broader testing
4. **Monitor**: Watch dev branch builds for any issues
5. **Deploy**: Enable for main branch after dev validation
## Rollback Plan
If issues arise:
1. **Immediate**: Use manual `workflow_dispatch` with traditional builds
2. **Short-term**: Revert to original `build-manager.yml`
3. **Long-term**: Fix issues on feature branch and re-test
## Key Benefits
### For Development
- **Faster iteration**: 5-10 min builds instead of 45 min
- **Better reliability**: Fallback ensures builds always work
- **Clear feedback**: Know immediately which strategy was used
### For Production
- **Consistency**: Single manager orchestrates all builds
- **Performance**: Dramatic build time reduction
- **Maintenance**: Centralized container management

View File

@@ -1,89 +0,0 @@
# Pre-commit Hook: External Image Validation
## Overview
This repository includes a pre-commit hook that validates all Quarto markdown files (`.qmd`) to ensure they don't contain external image references. This helps maintain build reliability and ensures the book can be built offline.
## How It Works
The `validate-external-images` hook runs automatically before each commit and:
1. **Scans** all `.qmd` files in `book/contents/`
2. **Detects** images with `#fig-` references that use external URLs (http/https)
3. **Fails** the commit if external images are found
4. **Provides** clear instructions on how to fix the issue
## Example Output
When external images are detected:
```
❌ External image found in book/contents/core/chapter/chapter.qmd: fig-example → https://example.com/image.png
💡 To fix, run:
python3 tools/scripts/download_external_images.py -f book/contents/core/chapter/chapter.qmd
```
## Fixing External Images
When the hook fails, you have two options:
### Option 1: Download Images Locally (Recommended)
**Single file:**
```bash
python3 tools/scripts/download_external_images.py -f path/to/file.qmd
```
**All files in directory:**
```bash
python3 tools/scripts/download_external_images.py -d book/contents/
```
**Preview what would be downloaded:**
```bash
python3 tools/scripts/download_external_images.py -d book/contents/ --dry-run
```
### Option 2: Bypass Hook (Not Recommended)
Only use this for special cases where external images are intentionally required:
```bash
git commit --no-verify -m "your commit message"
```
## Benefits
- **Build Reliability**: Local images prevent broken builds due to external URL changes
- **Offline Capability**: The book can be built without internet access
- **Performance**: Faster builds with local images
- **Consistency**: Ensures all contributors follow the same image management practices
## Configuration
The hook is configured in `.pre-commit-config.yaml`:
```yaml
- id: validate-external-images
name: "Check for external images in Quarto files"
entry: python3 tools/scripts/download_external_images.py --validate book/contents/
language: system
pass_filenames: false
files: ^book/contents/.*\.qmd$
```
## Running Manually
You can run the validation manually at any time:
```bash
# Check all files
pre-commit run validate-external-images --all-files
# Check only staged files
pre-commit run validate-external-images
# Run the validation script directly
python3 tools/scripts/download_external_images.py --validate book/contents/
```

View File

@@ -1,72 +0,0 @@
# Pre-commit Part Key Validation - Summary
## What We Implemented
You were absolutely right! Instead of doing validation in the GitHub workflow, we moved it to **pre-commit hooks** where it belongs. This catches issues before they even get committed, let alone pushed to the workflow.
## ✅ **What's Now in Place:**
### 1. Pre-commit Hook
- **Location**: `.pre-commit-config.yaml`
- **Trigger**: Runs on every commit
- **Action**: Validates all part keys in `.qmd` files
- **Result**: Blocks commit if invalid keys found
### 2. Validation Script
- **Location**: `scripts/validate_part_keys.py`
- **Function**: Scans all 65+ `.qmd` files
- **Checks**: Validates against `book/part_summaries.yml`
- **Output**: Detailed error report with file/line numbers
### 3. Easy-to-Use Tools
- **Quick check**: `pre-commit run validate-part-keys --all-files`
- **Wrapper script**: `./scripts/check_keys.sh`
- **Direct validation**: `python3 scripts/validate_part_keys.py`
## 🚀 **Benefits of Pre-commit Approach:**
1. **Catches issues early** - before commit, not after push
2. **Faster feedback** - no waiting for CI/CD
3. **Prevents broken commits** - keeps history clean
4. **Developer-friendly** - immediate feedback
5. **Reduces CI/CD load** - fewer failed builds
## 📊 **Current Status:**
-**15 valid keys** in `part_summaries.yml`
-**65+ .qmd files** scanned
-**0 issues** found
-**Pre-commit hook** working perfectly
## 🔧 **How to Use:**
### For Developers:
```bash
# Normal workflow (validation runs automatically)
git add .
git commit -m "Your changes"
# If invalid keys found, commit is blocked
```
### For Manual Testing:
```bash
# Test validation
pre-commit run validate-part-keys --all-files
# Or run directly
python3 scripts/validate_part_keys.py
```
## 🛠️ **Removed from Workflow:**
- ❌ Removed validation step from `.github/workflows/quarto-build.yml`
- ✅ Validation now happens in pre-commit hooks
- ✅ Faster, more efficient, developer-friendly
## 🎯 **Result:**
The `key:xxx` error you were seeing will now be **caught before commit**, preventing it from ever reaching the build process. This is much more efficient and user-friendly than catching it in the workflow.
---
*This approach is much better because it catches issues at the source (during development) rather than after they've been pushed to the repository.*