mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-03-12 02:06:14 -05:00
chore: remove unused scripts and documentation
Remove 19 unused scripts that were not referenced in any workflows or configuration files: - 13 validation scripts in .github/tinytorch-scripts/ (never integrated into CI/CD) - TINYTORCH_RELEASE_PROCESS.md documentation - Duplicate gs_compress_pdf.py script - Unused book scripts (footnotes, reorganize_scripts) - Unused check_no_emojis.py script
This commit is contained in:
460
.github/TINYTORCH_RELEASE_PROCESS.md
vendored
460
.github/TINYTORCH_RELEASE_PROCESS.md
vendored
@@ -1,460 +0,0 @@
|
||||
# TinyTorch Release Process
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the complete release process for TinyTorch, combining automated CI/CD checks with manual agent-driven reviews.
|
||||
|
||||
## Release Types
|
||||
|
||||
### Patch Release (0.1.X)
|
||||
- Bug fixes
|
||||
- Documentation updates
|
||||
- Minor improvements
|
||||
- **Timeline:** 1-2 days
|
||||
|
||||
### Minor Release (0.X.0)
|
||||
- New module additions
|
||||
- Feature enhancements
|
||||
- Significant improvements
|
||||
- **Timeline:** 1-2 weeks
|
||||
|
||||
### Major Release (X.0.0)
|
||||
- Complete module sets
|
||||
- Breaking API changes
|
||||
- Architectural updates
|
||||
- **Timeline:** 1-3 months
|
||||
|
||||
## Two-Track Quality Assurance
|
||||
|
||||
### Track 1: Automated CI/CD (Continuous)
|
||||
|
||||
**GitHub Actions** runs on every commit and PR:
|
||||
|
||||
```
|
||||
Every Push/PR:
|
||||
├── Educational Validation (Module structure, objectives)
|
||||
├── Implementation Validation (Time, difficulty, tests)
|
||||
├── Test Validation (All tests, coverage)
|
||||
├── Package Validation (Builds, installs)
|
||||
├── Documentation Validation (ABOUT.md, checkpoints)
|
||||
└── Systems Analysis (Memory, performance, production)
|
||||
```
|
||||
|
||||
**Trigger:** Automatic on push/PR
|
||||
|
||||
**Duration:** 15-20 minutes
|
||||
|
||||
**Pass Criteria:** All 6 quality gates green
|
||||
|
||||
---
|
||||
|
||||
### Track 2: Agent-Driven Review (Pre-Release)
|
||||
|
||||
**Specialized AI agents** provide deep review before releases:
|
||||
|
||||
```
|
||||
TPM Coordinates:
|
||||
├── Education Reviewer
|
||||
│ ├── Pedagogical effectiveness
|
||||
│ ├── Learning objective alignment
|
||||
│ ├── Cognitive load assessment
|
||||
│ └── Assessment quality
|
||||
│
|
||||
├── Module Developer
|
||||
│ ├── Implementation standards
|
||||
│ ├── Code quality patterns
|
||||
│ ├── Testing completeness
|
||||
│ └── PyTorch API alignment
|
||||
│
|
||||
├── Quality Assurance
|
||||
│ ├── Comprehensive test validation
|
||||
│ ├── Edge case coverage
|
||||
│ ├── Performance testing
|
||||
│ └── Integration stability
|
||||
│
|
||||
└── Package Manager
|
||||
├── Module integration
|
||||
├── Dependency resolution
|
||||
├── Export/import validation
|
||||
└── Build verification
|
||||
```
|
||||
|
||||
**Trigger:** Manual (via TPM)
|
||||
|
||||
**Duration:** 2-4 hours
|
||||
|
||||
**Pass Criteria:** All agents approve
|
||||
|
||||
---
|
||||
|
||||
## Complete Release Workflow
|
||||
|
||||
### Phase 1: Development (Ongoing)
|
||||
|
||||
1. **Feature Development**
|
||||
- Implement modules following DEFINITIVE_MODULE_PLAN.md
|
||||
- Write tests immediately after each function
|
||||
- Ensure NBGrader compatibility
|
||||
- Add checkpoint markers to long modules
|
||||
|
||||
2. **Local Validation**
|
||||
```bash
|
||||
# Run validators locally
|
||||
python .github/scripts/validate_time_estimates.py
|
||||
python .github/scripts/validate_difficulty_ratings.py
|
||||
python .github/scripts/validate_testing_patterns.py
|
||||
python .github/scripts/check_checkpoints.py
|
||||
|
||||
# Run tests
|
||||
pytest tests/ -v
|
||||
```
|
||||
|
||||
3. **Commit & Push**
|
||||
```bash
|
||||
git add .
|
||||
git commit -m "feat: Add [feature] to [module]"
|
||||
git push origin feature-branch
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Pre-Release Review (1-2 days)
|
||||
|
||||
1. **Create Release Branch**
|
||||
```bash
|
||||
git checkout -b release/v0.X.Y
|
||||
git push origin release/v0.X.Y
|
||||
```
|
||||
|
||||
2. **Automated CI/CD Check**
|
||||
- GitHub Actions runs automatically
|
||||
- Review workflow results
|
||||
- Fix any failures
|
||||
|
||||
3. **Agent-Driven Comprehensive Review**
|
||||
|
||||
**Invoke TPM for multi-agent review:**
|
||||
|
||||
```
|
||||
Request to TPM:
|
||||
"I need a comprehensive quality review of all 20 TinyTorch modules
|
||||
for release v0.X.Y. Please coordinate:
|
||||
|
||||
1. Education Reviewer - pedagogical validation
|
||||
2. Module Developer - implementation standards
|
||||
3. Quality Assurance - testing validation
|
||||
4. Package Manager - integration health
|
||||
|
||||
Run these in parallel and provide:
|
||||
- Consolidated findings report
|
||||
- Prioritized action items
|
||||
- Estimated effort for fixes
|
||||
- Timeline for completion
|
||||
|
||||
Release Type: [patch/minor/major]
|
||||
Target Date: [YYYY-MM-DD]"
|
||||
```
|
||||
|
||||
4. **Review Agent Reports**
|
||||
- Education Reviewer report
|
||||
- Module Developer report
|
||||
- Quality Assurance report
|
||||
- Package Manager report
|
||||
|
||||
5. **Address Findings**
|
||||
- Fix HIGH priority issues immediately
|
||||
- Schedule MEDIUM priority for next sprint
|
||||
- Document LOW priority as future improvements
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Release Candidate (1 day)
|
||||
|
||||
1. **Create Release Candidate**
|
||||
```bash
|
||||
git tag -a v0.X.Y-rc1 -m "Release candidate 1 for v0.X.Y"
|
||||
git push origin v0.X.Y-rc1
|
||||
```
|
||||
|
||||
2. **Final Validation**
|
||||
- Run full test suite
|
||||
- Build documentation
|
||||
- Test package installation
|
||||
- Manual smoke testing
|
||||
|
||||
3. **Stakeholder Review** (if applicable)
|
||||
- Share RC with instructors
|
||||
- Collect feedback
|
||||
- Make final adjustments
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Release (1 day)
|
||||
|
||||
1. **Manual Release Check Trigger**
|
||||
|
||||
Via GitHub UI:
|
||||
- Go to Actions → TinyTorch Release Check
|
||||
- Click "Run workflow"
|
||||
- Select:
|
||||
- Branch: `release/v0.X.Y`
|
||||
- Release Type: `[patch/minor/major]`
|
||||
- Check Level: `comprehensive`
|
||||
|
||||
2. **Review Release Report**
|
||||
- All quality gates pass
|
||||
- Download release report artifact
|
||||
- Verify all validations green
|
||||
|
||||
3. **Merge to Main**
|
||||
```bash
|
||||
git checkout main
|
||||
git merge --no-ff release/v0.X.Y
|
||||
git push origin main
|
||||
```
|
||||
|
||||
4. **Create Official Release**
|
||||
```bash
|
||||
git tag -a v0.X.Y -m "Release v0.X.Y: [Description]"
|
||||
git push origin v0.X.Y
|
||||
```
|
||||
|
||||
5. **GitHub Release**
|
||||
- Go to Releases → Draft a new release
|
||||
- Select tag: `v0.X.Y`
|
||||
- Title: `TinyTorch v0.X.Y`
|
||||
- Description: Include release report summary
|
||||
- Attach artifacts (wheels, documentation)
|
||||
- Publish release
|
||||
|
||||
6. **Package Distribution**
|
||||
```bash
|
||||
# Build distribution packages
|
||||
python -m build
|
||||
|
||||
# Upload to PyPI (if applicable)
|
||||
python -m twine upload dist/*
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Post-Release (Ongoing)
|
||||
|
||||
1. **Documentation Updates**
|
||||
- Update README.md with new version
|
||||
- Update CHANGELOG.md
|
||||
- Rebuild Jupyter Book
|
||||
- Deploy to mlsysbook.github.io
|
||||
|
||||
2. **Communication**
|
||||
- Announce on GitHub
|
||||
- Update course materials
|
||||
- Notify instructors
|
||||
- Social media (if applicable)
|
||||
|
||||
3. **Monitoring**
|
||||
- Watch for issues
|
||||
- Respond to feedback
|
||||
- Plan next release
|
||||
|
||||
---
|
||||
|
||||
## Quality Gates Reference
|
||||
|
||||
### Must Pass for ALL Releases
|
||||
|
||||
✅ All automated CI/CD checks pass
|
||||
✅ Test coverage ≥80%
|
||||
✅ All agent reviews approved
|
||||
✅ Documentation complete
|
||||
✅ No HIGH priority issues
|
||||
|
||||
### Additional for Major Releases
|
||||
|
||||
✅ All 20 modules validated
|
||||
✅ Complete integration testing
|
||||
✅ Performance benchmarks meet targets
|
||||
✅ Comprehensive stakeholder review
|
||||
|
||||
---
|
||||
|
||||
## Checklist Templates
|
||||
|
||||
### Patch Release Checklist
|
||||
|
||||
```markdown
|
||||
## Pre-Release
|
||||
- [ ] Local validation passes
|
||||
- [ ] Automated CI/CD passes
|
||||
- [ ] Bug fix validated
|
||||
- [ ] Tests updated
|
||||
|
||||
## Release
|
||||
- [ ] Release branch created
|
||||
- [ ] RC tested
|
||||
- [ ] Merged to main
|
||||
- [ ] Tag created
|
||||
- [ ] GitHub release published
|
||||
|
||||
## Post-Release
|
||||
- [ ] Documentation updated
|
||||
- [ ] CHANGELOG updated
|
||||
- [ ] Issue closed
|
||||
```
|
||||
|
||||
### Minor Release Checklist
|
||||
|
||||
```markdown
|
||||
## Pre-Release
|
||||
- [ ] All local validations pass
|
||||
- [ ] Automated CI/CD passes
|
||||
- [ ] Agent reviews complete (all 4)
|
||||
- [ ] High priority issues fixed
|
||||
- [ ] New modules validated
|
||||
- [ ] Integration tests pass
|
||||
|
||||
## Release
|
||||
- [ ] Release branch created
|
||||
- [ ] RC tested
|
||||
- [ ] Stakeholder review (if needed)
|
||||
- [ ] Merged to main
|
||||
- [ ] Tag created
|
||||
- [ ] GitHub release published
|
||||
- [ ] Package uploaded (if applicable)
|
||||
|
||||
## Post-Release
|
||||
- [ ] Documentation updated
|
||||
- [ ] CHANGELOG updated
|
||||
- [ ] Jupyter Book rebuilt
|
||||
- [ ] Announcement sent
|
||||
```
|
||||
|
||||
### Major Release Checklist
|
||||
|
||||
```markdown
|
||||
## Pre-Release (1-2 weeks)
|
||||
- [ ] All local validations pass
|
||||
- [ ] Automated CI/CD passes
|
||||
- [ ] Comprehensive agent review (TPM-coordinated)
|
||||
- [ ] Education Reviewer approved
|
||||
- [ ] Module Developer approved
|
||||
- [ ] Quality Assurance approved
|
||||
- [ ] Package Manager approved
|
||||
- [ ] ALL modules validated (20/20)
|
||||
- [ ] Complete integration testing
|
||||
- [ ] Performance benchmarks met
|
||||
- [ ] Documentation complete
|
||||
- [ ] All HIGH/MEDIUM issues resolved
|
||||
|
||||
## Release Candidate (3-5 days)
|
||||
- [ ] RC1 created and tested
|
||||
- [ ] Stakeholder feedback collected
|
||||
- [ ] Final adjustments made
|
||||
- [ ] RC2 validated (if needed)
|
||||
|
||||
## Release
|
||||
- [ ] Release branch created
|
||||
- [ ] Comprehensive check run
|
||||
- [ ] All quality gates green
|
||||
- [ ] Merged to main
|
||||
- [ ] Tag created
|
||||
- [ ] GitHub release published
|
||||
- [ ] Package uploaded to PyPI
|
||||
- [ ] Backup created
|
||||
|
||||
## Post-Release (1 week)
|
||||
- [ ] Documentation updated everywhere
|
||||
- [ ] CHANGELOG complete
|
||||
- [ ] Jupyter Book rebuilt and deployed
|
||||
- [ ] All stakeholders notified
|
||||
- [ ] Social media announcement
|
||||
- [ ] Course materials updated
|
||||
- [ ] Monitor for issues
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Emergency Hotfix Process
|
||||
|
||||
For critical bugs in production:
|
||||
|
||||
1. **Create hotfix branch from main**
|
||||
```bash
|
||||
git checkout main
|
||||
git checkout -b hotfix/v0.X.Y+1
|
||||
```
|
||||
|
||||
2. **Fix the issue**
|
||||
- Minimal changes only
|
||||
- Focus on critical bug
|
||||
- Add regression test
|
||||
|
||||
3. **Fast-track validation**
|
||||
```bash
|
||||
# Quick validation
|
||||
python .github/scripts/validate_time_estimates.py
|
||||
pytest tests/ -v -k "test_affected_module"
|
||||
```
|
||||
|
||||
4. **Release immediately**
|
||||
```bash
|
||||
git checkout main
|
||||
git merge --no-ff hotfix/v0.X.Y+1
|
||||
git tag -a v0.X.Y+1 -m "Hotfix: [Description]"
|
||||
git push origin main --tags
|
||||
```
|
||||
|
||||
5. **Backport to release branches if needed**
|
||||
|
||||
---
|
||||
|
||||
## Tools & Resources
|
||||
|
||||
### GitHub Actions
|
||||
- Workflow: `.github/workflows/release-check.yml`
|
||||
- Scripts: `.github/scripts/*.py`
|
||||
- Documentation: `.github/workflows/README.md`
|
||||
|
||||
### Agent Coordination
|
||||
- TPM: `.claude/agents/technical-program-manager.md`
|
||||
- Agents: `.claude/agents/`
|
||||
- Workflow: `DEFINITIVE_MODULE_PLAN.md`
|
||||
|
||||
### Validation
|
||||
- Time: `validate_time_estimates.py`
|
||||
- Difficulty: `validate_difficulty_ratings.py`
|
||||
- Tests: `validate_testing_patterns.py`
|
||||
- Checkpoints: `check_checkpoints.py`
|
||||
|
||||
---
|
||||
|
||||
## Version Numbering
|
||||
|
||||
TinyTorch follows [Semantic Versioning](https://semver.org/):
|
||||
|
||||
**Format:** `MAJOR.MINOR.PATCH`
|
||||
|
||||
- **MAJOR:** Breaking changes, complete module sets
|
||||
- **MINOR:** New features, module additions
|
||||
- **PATCH:** Bug fixes, documentation
|
||||
|
||||
**Examples:**
|
||||
- `0.1.0` → `0.1.1`: Bug fix (patch)
|
||||
- `0.1.1` → `0.2.0`: New module (minor)
|
||||
- `0.9.0` → `1.0.0`: All 20 modules complete (major)
|
||||
|
||||
---
|
||||
|
||||
## Contact & Support
|
||||
|
||||
**Questions about releases?**
|
||||
- Check this document first
|
||||
- Review workflow README: `.github/workflows/README.md`
|
||||
- Consult TPM agent for complex scenarios
|
||||
- File issue on GitHub for workflow improvements
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2024-11-24
|
||||
**Version:** 1.0.0
|
||||
**Maintainer:** TinyTorch Team
|
||||
62
.github/scripts/gs_compress_pdf.py
vendored
62
.github/scripts/gs_compress_pdf.py
vendored
@@ -1,62 +0,0 @@
|
||||
import argparse
|
||||
import subprocess
|
||||
import sys
|
||||
import os
|
||||
import shutil
|
||||
|
||||
def get_ghostscript_command():
|
||||
"""Determine the correct Ghostscript command based on the platform."""
|
||||
if os.name == 'nt':
|
||||
# Try 64-bit and then 32-bit Ghostscript command names
|
||||
for cmd in ['gswin64c', 'gswin32c']:
|
||||
if shutil.which(cmd):
|
||||
return cmd
|
||||
print("❌ Ghostscript executable not found. Install it and ensure it's in your PATH (e.g., gswin64c.exe).", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
else:
|
||||
# On Linux/macOS, the command is usually 'gs'
|
||||
if shutil.which('gs'):
|
||||
return 'gs'
|
||||
print("❌ Ghostscript (gs) not found. Install it and ensure it's in your PATH.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
def convert_pdf(input_file, output_file, settings='/printer', compatibility='1.4', debug=False):
|
||||
gs_command = get_ghostscript_command()
|
||||
|
||||
command = [
|
||||
gs_command,
|
||||
'-sDEVICE=pdfwrite',
|
||||
'-dNOPAUSE',
|
||||
'-dQUIET' if not debug else '-dQUIET=false',
|
||||
'-dBATCH',
|
||||
f'-dPDFSETTINGS={settings}',
|
||||
f'-dCompatibilityLevel={compatibility}',
|
||||
f'-sOutputFile={output_file}',
|
||||
input_file
|
||||
]
|
||||
|
||||
if debug:
|
||||
print(f"Running command: {' '.join(command)}")
|
||||
|
||||
try:
|
||||
result = subprocess.run(command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
if debug:
|
||||
print(result.stdout.decode())
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"Error: {e.stderr.decode()}", file=sys.stderr)
|
||||
sys.exit(e.returncode)
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Convert PDF using Ghostscript with various options.")
|
||||
parser.add_argument('-i', '--input', required=True, help="Input PDF file")
|
||||
parser.add_argument('-o', '--output', required=True, help="Output PDF file")
|
||||
parser.add_argument('-s', '--settings', default='/printer', help="PDF settings (default: /printer)")
|
||||
parser.add_argument('-c', '--compatibility', default='1.4', help="PDF compatibility level (default: 1.4)")
|
||||
parser.add_argument('-d', '--debug', action='store_true', help="Enable debug mode")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
convert_pdf(args.input, args.output, settings=args.settings, compatibility=args.compatibility, debug=args.debug)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
91
.github/tinytorch-scripts/check_checkpoints.py
vendored
91
.github/tinytorch-scripts/check_checkpoints.py
vendored
@@ -1,91 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Validate checkpoint markers in long modules (8+ hours).
|
||||
Ensures complex modules have progress markers to help students track completion.
|
||||
"""
|
||||
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def extract_time_estimate(about_file):
|
||||
"""Extract time estimate from ABOUT.md"""
|
||||
if not about_file.exists():
|
||||
return 0
|
||||
|
||||
content = about_file.read_text()
|
||||
match = re.search(r'time_estimate:\s*"(\d+)-(\d+)\s+hours"', content)
|
||||
|
||||
if match:
|
||||
return int(match.group(2)) # Return upper bound
|
||||
return 0
|
||||
|
||||
|
||||
def count_checkpoints(about_file):
|
||||
"""Count checkpoint markers in ABOUT.md"""
|
||||
if not about_file.exists():
|
||||
return 0
|
||||
|
||||
content = about_file.read_text()
|
||||
# Look for checkpoint patterns
|
||||
return len(re.findall(r'\*\*✓ CHECKPOINT \d+:', content))
|
||||
|
||||
|
||||
def main():
|
||||
"""Validate checkpoint markers in long modules"""
|
||||
modules_dir = Path("modules")
|
||||
recommendations = []
|
||||
validated = []
|
||||
|
||||
print("🏁 Validating Checkpoint Markers")
|
||||
print("=" * 60)
|
||||
|
||||
# Find all module directories
|
||||
module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()])
|
||||
|
||||
for module_dir in module_dirs:
|
||||
module_name = module_dir.name
|
||||
about_file = module_dir / "ABOUT.md"
|
||||
|
||||
time_estimate = extract_time_estimate(about_file)
|
||||
checkpoint_count = count_checkpoints(about_file)
|
||||
|
||||
# Modules 8+ hours should have checkpoints
|
||||
if time_estimate >= 8:
|
||||
if checkpoint_count == 0:
|
||||
recommendations.append(
|
||||
f"⚠️ {module_name} ({time_estimate}h): Consider adding checkpoint markers"
|
||||
)
|
||||
elif checkpoint_count >= 2:
|
||||
validated.append(
|
||||
f"✅ {module_name} ({time_estimate}h): {checkpoint_count} checkpoints"
|
||||
)
|
||||
else:
|
||||
recommendations.append(
|
||||
f"⚠️ {module_name} ({time_estimate}h): Only {checkpoint_count} checkpoint (recommend 2+)"
|
||||
)
|
||||
else:
|
||||
print(f" {module_name} ({time_estimate}h): Checkpoints not required")
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
|
||||
# Print validated modules
|
||||
if validated:
|
||||
print("\n✅ Modules with Good Checkpoint Coverage:")
|
||||
for item in validated:
|
||||
print(f" {item}")
|
||||
|
||||
# Print recommendations
|
||||
if recommendations:
|
||||
print("\n💡 Recommendations:")
|
||||
for rec in recommendations:
|
||||
print(f" {rec}")
|
||||
print("\nNote: This is informational only, not a blocker.")
|
||||
|
||||
print("\n✅ Checkpoint validation complete!")
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate learning objectives alignment across modules"""
|
||||
import sys
|
||||
print("📋 Learning objectives validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate progressive disclosure patterns (no forward references)"""
|
||||
import sys
|
||||
print("🔍 Progressive disclosure validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate module dependency chain"""
|
||||
import sys
|
||||
print("🔗 Module dependencies validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,120 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Validate difficulty rating consistency across LEARNING_PATH.md and module ABOUT.md files.
|
||||
"""
|
||||
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def normalize_difficulty(difficulty_str):
|
||||
"""Normalize difficulty rating to star count"""
|
||||
if not difficulty_str:
|
||||
return None
|
||||
|
||||
# Count stars
|
||||
star_count = difficulty_str.count("⭐")
|
||||
if star_count > 0:
|
||||
return star_count
|
||||
|
||||
# Handle numeric format
|
||||
if difficulty_str.isdigit():
|
||||
return int(difficulty_str)
|
||||
|
||||
# Handle "X/4" format
|
||||
match = re.match(r"(\d+)/4", difficulty_str)
|
||||
if match:
|
||||
return int(match.group(1))
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def extract_difficulty_from_learning_path(module_num):
|
||||
"""Extract difficulty rating for a module from LEARNING_PATH.md"""
|
||||
learning_path = Path("modules/LEARNING_PATH.md")
|
||||
if not learning_path.exists():
|
||||
return None
|
||||
|
||||
content = learning_path.read_text()
|
||||
|
||||
# Pattern: **Module XX: Name** (X-Y hours, ⭐...)
|
||||
pattern = rf"\*\*Module {module_num:02d}:.*?\*\*\s*\([^,]+,\s*([⭐]+)\)"
|
||||
match = re.search(pattern, content)
|
||||
|
||||
return normalize_difficulty(match.group(1)) if match else None
|
||||
|
||||
|
||||
def extract_difficulty_from_about(module_path):
|
||||
"""Extract difficulty rating from module ABOUT.md"""
|
||||
about_file = module_path / "ABOUT.md"
|
||||
if not about_file.exists():
|
||||
return None
|
||||
|
||||
content = about_file.read_text()
|
||||
|
||||
# Pattern: difficulty: "⭐..." or difficulty: X
|
||||
pattern = r'difficulty:\s*["\']?([⭐\d/]+)["\']?'
|
||||
match = re.search(pattern, content)
|
||||
|
||||
return normalize_difficulty(match.group(1)) if match else None
|
||||
|
||||
|
||||
def main():
|
||||
"""Validate difficulty ratings across all modules"""
|
||||
modules_dir = Path("modules")
|
||||
errors = []
|
||||
warnings = []
|
||||
|
||||
print("⭐ Validating Difficulty Rating Consistency")
|
||||
print("=" * 60)
|
||||
|
||||
# Find all module directories
|
||||
module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()])
|
||||
|
||||
for module_dir in module_dirs:
|
||||
module_num = int(module_dir.name.split("_")[0])
|
||||
module_name = module_dir.name
|
||||
|
||||
learning_path_diff = extract_difficulty_from_learning_path(module_num)
|
||||
about_diff = extract_difficulty_from_about(module_dir)
|
||||
|
||||
if not about_diff:
|
||||
warnings.append(f"⚠️ {module_name}: Missing difficulty in ABOUT.md")
|
||||
continue
|
||||
|
||||
if not learning_path_diff:
|
||||
warnings.append(f"⚠️ {module_name}: Not found in LEARNING_PATH.md")
|
||||
continue
|
||||
|
||||
if learning_path_diff != about_diff:
|
||||
errors.append(
|
||||
f"❌ {module_name}: Difficulty mismatch\n"
|
||||
f" LEARNING_PATH.md: {'⭐' * learning_path_diff}\n"
|
||||
f" ABOUT.md: {'⭐' * about_diff}"
|
||||
)
|
||||
else:
|
||||
print(f"✅ {module_name}: {'⭐' * about_diff}")
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
|
||||
# Print warnings
|
||||
if warnings:
|
||||
print("\n⚠️ Warnings:")
|
||||
for warning in warnings:
|
||||
print(f" {warning}")
|
||||
|
||||
# Print errors
|
||||
if errors:
|
||||
print("\n❌ Errors Found:")
|
||||
for error in errors:
|
||||
print(f" {error}\n")
|
||||
print(f"\n{len(errors)} difficulty rating inconsistencies found!")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("\n✅ All difficulty ratings are consistent!")
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate ABOUT.md consistency"""
|
||||
import sys
|
||||
print("📄 Documentation validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,17 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Validate educational standards across all modules.
|
||||
Invokes education-reviewer agent logic for comprehensive review.
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
print("🎓 Educational Standards Validation")
|
||||
print("=" * 60)
|
||||
print("✅ Learning objectives present")
|
||||
print("✅ Progressive disclosure maintained")
|
||||
print("✅ Cognitive load appropriate")
|
||||
print("✅ NBGrader compatible")
|
||||
print("\n✅ Educational standards validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate export directives"""
|
||||
import sys
|
||||
print("📦 Export directives validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate import path consistency"""
|
||||
import sys
|
||||
print("🔗 Import paths validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate NBGrader metadata in all modules"""
|
||||
import sys
|
||||
print("📝 NBGrader metadata validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,11 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate systems analysis coverage"""
|
||||
import sys
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--aspect', choices=['memory', 'performance', 'production'])
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"🧠 {args.aspect.capitalize()} analysis validated!")
|
||||
sys.exit(0)
|
||||
@@ -1,95 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Validate testing patterns in module development files.
|
||||
Ensures:
|
||||
- Unit tests use test_unit_* naming
|
||||
- Module integration test is named test_module()
|
||||
- Tests are protected with if __name__ == "__main__"
|
||||
"""
|
||||
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def check_module_tests(module_file):
|
||||
"""Check testing patterns in a module file"""
|
||||
content = module_file.read_text()
|
||||
issues = []
|
||||
|
||||
# Check for test_unit_* pattern
|
||||
unit_tests = re.findall(r'def\s+(test_unit_\w+)\s*\(', content)
|
||||
|
||||
# Check for test_module() function
|
||||
has_test_module = bool(re.search(r'def\s+test_module\s*\(', content))
|
||||
|
||||
# Check for if __name__ == "__main__" blocks
|
||||
has_main_guard = bool(re.search(r'if\s+__name__\s*==\s*["\']__main__["\']', content))
|
||||
|
||||
# Check for improper test names (test_* but not test_unit_*)
|
||||
improper_tests = [
|
||||
name for name in re.findall(r'def\s+(test_\w+)\s*\(', content)
|
||||
if not name.startswith('test_unit_') and name != 'test_module'
|
||||
]
|
||||
|
||||
# Validate patterns
|
||||
if not unit_tests and not has_test_module:
|
||||
issues.append("No tests found (missing test_unit_* or test_module)")
|
||||
|
||||
if not has_test_module:
|
||||
issues.append("Missing test_module() integration test")
|
||||
|
||||
if not has_main_guard:
|
||||
issues.append("Missing if __name__ == '__main__' guard")
|
||||
|
||||
if improper_tests:
|
||||
issues.append(f"Improper test names (should be test_unit_*): {', '.join(improper_tests)}")
|
||||
|
||||
return {
|
||||
'unit_tests': len(unit_tests),
|
||||
'has_test_module': has_test_module,
|
||||
'has_main_guard': has_main_guard,
|
||||
'issues': issues
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
"""Validate testing patterns across all modules"""
|
||||
modules_dir = Path("modules")
|
||||
errors = []
|
||||
warnings = []
|
||||
|
||||
print("🧪 Validating Testing Patterns")
|
||||
print("=" * 60)
|
||||
|
||||
# Find all module development files
|
||||
module_files = sorted(modules_dir.glob("*/*_dev.py"))
|
||||
|
||||
for module_file in module_files:
|
||||
module_name = module_file.parent.name
|
||||
|
||||
result = check_module_tests(module_file)
|
||||
|
||||
if result['issues']:
|
||||
errors.append(f"❌ {module_name}:")
|
||||
for issue in result['issues']:
|
||||
errors.append(f" - {issue}")
|
||||
else:
|
||||
print(f"✅ {module_name}: {result['unit_tests']} unit tests + test_module()")
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
|
||||
# Print errors
|
||||
if errors:
|
||||
print("\n❌ Testing Pattern Issues:")
|
||||
for error in errors:
|
||||
print(f" {error}")
|
||||
print(f"\n{len([e for e in errors if '❌' in e])} modules with testing issues!")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("\n✅ All modules follow correct testing patterns!")
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,98 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Validate time estimate consistency across LEARNING_PATH.md and module ABOUT.md files.
|
||||
"""
|
||||
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def extract_time_from_learning_path(module_num):
|
||||
"""Extract time estimate for a module from LEARNING_PATH.md"""
|
||||
learning_path = Path("modules/LEARNING_PATH.md")
|
||||
if not learning_path.exists():
|
||||
return None
|
||||
|
||||
content = learning_path.read_text()
|
||||
|
||||
# Pattern: **Module XX: Name** (X-Y hours, ⭐...)
|
||||
pattern = rf"\*\*Module {module_num:02d}:.*?\*\*\s*\((\d+-\d+\s+hours)"
|
||||
match = re.search(pattern, content)
|
||||
|
||||
return match.group(1) if match else None
|
||||
|
||||
|
||||
def extract_time_from_about(module_path):
|
||||
"""Extract time estimate from module ABOUT.md"""
|
||||
about_file = module_path / "ABOUT.md"
|
||||
if not about_file.exists():
|
||||
return None
|
||||
|
||||
content = about_file.read_text()
|
||||
|
||||
# Pattern: time_estimate: "X-Y hours"
|
||||
pattern = r'time_estimate:\s*"(\d+-\d+\s+hours)"'
|
||||
match = re.search(pattern, content)
|
||||
|
||||
return match.group(1) if match else None
|
||||
|
||||
|
||||
def main():
|
||||
"""Validate time estimates across all modules"""
|
||||
modules_dir = Path("modules")
|
||||
errors = []
|
||||
warnings = []
|
||||
|
||||
print("⏱️ Validating Time Estimate Consistency")
|
||||
print("=" * 60)
|
||||
|
||||
# Find all module directories
|
||||
module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()])
|
||||
|
||||
for module_dir in module_dirs:
|
||||
module_num = int(module_dir.name.split("_")[0])
|
||||
module_name = module_dir.name
|
||||
|
||||
learning_path_time = extract_time_from_learning_path(module_num)
|
||||
about_time = extract_time_from_about(module_dir)
|
||||
|
||||
if not about_time:
|
||||
warnings.append(f"⚠️ {module_name}: Missing time_estimate in ABOUT.md")
|
||||
continue
|
||||
|
||||
if not learning_path_time:
|
||||
warnings.append(f"⚠️ {module_name}: Not found in LEARNING_PATH.md")
|
||||
continue
|
||||
|
||||
if learning_path_time != about_time:
|
||||
errors.append(
|
||||
f"❌ {module_name}: Time mismatch\n"
|
||||
f" LEARNING_PATH.md: {learning_path_time}\n"
|
||||
f" ABOUT.md: {about_time}"
|
||||
)
|
||||
else:
|
||||
print(f"✅ {module_name}: {about_time}")
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
|
||||
# Print warnings
|
||||
if warnings:
|
||||
print("\n⚠️ Warnings:")
|
||||
for warning in warnings:
|
||||
print(f" {warning}")
|
||||
|
||||
# Print errors
|
||||
if errors:
|
||||
print("\n❌ Errors Found:")
|
||||
for error in errors:
|
||||
print(f" {error}\n")
|
||||
print(f"\n{len(errors)} time estimate inconsistencies found!")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("\n✅ All time estimates are consistent!")
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,366 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Catalog all footnotes in Quarto markdown (.qmd) files.
|
||||
|
||||
This script:
|
||||
1. Scans all qmd files for footnotes
|
||||
2. Collects inline references and their contexts
|
||||
3. Collects footnote definitions
|
||||
4. Generates a comprehensive report for the footnote agent
|
||||
"""
|
||||
|
||||
import re
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Tuple, Set
|
||||
from collections import defaultdict
|
||||
|
||||
|
||||
def extract_inline_references(content: str, file_path: Path) -> List[Dict]:
|
||||
"""Extract all inline footnote references with their surrounding context."""
|
||||
references = []
|
||||
lines = content.splitlines()
|
||||
|
||||
for line_num, line in enumerate(lines, 1):
|
||||
# Find all footnote references in this line
|
||||
matches = re.finditer(r'\[\^([^\]]+)\]', line)
|
||||
for match in matches:
|
||||
footnote_id = match.group(1)
|
||||
|
||||
# Get context (the sentence containing the footnote)
|
||||
# Find sentence boundaries
|
||||
start_pos = max(0, match.start() - 100)
|
||||
end_pos = min(len(line), match.end() + 100)
|
||||
context = line[start_pos:end_pos].strip()
|
||||
|
||||
# Clean up context
|
||||
if start_pos > 0:
|
||||
context = "..." + context
|
||||
if end_pos < len(line):
|
||||
context = context + "..."
|
||||
|
||||
references.append({
|
||||
'footnote_id': footnote_id,
|
||||
'file': str(file_path),
|
||||
'line': line_num,
|
||||
'context': context,
|
||||
'full_line': line.strip()
|
||||
})
|
||||
|
||||
return references
|
||||
|
||||
|
||||
def extract_footnote_definitions(content: str, file_path: Path) -> List[Dict]:
|
||||
"""Extract all footnote definitions."""
|
||||
definitions = []
|
||||
lines = content.splitlines()
|
||||
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
|
||||
# Check if this line starts a footnote definition
|
||||
match = re.match(r'^\[\^([^\]]+)\]:\s*(.*)$', line)
|
||||
if match:
|
||||
footnote_id = match.group(1)
|
||||
definition_text = match.group(2)
|
||||
line_num = i + 1
|
||||
|
||||
# Collect continuation lines
|
||||
i += 1
|
||||
while i < len(lines):
|
||||
next_line = lines[i]
|
||||
# Continuation lines are indented or empty
|
||||
if next_line and (next_line[0] == ' ' or next_line[0] == '\t'):
|
||||
definition_text += '\n' + next_line
|
||||
i += 1
|
||||
elif not next_line.strip():
|
||||
# Empty line might be part of the footnote
|
||||
if i + 1 < len(lines) and lines[i + 1] and (lines[i + 1][0] == ' ' or lines[i + 1][0] == '\t'):
|
||||
definition_text += '\n'
|
||||
i += 1
|
||||
else:
|
||||
break
|
||||
else:
|
||||
break
|
||||
|
||||
# Clean up the definition
|
||||
definition_text = definition_text.strip()
|
||||
|
||||
# Extract bold term if it exists (common pattern: **Term**: Definition)
|
||||
term_match = re.match(r'\*\*([^*]+)\*\*:\s*(.+)', definition_text)
|
||||
term = term_match.group(1) if term_match else None
|
||||
|
||||
definitions.append({
|
||||
'footnote_id': footnote_id,
|
||||
'file': str(file_path),
|
||||
'line': line_num,
|
||||
'definition': definition_text,
|
||||
'term': term,
|
||||
'length': len(definition_text)
|
||||
})
|
||||
else:
|
||||
i += 1
|
||||
|
||||
return definitions
|
||||
|
||||
|
||||
def analyze_footnote_patterns(all_definitions: List[Dict]) -> Dict:
|
||||
"""Analyze patterns in footnote definitions."""
|
||||
patterns = {
|
||||
'total_definitions': len(all_definitions),
|
||||
'with_bold_terms': 0,
|
||||
'average_length': 0,
|
||||
'common_prefixes': defaultdict(int),
|
||||
'terms_used': set()
|
||||
}
|
||||
|
||||
total_length = 0
|
||||
for defn in all_definitions:
|
||||
total_length += defn['length']
|
||||
if defn['term']:
|
||||
patterns['with_bold_terms'] += 1
|
||||
patterns['terms_used'].add(defn['term'].lower())
|
||||
|
||||
# Extract common ID prefixes (e.g., 'fn-', 'note-', etc.)
|
||||
id_parts = defn['footnote_id'].split('-')
|
||||
if len(id_parts) > 1:
|
||||
patterns['common_prefixes'][id_parts[0]] += 1
|
||||
|
||||
if all_definitions:
|
||||
patterns['average_length'] = total_length // len(all_definitions)
|
||||
|
||||
patterns['terms_used'] = list(patterns['terms_used'])
|
||||
patterns['common_prefixes'] = dict(patterns['common_prefixes'])
|
||||
|
||||
return patterns
|
||||
|
||||
|
||||
def find_duplicates(all_references: List[Dict], all_definitions: List[Dict]) -> Dict:
|
||||
"""Find duplicate footnotes across chapters."""
|
||||
duplicates = {
|
||||
'duplicate_ids': defaultdict(list),
|
||||
'duplicate_terms': defaultdict(list),
|
||||
'undefined_references': [],
|
||||
'unused_definitions': []
|
||||
}
|
||||
|
||||
# Track footnote IDs by file
|
||||
for ref in all_references:
|
||||
file_name = Path(ref['file']).stem
|
||||
duplicates['duplicate_ids'][ref['footnote_id']].append(file_name)
|
||||
|
||||
# Track terms across files
|
||||
for defn in all_definitions:
|
||||
if defn['term']:
|
||||
file_name = Path(defn['file']).stem
|
||||
duplicates['duplicate_terms'][defn['term'].lower()].append({
|
||||
'file': file_name,
|
||||
'footnote_id': defn['footnote_id']
|
||||
})
|
||||
|
||||
# Find undefined references
|
||||
defined_ids = {d['footnote_id'] for d in all_definitions}
|
||||
referenced_ids = {r['footnote_id'] for r in all_references}
|
||||
|
||||
for ref in all_references:
|
||||
if ref['footnote_id'] not in defined_ids:
|
||||
duplicates['undefined_references'].append({
|
||||
'footnote_id': ref['footnote_id'],
|
||||
'file': Path(ref['file']).stem,
|
||||
'line': ref['line']
|
||||
})
|
||||
|
||||
# Find unused definitions
|
||||
for defn in all_definitions:
|
||||
if defn['footnote_id'] not in referenced_ids:
|
||||
duplicates['unused_definitions'].append({
|
||||
'footnote_id': defn['footnote_id'],
|
||||
'file': Path(defn['file']).stem,
|
||||
'line': defn['line']
|
||||
})
|
||||
|
||||
# Clean up duplicates - only keep actual duplicates
|
||||
duplicates['duplicate_ids'] = {
|
||||
k: list(set(v)) for k, v in duplicates['duplicate_ids'].items()
|
||||
if len(set(v)) > 1
|
||||
}
|
||||
|
||||
duplicates['duplicate_terms'] = {
|
||||
k: v for k, v in duplicates['duplicate_terms'].items()
|
||||
if len(v) > 1
|
||||
}
|
||||
|
||||
return duplicates
|
||||
|
||||
|
||||
def generate_chapter_summary(file_path: Path, references: List[Dict], definitions: List[Dict]) -> Dict:
|
||||
"""Generate a summary for a specific chapter."""
|
||||
return {
|
||||
'file': str(file_path),
|
||||
'chapter_name': file_path.stem,
|
||||
'total_references': len(references),
|
||||
'total_definitions': len(definitions),
|
||||
'footnote_ids': sorted(list({r['footnote_id'] for r in references})),
|
||||
'terms_defined': sorted([d['term'] for d in definitions if d['term']])
|
||||
}
|
||||
|
||||
|
||||
def generate_agent_context(all_data: Dict, target_chapter: str = None) -> str:
|
||||
"""Generate context information for the footnote agent."""
|
||||
context = []
|
||||
|
||||
context.append("# FOOTNOTE CATALOG AND CONTEXT\n")
|
||||
context.append("## Book-Wide Footnote Statistics\n")
|
||||
|
||||
patterns = all_data['patterns']
|
||||
context.append(f"- Total footnotes defined: {patterns['total_definitions']}")
|
||||
context.append(f"- Footnotes with bold terms: {patterns['with_bold_terms']}")
|
||||
context.append(f"- Average definition length: {patterns['average_length']} characters")
|
||||
context.append(f"- Common ID prefixes: {patterns['common_prefixes']}")
|
||||
context.append(f"- Total unique terms: {len(patterns['terms_used'])}\n")
|
||||
|
||||
if all_data['duplicates']['duplicate_terms']:
|
||||
context.append("## ⚠️ IMPORTANT: Terms Already Defined\n")
|
||||
context.append("These terms have already been defined in other chapters. DO NOT redefine them:\n")
|
||||
for term, locations in all_data['duplicates']['duplicate_terms'].items():
|
||||
context.append(f"- **{term}**: defined in {', '.join([l['file'] for l in locations])}")
|
||||
context.append("")
|
||||
|
||||
if target_chapter:
|
||||
# Find chapter data
|
||||
chapter_data = None
|
||||
for chapter in all_data['by_chapter']:
|
||||
if chapter['chapter_name'] == target_chapter or target_chapter in chapter['file']:
|
||||
chapter_data = chapter
|
||||
break
|
||||
|
||||
if chapter_data:
|
||||
context.append(f"## Current Chapter: {chapter_data['chapter_name']}\n")
|
||||
context.append(f"- Existing footnotes: {chapter_data['total_references']}")
|
||||
context.append(f"- Footnote IDs used: {', '.join(chapter_data['footnote_ids'])}")
|
||||
if chapter_data['terms_defined']:
|
||||
context.append(f"- Terms already defined: {', '.join(chapter_data['terms_defined'])}")
|
||||
context.append("")
|
||||
|
||||
context.append("## Footnote Style Guidelines\n")
|
||||
context.append("Based on existing footnotes, follow these patterns:")
|
||||
context.append("1. Use ID format: [^fn-term-name] (lowercase, hyphens)")
|
||||
context.append("2. Definition format: **Bold Term**: Clear definition. Optional analogy.")
|
||||
context.append("3. Keep definitions concise (avg ~200 characters)")
|
||||
context.append("4. Avoid redefining terms from other chapters")
|
||||
context.append("5. Focus on technical terms that need clarification\n")
|
||||
|
||||
context.append("## All Terms Currently Defined in Book\n")
|
||||
if patterns['terms_used']:
|
||||
for i in range(0, len(patterns['terms_used']), 5):
|
||||
batch = patterns['terms_used'][i:i+5]
|
||||
context.append(f"- {', '.join(batch)}")
|
||||
|
||||
return '\n'.join(context)
|
||||
|
||||
|
||||
def main():
|
||||
"""Main function to catalog all footnotes."""
|
||||
# Determine root directory
|
||||
if len(sys.argv) > 1:
|
||||
root_dir = Path(sys.argv[1])
|
||||
else:
|
||||
root_dir = Path('/Users/VJ/GitHub/MLSysBook/quarto')
|
||||
|
||||
if not root_dir.exists():
|
||||
print(f"Error: Directory {root_dir} does not exist")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Cataloging footnotes in: {root_dir}")
|
||||
print("-" * 60)
|
||||
|
||||
# Find all .qmd files
|
||||
qmd_files = sorted(root_dir.rglob('*.qmd'))
|
||||
|
||||
all_references = []
|
||||
all_definitions = []
|
||||
by_chapter = []
|
||||
|
||||
for qmd_file in qmd_files:
|
||||
try:
|
||||
with open(qmd_file, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# Skip files with no content
|
||||
if not content.strip():
|
||||
continue
|
||||
|
||||
# Extract footnotes
|
||||
references = extract_inline_references(content, qmd_file)
|
||||
definitions = extract_footnote_definitions(content, qmd_file)
|
||||
|
||||
if references or definitions:
|
||||
relative_path = qmd_file.relative_to(root_dir.parent)
|
||||
print(f"✓ {relative_path}")
|
||||
print(f" - {len(references)} inline references")
|
||||
print(f" - {len(definitions)} definitions")
|
||||
|
||||
all_references.extend(references)
|
||||
all_definitions.extend(definitions)
|
||||
|
||||
chapter_summary = generate_chapter_summary(qmd_file, references, definitions)
|
||||
by_chapter.append(chapter_summary)
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error processing {qmd_file}: {e}")
|
||||
|
||||
# Analyze patterns and duplicates
|
||||
patterns = analyze_footnote_patterns(all_definitions)
|
||||
duplicates = find_duplicates(all_references, all_definitions)
|
||||
|
||||
# Create comprehensive report
|
||||
report = {
|
||||
'total_files': len(qmd_files),
|
||||
'total_references': len(all_references),
|
||||
'total_definitions': len(all_definitions),
|
||||
'patterns': patterns,
|
||||
'duplicates': duplicates,
|
||||
'by_chapter': by_chapter,
|
||||
'all_references': all_references,
|
||||
'all_definitions': all_definitions
|
||||
}
|
||||
|
||||
# Save JSON report
|
||||
report_file = root_dir.parent / 'footnote_catalog.json'
|
||||
with open(report_file, 'w', encoding='utf-8') as f:
|
||||
json.dump(report, f, indent=2, default=str)
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("FOOTNOTE CATALOG SUMMARY")
|
||||
print("=" * 60)
|
||||
print(f"Total files scanned: {len(qmd_files)}")
|
||||
print(f"Total inline references: {len(all_references)}")
|
||||
print(f"Total definitions: {len(all_definitions)}")
|
||||
print(f"Unique footnote IDs: {len(set(r['footnote_id'] for r in all_references))}")
|
||||
print(f"Terms defined: {len(patterns['terms_used'])}")
|
||||
|
||||
if duplicates['undefined_references']:
|
||||
print(f"\n⚠️ Undefined references: {len(duplicates['undefined_references'])}")
|
||||
for ref in duplicates['undefined_references'][:5]:
|
||||
print(f" - [{ref['footnote_id']}] in {ref['file']} line {ref['line']}")
|
||||
|
||||
if duplicates['unused_definitions']:
|
||||
print(f"\n⚠️ Unused definitions: {len(duplicates['unused_definitions'])}")
|
||||
for defn in duplicates['unused_definitions'][:5]:
|
||||
print(f" - [{defn['footnote_id']}] in {defn['file']} line {defn['line']}")
|
||||
|
||||
print(f"\n✓ Full report saved to: {report_file}")
|
||||
|
||||
# Generate agent context file
|
||||
agent_context = generate_agent_context(report)
|
||||
context_file = root_dir.parent / '.claude' / 'footnote_context.md'
|
||||
context_file.parent.mkdir(exist_ok=True)
|
||||
with open(context_file, 'w', encoding='utf-8') as f:
|
||||
f.write(agent_context)
|
||||
print(f"✓ Agent context saved to: {context_file}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,164 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Remove all footnotes from Quarto markdown (.qmd) files.
|
||||
|
||||
This script removes:
|
||||
1. Inline footnote references like [^fn-name]
|
||||
2. Footnote definitions like [^fn-name]: Definition text...
|
||||
3. Multi-line footnote definitions that are indented
|
||||
"""
|
||||
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import List, Tuple
|
||||
|
||||
|
||||
def remove_inline_footnotes(text: str) -> str:
|
||||
"""Remove inline footnote references like [^fn-name] from text."""
|
||||
# Pattern matches [^anything-here] where 'anything-here' doesn't contain ]
|
||||
pattern = r'\[\^[^\]]+\]'
|
||||
return re.sub(pattern, '', text)
|
||||
|
||||
|
||||
def remove_footnote_definitions(lines: List[str]) -> List[str]:
|
||||
"""Remove footnote definitions from a list of lines."""
|
||||
cleaned_lines = []
|
||||
skip_mode = False
|
||||
|
||||
for i, line in enumerate(lines):
|
||||
# Check if this line starts a footnote definition
|
||||
if re.match(r'^\[\^[^\]]+\]:', line):
|
||||
skip_mode = True
|
||||
continue
|
||||
|
||||
# If we're in skip mode, check if this line is a continuation
|
||||
if skip_mode:
|
||||
# Continuation lines start with whitespace (indented)
|
||||
if line and (line[0] == ' ' or line[0] == '\t'):
|
||||
continue
|
||||
# Empty lines after footnotes are also skipped
|
||||
elif not line.strip():
|
||||
# Check if next line exists and is indented (continuation)
|
||||
if i + 1 < len(lines) and lines[i + 1] and (lines[i + 1][0] == ' ' or lines[i + 1][0] == '\t'):
|
||||
continue
|
||||
# Otherwise, end skip mode but still skip this empty line
|
||||
skip_mode = False
|
||||
continue
|
||||
else:
|
||||
# Non-indented, non-empty line means footnote is done
|
||||
skip_mode = False
|
||||
|
||||
# Keep this line
|
||||
cleaned_lines.append(line)
|
||||
|
||||
return cleaned_lines
|
||||
|
||||
|
||||
def process_qmd_file(file_path: Path) -> Tuple[bool, int, int]:
|
||||
"""
|
||||
Process a single .qmd file to remove footnotes.
|
||||
|
||||
Returns:
|
||||
Tuple of (was_modified, inline_refs_removed, definitions_removed)
|
||||
"""
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
lines = content.splitlines()
|
||||
|
||||
# Count footnotes before processing
|
||||
inline_refs_before = len(re.findall(r'\[\^[^\]]+\]', content))
|
||||
definitions_before = len([l for l in lines if re.match(r'^\[\^[^\]]+\]:', l)])
|
||||
|
||||
# Remove inline references
|
||||
content_no_inline = remove_inline_footnotes(content)
|
||||
|
||||
# Remove definitions (work with lines)
|
||||
lines_no_inline = content_no_inline.splitlines()
|
||||
cleaned_lines = remove_footnote_definitions(lines_no_inline)
|
||||
|
||||
# Reconstruct content
|
||||
cleaned_content = '\n'.join(cleaned_lines)
|
||||
|
||||
# Only write if there were changes
|
||||
if cleaned_content != content:
|
||||
with open(file_path, 'w', encoding='utf-8') as f:
|
||||
f.write(cleaned_content)
|
||||
# Ensure file ends with newline
|
||||
if cleaned_content and not cleaned_content.endswith('\n'):
|
||||
f.write('\n')
|
||||
return True, inline_refs_before, definitions_before
|
||||
|
||||
return False, 0, 0
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error processing {file_path}: {e}")
|
||||
return False, 0, 0
|
||||
|
||||
|
||||
def find_qmd_files(root_dir: Path) -> List[Path]:
|
||||
"""Find all .qmd files in the directory tree."""
|
||||
return sorted(root_dir.rglob('*.qmd'))
|
||||
|
||||
|
||||
def main():
|
||||
"""Main function to process all .qmd files."""
|
||||
# Determine root directory
|
||||
if len(sys.argv) > 1:
|
||||
root_dir = Path(sys.argv[1])
|
||||
else:
|
||||
# Default to quarto directory
|
||||
root_dir = Path('/Users/VJ/GitHub/MLSysBook/quarto')
|
||||
|
||||
if not root_dir.exists():
|
||||
print(f"Error: Directory {root_dir} does not exist")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Scanning for .qmd files in: {root_dir}")
|
||||
|
||||
# Find all .qmd files
|
||||
qmd_files = find_qmd_files(root_dir)
|
||||
|
||||
if not qmd_files:
|
||||
print("No .qmd files found")
|
||||
return
|
||||
|
||||
print(f"Found {len(qmd_files)} .qmd files")
|
||||
print("-" * 60)
|
||||
|
||||
# Process each file
|
||||
total_modified = 0
|
||||
total_inline_refs = 0
|
||||
total_definitions = 0
|
||||
|
||||
for qmd_file in qmd_files:
|
||||
relative_path = qmd_file.relative_to(root_dir.parent)
|
||||
was_modified, inline_refs, definitions = process_qmd_file(qmd_file)
|
||||
|
||||
if was_modified:
|
||||
total_modified += 1
|
||||
total_inline_refs += inline_refs
|
||||
total_definitions += definitions
|
||||
print(f"✓ {relative_path}")
|
||||
print(f" - Removed {inline_refs} inline references")
|
||||
print(f" - Removed {definitions} footnote definitions")
|
||||
else:
|
||||
print(f" {relative_path} (no footnotes found)")
|
||||
|
||||
# Summary
|
||||
print("-" * 60)
|
||||
print(f"\nSummary:")
|
||||
print(f" Files processed: {len(qmd_files)}")
|
||||
print(f" Files modified: {total_modified}")
|
||||
print(f" Total inline references removed: {total_inline_refs}")
|
||||
print(f" Total footnote definitions removed: {total_definitions}")
|
||||
|
||||
if total_modified > 0:
|
||||
print(f"\n✓ Successfully removed all footnotes from {total_modified} files")
|
||||
else:
|
||||
print("\n✓ No footnotes found in any files")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,337 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Reorganize tools/scripts/ directory structure.
|
||||
|
||||
This script:
|
||||
1. Creates new subdirectories
|
||||
2. Moves scripts to proper locations
|
||||
3. Updates all references (pre-commit, imports, README)
|
||||
4. Creates a rollback backup
|
||||
"""
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import json
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
|
||||
# Define the migration plan
|
||||
MIGRATION_PLAN = {
|
||||
# Images subdirectory - consolidate all image-related scripts
|
||||
'images': [
|
||||
'download_external_images.py',
|
||||
'manage_external_images.py',
|
||||
'remove_bg.py',
|
||||
'rename_auto_images.py',
|
||||
'rename_downloaded_images.py',
|
||||
'validate_image_references.py',
|
||||
],
|
||||
|
||||
# Content subdirectory - add formatting scripts
|
||||
'content': [
|
||||
'fix_mid_paragraph_bold.py',
|
||||
'format_python_in_qmd.py',
|
||||
'format_tables.py',
|
||||
],
|
||||
|
||||
# Testing subdirectory - consolidate all tests
|
||||
'testing': [
|
||||
'test_format_tables.py',
|
||||
'test_image_extraction.py',
|
||||
'test_publish_live.py',
|
||||
],
|
||||
|
||||
# Infrastructure subdirectory - CI/CD and container management
|
||||
'infrastructure': [
|
||||
'cleanup_containers.py',
|
||||
'list_containers.py',
|
||||
'cleanup_workflow_runs_gh.py',
|
||||
],
|
||||
|
||||
# Glossary subdirectory - move glossary script
|
||||
'glossary': [
|
||||
'standardize_glossaries.py',
|
||||
],
|
||||
|
||||
# Maintenance subdirectory - add release and preflight
|
||||
'maintenance': [
|
||||
'generate_release_notes.py',
|
||||
'preflight.py',
|
||||
],
|
||||
|
||||
# Utilities subdirectory - validation scripts
|
||||
'utilities': [
|
||||
'check_custom_extensions.py',
|
||||
'validate_part_keys.py',
|
||||
],
|
||||
}
|
||||
|
||||
# Files that also need to be moved from existing subdirectories
|
||||
EXISTING_SUBDIRECTORY_MOVES = {
|
||||
'images': [
|
||||
('utilities/manage_images.py', 'manage_images.py'),
|
||||
('utilities/convert_svg_to_png.py', 'convert_svg_to_png.py'),
|
||||
('maintenance/compress_images.py', 'compress_images.py'),
|
||||
('maintenance/analyze_image_sizes.py', 'analyze_image_sizes.py'),
|
||||
],
|
||||
}
|
||||
|
||||
# Pre-commit reference updates
|
||||
PRECOMMIT_UPDATES = {
|
||||
'tools/scripts/format_python_in_qmd.py': 'tools/scripts/content/format_python_in_qmd.py',
|
||||
'tools/scripts/format_tables.py': 'tools/scripts/content/format_tables.py',
|
||||
'tools/scripts/validate_part_keys.py': 'tools/scripts/utilities/validate_part_keys.py',
|
||||
'tools/scripts/manage_external_images.py': 'tools/scripts/images/manage_external_images.py',
|
||||
'tools/scripts/validate_image_references.py': 'tools/scripts/images/validate_image_references.py',
|
||||
'tools/scripts/generate_release_notes.py': 'tools/scripts/maintenance/generate_release_notes.py',
|
||||
'tools/scripts/preflight.py': 'tools/scripts/maintenance/preflight.py',
|
||||
}
|
||||
|
||||
|
||||
def create_backup():
|
||||
"""Create a backup of the current state."""
|
||||
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
|
||||
backup_file = f'tools/scripts/.backup_{timestamp}.json'
|
||||
|
||||
backup_data = {
|
||||
'timestamp': timestamp,
|
||||
'files': []
|
||||
}
|
||||
|
||||
# Record all file locations
|
||||
for root, dirs, files in os.walk('tools/scripts'):
|
||||
for file in files:
|
||||
if file.endswith('.py'):
|
||||
rel_path = os.path.relpath(os.path.join(root, file), 'tools/scripts')
|
||||
backup_data['files'].append(rel_path)
|
||||
|
||||
with open(backup_file, 'w') as f:
|
||||
json.dump(backup_data, f, indent=2)
|
||||
|
||||
print(f"✅ Created backup: {backup_file}")
|
||||
return backup_file
|
||||
|
||||
|
||||
def create_directories():
|
||||
"""Create new subdirectories."""
|
||||
print("\n📁 Creating new directories...")
|
||||
|
||||
new_dirs = ['images', 'infrastructure']
|
||||
for dirname in new_dirs:
|
||||
dirpath = f'tools/scripts/{dirname}'
|
||||
if not os.path.exists(dirpath):
|
||||
os.makedirs(dirpath)
|
||||
# Create __init__.py
|
||||
with open(f'{dirpath}/__init__.py', 'w') as f:
|
||||
f.write(f'"""Scripts for {dirname} management."""\n')
|
||||
print(f" ✅ Created {dirpath}/")
|
||||
else:
|
||||
print(f" ⚠️ {dirpath}/ already exists")
|
||||
|
||||
|
||||
def move_files_from_root():
|
||||
"""Move files from root to subdirectories."""
|
||||
print("\n📦 Moving files from root level...")
|
||||
|
||||
moved_count = 0
|
||||
for target_dir, files in MIGRATION_PLAN.items():
|
||||
target_path = f'tools/scripts/{target_dir}'
|
||||
|
||||
for filename in files:
|
||||
source = f'tools/scripts/{filename}'
|
||||
dest = f'{target_path}/{filename}'
|
||||
|
||||
if os.path.exists(source):
|
||||
shutil.move(source, dest)
|
||||
print(f" ✅ {filename} → {target_dir}/")
|
||||
moved_count += 1
|
||||
else:
|
||||
print(f" ⚠️ {filename} not found (may already be moved)")
|
||||
|
||||
print(f"\n Moved {moved_count} files from root")
|
||||
|
||||
|
||||
def move_files_between_subdirs():
|
||||
"""Move files between existing subdirectories."""
|
||||
print("\n🔄 Consolidating files between subdirectories...")
|
||||
|
||||
moved_count = 0
|
||||
for target_dir, moves in EXISTING_SUBDIRECTORY_MOVES.items():
|
||||
target_path = f'tools/scripts/{target_dir}'
|
||||
|
||||
for source_rel, dest_name in moves:
|
||||
source = f'tools/scripts/{source_rel}'
|
||||
dest = f'{target_path}/{dest_name}'
|
||||
|
||||
if os.path.exists(source):
|
||||
shutil.move(source, dest)
|
||||
print(f" ✅ {source_rel} → {target_dir}/{dest_name}")
|
||||
moved_count += 1
|
||||
else:
|
||||
print(f" ⚠️ {source_rel} not found")
|
||||
|
||||
print(f"\n Moved {moved_count} files between subdirectories")
|
||||
|
||||
|
||||
def update_precommit_config():
|
||||
"""Update .pre-commit-config.yaml with new paths."""
|
||||
print("\n⚙️ Updating .pre-commit-config.yaml...")
|
||||
|
||||
config_path = '.pre-commit-config.yaml'
|
||||
|
||||
with open(config_path, 'r') as f:
|
||||
content = f.read()
|
||||
|
||||
original_content = content
|
||||
updates_made = 0
|
||||
|
||||
for old_path, new_path in PRECOMMIT_UPDATES.items():
|
||||
if old_path in content:
|
||||
content = content.replace(old_path, new_path)
|
||||
updates_made += 1
|
||||
print(f" ✅ Updated: {os.path.basename(old_path)}")
|
||||
|
||||
if updates_made > 0:
|
||||
with open(config_path, 'w') as f:
|
||||
f.write(content)
|
||||
print(f"\n Updated {updates_made} references in pre-commit config")
|
||||
else:
|
||||
print(" ℹ️ No updates needed")
|
||||
|
||||
|
||||
def create_readme_files():
|
||||
"""Create README files for new directories."""
|
||||
print("\n📝 Creating README files...")
|
||||
|
||||
readmes = {
|
||||
'images': '''# Image Management Scripts
|
||||
|
||||
Scripts for managing, processing, and validating images in the book.
|
||||
|
||||
## Image Processing
|
||||
- `compress_images.py` - Compress images to reduce file size
|
||||
- `convert_svg_to_png.py` - Convert SVG files to PNG format
|
||||
- `remove_bg.py` - Remove backgrounds from images
|
||||
|
||||
## Image Management
|
||||
- `manage_images.py` - Main image management utility
|
||||
- `download_external_images.py` - Download external images
|
||||
- `manage_external_images.py` - Manage external image references
|
||||
- `rename_auto_images.py` - Rename automatically generated images
|
||||
- `rename_downloaded_images.py` - Rename downloaded images
|
||||
|
||||
## Validation
|
||||
- `validate_image_references.py` - Ensure all image references are valid
|
||||
- `analyze_image_sizes.py` - Analyze image sizes and suggest optimizations
|
||||
''',
|
||||
'infrastructure': '''# Infrastructure Scripts
|
||||
|
||||
Scripts for managing CI/CD, containers, and workflow infrastructure.
|
||||
|
||||
## Container Management
|
||||
- `cleanup_containers.py` - Clean up Docker containers
|
||||
- `list_containers.py` - List active containers
|
||||
|
||||
## Workflow Management
|
||||
- `cleanup_workflow_runs_gh.py` - Clean up old GitHub Actions workflow runs
|
||||
''',
|
||||
}
|
||||
|
||||
for dirname, content in readmes.items():
|
||||
readme_path = f'tools/scripts/{dirname}/README.md'
|
||||
if not os.path.exists(readme_path):
|
||||
with open(readme_path, 'w') as f:
|
||||
f.write(content)
|
||||
print(f" ✅ Created {dirname}/README.md")
|
||||
|
||||
|
||||
def generate_summary():
|
||||
"""Generate a summary of the reorganization."""
|
||||
print("\n" + "=" * 80)
|
||||
print("📊 REORGANIZATION SUMMARY")
|
||||
print("=" * 80)
|
||||
|
||||
# Count files in each directory
|
||||
summary = {}
|
||||
for root, dirs, files in os.walk('tools/scripts'):
|
||||
# Skip __pycache__ and hidden directories
|
||||
if '__pycache__' in root or '/.backup' in root:
|
||||
continue
|
||||
|
||||
dirname = os.path.relpath(root, 'tools/scripts')
|
||||
py_files = [f for f in files if f.endswith('.py') and f != '__init__.py']
|
||||
|
||||
if py_files:
|
||||
summary[dirname] = len(py_files)
|
||||
|
||||
print("\n📁 Files per directory:")
|
||||
for dirname in sorted(summary.keys()):
|
||||
count = summary[dirname]
|
||||
print(f" {dirname + '/':<30} {count:>3} files")
|
||||
|
||||
# Count root level files
|
||||
root_files = [f for f in os.listdir('tools/scripts')
|
||||
if f.endswith('.py') and os.path.isfile(f'tools/scripts/{f}')]
|
||||
|
||||
print(f"\n🎯 Root level scripts remaining: {len(root_files)}")
|
||||
if root_files:
|
||||
for f in root_files:
|
||||
print(f" - {f}")
|
||||
|
||||
print("\n✅ Reorganization complete!")
|
||||
print("\nNext steps:")
|
||||
print(" 1. Test pre-commit hooks: pre-commit run --all-files")
|
||||
print(" 2. Check for any broken imports")
|
||||
print(" 3. Update any documentation references")
|
||||
|
||||
|
||||
def main():
|
||||
"""Main reorganization process."""
|
||||
import sys
|
||||
|
||||
print("🔧 SCRIPT REORGANIZATION TOOL")
|
||||
print("=" * 80)
|
||||
print("\nThis will reorganize tools/scripts/ directory structure.")
|
||||
print("\nChanges:")
|
||||
print(" • Create new subdirectories (images/, infrastructure/)")
|
||||
print(" • Move 21 scripts from root to appropriate subdirectories")
|
||||
print(" • Consolidate scattered scripts (images, tests, etc.)")
|
||||
print(" • Update .pre-commit-config.yaml references")
|
||||
print(" • Create documentation")
|
||||
|
||||
# Check for --yes flag
|
||||
if '--yes' not in sys.argv:
|
||||
response = input("\n⚠️ Proceed with reorganization? (yes/no): ")
|
||||
if response.lower() != 'yes':
|
||||
print("\n❌ Cancelled")
|
||||
return 1
|
||||
else:
|
||||
print("\n✅ Auto-confirmed with --yes flag")
|
||||
|
||||
try:
|
||||
# Create backup
|
||||
backup_file = create_backup()
|
||||
|
||||
# Execute reorganization
|
||||
create_directories()
|
||||
move_files_from_root()
|
||||
move_files_between_subdirs()
|
||||
update_precommit_config()
|
||||
create_readme_files()
|
||||
|
||||
# Summary
|
||||
generate_summary()
|
||||
|
||||
print(f"\n💾 Backup saved to: {backup_file}")
|
||||
print(" (Can be used for rollback if needed)")
|
||||
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n❌ Error during reorganization: {e}")
|
||||
print(" Please restore from backup if needed")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
exit(main())
|
||||
@@ -1,79 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Pre-commit hook: Check that markdown files don't contain emojis.
|
||||
|
||||
Emojis cause rendering issues in PDF builds. Keep content professional
|
||||
by using text descriptions instead.
|
||||
|
||||
Usage:
|
||||
python3 check_no_emojis.py [files...]
|
||||
|
||||
Exit codes:
|
||||
0 - No emojis found
|
||||
1 - Emojis found (lists files and emojis)
|
||||
"""
|
||||
|
||||
import sys
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
# Emoji pattern - matches most common emoji ranges
|
||||
EMOJI_PATTERN = re.compile(
|
||||
"["
|
||||
"\U0001F300-\U0001F9FF" # Misc Symbols, Emoticons, Dingbats, etc.
|
||||
"\U00002600-\U000026FF" # Misc symbols
|
||||
"\U00002700-\U000027BF" # Dingbats
|
||||
"\U0001FA00-\U0001FAFF" # Extended symbols
|
||||
"]",
|
||||
flags=re.UNICODE
|
||||
)
|
||||
|
||||
# Allowed characters:
|
||||
# - 🔥 Fire emoji for Tiny🔥Torch branding
|
||||
# - ✓ Checkmark (renders fine in most fonts, used in code examples)
|
||||
# - ✗ X mark (renders fine in most fonts)
|
||||
ALLOWED_EMOJIS = {'🔥', '✓', '✗', '×'}
|
||||
|
||||
def check_file(filepath: Path) -> list[tuple[int, str, str]]:
|
||||
"""Check a file for emojis. Returns list of (line_num, emoji, line_content)."""
|
||||
issues = []
|
||||
try:
|
||||
content = filepath.read_text(encoding='utf-8')
|
||||
for line_num, line in enumerate(content.splitlines(), 1):
|
||||
for match in EMOJI_PATTERN.finditer(line):
|
||||
emoji = match.group()
|
||||
if emoji not in ALLOWED_EMOJIS:
|
||||
issues.append((line_num, emoji, line.strip()[:60]))
|
||||
except Exception as e:
|
||||
print(f"Warning: Could not read {filepath}: {e}", file=sys.stderr)
|
||||
return issues
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: check_no_emojis.py <file1> [file2] ...")
|
||||
sys.exit(0)
|
||||
|
||||
files = [Path(f) for f in sys.argv[1:]]
|
||||
all_issues = {}
|
||||
|
||||
for filepath in files:
|
||||
if filepath.suffix in ('.md', '.qmd'):
|
||||
issues = check_file(filepath)
|
||||
if issues:
|
||||
all_issues[filepath] = issues
|
||||
|
||||
if all_issues:
|
||||
print("❌ Emojis found in markdown files (not allowed for PDF compatibility):\n")
|
||||
for filepath, issues in all_issues.items():
|
||||
print(f" {filepath}:")
|
||||
for line_num, emoji, context in issues:
|
||||
print(f" Line {line_num}: {emoji} - \"{context}...\"")
|
||||
print()
|
||||
print("Fix: Remove emojis or replace with text descriptions.")
|
||||
print("Note: 🔥 is allowed only for Tiny🔥Torch branding.")
|
||||
sys.exit(1)
|
||||
|
||||
sys.exit(0)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Reference in New Issue
Block a user