diff --git a/.github/TINYTORCH_RELEASE_PROCESS.md b/.github/TINYTORCH_RELEASE_PROCESS.md deleted file mode 100644 index 20eb5f483..000000000 --- a/.github/TINYTORCH_RELEASE_PROCESS.md +++ /dev/null @@ -1,460 +0,0 @@ -# TinyTorch Release Process - -## Overview - -This document describes the complete release process for TinyTorch, combining automated CI/CD checks with manual agent-driven reviews. - -## Release Types - -### Patch Release (0.1.X) -- Bug fixes -- Documentation updates -- Minor improvements -- **Timeline:** 1-2 days - -### Minor Release (0.X.0) -- New module additions -- Feature enhancements -- Significant improvements -- **Timeline:** 1-2 weeks - -### Major Release (X.0.0) -- Complete module sets -- Breaking API changes -- Architectural updates -- **Timeline:** 1-3 months - -## Two-Track Quality Assurance - -### Track 1: Automated CI/CD (Continuous) - -**GitHub Actions** runs on every commit and PR: - -``` -Every Push/PR: -├── Educational Validation (Module structure, objectives) -├── Implementation Validation (Time, difficulty, tests) -├── Test Validation (All tests, coverage) -├── Package Validation (Builds, installs) -├── Documentation Validation (ABOUT.md, checkpoints) -└── Systems Analysis (Memory, performance, production) -``` - -**Trigger:** Automatic on push/PR - -**Duration:** 15-20 minutes - -**Pass Criteria:** All 6 quality gates green - ---- - -### Track 2: Agent-Driven Review (Pre-Release) - -**Specialized AI agents** provide deep review before releases: - -``` -TPM Coordinates: -├── Education Reviewer -│ ├── Pedagogical effectiveness -│ ├── Learning objective alignment -│ ├── Cognitive load assessment -│ └── Assessment quality -│ -├── Module Developer -│ ├── Implementation standards -│ ├── Code quality patterns -│ ├── Testing completeness -│ └── PyTorch API alignment -│ -├── Quality Assurance -│ ├── Comprehensive test validation -│ ├── Edge case coverage -│ ├── Performance testing -│ └── Integration stability -│ -└── Package Manager - ├── Module integration - ├── Dependency resolution - ├── Export/import validation - └── Build verification -``` - -**Trigger:** Manual (via TPM) - -**Duration:** 2-4 hours - -**Pass Criteria:** All agents approve - ---- - -## Complete Release Workflow - -### Phase 1: Development (Ongoing) - -1. **Feature Development** - - Implement modules following DEFINITIVE_MODULE_PLAN.md - - Write tests immediately after each function - - Ensure NBGrader compatibility - - Add checkpoint markers to long modules - -2. **Local Validation** - ```bash - # Run validators locally - python .github/scripts/validate_time_estimates.py - python .github/scripts/validate_difficulty_ratings.py - python .github/scripts/validate_testing_patterns.py - python .github/scripts/check_checkpoints.py - - # Run tests - pytest tests/ -v - ``` - -3. **Commit & Push** - ```bash - git add . - git commit -m "feat: Add [feature] to [module]" - git push origin feature-branch - ``` - ---- - -### Phase 2: Pre-Release Review (1-2 days) - -1. **Create Release Branch** - ```bash - git checkout -b release/v0.X.Y - git push origin release/v0.X.Y - ``` - -2. **Automated CI/CD Check** - - GitHub Actions runs automatically - - Review workflow results - - Fix any failures - -3. **Agent-Driven Comprehensive Review** - - **Invoke TPM for multi-agent review:** - - ``` - Request to TPM: - "I need a comprehensive quality review of all 20 TinyTorch modules - for release v0.X.Y. Please coordinate: - - 1. Education Reviewer - pedagogical validation - 2. Module Developer - implementation standards - 3. Quality Assurance - testing validation - 4. Package Manager - integration health - - Run these in parallel and provide: - - Consolidated findings report - - Prioritized action items - - Estimated effort for fixes - - Timeline for completion - - Release Type: [patch/minor/major] - Target Date: [YYYY-MM-DD]" - ``` - -4. **Review Agent Reports** - - Education Reviewer report - - Module Developer report - - Quality Assurance report - - Package Manager report - -5. **Address Findings** - - Fix HIGH priority issues immediately - - Schedule MEDIUM priority for next sprint - - Document LOW priority as future improvements - ---- - -### Phase 3: Release Candidate (1 day) - -1. **Create Release Candidate** - ```bash - git tag -a v0.X.Y-rc1 -m "Release candidate 1 for v0.X.Y" - git push origin v0.X.Y-rc1 - ``` - -2. **Final Validation** - - Run full test suite - - Build documentation - - Test package installation - - Manual smoke testing - -3. **Stakeholder Review** (if applicable) - - Share RC with instructors - - Collect feedback - - Make final adjustments - ---- - -### Phase 4: Release (1 day) - -1. **Manual Release Check Trigger** - - Via GitHub UI: - - Go to Actions → TinyTorch Release Check - - Click "Run workflow" - - Select: - - Branch: `release/v0.X.Y` - - Release Type: `[patch/minor/major]` - - Check Level: `comprehensive` - -2. **Review Release Report** - - All quality gates pass - - Download release report artifact - - Verify all validations green - -3. **Merge to Main** - ```bash - git checkout main - git merge --no-ff release/v0.X.Y - git push origin main - ``` - -4. **Create Official Release** - ```bash - git tag -a v0.X.Y -m "Release v0.X.Y: [Description]" - git push origin v0.X.Y - ``` - -5. **GitHub Release** - - Go to Releases → Draft a new release - - Select tag: `v0.X.Y` - - Title: `TinyTorch v0.X.Y` - - Description: Include release report summary - - Attach artifacts (wheels, documentation) - - Publish release - -6. **Package Distribution** - ```bash - # Build distribution packages - python -m build - - # Upload to PyPI (if applicable) - python -m twine upload dist/* - ``` - ---- - -### Phase 5: Post-Release (Ongoing) - -1. **Documentation Updates** - - Update README.md with new version - - Update CHANGELOG.md - - Rebuild Jupyter Book - - Deploy to mlsysbook.github.io - -2. **Communication** - - Announce on GitHub - - Update course materials - - Notify instructors - - Social media (if applicable) - -3. **Monitoring** - - Watch for issues - - Respond to feedback - - Plan next release - ---- - -## Quality Gates Reference - -### Must Pass for ALL Releases - -✅ All automated CI/CD checks pass -✅ Test coverage ≥80% -✅ All agent reviews approved -✅ Documentation complete -✅ No HIGH priority issues - -### Additional for Major Releases - -✅ All 20 modules validated -✅ Complete integration testing -✅ Performance benchmarks meet targets -✅ Comprehensive stakeholder review - ---- - -## Checklist Templates - -### Patch Release Checklist - -```markdown -## Pre-Release -- [ ] Local validation passes -- [ ] Automated CI/CD passes -- [ ] Bug fix validated -- [ ] Tests updated - -## Release -- [ ] Release branch created -- [ ] RC tested -- [ ] Merged to main -- [ ] Tag created -- [ ] GitHub release published - -## Post-Release -- [ ] Documentation updated -- [ ] CHANGELOG updated -- [ ] Issue closed -``` - -### Minor Release Checklist - -```markdown -## Pre-Release -- [ ] All local validations pass -- [ ] Automated CI/CD passes -- [ ] Agent reviews complete (all 4) -- [ ] High priority issues fixed -- [ ] New modules validated -- [ ] Integration tests pass - -## Release -- [ ] Release branch created -- [ ] RC tested -- [ ] Stakeholder review (if needed) -- [ ] Merged to main -- [ ] Tag created -- [ ] GitHub release published -- [ ] Package uploaded (if applicable) - -## Post-Release -- [ ] Documentation updated -- [ ] CHANGELOG updated -- [ ] Jupyter Book rebuilt -- [ ] Announcement sent -``` - -### Major Release Checklist - -```markdown -## Pre-Release (1-2 weeks) -- [ ] All local validations pass -- [ ] Automated CI/CD passes -- [ ] Comprehensive agent review (TPM-coordinated) - - [ ] Education Reviewer approved - - [ ] Module Developer approved - - [ ] Quality Assurance approved - - [ ] Package Manager approved -- [ ] ALL modules validated (20/20) -- [ ] Complete integration testing -- [ ] Performance benchmarks met -- [ ] Documentation complete -- [ ] All HIGH/MEDIUM issues resolved - -## Release Candidate (3-5 days) -- [ ] RC1 created and tested -- [ ] Stakeholder feedback collected -- [ ] Final adjustments made -- [ ] RC2 validated (if needed) - -## Release -- [ ] Release branch created -- [ ] Comprehensive check run -- [ ] All quality gates green -- [ ] Merged to main -- [ ] Tag created -- [ ] GitHub release published -- [ ] Package uploaded to PyPI -- [ ] Backup created - -## Post-Release (1 week) -- [ ] Documentation updated everywhere -- [ ] CHANGELOG complete -- [ ] Jupyter Book rebuilt and deployed -- [ ] All stakeholders notified -- [ ] Social media announcement -- [ ] Course materials updated -- [ ] Monitor for issues -``` - ---- - -## Emergency Hotfix Process - -For critical bugs in production: - -1. **Create hotfix branch from main** - ```bash - git checkout main - git checkout -b hotfix/v0.X.Y+1 - ``` - -2. **Fix the issue** - - Minimal changes only - - Focus on critical bug - - Add regression test - -3. **Fast-track validation** - ```bash - # Quick validation - python .github/scripts/validate_time_estimates.py - pytest tests/ -v -k "test_affected_module" - ``` - -4. **Release immediately** - ```bash - git checkout main - git merge --no-ff hotfix/v0.X.Y+1 - git tag -a v0.X.Y+1 -m "Hotfix: [Description]" - git push origin main --tags - ``` - -5. **Backport to release branches if needed** - ---- - -## Tools & Resources - -### GitHub Actions -- Workflow: `.github/workflows/release-check.yml` -- Scripts: `.github/scripts/*.py` -- Documentation: `.github/workflows/README.md` - -### Agent Coordination -- TPM: `.claude/agents/technical-program-manager.md` -- Agents: `.claude/agents/` -- Workflow: `DEFINITIVE_MODULE_PLAN.md` - -### Validation -- Time: `validate_time_estimates.py` -- Difficulty: `validate_difficulty_ratings.py` -- Tests: `validate_testing_patterns.py` -- Checkpoints: `check_checkpoints.py` - ---- - -## Version Numbering - -TinyTorch follows [Semantic Versioning](https://semver.org/): - -**Format:** `MAJOR.MINOR.PATCH` - -- **MAJOR:** Breaking changes, complete module sets -- **MINOR:** New features, module additions -- **PATCH:** Bug fixes, documentation - -**Examples:** -- `0.1.0` → `0.1.1`: Bug fix (patch) -- `0.1.1` → `0.2.0`: New module (minor) -- `0.9.0` → `1.0.0`: All 20 modules complete (major) - ---- - -## Contact & Support - -**Questions about releases?** -- Check this document first -- Review workflow README: `.github/workflows/README.md` -- Consult TPM agent for complex scenarios -- File issue on GitHub for workflow improvements - ---- - -**Last Updated:** 2024-11-24 -**Version:** 1.0.0 -**Maintainer:** TinyTorch Team diff --git a/.github/scripts/gs_compress_pdf.py b/.github/scripts/gs_compress_pdf.py deleted file mode 100644 index 5a0cd1c9a..000000000 --- a/.github/scripts/gs_compress_pdf.py +++ /dev/null @@ -1,62 +0,0 @@ -import argparse -import subprocess -import sys -import os -import shutil - -def get_ghostscript_command(): - """Determine the correct Ghostscript command based on the platform.""" - if os.name == 'nt': - # Try 64-bit and then 32-bit Ghostscript command names - for cmd in ['gswin64c', 'gswin32c']: - if shutil.which(cmd): - return cmd - print("❌ Ghostscript executable not found. Install it and ensure it's in your PATH (e.g., gswin64c.exe).", file=sys.stderr) - sys.exit(1) - else: - # On Linux/macOS, the command is usually 'gs' - if shutil.which('gs'): - return 'gs' - print("❌ Ghostscript (gs) not found. Install it and ensure it's in your PATH.", file=sys.stderr) - sys.exit(1) - -def convert_pdf(input_file, output_file, settings='/printer', compatibility='1.4', debug=False): - gs_command = get_ghostscript_command() - - command = [ - gs_command, - '-sDEVICE=pdfwrite', - '-dNOPAUSE', - '-dQUIET' if not debug else '-dQUIET=false', - '-dBATCH', - f'-dPDFSETTINGS={settings}', - f'-dCompatibilityLevel={compatibility}', - f'-sOutputFile={output_file}', - input_file - ] - - if debug: - print(f"Running command: {' '.join(command)}") - - try: - result = subprocess.run(command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) - if debug: - print(result.stdout.decode()) - except subprocess.CalledProcessError as e: - print(f"Error: {e.stderr.decode()}", file=sys.stderr) - sys.exit(e.returncode) - -def main(): - parser = argparse.ArgumentParser(description="Convert PDF using Ghostscript with various options.") - parser.add_argument('-i', '--input', required=True, help="Input PDF file") - parser.add_argument('-o', '--output', required=True, help="Output PDF file") - parser.add_argument('-s', '--settings', default='/printer', help="PDF settings (default: /printer)") - parser.add_argument('-c', '--compatibility', default='1.4', help="PDF compatibility level (default: 1.4)") - parser.add_argument('-d', '--debug', action='store_true', help="Enable debug mode") - - args = parser.parse_args() - - convert_pdf(args.input, args.output, settings=args.settings, compatibility=args.compatibility, debug=args.debug) - -if __name__ == "__main__": - main() diff --git a/.github/tinytorch-scripts/check_checkpoints.py b/.github/tinytorch-scripts/check_checkpoints.py deleted file mode 100755 index 054497035..000000000 --- a/.github/tinytorch-scripts/check_checkpoints.py +++ /dev/null @@ -1,91 +0,0 @@ -#!/usr/bin/env python3 -""" -Validate checkpoint markers in long modules (8+ hours). -Ensures complex modules have progress markers to help students track completion. -""" - -import re -import sys -from pathlib import Path - - -def extract_time_estimate(about_file): - """Extract time estimate from ABOUT.md""" - if not about_file.exists(): - return 0 - - content = about_file.read_text() - match = re.search(r'time_estimate:\s*"(\d+)-(\d+)\s+hours"', content) - - if match: - return int(match.group(2)) # Return upper bound - return 0 - - -def count_checkpoints(about_file): - """Count checkpoint markers in ABOUT.md""" - if not about_file.exists(): - return 0 - - content = about_file.read_text() - # Look for checkpoint patterns - return len(re.findall(r'\*\*✓ CHECKPOINT \d+:', content)) - - -def main(): - """Validate checkpoint markers in long modules""" - modules_dir = Path("modules") - recommendations = [] - validated = [] - - print("🏁 Validating Checkpoint Markers") - print("=" * 60) - - # Find all module directories - module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()]) - - for module_dir in module_dirs: - module_name = module_dir.name - about_file = module_dir / "ABOUT.md" - - time_estimate = extract_time_estimate(about_file) - checkpoint_count = count_checkpoints(about_file) - - # Modules 8+ hours should have checkpoints - if time_estimate >= 8: - if checkpoint_count == 0: - recommendations.append( - f"⚠️ {module_name} ({time_estimate}h): Consider adding checkpoint markers" - ) - elif checkpoint_count >= 2: - validated.append( - f"✅ {module_name} ({time_estimate}h): {checkpoint_count} checkpoints" - ) - else: - recommendations.append( - f"⚠️ {module_name} ({time_estimate}h): Only {checkpoint_count} checkpoint (recommend 2+)" - ) - else: - print(f" {module_name} ({time_estimate}h): Checkpoints not required") - - print("\n" + "=" * 60) - - # Print validated modules - if validated: - print("\n✅ Modules with Good Checkpoint Coverage:") - for item in validated: - print(f" {item}") - - # Print recommendations - if recommendations: - print("\n💡 Recommendations:") - for rec in recommendations: - print(f" {rec}") - print("\nNote: This is informational only, not a blocker.") - - print("\n✅ Checkpoint validation complete!") - sys.exit(0) - - -if __name__ == "__main__": - main() diff --git a/.github/tinytorch-scripts/check_learning_objectives.py b/.github/tinytorch-scripts/check_learning_objectives.py deleted file mode 100755 index bd63aa903..000000000 --- a/.github/tinytorch-scripts/check_learning_objectives.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate learning objectives alignment across modules""" -import sys -print("📋 Learning objectives validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/check_progressive_disclosure.py b/.github/tinytorch-scripts/check_progressive_disclosure.py deleted file mode 100755 index df3145ae3..000000000 --- a/.github/tinytorch-scripts/check_progressive_disclosure.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate progressive disclosure patterns (no forward references)""" -import sys -print("🔍 Progressive disclosure validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_dependencies.py b/.github/tinytorch-scripts/validate_dependencies.py deleted file mode 100755 index 5d576aca7..000000000 --- a/.github/tinytorch-scripts/validate_dependencies.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate module dependency chain""" -import sys -print("🔗 Module dependencies validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_difficulty_ratings.py b/.github/tinytorch-scripts/validate_difficulty_ratings.py deleted file mode 100755 index 3c9bbd182..000000000 --- a/.github/tinytorch-scripts/validate_difficulty_ratings.py +++ /dev/null @@ -1,120 +0,0 @@ -#!/usr/bin/env python3 -""" -Validate difficulty rating consistency across LEARNING_PATH.md and module ABOUT.md files. -""" - -import re -import sys -from pathlib import Path - - -def normalize_difficulty(difficulty_str): - """Normalize difficulty rating to star count""" - if not difficulty_str: - return None - - # Count stars - star_count = difficulty_str.count("⭐") - if star_count > 0: - return star_count - - # Handle numeric format - if difficulty_str.isdigit(): - return int(difficulty_str) - - # Handle "X/4" format - match = re.match(r"(\d+)/4", difficulty_str) - if match: - return int(match.group(1)) - - return None - - -def extract_difficulty_from_learning_path(module_num): - """Extract difficulty rating for a module from LEARNING_PATH.md""" - learning_path = Path("modules/LEARNING_PATH.md") - if not learning_path.exists(): - return None - - content = learning_path.read_text() - - # Pattern: **Module XX: Name** (X-Y hours, ⭐...) - pattern = rf"\*\*Module {module_num:02d}:.*?\*\*\s*\([^,]+,\s*([⭐]+)\)" - match = re.search(pattern, content) - - return normalize_difficulty(match.group(1)) if match else None - - -def extract_difficulty_from_about(module_path): - """Extract difficulty rating from module ABOUT.md""" - about_file = module_path / "ABOUT.md" - if not about_file.exists(): - return None - - content = about_file.read_text() - - # Pattern: difficulty: "⭐..." or difficulty: X - pattern = r'difficulty:\s*["\']?([⭐\d/]+)["\']?' - match = re.search(pattern, content) - - return normalize_difficulty(match.group(1)) if match else None - - -def main(): - """Validate difficulty ratings across all modules""" - modules_dir = Path("modules") - errors = [] - warnings = [] - - print("⭐ Validating Difficulty Rating Consistency") - print("=" * 60) - - # Find all module directories - module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()]) - - for module_dir in module_dirs: - module_num = int(module_dir.name.split("_")[0]) - module_name = module_dir.name - - learning_path_diff = extract_difficulty_from_learning_path(module_num) - about_diff = extract_difficulty_from_about(module_dir) - - if not about_diff: - warnings.append(f"⚠️ {module_name}: Missing difficulty in ABOUT.md") - continue - - if not learning_path_diff: - warnings.append(f"⚠️ {module_name}: Not found in LEARNING_PATH.md") - continue - - if learning_path_diff != about_diff: - errors.append( - f"❌ {module_name}: Difficulty mismatch\n" - f" LEARNING_PATH.md: {'⭐' * learning_path_diff}\n" - f" ABOUT.md: {'⭐' * about_diff}" - ) - else: - print(f"✅ {module_name}: {'⭐' * about_diff}") - - print("\n" + "=" * 60) - - # Print warnings - if warnings: - print("\n⚠️ Warnings:") - for warning in warnings: - print(f" {warning}") - - # Print errors - if errors: - print("\n❌ Errors Found:") - for error in errors: - print(f" {error}\n") - print(f"\n{len(errors)} difficulty rating inconsistencies found!") - sys.exit(1) - else: - print("\n✅ All difficulty ratings are consistent!") - sys.exit(0) - - -if __name__ == "__main__": - main() diff --git a/.github/tinytorch-scripts/validate_documentation.py b/.github/tinytorch-scripts/validate_documentation.py deleted file mode 100755 index e499f98c7..000000000 --- a/.github/tinytorch-scripts/validate_documentation.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate ABOUT.md consistency""" -import sys -print("📄 Documentation validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_educational_standards.py b/.github/tinytorch-scripts/validate_educational_standards.py deleted file mode 100755 index 844a2e90a..000000000 --- a/.github/tinytorch-scripts/validate_educational_standards.py +++ /dev/null @@ -1,17 +0,0 @@ -#!/usr/bin/env python3 -""" -Validate educational standards across all modules. -Invokes education-reviewer agent logic for comprehensive review. -""" - -import sys -from pathlib import Path - -print("🎓 Educational Standards Validation") -print("=" * 60) -print("✅ Learning objectives present") -print("✅ Progressive disclosure maintained") -print("✅ Cognitive load appropriate") -print("✅ NBGrader compatible") -print("\n✅ Educational standards validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_exports.py b/.github/tinytorch-scripts/validate_exports.py deleted file mode 100755 index 1df2c79e0..000000000 --- a/.github/tinytorch-scripts/validate_exports.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate export directives""" -import sys -print("📦 Export directives validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_imports.py b/.github/tinytorch-scripts/validate_imports.py deleted file mode 100755 index 66bd54576..000000000 --- a/.github/tinytorch-scripts/validate_imports.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate import path consistency""" -import sys -print("🔗 Import paths validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_nbgrader.py b/.github/tinytorch-scripts/validate_nbgrader.py deleted file mode 100755 index 470d764fc..000000000 --- a/.github/tinytorch-scripts/validate_nbgrader.py +++ /dev/null @@ -1,5 +0,0 @@ -#!/usr/bin/env python3 -"""Validate NBGrader metadata in all modules""" -import sys -print("📝 NBGrader metadata validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_systems_analysis.py b/.github/tinytorch-scripts/validate_systems_analysis.py deleted file mode 100755 index 803ad732f..000000000 --- a/.github/tinytorch-scripts/validate_systems_analysis.py +++ /dev/null @@ -1,11 +0,0 @@ -#!/usr/bin/env python3 -"""Validate systems analysis coverage""" -import sys -import argparse - -parser = argparse.ArgumentParser() -parser.add_argument('--aspect', choices=['memory', 'performance', 'production']) -args = parser.parse_args() - -print(f"🧠 {args.aspect.capitalize()} analysis validated!") -sys.exit(0) diff --git a/.github/tinytorch-scripts/validate_testing_patterns.py b/.github/tinytorch-scripts/validate_testing_patterns.py deleted file mode 100755 index 8dc8308e5..000000000 --- a/.github/tinytorch-scripts/validate_testing_patterns.py +++ /dev/null @@ -1,95 +0,0 @@ -#!/usr/bin/env python3 -""" -Validate testing patterns in module development files. -Ensures: -- Unit tests use test_unit_* naming -- Module integration test is named test_module() -- Tests are protected with if __name__ == "__main__" -""" - -import re -import sys -from pathlib import Path - - -def check_module_tests(module_file): - """Check testing patterns in a module file""" - content = module_file.read_text() - issues = [] - - # Check for test_unit_* pattern - unit_tests = re.findall(r'def\s+(test_unit_\w+)\s*\(', content) - - # Check for test_module() function - has_test_module = bool(re.search(r'def\s+test_module\s*\(', content)) - - # Check for if __name__ == "__main__" blocks - has_main_guard = bool(re.search(r'if\s+__name__\s*==\s*["\']__main__["\']', content)) - - # Check for improper test names (test_* but not test_unit_*) - improper_tests = [ - name for name in re.findall(r'def\s+(test_\w+)\s*\(', content) - if not name.startswith('test_unit_') and name != 'test_module' - ] - - # Validate patterns - if not unit_tests and not has_test_module: - issues.append("No tests found (missing test_unit_* or test_module)") - - if not has_test_module: - issues.append("Missing test_module() integration test") - - if not has_main_guard: - issues.append("Missing if __name__ == '__main__' guard") - - if improper_tests: - issues.append(f"Improper test names (should be test_unit_*): {', '.join(improper_tests)}") - - return { - 'unit_tests': len(unit_tests), - 'has_test_module': has_test_module, - 'has_main_guard': has_main_guard, - 'issues': issues - } - - -def main(): - """Validate testing patterns across all modules""" - modules_dir = Path("modules") - errors = [] - warnings = [] - - print("🧪 Validating Testing Patterns") - print("=" * 60) - - # Find all module development files - module_files = sorted(modules_dir.glob("*/*_dev.py")) - - for module_file in module_files: - module_name = module_file.parent.name - - result = check_module_tests(module_file) - - if result['issues']: - errors.append(f"❌ {module_name}:") - for issue in result['issues']: - errors.append(f" - {issue}") - else: - print(f"✅ {module_name}: {result['unit_tests']} unit tests + test_module()") - - print("\n" + "=" * 60) - - # Print errors - if errors: - print("\n❌ Testing Pattern Issues:") - for error in errors: - print(f" {error}") - print(f"\n{len([e for e in errors if '❌' in e])} modules with testing issues!") - sys.exit(1) - else: - print("\n✅ All modules follow correct testing patterns!") - sys.exit(0) - - -if __name__ == "__main__": - main() diff --git a/.github/tinytorch-scripts/validate_time_estimates.py b/.github/tinytorch-scripts/validate_time_estimates.py deleted file mode 100755 index 8555557ea..000000000 --- a/.github/tinytorch-scripts/validate_time_estimates.py +++ /dev/null @@ -1,98 +0,0 @@ -#!/usr/bin/env python3 -""" -Validate time estimate consistency across LEARNING_PATH.md and module ABOUT.md files. -""" - -import re -import sys -from pathlib import Path - - -def extract_time_from_learning_path(module_num): - """Extract time estimate for a module from LEARNING_PATH.md""" - learning_path = Path("modules/LEARNING_PATH.md") - if not learning_path.exists(): - return None - - content = learning_path.read_text() - - # Pattern: **Module XX: Name** (X-Y hours, ⭐...) - pattern = rf"\*\*Module {module_num:02d}:.*?\*\*\s*\((\d+-\d+\s+hours)" - match = re.search(pattern, content) - - return match.group(1) if match else None - - -def extract_time_from_about(module_path): - """Extract time estimate from module ABOUT.md""" - about_file = module_path / "ABOUT.md" - if not about_file.exists(): - return None - - content = about_file.read_text() - - # Pattern: time_estimate: "X-Y hours" - pattern = r'time_estimate:\s*"(\d+-\d+\s+hours)"' - match = re.search(pattern, content) - - return match.group(1) if match else None - - -def main(): - """Validate time estimates across all modules""" - modules_dir = Path("modules") - errors = [] - warnings = [] - - print("⏱️ Validating Time Estimate Consistency") - print("=" * 60) - - # Find all module directories - module_dirs = sorted([d for d in modules_dir.iterdir() if d.is_dir() and d.name[0].isdigit()]) - - for module_dir in module_dirs: - module_num = int(module_dir.name.split("_")[0]) - module_name = module_dir.name - - learning_path_time = extract_time_from_learning_path(module_num) - about_time = extract_time_from_about(module_dir) - - if not about_time: - warnings.append(f"⚠️ {module_name}: Missing time_estimate in ABOUT.md") - continue - - if not learning_path_time: - warnings.append(f"⚠️ {module_name}: Not found in LEARNING_PATH.md") - continue - - if learning_path_time != about_time: - errors.append( - f"❌ {module_name}: Time mismatch\n" - f" LEARNING_PATH.md: {learning_path_time}\n" - f" ABOUT.md: {about_time}" - ) - else: - print(f"✅ {module_name}: {about_time}") - - print("\n" + "=" * 60) - - # Print warnings - if warnings: - print("\n⚠️ Warnings:") - for warning in warnings: - print(f" {warning}") - - # Print errors - if errors: - print("\n❌ Errors Found:") - for error in errors: - print(f" {error}\n") - print(f"\n{len(errors)} time estimate inconsistencies found!") - sys.exit(1) - else: - print("\n✅ All time estimates are consistent!") - sys.exit(0) - - -if __name__ == "__main__": - main() diff --git a/book/scripts/catalog_footnotes.py b/book/scripts/catalog_footnotes.py deleted file mode 100644 index 59e084fd1..000000000 --- a/book/scripts/catalog_footnotes.py +++ /dev/null @@ -1,366 +0,0 @@ -#!/usr/bin/env python3 -""" -Catalog all footnotes in Quarto markdown (.qmd) files. - -This script: -1. Scans all qmd files for footnotes -2. Collects inline references and their contexts -3. Collects footnote definitions -4. Generates a comprehensive report for the footnote agent -""" - -import re -import json -import sys -from pathlib import Path -from typing import Dict, List, Tuple, Set -from collections import defaultdict - - -def extract_inline_references(content: str, file_path: Path) -> List[Dict]: - """Extract all inline footnote references with their surrounding context.""" - references = [] - lines = content.splitlines() - - for line_num, line in enumerate(lines, 1): - # Find all footnote references in this line - matches = re.finditer(r'\[\^([^\]]+)\]', line) - for match in matches: - footnote_id = match.group(1) - - # Get context (the sentence containing the footnote) - # Find sentence boundaries - start_pos = max(0, match.start() - 100) - end_pos = min(len(line), match.end() + 100) - context = line[start_pos:end_pos].strip() - - # Clean up context - if start_pos > 0: - context = "..." + context - if end_pos < len(line): - context = context + "..." - - references.append({ - 'footnote_id': footnote_id, - 'file': str(file_path), - 'line': line_num, - 'context': context, - 'full_line': line.strip() - }) - - return references - - -def extract_footnote_definitions(content: str, file_path: Path) -> List[Dict]: - """Extract all footnote definitions.""" - definitions = [] - lines = content.splitlines() - - i = 0 - while i < len(lines): - line = lines[i] - - # Check if this line starts a footnote definition - match = re.match(r'^\[\^([^\]]+)\]:\s*(.*)$', line) - if match: - footnote_id = match.group(1) - definition_text = match.group(2) - line_num = i + 1 - - # Collect continuation lines - i += 1 - while i < len(lines): - next_line = lines[i] - # Continuation lines are indented or empty - if next_line and (next_line[0] == ' ' or next_line[0] == '\t'): - definition_text += '\n' + next_line - i += 1 - elif not next_line.strip(): - # Empty line might be part of the footnote - if i + 1 < len(lines) and lines[i + 1] and (lines[i + 1][0] == ' ' or lines[i + 1][0] == '\t'): - definition_text += '\n' - i += 1 - else: - break - else: - break - - # Clean up the definition - definition_text = definition_text.strip() - - # Extract bold term if it exists (common pattern: **Term**: Definition) - term_match = re.match(r'\*\*([^*]+)\*\*:\s*(.+)', definition_text) - term = term_match.group(1) if term_match else None - - definitions.append({ - 'footnote_id': footnote_id, - 'file': str(file_path), - 'line': line_num, - 'definition': definition_text, - 'term': term, - 'length': len(definition_text) - }) - else: - i += 1 - - return definitions - - -def analyze_footnote_patterns(all_definitions: List[Dict]) -> Dict: - """Analyze patterns in footnote definitions.""" - patterns = { - 'total_definitions': len(all_definitions), - 'with_bold_terms': 0, - 'average_length': 0, - 'common_prefixes': defaultdict(int), - 'terms_used': set() - } - - total_length = 0 - for defn in all_definitions: - total_length += defn['length'] - if defn['term']: - patterns['with_bold_terms'] += 1 - patterns['terms_used'].add(defn['term'].lower()) - - # Extract common ID prefixes (e.g., 'fn-', 'note-', etc.) - id_parts = defn['footnote_id'].split('-') - if len(id_parts) > 1: - patterns['common_prefixes'][id_parts[0]] += 1 - - if all_definitions: - patterns['average_length'] = total_length // len(all_definitions) - - patterns['terms_used'] = list(patterns['terms_used']) - patterns['common_prefixes'] = dict(patterns['common_prefixes']) - - return patterns - - -def find_duplicates(all_references: List[Dict], all_definitions: List[Dict]) -> Dict: - """Find duplicate footnotes across chapters.""" - duplicates = { - 'duplicate_ids': defaultdict(list), - 'duplicate_terms': defaultdict(list), - 'undefined_references': [], - 'unused_definitions': [] - } - - # Track footnote IDs by file - for ref in all_references: - file_name = Path(ref['file']).stem - duplicates['duplicate_ids'][ref['footnote_id']].append(file_name) - - # Track terms across files - for defn in all_definitions: - if defn['term']: - file_name = Path(defn['file']).stem - duplicates['duplicate_terms'][defn['term'].lower()].append({ - 'file': file_name, - 'footnote_id': defn['footnote_id'] - }) - - # Find undefined references - defined_ids = {d['footnote_id'] for d in all_definitions} - referenced_ids = {r['footnote_id'] for r in all_references} - - for ref in all_references: - if ref['footnote_id'] not in defined_ids: - duplicates['undefined_references'].append({ - 'footnote_id': ref['footnote_id'], - 'file': Path(ref['file']).stem, - 'line': ref['line'] - }) - - # Find unused definitions - for defn in all_definitions: - if defn['footnote_id'] not in referenced_ids: - duplicates['unused_definitions'].append({ - 'footnote_id': defn['footnote_id'], - 'file': Path(defn['file']).stem, - 'line': defn['line'] - }) - - # Clean up duplicates - only keep actual duplicates - duplicates['duplicate_ids'] = { - k: list(set(v)) for k, v in duplicates['duplicate_ids'].items() - if len(set(v)) > 1 - } - - duplicates['duplicate_terms'] = { - k: v for k, v in duplicates['duplicate_terms'].items() - if len(v) > 1 - } - - return duplicates - - -def generate_chapter_summary(file_path: Path, references: List[Dict], definitions: List[Dict]) -> Dict: - """Generate a summary for a specific chapter.""" - return { - 'file': str(file_path), - 'chapter_name': file_path.stem, - 'total_references': len(references), - 'total_definitions': len(definitions), - 'footnote_ids': sorted(list({r['footnote_id'] for r in references})), - 'terms_defined': sorted([d['term'] for d in definitions if d['term']]) - } - - -def generate_agent_context(all_data: Dict, target_chapter: str = None) -> str: - """Generate context information for the footnote agent.""" - context = [] - - context.append("# FOOTNOTE CATALOG AND CONTEXT\n") - context.append("## Book-Wide Footnote Statistics\n") - - patterns = all_data['patterns'] - context.append(f"- Total footnotes defined: {patterns['total_definitions']}") - context.append(f"- Footnotes with bold terms: {patterns['with_bold_terms']}") - context.append(f"- Average definition length: {patterns['average_length']} characters") - context.append(f"- Common ID prefixes: {patterns['common_prefixes']}") - context.append(f"- Total unique terms: {len(patterns['terms_used'])}\n") - - if all_data['duplicates']['duplicate_terms']: - context.append("## ⚠️ IMPORTANT: Terms Already Defined\n") - context.append("These terms have already been defined in other chapters. DO NOT redefine them:\n") - for term, locations in all_data['duplicates']['duplicate_terms'].items(): - context.append(f"- **{term}**: defined in {', '.join([l['file'] for l in locations])}") - context.append("") - - if target_chapter: - # Find chapter data - chapter_data = None - for chapter in all_data['by_chapter']: - if chapter['chapter_name'] == target_chapter or target_chapter in chapter['file']: - chapter_data = chapter - break - - if chapter_data: - context.append(f"## Current Chapter: {chapter_data['chapter_name']}\n") - context.append(f"- Existing footnotes: {chapter_data['total_references']}") - context.append(f"- Footnote IDs used: {', '.join(chapter_data['footnote_ids'])}") - if chapter_data['terms_defined']: - context.append(f"- Terms already defined: {', '.join(chapter_data['terms_defined'])}") - context.append("") - - context.append("## Footnote Style Guidelines\n") - context.append("Based on existing footnotes, follow these patterns:") - context.append("1. Use ID format: [^fn-term-name] (lowercase, hyphens)") - context.append("2. Definition format: **Bold Term**: Clear definition. Optional analogy.") - context.append("3. Keep definitions concise (avg ~200 characters)") - context.append("4. Avoid redefining terms from other chapters") - context.append("5. Focus on technical terms that need clarification\n") - - context.append("## All Terms Currently Defined in Book\n") - if patterns['terms_used']: - for i in range(0, len(patterns['terms_used']), 5): - batch = patterns['terms_used'][i:i+5] - context.append(f"- {', '.join(batch)}") - - return '\n'.join(context) - - -def main(): - """Main function to catalog all footnotes.""" - # Determine root directory - if len(sys.argv) > 1: - root_dir = Path(sys.argv[1]) - else: - root_dir = Path('/Users/VJ/GitHub/MLSysBook/quarto') - - if not root_dir.exists(): - print(f"Error: Directory {root_dir} does not exist") - sys.exit(1) - - print(f"Cataloging footnotes in: {root_dir}") - print("-" * 60) - - # Find all .qmd files - qmd_files = sorted(root_dir.rglob('*.qmd')) - - all_references = [] - all_definitions = [] - by_chapter = [] - - for qmd_file in qmd_files: - try: - with open(qmd_file, 'r', encoding='utf-8') as f: - content = f.read() - - # Skip files with no content - if not content.strip(): - continue - - # Extract footnotes - references = extract_inline_references(content, qmd_file) - definitions = extract_footnote_definitions(content, qmd_file) - - if references or definitions: - relative_path = qmd_file.relative_to(root_dir.parent) - print(f"✓ {relative_path}") - print(f" - {len(references)} inline references") - print(f" - {len(definitions)} definitions") - - all_references.extend(references) - all_definitions.extend(definitions) - - chapter_summary = generate_chapter_summary(qmd_file, references, definitions) - by_chapter.append(chapter_summary) - - except Exception as e: - print(f"Error processing {qmd_file}: {e}") - - # Analyze patterns and duplicates - patterns = analyze_footnote_patterns(all_definitions) - duplicates = find_duplicates(all_references, all_definitions) - - # Create comprehensive report - report = { - 'total_files': len(qmd_files), - 'total_references': len(all_references), - 'total_definitions': len(all_definitions), - 'patterns': patterns, - 'duplicates': duplicates, - 'by_chapter': by_chapter, - 'all_references': all_references, - 'all_definitions': all_definitions - } - - # Save JSON report - report_file = root_dir.parent / 'footnote_catalog.json' - with open(report_file, 'w', encoding='utf-8') as f: - json.dump(report, f, indent=2, default=str) - - print("\n" + "=" * 60) - print("FOOTNOTE CATALOG SUMMARY") - print("=" * 60) - print(f"Total files scanned: {len(qmd_files)}") - print(f"Total inline references: {len(all_references)}") - print(f"Total definitions: {len(all_definitions)}") - print(f"Unique footnote IDs: {len(set(r['footnote_id'] for r in all_references))}") - print(f"Terms defined: {len(patterns['terms_used'])}") - - if duplicates['undefined_references']: - print(f"\n⚠️ Undefined references: {len(duplicates['undefined_references'])}") - for ref in duplicates['undefined_references'][:5]: - print(f" - [{ref['footnote_id']}] in {ref['file']} line {ref['line']}") - - if duplicates['unused_definitions']: - print(f"\n⚠️ Unused definitions: {len(duplicates['unused_definitions'])}") - for defn in duplicates['unused_definitions'][:5]: - print(f" - [{defn['footnote_id']}] in {defn['file']} line {defn['line']}") - - print(f"\n✓ Full report saved to: {report_file}") - - # Generate agent context file - agent_context = generate_agent_context(report) - context_file = root_dir.parent / '.claude' / 'footnote_context.md' - context_file.parent.mkdir(exist_ok=True) - with open(context_file, 'w', encoding='utf-8') as f: - f.write(agent_context) - print(f"✓ Agent context saved to: {context_file}") - - -if __name__ == "__main__": - main() diff --git a/book/scripts/remove_footnotes.py b/book/scripts/remove_footnotes.py deleted file mode 100755 index a9a228cb0..000000000 --- a/book/scripts/remove_footnotes.py +++ /dev/null @@ -1,164 +0,0 @@ -#!/usr/bin/env python3 -""" -Remove all footnotes from Quarto markdown (.qmd) files. - -This script removes: -1. Inline footnote references like [^fn-name] -2. Footnote definitions like [^fn-name]: Definition text... -3. Multi-line footnote definitions that are indented -""" - -import re -import sys -from pathlib import Path -from typing import List, Tuple - - -def remove_inline_footnotes(text: str) -> str: - """Remove inline footnote references like [^fn-name] from text.""" - # Pattern matches [^anything-here] where 'anything-here' doesn't contain ] - pattern = r'\[\^[^\]]+\]' - return re.sub(pattern, '', text) - - -def remove_footnote_definitions(lines: List[str]) -> List[str]: - """Remove footnote definitions from a list of lines.""" - cleaned_lines = [] - skip_mode = False - - for i, line in enumerate(lines): - # Check if this line starts a footnote definition - if re.match(r'^\[\^[^\]]+\]:', line): - skip_mode = True - continue - - # If we're in skip mode, check if this line is a continuation - if skip_mode: - # Continuation lines start with whitespace (indented) - if line and (line[0] == ' ' or line[0] == '\t'): - continue - # Empty lines after footnotes are also skipped - elif not line.strip(): - # Check if next line exists and is indented (continuation) - if i + 1 < len(lines) and lines[i + 1] and (lines[i + 1][0] == ' ' or lines[i + 1][0] == '\t'): - continue - # Otherwise, end skip mode but still skip this empty line - skip_mode = False - continue - else: - # Non-indented, non-empty line means footnote is done - skip_mode = False - - # Keep this line - cleaned_lines.append(line) - - return cleaned_lines - - -def process_qmd_file(file_path: Path) -> Tuple[bool, int, int]: - """ - Process a single .qmd file to remove footnotes. - - Returns: - Tuple of (was_modified, inline_refs_removed, definitions_removed) - """ - try: - with open(file_path, 'r', encoding='utf-8') as f: - content = f.read() - lines = content.splitlines() - - # Count footnotes before processing - inline_refs_before = len(re.findall(r'\[\^[^\]]+\]', content)) - definitions_before = len([l for l in lines if re.match(r'^\[\^[^\]]+\]:', l)]) - - # Remove inline references - content_no_inline = remove_inline_footnotes(content) - - # Remove definitions (work with lines) - lines_no_inline = content_no_inline.splitlines() - cleaned_lines = remove_footnote_definitions(lines_no_inline) - - # Reconstruct content - cleaned_content = '\n'.join(cleaned_lines) - - # Only write if there were changes - if cleaned_content != content: - with open(file_path, 'w', encoding='utf-8') as f: - f.write(cleaned_content) - # Ensure file ends with newline - if cleaned_content and not cleaned_content.endswith('\n'): - f.write('\n') - return True, inline_refs_before, definitions_before - - return False, 0, 0 - - except Exception as e: - print(f"Error processing {file_path}: {e}") - return False, 0, 0 - - -def find_qmd_files(root_dir: Path) -> List[Path]: - """Find all .qmd files in the directory tree.""" - return sorted(root_dir.rglob('*.qmd')) - - -def main(): - """Main function to process all .qmd files.""" - # Determine root directory - if len(sys.argv) > 1: - root_dir = Path(sys.argv[1]) - else: - # Default to quarto directory - root_dir = Path('/Users/VJ/GitHub/MLSysBook/quarto') - - if not root_dir.exists(): - print(f"Error: Directory {root_dir} does not exist") - sys.exit(1) - - print(f"Scanning for .qmd files in: {root_dir}") - - # Find all .qmd files - qmd_files = find_qmd_files(root_dir) - - if not qmd_files: - print("No .qmd files found") - return - - print(f"Found {len(qmd_files)} .qmd files") - print("-" * 60) - - # Process each file - total_modified = 0 - total_inline_refs = 0 - total_definitions = 0 - - for qmd_file in qmd_files: - relative_path = qmd_file.relative_to(root_dir.parent) - was_modified, inline_refs, definitions = process_qmd_file(qmd_file) - - if was_modified: - total_modified += 1 - total_inline_refs += inline_refs - total_definitions += definitions - print(f"✓ {relative_path}") - print(f" - Removed {inline_refs} inline references") - print(f" - Removed {definitions} footnote definitions") - else: - print(f" {relative_path} (no footnotes found)") - - # Summary - print("-" * 60) - print(f"\nSummary:") - print(f" Files processed: {len(qmd_files)}") - print(f" Files modified: {total_modified}") - print(f" Total inline references removed: {total_inline_refs}") - print(f" Total footnote definitions removed: {total_definitions}") - - if total_modified > 0: - print(f"\n✓ Successfully removed all footnotes from {total_modified} files") - else: - print("\n✓ No footnotes found in any files") - - -if __name__ == "__main__": - main() diff --git a/book/tools/scripts/reorganize_scripts.py b/book/tools/scripts/reorganize_scripts.py deleted file mode 100755 index cf0262e8e..000000000 --- a/book/tools/scripts/reorganize_scripts.py +++ /dev/null @@ -1,337 +0,0 @@ -#!/usr/bin/env python3 -""" -Reorganize tools/scripts/ directory structure. - -This script: -1. Creates new subdirectories -2. Moves scripts to proper locations -3. Updates all references (pre-commit, imports, README) -4. Creates a rollback backup -""" - -import os -import shutil -import json -from pathlib import Path -from datetime import datetime - -# Define the migration plan -MIGRATION_PLAN = { - # Images subdirectory - consolidate all image-related scripts - 'images': [ - 'download_external_images.py', - 'manage_external_images.py', - 'remove_bg.py', - 'rename_auto_images.py', - 'rename_downloaded_images.py', - 'validate_image_references.py', - ], - - # Content subdirectory - add formatting scripts - 'content': [ - 'fix_mid_paragraph_bold.py', - 'format_python_in_qmd.py', - 'format_tables.py', - ], - - # Testing subdirectory - consolidate all tests - 'testing': [ - 'test_format_tables.py', - 'test_image_extraction.py', - 'test_publish_live.py', - ], - - # Infrastructure subdirectory - CI/CD and container management - 'infrastructure': [ - 'cleanup_containers.py', - 'list_containers.py', - 'cleanup_workflow_runs_gh.py', - ], - - # Glossary subdirectory - move glossary script - 'glossary': [ - 'standardize_glossaries.py', - ], - - # Maintenance subdirectory - add release and preflight - 'maintenance': [ - 'generate_release_notes.py', - 'preflight.py', - ], - - # Utilities subdirectory - validation scripts - 'utilities': [ - 'check_custom_extensions.py', - 'validate_part_keys.py', - ], -} - -# Files that also need to be moved from existing subdirectories -EXISTING_SUBDIRECTORY_MOVES = { - 'images': [ - ('utilities/manage_images.py', 'manage_images.py'), - ('utilities/convert_svg_to_png.py', 'convert_svg_to_png.py'), - ('maintenance/compress_images.py', 'compress_images.py'), - ('maintenance/analyze_image_sizes.py', 'analyze_image_sizes.py'), - ], -} - -# Pre-commit reference updates -PRECOMMIT_UPDATES = { - 'tools/scripts/format_python_in_qmd.py': 'tools/scripts/content/format_python_in_qmd.py', - 'tools/scripts/format_tables.py': 'tools/scripts/content/format_tables.py', - 'tools/scripts/validate_part_keys.py': 'tools/scripts/utilities/validate_part_keys.py', - 'tools/scripts/manage_external_images.py': 'tools/scripts/images/manage_external_images.py', - 'tools/scripts/validate_image_references.py': 'tools/scripts/images/validate_image_references.py', - 'tools/scripts/generate_release_notes.py': 'tools/scripts/maintenance/generate_release_notes.py', - 'tools/scripts/preflight.py': 'tools/scripts/maintenance/preflight.py', -} - - -def create_backup(): - """Create a backup of the current state.""" - timestamp = datetime.now().strftime('%Y%m%d_%H%M%S') - backup_file = f'tools/scripts/.backup_{timestamp}.json' - - backup_data = { - 'timestamp': timestamp, - 'files': [] - } - - # Record all file locations - for root, dirs, files in os.walk('tools/scripts'): - for file in files: - if file.endswith('.py'): - rel_path = os.path.relpath(os.path.join(root, file), 'tools/scripts') - backup_data['files'].append(rel_path) - - with open(backup_file, 'w') as f: - json.dump(backup_data, f, indent=2) - - print(f"✅ Created backup: {backup_file}") - return backup_file - - -def create_directories(): - """Create new subdirectories.""" - print("\n📁 Creating new directories...") - - new_dirs = ['images', 'infrastructure'] - for dirname in new_dirs: - dirpath = f'tools/scripts/{dirname}' - if not os.path.exists(dirpath): - os.makedirs(dirpath) - # Create __init__.py - with open(f'{dirpath}/__init__.py', 'w') as f: - f.write(f'"""Scripts for {dirname} management."""\n') - print(f" ✅ Created {dirpath}/") - else: - print(f" ⚠️ {dirpath}/ already exists") - - -def move_files_from_root(): - """Move files from root to subdirectories.""" - print("\n📦 Moving files from root level...") - - moved_count = 0 - for target_dir, files in MIGRATION_PLAN.items(): - target_path = f'tools/scripts/{target_dir}' - - for filename in files: - source = f'tools/scripts/{filename}' - dest = f'{target_path}/{filename}' - - if os.path.exists(source): - shutil.move(source, dest) - print(f" ✅ {filename} → {target_dir}/") - moved_count += 1 - else: - print(f" ⚠️ {filename} not found (may already be moved)") - - print(f"\n Moved {moved_count} files from root") - - -def move_files_between_subdirs(): - """Move files between existing subdirectories.""" - print("\n🔄 Consolidating files between subdirectories...") - - moved_count = 0 - for target_dir, moves in EXISTING_SUBDIRECTORY_MOVES.items(): - target_path = f'tools/scripts/{target_dir}' - - for source_rel, dest_name in moves: - source = f'tools/scripts/{source_rel}' - dest = f'{target_path}/{dest_name}' - - if os.path.exists(source): - shutil.move(source, dest) - print(f" ✅ {source_rel} → {target_dir}/{dest_name}") - moved_count += 1 - else: - print(f" ⚠️ {source_rel} not found") - - print(f"\n Moved {moved_count} files between subdirectories") - - -def update_precommit_config(): - """Update .pre-commit-config.yaml with new paths.""" - print("\n⚙️ Updating .pre-commit-config.yaml...") - - config_path = '.pre-commit-config.yaml' - - with open(config_path, 'r') as f: - content = f.read() - - original_content = content - updates_made = 0 - - for old_path, new_path in PRECOMMIT_UPDATES.items(): - if old_path in content: - content = content.replace(old_path, new_path) - updates_made += 1 - print(f" ✅ Updated: {os.path.basename(old_path)}") - - if updates_made > 0: - with open(config_path, 'w') as f: - f.write(content) - print(f"\n Updated {updates_made} references in pre-commit config") - else: - print(" ℹ️ No updates needed") - - -def create_readme_files(): - """Create README files for new directories.""" - print("\n📝 Creating README files...") - - readmes = { - 'images': '''# Image Management Scripts - -Scripts for managing, processing, and validating images in the book. - -## Image Processing -- `compress_images.py` - Compress images to reduce file size -- `convert_svg_to_png.py` - Convert SVG files to PNG format -- `remove_bg.py` - Remove backgrounds from images - -## Image Management -- `manage_images.py` - Main image management utility -- `download_external_images.py` - Download external images -- `manage_external_images.py` - Manage external image references -- `rename_auto_images.py` - Rename automatically generated images -- `rename_downloaded_images.py` - Rename downloaded images - -## Validation -- `validate_image_references.py` - Ensure all image references are valid -- `analyze_image_sizes.py` - Analyze image sizes and suggest optimizations -''', - 'infrastructure': '''# Infrastructure Scripts - -Scripts for managing CI/CD, containers, and workflow infrastructure. - -## Container Management -- `cleanup_containers.py` - Clean up Docker containers -- `list_containers.py` - List active containers - -## Workflow Management -- `cleanup_workflow_runs_gh.py` - Clean up old GitHub Actions workflow runs -''', - } - - for dirname, content in readmes.items(): - readme_path = f'tools/scripts/{dirname}/README.md' - if not os.path.exists(readme_path): - with open(readme_path, 'w') as f: - f.write(content) - print(f" ✅ Created {dirname}/README.md") - - -def generate_summary(): - """Generate a summary of the reorganization.""" - print("\n" + "=" * 80) - print("📊 REORGANIZATION SUMMARY") - print("=" * 80) - - # Count files in each directory - summary = {} - for root, dirs, files in os.walk('tools/scripts'): - # Skip __pycache__ and hidden directories - if '__pycache__' in root or '/.backup' in root: - continue - - dirname = os.path.relpath(root, 'tools/scripts') - py_files = [f for f in files if f.endswith('.py') and f != '__init__.py'] - - if py_files: - summary[dirname] = len(py_files) - - print("\n📁 Files per directory:") - for dirname in sorted(summary.keys()): - count = summary[dirname] - print(f" {dirname + '/':<30} {count:>3} files") - - # Count root level files - root_files = [f for f in os.listdir('tools/scripts') - if f.endswith('.py') and os.path.isfile(f'tools/scripts/{f}')] - - print(f"\n🎯 Root level scripts remaining: {len(root_files)}") - if root_files: - for f in root_files: - print(f" - {f}") - - print("\n✅ Reorganization complete!") - print("\nNext steps:") - print(" 1. Test pre-commit hooks: pre-commit run --all-files") - print(" 2. Check for any broken imports") - print(" 3. Update any documentation references") - - -def main(): - """Main reorganization process.""" - import sys - - print("🔧 SCRIPT REORGANIZATION TOOL") - print("=" * 80) - print("\nThis will reorganize tools/scripts/ directory structure.") - print("\nChanges:") - print(" • Create new subdirectories (images/, infrastructure/)") - print(" • Move 21 scripts from root to appropriate subdirectories") - print(" • Consolidate scattered scripts (images, tests, etc.)") - print(" • Update .pre-commit-config.yaml references") - print(" • Create documentation") - - # Check for --yes flag - if '--yes' not in sys.argv: - response = input("\n⚠️ Proceed with reorganization? (yes/no): ") - if response.lower() != 'yes': - print("\n❌ Cancelled") - return 1 - else: - print("\n✅ Auto-confirmed with --yes flag") - - try: - # Create backup - backup_file = create_backup() - - # Execute reorganization - create_directories() - move_files_from_root() - move_files_between_subdirs() - update_precommit_config() - create_readme_files() - - # Summary - generate_summary() - - print(f"\n💾 Backup saved to: {backup_file}") - print(" (Can be used for rollback if needed)") - - return 0 - - except Exception as e: - print(f"\n❌ Error during reorganization: {e}") - print(" Please restore from backup if needed") - return 1 - - -if __name__ == '__main__': - exit(main()) diff --git a/tinytorch/site/scripts/check_no_emojis.py b/tinytorch/site/scripts/check_no_emojis.py deleted file mode 100755 index 0b1c6f316..000000000 --- a/tinytorch/site/scripts/check_no_emojis.py +++ /dev/null @@ -1,79 +0,0 @@ -#!/usr/bin/env python3 -""" -Pre-commit hook: Check that markdown files don't contain emojis. - -Emojis cause rendering issues in PDF builds. Keep content professional -by using text descriptions instead. - -Usage: - python3 check_no_emojis.py [files...] - -Exit codes: - 0 - No emojis found - 1 - Emojis found (lists files and emojis) -""" - -import sys -import re -from pathlib import Path - -# Emoji pattern - matches most common emoji ranges -EMOJI_PATTERN = re.compile( - "[" - "\U0001F300-\U0001F9FF" # Misc Symbols, Emoticons, Dingbats, etc. - "\U00002600-\U000026FF" # Misc symbols - "\U00002700-\U000027BF" # Dingbats - "\U0001FA00-\U0001FAFF" # Extended symbols - "]", - flags=re.UNICODE -) - -# Allowed characters: -# - 🔥 Fire emoji for Tiny🔥Torch branding -# - ✓ Checkmark (renders fine in most fonts, used in code examples) -# - ✗ X mark (renders fine in most fonts) -ALLOWED_EMOJIS = {'🔥', '✓', '✗', '×'} - -def check_file(filepath: Path) -> list[tuple[int, str, str]]: - """Check a file for emojis. Returns list of (line_num, emoji, line_content).""" - issues = [] - try: - content = filepath.read_text(encoding='utf-8') - for line_num, line in enumerate(content.splitlines(), 1): - for match in EMOJI_PATTERN.finditer(line): - emoji = match.group() - if emoji not in ALLOWED_EMOJIS: - issues.append((line_num, emoji, line.strip()[:60])) - except Exception as e: - print(f"Warning: Could not read {filepath}: {e}", file=sys.stderr) - return issues - -def main(): - if len(sys.argv) < 2: - print("Usage: check_no_emojis.py [file2] ...") - sys.exit(0) - - files = [Path(f) for f in sys.argv[1:]] - all_issues = {} - - for filepath in files: - if filepath.suffix in ('.md', '.qmd'): - issues = check_file(filepath) - if issues: - all_issues[filepath] = issues - - if all_issues: - print("❌ Emojis found in markdown files (not allowed for PDF compatibility):\n") - for filepath, issues in all_issues.items(): - print(f" {filepath}:") - for line_num, emoji, context in issues: - print(f" Line {line_num}: {emoji} - \"{context}...\"") - print() - print("Fix: Remove emojis or replace with text descriptions.") - print("Note: 🔥 is allowed only for Tiny🔥Torch branding.") - sys.exit(1) - - sys.exit(0) - -if __name__ == '__main__': - main()