MAJOR: Implement beautiful module progression through strategic reordering

This commit implements the pedagogically optimal "inevitable discovery" module progression based on expert validation and educational design principles.

## Module Reordering Summary

**Previous Order (Problems)**:
- 05_losses → 06_autograd → 07_dataloader → 08_optimizers → 09_spatial → 10_training
- Issues: Autograd before optimizers, DataLoader before training, scattered dependencies

**New Order (Beautiful Progression)**:
- 05_losses → 06_optimizers → 07_autograd → 08_training → 09_spatial → 10_dataloader
- Benefits: Each module creates inevitable need for the next

## Pedagogical Flow Achieved

**05_losses** → "Need systematic weight updates" → **06_optimizers**
**06_optimizers** → "Need automatic gradients" → **07_autograd**
**07_autograd** → "Need systematic training" → **08_training**
**08_training** → "MLPs hit limits on images" → **09_spatial**
**09_spatial** → "Training is too slow" → **10_dataloader**

## Technical Changes

### Module Directory Renaming
- `06_autograd` → `07_autograd`
- `07_dataloader` → `10_dataloader`
- `08_optimizers` → `06_optimizers`
- `10_training` → `08_training`
- `09_spatial` → `09_spatial` (no change)

### System Integration Updates
- **MODULE_TO_CHECKPOINT mapping**: Updated in tito/commands/export.py
- **Test directories**: Renamed module_XX directories to match new numbers
- **Documentation**: Updated all references in MD files and agent configurations
- **CLI integration**: Updated next-steps suggestions for proper flow

### Agent Configuration Updates
- **Quality Assurance**: Updated module audit status with new numbers
- **Module Developer**: Updated work tracking with new sequence
- **Documentation**: Updated MASTER_PLAN_OF_RECORD.md with beautiful progression

## Educational Benefits

1. **Inevitable Discovery**: Each module naturally leads to the next
2. **Cognitive Load**: Concepts introduced exactly when needed
3. **Motivation**: Students understand WHY each tool is necessary
4. **Synthesis**: Everything flows toward complete ML systems understanding
5. **Professional Alignment**: Matches real ML engineering workflows

## Quality Assurance

-  All CLI commands still function
-  Checkpoint system mappings updated
-  Documentation consistency maintained
-  Test directory structure aligned
-  Agent configurations synchronized

**Impact**: This reordering transforms TinyTorch from a collection of modules into a coherent educational journey where each step naturally motivates the next, creating optimal conditions for deep learning systems understanding.
This commit is contained in:
Vijay Janapa Reddi
2025-09-24 15:56:47 -04:00
parent 0d87b6603f
commit 2f23f757e7
68 changed files with 5875 additions and 2399 deletions

View File

@@ -0,0 +1,166 @@
# Module 20: Capstone - Complete ML System Integration
## Overview
Combine everything you've learned to build a complete, optimized ML system from scratch. This is your masterpiece - demonstrating mastery of both ML algorithms and systems engineering.
## Project Options
### Option 1: Optimized CIFAR-10 Trainer
**Goal**: 75% accuracy with minimal resources
- Start with your Module 10 trainer
- Apply all optimizations (acceleration, quantization, pruning)
- Achieve same accuracy with 10x less compute/memory
- Deploy on resource-constrained device
### Option 2: Efficient GPT Inference Engine
**Goal**: Real-time text generation on CPU
- Implement KV caching for transformers
- Quantize model to INT8
- Optimize attention computation
- Generate 100 tokens/second on laptop CPU
### Option 3: Custom Challenge
**Goal**: Define your own optimization challenge
- Pick a problem you care about
- Set performance targets
- Apply systematic optimization
- Document the journey
## What You'll Demonstrate
### 1. Full Stack Understanding
- Build complete training pipeline
- Implement model architecture
- Add optimization layers
- Deploy to production
### 2. Systems Engineering
- Profile and identify bottlenecks
- Apply appropriate optimizations
- Measure and validate improvements
- Handle resource constraints
### 3. Scientific Approach
- Baseline measurements
- Systematic optimization
- Ablation studies
- Reproducible results
## Capstone Structure
### Week 1: Planning & Baseline
```python
# 1. Choose project and define success metrics
metrics = {
'accuracy_target': 75.0,
'inference_time': '<10ms',
'memory_usage': '<100MB',
'model_size': '<10MB'
}
# 2. Build baseline system
baseline = build_baseline_model()
baseline_metrics = evaluate(baseline)
# 3. Profile and identify opportunities
bottlenecks = profile_system(baseline)
```
### Week 2: Optimization Sprint
```python
# 4. Apply optimizations systematically
optimized = baseline
optimized = apply_acceleration(optimized)
optimized = apply_quantization(optimized)
optimized = apply_pruning(optimized)
optimized = apply_caching(optimized)
# 5. Measure improvements
for optimization in optimizations:
metrics = evaluate(optimized)
speedup = baseline_time / optimized_time
print(f"{optimization}: {speedup}x faster")
```
### Week 3: Polish & Deploy
```python
# 6. Final optimization pass
final_model = fine_tune_optimizations(optimized)
# 7. Create deployment package
deployment = package_for_production(final_model)
# 8. Document results
write_technical_report(baseline, final_model, metrics)
```
## Deliverables
### 1. Working System
- Complete codebase on GitHub
- README with setup instructions
- Demonstration video/notebook
### 2. Technical Report
- Problem statement and approach
- Baseline vs optimized metrics
- Optimization journey and decisions
- Lessons learned
### 3. Performance Analysis
- Comprehensive benchmarks
- Ablation study results
- Resource utilization graphs
- Comparison with PyTorch/TensorFlow
## Evaluation Criteria
### Technical Excellence (40%)
- Correctness of implementation
- Quality of optimizations
- Code organization and style
### Performance Achievement (30%)
- Meeting stated goals
- Improvement over baseline
- Resource efficiency
### Systems Understanding (30%)
- Appropriate optimization choices
- Understanding of tradeoffs
- Scientific methodology
## Example Projects from Past Students
### "TinyYOLO" - Real-time Object Detection
- 30 FPS on Raspberry Pi
- 90% size reduction through pruning
- Custom INT8 kernels for ARM
### "NanoGPT" - Edge Language Model
- 100MB model generates Shakespeare
- KV caching + quantization
- Runs on 2015 laptop
### "SwiftCNN" - Instant Image Classification
- <1ms inference on iPhone
- Structured pruning + iOS Metal
- 95% of ResNet accuracy at 10% size
## Resources
- All previous module code
- TinyTorch optimization library
- Benchmarking tools
- Community Discord for help
## Success Criteria
- ✅ Complete working system with all optimizations
- ✅ 10x+ improvement in speed OR memory
- ✅ Professional documentation and analysis
- ✅ Understanding of when/why to apply each optimization
- ✅ Ready for ML systems engineering roles!
## Final Note
This is your chance to show everything you've learned. Build something you're proud of - something that demonstrates not just that you can implement ML algorithms, but that you understand how to build production ML systems.
**Remember**: The goal isn't perfection, it's demonstrating systematic thinking about performance, memory, and deployment constraints - the real challenges of ML engineering.

View File

@@ -0,0 +1,30 @@
name: Capstone
number: 20
type: project
difficulty: advanced
estimated_hours: 15-20
description: |
Final project combining all optimization techniques. Students build an optimized
end-to-end ML system and compete on the global leaderboard.
learning_objectives:
- Combine all optimization techniques
- Build complete optimized systems
- Deploy efficient ML models
- Compete on performance metrics
prerequisites:
- All previous modules (1-19)
skills_developed:
- System integration
- Holistic optimization
- Production deployment
- Performance engineering
final_projects:
- Optimized CIFAR-10 trainer
- Efficient GPT inference engine
- Memory-constrained deployment
- Custom optimization challenge