mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-04 15:47:17 -05:00

Files

Vijay Janapa Reddi 2f23f757e7 MAJOR: Implement beautiful module progression through strategic reordering

This commit implements the pedagogically optimal "inevitable discovery" module progression based on expert validation and educational design principles.

## Module Reordering Summary

**Previous Order (Problems)**:
- 05_losses → 06_autograd → 07_dataloader → 08_optimizers → 09_spatial → 10_training
- Issues: Autograd before optimizers, DataLoader before training, scattered dependencies

**New Order (Beautiful Progression)**:
- 05_losses → 06_optimizers → 07_autograd → 08_training → 09_spatial → 10_dataloader
- Benefits: Each module creates inevitable need for the next

## Pedagogical Flow Achieved

**05_losses** → "Need systematic weight updates" → **06_optimizers**
**06_optimizers** → "Need automatic gradients" → **07_autograd**
**07_autograd** → "Need systematic training" → **08_training**
**08_training** → "MLPs hit limits on images" → **09_spatial**
**09_spatial** → "Training is too slow" → **10_dataloader**

## Technical Changes

### Module Directory Renaming
- `06_autograd` → `07_autograd`
- `07_dataloader` → `10_dataloader`
- `08_optimizers` → `06_optimizers`
- `10_training` → `08_training`
- `09_spatial` → `09_spatial` (no change)

### System Integration Updates
- **MODULE_TO_CHECKPOINT mapping**: Updated in tito/commands/export.py
- **Test directories**: Renamed module_XX directories to match new numbers
- **Documentation**: Updated all references in MD files and agent configurations
- **CLI integration**: Updated next-steps suggestions for proper flow

### Agent Configuration Updates
- **Quality Assurance**: Updated module audit status with new numbers
- **Module Developer**: Updated work tracking with new sequence
- **Documentation**: Updated MASTER_PLAN_OF_RECORD.md with beautiful progression

## Educational Benefits

1. **Inevitable Discovery**: Each module naturally leads to the next
2. **Cognitive Load**: Concepts introduced exactly when needed
3. **Motivation**: Students understand WHY each tool is necessary
4. **Synthesis**: Everything flows toward complete ML systems understanding
5. **Professional Alignment**: Matches real ML engineering workflows

## Quality Assurance

- ✅ All CLI commands still function
- ✅ Checkpoint system mappings updated
- ✅ Documentation consistency maintained
- ✅ Test directory structure aligned
- ✅ Agent configurations synchronized

**Impact**: This reordering transforms TinyTorch from a collection of modules into a coherent educational journey where each step naturally motivates the next, creating optimal conditions for deep learning systems understanding.

2025-09-24 15:56:47 -04:00

module.yaml

MAJOR: Implement beautiful module progression through strategic reordering

2025-09-24 15:56:47 -04:00

README.md

MAJOR: Implement beautiful module progression through strategic reordering

2025-09-24 15:56:47 -04:00

README.md

Module 20: Capstone - Complete ML System Integration

Overview

Combine everything you've learned to build a complete, optimized ML system from scratch. This is your masterpiece - demonstrating mastery of both ML algorithms and systems engineering.

Project Options

Option 1: Optimized CIFAR-10 Trainer

Goal: 75% accuracy with minimal resources

Start with your Module 10 trainer
Apply all optimizations (acceleration, quantization, pruning)
Achieve same accuracy with 10x less compute/memory
Deploy on resource-constrained device

Option 2: Efficient GPT Inference Engine

Goal: Real-time text generation on CPU

Implement KV caching for transformers
Quantize model to INT8
Optimize attention computation
Generate 100 tokens/second on laptop CPU

Option 3: Custom Challenge

Goal: Define your own optimization challenge

Pick a problem you care about
Set performance targets
Apply systematic optimization
Document the journey

What You'll Demonstrate

1. Full Stack Understanding

Build complete training pipeline
Implement model architecture
Add optimization layers
Deploy to production

2. Systems Engineering

Profile and identify bottlenecks
Apply appropriate optimizations
Measure and validate improvements
Handle resource constraints

3. Scientific Approach

Baseline measurements
Systematic optimization
Ablation studies
Reproducible results

Capstone Structure

Week 1: Planning & Baseline

# 1. Choose project and define success metrics
metrics = {
    'accuracy_target': 75.0,
    'inference_time': '<10ms',
    'memory_usage': '<100MB',
    'model_size': '<10MB'
}

# 2. Build baseline system
baseline = build_baseline_model()
baseline_metrics = evaluate(baseline)

# 3. Profile and identify opportunities
bottlenecks = profile_system(baseline)

Week 2: Optimization Sprint

# 4. Apply optimizations systematically
optimized = baseline
optimized = apply_acceleration(optimized)
optimized = apply_quantization(optimized)  
optimized = apply_pruning(optimized)
optimized = apply_caching(optimized)

# 5. Measure improvements
for optimization in optimizations:
    metrics = evaluate(optimized)
    speedup = baseline_time / optimized_time
    print(f"{optimization}: {speedup}x faster")

Week 3: Polish & Deploy

# 6. Final optimization pass
final_model = fine_tune_optimizations(optimized)

# 7. Create deployment package
deployment = package_for_production(final_model)

# 8. Document results
write_technical_report(baseline, final_model, metrics)

Deliverables

1. Working System

Complete codebase on GitHub
README with setup instructions
Demonstration video/notebook

2. Technical Report

Problem statement and approach
Baseline vs optimized metrics
Optimization journey and decisions
Lessons learned

3. Performance Analysis

Comprehensive benchmarks
Ablation study results
Resource utilization graphs
Comparison with PyTorch/TensorFlow

Evaluation Criteria

Technical Excellence (40%)

Correctness of implementation
Quality of optimizations
Code organization and style

Performance Achievement (30%)

Meeting stated goals
Improvement over baseline
Resource efficiency

Systems Understanding (30%)

Appropriate optimization choices
Understanding of tradeoffs
Scientific methodology

Example Projects from Past Students

"TinyYOLO" - Real-time Object Detection

30 FPS on Raspberry Pi
90% size reduction through pruning
Custom INT8 kernels for ARM

"NanoGPT" - Edge Language Model

100MB model generates Shakespeare
KV caching + quantization
Runs on 2015 laptop

"SwiftCNN" - Instant Image Classification

<1ms inference on iPhone
Structured pruning + iOS Metal
95% of ResNet accuracy at 10% size

Resources

All previous module code
TinyTorch optimization library
Benchmarking tools
Community Discord for help

Success Criteria

✅ Complete working system with all optimizations
✅ 10x+ improvement in speed OR memory
✅ Professional documentation and analysis
✅ Understanding of when/why to apply each optimization
✅ Ready for ML systems engineering roles!

Final Note

This is your chance to show everything you've learned. Build something you're proud of - something that demonstrates not just that you can implement ML algorithms, but that you understand how to build production ML systems.

Remember: The goal isn't perfection, it's demonstrating systematic thinking about performance, memory, and deployment constraints - the real challenges of ML engineering.