mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-01 17:22:34 -05:00

Files

Vijay Janapa Reddi 2f23f757e7 MAJOR: Implement beautiful module progression through strategic reordering

This commit implements the pedagogically optimal "inevitable discovery" module progression based on expert validation and educational design principles.

## Module Reordering Summary

**Previous Order (Problems)**:
- 05_losses → 06_autograd → 07_dataloader → 08_optimizers → 09_spatial → 10_training
- Issues: Autograd before optimizers, DataLoader before training, scattered dependencies

**New Order (Beautiful Progression)**:
- 05_losses → 06_optimizers → 07_autograd → 08_training → 09_spatial → 10_dataloader
- Benefits: Each module creates inevitable need for the next

## Pedagogical Flow Achieved

**05_losses** → "Need systematic weight updates" → **06_optimizers**
**06_optimizers** → "Need automatic gradients" → **07_autograd**
**07_autograd** → "Need systematic training" → **08_training**
**08_training** → "MLPs hit limits on images" → **09_spatial**
**09_spatial** → "Training is too slow" → **10_dataloader**

## Technical Changes

### Module Directory Renaming
- `06_autograd` → `07_autograd`
- `07_dataloader` → `10_dataloader`
- `08_optimizers` → `06_optimizers`
- `10_training` → `08_training`
- `09_spatial` → `09_spatial` (no change)

### System Integration Updates
- **MODULE_TO_CHECKPOINT mapping**: Updated in tito/commands/export.py
- **Test directories**: Renamed module_XX directories to match new numbers
- **Documentation**: Updated all references in MD files and agent configurations
- **CLI integration**: Updated next-steps suggestions for proper flow

### Agent Configuration Updates
- **Quality Assurance**: Updated module audit status with new numbers
- **Module Developer**: Updated work tracking with new sequence
- **Documentation**: Updated MASTER_PLAN_OF_RECORD.md with beautiful progression

## Educational Benefits

1. **Inevitable Discovery**: Each module naturally leads to the next
2. **Cognitive Load**: Concepts introduced exactly when needed
3. **Motivation**: Students understand WHY each tool is necessary
4. **Synthesis**: Everything flows toward complete ML systems understanding
5. **Professional Alignment**: Matches real ML engineering workflows

## Quality Assurance

- ✅ All CLI commands still function
- ✅ Checkpoint system mappings updated
- ✅ Documentation consistency maintained
- ✅ Test directory structure aligned
- ✅ Agent configurations synchronized

**Impact**: This reordering transforms TinyTorch from a collection of modules into a coherent educational journey where each step naturally motivates the next, creating optimal conditions for deep learning systems understanding.

2025-09-24 15:56:47 -04:00

6.0 KiB

Raw Blame History

TinyTorch Complete Module Roadmap

20-Module ML Systems Course with Competition System

PHASE 1: FOUNDATION (Modules 1-6)

Build the core mathematical infrastructure for neural networks.

Module 01: setup - Development environment configuration
Module 02: tensor - Core data structures with autodiff support (backward design: built-in grad support)
Module 03: activations - ReLU, Sigmoid, nonlinearity functions
Module 04: layers - Dense layers, network building blocks
Module 05: losses - MSE, CrossEntropy, BCE loss functions
Module 06: autograd - Automatic differentiation engine

Capability Unlocked: Networks can learn through backpropagation Historical Example: XOR Problem (1969) - Solve what stumped AI for a decade

PHASE 2: TRAINING SYSTEMS (Modules 7-10)

Build complete training pipelines for real datasets.

Module 07: dataloader - Data pipelines, batching, real datasets (moved from 09)
Module 08: optimizers - SGD, Adam optimization algorithms
Module 09: spatial - Conv2D, pooling for image processing (moved from 07)
Module 10: training - Complete training loops with validation

Capability Unlocked: Train deep networks on real datasets Historical Examples:

After Module 9: LeNet (1998) - First CNN for digit recognition
After Module 10: AlexNet (2012) - Deep learning revolution

PHASE 3: LANGUAGE MODELS (Modules 11-14)

Build modern transformer architectures for NLP.

Module 11: tokenization - Text preprocessing and tokenization
Module 12: embeddings - Word vectors, positional encoding
Module 13: attention - Self-attention mechanisms
Module 14: transformers - Complete transformer architecture

Capability Unlocked: Build GPT-style language models Historical Example: GPT (2018) - Foundation of modern AI

PHASE 4: SYSTEM OPTIMIZATION (Modules 15-19)

Transform educational code into production-ready systems through progressive optimization.

Module 15: acceleration - Core performance optimization
- Journey from educational loops to optimized operations
- Cache-friendly blocking for matrix multiplication
- NumPy vectorization (10-100x speedups)
- Transparent backend dispatch (existing code runs faster automatically!)
Module 16: caching - Memory optimization patterns
- KV caching for transformer inference
- Incremental computation techniques
- Autoregressive generation optimization
- Memory vs computation tradeoffs
Module 17: precision - Numerical optimization
- Post-training INT8 quantization
- Calibration and scaling techniques
- Accuracy vs performance tradeoffs
- Memory footprint reduction
Module 18: compression - Model size optimization
- Magnitude-based pruning
- Structured vs unstructured sparsity
- Knowledge distillation basics
- Deployment optimization
Module 19: benchmarking - Performance analysis
- Profiling and bottleneck identification
- Memory usage analysis
- Comparative benchmarking
- Scientific performance measurement

PHASE 5: CAPSTONE PROJECT (Module 20)

Module 20: capstone - Complete ML system
- Combine all optimization techniques
- Build optimized end-to-end systems
- Example projects:
  - Optimized CIFAR-10 trainer (75% accuracy, minimal resources)
  - Efficient GPT inference engine (memory-constrained)
  - Custom optimization challenge
- Deploy production-ready ML systems

Key Design Principles

1. Backward Design Philosophy

Each module is designed with future needs in mind:

Tensors (Module 2): Built with gradient support from day 1
Layers (Module 4): Parameter management ready for optimizers
Training (Module 10): Memory tracking for optimization modules
Transformers (Module 14): KV structure ready for caching

2. Backend Dispatch Architecture

# Students run SAME code throughout
model.train()  # Uses appropriate backend automatically

# Module 1-14: Naive backend (for learning)
# Module 15+: Optimized backend (for performance)
# Zero code changes needed!

3. Progressive Optimization Journey

Understanding through implementation (Modules 1-14): Build with loops for clarity
Systematic optimization (Modules 15-19): Transform loops into production code
Transparent acceleration: Optimizations work automatically on existing code
Real-world techniques: Learn optimizations used in PyTorch/TensorFlow

4. Historical Context

Examples map to ML breakthroughs:

1957: Perceptron (Module 4)
1969: XOR Solution (Module 6)
1998: LeNet (Module 9)
2012: AlexNet (Module 10)
2018: GPT (Module 14)

Learning Progression

Weeks 1-6: Foundation

Students build mathematical infrastructure and understand how neural networks work.

Weeks 7-10: Training Systems

Students build complete training pipelines and understand how to scale to real datasets.

Weeks 11-14: Modern AI

Students build transformer architectures that power ChatGPT and modern AI.

Weeks 15-19: System Optimization

Students transform educational code into production-ready systems through progressive optimization techniques.

Week 20: Capstone Project

Students combine all techniques to build complete, optimized ML systems from scratch.

Success Metrics

By completion, students will have:

✅ Built every component of modern ML systems from scratch
✅ Recreated the major breakthroughs in AI history
✅ Transformed educational loops into production-ready code (10-100x speedups)
✅ Understood why PyTorch, TensorFlow are designed the way they are
✅ Mastered real-world optimization techniques (caching, quantization, pruning)
✅ Built complete ML systems that transparently optimize themselves

Ultimate Goal: Students who can read PyTorch source code and think "I understand why they did it this way - I built this myself in TinyTorch!"

6.0 KiB Raw Blame History