🎯 NORTH STAR VISION DOCUMENTED: 'Don't Just Import It, Build It' - Training AI Engineers, not just ML users AI Engineering emerges as a foundational discipline like Computer Engineering, bridging algorithms and systems to build the AI infrastructure of the future. 🧪 ROBUST TESTING FRAMEWORK ESTABLISHED: - Created tests/regression/ for sandbox integrity tests - Implemented test-driven bug prevention workflow - Clear separation: student tests (pedagogical) vs system tests (robustness) - Every bug becomes a test to prevent recurrence ✅ KEY IMPLEMENTATIONS: - NORTH_STAR.md: Vision for AI Engineering discipline - Testing best practices: Focus on robust student sandbox - Git workflow standards: Professional development practices - Regression test suite: Prevent infrastructure issues - Conv->Linear dimension tests (found CNN bug) - Transformer reshaping tests (found GPT bug) 🏗️ SANDBOX INTEGRITY: Students need a solid, predictable environment where they focus on ML concepts, not debugging framework issues. The framework must be invisible. 📚 EDUCATIONAL PHILOSOPHY: TinyTorch isn't just teaching a framework - it's founding the AI Engineering discipline by training engineers who understand how to BUILD ML systems. This establishes the foundation for training the first generation of true AI Engineers who will define this emerging discipline.
6.0 KiB
TinyTorch Complete Module Roadmap
20-Module ML Systems Course with Competition System
PHASE 1: FOUNDATION (Modules 1-6)
Build the core mathematical infrastructure for neural networks.
- Module 01:
setup- Development environment configuration - Module 02:
tensor- Core data structures with autodiff support (backward design: built-in grad support) - Module 03:
activations- ReLU, Sigmoid, nonlinearity functions - Module 04:
layers- Dense layers, network building blocks - Module 05:
losses- MSE, CrossEntropy, BCE loss functions - Module 06:
autograd- Automatic differentiation engine
Capability Unlocked: Networks can learn through backpropagation Historical Example: XOR Problem (1969) - Solve what stumped AI for a decade
PHASE 2: TRAINING SYSTEMS (Modules 7-10)
Build complete training pipelines for real datasets.
- Module 07:
dataloader- Data pipelines, batching, real datasets (moved from 09) - Module 08:
optimizers- SGD, Adam optimization algorithms - Module 09:
spatial- Conv2D, pooling for image processing (moved from 07) - Module 10:
training- Complete training loops with validation
Capability Unlocked: Train deep networks on real datasets Historical Examples:
- After Module 9: LeNet (1998) - First CNN for digit recognition
- After Module 10: AlexNet (2012) - Deep learning revolution
PHASE 3: LANGUAGE MODELS (Modules 11-14)
Build modern transformer architectures for NLP.
- Module 11:
tokenization- Text preprocessing and tokenization - Module 12:
embeddings- Word vectors, positional encoding - Module 13:
attention- Self-attention mechanisms - Module 14:
transformers- Complete transformer architecture
Capability Unlocked: Build GPT-style language models Historical Example: GPT (2018) - Foundation of modern AI
PHASE 4: SYSTEM OPTIMIZATION (Modules 15-19)
Transform educational code into production-ready systems through progressive optimization.
-
Module 15:
acceleration- Core performance optimization- Journey from educational loops to optimized operations
- Cache-friendly blocking for matrix multiplication
- NumPy vectorization (10-100x speedups)
- Transparent backend dispatch (existing code runs faster automatically!)
-
Module 16:
caching- Memory optimization patterns- KV caching for transformer inference
- Incremental computation techniques
- Autoregressive generation optimization
- Memory vs computation tradeoffs
-
Module 17:
precision- Numerical optimization- Post-training INT8 quantization
- Calibration and scaling techniques
- Accuracy vs performance tradeoffs
- Memory footprint reduction
-
Module 18:
compression- Model size optimization- Magnitude-based pruning
- Structured vs unstructured sparsity
- Knowledge distillation basics
- Deployment optimization
-
Module 19:
benchmarking- Performance analysis- Profiling and bottleneck identification
- Memory usage analysis
- Comparative benchmarking
- Scientific performance measurement
PHASE 5: CAPSTONE PROJECT (Module 20)
- Module 20:
capstone- Complete ML system- Combine all optimization techniques
- Build optimized end-to-end systems
- Example projects:
- Optimized CIFAR-10 trainer (75% accuracy, minimal resources)
- Efficient GPT inference engine (memory-constrained)
- Custom optimization challenge
- Deploy production-ready ML systems
Key Design Principles
1. Backward Design Philosophy
Each module is designed with future needs in mind:
- Tensors (Module 2): Built with gradient support from day 1
- Layers (Module 4): Parameter management ready for optimizers
- Training (Module 10): Memory tracking for optimization modules
- Transformers (Module 14): KV structure ready for caching
2. Backend Dispatch Architecture
# Students run SAME code throughout
model.train() # Uses appropriate backend automatically
# Module 1-14: Naive backend (for learning)
# Module 15+: Optimized backend (for performance)
# Zero code changes needed!
3. Progressive Optimization Journey
- Understanding through implementation (Modules 1-14): Build with loops for clarity
- Systematic optimization (Modules 15-19): Transform loops into production code
- Transparent acceleration: Optimizations work automatically on existing code
- Real-world techniques: Learn optimizations used in PyTorch/TensorFlow
4. Historical Context
Examples map to ML breakthroughs:
- 1957: Perceptron (Module 4)
- 1969: XOR Solution (Module 6)
- 1998: LeNet (Module 9)
- 2012: AlexNet (Module 10)
- 2018: GPT (Module 14)
Learning Progression
Weeks 1-6: Foundation
Students build mathematical infrastructure and understand how neural networks work.
Weeks 7-10: Training Systems
Students build complete training pipelines and understand how to scale to real datasets.
Weeks 11-14: Modern AI
Students build transformer architectures that power ChatGPT and modern AI.
Weeks 15-19: System Optimization
Students transform educational code into production-ready systems through progressive optimization techniques.
Week 20: Capstone Project
Students combine all techniques to build complete, optimized ML systems from scratch.
Success Metrics
By completion, students will have:
- ✅ Built every component of modern ML systems from scratch
- ✅ Recreated the major breakthroughs in AI history
- ✅ Transformed educational loops into production-ready code (10-100x speedups)
- ✅ Understood why PyTorch, TensorFlow are designed the way they are
- ✅ Mastered real-world optimization techniques (caching, quantization, pruning)
- ✅ Built complete ML systems that transparently optimize themselves
Ultimate Goal: Students who can read PyTorch source code and think "I understand why they did it this way - I built this myself in TinyTorch!"