mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-02 23:38:31 -05:00

Files

Vijay Janapa Reddi 73e7f5b67a FOUNDATION: Establish AI Engineering as a discipline through TinyTorch

🎯 NORTH STAR VISION DOCUMENTED:
'Don't Just Import It, Build It' - Training AI Engineers, not just ML users

AI Engineering emerges as a foundational discipline like Computer Engineering,
bridging algorithms and systems to build the AI infrastructure of the future.

🧪 ROBUST TESTING FRAMEWORK ESTABLISHED:
- Created tests/regression/ for sandbox integrity tests
- Implemented test-driven bug prevention workflow
- Clear separation: student tests (pedagogical) vs system tests (robustness)
- Every bug becomes a test to prevent recurrence

✅ KEY IMPLEMENTATIONS:
- NORTH_STAR.md: Vision for AI Engineering discipline
- Testing best practices: Focus on robust student sandbox
- Git workflow standards: Professional development practices
- Regression test suite: Prevent infrastructure issues
- Conv->Linear dimension tests (found CNN bug)
- Transformer reshaping tests (found GPT bug)

🏗️ SANDBOX INTEGRITY:
Students need a solid, predictable environment where they focus on ML concepts,
not debugging framework issues. The framework must be invisible.

📚 EDUCATIONAL PHILOSOPHY:
TinyTorch isn't just teaching a framework - it's founding the AI Engineering
discipline by training engineers who understand how to BUILD ML systems.

This establishes the foundation for training the first generation of true
AI Engineers who will define this emerging discipline.

2025-09-25 11:16:28 -04:00

6.0 KiB

Raw Blame History

TinyTorch Complete Module Roadmap

20-Module ML Systems Course with Competition System

PHASE 1: FOUNDATION (Modules 1-6)

Build the core mathematical infrastructure for neural networks.

Module 01: setup - Development environment configuration
Module 02: tensor - Core data structures with autodiff support (backward design: built-in grad support)
Module 03: activations - ReLU, Sigmoid, nonlinearity functions
Module 04: layers - Dense layers, network building blocks
Module 05: losses - MSE, CrossEntropy, BCE loss functions
Module 06: autograd - Automatic differentiation engine

Capability Unlocked: Networks can learn through backpropagation Historical Example: XOR Problem (1969) - Solve what stumped AI for a decade

PHASE 2: TRAINING SYSTEMS (Modules 7-10)

Build complete training pipelines for real datasets.

Module 07: dataloader - Data pipelines, batching, real datasets (moved from 09)
Module 08: optimizers - SGD, Adam optimization algorithms
Module 09: spatial - Conv2D, pooling for image processing (moved from 07)
Module 10: training - Complete training loops with validation

Capability Unlocked: Train deep networks on real datasets Historical Examples:

After Module 9: LeNet (1998) - First CNN for digit recognition
After Module 10: AlexNet (2012) - Deep learning revolution

PHASE 3: LANGUAGE MODELS (Modules 11-14)

Build modern transformer architectures for NLP.

Module 11: tokenization - Text preprocessing and tokenization
Module 12: embeddings - Word vectors, positional encoding
Module 13: attention - Self-attention mechanisms
Module 14: transformers - Complete transformer architecture

Capability Unlocked: Build GPT-style language models Historical Example: GPT (2018) - Foundation of modern AI

PHASE 4: SYSTEM OPTIMIZATION (Modules 15-19)

Transform educational code into production-ready systems through progressive optimization.

Module 15: acceleration - Core performance optimization
- Journey from educational loops to optimized operations
- Cache-friendly blocking for matrix multiplication
- NumPy vectorization (10-100x speedups)
- Transparent backend dispatch (existing code runs faster automatically!)
Module 16: caching - Memory optimization patterns
- KV caching for transformer inference
- Incremental computation techniques
- Autoregressive generation optimization
- Memory vs computation tradeoffs
Module 17: precision - Numerical optimization
- Post-training INT8 quantization
- Calibration and scaling techniques
- Accuracy vs performance tradeoffs
- Memory footprint reduction
Module 18: compression - Model size optimization
- Magnitude-based pruning
- Structured vs unstructured sparsity
- Knowledge distillation basics
- Deployment optimization
Module 19: benchmarking - Performance analysis
- Profiling and bottleneck identification
- Memory usage analysis
- Comparative benchmarking
- Scientific performance measurement

PHASE 5: CAPSTONE PROJECT (Module 20)

Module 20: capstone - Complete ML system
- Combine all optimization techniques
- Build optimized end-to-end systems
- Example projects:
  - Optimized CIFAR-10 trainer (75% accuracy, minimal resources)
  - Efficient GPT inference engine (memory-constrained)
  - Custom optimization challenge
- Deploy production-ready ML systems

Key Design Principles

1. Backward Design Philosophy

Each module is designed with future needs in mind:

Tensors (Module 2): Built with gradient support from day 1
Layers (Module 4): Parameter management ready for optimizers
Training (Module 10): Memory tracking for optimization modules
Transformers (Module 14): KV structure ready for caching

2. Backend Dispatch Architecture

# Students run SAME code throughout
model.train()  # Uses appropriate backend automatically

# Module 1-14: Naive backend (for learning)
# Module 15+: Optimized backend (for performance)
# Zero code changes needed!

3. Progressive Optimization Journey

Understanding through implementation (Modules 1-14): Build with loops for clarity
Systematic optimization (Modules 15-19): Transform loops into production code
Transparent acceleration: Optimizations work automatically on existing code
Real-world techniques: Learn optimizations used in PyTorch/TensorFlow

4. Historical Context

Examples map to ML breakthroughs:

1957: Perceptron (Module 4)
1969: XOR Solution (Module 6)
1998: LeNet (Module 9)
2012: AlexNet (Module 10)
2018: GPT (Module 14)

Learning Progression

Weeks 1-6: Foundation

Students build mathematical infrastructure and understand how neural networks work.

Weeks 7-10: Training Systems

Students build complete training pipelines and understand how to scale to real datasets.

Weeks 11-14: Modern AI

Students build transformer architectures that power ChatGPT and modern AI.

Weeks 15-19: System Optimization

Students transform educational code into production-ready systems through progressive optimization techniques.

Week 20: Capstone Project

Students combine all techniques to build complete, optimized ML systems from scratch.

Success Metrics

By completion, students will have:

✅ Built every component of modern ML systems from scratch
✅ Recreated the major breakthroughs in AI history
✅ Transformed educational loops into production-ready code (10-100x speedups)
✅ Understood why PyTorch, TensorFlow are designed the way they are
✅ Mastered real-world optimization techniques (caching, quantization, pruning)
✅ Built complete ML systems that transparently optimize themselves

Ultimate Goal: Students who can read PyTorch source code and think "I understand why they did it this way - I built this myself in TinyTorch!"

6.0 KiB Raw Blame History