# TinyTorch **Build ML Systems From First Principles** ![Python](https://img.shields.io/badge/python-3.8+-blue.svg) ![License](https://img.shields.io/badge/license-MIT-green.svg) [![Documentation](https://img.shields.io/badge/docs-jupyter_book-orange.svg)](https://mlsysbook.github.io/TinyTorch/) ![Status](https://img.shields.io/badge/status-active-success.svg) A Harvard University course that teaches ML systems engineering by building a complete deep learning framework from scratch. From tensors to transformers, understand every line of code powering modern AI. ## What You'll Build A **complete ML framework** capable of: - Training neural networks on CIFAR-10 to 75%+ accuracy (reliably achievable!) - Building GPT-style language models - Implementing modern optimizers (Adam, learning rate scheduling) - Production deployment with monitoring and MLOps All built from scratch using only NumPy - no PyTorch, no TensorFlow! ## Quick Start ```bash # Clone and setup git clone https://github.com/mlsysbook/TinyTorch.git cd TinyTorch python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt pip install -e . # Start learning cd modules/source/01_setup jupyter lab setup_dev.py # Track progress tito checkpoint status ``` ## Learning Journey ### 20 Progressive Modules #### Part I: Neural Network Foundations (Modules 1-8) Build and train neural networks from scratch | Module | Topic | What You Build | ML Systems Learning | |--------|-------|----------------|-------------------| | 01 | Setup | Development environment | CLI tools, dependency management, testing frameworks | | 02 | Tensor | N-dimensional arrays + gradients | **Memory layout, cache efficiency**, broadcasting semantics | | 03 | Activations | ReLU + Softmax + derivatives | **Numerical stability**, saturation analysis, gradient flow | | 04 | Layers | Linear + Module + parameter management | **Parameter counting**, weight initialization, modularity patterns | | 05 | Loss | MSE + CrossEntropy + gradient computation | **Numerical precision**, loss landscape analysis, convergence metrics | | 06 | Autograd | Automatic differentiation engine | **Computational graphs**, memory management, gradient accumulation | | 07 | Optimizers | SGD + Adam + learning schedules | **Memory efficiency** (Adam uses 3x SGD), convergence dynamics | | 08 | Training | Complete training loops + evaluation | **Training dynamics**, checkpoint systems, performance monitoring | **Milestone Achievement**: Train XOR solver and MNIST classifier after Module 8 --- #### Part II: Computer Vision (Modules 9-10) Build CNNs that classify real images | Module | Topic | What You Build | ML Systems Learning | |--------|-------|----------------|-------------------| | 09 | Spatial | Conv2d + MaxPool2d + CNN operations | **Parameter scaling** (filters × channels), spatial locality, convolution efficiency | | 10 | DataLoader | Efficient data pipelines + CIFAR-10 | **Batch processing**, memory-mapped I/O, data pipeline bottlenecks | **Milestone Achievement**: CIFAR-10 CNN with 75%+ accuracy --- #### Part III: Language Models (Modules 11-14) Build transformers that generate text | Module | Topic | What You Build | ML Systems Learning | |--------|-------|----------------|-------------------| | 11 | Tokenization | Text processing + vocabulary | **Vocabulary scaling** (memory vs sequence length), tokenization bottlenecks | | 12 | Embeddings | Token embeddings + positional encoding | **Embedding tables** (vocab × dim parameters), lookup performance | | 13 | Attention | Multi-head attention mechanisms | **O(N²) scaling**, memory bottlenecks, attention optimization | | 14 | Transformers | Complete transformer blocks | **Layer scaling**, memory requirements, architectural trade-offs | **Milestone Achievement**: TinyGPT language generation --- #### Part IV: System Optimization (Modules 15-20) Profile, optimize, and benchmark ML systems | Module | Topic | What You Build | ML Systems Learning | |--------|-------|----------------|-------------------| | 15 | Profiling | Performance analysis + bottleneck detection | **Memory profiling**, FLOP counting, **Amdahl's Law**, performance measurement | | 16 | Acceleration | Hardware optimization + cache-friendly algorithms | **Cache hierarchies**, memory access patterns, **vectorization vs loops** | | 17 | Quantization | Model compression + precision reduction | **Precision trade-offs** (FP32→INT8), memory reduction, accuracy preservation | | 18 | Compression | Pruning + knowledge distillation | **Sparsity patterns**, parameter reduction, **compression ratios** | | 19 | Caching | Memory optimization + KV caching | **Memory vs compute trade-offs**, cache management, generation efficiency | | 20 | Benchmarking | **TinyMLPerf competition framework** | **Competitive optimization**, relative performance metrics, innovation scoring | **Milestone Achievement**: TinyMLPerf optimization competition --- ## Learning Philosophy **Most courses teach you to USE frameworks. TinyTorch teaches you to UNDERSTAND them.** ```python # Traditional Course: import torch model.fit(X, y) # Magic happens # TinyTorch: # You implement every component # You measure memory usage # You optimize performance # You understand the systems ``` ### Why Build Your Own Framework? - **Deep Understanding** - Know exactly what `loss.backward()` does - **Systems Thinking** - Understand memory, compute, and scaling - **Debugging Skills** - Fix problems at any level of the stack - **Production Ready** - Learn patterns used in real ML systems ## Key Features ### For Students - **Interactive Demos**: Rich CLI visualizations for every concept - **Checkpoint System**: Track your learning progress - **Immediate Testing**: Validate your implementations instantly - **Real Datasets**: Train on CIFAR-10, not toy examples ### For Instructors - **NBGrader Integration**: Automated grading workflow - **Progress Tracking**: Monitor student achievements - **Jupyter Book**: Professional course website - **Complete Solutions**: Reference implementations included ## Milestone Examples As you complete modules, exciting examples unlock to show your framework in action: ### After Module 04: First Neural Network ```bash cd examples/perceptron_1957 python rosenblatt_perceptron.py # Build the first trainable neural network (1957) ``` ### After Module 06: Multi-Layer Networks ```bash cd examples/xor_1969 python minsky_xor_problem.py # Solve the XOR problem with multi-layer networks (1969) ``` ### After Module 08: Real Computer Vision ```bash cd examples/mnist_mlp_1986 python train_mlp.py # Achieve 95%+ accuracy on MNIST (1986) ``` ### After Module 10: Modern CNNs ```bash cd examples/cifar_cnn_modern python train_cnn.py # Achieve 75%+ accuracy on CIFAR-10 ``` ### After Module 14: Language Models ```bash cd examples/gpt_2018 python train_gpt.py # Generate text with your transformer implementation ``` ### After Module 20: TinyMLPerf Competition ```bash # Use TinyMLPerf to benchmark your optimizations tito benchmark run --event mlp_sprint tito benchmark run --event cnn_marathon tito benchmark run --event transformer_decathlon # Compete in ML systems optimization benchmarks ``` ### After Module 20: Complete Optimization Suite ```bash # Use TinyMLPerf to benchmark and optimize your complete framework tito benchmark run --comprehensive python examples/optimization_showcase.py # Professional ML systems optimization ``` **These aren't toy demos** - they're real ML applications achieving solid results with YOUR framework built from scratch and optimized for performance! ## Testing & Validation All demos and modules are thoroughly tested: ```bash # Run comprehensive test suite (recommended) tito test --comprehensive # Run checkpoint tests tito checkpoint test 01 # Test specific modules tito test --module tensor # Run all module tests python tests/run_all_modules.py ``` - **20 modules** passing all tests with 100% health status - **16 capability checkpoints** tracking learning progress - **Complete optimization pipeline** from profiling to benchmarking - **TinyMLPerf competition framework** for performance excellence - **KISS principle design** for clear, maintainable code ## Documentation - **[Course Website](https://mlsysbook.github.io/TinyTorch/)** - Complete interactive course - **[Instructor Guide](docs/INSTRUCTOR_GUIDE.md)** - Teaching resources - **[Student Quickstart](docs/STUDENT_QUICKSTART.md)** - Getting started guide - **[CIFAR-10 Training Guide](docs/cifar10-training-guide.md)** - Detailed training walkthrough ## Contributing We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ## License MIT License - see [LICENSE](LICENSE) for details. ## Acknowledgments Created by [Prof. Vijay Janapa Reddi](https://vijay.seas.harvard.edu) at Harvard University. Special thanks to students and contributors who helped refine this educational framework. --- **Start Small. Go Deep. Build ML Systems.**