diff --git a/README.md b/README.md
index df32f302..67d75bb8 100644
--- a/README.md
+++ b/README.md
@@ -1,969 +1,175 @@
# TinyTorch ๐ฅ
-**Build ML Systems From First Principles. From Computer Vision to Language Models.**
+**Build ML Systems From First Principles**
-
-
-[](https://github.com/MLSysBook/TinyTorch)
-[](https://python.org)
-[](https://mybinder.org/v2/gh/MLSysBook/TinyTorch/main)
-[](https://mlsysbook.github.io/TinyTorch/)
-[](https://harvard.edu)
+
+
+[](https://mlsysbook.github.io/TinyTorch/)
+
-๐ **[Interactive Course Website โ](https://mlsysbook.github.io/TinyTorch/)** | ๐ **[Instructor Resources โ](https://mlsysbook.github.io/TinyTorch/instructor-guide.html)**
+A Harvard University course that teaches ML systems engineering by building a complete deep learning framework from scratch. From tensors to transformers, understand every line of code powering modern AI.
----
+## ๐ฏ What You'll Build
-## โจ **New in v0.1**
+A **complete ML framework** capable of:
+- Training CNNs on CIFAR-10 to 75%+ accuracy
+- Building GPT-style language models
+- Implementing modern optimizers (Adam, learning rate scheduling)
+- Production deployment with monitoring and MLOps
-### **๐ Professional Academic Design**
-- **Clean Typography**: Inter font family optimized for extended reading sessions
-- **Academic Styling**: Professional appearance matching top university courses
-- **Enhanced Readability**: Improved spacing, contrast, and visual hierarchy
-- **Responsive Design**: Beautiful on all devices from mobile to desktop
+All built from scratch using only NumPy - no PyTorch, no TensorFlow!
-### **๐ฏ Interactive ML Systems Learning**
-- **Interactive Questions**: Write 150-300 word reflections on ML systems design
-- **NBGrader Integration**: Automated assessment with instructor feedback
-- **Checkpoint System**: Track your capability progression through 16 checkpoints
-- **TinyGPT Module**: Build transformers showing 95% framework component reuse
-
-### **๐ ๏ธ Simplified Instructor Tools**
-- **`tito grade`**: Complete grading workflow wrapped in simple CLI commands
-- **Module Management**: Export, test, and validate all modules with one command
-- **Progress Tracking**: Visual checkpoint timeline showing student achievements
-- **Jupyter Book**: Professional course website with automatic chapter generation
-
----
-
-## ๐ **A Harvard University Course**
-
-Created by [Prof. Vijay Janapa Reddi](https://vijay.seas.harvard.edu) at Harvard University, TinyTorch provides Ivy League-quality education in ML systems engineering. This course represents the culmination of years of teaching experience and research in machine learning systems.
-
-## ๐ฌ **Why Build Your Own ML Framework?**
-
-**Most ML courses teach you to use frameworks. TinyTorch teaches you to understand them through systems engineering.**
-
-```python
-Algorithm-focused Course: Systems-focused Course (TinyTorch):
-โโโ import torch โโโ Memory analysis: Adam = 3ร parameters
-โโโ model.fit(X, y) โโโ Cache efficiency: Why convolution ordering matters
-โโโ accuracy = 0.95 โโโ Gradient bottlenecks: O(Nยฒ) attention scaling
-โโโ "It works!" ๐ โโโ Production patterns: Checkpointing, monitoring
- โโโ Hardware implications: Vectorization, bandwidth
- โโโ "I understand the entire system!" ๐ก
-```
-
-**You become the ML engineer who can:**
-- Debug performance bottlenecks in production systems
-- Optimize memory usage for large-scale deployments
-- Design custom operations for novel architectures
-- Understand exactly why certain ML engineering decisions were made
-
----
-
-## ๐ฏ **Your Learning Path: ML Systems Engineering**
-
-### **๐ฌ Phase 1: Core Systems (Modules 1-5)**
-- **Memory management**: How tensors use RAM, when copies happen
-- **Compute patterns**: Understanding operation complexity O(Nยฒ) vs O(N)
-- **Data structures**: Why certain tensor layouts enable vectorization
-- **Performance foundations**: Cache efficiency, memory bandwidth
-
-### **๐ง Phase 2: ML Systems Architecture (Modules 6-10)**
-- **Scaling analysis**: Why attention is O(Nยฒ) and how to handle it
-- **Memory profiling**: Adam optimizer uses 3ร parameter memory - why?
-- **Computational graphs**: Memory vs speed tradeoffs in autograd
-- **Production patterns**: Gradient checkpointing, mixed precision
-
-### **โก Phase 3: Production Engineering (Modules 11-15)**
-- **Training systems**: Distributed computing, fault tolerance
-- **Optimization techniques**: Quantization, pruning, distillation
-- **Hardware acceleration**: Custom kernels, GPU utilization
-- **MLOps pipelines**: Monitoring, deployment, A/B testing
-
-### **๐ Phase 4: Language Models (Module 16 - TinyGPT)**
-- **Framework generalization**: Extend vision framework to language models
-- **Transformer architecture**: Multi-head attention, autoregressive generation
-- **Component reuse**: 95% framework reuse from vision to language
-- **ML Systems mastery**: Understand unified mathematical foundations
-
-### **๐ก Systems Thinking Through Implementation**
-**Every module teaches systems principles through building:**
-
-- **Module 2 (Tensors)**: Learn memory layout by implementing array operations
-- **Module 6 (Spatial)**: Understand cache performance through convolution
-- **Module 7 (Attention)**: Experience O(Nยฒ) scaling by building attention
-- **Module 9 (Autograd)**: See memory/compute tradeoffs in gradient computation
-- **Module 10 (Optimizers)**: Profile memory usage patterns in Adam vs SGD
-- **Module 13 (Kernels)**: Optimize operations for hardware characteristics
-
-**Result**: You don't just know ML algorithms - you understand ML _systems_.
-
----
-
-## ๐๏ธ What You'll Build
-
-**A Complete ML Systems Framework** โ Understanding through implementation:
-
-### **๐ง Core Systems Engineering**
-* **Memory-efficient tensor operations** with performance profiling
-* **Computational graph system** with automatic differentiation
-* **Training infrastructure** with checkpointing and fault tolerance
-* **Production monitoring** with performance bottleneck identification
-
-### **๐ Performance Analysis & Optimization**
-* **Memory profiling tools**: Understand exactly where your RAM goes
-* **Compute optimization**: Custom kernels, vectorization patterns
-* **Scaling analysis**: When operations become bottlenecks
-* **Hardware utilization**: Cache-friendly algorithms and memory patterns
-
-### **๐ Real-World ML Systems**
-* **Train CNNs on CIFAR-10** โ Achieve 75%+ accuracy with your own code
-* **Deploy production models** โ Complete MLOps pipeline with monitoring
-* **Handle large-scale data** โ Efficient DataLoader with memory management
-* **Build language models** โ Complete TinyGPT with character-level generation
-
----
-
-## ๐ Quick Start (2 minutes)
-
-### ๐ **Step 1: Setup & System Check**
+## ๐ Quick Start
```bash
-# Clone the repository
+# Clone and setup
git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch
-
-# Create virtual environment (recommended)
python -m venv .venv
-source .venv/bin/activate # On macOS/Linux
-# OR: .venv\Scripts\activate # On Windows
+source .venv/bin/activate # On Windows: .venv\Scripts\activate
+pip install -r requirements.txt
+pip install -e .
-# Install dependencies and TinyTorch
-pip install -r requirements.txt # Install all dependencies
-pip install -e . # Install TinyTorch in editable mode
-
-# Verify your setup
-tito system doctor # Comprehensive system check
-```
-
-### ๐ฏ **Step 2: Start with Module 0 - Introduction**
-
-```bash
-# Begin your TinyTorch journey with the system overview:
-cd modules/source/00_introduction
-jupyter lab introduction_dev.py # Interactive visualizations of the entire system!
-
-# What you'll explore:
-# - Complete system architecture visualization
-# - Module dependency graphs
-# - Optimal learning path through 17 modules
-# - Component relationships and complexity analysis
-```
-
-### ๐งโ๐ **Step 3: Continue to Module 1 - Setup**
-
-```bash
-# After understanding the system, start building:
-cd ../01_setup
-jupyter lab setup_dev.py # Your first implementation module
-
-# Complete the module with automatic testing:
-tito module complete 01_setup # Exports to package AND tests capabilities
-```
-
-### ๐ฏ **Step 4: Track Your Progress with Checkpoints**
-
-```bash
-# See your capability progression:
-tito checkpoint status # Current progress overview
-tito checkpoint timeline --horizontal # Visual progress timeline
-tito checkpoint test 00 # Test environment checkpoint
-
-# What you'll see:
-# โ
00: Environment - "Can I configure my TinyTorch development environment?"
-# ๐ฏ 01: Foundation - "Can I create and manipulate the building blocks of ML?"
-# โณ 02: Intelligence - "Can I add nonlinearity - the key to neural network intelligence?"
-```
-
-### ๐ **Step 5: Access the Interactive Course Website**
-
-```bash
-# Build and view the Jupyter Book website locally
-tito book build # Generate the course website
-tito book serve # Launch local server (http://localhost:8000)
-
-# Or access online:
-# https://mlsysbook.github.io/TinyTorch/
-```
-
-### ๐ฉโ๐ซ **For Instructors**
-
-```bash
-# Grading workflow with NBGrader
-tito grade setup # Configure NBGrader
-tito grade assign 02_tensor # Create student version
-tito grade collect # Collect submissions
-tito grade autograde 02_tensor # Automatic grading
-tito grade feedback 02_tensor # Generate feedback
-
-# Module management
-tito module complete 01_setup # Export + test capability
-tito checkpoint status --detailed # Student progress overview
-
-# Book management
-tito book build # Build course website
-tito book deploy # Deploy to GitHub Pages
-```
-
----
-
-## ๐ **Connection to ML Systems Research**
-
-**Optional Enhancement**: TinyTorch concepts align with the Machine Learning Systems book:
-
-- **Chapter 5 (Data Engineering)**: Your DataLoader implementation (Module 8)
-- **Chapter 6 (Feature Engineering)**: Tensor operations and preprocessing (Module 2)
-- **Chapter 7 (Model Development)**: Layer architectures you build (Modules 4-7)
-- **Chapter 8 (Model Training)**: Training loops and optimization (Modules 10-11)
-- **Chapter 9 (Model Deployment)**: MLOps and production systems (Module 15)
-- **Chapter 11 (Continual Learning)**: Advanced techniques (Module 12)
-
-**TinyTorch provides the implementation foundation for understanding these concepts deeply.**
-
----
-
-## ๐ **Prerequisites & Learning Resources**
-
-### **Required Knowledge**
-- **Python Programming**: Intermediate level (classes, functions, NumPy basics)
-- **Linear Algebra**: Matrices, vectors, basic operations
-- **Calculus**: Derivatives and chain rule (for backpropagation)
-- **Basic ML Concepts**: What neural networks are (implementation not required)
-
-### **Recommended Resources**
-- ๐ **[Machine Learning Systems Book](https://mlsysbook.ai)** - Comprehensive ML systems context
-- ๐ฅ **Course Videos** - Coming soon on the course website
-- ๐ฌ **Community Discord** - Join discussions with other learners
-- ๐ **[NBGrader Documentation](https://nbgrader.readthedocs.io)** - For instructors using autograding
-
-### **Time Commitment**
-- **Complete Course**: 60-80 hours (full implementation + exercises)
-- **Quick Exploration**: 10-15 hours (understand core concepts)
-- **Module Pace**: 3-5 hours per module average
-
----
-
-## ๐ **Your Learning Journey: Vision to Language**
-
-**TinyTorch demonstrates that the same mathematical foundations power both computer vision AND language models.**
-
-### **Phase 1: Build ML Systems Foundation (Modules 1-15)**
-Complete the core TinyTorch framework to understand:
-- Memory management and compute optimization
-- Training systems and production deployment
-- Performance profiling and bottleneck identification
-- Hardware utilization and scaling patterns
-
-### **Phase 2: Extend to Language Models (Module 16 - TinyGPT)**
-Discover framework generalization by building:
-- **Character-level GPT**: 95% component reuse from your vision framework
-- **Multi-head attention**: The key architectural difference for sequences
-- **Autoregressive generation**: Coherent text production with your training system
-- **Framework thinking**: Understand why successful ML frameworks support multiple modalities
-
-### **The Power of Unified Foundations**
-```python
-# Your TinyTorch foundation works unchanged:
-from tinytorch.core.tensor import Tensor # Same tensors for vision + language
-from tinytorch.core.layers import Dense # Same dense layers for both domains
-from tinytorch.core.training import Trainer # Same training infrastructure
-from tinytorch.core.optimizers import Adam # Same optimization algorithms
-
-# TinyGPT adds minimal language-specific components:
-from tinytorch.tinygpt import CharTokenizer # Text preprocessing
-from tinytorch.tinygpt import MultiHeadAttention # Sequence attention
-from tinytorch.tinygpt import TinyGPT # Complete language model
-```
-
-**Result**: You understand that vision and language models are variations of the same mathematical framework.
-
----
-
-## ๐ **Repository Structure**
-
-```
-TinyTorch/
-โโโ modules/source/ # 16 educational modules
-โ โโโ 00_introduction/ # ๐ฏ Visual system overview & architecture
-โ โ โโโ module.yaml # Module metadata
-โ โ โโโ README.md # Getting started guide
-โ โ โโโ introduction_dev.py # Interactive visualizations & dependency analysis
-โ โโโ 01_setup/ # Development environment setup
-โ โ โโโ module.yaml # Module metadata
-โ โ โโโ README.md # Learning objectives and guide
-โ โ โโโ setup_dev.py # Implementation file
-โ โโโ 02_tensor/ # N-dimensional arrays
-โ โ โโโ module.yaml
-โ โ โโโ README.md
-โ โ โโโ tensor_dev.py
-โ โโโ 03_activations/ # Neural network activation functions
-โ โโโ 04_layers/ # Dense layers and transformations
-โ โโโ 05_dense/ # Sequential networks and MLPs
-โ โโโ 06_spatial/ # Convolutional neural networks
-โ โโโ 07_attention/ # Self-attention and transformer components
-โ โโโ 08_dataloader/ # Data loading and preprocessing
-โ โโโ 09_autograd/ # Automatic differentiation
-โ โโโ 10_optimizers/ # SGD, Adam, learning rate scheduling
-โ โโโ 11_training/ # Training loops and validation
-โ โโโ 12_compression/ # Model optimization and compression
-โ โโโ 13_kernels/ # High-performance operations
-โ โโโ 14_benchmarking/ # Performance analysis and profiling
-โ โโโ 15_mlops/ # Production monitoring and deployment
-โ โโโ 16_tinygpt/ # ๐ฅ Complete language model implementation
-โโโ tinytorch/ # Your built framework package
-โ โโโ core/ # Core implementations (exported from modules)
-โ โ โโโ tensor.py # Generated from 02_tensor
-โ โ โโโ activations.py # Generated from 03_activations
-โ โ โโโ layers.py # Generated from 04_layers
-โ โ โโโ dense.py # Generated from 05_dense
-โ โ โโโ spatial.py # Generated from 06_spatial
-โ โ โโโ attention.py # Generated from 07_attention
-โ โ โโโ ... # All your implementations
-โ โโโ utils/ # Shared utilities and tools
-โโโ book/ # Interactive course website
-โ โโโ _config.yml # Jupyter Book configuration
-โ โโโ intro.md # Course introduction
-โ โโโ chapters/ # Generated from module READMEs
-โโโ tito/ # CLI tool for development workflow
-โ โโโ commands/ # Student and instructor commands
-โ โ โโโ checkpoint.py # ๐ฏ Checkpoint system with Rich progress tracking
-โ โ โโโ module.py # Enhanced with tito module complete workflow
-โ โโโ tools/ # Testing and build automation
-โโโ tests/ # Integration tests
- โโโ checkpoints/ # ๐ฏ 16 capability checkpoint tests
- โ โโโ checkpoint_00_environment.py
- โ โโโ checkpoint_01_foundation.py
- โ โโโ ... # Through checkpoint_15_capstone.py
- โโโ test_checkpoint_integration.py # Integration testing suite
-```
-
-**Module Progression (Start with Module 0!):**
-1. **๐ฏ Module 0: Introduction** - Begin here! Visual system overview and architecture exploration
-2. **Module 1: Setup** - Configure your development environment and workflow
-3. **Modules 2-15** - Build your computer vision ML framework progressively
-4. **๐ฅ Module 16: TinyGPT** - Extend your framework to language models!
-
-**Development Workflow:**
-1. **Develop in `modules/source/`** - Each module has a `*_dev.py` file where you implement components
-2. **Complete module** - Use `tito module complete` to export AND test capabilities automatically
-3. **Track progress** - Use `tito checkpoint status` to see your ML capabilities unlocked
-4. **Use your framework** - Import and use your own code: `from tinytorch.core.tensor import Tensor`
-5. **Celebrate achievements** - Get immediate feedback when you unlock new ML capabilities
-
-**Alternative Workflow:**
-1. **Traditional export** - Use `tito export` to build implementations into Python package
-2. **Manual testing** - Run `tito test` to verify implementations work correctly
-3. **Manual checkpoint testing** - Use `tito checkpoint test` for capability validation
-
----
-
-## ๐ Complete Course: 16 Modules (Start with Module 0!)
-
-**Module Progression:** Start with Module 0 (Introduction) โ Progress through Modules 1-15 โ Extend to language models with Module 16!
-
-**Difficulty Levels:** ๐ Overview โ โญ Beginner โ โญโญ Intermediate โ โญโญโญ Advanced โ โญโญโญโญ Expert โ ๐ฅ Language Models
-
-### **๐ Module 0: System Overview (START HERE!)**
-* **๐ฏ 00_introduction**: Interactive system architecture, dependency visualization, and learning roadmap
- - Understand the complete TinyTorch system before building
- - Explore module dependencies and optimal learning paths
- - Visualize how all 17 modules work together
-
-### **๐๏ธ Foundations** (Modules 01-05)
-* **01_setup**: Development environment and CLI tools
-* **02_tensor**: N-dimensional arrays and tensor operations
-* **03_activations**: ReLU, Sigmoid, Tanh, Softmax functions
-* **04_layers**: Dense layers and matrix operations
-* **05_dense**: Sequential networks and MLPs
-
-### **๐ง Deep Learning** (Modules 06-10)
-* **06_spatial**: Convolutional neural networks and image processing
-* **07_attention**: Self-attention and transformer components
-* **08_dataloader**: Data loading, batching, and preprocessing
-* **09_autograd**: Automatic differentiation and backpropagation
-* **10_optimizers**: SGD, Adam, and learning rate scheduling
-
-### **โก Systems & Production** (Modules 11-15)
-* **11_training**: Training loops, metrics, and validation
-* **12_compression**: Model pruning, quantization, and distillation
-* **13_kernels**: Performance optimization and custom operations
-* **14_benchmarking**: Profiling, testing, and performance analysis
-* **15_mlops**: Monitoring, deployment, and production systems
-
-### **๐ฅ Language Models** (Module 16)
-* **16_tinygpt**: Complete GPT-style transformer with character-level generation
-
-**Status**: All 16 modules complete with inline tests and educational content
-
----
-
-## ๐ **Complete System Integration**
-
-**This isn't 16 isolated assignments.** Every component you build integrates into one cohesive, fully functional ML framework that powers both vision AND language:
-
-**๐ฏ Explore the full system architecture visually in Module 00 before diving into implementation!**
-
-```mermaid
-flowchart TD
- Z[00_introduction
๐ฏ System Overview] --> A[01_setup
Setup & Environment]
- A --> B[02_tensor
Core Tensor Operations]
- B --> C[03_activations
ReLU, Sigmoid, Tanh]
- B --> I[09_autograd
Automatic Differentiation]
-
- C --> D[04_layers
Dense Layers]
- D --> E[05_dense
Sequential Networks]
-
- E --> F[06_spatial
Convolutional Networks]
- E --> G[07_attention
Self-Attention]
-
- B --> H[08_dataloader
Data Loading]
-
- I --> J[10_optimizers
SGD & Adam]
-
- H --> K[11_training
Training Loops]
- E --> K
- F --> K
- G --> K
- J --> K
-
- K --> L[12_compression
Model Optimization]
- K --> M[13_kernels
High-Performance Ops]
- K --> N[14_benchmarking
Performance Analysis]
- K --> O[15_mlops
Production Monitoring]
-
- L --> P[16_tinygpt
๐ฅ Language Models]
- G --> P
- J --> P
- K --> P
-```
-
-### **๐ฏ How It All Connects**
-
-**Foundation (01-05):** Build your core data structures and basic operations
-**Deep Learning (06-10):** Add neural networks and automatic differentiation
-**Production (11-15):** Scale to real applications with training and production systems
-**Language Models (16):** Extend your vision framework to natural language processing
-
-**The Result:** A complete, working ML framework built entirely by you, capable of:
-- โ
Training CNNs on CIFAR-10 with 90%+ accuracy
-- โ
Implementing modern optimizers (Adam, learning rate scheduling)
-- โ
Deploying compressed models with 75% size reduction
-- โ
Production monitoring with comprehensive metrics
-- ๐ฅ **Generating coherent text with TinyGPT language models**
-
-### **๐ฅ TinyGPT: Framework Generalization**
-
-After completing the 15 core modules, you have a **complete computer vision framework**. Module 16 demonstrates the ultimate ML systems insight: the same foundation powers language models!
-
-**What You'll Discover:**
-- ๐ง **Component Reuse**: 95% of your vision framework works unchanged for language
-- ๐ **Mathematical Unity**: Dense layers, activations, and optimizers are universal
-- โก **Strategic Extensions**: Only attention mechanisms are truly language-specific
-- ๐ฏ **Framework Thinking**: Understand why successful ML frameworks support multiple modalities
-
-**The Achievement:** Build a complete GPT-style language model using **your TinyTorch implementation**. This demonstrates true understanding of unified ML foundations.
-
----
-
-## ๐ง ML Systems Learning Framework: Implement โ Profile โ Optimize
-
-### **Example: Understanding Memory in Adam Optimizer**
-
-**๐ง Implement:** Build Adam optimizer from paper
-```python
-class Adam:
- def __init__(self, lr=0.001):
- self.m = {} # First moment estimates
- self.v = {} # Second moment estimates
- # Why do we need TWO additional arrays per parameter?
-```
-
-**๐ Profile:** Measure actual memory usage
-```python
-from tinytorch.core.profiling import MemoryProfiler
-profiler = MemoryProfiler()
-with profiler.measure('adam_step'):
- optimizer.step()
-print(f"Memory: {profiler.peak_memory_mb} MB") # 3ร your model size!
-```
-
-**โก Optimize:** Understand the systems implications
-```python
-# Why does Adam use 3ร memory?
-# Parameters: W (model weights)
-# + First moments: m (momentum)
-# + Second moments: v (adaptive learning rates)
-# = 3ร parameter memory for Adam vs 1ร for SGD
-
-# When does this become a production bottleneck?
-# How do systems like PyTorch handle this in practice?
-```
-
-**This pattern teaches both algorithms AND systems engineering** โ you understand not just "how" but "why" and "what are the tradeoffs."
-
----
-
-## ๐ ML Systems Philosophy
-
-### **Systems Understanding Through Implementation**
-* **Memory consciousness**: Every operation has memory implications you'll measure
-* **Performance awareness**: Understand computational complexity through profiling
-* **Production reality**: Real bottlenecks, real datasets, real scale challenges
-
-### **Engineering-First Mindset**
-* **Measure everything**: Performance profiling built into every module
-* **Optimize systematically**: Understand trade-offs before making decisions
-* **Scale considerations**: How do algorithms behave as data/models grow?
-
-### **Bridge Theory and Practice**
-* **Academic rigor**: Implement algorithms correctly from papers
-* **Engineering pragmatism**: Understand why certain design choices were made
-* **Production readiness**: Build systems that work at scale, not just in notebooks
-
----
-
-## ๐ Documentation
-
-### **Interactive Jupyter Book**
-- **Live Site**: https://mlsysbook.github.io/TinyTorch/
-- **Auto-updated** from source code on every release
-- **Complete course content** with executable examples
-- **Real implementation details** with solution code
-
-### **Development Workflow**
-- **`dev` branch**: Active development and experiments
-- **`main` branch**: Stable releases that trigger documentation deployment
-- **Inline testing**: Tests embedded directly in source modules
-- **Continuous integration**: Automatic building and deployment
-
----
-
-## ๐ ๏ธ Development Workflow
-
-### **Module Development**
-```bash
-# Work on dev branch
-git checkout dev
-
-# Edit source modules
-cd modules/source/02_tensor
-jupyter lab tensor_dev.py
-
-# Complete module with export and capability testing
-tito module complete 02_tensor # Exports + tests checkpoint_01_foundation
-
-# Check your progress
-tito checkpoint status # See capabilities unlocked
-tito checkpoint timeline --horizontal # Visual progress timeline
-
-# Alternative: Traditional workflow
-tito export 02_tensor # Export to package
-tito test 02_tensor # Test implementation
-tito checkpoint test 01 # Test specific checkpoint
-tito nbdev build # Build complete package
-```
-
-### **Release Process**
-```bash
-# Ready for release
-git checkout main
-git merge dev
-git push origin main # Triggers documentation deployment
-```
-
----
-
-## ๐ Project Structure
-
-```
-TinyTorch/
-โโโ modules/source/XX/ # 16 source modules with inline tests
-โโโ tinytorch/core/ # Your exported ML framework
-โโโ tito/ # CLI and course management tools
-โโโ book/ # Jupyter Book source and config
-โโโ tests/ # Integration tests
-โโโ docs/ # Development guides and workflows
-```
-
----
-
-## ๐งช Tech Stack
-
-* **Python 3.8+** โ Modern Python with type hints
-* **NumPy** โ Numerical foundations
-* **Jupyter Lab** โ Interactive development
-* **Rich** โ Beautiful CLI output
-* **NBDev** โ Literate programming and packaging
-* **Jupyter Book** โ Interactive documentation
-* **GitHub Actions** โ Continuous integration and deployment
-
----
-
-## โ
Verified Learning Outcomes
-
-Students who complete TinyTorch can:
-
-โ
**Build complete neural networks** from tensors to training loops
-โ
**Implement modern ML algorithms** (Adam, dropout, batch norm)
-โ
**Optimize performance** with profiling and custom kernels
-โ
**Deploy production systems** with monitoring and MLOps
-โ
**Debug and test** ML systems with proper engineering practices
-โ
**Understand trade-offs** between accuracy, speed, and resources
-
----
-
-## ๐โโ๏ธ Getting Started
-
-### **Option 1: Interactive Course**
-๐ **[Start Learning Now](https://mlsysbook.github.io/TinyTorch/)** โ Complete course in your browser
-
-### **Option 2: Local Development**
-```bash
-git clone https://github.com/mlsysbook/TinyTorch.git
-cd TinyTorch
-pip install -r requirements.txt # Install all dependencies (numpy, jupyter, pytest, etc.)
-pip install -e . # Install TinyTorch package in editable mode
-tito system doctor # Verify setup
-tito checkpoint status # See your capability progression
+# Start learning
cd modules/source/01_setup
-jupyter lab setup_dev.py # Start building
-tito module complete 01_setup # Complete with automatic testing
+jupyter lab setup_dev.py
+
+# Track progress
+tito checkpoint status
```
-### **Option 3: Instructor Setup**
-```bash
-# Clone and verify system
-git clone https://github.com/mlsysbook/TinyTorch.git
-cd TinyTorch
-tito system info
-tito checkpoint status --detailed # Student progress overview
+## ๐ Course Structure
-# Test module workflow with checkpoints
-tito module complete 01_setup # Export + test capabilities
-tito checkpoint test 00 # Test environment checkpoint
+### **16 Progressive Modules**
-# Traditional workflow (still available)
-tito export 01_setup && tito test 01_setup
-```
+| Module | Topic | What You Build |
+|--------|-------|----------------|
+| **Foundations** | | |
+| 01 | Setup | Development environment |
+| 02 | Tensors | N-dimensional arrays |
+| 03 | Activations | ReLU, Sigmoid, Softmax |
+| 04 | Layers | Dense layers |
+| 05 | Networks | Sequential models |
+| **Deep Learning** | | |
+| 06 | Spatial | CNNs for vision |
+| 07 | Attention | Transformers |
+| 08 | DataLoader | Efficient data pipelines |
+| 09 | Autograd | Automatic differentiation |
+| 10 | Optimizers | SGD, Adam |
+| **Production** | | |
+| 11 | Training | Complete training loops |
+| 12 | Compression | Model optimization |
+| 13 | Kernels | Performance optimization |
+| 14 | Benchmarking | Profiling tools |
+| 15 | MLOps | Production deployment |
+| **Language Models** | | |
+| 16 | TinyGPT | Complete GPT implementation |
----
+## ๐ Learning Philosophy
-**๐ฅ Ready to build your ML framework? Start with TinyTorch and understand every layer. _Start Small. Go Deep._**
-
----
-
-## ๐ฏ **North Star Achievement: Train Real CNNs on CIFAR-10**
-
-### **Your Semester Goal: 75%+ Accuracy on CIFAR-10**
-
-**What You'll Build:** A complete neural network training pipeline using 100% your own code - no PyTorch, no TensorFlow, just TinyTorch!
+**Most courses teach you to USE frameworks. TinyTorch teaches you to UNDERSTAND them.**
+
+```python
+# Traditional Course:
+import torch
+model.fit(X, y) # Magic happens
+
+# TinyTorch:
+# You implement every component
+# You measure memory usage
+# You optimize performance
+# You understand the systems
+```
+
+### Why Build Your Own Framework?
+
+โ
**Deep Understanding** - Know exactly what `loss.backward()` does
+โ
**Systems Thinking** - Understand memory, compute, and scaling
+โ
**Debugging Skills** - Fix problems at any level of the stack
+โ
**Production Ready** - Learn patterns used in real ML systems
+
+## ๐ ๏ธ Key Features
+
+### For Students
+- **Interactive Demos**: Rich CLI visualizations for every concept
+- **Checkpoint System**: Track your learning progress
+- **Immediate Testing**: Validate your implementations instantly
+- **Real Datasets**: Train on CIFAR-10, not toy examples
+
+### For Instructors
+- **NBGrader Integration**: Automated grading workflow
+- **Progress Tracking**: Monitor student achievements
+- **Jupyter Book**: Professional course website
+- **Complete Solutions**: Reference implementations included
+
+## ๐ Example: Train a CNN on CIFAR-10
```python
-# This is what you'll be able to do by semester end:
-from tinytorch.core.tensor import Tensor
from tinytorch.core.networks import Sequential
-from tinytorch.core.layers import Dense
-from tinytorch.core.spatial import Conv2D
+from tinytorch.core.spatial import Conv2D
from tinytorch.core.activations import ReLU
-from tinytorch.core.dataloader import CIFAR10Dataset, DataLoader
-from tinytorch.core.training import Trainer, CrossEntropyLoss, Accuracy
+from tinytorch.core.dataloader import CIFAR10Dataset
+from tinytorch.core.training import Trainer
from tinytorch.core.optimizers import Adam
-# Download real CIFAR-10 data (built-in support!)
-dataset = CIFAR10Dataset(download=True, flatten=False)
-train_loader = DataLoader(dataset.train_data, dataset.train_labels, batch_size=32)
-test_loader = DataLoader(dataset.test_data, dataset.test_labels, batch_size=32)
+# Load real data
+dataset = CIFAR10Dataset(download=True)
+train_loader = DataLoader(dataset.train_data, batch_size=32)
-# Build your CNN architecture
+# Build CNN
model = Sequential([
Conv2D(3, 32, kernel_size=3),
ReLU(),
- Conv2D(32, 64, kernel_size=3),
+ Conv2D(32, 64, kernel_size=3),
ReLU(),
- Dense(64 * 28 * 28, 128),
- ReLU(),
- Dense(128, 10)
+ Dense(64*28*28, 10)
])
-# Train with automatic checkpointing
-trainer = Trainer(model, CrossEntropyLoss(), Adam(lr=0.001), [Accuracy()])
-history = trainer.fit(
- train_loader,
- val_dataloader=test_loader,
- epochs=30,
- save_best=True, # Automatically saves best model
- checkpoint_path='best_model.pkl'
-)
-
-# Evaluate your trained model
-from tinytorch.core.training import evaluate_model, plot_training_history
-results = evaluate_model(model, test_loader)
-print(f"๐ Test Accuracy: {results['accuracy']:.2%}") # Target: 75%+
-plot_training_history(history) # Visualize training curves
+# Train
+trainer = Trainer(model, loss=CrossEntropyLoss(), optimizer=Adam())
+trainer.fit(train_loader, epochs=30)
+# Achieves 75%+ accuracy!
```
-### **๐ Real-World Capabilities You'll Implement**
+## ๐งช Testing & Validation
-**Data Management:**
-- โ
**CIFAR-10 Download**: Built-in `download_cifar10()` function
-- โ
**Efficient Loading**: `CIFAR10Dataset` class with train/test splits
-- โ
**Batch Processing**: DataLoader with shuffling and batching
+All demos and modules are thoroughly tested:
-**Training Infrastructure:**
-- โ
**Model Checkpointing**: Save best models during training
-- โ
**Early Stopping**: Stop when validation loss stops improving
-- โ
**Progress Tracking**: Real-time metrics and loss visualization
+```bash
+# Test all demos
+python test_all_demos.py
-**Evaluation Tools:**
-- โ
**Confusion Matrices**: `compute_confusion_matrix()` for error analysis
-- โ
**Performance Metrics**: Accuracy, precision, recall computation
-- โ
**Visualization**: `plot_training_history()` for learning curves
+# Validate implementations
+python validate_demos.py
-### **๐ Progressive Milestones**
-
-1. **Module 8 (DataLoader)**: Load and visualize CIFAR-10 images
-2. **Module 11 (Training)**: Train simple models with checkpointing
-3. **Module 6 (Spatial)**: Add CNN layers for image processing
-4. **Module 10 (Optimizers)**: Use Adam for faster convergence
-5. **Final Goal**: Achieve 75%+ accuracy on CIFAR-10 test set!
-
-### **๐ What This Means For You**
-
-By achieving this north star goal, you will have:
-- **Built a complete ML framework** capable of training real neural networks
-- **Implemented industry-standard features** like checkpointing and evaluation
-- **Trained on real data** not toy examples - actual CIFAR-10 images
-- **Achieved meaningful accuracy** competitive with early PyTorch implementations
-- **Deep understanding** of every component because you built it all
-
-This isn't just an academic exercise - you're building production-capable ML infrastructure from scratch!
-
----
-
-## โ **Frequently Asked Questions**
-
-
-๐ "Why not just use PyTorch/TensorFlow? This seems like reinventing the wheel."
-
-
-
-> **You're right - for production, use PyTorch!** But consider:
->
-> **๐ค Deep Understanding Questions:**
-> - **Do you understand what `loss.backward()` actually does?** Most engineers don't.
-> - **Can you debug when gradients vanish?** You'll know why and how to fix it.
-> - **Could you optimize a custom operation?** You'll have built the primitives.
->
-> **๐ก The Learning Analogy:**
-> Think of it like this: Pilots learn in small planes before flying 747s. You're learning the fundamentals that make you a **better PyTorch engineer**.
-
----
-
-
-
-โก "How is this different from online tutorials that build neural networks?"
-
-
-
-> **Most tutorials focus on isolated components** - a Colab here, a notebook there. TinyTorch builds a **fully integrated system**.
->
-> **๐๏ธ Systems Engineering Analogy:**
-> Think of building a **compiler** or **operating system**. You don't just implement a lexer or a scheduler - you build how **every component works together**. Each piece must integrate seamlessly with the whole.
->
-> **๐ Component vs. System Approach:**
-> ```python
-> Component Approach: Systems Approach (TinyTorch):
-> โโโ Build a neural network โโโ Build a complete ML framework
-> โโโ Jupyter notebook demos โโโ Full Python package with CLI
-> โโโ Isolated examples โโโ Integrated: tensors โ layers โ training
-> โโโ "Here's how ReLU works" โโโ Production patterns: testing, profiling
-> โโโ "Here's how EVERYTHING connects"
-> ```
->
-> **๐ฏ Key Insight:**
-> You learn **systems engineering**, not just individual algorithms. Like understanding how every part of a compiler interacts to turn code into executable programs.
-
----
-
-
-
-๐ก "Can't I just read papers/books instead of implementing?"
-
-
-
-> **๐ Reading vs. ๐ง Building:**
-> ```
-> Reading about neural networks: Building neural networks:
-> โโโ "I understand the theory" โโโ "Why are my gradients exploding?"
-> โโโ "Backprop makes sense" โโโ "Oh, that's why we need gradient clipping"
-> โโโ "Adam is better than SGD" โโโ "Now I see when each optimizer works"
-> โโโ Theoretical knowledge โโโ Deep intuitive understanding
-> ```
->
-> **๐ The Reality Check:**
-> Implementation **forces you to confront reality** - edge cases, numerical stability, memory management, performance trade-offs that papers gloss over.
-
----
-
-
-
-๐ค "Isn't everything a Transformer now? Why learn old architectures?"
-
-
-
-> **Great question!** Transformers are indeed dominant, but they're built on the same foundations you'll implement:
->
-> **๐๏ธ Transformer Building Blocks You'll Build:**
-> - **Attention is just matrix operations** - which you'll build from tensors
-> - **LayerNorm uses your activations and layers**
-> - **Adam optimizer powers Transformer training** - you'll implement it
-> - **Multi-head attention = your Linear layers + reshaping**
->
-> **๐ฏ The Strategic Reality:**
-> Understanding foundations makes you the engineer who can **optimize Transformers**, not just use them. Plus, CNNs still power computer vision, RNNs drive real-time systems, and new architectures emerge constantly.
-
----
-
-
-
-๐ "I'm already good at ML. Is this too basic for me?"
-
-
-
-> **๐งช Challenge Test - Can You:**
-> - **Implement Adam optimizer from the paper?** (Not just use `torch.optim.Adam`)
-> - **Explain why ReLU causes dying neurons** and how to fix it?
-> - **Debug a 50% accuracy drop** after model deployment?
->
-> **๐ช Why Advanced Engineers Love TinyTorch:**
-> It fills the **"implementation gap"** that most ML education skips. You'll go from understanding concepts to implementing production systems.
-
----
-
-
-
-๐งช "Is this academic or practical?"
-
-
-
-> **Both!** TinyTorch bridges academic understanding with engineering reality:
->
-> **๐ Academic Rigor:**
-> - Mathematical foundations implemented correctly
-> - Proper testing and validation methodologies
-> - Research-quality implementations you can trust
->
-> **โ๏ธ Engineering Practicality:**
-> - Production-style code organization and CLI tools
-> - Performance considerations and optimization techniques
-> - Real datasets, realistic scale, professional development workflow
-
----
-
-
-
-โฐ "How much time does this take?"
-
-
-
-> **๐ Time Investment:** ~40-60 hours for complete framework
->
-> **๐ฏ Flexible Learning Paths:**
-> - **Quick exploration:** 1-2 modules to understand the approach
-> - **Focused learning:** Core modules (01-10) for solid foundations
-> - **Complete mastery:** All 16 modules for full framework expertise
->
-> **โจ Self-Paced Design:**
-> Each module is self-contained, so you can stop and start as needed.
-
----
-
-
-
-๐ "What if I get stuck or confused?"
-
-
-
-> **๐ก๏ธ Built-in Support System:**
-> - **Progressive scaffolding:** Each step builds on the previous, with guided implementations
-> - **Comprehensive testing:** 200+ tests ensure your code works correctly
-> - **Rich documentation:** Visual explanations, real-world context, debugging tips
-> - **Professional error messages:** Helpful feedback when things go wrong
-> - **Modular design:** Skip ahead or go back without breaking your progress
->
-> **๐ก Learning Philosophy:**
-> The course is designed to **guide you through complexity**, not leave you struggling alone.
-
----
-
-
-
-๐ "What can I build after completing TinyTorch?"
-
-
-
-> **๐๏ธ Your Framework Becomes the Foundation For:**
-> - **Research projects:** Implement cutting-edge papers on solid foundations
-> - **Specialized systems:** Computer vision, NLP, robotics applications
-> - **Performance engineering:** GPU kernels, distributed training, quantization
-> - **Custom architectures:** New layer types, novel optimizers, experimental designs
->
-> **๐ฏ Ultimate Skill Unlock:**
-> You'll have the implementation skills to **turn any ML paper into working code**.
-
----
-
-
----
-
-## ๐ค **Contributing & Support**
-
-### **Getting Help**
-- ๐ **Course Website**: [mlsysbook.github.io/TinyTorch](https://mlsysbook.github.io/TinyTorch/)
-- ๐ฌ **Issues**: [GitHub Issues](https://github.com/mlsysbook/TinyTorch/issues)
-- ๐ง **Contact**: Course instructors through the website
-
-### **Contributing**
-We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for:
-- Code style guidelines
-- Testing requirements
-- Documentation standards
-- Pull request process
-
-### **Citation**
-If you use TinyTorch in your research or teaching, please cite:
-```bibtex
-@software{tinytorch2024,
- title = {TinyTorch: Build ML Systems From First Principles},
- author = {Reddi, Vijay Janapa},
- year = {2024},
- institution = {Harvard University},
- url = {https://github.com/mlsysbook/TinyTorch}
-}
+# Run checkpoint tests
+tito checkpoint test 01
```
-### **License**
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+โ
**100% test coverage** across 8 interactive demos
+โ
**48 validation checks** ensuring correctness
+โ
**16 capability checkpoints** tracking progress
-### **Acknowledgments**
-- Harvard University for supporting this educational initiative
-- Students who provided feedback and helped improve the course
-- The ML systems community for inspiration and guidance
+## ๐ Documentation
+
+- **[Course Website](https://mlsysbook.github.io/TinyTorch/)** - Complete interactive course
+- **[Instructor Guide](docs/INSTRUCTOR_GUIDE.md)** - Teaching resources
+- **[API Reference](https://mlsysbook.github.io/TinyTorch/api)** - Framework documentation
+
+## ๐ค Contributing
+
+We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
+
+## ๐ License
+
+MIT License - see [LICENSE](LICENSE) for details.
+
+## ๐ Acknowledgments
+
+Created by [Prof. Vijay Janapa Reddi](https://vijay.seas.harvard.edu) at Harvard University.
+
+Special thanks to students and contributors who helped refine this educational framework.
---
-**Built with โค๏ธ at Harvard University**
-
-_Start Small. Go Deep. Build ML Systems._
+**Start Small. Go Deep. Build ML Systems.**
\ No newline at end of file