Files
TinyTorch/docs/15-module-structure.md
Vijay Janapa Reddi 7c58db8458 Finalize 15-module structure: MLPs → CNNs → Transformers
Clean, dependency-driven organization:
- Part I (1-5): MLPs for XORNet
- Part II (6-10): CNNs for CIFAR-10
- Part III (11-15): Transformers for TinyGPT

Key improvements:
- Dropped modules 16-17 (regularization/systems) to maintain scope
- Moved normalization to module 13 (Part III where it's needed)
- Created three CIFAR-10 examples: random, MLP, CNN
- Each part introduces ONE major innovation (FC → Conv → Attention)

CIFAR-10 now showcases progression:
- test_random_baseline.py: ~10% (random chance)
- train_mlp.py: ~55% (no convolutions)
- train_cnn.py: ~60%+ (WITH Conv2D - shows why convolutions matter!)

This follows actual ML history and each module is needed for its capstone.
2025-09-22 10:07:09 -04:00

111 lines
3.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TinyTorch 15-Module Structure
## Three-Part Journey: MLPs → CNNs → Transformers
### Part I: Multi-Layer Perceptrons (Modules 1-5)
**Goal**: Build neural networks that can solve XOR
| Module | Topic | What You Build |
|--------|-------|----------------|
| 01 | Setup | Development environment |
| 02 | Tensors | N-dimensional arrays |
| 03 | Activations | ReLU, Sigmoid, Softmax |
| 04 | Layers | Dense layers |
| 05 | Networks | Sequential models |
**Capstone**: XORNet - Proves neural networks can learn non-linear functions
---
### Part II: Convolutional Neural Networks (Modules 6-10)
**Goal**: Build CNNs for image classification
| Module | Topic | What You Build |
|--------|-------|----------------|
| 06 | Spatial | Conv2D, MaxPool2D |
| 07 | DataLoader | Efficient data pipelines |
| 08 | Autograd | Automatic differentiation |
| 09 | Optimizers | SGD, Adam |
| 10 | Training | Complete training loops |
**Capstone**: CIFAR-10 with three approaches:
1. **Random Baseline**: ~10% accuracy (chance)
2. **MLP Approach**: ~55% accuracy (no convolutions)
3. **CNN Approach**: ~60%+ accuracy (WITH Conv2D!)
This progression shows WHY convolutions matter for vision!
---
### Part III: Transformers (Modules 11-15)
**Goal**: Build transformers for text generation
| Module | Topic | What You Build |
|--------|-------|----------------|
| 11 | Embeddings | Token & positional encoding |
| 12 | Attention | Multi-head attention |
| 13 | Normalization | LayerNorm for stable training |
| 14 | Transformers | Complete transformer blocks |
| 15 | Generation | Autoregressive decoding |
**Capstone**: TinyGPT - Character-level text generation
---
## Why This Structure Works
### Pedagogical Excellence
- **Each part introduces ONE major innovation**:
- Part I: Fully connected networks (the foundation)
- Part II: Convolutions (spatial processing)
- Part III: Attention (sequence processing)
### Historical Accuracy
- **Follows ML evolution**:
- 1980s-90s: MLPs dominate
- 2012: AlexNet shows CNNs beat MLPs on ImageNet
- 2017: Transformers revolutionize NLP
### Dependency-Driven Design
- **Nothing unnecessary**: Each module is needed for its capstone
- **Progressive complexity**: Each part builds on the previous
- **Clear motivation**: Students see WHY each innovation matters
## Module Dependencies
```
Part I: Foundations
├── 02_tensor (required by everything)
├── 03_activations (required by 04)
├── 04_layers (required by 05)
└── 05_networks (combines all above)
└── ✅ XORNet works!
Part II: Computer Vision
├── 06_spatial (Conv2D - THE KEY!)
├── 07_dataloader (handle real data)
├── 08_autograd (enable learning)
├── 09_optimizers (gradient descent)
└── 10_training (put it all together)
└── ✅ CIFAR-10 CNN works!
Part III: Language Models
├── 11_embeddings (discrete → continuous)
├── 12_attention (THE KEY!)
├── 13_normalization (stable training)
├── 14_transformers (attention + FFN)
└── 15_generation (sampling strategies)
└── ✅ TinyGPT works!
```
## What We Dropped
- **Module 16 (Regularization)**: Important but not essential for capstones
- **Module 17 (Systems)**: Kernels, benchmarking - advanced optimization
These could be bonus content or a separate "Production ML" course.
## The Beauty of 15 Modules
- **3 parts × 5 modules = 15**: Perfect symmetry!
- **Each part is self-contained**: Students can stop after any part
- **Clear progression**: MLP → CNN → Transformer
- **Manageable scope**: Achievable in one semester