mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-01 19:20:54 -05:00
- Removed temporary test files and audit reports - Deleted backup and temp_holding directories - Reorganized module structure (07->09 spatial, 09->07 dataloader) - Added new modules: 11-14 (tokenization, embeddings, attention, transformers) - Updated examples with historical ML milestones - Cleaned up documentation structure
178 lines
6.7 KiB
Markdown
178 lines
6.7 KiB
Markdown
# TinyTorch Module Reorganization Migration Guide
|
|
|
|
## 🎯 **What Changed: Simplified, Better Learning Path**
|
|
|
|
The PyTorch expert completed surgical fixes to create a superior pedagogical structure. Students can now **train neural networks after just 7 modules** instead of 11!
|
|
|
|
## 📚 **New Module Structure**
|
|
|
|
### **Before → After Comparison**
|
|
|
|
| OLD Module | OLD Topic | → | NEW Module | NEW Topic | **Key Improvement** |
|
|
|------------|-----------|---|------------|-----------|-------------------|
|
|
| 02 | Tensor | → | 02 | Tensor + **Basic Autograd** | **Gradients from the start!** |
|
|
| 03 | 6 Activations | → | 03 | **ReLU + Softmax ONLY** | **Focus on essentials** |
|
|
| 04 | Just Layers | → | 04 | Linear + Module + **Flatten** | **Complete building blocks** |
|
|
| 05 | Networks | → | 05 | **Loss Functions** | **Clear separation: what to optimize** |
|
|
| 06 | Autograd | → | ~~merged~~ | _(integrated into 02)_ | **No forward dependencies** |
|
|
| 07 | Spatial | → | 08 | **CNN Ops** | **CNN after fundamentals** |
|
|
| 08 | Optimizers | → | 06 | **Optimizers** | **Clear separation: how to optimize** |
|
|
| 09 | DataLoader | → | 09 | DataLoader | _(same position)_ |
|
|
| 10 | Training | → | 07 | **Training** | **Complete training after Module 7!** |
|
|
|
|
## 🚀 **Import Path Changes**
|
|
|
|
### **Critical Updates for Examples and Code**
|
|
|
|
#### **OLD Import Paths (BROKEN):**
|
|
```python
|
|
# These imports will FAIL after reorganization
|
|
from tinytorch.core.networks import Module # ❌ WRONG - moved to layers
|
|
from tinytorch.core.spatial import Flatten # ❌ WRONG - moved to layers
|
|
from tinytorch.core.autograd import backward # ❌ WRONG - moved to tensor
|
|
```
|
|
|
|
#### **NEW Import Paths (CORRECT):**
|
|
```python
|
|
# Updated imports that work with reorganized structure
|
|
from tinytorch.core.layers import Module, Linear, Flatten # ✅ CORRECT
|
|
from tinytorch.core.losses import MSELoss, CrossEntropyLoss # ✅ CORRECT
|
|
from tinytorch.core.tensor import Tensor # ✅ Has backward() built-in
|
|
```
|
|
|
|
### **PyTorch-Style Import Pattern:**
|
|
```python
|
|
# Recommended pattern matching PyTorch conventions
|
|
from tinytorch import nn, optim
|
|
from tinytorch.core.tensor import Tensor
|
|
|
|
class MLP(nn.Module): # Module base from layers
|
|
def __init__(self):
|
|
super().__init__()
|
|
self.fc1 = nn.Linear(784, 128) # Linear from layers
|
|
self.fc2 = nn.Linear(128, 10) #
|
|
|
|
def forward(self, x):
|
|
x = nn.F.flatten(x, start_dim=1) # Flatten from layers
|
|
x = nn.F.relu(self.fc1(x)) # ReLU from activations
|
|
return self.fc2(x)
|
|
|
|
# Training setup
|
|
model = MLP()
|
|
optimizer = optim.SGD(model.parameters()) # From optimizers (Module 06)
|
|
loss_fn = nn.CrossEntropyLoss() # From losses (Module 05)
|
|
```
|
|
|
|
## 🎓 **Example Updates Required**
|
|
|
|
### **XOR Example** (`examples/xornet/train_xor.py`)
|
|
**OLD Dependencies:** Modules 02-10 (8 modules!)
|
|
**NEW Dependencies:** Modules 02-07 (5 modules!)
|
|
|
|
```python
|
|
# Updated module references in comments:
|
|
# Module 02: Tensor + gradients (was separate autograd)
|
|
# Module 03: ReLU only (was 6 activations)
|
|
# Module 04: Linear + Module + Flatten (was separate modules)
|
|
# Module 05: MSE loss (was in training)
|
|
# Module 06: SGD optimizer (renumbered from 08)
|
|
# Module 07: Training loops (renumbered from 10)
|
|
```
|
|
|
|
### **MNIST Example** (`examples/mnist/train_mlp.py`)
|
|
**OLD Dependencies:** Modules 02-10
|
|
**NEW Dependencies:** Modules 02-07
|
|
|
|
```python
|
|
# Key changes:
|
|
# - Flatten operation moved to Module 04
|
|
# - CrossEntropy loss moved to Module 05
|
|
# - Adam optimizer renumbered to Module 06
|
|
```
|
|
|
|
### **CIFAR-10 Example** (`examples/cifar10/train_cnn.py`)
|
|
**OLD Dependencies:** Modules 02-10
|
|
**NEW Dependencies:** Modules 02-09
|
|
|
|
```python
|
|
# Key changes:
|
|
# - Conv2d/MaxPool2d moved to Module 08 (CNN Ops)
|
|
# - DataLoader remains Module 09
|
|
# - Training infrastructure available from Module 07
|
|
```
|
|
|
|
## 🎯 **Key Pedagogical Improvements**
|
|
|
|
### **✅ What Students Gain:**
|
|
|
|
1. **Faster Training Capability:**
|
|
- OLD: Train networks after 11 modules
|
|
- NEW: **Train networks after 7 modules**
|
|
|
|
2. **Gradients From Start:**
|
|
- OLD: Wait until Module 09 for gradients
|
|
- NEW: **Gradients available in Module 02**
|
|
|
|
3. **Essential Activations Only:**
|
|
- OLD: Learn 6 activation functions
|
|
- NEW: **Master ReLU + Softmax (90% of use cases)**
|
|
|
|
4. **Complete Building Blocks:**
|
|
- OLD: Scattered across modules
|
|
- NEW: **Linear + Module + Flatten all in Module 04**
|
|
|
|
5. **Clear Separation:**
|
|
- OLD: Mixed loss and training concepts
|
|
- NEW: **Loss functions (what) vs Optimizers (how)**
|
|
|
|
### **🎆 Learning Acceleration:**
|
|
|
|
| Capability | OLD Path | NEW Path | **Improvement** |
|
|
|------------|----------|----------|-----------------|
|
|
| Basic Neural Networks | Module 11 | **Module 7** | **4 modules faster** |
|
|
| Gradient Computation | Module 9 | **Module 2** | **7 modules earlier** |
|
|
| Complete Training | Module 11 | **Module 7** | **4 modules faster** |
|
|
| CNN Training | Module 11 | **Module 9** | **2 modules faster** |
|
|
|
|
## 🔧 **Migration Checklist for Instructors**
|
|
|
|
### **Code Updates:**
|
|
- [ ] Update all example files with new module numbers
|
|
- [ ] Fix import statements in examples and documentation
|
|
- [ ] Update README files with correct prerequisites
|
|
- [ ] Test all examples run with new module structure
|
|
|
|
### **Documentation Updates:**
|
|
- [ ] Main README reflects 12-module structure
|
|
- [ ] Example documentation shows correct dependencies
|
|
- [ ] Module README files updated with new flow
|
|
- [ ] Learning path documentation emphasizes acceleration
|
|
|
|
### **Educational Messaging:**
|
|
- [ ] Emphasize "train neural networks after 7 modules"
|
|
- [ ] Highlight "gradients from Module 02"
|
|
- [ ] Explain "focus on essential activations"
|
|
- [ ] Celebrate "no forward dependencies"
|
|
|
|
## 🎉 **Success Metrics**
|
|
|
|
The reorganization is successful when:
|
|
|
|
✅ **All examples run with updated module references**
|
|
✅ **Documentation has zero old module number references**
|
|
✅ **Students can train networks faster than before**
|
|
✅ **Import statements use new consolidated paths**
|
|
✅ **Clear pedagogical benefits are communicated**
|
|
|
|
## 🚨 **Breaking Changes Summary**
|
|
|
|
| Change Type | Impact | Required Action |
|
|
|-------------|--------|-----------------|
|
|
| **Module Renumbering** | Examples break | Update all module references 02-10 → 02-07 |
|
|
| **Import Path Changes** | Code breaks | Update imports from networks/spatial to layers |
|
|
| **Function Consolidation** | API changes | Use Linear instead of Dense, unified Module base |
|
|
| **Concept Reorganization** | Learning path | Update prerequisites and dependency chains |
|
|
|
|
---
|
|
|
|
**The reorganized structure eliminates confusion, removes forward dependencies, and gets students building and training neural networks in half the time. This is a pedagogical win that makes TinyTorch a superior learning platform.** |