Files
TinyTorch/docs/tinytorch-textbook-alignment.md
Vijay Janapa Reddi 753ae52ae0 MAJOR: Implement beautiful module progression through strategic reordering
This commit implements the pedagogically optimal "inevitable discovery" module progression based on expert validation and educational design principles.

## Module Reordering Summary

**Previous Order (Problems)**:
- 05_losses → 06_autograd → 07_dataloader → 08_optimizers → 09_spatial → 10_training
- Issues: Autograd before optimizers, DataLoader before training, scattered dependencies

**New Order (Beautiful Progression)**:
- 05_losses → 06_optimizers → 07_autograd → 08_training → 09_spatial → 10_dataloader
- Benefits: Each module creates inevitable need for the next

## Pedagogical Flow Achieved

**05_losses** → "Need systematic weight updates" → **06_optimizers**
**06_optimizers** → "Need automatic gradients" → **07_autograd**
**07_autograd** → "Need systematic training" → **08_training**
**08_training** → "MLPs hit limits on images" → **09_spatial**
**09_spatial** → "Training is too slow" → **10_dataloader**

## Technical Changes

### Module Directory Renaming
- `06_autograd` → `07_autograd`
- `07_dataloader` → `10_dataloader`
- `08_optimizers` → `06_optimizers`
- `10_training` → `08_training`
- `09_spatial` → `09_spatial` (no change)

### System Integration Updates
- **MODULE_TO_CHECKPOINT mapping**: Updated in tito/commands/export.py
- **Test directories**: Renamed module_XX directories to match new numbers
- **Documentation**: Updated all references in MD files and agent configurations
- **CLI integration**: Updated next-steps suggestions for proper flow

### Agent Configuration Updates
- **Quality Assurance**: Updated module audit status with new numbers
- **Module Developer**: Updated work tracking with new sequence
- **Documentation**: Updated MASTER_PLAN_OF_RECORD.md with beautiful progression

## Educational Benefits

1. **Inevitable Discovery**: Each module naturally leads to the next
2. **Cognitive Load**: Concepts introduced exactly when needed
3. **Motivation**: Students understand WHY each tool is necessary
4. **Synthesis**: Everything flows toward complete ML systems understanding
5. **Professional Alignment**: Matches real ML engineering workflows

## Quality Assurance

-  All CLI commands still function
-  Checkpoint system mappings updated
-  Documentation consistency maintained
-  Test directory structure aligned
-  Agent configurations synchronized

**Impact**: This reordering transforms TinyTorch from a collection of modules into a coherent educational journey where each step naturally motivates the next, creating optimal conditions for deep learning systems understanding.
2025-09-24 15:56:47 -04:00

8.2 KiB

TinyTorch Tutorial Structure & ML Systems Textbook Alignment

Overview

TinyTorch is designed as a companion to the Machine Learning Systems textbook, providing hands-on implementation experience for each theoretical concept. Students build ML systems from scratch to understand why production frameworks work the way they do.

Textbook Chapter → TinyTorch Module Mapping

Part I: Foundations (Chapters 1-5 → Modules 1-6)

Textbook Chapter TinyTorch Modules What Students Build
Ch 1: Introduction Module 01: Setup Development environment
Ch 2: ML Systems Module 02: Tensor Core data structures with educational loops
Ch 3: DL Primer Module 03: Activations Nonlinearity functions
Ch 4: DNN Architectures Module 04: Layers
Module 05: Losses
Network building blocks
Ch 5: AI Workflow Module 06: Autograd Automatic differentiation

Milestone: After Module 6, students can solve XOR problem - first neural network learning!

Part II: Training Systems (Chapters 6-8 → Modules 7-10)

Textbook Chapter TinyTorch Modules What Students Build
Ch 6: Data Engineering Module 07: DataLoader Batching, shuffling, real datasets
Ch 7: AI Frameworks Module 08: Optimizers SGD, Adam, learning algorithms
Ch 8: AI Training Module 09: Spatial
Module 10: Training
CNNs, training loops

Milestone: After Module 10, students train CNN on CIFAR-10 to 75% accuracy!

Part III: Language Models (Not in textbook → Modules 11-14)

Concept TinyTorch Modules What Students Build
NLP Foundations Module 11: Tokenization
Module 12: Embeddings
Text processing pipeline
Modern AI Module 13: Attention
Module 14: Transformers
GPT-style architecture

Milestone: After Module 14, students build TinyGPT from scratch!

Part IV: System Optimization (Chapters 9-12 → Modules 15-19)

Textbook Chapter TinyTorch Modules What Students Build
Ch 9: Efficient AI Module 15: Acceleration Loops → blocking → NumPy
Ch 10: Model Optimizations Module 17: Precision
Module 18: Compression
Quantization, pruning
Ch 11: AI Acceleration Module 16: Caching KV cache for transformers
Ch 12: Benchmarking AI Module 19: Benchmarking Profiling tools

Key Innovation: Students first implement with loops (Modules 2-14), then optimize (Modules 15-19)

Part V: Production & Capstone (Chapters 13-20 → Module 20)

Textbook Chapter TinyTorch Module Integration
Ch 13: ML Operations Module 20: Capstone Deploy optimized system
Ch 14-20: Advanced Topics Module 20: Capstone Apply to final project

Current Order (Phase 2: Modules 7-10)

7. DataLoader → 8. Optimizers → 9. Spatial → 10. Training

Alternative Order A: Training-First

7. Optimizers → 8. Training → 9. DataLoader → 10. Spatial

Pros: Get to training loop quickly Cons: Training without real data feels artificial

Alternative Order B: Architecture-First

7. Spatial → 8. DataLoader → 9. Optimizers → 10. Training  

Pros: Build complete architectures early Cons: Can't train CNNs without optimizers

Alternative Order C: Data-Last (Your Suggestion)

7. Optimizers → 8. Spatial → 9. Training → 10. DataLoader

Pros: Build and train on toy data first, then scale to real data Cons: Module 9 training would be limited without batching

7. Optimizers → 8. Spatial → 9. Training (toy) → 10. DataLoader (real)

Why This Works Best:

  1. Module 7 (Optimizers): Learn SGD/Adam on simple problems
  2. Module 8 (Spatial): Build CNN layers (can test with random data)
  3. Module 9 (Training): Complete training loops on toy datasets
  4. Module 10 (DataLoader): Scale to real datasets (CIFAR-10)

This creates a natural progression:

  • First train small networks on toy data (XOR, simple patterns)
  • Then scale to real vision problems (CIFAR-10)
  • DataLoader becomes the "scaling" module

Pedagogical Flow Principles

1. Build Before Optimize

  • Modules 1-14: Use educational loops for understanding
  • Modules 15-19: Transform to production code
  • Students see WHY optimizations matter

2. Milestones Drive Motivation

  • Module 6: Solve XOR (historical breakthrough)
  • Module 10: Real CNN on real data
  • Module 14: Build GPT architecture
  • Module 20: Deploy optimized system

3. Theory → Implementation → Systems

Each module follows:

  1. Mathematical foundation (textbook theory)
  2. Naive implementation (understanding)
  3. Systems analysis (memory, performance)
  4. Optimization path (how to improve)

Example Module Flow: Training Systems

Module 7: Optimizers (Learn the algorithms)

# Start simple - optimize a parabola
def sgd_step(params, grads, lr=0.01):
    params -= lr * grads

# Build up to Adam
def adam_step(params, grads, m, v, t):
    # Momentum + RMSprop = Adam

Module 8: Spatial (Build CNN components)

# Educational convolution with loops
for i in range(H_out):
    for j in range(W_out):
        for k in range(K):
            for l in range(K):
                output[i,j] += input[i+k, j+l] * kernel[k,l]

Module 9: Training (Put it together - toy data)

# Train on synthetic data first
X = np.random.randn(100, 28, 28, 1)  # Random "images"
y = (X.sum(axis=(1,2,3)) > 0).astype(int)  # Simple rule

model = SimpleCNN()
train(model, X, y)  # Works! But toy problem

Module 10: DataLoader (Scale to reality)

# Now load real CIFAR-10
dataset = CIFAR10Dataset()
loader = DataLoader(dataset, batch_size=32)

# Same training code, real data!
train(model, loader)  # 75% accuracy on CIFAR-10!

Integration with Textbook Teaching

Suggested Course Structure (15-week semester)

Weeks 1-3: Foundations

  • Read: Chapters 1-3
  • Build: Modules 1-3 (Setup, Tensor, Activations)
  • Understand: Why we need gradients in tensors from day 1

Weeks 4-6: Architecture

  • Read: Chapters 4-5
  • Build: Modules 4-6 (Layers, Losses, Autograd)
  • Milestone: XOR problem solved!

Weeks 7-9: Training Systems

  • Read: Chapters 6-8
  • Build: Modules 7-10 (Optimizers, Spatial, Training, DataLoader)
  • Milestone: CIFAR-10 CNN trained!

Weeks 10-12: Modern AI

  • Read: Supplementary NLP materials
  • Build: Modules 11-14 (Tokenization through Transformers)
  • Milestone: TinyGPT generates text!

Weeks 13-14: Optimization

  • Read: Chapters 9-12
  • Build: Modules 15-19 (Acceleration through Benchmarking)
  • Transform: Loops → Production code

Week 15: Capstone

  • Read: Chapter 13
  • Build: Module 20 (Complete optimized system)
  • Deploy: Working ML system

Key Insights for Textbook Alignment

1. Systems Thinking Through Building

Your textbook explains WHY, TinyTorch shows HOW by building it

2. Historical Progression

Examples follow ML history: Perceptron → XOR → LeNet → AlexNet → GPT

3. Production Patterns

Every optimization in TinyTorch mirrors real PyTorch/TensorFlow

4. Gradual Complexity

  • Start: Triple-nested loops (understanding)
  • End: Vectorized operations (performance)
  • Students see the journey!

Recommendation: Update Module Order

Based on this analysis, I recommend reordering Phase 2 modules:

Current: 7.DataLoader, 8.Optimizers, 9.Spatial, 10.Training Proposed: 7.Optimizers, 8.Spatial, 9.Training, 10.DataLoader

This better aligns with your textbook's flow and creates a more natural progression from toy problems to real datasets.

Next Steps

  1. Update module numbering to reflect new order
  2. Adjust Module 9 (Training) to work with synthetic data
  3. Make Module 10 (DataLoader) the "scaling up" module
  4. Update examples to show progression: toy → real data

This structure ensures TinyTorch perfectly complements your ML Systems textbook while maintaining pedagogical clarity!