mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-09 00:42:36 -05:00

Files

Vijay Janapa Reddi 86b908fe5c Add TinyTorch examples gallery and fix module integration issues

- Create professional examples directory showcasing TinyTorch as real ML framework
- Add examples: XOR, MNIST, CIFAR-10, text generation, autograd demo, optimizer comparison
- Fix import paths in exported modules (training.py, dense.py)
- Update training module with autograd integration for loss functions
- Add progressive integration tests for all 16 modules
- Document framework capabilities and usage patterns

This commit establishes the examples gallery that demonstrates TinyTorch
works like PyTorch/TensorFlow, validating the complete framework.

2025-09-21 10:00:11 -04:00

compare.py

Add TinyTorch examples gallery and fix module integration issues

2025-09-21 10:00:11 -04:00

README.md

Add TinyTorch examples gallery and fix module integration issues

2025-09-21 10:00:11 -04:00

README.md

Optimization Algorithm Comparison

Compare SGD, Momentum, and Adam optimizers to see how different algorithms navigate the loss landscape!

What This Demonstrates

Different optimization strategies and their trade-offs
Convergence speed comparison between optimizers
Why Adam is popular for deep learning
YOUR implementations of all major optimizers

Running the Comparison

python compare.py

Expected output:

⚡ Optimizer Comparison with TinyTorch
======================================================================

🏃 Training with different optimizers...
------------------------------------------------------------

Training with SGD:
  Initial loss: 4.2315
  Final loss:   0.0234
  Improvement:  99.4%

Training with Momentum:
  Initial loss: 4.2315
  Final loss:   0.0156
  Improvement:  99.6%

Training with Adam:
  Initial loss: 4.2315
  Final loss:   0.0098
  Improvement:  99.8%

📊 Loss Curves (lower is better):
------------------------------------------------------------
Epoch  0: SGD: 4.2315 ████████████████████  Momentum: 4.2315 ████████████████████  Adam: 4.2315 ████████████████████
Epoch  5: SGD: 1.5234 ███████  Momentum: 0.8976 ████  Adam: 0.2134 █
Epoch 10: SGD: 0.6789 ███  Momentum: 0.2345 █  Adam: 0.0567
Epoch 15: SGD: 0.3456 █  Momentum: 0.0876   Adam: 0.0234
...

🏆 Best optimizer: Adam (lowest final loss)

Optimizers Compared

SGD (Stochastic Gradient Descent)

w = w - learning_rate * gradient

Simple and reliable
Can be slow to converge
Fixed learning rate

Momentum

velocity = momentum * velocity - learning_rate * gradient
w = w + velocity

Accelerates in consistent directions
Dampens oscillations
Helps escape shallow local minima

Adam (Adaptive Moment Estimation)

m = β₁ * m + (1 - β₁) * gradient        # First moment
v = β₂ * v + (1 - β₂) * gradient²       # Second moment
w = w - learning_rate * m / (√v + ε)

Adaptive learning rates per parameter
Combines momentum with RMSprop
Often fastest convergence

Key Insights

Optimizer	Pros	Cons	Best For
SGD	Simple, stable	Slow convergence	Final fine-tuning
Momentum	Faster than SGD	Requires tuning	General training
Adam	Fast, adaptive	Can overfit	Most deep learning

Mathematical Foundation

Your TinyTorch implements:

First-order optimization (gradient-based)
Second-order moment estimation (Adam)
Momentum accumulation
Adaptive learning rates

Requirements

Module 10 (Optimizers) completed
TinyTorch package exported

Next Steps

Try experimenting with:

Different learning rates
Various momentum values
Complex loss landscapes
Your own optimization algorithms!