TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-11 21:53:34 -05:00

Files

Vijay Janapa Reddi 0c2a33ed40 fix(autograd): Add EmbeddingBackward and ReshapeBackward

Critical fixes for transformer gradient flow:

EmbeddingBackward:
- Implements scatter-add gradient accumulation for embedding lookups
- Added to Module 05 (autograd_dev.py)
- Module 11 imports and uses it in Embedding.forward()
- Gradients now flow back to embedding weights

ReshapeBackward:
- reshape() was breaking computation graph (no _grad_fn)
- Added backward function that reshapes gradient back to original shape
- Patched Tensor.reshape() in enable_autograd()
- Critical for GPT forward pass (logits.reshape before loss)

Results:
- Before: 0/37 parameters receive gradients, loss stuck
- After: 13/37 parameters receive gradients (35%)
- Single batch overfitting: 4.46 → 0.03 (99.4% improvement!)
- MODEL NOW LEARNS! 🎉

Remaining work: 24 parameters still missing gradients (likely attention)

Tests added:
- tests/milestones/test_05_transformer_architecture.py (Phase 1)
- Multiple debug scripts to isolate issues

2025-10-28 07:56:20 -04:00

__init__.py

Add exported package files and cleanup

2025-09-30 12:38:56 -04:00

embeddings.py

fix(autograd): Add EmbeddingBackward and ReshapeBackward

2025-10-28 07:56:20 -04:00

tokenization.py

feat: Complete transformer integration with milestones

2025-10-19 12:46:58 -04:00