TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-06-03 15:15:50 -05:00

Files

Vijay Janapa Reddi 578b6d7d84 fix(autograd): Add SoftmaxBackward and patch Softmax.forward()

- Implemented SoftmaxBackward with proper gradient formula
- Patched Softmax.forward() in enable_autograd()
- Fixed LayerNorm gamma/beta to have requires_grad=True

Progress:
- Softmax now correctly computes gradients
- LayerNorm parameters initialized with requires_grad
- Still debugging: Q/K/V projections, LayerNorms in blocks, MLP first layer

Current: 9/21 parameters receive gradients (was 0/21)

2025-10-28 08:04:19 -04:00

transformers_dev.ipynb

fix(autograd): Add SoftmaxBackward and patch Softmax.forward()

2025-10-28 08:04:19 -04:00

transformers_dev.py

fix(autograd): Add SoftmaxBackward and patch Softmax.forward()

2025-10-28 08:04:19 -04:00