mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-28 13:22:33 -05:00
- Change from x.data * mask to Tensor multiplication (x * mask_tensor * scale) - Preserves computation graph and gradient flow - Required for transformer with dropout regularization