mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-27 23:27:31 -05:00
- Change from x.data * mask to Tensor multiplication (x * mask_tensor * scale) - Preserves computation graph and gradient flow - Required for transformer with dropout regularization