mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-10 08:12:33 -05:00
- Change from x.data * mask to Tensor multiplication (x * mask_tensor * scale) - Preserves computation graph and gradient flow - Required for transformer with dropout regularization