[PR #1781] fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs #15732

Open
opened 2026-05-20 14:04:53 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1781
Author: @Shashank-Tripathi-07
Created: 5/18/2026
Status: 🔄 Open

Base: devHead: fix/tinytorch-audit-20260518


📝 Commits (1)

  • b4fae46 fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs

📊 Changes

1 file changed (+10 additions, -6 deletions)

View changed files

📝 tinytorch/src/06_autograd/06_autograd.py (+10 -6)

📄 Description

Summary

  • Fixes a correctness bug in MatmulBackward.apply() in tinytorch/src/06_autograd/06_autograd.py where gradient computation was wrong when input B is a 1D vector.

Bug Details

Root cause: When B is 1D with shape (k,), NumPy's .T is a no-op (1D arrays have no second axis to swap). The old code then called np.matmul(grad_output, b.data) which either raises a shape error (when m != k) or silently computes a dot product returning a scalar (when m == k) instead of the correct (m, k) gradient matrix.

Correct math: For A(m,k) @ b(k,) -> out(m,):

  • grad_A = np.outer(grad_output, b) -- shape (m, k) (was broken)
  • grad_b = a.T @ grad_output -- shape (k,) (was broken for 1D a)

The symmetric case (1D a with 2D B) is also fixed using np.outer(a, grad_output).

The 2D+ batched path using np.swapaxes is preserved unchanged and correct.

Test plan

  • Verify MatmulBackward with A(m,k) @ b(k,) produces grad_A of shape (m, k) and grad_b of shape (k,)
  • Verify existing 2D matmul test (A(2,2) @ B(2,2)) still passes
  • Verify batched matmul (3D inputs) still passes
  • Run tinytorch test suite: pytest tinytorch/tests/

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1781 **Author:** [@Shashank-Tripathi-07](https://github.com/Shashank-Tripathi-07) **Created:** 5/18/2026 **Status:** 🔄 Open **Base:** `dev` ← **Head:** `fix/tinytorch-audit-20260518` --- ### 📝 Commits (1) - [`b4fae46`](https://github.com/harvard-edge/cs249r_book/commit/b4fae46e183a981384d63458370c2246a7d805fe) fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs ### 📊 Changes **1 file changed** (+10 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `tinytorch/src/06_autograd/06_autograd.py` (+10 -6) </details> ### 📄 Description ## Summary - Fixes a correctness bug in `MatmulBackward.apply()` in `tinytorch/src/06_autograd/06_autograd.py` where gradient computation was wrong when input `B` is a 1D vector. ## Bug Details **Root cause**: When `B` is 1D with shape `(k,)`, NumPy's `.T` is a no-op (1D arrays have no second axis to swap). The old code then called `np.matmul(grad_output, b.data)` which either raises a shape error (when `m != k`) or silently computes a dot product returning a scalar (when `m == k`) instead of the correct `(m, k)` gradient matrix. **Correct math**: For `A(m,k) @ b(k,)` -> `out(m,)`: - `grad_A = np.outer(grad_output, b)` -- shape `(m, k)` (was broken) - `grad_b = a.T @ grad_output` -- shape `(k,)` (was broken for 1D `a`) The symmetric case (1D `a` with 2D `B`) is also fixed using `np.outer(a, grad_output)`. The 2D+ batched path using `np.swapaxes` is preserved unchanged and correct. ## Test plan - [ ] Verify `MatmulBackward` with `A(m,k) @ b(k,)` produces `grad_A` of shape `(m, k)` and `grad_b` of shape `(k,)` - [ ] Verify existing 2D matmul test (`A(2,2) @ B(2,2)`) still passes - [ ] Verify batched matmul (3D inputs) still passes - [ ] Run `tinytorch` test suite: `pytest tinytorch/tests/` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-20 14:04:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#15732