[PR #1781] fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs #15732

New Issue

GiteaMirror · 2026-05-20T14:04:53-05:00

GiteaMirror commented

2026-05-20 14:04:53 -05:00

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1781
Author: @Shashank-Tripathi-07
Created: 5/18/2026
Status: 🔄 Open

Base: dev ← Head: fix/tinytorch-audit-20260518

📝 Commits (1)

b4fae46 fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs

📊 Changes

1 file changed (+10 additions, -6 deletions)

View changed files

📝 tinytorch/src/06_autograd/06_autograd.py (+10 -6)

📄 Description

Summary

Fixes a correctness bug in MatmulBackward.apply() in tinytorch/src/06_autograd/06_autograd.py where gradient computation was wrong when input B is a 1D vector.

Bug Details

Root cause: When B is 1D with shape (k,), NumPy's .T is a no-op (1D arrays have no second axis to swap). The old code then called np.matmul(grad_output, b.data) which either raises a shape error (when m != k) or silently computes a dot product returning a scalar (when m == k) instead of the correct (m, k) gradient matrix.

Correct math: For A(m,k) @ b(k,) -> out(m,):

grad_A = np.outer(grad_output, b) -- shape (m, k) (was broken)
grad_b = a.T @ grad_output -- shape (k,) (was broken for 1D a)

The symmetric case (1D a with 2D B) is also fixed using np.outer(a, grad_output).

The 2D+ batched path using np.swapaxes is preserved unchanged and correct.

Test plan

Verify MatmulBackward with A(m,k) @ b(k,) produces grad_A of shape (m, k) and grad_b of shape (k,)
Verify existing 2D matmul test (A(2,2) @ B(2,2)) still passes
Verify batched matmul (3D inputs) still passes
Run tinytorch test suite: pytest tinytorch/tests/

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1781 **Author:** [@Shashank-Tripathi-07](https://github.com/Shashank-Tripathi-07) **Created:** 5/18/2026 **Status:** 🔄 Open **Base:** `dev` ← **Head:** `fix/tinytorch-audit-20260518` --- ### 📝 Commits (1) - [`b4fae46`](https://github.com/harvard-edge/cs249r_book/commit/b4fae46e183a981384d63458370c2246a7d805fe) fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs ### 📊 Changes **1 file changed** (+10 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `tinytorch/src/06_autograd/06_autograd.py` (+10 -6) </details> ### 📄 Description ## Summary - Fixes a correctness bug in `MatmulBackward.apply()` in `tinytorch/src/06_autograd/06_autograd.py` where gradient computation was wrong when input `B` is a 1D vector. ## Bug Details **Root cause**: When `B` is 1D with shape `(k,)`, NumPy's `.T` is a no-op (1D arrays have no second axis to swap). The old code then called `np.matmul(grad_output, b.data)` which either raises a shape error (when `m != k`) or silently computes a dot product returning a scalar (when `m == k`) instead of the correct `(m, k)` gradient matrix. **Correct math**: For `A(m,k) @ b(k,)` -> `out(m,)`: - `grad_A = np.outer(grad_output, b)` -- shape `(m, k)` (was broken) - `grad_b = a.T @ grad_output` -- shape `(k,)` (was broken for 1D `a`) The symmetric case (1D `a` with 2D `B`) is also fixed using `np.outer(a, grad_output)`. The 2D+ batched path using `np.swapaxes` is preserved unchanged and correct. ## Test plan - [ ] Verify `MatmulBackward` with `A(m,k) @ b(k,)` produces `grad_A` of shape `(m, k)` and `grad_b` of shape `(k,)` - [ ] Verify existing 2D matmul test (`A(2,2) @ B(2,2)`) still passes - [ ] Verify batched matmul (3D inputs) still passes - [ ] Run `tinytorch` test suite: `pytest tinytorch/tests/` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-05-20 14:04:53 -05:00

Sign in to join this conversation.

Branches Tags

dev

gh-pages

vol1/all-final

main

vol1/appendices-final

vol1/ch16-final

vol1/ch15-final

vol1/ch14-final

vol1/ch13-final

vol1/ch11-final

vol1/ch12-final

vol1/ch10-final

vol1/ch9-final

vol1/ch8-final

vol1/ch7-final

vol1/ch6-final

vol1/ch5-final

vol1/ch4-final

vol1/ch3-final

vol1/ch2-final

vol1/frontmater-final

kai/fixing-profile-setting-and-map

chore/staffml-ci-path

fix/callout-flow

vol1/ch10-pass4

vol1/ch9-pass4

vol1/ch8-pass4

vol1/ch7-pass4

vol1/ch6-pass4

vol1/ch5-pass4

vol1/apC-pass3

vol1/ch4-pass4

vol1/ch3-pass4

vol1/ch2-pass4

vol1/ch1-pass4

vol1/frontmatter

vol1/apE-pass3

vol1/apD-pass3

fmt-fix

vol1/ch14-pass3

kai/clarify-community-map-totals

vol1/ch13-pass3

vol1/ch12-pass3

vol1/ch11-pass3

vol1/ch10-pass3

vol1/ch7-pass3

vol1/ch9-pass3

vol1/ch8-pass3

vol1/ch6-pass3

vol1/ch5-pass3

vol1/ch4-pass3

vol1/ch3-pass3

vol1/ch2-pass3

vol1/ch1-pass3

vol1/ch6-pass2

vol1/ch5-pass2

vol1/ch4-pass2

vol1/ch3-pass2

vol1/ch2-pass2

fix/badge-fixes

chore/precommit-cleanup

cleanup/book-validate-paths

fix/staffml-trigger-on-workflow-edits

fix/staffml-reusable-concurrency

feat/container-preflight-urls

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#15732