mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-21 13:31:55 -05:00
[PR #1781] fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs #15732
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/harvard-edge/cs249r_book/pull/1781
Author: @Shashank-Tripathi-07
Created: 5/18/2026
Status: 🔄 Open
Base:
dev← Head:fix/tinytorch-audit-20260518📝 Commits (1)
b4fae46fix(tinytorch): correct MatmulBackward gradients for 1D vector inputs📊 Changes
1 file changed (+10 additions, -6 deletions)
View changed files
📝
tinytorch/src/06_autograd/06_autograd.py(+10 -6)📄 Description
Summary
MatmulBackward.apply()intinytorch/src/06_autograd/06_autograd.pywhere gradient computation was wrong when inputBis a 1D vector.Bug Details
Root cause: When
Bis 1D with shape(k,), NumPy's.Tis a no-op (1D arrays have no second axis to swap). The old code then callednp.matmul(grad_output, b.data)which either raises a shape error (whenm != k) or silently computes a dot product returning a scalar (whenm == k) instead of the correct(m, k)gradient matrix.Correct math: For
A(m,k) @ b(k,)->out(m,):grad_A = np.outer(grad_output, b)-- shape(m, k)(was broken)grad_b = a.T @ grad_output-- shape(k,)(was broken for 1Da)The symmetric case (1D
awith 2DB) is also fixed usingnp.outer(a, grad_output).The 2D+ batched path using
np.swapaxesis preserved unchanged and correct.Test plan
MatmulBackwardwithA(m,k) @ b(k,)producesgrad_Aof shape(m, k)andgrad_bof shape(k,)A(2,2) @ B(2,2)) still passestinytorchtest suite:pytest tinytorch/tests/🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.