mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 02:03:55 -05:00
[GH-ISSUE #1342] backward class math audit — finite diff mismatch #5721
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @profvjreddi on GitHub (Apr 16, 2026).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1342
so the finite diff tests from #1336 flagged some value mismatches. did a code read of every Backward class that came up and the math is actually right. so this is narrower than it first looked.
audit results (reading math against textbook derivatives):
so the "value mismatch" failures are most likely:
loss.backward(np.ones_like(loss.data))propagates through.sum()for 0-d outputswhat needs to happen:
note: tanh bypass was a separate real bug, fixed in #1343 and tracked in #1341.
@profvjreddi commented on GitHub (Apr 16, 2026):
updating the scope above after a code review pass. every Backward class that came up here has correct math against the textbook derivative. the real issue is tolerance tuning or something in the test setup, not the Backward classes. saves anyone picking this up from chasing the wrong thing.
@profvjreddi commented on GitHub (Apr 17, 2026):
Closing this out — the audit is complete and there's nothing remaining:
_reduce_broadcast_grad)test/autograd-gradient-correctnessbranch, which has been merged and deleted. The permanent test suite(
tests/06_autograd/) uses analytical expected values withnp.allclosedefaults — no finite-difference checker exists in the current codebase, so there are no tolerance-sensitive tests left to tune.If a numerical gradient checker is added in the future, per-op tolerance tuning (especially for GELU) should be considered at that time.