mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-06 17:49:07 -05:00
[GH-ISSUE #974] Missing transpose in Equation of Section 4.2.2 (matrix multiplication inconsistency) #1652
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @VThuong99 on GitHub (Oct 11, 2025).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/974
In Section 4.2.2, the layer computation equation is currently written as:
However, this is inconsistent with the presentation of the layer’s outputs h^(l) in Section 3.4.2.4.
To make the matrix dimensions consistent, the equation should be:
Suggested fix: Add the transpose to h^(l−1) in the equation and to h^(0) in the example shown in Section 4.2.2.
Location: Section 4.2.2, in the example immediately below the equation.
I may be mistaken, but it seems there might be a minor inconsistency here. If confirmed, I’d be happy to contribute a PR to address it.
@profvjreddi commented on GitHub (Oct 11, 2025):
Thanks so much for catching this and writing it up clearly. 🙏 You’re right in keeping the equations consistent.
I’m traveling right now and won’t be able to look at it in detail until Monday, but if you’re up for it, I’d love for you to open a quick PR with the fix. I’ll make sure to review it as soon as I’m back.
And if you happen to spot any other little inconsistencies or improvements along the way, please keep them coming. These kinds of contributions make the material better for everyone, and I really value and appreciate your time. Plus, anyone who submits a PR automatically gets acknowledged as a contributor 🙂 Thanks again!
@VThuong99 commented on GitHub (Oct 11, 2025):
Hi, I’ve reviewed the notation again. Initially, I noticed that in Section 3.4.2.4 the formula includes a transpose, while in Section 4.2.2 it does not. However, since the following explanations and the example itself don’t depend on whether h is treated as a row or column vector, the simplest and most consistent fix is just to adjust the computation in the example (as I did in this PR).
If you prefer to make it fully consistent with the earlier section, an alternative would be to also transpose h in the main formula.
@profvjreddi commented on GitHub (Oct 17, 2025):
Thanks for doing this @VThuong99 - I am back on my computer now, so if you have additional fixes, do let me know.