mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 18:18:42 -05:00
[GH-ISSUE #1150] Chapter 12 - Attention #5671
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ngbolin on GitHub (Feb 2, 2026).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1150
Hi,
Wanted to clarify the following computation when deriving the complexity for Attention:
Shouldn't it be n * n * d instead of n * n, since we are taking the dot product between K and V of which d operations are computed for each query-key pair.
@ngbolin commented on GitHub (Feb 2, 2026):
To add on, the value for Total Memory should be the same of all previous columns, not the same as the Optimizer memory
@profvjreddi commented on GitHub (Feb 4, 2026):
Hi @ngbolin — great catches on both points! 💯
Attention Complexity: You're absolutely right. The time complexity of QK^T is O(n² × d), not O(n²), since each of the n² query-key pairs requires a d-dimensional dot product. O(n²) only describes the memory complexity (the attention weight matrix has n² entries). I've updated the explanation to clearly distinguish between time and memory complexity.
Total Memory Column: Also correct — the Total Memory column was just reprinting the Optimizer value instead of summing Forward + Gradients + Optimizer. Thanks for catching that! I just fixed it too, and the multiplier now should show ~7× inference memory.
Both fixes are in PR #1155. Thanks for the careful reading! 🙏
@all-contributors please add @ngbolin for bug, doc in TinyTorch
@github-actions[bot] commented on GitHub (Feb 4, 2026):
I've added @ngbolin as a contributor to tinytorch! 🎉
Recognized for: doc
Based on: @all-contributors please add @ngbolin for bug, doc in TinyTorch
The contributor list has been updated in:
tinytorch/.all-contributorsrctinytorch/README.mdREADME.mdWe love recognizing our contributors! ❤️
@ngbolin commented on GitHub (Feb 6, 2026):
Hi @profvjreddi - for Module 15 on Quantization, can you explain why the Zero Point is at 88? Intuitively, because the input ranges from -1.5 to 2.8, the ZP should be (-)0.65/0.017 = -38 (where 0.65 is the midpoint between -1.5 and 2.8), and 0.2 should be -26.5 instead of 12...
@profvjreddi commented on GitHub (Feb 13, 2026):
Great catch @ngbolin, you're right! The zero point of 88 was wrong.
Let me explain what's going on so the fix makes sense.
What the zero point actually is
The goal is to squeeze a continuous FP32 range into 256 discrete INT8 values (-128 to 127). The zero point answers one question: which INT8 value represents FP32 zero?
This matters because zero is special in neural networks. ReLU outputs zero for all negative inputs, sparse weights are zero, dropout masks with zero. If FP32 zero doesn't map cleanly to an integer, you get quantization error at the most important value.
Walking through the example
Our tensor has values in [-1.5, 2.8]. We need to map that onto [-128, 127]:
Scale tells us how wide each INT8 step is:
Zero point tells us where FP32 zero lands. We know -1.5 must map to -128 (the bottom of INT8 range), and the quantization formula is
q = round(value / scale + zp). Plugging in:So FP32 zero sits at INT8 value -39:
Now we can quantize 0.2:
Result: [-128, -27, 127] with ZP = -39.
Where 88 came from
I mixed up the UINT8 (0 to 255) and signed INT8 (-128 to 127) conventions when writing the documentation. The UINT8 formula gives
zp = round(-min_val / scale) = round(1.5 / 0.017) = round(89) ≈ 88. That's the right answer if your integer range starts at 0 instead of -128. Off by exactly 128. The code was always using the correct signed INT8 formula, but I wrote the diagram numbers with the wrong one.This bug was pervasive across the module. The same mistake was in several other diagrams and docstrings too. I've fixed all of them in
99b0eb1and verified nothing else in the codebase was affected. Thanks for the careful reading, really appreciate it!@profvjreddi commented on GitHub (Feb 13, 2026):
@all-contributors please add @ as a contributor for ✍️ Doc in TinyTorch
@github-actions[bot] commented on GitHub (Feb 13, 2026):
I've added @ngbolin as a contributor to tinytorch! 🎉
Recognized for: doc
Project: tinytorch (explicitly mentioned in comment)
Based on: @all-contributors please add @ as a contributor for ✍️ Doc in TinyTorch
The contributor list has been updated in:
tinytorch/.all-contributorsrctinytorch/README.mdREADME.mdWe love recognizing our contributors! ❤️