mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 10:08:50 -05:00
[GH-ISSUE #1131] Module 06 "Autograd": Tensor.__init__() got an unexpected keyword argument 'requires_grad' #5665
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ngbolin on GitHub (Jan 23, 2026).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1131
Hello,
I realised that in the 6th module ("Autograd"), the tensors that we have build do not allow for the parameter, "requires_grad", resulting in the following error.
Should we have used the Tensor class from tinytorch's library instead? Thank you!
@ngbolin commented on GitHub (Jan 24, 2026):
Hi, suggesting a proposed fix - to move the test_unit_function_classes() function in both the .ipynb file and .py file after we have defined the enable_autograd() function, since enable_autograd() enhances the existing Tensor class with autograd capabilities, including the addition of the "require_grad" parameter.
Once this is done for both .ipynb file and .py file, I was able to get module 06 done!
@profvjreddi commented on GitHub (Jan 24, 2026):
Hey @ngbolin -- you're good at this. 👍
This is, in fact, exactly the bug I was fixing just now, about an hour ago. I realized I messed up some things. I really appreciate trying to help me. I'm going to try and do a release today with a lot of fixes that I've been working on over the holidays and Jan.
@ngbolin commented on GitHub (Jan 24, 2026):
No worries at all @profvjreddi!
Apologies for troubling you, but not sure if the following was fixed for module 07, when the function, test_unit_sgd_optimizer() was defined and ran to check if the SGD optimizer was correctly defined.
For some reason, the parameter gradients, param.grad is "forgotten" when optimizer is defined. Thus, when we apply the following formula after defining the SGD optimizer (optimizer = SGD([param], lr=0.1), param.grad becomes undefined:
expected = original_data - 0.1 * param.grad.data
I tried tinkering with it, and I came to the realization that whenyou move the definition of param.grad after defining the optimizer, the code runs as it should, as the SGD optimizer's "super.init(params)" property seemed to have replaceed the existing gradients for the param Tensor.
Once we move the definition of param.grad after defining the optimizer, it works as per normal (see image below). Thank you!
Alternatively, we could also replace the param.grad.data with the true gradients [0.99, 1.98].
@profvjreddi commented on GitHub (Jan 24, 2026):
There's an annoying bug. Please don't apologize; I should really be thanking you because you are helping us improve this. I actually believe I fixed this bug yesterday, but I have to release it as a new update.
@profvjreddi commented on GitHub (Jan 25, 2026):
This bug has been fixed and merged to the dev branch in PR #1136.
Root cause:
Optimizer.__init__resetsparam.grad = Nonefor all parameters (line 276 in 07_optimizers.py). When tests set gradients before creating the optimizer, the gradients were wiped out.Fix: All test functions now set gradients after creating the optimizer:
Verification: All 20 inline module tests pass, and CI stages 1-5 are green.
@ngbolin commented on GitHub (Jan 27, 2026):
Thanks for the fix! :-)