[GH-ISSUE #1131] Module 06 "Autograd": Tensor.__init__() got an unexpected keyword argument 'requires_grad' #5665

Closed
opened 2026-04-21 21:39:15 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @ngbolin on GitHub (Jan 23, 2026).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1131

Hello,

I realised that in the 6th module ("Autograd"), the tensors that we have build do not allow for the parameter, "requires_grad", resulting in the following error.

Should we have used the Tensor class from tinytorch's library instead? Thank you!

Image
Originally created by @ngbolin on GitHub (Jan 23, 2026). Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1131 Hello, I realised that in the 6th module ("Autograd"), the tensors that we have build do not allow for the parameter, "requires_grad", resulting in the following error. Should we have used the Tensor class from tinytorch's library instead? Thank you! <img width="1862" height="905" alt="Image" src="https://github.com/user-attachments/assets/f0328557-37e4-4c21-bf1a-1d4dc10bc21b" />
GiteaMirror added the type: bugarea: tinytorch labels 2026-04-21 21:39:15 -05:00
Author
Owner

@ngbolin commented on GitHub (Jan 24, 2026):

Hi, suggesting a proposed fix - to move the test_unit_function_classes() function in both the .ipynb file and .py file after we have defined the enable_autograd() function, since enable_autograd() enhances the existing Tensor class with autograd capabilities, including the addition of the "require_grad" parameter.

Once this is done for both .ipynb file and .py file, I was able to get module 06 done!

Image Image
<!-- gh-comment-id:3794471900 --> @ngbolin commented on GitHub (Jan 24, 2026): Hi, suggesting a proposed fix - to move the test_unit_function_classes() function in both the .ipynb file and .py file after we have defined the enable_autograd() function, since enable_autograd() enhances the existing Tensor class with autograd capabilities, including the addition of the "require_grad" parameter. Once this is done for both .ipynb file and .py file, I was able to get module 06 done! <img width="505" height="479" alt="Image" src="https://github.com/user-attachments/assets/2c9842f8-dd78-4804-8437-8f20c215ae8f" /> <img width="1090" height="350" alt="Image" src="https://github.com/user-attachments/assets/cbfb2a73-d368-4983-a861-d3cb40b7e0e8" />
Author
Owner

@profvjreddi commented on GitHub (Jan 24, 2026):

Hey @ngbolin -- you're good at this. 👍

This is, in fact, exactly the bug I was fixing just now, about an hour ago. I realized I messed up some things. I really appreciate trying to help me. I'm going to try and do a release today with a lot of fixes that I've been working on over the holidays and Jan.

<!-- gh-comment-id:3794742270 --> @profvjreddi commented on GitHub (Jan 24, 2026): Hey @ngbolin -- you're good at this. 👍 This is, in fact, exactly the bug I was fixing just now, about an hour ago. I realized I messed up some things. I really appreciate trying to help me. I'm going to try and do a release today with a lot of fixes that I've been working on over the holidays and Jan.
Author
Owner

@ngbolin commented on GitHub (Jan 24, 2026):

No worries at all @profvjreddi!

Apologies for troubling you, but not sure if the following was fixed for module 07, when the function, test_unit_sgd_optimizer() was defined and ran to check if the SGD optimizer was correctly defined.

For some reason, the parameter gradients, param.grad is "forgotten" when optimizer is defined. Thus, when we apply the following formula after defining the SGD optimizer (optimizer = SGD([param], lr=0.1), param.grad becomes undefined:

expected = original_data - 0.1 * param.grad.data

I tried tinkering with it, and I came to the realization that whenyou move the definition of param.grad after defining the optimizer, the code runs as it should, as the SGD optimizer's "super.init(params)" property seemed to have replaceed the existing gradients for the param Tensor.

Once we move the definition of param.grad after defining the optimizer, it works as per normal (see image below). Thank you!

Image

Alternatively, we could also replace the param.grad.data with the true gradients [0.99, 1.98].

<!-- gh-comment-id:3794883429 --> @ngbolin commented on GitHub (Jan 24, 2026): No worries at all @profvjreddi! Apologies for troubling you, but not sure if the following was fixed for module 07, when the function, test_unit_sgd_optimizer() was defined and ran to check if the SGD optimizer was correctly defined. For some reason, the parameter gradients, param.grad is "forgotten" when optimizer is defined. Thus, when we apply the following formula after defining the SGD optimizer (optimizer = SGD([param], lr=0.1), param.grad becomes undefined: expected = original_data - 0.1 * param.grad.data I tried tinkering with it, and I came to the realization that whenyou move the definition of param.grad after defining the optimizer, the code runs as it should, as the SGD optimizer's "super.__init(params)__" property seemed to have replaceed the existing gradients for the param Tensor. Once we move the definition of param.grad after defining the optimizer, it works as per normal (see image below). Thank you! <img width="449" height="767" alt="Image" src="https://github.com/user-attachments/assets/3b6ac629-0bca-4444-bb4f-769075a474de" /> Alternatively, we could also replace the param.grad.data with the true gradients [0.99, 1.98].
Author
Owner

@profvjreddi commented on GitHub (Jan 24, 2026):

There's an annoying bug. Please don't apologize; I should really be thanking you because you are helping us improve this. I actually believe I fixed this bug yesterday, but I have to release it as a new update.

<!-- gh-comment-id:3794944881 --> @profvjreddi commented on GitHub (Jan 24, 2026): There's an annoying bug. Please don't apologize; I should really be thanking you because you are helping us improve this. I actually believe I fixed this bug yesterday, but I have to release it as a new update.
Author
Owner

@profvjreddi commented on GitHub (Jan 25, 2026):

This bug has been fixed and merged to the dev branch in PR #1136.

Root cause: Optimizer.__init__ resets param.grad = None for all parameters (line 276 in 07_optimizers.py). When tests set gradients before creating the optimizer, the gradients were wiped out.

Fix: All test functions now set gradients after creating the optimizer:

# Before (buggy):
param.grad = Tensor([0.1, 0.2])  # Set BEFORE optimizer
optimizer = SGD([param], lr=0.1)  # This resets param.grad to None!

# After (fixed):
optimizer = SGD([param], lr=0.1)  # Create optimizer first
param.grad = Tensor([0.1, 0.2])   # Set gradient AFTER

Verification: All 20 inline module tests pass, and CI stages 1-5 are green.

<!-- gh-comment-id:3795822895 --> @profvjreddi commented on GitHub (Jan 25, 2026): This bug has been fixed and merged to the dev branch in PR #1136. **Root cause**: `Optimizer.__init__` resets `param.grad = None` for all parameters (line 276 in 07_optimizers.py). When tests set gradients *before* creating the optimizer, the gradients were wiped out. **Fix**: All test functions now set gradients *after* creating the optimizer: ```python # Before (buggy): param.grad = Tensor([0.1, 0.2]) # Set BEFORE optimizer optimizer = SGD([param], lr=0.1) # This resets param.grad to None! # After (fixed): optimizer = SGD([param], lr=0.1) # Create optimizer first param.grad = Tensor([0.1, 0.2]) # Set gradient AFTER ``` **Verification**: All 20 inline module tests pass, and CI stages 1-5 are green.
Author
Owner

@ngbolin commented on GitHub (Jan 27, 2026):

Thanks for the fix! :-)

<!-- gh-comment-id:3804096994 --> @ngbolin commented on GitHub (Jan 27, 2026): Thanks for the fix! :-)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#5665