mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-06 17:49:07 -05:00
[PR #1335] [MERGED] fix(tinytorch): ensure model params have requires_grad=True in trainer_init #7281
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/harvard-edge/cs249r_book/pull/1335
Author: @Shashank-Tripathi-07
Created: 4/16/2026
Status: ✅ Merged
Merged: 4/16/2026
Merged by: @profvjreddi
Base:
dev← Head:fix/issue-1334-trainer-requires-grad📝 Commits (2)
6e8ef66fix(tinytorch): ensure model params have requires_grad=True in trainer_initbc501a2fix(tinytorch): guard requires_grad loop against non-Tensor params in trainer_init📊 Changes
1 file changed (+11 additions, -2 deletions)
View changed files
📝
tinytorch/src/08_training/08_training.py(+11 -2)📄 Description
Fixes #1334 —
test_unit_trainer_optimizer_updatefails with "Parameters should change after optimizer update" becauseparam.gradis never populated.Root cause:
Linear.__init__creates weights withrequires_grad=False.tracked_mse_forwardonly attaches a backward graphif predictions.requires_grad— soloss.backward()returns immediately and no gradients flow.Optimizer.__init__does set the flag but only on construction; any ordering edge case silently breaks the chain.Fix: iterate
model.parameters()intrainer_initand setrequires_grad = True, making the Trainer the authoritative owner of gradient tracking.tinytorch/src/08_training/08_training.pyrequires_grad = Trueloop in solution block; APPROACH docstring updated🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.