[PR #1335] [MERGED] fix(tinytorch): ensure model params have requires_grad=True in trainer_init #9031

New Issue

GiteaMirror · 2026-05-03T01:14:14-05:00

GiteaMirror commented

2026-05-03 01:14:14 -05:00

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1335
Author: @Shashank-Tripathi-07
Created: 4/16/2026
Status: ✅ Merged
Merged: 4/16/2026
Merged by: @profvjreddi

Base: dev ← Head: fix/issue-1334-trainer-requires-grad

📝 Commits (2)

6e8ef66 fix(tinytorch): ensure model params have requires_grad=True in trainer_init
bc501a2 fix(tinytorch): guard requires_grad loop against non-Tensor params in trainer_init

📊 Changes

1 file changed (+11 additions, -2 deletions)

View changed files

📝 tinytorch/src/08_training/08_training.py (+11 -2)

📄 Description

Fixes #1334 — test_unit_trainer_optimizer_update fails with "Parameters should change after optimizer update" because param.grad is never populated.

Root cause: Linear.__init__ creates weights with requires_grad=False. tracked_mse_forward only attaches a backward graph if predictions.requires_grad — so loss.backward() returns immediately and no gradients flow. Optimizer.__init__ does set the flag but only on construction; any ordering edge case silently breaks the chain.

Fix: iterate model.parameters() in trainer_init and set requires_grad = True, making the Trainer the authoritative owner of gradient tracking.

File	Change
`tinytorch/src/08_training/08_training.py`	`requires_grad = True` loop in solution block; APPROACH docstring updated

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1335 **Author:** [@Shashank-Tripathi-07](https://github.com/Shashank-Tripathi-07) **Created:** 4/16/2026 **Status:** ✅ Merged **Merged:** 4/16/2026 **Merged by:** [@profvjreddi](https://github.com/profvjreddi) **Base:** `dev` ← **Head:** `fix/issue-1334-trainer-requires-grad` --- ### 📝 Commits (2) - [`6e8ef66`](https://github.com/harvard-edge/cs249r_book/commit/6e8ef6616af49cb3f419ca844b4ce09833cacceb) fix(tinytorch): ensure model params have requires_grad=True in trainer_init - [`bc501a2`](https://github.com/harvard-edge/cs249r_book/commit/bc501a25b5c4f9965866546b02dd4933f90fd2e8) fix(tinytorch): guard requires_grad loop against non-Tensor params in trainer_init ### 📊 Changes **1 file changed** (+11 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `tinytorch/src/08_training/08_training.py` (+11 -2) </details> ### 📄 Description Fixes #1334 — `test_unit_trainer_optimizer_update` fails with "Parameters should change after optimizer update" because `param.grad` is never populated. **Root cause:** `Linear.__init__` creates weights with `requires_grad=False`. `tracked_mse_forward` only attaches a backward graph `if predictions.requires_grad` — so `loss.backward()` returns immediately and no gradients flow. `Optimizer.__init__` does set the flag but only on construction; any ordering edge case silently breaks the chain. **Fix:** iterate `model.parameters()` in `trainer_init` and set `requires_grad = True`, making the Trainer the authoritative owner of gradient tracking. | File | Change | |------|--------| | `tinytorch/src/08_training/08_training.py` | `requires_grad = True` loop in solution block; APPROACH docstring updated | --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-05-03 01:14:14 -05:00

GiteaMirror closed this issue

2026-05-03 01:14:15 -05:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#9031