[PR #1136] [MERGED] Fix optimizer gradient bug and CI improvements #2359

New Issue

GiteaMirror · 2026-04-11T08:25:20-05:00

GiteaMirror commented

2026-04-11 08:25:20 -05:00

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1136
Author: @profvjreddi
Created: 1/25/2026
Status: ✅ Merged
Merged: 1/25/2026
Merged by: @profvjreddi

Base: dev ← Head: feature/tinytorch

📝 Commits (10+)

ed709c9 fix(tests): resolve import errors for honest test execution
22a94d2 fix(ci): increase unit test timeout from 10 to 30 minutes
f524506 fix(tests): resolve API mismatches and fix test infrastructure
770dac3 fix(tests): correct API calls in system milestone tests
1dab26b fix(tests): add optimizer creation to enable gradient flow in tests
975c92a fix(cli): resolve Rich markup rendering in module completion message
aafd7a8 refactor(modules): standardize formatting and fix NBGrader directives
389989e refactor(tests): clean up test folder and fix gradient flow issues
999fd13 refactor(tests): reorganize test folders and fix misplaced files
d53722e fix(tests): skip flaky performance and transformer training tests in CI

📊 Changes

161 files changed (+3813 additions, -29990 deletions)

View changed files

📝 .github/workflows/tinytorch-ci.yml (+18 -78)
➖ .vscode/settings.json (+0 -23)
📝 tinytorch/milestones/02_1969_xor/01_xor_crisis.py (+2 -2)
📝 tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py (+3 -10)
📝 tinytorch/milestones/06_2018_mlperf/02_generation_speedup.py (+1 -1)
📝 tinytorch/src/01_tensor/01_tensor.py (+52 -92)
📝 tinytorch/src/02_activations/02_activations.py (+204 -205)
📝 tinytorch/src/03_layers/03_layers.py (+123 -92)
📝 tinytorch/src/04_losses/04_losses.py (+115 -71)
📝 tinytorch/src/05_dataloader/05_dataloader.py (+161 -35)
📝 tinytorch/src/06_autograd/06_autograd.py (+87 -41)
📝 tinytorch/src/07_optimizers/07_optimizers.py (+144 -96)
📝 tinytorch/src/08_training/08_training.py (+117 -66)
📝 tinytorch/src/09_convolutions/09_convolutions.py (+301 -80)
📝 tinytorch/src/10_tokenization/10_tokenization.py (+240 -97)
📝 tinytorch/src/11_embeddings/11_embeddings.py (+142 -64)
📝 tinytorch/src/12_attention/12_attention.py (+170 -145)
📝 tinytorch/src/13_transformers/13_transformers.py (+89 -43)
📝 tinytorch/src/14_profiling/14_profiling.py (+249 -176)
📝 tinytorch/src/15_quantization/15_quantization.py (+290 -233)

...and 80 more files

📄 Description

Summary

This PR includes:

Bug Fixes:
- fix(optimizers): Set gradients AFTER optimizer creation (fixes #1131)
- fix(tokenization): Make vocab_size optional in BPETokenizer.train()
- fix: Restore Conv2dBackward and MaxPool2dBackward for CNN gradient flow
Test Infrastructure:
- Skip slow milestone tests in unit test runs to prevent CI timeouts
- Add full milestone run tests (tests/milestones/test_milestones_run.py)
- Remove redundant and flaky tests
Style Improvements:
- Standardize unit test emoji to test tube across all 20 modules
- Minor formatting fixes in milestone files

Test plan

All 20 inline module tests pass locally
CI stages 1-5 all pass (run 21324095576)
Unit tests run in 2m28s (previously timed out at 12m32s)

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1136 **Author:** [@profvjreddi](https://github.com/profvjreddi) **Created:** 1/25/2026 **Status:** ✅ Merged **Merged:** 1/25/2026 **Merged by:** [@profvjreddi](https://github.com/profvjreddi) **Base:** `dev` ← **Head:** `feature/tinytorch` --- ### 📝 Commits (10+) - [`ed709c9`](https://github.com/harvard-edge/cs249r_book/commit/ed709c95a585711221f895d57831ce6bb04521dd) fix(tests): resolve import errors for honest test execution - [`22a94d2`](https://github.com/harvard-edge/cs249r_book/commit/22a94d21b54d8eddb0d8d351eeb8beefa5e9c5c4) fix(ci): increase unit test timeout from 10 to 30 minutes - [`f524506`](https://github.com/harvard-edge/cs249r_book/commit/f524506d1985e0e6d54db23c1581507265d79587) fix(tests): resolve API mismatches and fix test infrastructure - [`770dac3`](https://github.com/harvard-edge/cs249r_book/commit/770dac3469a759f7ad8308a2660425b9684e51c9) fix(tests): correct API calls in system milestone tests - [`1dab26b`](https://github.com/harvard-edge/cs249r_book/commit/1dab26b16c943aacb47cc0f59adb9c2c482975a4) fix(tests): add optimizer creation to enable gradient flow in tests - [`975c92a`](https://github.com/harvard-edge/cs249r_book/commit/975c92a1c6e9e21a3d93d8bb0cd53aa4fbbfffea) fix(cli): resolve Rich markup rendering in module completion message - [`aafd7a8`](https://github.com/harvard-edge/cs249r_book/commit/aafd7a8c676a7968aeb0d6e3c370aa953c9221f8) refactor(modules): standardize formatting and fix NBGrader directives - [`389989e`](https://github.com/harvard-edge/cs249r_book/commit/389989ece706830b1857193439fe5453304baf8f) refactor(tests): clean up test folder and fix gradient flow issues - [`999fd13`](https://github.com/harvard-edge/cs249r_book/commit/999fd13447d71f3c7e54307a93748b3b179243b7) refactor(tests): reorganize test folders and fix misplaced files - [`d53722e`](https://github.com/harvard-edge/cs249r_book/commit/d53722eb817b4abf39dc0d538eacdb5d0a0e2f4d) fix(tests): skip flaky performance and transformer training tests in CI ### 📊 Changes **161 files changed** (+3813 additions, -29990 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/tinytorch-ci.yml` (+18 -78) ➖ `.vscode/settings.json` (+0 -23) 📝 `tinytorch/milestones/02_1969_xor/01_xor_crisis.py` (+2 -2) 📝 `tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py` (+3 -10) 📝 `tinytorch/milestones/06_2018_mlperf/02_generation_speedup.py` (+1 -1) 📝 `tinytorch/src/01_tensor/01_tensor.py` (+52 -92) 📝 `tinytorch/src/02_activations/02_activations.py` (+204 -205) 📝 `tinytorch/src/03_layers/03_layers.py` (+123 -92) 📝 `tinytorch/src/04_losses/04_losses.py` (+115 -71) 📝 `tinytorch/src/05_dataloader/05_dataloader.py` (+161 -35) 📝 `tinytorch/src/06_autograd/06_autograd.py` (+87 -41) 📝 `tinytorch/src/07_optimizers/07_optimizers.py` (+144 -96) 📝 `tinytorch/src/08_training/08_training.py` (+117 -66) 📝 `tinytorch/src/09_convolutions/09_convolutions.py` (+301 -80) 📝 `tinytorch/src/10_tokenization/10_tokenization.py` (+240 -97) 📝 `tinytorch/src/11_embeddings/11_embeddings.py` (+142 -64) 📝 `tinytorch/src/12_attention/12_attention.py` (+170 -145) 📝 `tinytorch/src/13_transformers/13_transformers.py` (+89 -43) 📝 `tinytorch/src/14_profiling/14_profiling.py` (+249 -176) 📝 `tinytorch/src/15_quantization/15_quantization.py` (+290 -233) _...and 80 more files_ </details> ### 📄 Description ## Summary This PR includes: 1. **Bug Fixes:** - fix(optimizers): Set gradients AFTER optimizer creation (fixes #1131) - fix(tokenization): Make vocab_size optional in BPETokenizer.train() - fix: Restore Conv2dBackward and MaxPool2dBackward for CNN gradient flow 2. **Test Infrastructure:** - Skip slow milestone tests in unit test runs to prevent CI timeouts - Add full milestone run tests (tests/milestones/test_milestones_run.py) - Remove redundant and flaky tests 3. **Style Improvements:** - Standardize unit test emoji to test tube across all 20 modules - Minor formatting fixes in milestone files ## Test plan - [x] All 20 inline module tests pass locally - [x] CI stages 1-5 all pass (run 21324095576) - [x] Unit tests run in 2m28s (previously timed out at 12m32s) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-04-11 08:25:20 -05:00

GiteaMirror closed this issue

2026-04-11 08:25:20 -05:00

Sign in to join this conversation.

Branches Tags

dev

gh-pages

vol1/all-final

main

vol1/appendices-final

vol1/ch16-final

vol1/ch15-final

vol1/ch14-final

vol1/ch13-final

vol1/ch11-final

vol1/ch12-final

vol1/ch10-final

vol1/ch9-final

vol1/ch8-final

vol1/ch7-final

vol1/ch6-final

vol1/ch5-final

vol1/ch4-final

vol1/ch3-final

vol1/ch2-final

vol1/frontmater-final

kai/fixing-profile-setting-and-map

chore/staffml-ci-path

fix/callout-flow

vol1/ch10-pass4

vol1/ch9-pass4

vol1/ch8-pass4

vol1/ch7-pass4

vol1/ch6-pass4

vol1/ch5-pass4

vol1/apC-pass3

vol1/ch4-pass4

vol1/ch3-pass4

vol1/ch2-pass4

vol1/ch1-pass4

vol1/frontmatter

vol1/apE-pass3

vol1/apD-pass3

fmt-fix

vol1/ch14-pass3

kai/clarify-community-map-totals

vol1/ch13-pass3

vol1/ch12-pass3

vol1/ch11-pass3

vol1/ch10-pass3

vol1/ch7-pass3

vol1/ch9-pass3

vol1/ch8-pass3

vol1/ch6-pass3

vol1/ch5-pass3

vol1/ch4-pass3

vol1/ch3-pass3

vol1/ch2-pass3

vol1/ch1-pass3

vol1/ch6-pass2

vol1/ch5-pass2

vol1/ch4-pass2

vol1/ch3-pass2

vol1/ch2-pass2

fix/badge-fixes

chore/precommit-cleanup

cleanup/book-validate-paths

fix/staffml-trigger-on-workflow-edits

fix/staffml-reusable-concurrency

feat/container-preflight-urls

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#2359