[PR #1136] [MERGED] Fix optimizer gradient bug and CI improvements #2359

Closed
opened 2026-04-11 08:25:20 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1136
Author: @profvjreddi
Created: 1/25/2026
Status: Merged
Merged: 1/25/2026
Merged by: @profvjreddi

Base: devHead: feature/tinytorch


📝 Commits (10+)

  • ed709c9 fix(tests): resolve import errors for honest test execution
  • 22a94d2 fix(ci): increase unit test timeout from 10 to 30 minutes
  • f524506 fix(tests): resolve API mismatches and fix test infrastructure
  • 770dac3 fix(tests): correct API calls in system milestone tests
  • 1dab26b fix(tests): add optimizer creation to enable gradient flow in tests
  • 975c92a fix(cli): resolve Rich markup rendering in module completion message
  • aafd7a8 refactor(modules): standardize formatting and fix NBGrader directives
  • 389989e refactor(tests): clean up test folder and fix gradient flow issues
  • 999fd13 refactor(tests): reorganize test folders and fix misplaced files
  • d53722e fix(tests): skip flaky performance and transformer training tests in CI

📊 Changes

161 files changed (+3813 additions, -29990 deletions)

View changed files

📝 .github/workflows/tinytorch-ci.yml (+18 -78)
.vscode/settings.json (+0 -23)
📝 tinytorch/milestones/02_1969_xor/01_xor_crisis.py (+2 -2)
📝 tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py (+3 -10)
📝 tinytorch/milestones/06_2018_mlperf/02_generation_speedup.py (+1 -1)
📝 tinytorch/src/01_tensor/01_tensor.py (+52 -92)
📝 tinytorch/src/02_activations/02_activations.py (+204 -205)
📝 tinytorch/src/03_layers/03_layers.py (+123 -92)
📝 tinytorch/src/04_losses/04_losses.py (+115 -71)
📝 tinytorch/src/05_dataloader/05_dataloader.py (+161 -35)
📝 tinytorch/src/06_autograd/06_autograd.py (+87 -41)
📝 tinytorch/src/07_optimizers/07_optimizers.py (+144 -96)
📝 tinytorch/src/08_training/08_training.py (+117 -66)
📝 tinytorch/src/09_convolutions/09_convolutions.py (+301 -80)
📝 tinytorch/src/10_tokenization/10_tokenization.py (+240 -97)
📝 tinytorch/src/11_embeddings/11_embeddings.py (+142 -64)
📝 tinytorch/src/12_attention/12_attention.py (+170 -145)
📝 tinytorch/src/13_transformers/13_transformers.py (+89 -43)
📝 tinytorch/src/14_profiling/14_profiling.py (+249 -176)
📝 tinytorch/src/15_quantization/15_quantization.py (+290 -233)

...and 80 more files

📄 Description

Summary

This PR includes:

  1. Bug Fixes:

    • fix(optimizers): Set gradients AFTER optimizer creation (fixes #1131)
    • fix(tokenization): Make vocab_size optional in BPETokenizer.train()
    • fix: Restore Conv2dBackward and MaxPool2dBackward for CNN gradient flow
  2. Test Infrastructure:

    • Skip slow milestone tests in unit test runs to prevent CI timeouts
    • Add full milestone run tests (tests/milestones/test_milestones_run.py)
    • Remove redundant and flaky tests
  3. Style Improvements:

    • Standardize unit test emoji to test tube across all 20 modules
    • Minor formatting fixes in milestone files

Test plan

  • All 20 inline module tests pass locally
  • CI stages 1-5 all pass (run 21324095576)
  • Unit tests run in 2m28s (previously timed out at 12m32s)

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1136 **Author:** [@profvjreddi](https://github.com/profvjreddi) **Created:** 1/25/2026 **Status:** ✅ Merged **Merged:** 1/25/2026 **Merged by:** [@profvjreddi](https://github.com/profvjreddi) **Base:** `dev` ← **Head:** `feature/tinytorch` --- ### 📝 Commits (10+) - [`ed709c9`](https://github.com/harvard-edge/cs249r_book/commit/ed709c95a585711221f895d57831ce6bb04521dd) fix(tests): resolve import errors for honest test execution - [`22a94d2`](https://github.com/harvard-edge/cs249r_book/commit/22a94d21b54d8eddb0d8d351eeb8beefa5e9c5c4) fix(ci): increase unit test timeout from 10 to 30 minutes - [`f524506`](https://github.com/harvard-edge/cs249r_book/commit/f524506d1985e0e6d54db23c1581507265d79587) fix(tests): resolve API mismatches and fix test infrastructure - [`770dac3`](https://github.com/harvard-edge/cs249r_book/commit/770dac3469a759f7ad8308a2660425b9684e51c9) fix(tests): correct API calls in system milestone tests - [`1dab26b`](https://github.com/harvard-edge/cs249r_book/commit/1dab26b16c943aacb47cc0f59adb9c2c482975a4) fix(tests): add optimizer creation to enable gradient flow in tests - [`975c92a`](https://github.com/harvard-edge/cs249r_book/commit/975c92a1c6e9e21a3d93d8bb0cd53aa4fbbfffea) fix(cli): resolve Rich markup rendering in module completion message - [`aafd7a8`](https://github.com/harvard-edge/cs249r_book/commit/aafd7a8c676a7968aeb0d6e3c370aa953c9221f8) refactor(modules): standardize formatting and fix NBGrader directives - [`389989e`](https://github.com/harvard-edge/cs249r_book/commit/389989ece706830b1857193439fe5453304baf8f) refactor(tests): clean up test folder and fix gradient flow issues - [`999fd13`](https://github.com/harvard-edge/cs249r_book/commit/999fd13447d71f3c7e54307a93748b3b179243b7) refactor(tests): reorganize test folders and fix misplaced files - [`d53722e`](https://github.com/harvard-edge/cs249r_book/commit/d53722eb817b4abf39dc0d538eacdb5d0a0e2f4d) fix(tests): skip flaky performance and transformer training tests in CI ### 📊 Changes **161 files changed** (+3813 additions, -29990 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/tinytorch-ci.yml` (+18 -78) ➖ `.vscode/settings.json` (+0 -23) 📝 `tinytorch/milestones/02_1969_xor/01_xor_crisis.py` (+2 -2) 📝 `tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py` (+3 -10) 📝 `tinytorch/milestones/06_2018_mlperf/02_generation_speedup.py` (+1 -1) 📝 `tinytorch/src/01_tensor/01_tensor.py` (+52 -92) 📝 `tinytorch/src/02_activations/02_activations.py` (+204 -205) 📝 `tinytorch/src/03_layers/03_layers.py` (+123 -92) 📝 `tinytorch/src/04_losses/04_losses.py` (+115 -71) 📝 `tinytorch/src/05_dataloader/05_dataloader.py` (+161 -35) 📝 `tinytorch/src/06_autograd/06_autograd.py` (+87 -41) 📝 `tinytorch/src/07_optimizers/07_optimizers.py` (+144 -96) 📝 `tinytorch/src/08_training/08_training.py` (+117 -66) 📝 `tinytorch/src/09_convolutions/09_convolutions.py` (+301 -80) 📝 `tinytorch/src/10_tokenization/10_tokenization.py` (+240 -97) 📝 `tinytorch/src/11_embeddings/11_embeddings.py` (+142 -64) 📝 `tinytorch/src/12_attention/12_attention.py` (+170 -145) 📝 `tinytorch/src/13_transformers/13_transformers.py` (+89 -43) 📝 `tinytorch/src/14_profiling/14_profiling.py` (+249 -176) 📝 `tinytorch/src/15_quantization/15_quantization.py` (+290 -233) _...and 80 more files_ </details> ### 📄 Description ## Summary This PR includes: 1. **Bug Fixes:** - fix(optimizers): Set gradients AFTER optimizer creation (fixes #1131) - fix(tokenization): Make vocab_size optional in BPETokenizer.train() - fix: Restore Conv2dBackward and MaxPool2dBackward for CNN gradient flow 2. **Test Infrastructure:** - Skip slow milestone tests in unit test runs to prevent CI timeouts - Add full milestone run tests (tests/milestones/test_milestones_run.py) - Remove redundant and flaky tests 3. **Style Improvements:** - Standardize unit test emoji to test tube across all 20 modules - Minor formatting fixes in milestone files ## Test plan - [x] All 20 inline module tests pass locally - [x] CI stages 1-5 all pass (run 21324095576) - [x] Unit tests run in 2m28s (previously timed out at 12m32s) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-11 08:25:20 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#2359