[GH-ISSUE #1129] Error: tito milestone run 03 #5664

Closed
opened 2026-04-21 21:39:06 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @Takosaga on GitHub (Jan 22, 2026).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1129

Originally assigned to: @profvjreddi on GitHub.

load_digit_dataset() raises an error, though does run once manually creating dataset from datasets/tinydigits which also requires scikit-learn

Originally created by @Takosaga on GitHub (Jan 22, 2026). Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/1129 Originally assigned to: @profvjreddi on GitHub. load_digit_dataset() raises an error, though does run once manually creating dataset from `datasets/tinydigits` which also requires scikit-learn
GiteaMirror added the type: bugarea: tinytorch labels 2026-04-21 21:39:06 -05:00
Author
Owner

@profvjreddi commented on GitHub (Jan 22, 2026):

Status: Already Fixed

The load_digit_dataset() function works correctly in the current codebase. The TinyDigits dataset files (train.pkl and test.pkl) are bundled in datasets/tinydigits/.

The issue you encountered was likely from an older version before the dataset was bundled, or the dataset files weren't present in your installation.

We also fixed a minor issue where interactive prompts (Run batch size experiment? and Sync achievement?) would cause EOFError when running in non-interactive mode. These now gracefully skip in non-interactive environments.

Please try updating to the latest version (0.1.4), which I will release this morning, and let us know if you still encounter issues!

<!-- gh-comment-id:3784232149 --> @profvjreddi commented on GitHub (Jan 22, 2026): **Status: Already Fixed** ✅ The `load_digit_dataset()` function works correctly in the current codebase. The TinyDigits dataset files (`train.pkl` and `test.pkl`) are bundled in `datasets/tinydigits/`. The issue you encountered was likely from an older version before the dataset was bundled, or the dataset files weren't present in your installation. We also fixed a minor issue where interactive prompts (`Run batch size experiment?` and `Sync achievement?`) would cause `EOFError` when running in non-interactive mode. These now gracefully skip in non-interactive environments. Please try updating to the latest version (0.1.4), which I will release this morning, and let us know if you still encounter issues!
Author
Owner

@profvjreddi commented on GitHub (Jan 22, 2026):

@all-contributors please add @Takosaga for bug

<!-- gh-comment-id:3784503134 --> @profvjreddi commented on GitHub (Jan 22, 2026): @all-contributors please add @Takosaga for bug
Author
Owner

@profvjreddi commented on GitHub (Jan 22, 2026):

(Testing some automated workflows -- so ignore all the github actions 🤗 )

<!-- gh-comment-id:3785115600 --> @profvjreddi commented on GitHub (Jan 22, 2026): (Testing some automated workflows -- so ignore all the github actions 🤗 )
Author
Owner

@profvjreddi commented on GitHub (Jan 22, 2026):

v0.1.4 is now released and should fix the load_digit_dataset() error. We also fixed interactive prompts that were causing EOFError in non-interactive environments.

To update, try:

tito system update

If that doesn't work, you can re-run the install script:

curl -sSL mlsysbook.ai/tinytorch/install.sh | bash

Then try tito milestone run 03 again - let me know if you hit any other issues!

You've been added to the TinyTorch Contributors list for helping find this bug. It's a small start, but I'm hoping folks who help test and improve TinyTorch will eventually teach this material and help spread the learning to others. Thanks for being part of this! 🙏

<!-- gh-comment-id:3787428942 --> @profvjreddi commented on GitHub (Jan 22, 2026): v0.1.4 is now released and should fix the `load_digit_dataset()` error. We also fixed interactive prompts that were causing `EOFError` in non-interactive environments. To update, try: ```bash tito system update ``` If that doesn't work, you can re-run the install script: ```bash curl -sSL mlsysbook.ai/tinytorch/install.sh | bash ``` Then try `tito milestone run 03` again - let me know if you hit any other issues\! You've been added to the [TinyTorch Contributors](https://github.com/harvard-edge/cs249r_book?tab=readme-ov-file#-tinytorch-contributors) list for helping find this bug. It's a small start, but I'm hoping folks who help test and improve TinyTorch will eventually teach this material and help spread the learning to others. Thanks for being part of this\! 🙏
Author
Owner

@Takosaga commented on GitHub (Jan 23, 2026):

Did a fresh install

tito system update
You're on the latest version
Version: v0.1.4

tito milestone run 03
The Data:
╭──────────────────────── 📊 Dataset ────────────────────────╮
│ Loading TinyDigits Dataset │
│ Curated 8×8 handwritten digits optimized for fast learning │
╰────────────────────────────────────────────────────────────╯
Traceback (most recent call last):
File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 630, in
train_mlp()
File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 392, in train_mlp
train_images, train_labels, test_images, test_labels = load_digit_dataset()
^^^^^^^^^^^^^^^^^^^^
File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 223, in load_digit_dataset
train_data = pickle.load(f)
^^^^^^^^^^^^^^
_pickle.UnpicklingError: invalid load key, 'v'.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
━━━━━━━━━━━━━━━━━━
⚠️ Part TinyDigits completed with errors

<!-- gh-comment-id:3789395889 --> @Takosaga commented on GitHub (Jan 23, 2026): Did a fresh install ``` tito system update ``` ✅ You're on the latest version Version: v0.1.4 ```tito milestone run 03 ``` The Data: ╭──────────────────────── 📊 Dataset ────────────────────────╮ │ Loading TinyDigits Dataset │ │ Curated 8×8 handwritten digits optimized for fast learning │ ╰────────────────────────────────────────────────────────────╯ Traceback (most recent call last): File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 630, in <module> train_mlp() File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 392, in train_mlp train_images, train_labels, test_images, test_labels = load_digit_dataset() ^^^^^^^^^^^^^^^^^^^^ File "/home/takosaga/Projects/tiny_torch_temp/tinytorch/milestones/03_1986_mlp/01_rumelhart_tinydigits.py", line 223, in load_digit_dataset train_data = pickle.load(f) ^^^^^^^^^^^^^^ _pickle.UnpicklingError: invalid load key, 'v'. ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━ ⚠️ Part TinyDigits completed with errors
Author
Owner

@profvjreddi commented on GitHub (Jan 23, 2026):

Ah, found it! The dataset files are stored in Git LFS, and it looks like you got the pointer files instead of the actual data. That's why pickle is choking on 'v' - it's trying to read the text version https://git-lfs... as binary data.

I'm going to remove these from LFS since they're tiny anyway (~300KB) - no reason to have that dependency. Will push a fix shortly!

<!-- gh-comment-id:3790418679 --> @profvjreddi commented on GitHub (Jan 23, 2026): Ah, found it! The dataset files are stored in Git LFS, and it looks like you got the pointer files instead of the actual data. That's why pickle is choking on `'v'` - it's trying to read the text `version https://git-lfs...` as binary data. I'm going to remove these from LFS since they're tiny anyway (~300KB) - no reason to have that dependency. Will push a fix shortly!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#5664