TinyTorch

github-starred/TinyTorch

Fork 0

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-03 03:09:33 -05:00

Commit Graph

Author	SHA1	Message	Date
Vijay Janapa Reddi	97fece7b5f	Finalize Module 08 and add integration tests Added integration tests for DataLoader: - test_dataloader_integration.py in tests/integration/ - Training workflow integration - Shuffle consistency across epochs - Memory efficiency verification Updated Module 08: - Added note about optional performance analysis - Clarified that analysis functions can be run manually - Clean flow: text → code → tests Updated datasets/tiny/README.md: - Minor formatting fixes Module 08 is now complete and ready to export: ✅ Dataset abstraction ✅ TensorDataset implementation ✅ DataLoader with batching/shuffling ✅ ASCII visualizations for understanding ✅ Unit tests (in module) ✅ Integration tests (in tests/) ✅ Performance analysis tools (optional) Next: Export with 'bin/tito export 08_dataloader'	2025-09-30 16:07:55 -04:00
Vijay Janapa Reddi	79f8fe38d0	Add tiny datasets infrastructure with 8×8 digits Created datasets/tiny/ for shipping small datasets with TinyTorch: New Structure: - datasets/tiny/digits_8x8.npz (67KB, 1,797 samples) - 8×8 handwritten digits from UCI/sklearn - Normalized to [0-1], ready for immediate use - Perfect for DataLoader learning (Module 08) - datasets/tiny/README.md - Full documentation and usage examples - Philosophy: tiny (learn) → full (practice) → custom (master) - datasets/tiny/create_digits_8x8.py - Extraction script showing how dataset was created - Reproducible from sklearn.datasets.load_digits() Updated .gitignore: - Ignore datasets/* (downloaded large files) - Allow datasets/tiny/ (shipped small files) - Allow datasets/README.md and download scripts - Selectively ignore .npz files (not in tiny/) Benefits: ✅ Zero download friction for Module 08 ✅ Offline-friendly (planes, classrooms, slow networks) ✅ Real handwritten digits (not synthetic noise) ✅ Git-friendly size (67KB vs 10MB MNIST) ✅ Same shape/format students will use for CNNs Progression: - Module 08: Learn DataLoader with 8×8 digits - Milestone 03: Train on full 28×28 MNIST - Milestone 04: Scale to CIFAR-10	2025-09-30 15:05:34 -04:00

Author

SHA1

Message

Date

Vijay Janapa Reddi

97fece7b5f

Finalize Module 08 and add integration tests

Added integration tests for DataLoader:
- test_dataloader_integration.py in tests/integration/
  - Training workflow integration
  - Shuffle consistency across epochs
  - Memory efficiency verification

Updated Module 08:
- Added note about optional performance analysis
- Clarified that analysis functions can be run manually
- Clean flow: text → code → tests

Updated datasets/tiny/README.md:
- Minor formatting fixes

Module 08 is now complete and ready to export:
✅ Dataset abstraction
✅ TensorDataset implementation
✅ DataLoader with batching/shuffling
✅ ASCII visualizations for understanding
✅ Unit tests (in module)
✅ Integration tests (in tests/)
✅ Performance analysis tools (optional)

Next: Export with 'bin/tito export 08_dataloader'

2025-09-30 16:07:55 -04:00

Vijay Janapa Reddi

79f8fe38d0

Add tiny datasets infrastructure with 8×8 digits

Created datasets/tiny/ for shipping small datasets with TinyTorch:

New Structure:
- datasets/tiny/digits_8x8.npz (67KB, 1,797 samples)
  - 8×8 handwritten digits from UCI/sklearn
  - Normalized to [0-1], ready for immediate use
  - Perfect for DataLoader learning (Module 08)

- datasets/tiny/README.md
  - Full documentation and usage examples
  - Philosophy: tiny (learn) → full (practice) → custom (master)

- datasets/tiny/create_digits_8x8.py
  - Extraction script showing how dataset was created
  - Reproducible from sklearn.datasets.load_digits()

Updated .gitignore:
- Ignore datasets/* (downloaded large files)
- Allow datasets/tiny/ (shipped small files)
- Allow datasets/README.md and download scripts
- Selectively ignore .npz files (not in tiny/)

Benefits:
✅ Zero download friction for Module 08
✅ Offline-friendly (planes, classrooms, slow networks)
✅ Real handwritten digits (not synthetic noise)
✅ Git-friendly size (67KB vs 10MB MNIST)
✅ Same shape/format students will use for CNNs

Progression:
- Module 08: Learn DataLoader with 8×8 digits
- Milestone 03: Train on full 28×28 MNIST
- Milestone 04: Scale to CIFAR-10

2025-09-30 15:05:34 -04:00

2 Commits