Created datasets/tiny/ for shipping small datasets with TinyTorch:
New Structure:
- datasets/tiny/digits_8x8.npz (67KB, 1,797 samples)
- 8×8 handwritten digits from UCI/sklearn
- Normalized to [0-1], ready for immediate use
- Perfect for DataLoader learning (Module 08)
- datasets/tiny/README.md
- Full documentation and usage examples
- Philosophy: tiny (learn) → full (practice) → custom (master)
- datasets/tiny/create_digits_8x8.py
- Extraction script showing how dataset was created
- Reproducible from sklearn.datasets.load_digits()
Updated .gitignore:
- Ignore datasets/* (downloaded large files)
- Allow datasets/tiny/ (shipped small files)
- Allow datasets/README.md and download scripts
- Selectively ignore .npz files (not in tiny/)
Benefits:
✅ Zero download friction for Module 08
✅ Offline-friendly (planes, classrooms, slow networks)
✅ Real handwritten digits (not synthetic noise)
✅ Git-friendly size (67KB vs 10MB MNIST)
✅ Same shape/format students will use for CNNs
Progression:
- Module 08: Learn DataLoader with 8×8 digits
- Milestone 03: Train on full 28×28 MNIST
- Milestone 04: Scale to CIFAR-10
- Created download_mnist.py script to fetch Fashion-MNIST dataset
- Added README explaining dataset format and download process
- Fashion-MNIST used as accessible alternative to original MNIST
- Same format allows seamless use with existing examples