Added integration tests for DataLoader:
- test_dataloader_integration.py in tests/integration/
- Training workflow integration
- Shuffle consistency across epochs
- Memory efficiency verification
Updated Module 08:
- Added note about optional performance analysis
- Clarified that analysis functions can be run manually
- Clean flow: text → code → tests
Updated datasets/tiny/README.md:
- Minor formatting fixes
Module 08 is now complete and ready to export:
✅ Dataset abstraction
✅ TensorDataset implementation
✅ DataLoader with batching/shuffling
✅ ASCII visualizations for understanding
✅ Unit tests (in module)
✅ Integration tests (in tests/)
✅ Performance analysis tools (optional)
Next: Export with 'bin/tito export 08_dataloader'
Fixed issue where performance analysis functions were called every time
the module was imported, instead of only when needed.
Changes:
- Commented out analyze_dataloader_performance() bare call
- Commented out analyze_memory_usage() bare call
- Removed redundant test_training_integration() comment
These functions are still defined and can be called manually for
performance insights, but won't run on every import.
The test_module() function still calls all necessary tests when
the module is run as __main__.
Result: Module imports cleanly without running expensive performance
benchmarks unless explicitly requested.
Added educational ASCII art showing:
1. **Actual pixel values** - What 8×8 digit images look like as numbers
- Shows digits 5, 3, and 8 with real pixel values (0-16 range)
- Helps students understand images are just 2D arrays
2. **Visual representation** - How humans see the digits
- ASCII art showing recognizable digit shapes
- Connects abstract numbers to concrete patterns
3. **Shape transformations** - How DataLoader batches data
- Individual: (8, 8) → Batched: (32, 8, 8)
- Shows what the model actually receives
4. **Complete example** - Loading and using tiny digits dataset
- Real code showing datasets/tiny/digits_8x8.npz usage
- Demonstrates the full DataLoader workflow
Benefits:
✅ Students visualize what image data IS
✅ Understand DataLoader's batching transformation
✅ See connection between numbers and visual patterns
✅ Ready to work with real datasets in milestones
This makes the abstract concept of 'image tensors' concrete and visual.
- 07_training: Export Trainer, CosineSchedule, clip_grad_norm only
- 08_dataloader: Export Dataset, DataLoader, TensorDataset only
Continues professional selective export pattern across all modules.
Development utilities remain in development, clean public API exported.
- Remove circular imports where modules imported from themselves
- Convert tinytorch.core imports to sys.path relative imports
- Only import dependencies that are actually used in each module
- Preserve documentation imports in markdown cells
- Use consistent relative path pattern across all modules
- Remove hardcoded absolute paths in favor of relative imports
Affected modules: 02_activations, 03_layers, 04_losses, 06_optimizers,
07_training, 09_spatial, 12_attention, 17_quantization
- Part I: Foundations (Modules 1-5) - Build MLPs, solve XOR
- Part II: Computer Vision (Modules 6-11) - Build CNNs, classify CIFAR-10
- Part III: Language Models (Modules 12-17) - Build transformers, generate text
Key changes:
- Renamed 05_dense to 05_networks for clarity
- Moved 08_dataloader to 07_dataloader (swap with attention)
- Moved 07_attention to 13_attention (Part III)
- Renamed 12_compression to 16_regularization
- Created placeholder dirs for new language modules (12,14,15,17)
- Moved old modules 13-16 to temp_holding for content migration
- Updated README with three-part structure
- Added comprehensive documentation in docs/three-part-structure.md
This structure gives students three natural exit points with concrete achievements at each level.
Module Standardization:
- Applied consistent introduction format to all 17 modules
- Every module now has: Welcome, Learning Goals, Build→Use→Reflect, What You'll Achieve, Systems Reality Check
- Focused on systems thinking, performance, and production relevance
- Consistent 5 learning goals with systems/performance/scaling emphasis
Agent Structure Fixes:
- Recreated missing documentation-publisher.md agent
- Clear separation: Documentation Publisher (content) vs Educational ML Docs Architect (structure)
- All 10 agents now present and properly defined
- No overlapping responsibilities between agents
Improvements:
- Consistent Build→Use→Reflect pattern (not Understand or Analyze)
- What You'll Achieve section (not What You'll Learn)
- Systems Reality Check in every module
- Production context and performance insights emphasized
Enhancements for achieving 75% accuracy on CIFAR-10:
Module 08 (DataLoader):
- Add download_cifar10() function for real dataset downloading
- Implement CIFAR10Dataset class for loading real CV data
- Simple implementation focused on educational value
Module 11 (Training):
- Add model checkpointing (save_checkpoint/load_checkpoint)
- Enhanced fit() with save_best parameter
- Add evaluation tools: compute_confusion_matrix, evaluate_model
- Add plot_training_history for tracking progress
These minimal changes enable students to:
1. Download and load real CIFAR-10 data
2. Train CNNs with checkpointing
3. Evaluate model performance
4. Achieve our north star goal of 75% accuracy
This comprehensive update ensures all TinyTorch modules follow consistent NBGrader
formatting guidelines and proper Python module structure:
- Fix test execution patterns: All test calls now wrapped in if __name__ == "__main__" blocks
- Add ML Systems Thinking Questions to modules missing them
- Standardize NBGrader formatting (BEGIN/END SOLUTION blocks, STEP-BY-STEP, etc.)
- Remove unused imports across all modules
- Fix syntax errors (apostrophes, special characters)
- Ensure modules can be imported without running tests
Affected modules: All 17 development modules (00-16)
Agent workflow: Module Developer → QA Agent → Package Manager coordination
Testing: Comprehensive QA validation completed
- Fixed test functions to only run when modules executed directly
- Added proper __name__ == '__main__' guards to all test calls
- Fixed syntax errors from incorrect replacements in Module 13 and 15
- Modules now import properly without executing tests
- ProductionBenchmarkingProfiler (Module 14) and ProductionMLSystemProfiler (Module 16) fully working
- Other profiler classes present but require full numpy environment to test completely
- Add documentation for test_unit_dataset_interface function
- Add documentation for test_unit_dataloader function
- Add documentation for test_unit_simple_dataset function
- Add documentation for test_unit_dataloader_pipeline function
- Ensures every code function has preceding explanatory markdown cell
- Maintains educational clarity and structure
- Insert ## 🔧 DEVELOPMENT header before first test function
- Organizes module according to educational structure guidelines
- Maintains all existing functionality and test execution
- Improves readability and navigation for educational use
Stops the automatic execution of the integration test.
This change prevents the test from running every time the module is loaded,
allowing for more focused and controlled testing.
Removes redundant "DEVELOPMENT" headers from several notebook files.
These headers are no longer necessary and declutter the notebook content, improving readability and focus on the core content and testing sections.
- Added ## 🔧 DEVELOPMENT section before Step 1 where development begins
- Added ## 🤖 AUTO TESTING section before auto testing block
- Updated to ## 🎯 MODULE SUMMARY: Data Loading Systems
Improves notebook organization without changing any code logic or content.