TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-29 15:35:56 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	97fece7b5f	Finalize Module 08 and add integration tests Added integration tests for DataLoader: - test_dataloader_integration.py in tests/integration/ - Training workflow integration - Shuffle consistency across epochs - Memory efficiency verification Updated Module 08: - Added note about optional performance analysis - Clarified that analysis functions can be run manually - Clean flow: text → code → tests Updated datasets/tiny/README.md: - Minor formatting fixes Module 08 is now complete and ready to export: ✅ Dataset abstraction ✅ TensorDataset implementation ✅ DataLoader with batching/shuffling ✅ ASCII visualizations for understanding ✅ Unit tests (in module) ✅ Integration tests (in tests/) ✅ Performance analysis tools (optional) Next: Export with 'bin/tito export 08_dataloader'	2025-09-30 16:07:55 -04:00
Vijay Janapa Reddi	779c47ed7a	Clean up Module 08: Remove unconditional function calls Fixed issue where performance analysis functions were called every time the module was imported, instead of only when needed. Changes: - Commented out analyze_dataloader_performance() bare call - Commented out analyze_memory_usage() bare call - Removed redundant test_training_integration() comment These functions are still defined and can be called manually for performance insights, but won't run on every import. The test_module() function still calls all necessary tests when the module is run as __main__. Result: Module imports cleanly without running expensive performance benchmarks unless explicitly requested.	2025-09-30 15:26:00 -04:00
Vijay Janapa Reddi	ce158d94dc	Add ASCII visualizations to Module 08 for understanding image data Added educational ASCII art showing: 1. Actual pixel values - What 8×8 digit images look like as numbers - Shows digits 5, 3, and 8 with real pixel values (0-16 range) - Helps students understand images are just 2D arrays 2. Visual representation - How humans see the digits - ASCII art showing recognizable digit shapes - Connects abstract numbers to concrete patterns 3. Shape transformations - How DataLoader batches data - Individual: (8, 8) → Batched: (32, 8, 8) - Shows what the model actually receives 4. Complete example - Loading and using tiny digits dataset - Real code showing datasets/tiny/digits_8x8.npz usage - Demonstrates the full DataLoader workflow Benefits: ✅ Students visualize what image data IS ✅ Understand DataLoader's batching transformation ✅ See connection between numbers and visual patterns ✅ Ready to work with real datasets in milestones This makes the abstract concept of 'image tensors' concrete and visual.	2025-09-30 15:22:30 -04:00
Vijay Janapa Reddi	98a02d0efa	Simplify Module 08: Focus on DataLoader mechanics, not dataset downloads Removed synthetic download functions (download_mnist, download_cifar10): - These were placeholder stubs generating random noise - Conflicted with 'Real Data, Real Systems' philosophy - Added scope creep (dataset management vs data loading) Module 08 now focuses purely on: ✅ Dataset abstraction (interface design) ✅ TensorDataset implementation (in-memory wrapper) ✅ DataLoader mechanics (batching, shuffling, iteration) Real datasets handled in examples/milestones: - datasets/tiny/digits_8x8.npz ships with repo (instant) - Milestone 03: MNIST download + training - Milestone 04: CIFAR-10 download + CNN training Separation of concerns: - Module 08: Learn DataLoader abstraction (synthetic test data) - Examples: Apply DataLoader to real data (actual datasets) This follows PyTorch's pattern: - torch.utils.data.DataLoader (abstraction) - torchvision.datasets (actual data) Tests still pass 100% with simplified synthetic data.	2025-09-30 15:10:08 -04:00
Vijay Janapa Reddi	b678fe8f77	feat: implement selective exports for modules 07-08 - 07_training: Export Trainer, CosineSchedule, clip_grad_norm only - 08_dataloader: Export Dataset, DataLoader, TensorDataset only Continues professional selective export pattern across all modules. Development utilities remain in development, clean public API exported.	2025-09-30 09:51:45 -04:00
Vijay Janapa Reddi	acb772dd92	Clean up module imports: convert tinytorch.core to sys.path style - Remove circular imports where modules imported from themselves - Convert tinytorch.core imports to sys.path relative imports - Only import dependencies that are actually used in each module - Preserve documentation imports in markdown cells - Use consistent relative path pattern across all modules - Remove hardcoded absolute paths in favor of relative imports Affected modules: 02_activations, 03_layers, 04_losses, 06_optimizers, 07_training, 09_spatial, 12_attention, 17_quantization	2025-09-30 08:58:58 -04:00
Vijay Janapa Reddi	bc634c586f	Restructure TinyTorch into three-part learning journey (17 modules) - Part I: Foundations (Modules 1-5) - Build MLPs, solve XOR - Part II: Computer Vision (Modules 6-11) - Build CNNs, classify CIFAR-10 - Part III: Language Models (Modules 12-17) - Build transformers, generate text Key changes: - Renamed 05_dense to 05_networks for clarity - Moved 08_dataloader to 07_dataloader (swap with attention) - Moved 07_attention to 13_attention (Part III) - Renamed 12_compression to 16_regularization - Created placeholder dirs for new language modules (12,14,15,17) - Moved old modules 13-16 to temp_holding for content migration - Updated README with three-part structure - Added comprehensive documentation in docs/three-part-structure.md This structure gives students three natural exit points with concrete achievements at each level.	2025-09-22 09:50:48 -04:00
Vijay Janapa Reddi	ebf43e45ce	Fix critical module implementation issues 04_layers: Complete rewrite implementing matrix multiplication and Dense layer - Clean matmul() function with proper tensor operations - Dense layer class with weight/bias initialization and forward pass - Comprehensive testing covering basic operations and edge cases 05_dense: Fix import path errors for module dependencies - Correct directory names in fallback imports (01_tensor → 02_tensor, etc.) - Ensure proper module chain imports work correctly 08_dataloader: Fix execution blocking and dataset issues - Wrap problematic execution code in main block to prevent import chain blocking - Fix TensorDataset → TestDataset and add missing get_sample_shape() method - Enable proper dataloader pipeline functionality 09_autograd: Fix syntax error from incomplete markdown cell - Remove unterminated triple-quoted string literal causing parser failure - Clean up markdown cell formatting for jupytext compatibility	2025-09-18 16:42:21 -04:00
Vijay Janapa Reddi	176dffc226	Standardize all module introductions and fix agent structure Module Standardization: - Applied consistent introduction format to all 17 modules - Every module now has: Welcome, Learning Goals, Build→Use→Reflect, What You'll Achieve, Systems Reality Check - Focused on systems thinking, performance, and production relevance - Consistent 5 learning goals with systems/performance/scaling emphasis Agent Structure Fixes: - Recreated missing documentation-publisher.md agent - Clear separation: Documentation Publisher (content) vs Educational ML Docs Architect (structure) - All 10 agents now present and properly defined - No overlapping responsibilities between agents Improvements: - Consistent Build→Use→Reflect pattern (not Understand or Analyze) - What You'll Achieve section (not What You'll Learn) - Systems Reality Check in every module - Production context and performance insights emphasized	2025-09-18 14:16:58 -04:00
Vijay Janapa Reddi	7978998061	Fix module structure ordering across all modules Standardize module structure to ensure correct section ordering: - if __name__ block → ML Systems Thinking → Module Summary (always last) Fixed 10 modules with incorrect ordering: • 02_tensor, 04_layers, 05_dense, 06_spatial • 08_dataloader, 09_autograd, 10_optimizers, 11_training • 12_compression (consolidated 3 scattered if blocks) • 15_mlops (consolidated 6 scattered if blocks) All 17 modules now follow consistent structure: 1. Content and implementations 2. Main execution block (if __name__) 3. ML Systems Thinking Questions 4. Module Summary (always last section) Updated CLAUDE.md with explicit ordering requirements to prevent future issues.	2025-09-17 17:33:09 -04:00
Vijay Janapa Reddi	fb01c7ab51	Add minimal enhancements for CIFAR-10 north star goal Enhancements for achieving 75% accuracy on CIFAR-10: Module 08 (DataLoader): - Add download_cifar10() function for real dataset downloading - Implement CIFAR10Dataset class for loading real CV data - Simple implementation focused on educational value Module 11 (Training): - Add model checkpointing (save_checkpoint/load_checkpoint) - Enhanced fit() with save_best parameter - Add evaluation tools: compute_confusion_matrix, evaluate_model - Add plot_training_history for tracking progress These minimal changes enable students to: 1. Download and load real CIFAR-10 data 2. Train CNNs with checkpointing 3. Evaluate model performance 4. Achieve our north star goal of 75% accuracy	2025-09-17 00:15:13 -04:00
Vijay Janapa Reddi	c366e9d1c2	Standardize NBGrader formatting and fix test execution patterns across all modules This comprehensive update ensures all TinyTorch modules follow consistent NBGrader formatting guidelines and proper Python module structure: - Fix test execution patterns: All test calls now wrapped in if __name__ == "__main__" blocks - Add ML Systems Thinking Questions to modules missing them - Standardize NBGrader formatting (BEGIN/END SOLUTION blocks, STEP-BY-STEP, etc.) - Remove unused imports across all modules - Fix syntax errors (apostrophes, special characters) - Ensure modules can be imported without running tests Affected modules: All 17 development modules (00-16) Agent workflow: Module Developer → QA Agent → Package Manager coordination Testing: Comprehensive QA validation completed	2025-09-16 19:48:54 -04:00
Vijay Janapa Reddi	f2842b7935	Standardize all modules to follow NBGrader style guide - Updated 7 non-compliant modules for consistency - Module 01_setup: Added EXAMPLE USAGE sections with code examples - Module 02_tensor: Added STEP-BY-STEP IMPLEMENTATION and LEARNING CONNECTIONS - Module 05_dense: Added LEARNING CONNECTIONS to all functions - Module 06_spatial: Added STEP-BY-STEP and LEARNING CONNECTIONS - Module 08_dataloader: Added LEARNING CONNECTIONS sections - Module 11_training: Added STEP-BY-STEP and LEARNING CONNECTIONS - Module 14_benchmarking: Added STEP-BY-STEP and LEARNING CONNECTIONS - All modules now follow consistent format per NBGRADER_STYLE_GUIDE.md - Preserved all existing solution blocks and functionality	2025-09-16 16:48:14 -04:00
Vijay Janapa Reddi	4fced96023	Fix module test execution issues - Fixed test functions to only run when modules executed directly - Added proper __name__ == '__main__' guards to all test calls - Fixed syntax errors from incorrect replacements in Module 13 and 15 - Modules now import properly without executing tests - ProductionBenchmarkingProfiler (Module 14) and ProductionMLSystemProfiler (Module 16) fully working - Other profiler classes present but require full numpy environment to test completely	2025-09-16 00:17:32 -04:00
Vijay Janapa Reddi	ce77693723	Removes development heading from notebook Removes a redundant development heading from the dataloader notebook, streamlining the document's structure and improving readability.	2025-07-20 18:02:37 -04:00
Vijay Janapa Reddi	8a7550b1fb	Add missing markdown documentation to 08_dataloader module - Add documentation for test_unit_dataset_interface function - Add documentation for test_unit_dataloader function - Add documentation for test_unit_simple_dataset function - Add documentation for test_unit_dataloader_pipeline function - Ensures every code function has preceding explanatory markdown cell - Maintains educational clarity and structure	2025-07-20 17:49:03 -04:00
Vijay Janapa Reddi	35fa89a457	Add section organization to 08_dataloader module: Add DEVELOPMENT section header - Insert ## 🔧 DEVELOPMENT header before first test function - Organizes module according to educational structure guidelines - Maintains all existing functionality and test execution - Improves readability and navigation for educational use	2025-07-20 14:05:03 -04:00
Vijay Janapa Reddi	38aee6ab19	Deprecate AUTO TESTING: Remove run_module_tests_auto from all _dev.py modules. Standardize on full-module test execution for reliable, context-aware testing.	2025-07-20 13:28:10 -04:00
Vijay Janapa Reddi	f48f791d28	Removes integration test execution Stops the automatic execution of the integration test. This change prevents the test from running every time the module is loaded, allowing for more focused and controlled testing.	2025-07-20 12:59:54 -04:00
Vijay Janapa Reddi	1d799b4f24	Removes autogenerated markdown section Removes a markdown section that appears to be autogenerated documentation, cleaning up the code.	2025-07-20 12:57:02 -04:00
Vijay Janapa Reddi	b69cff7f4d	Fix test function calls in spatial and dataloader modules - move test calls outside __main__ blocks	2025-07-20 12:54:15 -04:00
Vijay Janapa Reddi	c39923fe5c	Simplify plot handling - remove _should_show_plots functions and plot guards	2025-07-20 12:47:14 -04:00
Vijay Janapa Reddi	99b32d2719	Removes development headers from notebooks Removes redundant "DEVELOPMENT" headers from several notebook files. These headers are no longer necessary and declutter the notebook content, improving readability and focus on the core content and testing sections.	2025-07-20 12:39:21 -04:00
Vijay Janapa Reddi	701278f932	Standardize section headers for 08_dataloader module	2025-07-20 12:29:02 -04:00
Vijay Janapa Reddi	9869308a51	Fix test naming and enhance plot detection	2025-07-20 12:20:00 -04:00
Vijay Janapa Reddi	a63f0aa221	✨ Add structural organization headers to 08_dataloader module - Added ## 🔧 DEVELOPMENT section before Step 1 where development begins - Added ## 🤖 AUTO TESTING section before auto testing block - Updated to ## 🎯 MODULE SUMMARY: Data Loading Systems Improves notebook organization without changing any code logic or content.	2025-07-20 10:01:34 -04:00
Vijay Janapa Reddi	2fbd0d9915	✅ Fix 08_dataloader: Move Module Summary AFTER STANDARDIZED MODULE TESTING CORRECTED ORDER: ✅ BEFORE: Module Summary (line 1054) → STANDARDIZED MODULE TESTING (wrong order) ✅ AFTER: Integration tests → STANDARDIZED MODULE TESTING → Module Summary ✅ Changes: 1. ✅ Removed Module Summary from wrong location (before testing section) 2. ✅ Added Module Summary after run_module_tests_auto call 3. ✅ Correct pattern: ## 🧪 Module Testing (1055) → ## 🎯 Module Summary (1115) 4. ✅ No code between STANDARDIZED MODULE TESTING and Module Summary Module 08_dataloader now follows the exact pattern the user requested	2025-07-20 09:29:36 -04:00
Vijay Janapa Reddi	a65db71762	✅ Fix 08_dataloader: Move STANDARDIZED MODULE TESTING before Module Summary CORRECTED ORDER: ✅ BEFORE: Module Summary (line 979) → STANDARDIZED MODULE TESTING (line 1137) ❌ ✅ AFTER: STANDARDIZED MODULE TESTING → Module Summary ✅ Changes: - Moved complete testing section (Module Testing + standardized cell + integration tests + run_module_tests_auto) to line 979 - Moved Module Summary section to follow after testing - Removed duplicate testing sections - Now follows correct pattern: Testing → Summary Module 08_dataloader now has proper ordering	2025-07-20 09:16:12 -04:00
Vijay Janapa Reddi	8f99f0e61a	🚀 Training System: Standardize test naming in ML training pipeline - DataLoader: test_integration_* → test_module_* (module dependency tests) - Autograd: test_variable_class → test_unit_variable_class - Autograd: test_add_operation → test_unit_add_operation - Autograd: test_multiply_operation → test_unit_multiply_operation - Autograd: test_subtract_operation → test_unit_subtract_operation - Autograd: test_chain_rule → test_unit_chain_rule - Autograd: test_neural_network_training → test_module_neural_network_training - Optimizers: test_integration_* → test_module_* (module dependency tests) - Training: All test_* → test_unit_* except test_training → test_module_training - Completes test standardization for complete training pipeline	2025-07-20 08:39:13 -04:00
Vijay Janapa Reddi	9d637e80ef	refactor: Implement learner-focused module progression with better naming ✅ Renamed modules for clearer pedagogical flow: - 05_networks → 05_dense (multi-layer dense/fully connected networks) - 06_cnn → 06_spatial (convolutional networks for spatial patterns) - 06_attention → 07_attention (attention mechanisms for sequences) ✅ Shifted remaining modules down by 1: - 07_dataloader → 08_dataloader - 08_autograd → 09_autograd - 09_optimizers → 10_optimizers - 10_training → 11_training - 11_compression → 12_compression - 12_kernels → 13_kernels - 13_benchmarking → 14_benchmarking - 14_mlops → 15_mlops - 15_capstone → 16_capstone ✅ Updated module metadata (module.yaml files): - Updated names, descriptions, dependencies - Fixed prerequisite chains and enables relationships - Updated export paths to match new names New learner progression: Foundation → Individual Layers → Dense Networks → Spatial Networks → Attention Networks → Training Pipeline Perfect pedagogical flow: Build one layer → Stack dense layers → Add spatial patterns → Add attention mechanisms → Learn to train them all.	2025-07-18 00:12:50 -04:00

30 Commits