TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-06-04 10:15:52 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	791b09a950	Fix modules 10-13 tests and add CLAUDE.md - Add CLAUDE.md entry point for Claude AI system - Fix tito test command to set PYTHONPATH for module imports - Fix embeddings export directive placement for nbdev - Fix attention module to export imports properly - Fix transformers embedding index casting to int	2025-10-25 17:04:00 -04:00
Vijay Janapa Reddi	6603e00850	refactor: Update transformers module and milestone compatibility - Update transformers module to match tokenization style with improved ASCII diagrams - Fix attention module to use proper multi-head interface - Update transformer era milestone for refined module integration - Fix import paths and ensure forward() method consistency - All transformer components now work seamlessly together	2025-10-25 16:42:02 -04:00
Vijay Janapa Reddi	77e2e7fd4a	refactor: Update attention module to match tokenization style - Clean import structure following TinyTorch dependency chain - Add proper export declarations for key functions and classes - Standardize NBGrader cell structure and testing patterns - Enhance ASCII diagrams with improved formatting - Align documentation style with tokenization module standards - Maintain all core functionality and educational value	2025-10-25 15:26:33 -04:00
Vijay Janapa Reddi	4d70e308ff	refactor: Update embeddings module to match tokenization style - Standardize import structure following TinyTorch dependency chain - Enhance section organization with 6 clear educational sections - Add comprehensive ASCII diagrams matching tokenization patterns - Improve code organization and function naming consistency - Strengthen systems analysis and performance documentation - Align package integration documentation with module standards 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 14:58:30 -04:00
Vijay Janapa Reddi	805608e3d4	fix: Adjust ASCII diagram spacing for consistent alignment	2025-10-24 17:51:11 -04:00
Vijay Janapa Reddi	c43c5d89c6	docs: Improve tokenization module with enhanced ASCII diagrams Following module developer guidelines, added comprehensive visual diagrams: 1. Text-to-Numbers Pipeline (Introduction): - Added full boxed diagram showing 4-step tokenization process - Clear visual flow from human text to numerical IDs - Each step explained inline with the diagram 2. Character Tokenization Process: - Step-by-step vocabulary building visualization - Shows corpus → unique chars → vocab with IDs - Encoding process with ID lookup visualization - Decoding process with reverse lookup - All in clear nested boxes 3. BPE Training Algorithm: - Comprehensive 4-step process with nested boxes - Pair frequency analysis with bar charts (████) - Before/After merge visualizations - Iteration examples showing vocabulary growth - Final results with key insights 4. Memory Layout for Embedding Tables: - Visual bars showing relative memory sizes - Character (204KB) vs BPE-50K (102MB) vs Word-100K (204MB) - Shows fp32/fp16/int8 precision trade-offs - Real production model examples (GPT-2/3, BERT, T5, LLaMA) - Clear table format for comparison Educational improvements: - More visual, less text-heavy - Clearer step-by-step flows - Better intuition building - Production context throughout - Following module developer ASCII diagram patterns Students now see: - HOW tokenization works (not just WHAT) - WHY different strategies exist - WHAT the memory implications are - HOW production models make these choices	2025-10-24 17:51:11 -04:00
Vijay Janapa Reddi	6efe1124c0	refactor: Standardize imports across modules 10-17 to match 01-09 Enforce consistent import pattern across all modules: - Direct imports from tinytorch.core.* (no fallbacks) - Remove all sys.path.append manipulations - Remove try/except import fallbacks - Remove mock/dummy class fallbacks Fixed modules: - Module 10 (tokenization): Removed try/except fallback - Module 12 (attention): Removed sys.path.append for tensor/layers - Module 15 (profiling): Removed sys.path + mock Tensor/Linear/Conv2d - Module 16 (acceleration): Removed hardcoded path + importlib + mock Tensor - Module 17 (quantization): Removed sys.path + disabled fallback block All modules now follow the same pattern as modules 01-09: from tinytorch.core.tensor import Tensor from tinytorch.core.layers import Linear # etc. No development fallbacks - assume tinytorch package is installed.	2025-10-24 17:51:10 -04:00
Vijay Janapa Reddi	76fb4326dd	feat: Complete transformer integration with milestones - Add tokenization module (tinytorch/text/tokenization.py) - Update Milestone 05 transformer demos (validation, TinyCoder, Shakespeare) - Update book chapters with milestones overview - Update README and integration plan - Sync module notebooks and metadata	2025-10-19 12:46:58 -04:00
Vijay Janapa Reddi	95274448bd	feat: Add Milestone 04 (CNN Revolution 1998) + Clean spatial imports Milestone 04 - CNN Revolution: ✅ Complete 5-Act narrative structure (Challenge → Reflection) ✅ SimpleCNN architecture: Conv2d → ReLU → MaxPool → Linear ✅ Trains on 8x8 digits dataset (1,437 train, 360 test) ✅ Achieves 84.2% accuracy with only 810 parameters ✅ Demonstrates spatial operations preserve structure ✅ Beautiful visual output with progress tracking Key Features: - Conv2d (1→8 channels, 3×3 kernel) detects local patterns - MaxPool2d (2×2) provides translation invariance - 100× fewer parameters than equivalent MLP - Training completes in ~105 seconds (50 epochs) - Sample predictions table shows 9/10 correct Module 09 Spatial Improvements: - Removed ugly try/except import pattern - Clean imports: 'from tinytorch.core.tensor import Tensor' - Matches PyTorch style (simple and professional) - No fallback logic needed All 4 milestones now follow consistent 5-Act structure!	2025-09-30 17:04:41 -04:00
Vijay Janapa Reddi	cf575b4829	fix: Update Module 09 spatial for standalone classes Changes: - Removed broken _SimplifiedTensor and internal Module helper classes - Updated imports to use tinytorch.core instead of dev modules - Removed Module inheritance from Conv2d, MaxPool2d, AvgPool2d, SimpleCNN - All spatial classes now standalone like Linear in layers module This allows spatial module to export cleanly and import correctly: from tinytorch.core.spatial import Conv2d, MaxPool2d, AvgPool2d Smoke test: Conv2d(1,3,8,8) → (1,16,6,6) ✓	2025-09-30 16:54:21 -04:00
Vijay Janapa Reddi	828c3d9081	feat: Add CrossEntropyLoss autograd support + Milestone 03 MLP on digits Key Changes: - Implemented CrossEntropyBackward for gradient computation - Integrated CrossEntropyLoss into enable_autograd() patching - Created comprehensive loss gradient test suite - Milestone 03: MLP digits classifier (77.5% accuracy) - Shipped tiny 8x8 digits dataset (67KB) for instant demos - Updated DataLoader module with ASCII visualizations Tests: - All 3 losses (MSE, BCE, CrossEntropy) now have gradient flow - MLP successfully learns digit classification (6.9% → 77.5%) - Integration tests pass Technical: - CrossEntropyBackward: softmax - one_hot gradient - Numerically stable via log-softmax - Works with raw class labels (no one-hot needed)	2025-09-30 16:22:09 -04:00
Vijay Janapa Reddi	3830e4bfc3	Finalize Module 08 and add integration tests Added integration tests for DataLoader: - test_dataloader_integration.py in tests/integration/ - Training workflow integration - Shuffle consistency across epochs - Memory efficiency verification Updated Module 08: - Added note about optional performance analysis - Clarified that analysis functions can be run manually - Clean flow: text → code → tests Updated datasets/tiny/README.md: - Minor formatting fixes Module 08 is now complete and ready to export: ✅ Dataset abstraction ✅ TensorDataset implementation ✅ DataLoader with batching/shuffling ✅ ASCII visualizations for understanding ✅ Unit tests (in module) ✅ Integration tests (in tests/) ✅ Performance analysis tools (optional) Next: Export with 'bin/tito export 08_dataloader'	2025-09-30 16:07:55 -04:00
Vijay Janapa Reddi	683615d04f	Clean up Module 08: Remove unconditional function calls Fixed issue where performance analysis functions were called every time the module was imported, instead of only when needed. Changes: - Commented out analyze_dataloader_performance() bare call - Commented out analyze_memory_usage() bare call - Removed redundant test_training_integration() comment These functions are still defined and can be called manually for performance insights, but won't run on every import. The test_module() function still calls all necessary tests when the module is run as __main__. Result: Module imports cleanly without running expensive performance benchmarks unless explicitly requested.	2025-09-30 15:26:00 -04:00
Vijay Janapa Reddi	b6f4a0bee6	Add ASCII visualizations to Module 08 for understanding image data Added educational ASCII art showing: 1. Actual pixel values - What 8×8 digit images look like as numbers - Shows digits 5, 3, and 8 with real pixel values (0-16 range) - Helps students understand images are just 2D arrays 2. Visual representation - How humans see the digits - ASCII art showing recognizable digit shapes - Connects abstract numbers to concrete patterns 3. Shape transformations - How DataLoader batches data - Individual: (8, 8) → Batched: (32, 8, 8) - Shows what the model actually receives 4. Complete example - Loading and using tiny digits dataset - Real code showing datasets/tiny/digits_8x8.npz usage - Demonstrates the full DataLoader workflow Benefits: ✅ Students visualize what image data IS ✅ Understand DataLoader's batching transformation ✅ See connection between numbers and visual patterns ✅ Ready to work with real datasets in milestones This makes the abstract concept of 'image tensors' concrete and visual.	2025-09-30 15:22:30 -04:00
Vijay Janapa Reddi	38b089b52f	Simplify Module 08: Focus on DataLoader mechanics, not dataset downloads Removed synthetic download functions (download_mnist, download_cifar10): - These were placeholder stubs generating random noise - Conflicted with 'Real Data, Real Systems' philosophy - Added scope creep (dataset management vs data loading) Module 08 now focuses purely on: ✅ Dataset abstraction (interface design) ✅ TensorDataset implementation (in-memory wrapper) ✅ DataLoader mechanics (batching, shuffling, iteration) Real datasets handled in examples/milestones: - datasets/tiny/digits_8x8.npz ships with repo (instant) - Milestone 03: MNIST download + training - Milestone 04: CIFAR-10 download + CNN training Separation of concerns: - Module 08: Learn DataLoader abstraction (synthetic test data) - Examples: Apply DataLoader to real data (actual datasets) This follows PyTorch's pattern: - torch.utils.data.DataLoader (abstraction) - torchvision.datasets (actual data) Tests still pass 100% with simplified synthetic data.	2025-09-30 15:10:08 -04:00
Vijay Janapa Reddi	82fd89d5b3	Remove unnecessary matplotlib import from losses module Issue: xor_crisis.py was failing with ImportError on matplotlib architecture mismatch Root cause: losses_dev.py imported matplotlib.pyplot but never used it Fix: - ✅ Removed unused imports: matplotlib.pyplot, time - ✅ Re-exported module 04_losses to update tinytorch package - ✅ Verified both milestone 02 scripts now run successfully The matplotlib import was causing failures on M2 Macs where matplotlib was installed for wrong architecture (x86_64 vs arm64). Since it was never used, removing it eliminates the dependency entirely. Tested: - ✅ milestones/02_xor_crisis_1969/xor_crisis.py (49% accuracy - expected failure) - ✅ milestones/02_xor_crisis_1969/xor_solved.py (100% accuracy - perfect!)	2025-09-30 14:16:42 -04:00
Vijay Janapa Reddi	d032e4278b	Add ReLUBackward and complete XOR milestone scripts New Features: - Add ReLUBackward for proper ReLU gradient computation - Patch ReLU.forward() in enable_autograd() for gradient tracking - Create polished XOR milestone scripts matching perceptron style XOR Milestone Scripts (milestones/02_xor_crisis_1969/): - xor_crisis.py: Shows single-layer perceptron FAILING (~50% accuracy) - xor_solved.py: Shows multi-layer network SUCCEEDING (75%+ accuracy) - Beautiful rich output with tables, panels, historical context - Pedagogically structured like the perceptron milestone Results: ✅ Single-layer: Stuck at ~50% (proves the crisis) ✅ Multi-layer: 75% accuracy (proves hidden layers work!) ✅ ReLU gradients flow correctly through network ✅ All 4 core activations now support autograd: - Sigmoid ✓, ReLU ✓, Tanh ✓ (future), GELU ✓ (future) Historical Significance: This recreates the exact problem that killed AI for 17 years and demonstrates the solution that started the modern era!	2025-09-30 14:10:11 -04:00
Vijay Janapa Reddi	9129935d5b	Add MSEBackward and organize comprehensive test suite New Features: - Add MSEBackward gradient computation for regression tasks - Patch MSELoss in enable_autograd() for gradient tracking - All 3 loss functions now support autograd: MSE, BCE, CrossEntropy Test Suite Organization: - Reorganize tests/ into focused directories - Create tests/integration/ for cross-module tests - Create tests/05_autograd/ for autograd edge cases - Create tests/debugging/ for common student pitfalls - Add comprehensive tests/README.md explaining test philosophy Integration Tests: - Move test_gradient_flow.py to integration/ - 20 comprehensive gradient flow tests - Tests cover: tensors, layers, activations, losses, optimizers - Tests validate: basic ops, chain rule, broadcasting, training loops - 19/20 tests passing (MSE now fixed!) Results: ✅ Perceptron learns: 50% → 93% accuracy ✅ Clean test organization guides future development ✅ Tests catch the exact bugs that broke training Pedagogical Value: - Test organization teaches testing best practices - Gradient flow tests show what integration testing catches - Sets foundation for debugging/diagnostic tests	2025-09-30 13:57:40 -04:00
Vijay Janapa Reddi	dc61a1b041	Clean up gradient broadcasting logic - more pedagogical Refactored gradient accumulation to use clearer two-step approach: 1. Remove extra leading dimensions (batch dims) 2. Sum over dimensions that were size-1 (broadcast dims) Benefits: - Clearer intent: while loop for variable dims, for loop for fixed dims - Better comments with concrete examples - Easier for students to understand broadcasting in backprop - Matches how you'd explain it verbally Same functionality, cleaner code.	2025-09-30 13:53:05 -04:00
Vijay Janapa Reddi	49ea4d6839	Fix gradient propagation: enable autograd and patch activations/losses CRITICAL FIX: Gradients now flow through entire training stack! Changes: 1. Enable autograd in __init__.py - patches Tensor operations on import 2. Extend enable_autograd() to patch Sigmoid and BCE forward methods 3. Fix gradient accumulation to handle broadcasting (bias gradients) 4. Fix optimizer.step() - param.grad is numpy array, not Tensor.data 5. Add debug_gradients.py for systematic gradient flow testing Architecture: - Clean patching pattern - all gradient tracking in enable_autograd() - Activations/losses remain simple (Module 02/04) - Autograd (Module 05) upgrades them with gradient tracking - Pedagogically sound: separation of concerns Results: ✅ All 6 debug tests pass ✅ Perceptron learns: 50% → 93% accuracy ✅ Loss decreases: 0.79 → 0.36 ✅ Weights update correctly through SGD	2025-09-30 13:51:30 -04:00
Vijay Janapa Reddi	af1c313d16	Reset package and export modules 01-07 only (skip broken spatial module)	2025-09-30 13:41:00 -04:00
Vijay Janapa Reddi	5184fa350b	Update autograd module with latest changes	2025-09-30 13:40:51 -04:00
Vijay Janapa Reddi	d1439a0db1	Fix imports: Replace dev-style imports with proper package imports in modules 06-07	2025-09-30 13:40:38 -04:00
Vijay Janapa Reddi	eeb308a691	WIP: Manual edits to tinytorch (WRONG APPROACH - needs revert) WARNING: I incorrectly edited files in tinytorch/ directly: - tinytorch/core/autograd.py - added enable_autograd() manually - tinytorch/core/activations.py - tried to add gradient tracking - tinytorch/core/losses.py - restored from git CORRECT APPROACH: 1. Make ALL changes in modules/source/XX_*/YY_dev.py 2. Add #\| export directives for classes to export 3. Run: tito export XX_module 4. NEVER edit tinytorch/ files directly Next steps: - Revert tinytorch/ manual edits - Add proper exports to source modules - Export cleanly	2025-09-30 13:31:31 -04:00
Vijay Janapa Reddi	0015a8cab1	WIP: Add SigmoidBackward and BCEBackward classes to autograd Added: - SigmoidBackward class to modules/source/05_autograd/autograd_dev.py with #\| export - BCEBackward class to modules/source/05_autograd/autograd_dev.py with #\| export - Both classes exported to tinytorch/core/autograd.py - Updated Sigmoid activation to track gradients using SigmoidBackward - Updated BCE loss to track gradients using BCEBackward ISSUE: Training still not learning - gradients not flowing properly - Loss stays constant at 0.7911 - Weights don't update - Sigmoid.forward() code looks correct but a.requires_grad stays False - Need to investigate why gradient tracking isn't working through activations	2025-09-30 13:23:56 -04:00
Vijay Janapa Reddi	76da686ce0	Update loss function examples to use PyTorch-style callable API Updated docstring examples to use cleaner callable syntax: - loss_fn(predictions, targets) instead of loss_fn.forward(predictions, targets) Applied to: - MSELoss - CrossEntropyLoss - BinaryCrossEntropyLoss Demonstrates proper usage with __call__ methods for cleaner, more Pythonic code.	2025-09-30 12:36:27 -04:00
Vijay Janapa Reddi	fd6f377b77	Update activation examples to use PyTorch-style callable API Updated docstring examples to use cleaner callable syntax: - sigmoid(x) instead of sigmoid.forward(x) - relu(x) instead of relu.forward(x) - tanh(x) instead of tanh.forward(x) - gelu(x) instead of gelu.forward(x) - softmax(x) instead of softmax.forward(x) This demonstrates the proper usage pattern with the __call__ methods we just added, making examples more Pythonic and PyTorch-compatible.	2025-09-30 12:36:00 -04:00
Vijay Janapa Reddi	17cb8049c6	Add __call__ methods to enable PyTorch-style API Enable cleaner API usage by adding __call__ methods to all activation, layer, and loss classes. This allows students to write: - relu(x) instead of relu.forward(x) - layer(x) instead of layer.forward(x) - loss_fn(pred, target) instead of loss_fn.forward(pred, target) Changes: - Module 02 (Activations): Add __call__ to ReLU, Tanh, GELU, Softmax * Sigmoid already had __call__ - Module 03 (Layers): Add __call__ to Dropout * Linear already had __call__ - Module 04 (Losses): Add __call__ to MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss This matches PyTorch's API convention where model(x) calls model.__call__(x) which internally calls model.forward(x). Makes code more Pythonic and intuitive for students familiar with PyTorch. Expected impact: Test pass rates should improve significantly as tests expect PyTorch-style callable API.	2025-09-30 12:33:45 -04:00
Vijay Janapa Reddi	32aabfa78c	Refactor Milestone 1: Clean forward pass with Rich CLI - Reorganized milestone structure to historical progression (01-06) - Created single forward_pass.py with student code clearly at top - Added Rich CLI visualizations: data scatter, network diagram, decision boundary - Show decision boundary using / or \ based on slope - No random seed - students see variability in random weights - Annotated all code with which modules were used (Modules 01-03) - Added introductory panel explaining what to expect - Updated DEFINITIVE_MODULE_PLAN.md with corrected milestone structure	2025-09-30 12:03:19 -04:00
Vijay Janapa Reddi	de3b837bee	Fix nbdev export system across all 20 modules PROBLEM: - nbdev requires #\| export directive on EACH cell to export when using # %% markers - Cell markers inside class definitions split classes across multiple cells - Only partial classes were being exported to tinytorch package - Missing matmul, arithmetic operations, and activation classes in exports SOLUTION: 1. Removed # %% cell markers INSIDE class definitions (kept classes as single units) 2. Added #\| export to imports cell at top of each module 3. Added #\| export before each exportable class definition in all 20 modules 4. Added __call__ method to Sigmoid for functional usage 5. Fixed numpy import (moved to module level from __init__) MODULES FIXED: - 01_tensor: Tensor class with all operations (matmul, arithmetic, shape ops) - 02_activations: Sigmoid, ReLU, Tanh, GELU, Softmax classes - 03_layers: Linear, Dropout classes - 04_losses: MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss classes - 05_autograd: Function, AddBackward, MulBackward, MatmulBackward, SumBackward - 06_optimizers: Optimizer, SGD, Adam, AdamW classes - 07_training: CosineSchedule, Trainer classes - 08_dataloader: Dataset, TensorDataset, DataLoader classes - 09_spatial: Conv2d, MaxPool2d, AvgPool2d, SimpleCNN classes - 10-20: All exportable classes in remaining modules TESTING: - Test functions use 'if __name__ == "__main__"' guards - Tests run in notebooks but NOT on import - Rosenblatt Perceptron milestone working perfectly RESULT: ✅ All 20 modules export correctly ✅ Perceptron (1957) milestone functional ✅ Clean separation: development (modules/source) vs package (tinytorch)	2025-09-30 11:21:04 -04:00
Vijay Janapa Reddi	db1582f81e	feat: implement selective exports for modules 12-13 - 12_attention: Export scaled_dot_product_attention, MultiHeadAttention only - 13_transformers: Export TransformerBlock, GPT only Continues professional selective export pattern across advanced modules. Clean public APIs for transformer architecture components.	2025-09-30 09:58:04 -04:00
Vijay Janapa Reddi	aad98c7383	feat: implement selective exports for modules 09-11 - 09_spatial: Export Conv2d, MaxPool2d, AvgPool2d only - 10_tokenization: Export Tokenizer, CharTokenizer, BPETokenizer only - 11_embeddings: Export Embedding, PositionalEncoding only Continues professional selective export pattern. Clean public APIs, development utilities remain in development environment.	2025-09-30 09:56:50 -04:00
Vijay Janapa Reddi	6d4f23a22d	feat: implement selective exports for modules 07-08 - 07_training: Export Trainer, CosineSchedule, clip_grad_norm only - 08_dataloader: Export Dataset, DataLoader, TensorDataset only Continues professional selective export pattern across all modules. Development utilities remain in development, clean public API exported.	2025-09-30 09:51:45 -04:00
Vijay Janapa Reddi	b428b63b81	feat: implement professional selective export pattern across all modules BREAKING CHANGE: Refactor from whole-module exports to selective function/class exports What Changed: - Separate development utilities from production exports - Each function/class gets individual #\| export directive - Clean Prerequisites & Setup sections in all modules - Development helpers (import_previous_module) not exported Module Export Summary: - 01_tensor: Tensor class only - 02_activations: Sigmoid, ReLU, Tanh, GELU, Softmax only - 03_layers: Linear, Dropout only - 04_losses: MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss, log_softmax only - 05_autograd: Function class only - 06_optimizers: SGD, Adam, AdamW only Benefits: ✅ Clean public API (matches PyTorch/TensorFlow patterns) ✅ No development utilities in final package ✅ Professional software education standards ✅ Clear separation of concerns ✅ Educational clarity for students This matches industry standards for educational ML frameworks.	2025-09-30 09:48:47 -04:00
Vijay Janapa Reddi	1a6d36e05f	feat: update advanced modules (09-20) with latest improvements - Update spatial, tokenization, embeddings, attention modules - Update transformers, kv-caching, profiling modules - Update acceleration, quantization, compression modules - Update benchmarking and capstone modules - Align with current TinyTorch standards and patterns	2025-09-30 09:45:00 -04:00
Vijay Janapa Reddi	e82ec44e6a	feat: standardize integration testing with import helpers - Add import_previous_module() helper function to all core modules (01-07) - Standardize cross-module imports for integration testing - Add clear Prerequisites & Setup sections explaining module dependencies - Update integration tests to use standardized import pattern - Maintain clean separation between development and production code This provides a consistent, educational approach to module integration while keeping the codebase maintainable and student-friendly.	2025-09-30 09:42:58 -04:00
Vijay Janapa Reddi	6dbce13c85	Enhance autograd_dev.py with comprehensive documentation and methods ✨ Major improvements to Module 05: Autograd - Add complete Jupyter notebook structure with markdown cells - Enhance all Function classes with detailed mathematical explanations - Add comprehensive unit tests with proper test patterns - Improve enable_autograd() with detailed documentation - Add integration tests for complex computation graphs - Include educational visualizations and examples - Follow TinyTorch standards with ⭐⭐ difficulty rating - All tests pass: Function classes, Tensor autograd, integration scenarios 🎯 Ready for student use with modern PyTorch 2.0 style autograd	2025-09-30 09:22:29 -04:00
Vijay Janapa Reddi	30941e7c6e	Complete autograd cleanup - finalize file rename - Remove autograd_clean.py (now renamed) - Update autograd_dev.py to be the clean implementation - Single clean autograd implementation ready for use	2025-09-30 09:15:35 -04:00
Vijay Janapa Reddi	cc7c7526c8	Clean up module imports: convert tinytorch.core to sys.path style - Remove circular imports where modules imported from themselves - Convert tinytorch.core imports to sys.path relative imports - Only import dependencies that are actually used in each module - Preserve documentation imports in markdown cells - Use consistent relative path pattern across all modules - Remove hardcoded absolute paths in favor of relative imports Affected modules: 02_activations, 03_layers, 04_losses, 06_optimizers, 07_training, 09_spatial, 12_attention, 17_quantization	2025-09-30 08:58:58 -04:00
Vijay Janapa Reddi	be4ad5356d	Clean up modules 04, 05, and 06 by removing unnecessary demonstration functions - Remove demonstrate_complex_computation_graph() function from Module 05 (autograd) - Remove demonstrate_optimizer_integration() function from Module 06 (optimizers) - Module 04 (losses) had no demonstration functions to remove - Keep all core implementations and unit test functions intact - Keep final test_module() function for integration testing - All module tests continue to pass after cleanup 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-09-30 08:09:29 -04:00
Vijay Janapa Reddi	4d27b8f1a1	Fix module test execution pattern with if __name__ == '__main__' guards This change ensures tests run immediately when developing modules but don't execute when modules are imported by other modules. Changes: - Protected all test executions with if __name__ == "__main__" blocks - Unit tests run immediately after function definitions during development - Module integration test (test_module()) runs at end when executed directly - Updated module-developer.md with new testing patterns and examples Benefits: - Students see immediate feedback when developing (python module_dev.py runs all tests) - Clean imports: later modules can import earlier ones without triggering tests - Maintains educational flow: tests visible right after implementations - Compatible with nbgrader and notebook environments Tested: - Module 01 runs all tests when executed directly ✓ - Importing Tensor from tensor_dev doesn't run tests ✓ - Cross-module imports work without test interference ✓ 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2025-09-30 07:42:42 -04:00
Vijay Janapa Reddi	1aae7be8fb	Simplify training module by removing unnecessary model classes Removed complexity from Module 07 (training): - Removed DemoModel and TestModel classes - Unified all tests/demos to use single minimal MockModel - Module now focuses purely on training infrastructure What remains: - Trainer class (the core training orchestrator) - CosineSchedule (learning rate scheduling) - clip_grad_norm (gradient clipping utility) - Training loop mechanics and checkpointing Impact: - Cleaner, more focused module - No distraction from model architecture - Tests training infrastructure, not model building - All tests still pass with simplified mocks The module now teaches exactly what it should: how to train models, not how to build them.	2025-09-30 07:06:46 -04:00
Vijay Janapa Reddi	d67b412c42	Enforce components-only philosophy in modules Major changes to module structure: 1. Updated module-developer.md with clear components-only rule 2. Removed Sequential container from Module 03 (layers) 3. Converted to manual layer composition for transparency Philosophy: - Modules build ATOMIC COMPONENTS (Tensor, Linear, ReLU, etc.) - Milestones/Examples show EXPLICIT COMPOSITION - Students SEE how their components connect - No hidden abstractions or black boxes Module 03 changes: - REMOVED: Sequential class and tests (~200 lines) - KEPT: Linear and Dropout as individual components - UPDATED: Integration demos use manual composition - Result: Students see explicit layer1.forward(x) calls Module 07 changes: - Simplified model classes to minimal test fixtures - Removed complex neural network teaching examples - Focus purely on training infrastructure Impact: - Clearer learning progression - Students understand each component's role - Milestones become showcases of student work - No magic containers hiding the data flow	2025-09-30 07:02:59 -04:00
Vijay Janapa Reddi	b8d631cac9	Simplify module test execution for notebook compatibility Removed redundant test calls from all modules: - Eliminated verbose if __name__ == '__main__': blocks - Removed duplicate individual test calls - Each module now simply calls test_module() directly Changes made to all 9 modules: - Module 01 (Tensor): Simplified from 16-line main block to 1 line - Module 02 (Activations): Simplified from 13-line main block to 1 line - Module 03 (Layers): Simplified from 17-line main block to 1 line - Module 04 (Losses): Simplified from 20-line main block to 1 line - Module 05 (Autograd): Simplified from 19-line main block to 1 line - Module 06 (Optimizers): Simplified from 17-line main block to 1 line - Module 07 (Training): Simplified from 16-line main block to 1 line - Module 08 (DataLoader): Simplified from 17-line main block to 1 line - Module 09 (Spatial): Simplified from 14-line main block to 1 line Impact: - Notebook-friendly: Tests run immediately in Jupyter environments - No redundancy: test_module() already runs all unit tests - Cleaner code: ~140 lines of redundant code removed - Better for students: Simpler, more direct execution flow	2025-09-30 06:51:30 -04:00
Vijay Janapa Reddi	80ac63e851	Remove ML Systems Thinking sections from all modules Cleaned up module structure by removing reflection questions: - Updated module-developer.md to remove ML Systems Thinking from template - Removed ML Systems Thinking sections from all 9 modules: * Module 01 (Tensor): Removed 113 lines of questions * Module 02 (Activations): Removed 24 lines of questions * Module 03 (Layers): Removed 84 lines of questions * Module 04 (Losses): Removed 93 lines of questions * Module 05 (Autograd): Removed 64 lines of questions * Module 06 (Optimizers): Removed questions section * Module 07 (Training): Removed questions section * Module 08 (DataLoader): Removed 35 lines of questions * Module 09 (Spatial): Removed 34 lines of questions Impact: - Modules now flow directly from tests to summary - Cleaner, more focused module structure - Removes assessment burden from implementation modules - Keeps focus on building and understanding code	2025-09-30 06:44:36 -04:00
Vijay Janapa Reddi	eb82dd2af6	Fix all remaining modules to prevent test execution on import Wrapped test code in if __name__ == '__main__': guards for: - Module 02 (activations): 7 test calls protected - Module 03 (layers): 7 test calls protected - Module 04 (losses): 10 test calls protected - Module 05 (autograd): 7 test calls protected - Module 06 (optimizers): 8 test calls protected - Module 07 (training): 7 test calls protected - Module 09 (spatial): 5 test calls protected Impact: - All modules can now be imported cleanly without test execution - Tests still run when modules are executed directly - Clean dependency chain throughout the framework - Follows Python best practices for module structure This completes the fix for the entire module system. Modules can now properly import from each other without triggering test code execution.	2025-09-30 06:40:45 -04:00
Vijay Janapa Reddi	93668e0f5e	Fix module dependency chain - clean imports now work Critical fixes to resolve module import issues: 1. Module 01 (tensor_dev.py): - Wrapped all test calls in if __name__ == '__main__': guards - Tests no longer execute during import - Clean imports now work: from tensor_dev import Tensor 2. Module 08 (dataloader_dev.py): - REMOVED redefined Tensor class (was breaking dependency chain) - Now imports real Tensor from Module 01 - DataLoader uses actual Tensor with full gradient support Impact: - Modules properly build on previous work (no isolated implementations) - Clean dependency chain: each module imports from previous modules - No test execution during imports = fast, clean module loading This resolves the root cause where DataLoader had to redefine Tensor because importing tensor_dev.py would execute all test code.	2025-09-30 06:37:52 -04:00
Vijay Janapa Reddi	915ee8a536	Remove all Variable references - pure Tensor system with clean autograd Major refactoring: - Eliminated Variable class completely from autograd module - Implemented progressive enhancement pattern with enable_autograd() - All modules now use pure Tensor with requires_grad=True - PyTorch 2.0 compatible API throughout - Clean separation: Module 01 has simple Tensor, Module 05 enhances with gradients - Fixed all imports and references across layers, activations, losses - Educational clarity: students learn modern patterns from day one The system now follows the principle: 'One Tensor class to rule them all' No more confusion between Variable and Tensor - everything is just Tensor!	2025-09-30 00:08:31 -04:00
Vijay Janapa Reddi	235b19befd	Partial fix for Module 17 quantization - type conversion and formula corrections	2025-09-29 22:13:21 -04:00
Vijay Janapa Reddi	c68d982443	Fix critical modules for complete ML pipeline: DataLoader through KV-Caching Module Fixes Applied: • Module 08 (DataLoader): Fixed import loop with simplified local Tensor class • Module 09 (Spatial): Fixed import conflicts and reduced analysis input sizes • Module 11 (Embeddings): Fixed test logic error in embedding scaling comparison • Module 12 (Attention): Fixed namespace collision between Tensor classes • Module 14 (KV-Caching): Fixed memory allocation and achieved 10x+ speedup Milestone Achievements: ✅ Milestone 1: Perceptron (Modules 01-04) - ACHIEVED ✅ Milestone 2: MLP (Modules 01-07) - ACHIEVED ✅ Milestone 3: CNN (Modules 01-09) - ACHIEVED ✅ Milestone 4: GPT (Modules 10-14) - ACHIEVED Current Status: 16/20 modules working (80% success rate) Next: Fix remaining modules 17-20 for 100% completion Technical Highlights: • Complete NLP pipeline: tokenization → embeddings → attention → transformers → caching • Production optimizations: O(n²) → O(n) complexity with KV-caching • Systems analysis: memory vs speed trade-offs, scaling strategies • Educational progression: each module builds systematically on previous	2025-09-29 22:02:11 -04:00

1 2 3 4 5 ...

454 Commits