- Preserve computation graph by using Tensor arithmetic (x - x_max, exp / sum)
- No more .data extraction that breaks gradient flow
- Numerically stable with max subtraction before exp
Required for transformer attention softmax gradient flow
Updated docstring examples to use cleaner callable syntax:
- sigmoid(x) instead of sigmoid.forward(x)
- relu(x) instead of relu.forward(x)
- tanh(x) instead of tanh.forward(x)
- gelu(x) instead of gelu.forward(x)
- softmax(x) instead of softmax.forward(x)
This demonstrates the proper usage pattern with the __call__ methods
we just added, making examples more Pythonic and PyTorch-compatible.
Enable cleaner API usage by adding __call__ methods to all activation,
layer, and loss classes. This allows students to write:
- relu(x) instead of relu.forward(x)
- layer(x) instead of layer.forward(x)
- loss_fn(pred, target) instead of loss_fn.forward(pred, target)
Changes:
- Module 02 (Activations): Add __call__ to ReLU, Tanh, GELU, Softmax
* Sigmoid already had __call__
- Module 03 (Layers): Add __call__ to Dropout
* Linear already had __call__
- Module 04 (Losses): Add __call__ to MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss
This matches PyTorch's API convention where model(x) calls model.__call__(x)
which internally calls model.forward(x). Makes code more Pythonic and
intuitive for students familiar with PyTorch.
Expected impact: Test pass rates should improve significantly as tests
expect PyTorch-style callable API.
PROBLEM:
- nbdev requires #| export directive on EACH cell to export when using # %% markers
- Cell markers inside class definitions split classes across multiple cells
- Only partial classes were being exported to tinytorch package
- Missing matmul, arithmetic operations, and activation classes in exports
SOLUTION:
1. Removed # %% cell markers INSIDE class definitions (kept classes as single units)
2. Added #| export to imports cell at top of each module
3. Added #| export before each exportable class definition in all 20 modules
4. Added __call__ method to Sigmoid for functional usage
5. Fixed numpy import (moved to module level from __init__)
MODULES FIXED:
- 01_tensor: Tensor class with all operations (matmul, arithmetic, shape ops)
- 02_activations: Sigmoid, ReLU, Tanh, GELU, Softmax classes
- 03_layers: Linear, Dropout classes
- 04_losses: MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss classes
- 05_autograd: Function, AddBackward, MulBackward, MatmulBackward, SumBackward
- 06_optimizers: Optimizer, SGD, Adam, AdamW classes
- 07_training: CosineSchedule, Trainer classes
- 08_dataloader: Dataset, TensorDataset, DataLoader classes
- 09_spatial: Conv2d, MaxPool2d, AvgPool2d, SimpleCNN classes
- 10-20: All exportable classes in remaining modules
TESTING:
- Test functions use 'if __name__ == "__main__"' guards
- Tests run in notebooks but NOT on import
- Rosenblatt Perceptron milestone working perfectly
RESULT:
✅ All 20 modules export correctly
✅ Perceptron (1957) milestone functional
✅ Clean separation: development (modules/source) vs package (tinytorch)
BREAKING CHANGE: Refactor from whole-module exports to selective function/class exports
**What Changed:**
- Separate development utilities from production exports
- Each function/class gets individual #| export directive
- Clean Prerequisites & Setup sections in all modules
- Development helpers (import_previous_module) not exported
**Module Export Summary:**
- 01_tensor: Tensor class only
- 02_activations: Sigmoid, ReLU, Tanh, GELU, Softmax only
- 03_layers: Linear, Dropout only
- 04_losses: MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss, log_softmax only
- 05_autograd: Function class only
- 06_optimizers: SGD, Adam, AdamW only
**Benefits:**
✅ Clean public API (matches PyTorch/TensorFlow patterns)
✅ No development utilities in final package
✅ Professional software education standards
✅ Clear separation of concerns
✅ Educational clarity for students
This matches industry standards for educational ML frameworks.
- Add import_previous_module() helper function to all core modules (01-07)
- Standardize cross-module imports for integration testing
- Add clear Prerequisites & Setup sections explaining module dependencies
- Update integration tests to use standardized import pattern
- Maintain clean separation between development and production code
This provides a consistent, educational approach to module integration
while keeping the codebase maintainable and student-friendly.
- Remove circular imports where modules imported from themselves
- Convert tinytorch.core imports to sys.path relative imports
- Only import dependencies that are actually used in each module
- Preserve documentation imports in markdown cells
- Use consistent relative path pattern across all modules
- Remove hardcoded absolute paths in favor of relative imports
Affected modules: 02_activations, 03_layers, 04_losses, 06_optimizers,
07_training, 09_spatial, 12_attention, 17_quantization
✅ Rename all module directories: 00_setup → 01_setup, etc.
✅ Update convert_modules.py mappings for new directory names
✅ Update _toc.yml file paths and titles (1-14 instead of 0-13)
✅ Regenerate all overview pages with new numbering
✅ Fix all broken references in usage-paths and intro
✅ Update chapter references to use natural numbering
Benefits:
- More intuitive course progression starting from 1
- Matches academic course numbering conventions
- Eliminates confusion about 'Module 0' concept
- Cleaner mental model for students and instructors
- All references and links properly updated
Complete transformation: 14 modules now numbered 01-14
✅ Clean source file headers: 'Module X:' → clean descriptive titles
✅ Regenerate overview pages with clean headers
✅ More flexible content that works in any context
✅ Numbers still provided by book TOC structure
Changes:
- Remove 'Module X: ' prefix from all source file headers
- Headers now focus on descriptive content titles
- Book maintains proper chapter ordering via _toc.yml
- Content is more reusable across different presentations
MAJOR IMPROVEMENT: Simplified test discovery logic
- Removed restrictive valid_patterns requirement from testing framework
- Any function starting with 'test_' is now automatically discovered
- Follows standard pytest conventions - no maintenance overhead
- Eliminates need to manually add patterns for new test functions
CLEANED UP: Test function names across all 10 modules
- Removed redundant '_comprehensive' suffix from all test functions
- Updated 40+ test function names to be more concise and readable:
* 00_setup: 6 functions (test_personal_info, test_system_info, etc.)
* 01_tensor: 4 functions (test_tensor_creation, test_tensor_properties, etc.)
* 02_activations: 1 function (test_activations)
* 03_layers: 3 functions (test_matrix_multiplication, test_dense_layer, etc.)
* 04_networks: 4 functions (test_sequential_networks, test_mlp_creation, etc.)
* 05_cnn: 3 functions (test_convolution_operation, test_conv2d_layer, etc.)
* 06_dataloader: 4 functions (test_dataset_interface, test_dataloader, etc.)
* 07_autograd: 6 functions (test_variable_class, test_add_operation, etc.)
* 08_optimizers: 5 functions (test_gradient_descent_step, test_sgd_optimizer, etc.)
* 09_training: 6 functions (test_mse_loss, test_crossentropy_loss, etc.)
* 10_compression: 6 functions (already cleaned up)
VERIFICATION: All tests still pass
- All 10 modules tested successfully with new discovery logic
- Total test count maintained: 47 inline tests across all modules
- No functionality lost, only improved maintainability
RESULT: Much cleaner, more maintainable testing framework following standard conventions
- Updated all _dev.py files to use 'comprehensive test' instead of 'integration test'
- Changed function names: test_*_integration() → test_*_comprehensive()
- Updated markdown headers, print statements, success/error messages
- Clarifies that these are comprehensive tests of single modules, not cross-module integration
- Real cross-module integration tests remain in tests/ directory
- Updated modules: 00_setup, 01_tensor, 02_activations, 03_layers, 04_networks, 05_cnn, 06_dataloader, 07_autograd
- Remove student-facing bloat (learning objectives, time estimates, pedagogical details)
- Remove assessment sections (not needed for operational metadata)
- Streamline to essential system information only:
- Module identification and dependencies
- Package export configuration
- File structure and component listings
- Updated existing files (6): setup, tensor, activations, layers, autograd, optimizers
- Created missing files (3): networks, cnn, dataloader
- Consistent 25-26 line format across all 9 modules
Result: Pure operational metadata for CLI tools and build systems
Perfect for instructor/staff development workflow
🔧 Issues Fixed:
1. MockTensor compatibility: Activations now return same type as input (polymorphic)
2. Empty input handling: Softmax gracefully handles zero-size arrays
✅ Impact: 02_activations external tests now pass 34/34 (was 32/34)
🎯 Technical Changes:
- Changed activation signatures from Tensor -> Tensor to flexible types
- Use type(x)(result) instead of hardcoded Tensor(result)
- Added empty input guard in Softmax: if x.data.size == 0: return type(x)(x.data.copy())
- Applied consistent pattern across ReLU, Sigmoid, Tanh, Softmax
This makes activations more robust and testable without tight coupling to Tensor implementation.
- Move testing utilities from tinytorch/utils/testing.py to tito/tools/testing.py
- Update all module imports to use tito.tools.testing
- Remove testing utilities from core TinyTorch package
- Testing utilities are development tools, not part of the ML library
- Maintains clean separation between library code and development toolchain
- All tests continue to work correctly with improved architecture
- Replaced 3 overlapping documentation files with 1 authoritative source
- Set modules/source/08_optimizers/optimizers_dev.py as reference implementation
- Created comprehensive module-rules.md with complete patterns and examples
- Added living-example approach: use actual working code as template
- Removed redundant files: module-structure-design.md, module-quick-reference.md, testing-design.md
- Updated cursor rules to point to consolidated documentation
- All module development now follows single source of truth
- Fixed indentation issues in 03_layers/layers_dev.py
- Fixed indentation issues in 04_networks/networks_dev.py
- Fixed indentation issues in 05_cnn/cnn_dev.py
- Removed orphaned except/raise statements
- 06_dataloader still has some complex indentation issues to resolve
✅ Updated modules to use consistent testing format:
- 08_optimizers: 'Testing X...' → '🔬 Unit Test: X...'
- 07_autograd: 'Testing X...' → '🔬 Unit Test: X...'
- 02_activations: 'Testing X...' → '🔬 Unit Test: X...'
- 03_layers: 'Testing X...' → '🔬 Unit Test: X...'
🎯 Now all modules follow tensor_dev.py format:
- ✅ Consistent '🔬 Unit Test: [Component]...' format
- ✅ Maintains visual consistency across all modules
- ✅ Clear identification of unit test sections
- ✅ Professional and educational presentation
📊 Status: All 9 modules (00-08) now use unified testing terminology
- Remove all tests/ directories under modules/source/
- Keep main tests/ directory for testing exported functionality
- Update status command to check tests in main tests/ directory
- Update documentation to reflect new test structure
- Reduce maintenance burden by eliminating duplicate test systems
- Focus on inline NBGrader tests for development, main tests for package validation
- Enhanced tensor module documentation with mathematical foundations
- Improved explanations for scalars, vectors, and matrices
- Added NBGrader workflow documentation to activations module
- Cleaned up .cursor/rules/ directory structure
- Updated user preferences for better development workflow
These changes improve the educational content and developer experience
while maintaining the core functionality of all modules.
- Implement 'explain → code → test → repeat' structure across all modules
- Replace comprehensive end-of-module tests with progressive unit tests
- Add rich scaffolding with detailed implementation guidance
- Transform generic TODOs into step-by-step learning instructions
- Connect educational content to real-world ML systems and PyTorch
- Reduce overall codebase by 37% while enhancing learning experience
- Ensure immediate feedback and skill building for students
Modules transformed:
- 01_tensor: Tensor operations and broadcasting
- 02_activations: Activation functions and derivatives
- 03_layers: Linear layers and forward/backward propagation
- 04_networks: Network building and multi-layer composition
- 05_cnn: Convolution operations and CNN architecture
- 06_dataloader: Data pipeline and batch processing
- 07_autograd: Automatic differentiation and computational graphs
- Replace all 'python bin/tito.py' references with correct 'tito' commands
- Update command structure to use proper subcommands (tito system info, tito module test, etc.)
- Add virtual environment activation to all workflows
- Update Makefile to use correct tito commands with .venv activation
- Update activation script to use correct tito path and command examples
- Add Tiny🔥Torch branding to activation script header
- Update documentation to reflect correct CLI usage patterns
- Added detailed explanation of the linear limitation problem
- Enhanced biological inspiration and neuron modeling connections
- Included Universal Approximation Theorem and its implications
- Added real-world impact examples (computer vision, NLP, game playing)
- Comprehensive activation function properties analysis
- Historical timeline of activation function evolution
- Better visual analogies and signal processor metaphors
- Improved connections to previous and next modules
- Replace existing tests with comprehensive educational tests
- Add 12 comprehensive test cases covering all activation functions
- Include ReLU, Sigmoid, Tanh, and Softmax testing
- Add edge cases, numerical stability, and shape preservation tests
- Add function composition and real ML scenario testing
- Provide detailed feedback, hints, and progress tracking
- Follow inline-first testing approach for immediate feedback
- Add 17 intermediate test points across 6 modules for immediate student feedback
- Tensor module: Tests after creation, properties, arithmetic, and operators
- Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax)
- Layers module: Tests after matrix multiplication and Dense layer implementation
- Networks module: Tests after Sequential class and MLP creation
- CNN module: Tests after convolution, Conv2D layer, and flatten operations
- DataLoader module: Tests after Dataset interface and DataLoader class
- All tests include visual progress indicators and behavioral explanations
- Maintains NBGrader compliance with proper metadata and point allocation
- Enables steady forward progress and better debugging for students
- 100% test success rate across all modules and integration testing
- Added package structure documentation explaining modules/source/ vs tinytorch.core.
- Enhanced mathematical foundations with linear algebra refresher and Universal Approximation Theorem
- Added real-world applications for each activation function (ReLU, Sigmoid, Tanh, Softmax)
- Included mathematical properties, derivatives, ranges, and computational costs
- Added performance considerations and numerical stability explanations
- Connected to production ML systems (PyTorch, TensorFlow, JAX equivalents)
- Implemented streamlined 'tito export' command with automatic .py → .ipynb conversion
- All functionality preserved: scripts run correctly, tests pass, package integration works
- Ready to continue with remaining modules (layers, networks, cnn, dataloader)
- Remove unnecessary module_paths.txt file for cleaner architecture
- Update export command to discover modules dynamically from modules/source/
- Simplify nbdev command to support --all and module-specific exports
- Use single source of truth: nbdev settings.ini for module paths
- Clean up import structure in setup module for proper nbdev export
- Maintain clean separation between module discovery and export logic
This implements a proper software engineering approach with:
- Single source of truth (settings.ini)
- Dynamic discovery (no hardcoded paths)
- Clean CLI interface (tito package nbdev --export [--all|module])
- Robust error handling with helpful feedback