mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-06-03 17:42:14 -05:00
4aeeb1069ad7c9d213554a6b557b30ce5c0d1da7
- Add comprehensive mock-based tests for Activations module (tests/test_activations.py): * TestReLUActivation: 7 test methods covering positive/negative values, mixed inputs, 2D processing * TestSigmoidActivation: 6 test methods covering zero input, symmetry, extreme values, 2D processing * TestTanhActivation: 6 test methods covering antisymmetry, extreme values, mathematical properties * TestSoftmaxActivation: 6 test methods covering probability distribution, numerical stability, batch processing * TestActivationIntegration: 3 test methods covering chaining, consistency, shape preservation * TestActivationEdgeCases: 3 test methods covering empty input, small values, inf/nan handling * Total: 514 lines with MockTensor class avoiding cross-module dependencies - Add comprehensive mock-based tests for Networks module (tests/test_networks.py): * TestSequentialNetwork: 8 test methods covering initialization, layer addition, forward pass, batch processing * TestMLPNetwork: 6 test methods covering basic/parameter initialization, network structure, forward pass * TestNetworkIntegration: 3 test methods covering composition, equivalence, complex architectures * TestNetworkEdgeCases: 4 test methods covering incompatible layers, edge sizes, empty networks * TestNetworkPerformance: 2 test methods covering call efficiency and scalability * Total: 552 lines with MockTensor and MockLayer classes for isolated testing - Add comprehensive mock-based tests for CNN module (tests/test_cnn.py): * TestConv2DNaive: 6 test methods covering basic convolution, edge detection, different sizes, kernels * TestConv2DLayer: 7 test methods covering initialization, forward pass, batch processing, consistency * TestFlattenFunction: 6 test methods covering 2D/3D tensors, shape preservation, batch dimensions * TestCNNIntegration: 4 test methods covering conv-to-flatten pipeline, multiple layers, feature extraction * TestCNNEdgeCases: 4 test methods covering minimal input, large kernels, numerical stability * TestCNNPerformance: 4 test methods covering consistency, scalability, efficiency * TestCNNMathematicalProperties: 3 test methods covering linearity, translation invariance, bijection * Total: 521 lines with MockTensor class for isolated CNN testing - Add comprehensive mock-based tests for DataLoader module (tests/test_dataloader.py): * TestDatasetInterface: 6 test methods covering abstract methods, MockDataset functionality, configurations * TestDataLoaderBasic: 4 test methods covering initialization, length calculation, iteration * TestDataLoaderShuffling: 3 test methods covering shuffle/no-shuffle behavior, consistency * TestDataLoaderEdgeCases: 5 test methods covering empty datasets, single samples, edge cases * TestDataLoaderIntegration: 3 test methods covering SimpleDataset, custom datasets, different data types * TestDataLoaderPerformance: 3 test methods covering memory efficiency, iteration speed, scalability * TestDataLoaderRobustness: 3 test methods covering invalid inputs, error handling, consistency * Total: 585 lines with MockTensor and MockDataset classes for isolated testing - All mock-based tests follow established patterns: * Simple, visible mocks instead of complex mocking frameworks * Test interface contracts and behavior, not implementation details * Avoid dependency cascade where tests fail due to other module bugs * Focus on mathematical correctness and architectural patterns * Educational value with clear test structure and comprehensive coverage - Complete mock-based testing implementation: 2,172 lines across 4 modules - Total testing architecture: 6,200+ lines across inline and mock-based tests - Ready for production-quality module isolation and validation
🔥 TinyTorch: Build ML Systems from Scratch
A complete Machine Learning Systems course where students build their own ML framework.
🎯 What You'll Build
- Complete ML Framework: Build your own PyTorch-style framework from scratch
- Real Applications: Use your framework to classify CIFAR-10 images
- Production Skills: Learn ML systems engineering, not just algorithms
- Immediate Feedback: See your code working at every step
🚀 Quick Start (2 minutes)
Students
git clone https://github.com/your-org/tinytorch.git
cd TinyTorch
make install # Install dependencies
tito system doctor # Verify setup
cd assignments/source/00_setup # Start with setup
jupyter lab setup_dev.py # Open first assignment
Instructors
# System check
tito system info # Check course status
tito system doctor # Verify environment
# Assignment management
tito nbgrader generate 00_setup # Create student assignments
tito nbgrader release 00_setup # Release to students
tito nbgrader autograde 00_setup # Auto-grade submissions
📚 Course Structure
Core Assignments (6+ weeks of proven content)
- 00_setup (20/20 tests) - Development workflow & CLI tools
- 02_activations (24/24 tests) - ReLU, Sigmoid, Tanh functions
- 03_layers (17/22 tests) - Dense layers & neural building blocks
- 04_networks (20/25 tests) - Sequential networks & MLPs
- 06_dataloader (15/15 tests) - CIFAR-10 data loading
- 05_cnn (2/2 tests) - Convolution operations
Advanced Features (in development)
- 01_tensor (22/33 tests) - Tensor arithmetic
- 07-13 - Autograd, optimizers, training, MLOps
🛠️ Development Workflow
NBGrader (Assignment Creation & Testing)
tito nbgrader generate 00_setup # Create student assignments
tito nbgrader release 00_setup # Release to students
tito nbgrader collect 00_setup # Collect submissions
tito nbgrader autograde 00_setup # Auto-grade with pytest
nbdev (Package Export & Building)
tito module export 00_setup # Export to tinytorch package
tito module test 00_setup # Test package integration
📈 Student Success Path
Build → Use → Understand → Repeat
- Build: Implement
ReLU()function from scratch - Use:
from tinytorch.core.activations import ReLU- your own code! - Understand: See how it works in real neural networks
- Repeat: Each assignment builds on previous work
Example: First Assignment
# You implement this:
def hello_tinytorch():
print("Welcome to TinyTorch!")
# Then immediately use it:
from tinytorch.core.utils import hello_tinytorch
hello_tinytorch() # Your code working!
🎓 Educational Philosophy
Real Data, Real Systems
- Work with CIFAR-10 (not toy datasets)
- Production-style code organization
- Performance and engineering considerations
- Immediate visual feedback
Build Everything from Scratch
- No black boxes or "magic" functions
- Understanding through implementation
- Connect every concept to production systems
- See your code working immediately
📁 Repository Structure
TinyTorch/
├── assignments/source/XX/ # Assignment source files
│ ├── XX_dev.py # Development assignment
│ └── tests/ # Assignment tests
├── tinytorch/ # Your built framework
│ └── core/ # Exported student code
├── tito/ # CLI tools
└── docs/ # Documentation
🔧 Technical Requirements
- Python 3.8+
- Jupyter Lab for development
- PyTorch for comparison and final projects
- NBGrader for assignment management
- nbdev for package building
🎯 Getting Started
Students
- System Check:
tito system doctor - First Assignment:
cd assignments/source/00_setup && jupyter lab setup_dev.py - Build & Test: Follow the notebook, export when complete
- Use Your Code:
from tinytorch.core.utils import hello_tinytorch
Instructors
- Course Status:
tito system info - Assignment Management:
tito nbgrader generate 00_setup - Student Release:
tito nbgrader release 00_setup - Auto-grading:
tito nbgrader autograde 00_setup
📊 Success Metrics
Students can currently:
- Build and test multi-layer perceptrons
- Implement custom activation functions
- Load and process CIFAR-10 data
- Create basic convolution operations
- Export their code to a working package
Verified workflows:
- ✅ Student Journey: receive assignment → implement → export → use
- ✅ Instructor Journey: create → release → collect → grade
- ✅ Package Integration: All core imports work correctly
🎉 TinyTorch is ready for classroom use with 6+ weeks of proven curriculum content!
Languages
Python
84.5%
Jupyter Notebook
7.4%
HTML
2.8%
TeX
2.2%
JavaScript
1.3%
Other
1.8%