The itemize environment parameters [leftmargin=*, itemsep=1pt, parsep=0pt]
were appearing as visible text in the PDF because the enumitem package
wasn't loaded. This fix adds \usepackage{enumitem} to the preamble.
All itemized lists now format correctly with proper spacing and margins.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
13 KiB
TinyTorch Testing Architecture
🎯 Overview: Two-Tier Testing Strategy
TinyTorch uses a two-tier testing approach that separates component validation from system integration:
- Inline Tests (
modules/) - Component validation, unit tests - Integration Tests (
tests/) - Inter-module integration, edge cases, system tests
This separation follows ML engineering best practices: validate components in isolation, then test how they work together.
📋 Tier 1: Inline Tests (Component Validation)
Location: modules/XX_modulename/*.py
Purpose:
- ✅ Validate individual components work correctly in isolation
- ✅ Test single module functionality
- ✅ Provide immediate feedback during development
- ✅ Educate students about expected behavior
- ✅ Fast execution for rapid iteration
What Gets Tested:
- Individual class/function correctness
- Mathematical operations (forward passes)
- Shape transformations
- Basic edge cases and error handling
- Component-level functionality
Test Pattern:
def test_unit_componentname():
"""🧪 Unit Test: Component Name
**This is a unit test** - it tests [component] in isolation.
"""
print("🔬 Unit Test: Component...")
# Test implementation
assert condition, "✅ Component works"
print("✅ Component test passed")
Example: modules/01_tensor/tensor.py
test_unit_tensor_creation()- Tests tensor creationtest_unit_arithmetic_operations()- Tests +, -, *, /test_unit_matrix_multiplication()- Tests @ operatortest_unit_shape_manipulation()- Tests reshape, transposetest_unit_reduction_operations()- Tests sum, mean, max
Execution:
# Run inline tests only
tito test 01_tensor --inline-only
# Tests run when you execute the module file
python modules/01_tensor/tensor.py
Key Characteristics:
- ✅ Fast: Run during development for immediate feedback
- ✅ Isolated: No dependencies on other modules
- ✅ Educational: Shows students what "correct" looks like
- ✅ Component-focused: Tests one thing at a time
📊 Tier 2: Integration Tests (tests/ Directory)
Location: tests/
Purpose:
- ✅ Test how multiple modules work together
- ✅ Validate cross-module dependencies
- ✅ Test realistic workflows and use cases
- ✅ Ensure system-level correctness
- ✅ Catch bugs that unit tests miss
- ✅ Test edge cases and corner scenarios
- ✅ Validate exported code (
tinytorch/) works correctly
Key Insight:
Component correctness ≠ System correctness
A tensor might work perfectly in isolation, but fail when gradients flow through layers → activations → losses → optimizers. Integration tests catch these "seam" bugs.
🗂️ Structure of tests/ Directory
1. Module-Specific Integration Tests (tests/XX_modulename/)
Purpose: Test that module N works correctly with all previous modules (1 through N-1)
Example: tests/05_autograd/test_progressive_integration.py
- Tests autograd with Tensor (01), Activations (02), Layers (03), Losses (04)
- Validates that gradients flow correctly through the entire stack built so far
Pattern: Progressive integration
# tests/05_autograd/test_progressive_integration.py
def test_autograd_with_all_previous_modules():
# Uses real Tensor, real Layers, real Activations, real Losses
# Then tests Autograd (05) with all of them
x = Tensor([[1.0, 2.0]], requires_grad=True)
layer = Linear(2, 3)
activation = ReLU()
loss_fn = MSELoss()
output = activation(layer(x))
loss = loss_fn(output, target)
loss.backward()
assert x.grad is not None # Gradient flowed through everything!
Why This Matters:
- Catches integration bugs early
- Ensures modules don't break previous functionality
- Validates the "seams" between modules
2. Cross-Module Integration Tests (tests/integration/)
Purpose: Test multiple modules working together in realistic scenarios
Key Files:
test_gradient_flow.py- CRITICAL: Validates gradients flow through entire training stacktest_end_to_end_training.py- Full training loopstest_module_compatibility.py- Module interfaces
Example: tests/integration/test_gradient_flow.py
def test_complete_training_stack():
"""Test that gradients flow through: Tensor → Layers → Activations → Loss → Autograd → Optimizer"""
# Uses modules 01, 02, 03, 04, 05, 06, 07
# Validates the entire training pipeline works
Why This Matters:
- Catches bugs that unit tests miss
- Validates the "seams" between modules
- Ensures training actually works end-to-end
- Tests realistic ML workflows
3. Edge Cases & Stress Tests (tests/05_autograd/, tests/debugging/)
Purpose: Test corner cases and common pitfalls
Examples:
tests/05_autograd/test_broadcasting.py- Broadcasting gradient bugstests/05_autograd/test_computation_graph.py- Graph construction edge casestests/debugging/test_gradient_vanishing.py- Detect vanishing gradientstests/debugging/test_common_mistakes.py- "Did you forget backward()?" style tests
Philosophy: When these tests fail, the error message should teach the student what went wrong and how to fix it.
Why This Matters:
- Catches numerical stability issues
- Tests edge cases that break in production
- Pedagogical: teaches debugging skills
4. Regression Tests (tests/regression/)
Purpose: Ensure previously fixed bugs don't come back
Pattern: Each bug gets a test file
test_issue_20241125_conv_fc_shapes.py- Tests a specific bug that was fixed- Documents the bug, root cause, fix, and prevention
Why This Matters:
- Prevents regressions
- Documents historical bugs
- Ensures fixes persist
5. Performance Tests (tests/performance/)
Purpose: Validate systems performance characteristics
Examples:
- Memory profiling
- Speed benchmarks
- Scalability tests
Why This Matters:
- Ensures implementations are efficient
- Validates performance characteristics
- Catches performance regressions
6. System Tests (tests/system/)
Purpose: Test entire system workflows
Examples:
- End-to-end training pipelines
- Model export/import
- Checkpoint system tests
Why This Matters:
- Validates complete workflows
- Tests production scenarios
- Ensures system-level correctness
7. Checkpoint Tests (tests/checkpoints/)
Purpose: Validate milestone capabilities
Examples:
checkpoint_01_foundation.py- Tensor operations masteredcheckpoint_05_learning.py- Autograd working correctly
Why This Matters:
- Validates student progress
- Ensures milestones are met
- Provides clear success criteria
🔄 Code Flow: Development → Export → Testing
┌─────────────────────────────────────────────────────────────┐
│ DEVELOPMENT WORKFLOW │
└─────────────────────────────────────────────────────────────┘
1. DEVELOP in modules/
└─> modules/01_tensor/tensor.py
├─> Write code
├─> Write inline tests (test_unit_*)
└─> Run: python modules/01_tensor/tensor.py
2. EXPORT to tinytorch/
└─> tito export 01_tensor
└─> Code exported to tinytorch/core/tensor.py
3. TEST integration
└─> tests/01_tensor/test_progressive_integration.py
├─> Imports from tinytorch.core.tensor (exported code!)
├─> Tests module works with previous modules
└─> Run: pytest tests/01_tensor/
4. TEST cross-module
└─> tests/integration/test_gradient_flow.py
├─> Imports from tinytorch.* (all exported modules)
├─> Tests multiple modules working together
└─> Run: pytest tests/integration/
🎯 Decision Tree: Where Should This Test Go?
Is it testing a single component in isolation?
├─ YES → modules/XX_modulename/*.py (inline test_unit_*)
│
└─ NO → Is it testing module N with previous modules?
├─ YES → tests/XX_modulename/test_progressive_integration.py
│
└─ NO → Is it testing multiple modules together?
├─ YES → tests/integration/test_*.py
│
└─ NO → Is it an edge case or stress test?
├─ YES → tests/XX_modulename/test_*_edge_cases.py
│ OR tests/debugging/test_*.py
│
└─ NO → Is it a regression test?
├─ YES → tests/regression/test_issue_*.py
│
└─ NO → Is it a performance test?
├─ YES → tests/performance/test_*.py
│
└─ NO → Is it a system test?
└─ YES → tests/system/test_*.py
📝 Best Practices
DO:
✅ Write inline tests immediately after implementing a component
✅ Test one thing per inline test function
✅ Use descriptive test function names (test_unit_sigmoid, not test1)
✅ Add integration tests when combining multiple modules
✅ Run inline tests frequently during development
✅ Run full test suite before committing
✅ Test exported code (tinytorch/), not development code (modules/)
✅ Write tests that catch real bugs you've encountered
DON'T:
❌ Mix inline and integration test concerns
❌ Test implementation details in integration tests
❌ Skip inline tests and jump to integration
❌ Test mocked/fake components (use real ones)
❌ Create dependencies between test files
❌ Test code in modules/ directly in tests/ (test tinytorch/ instead)
❌ Duplicate inline tests in tests/ directory
🔍 Key Distinctions
| Aspect | Inline Tests (modules/) |
Integration Tests (tests/) |
|---|---|---|
| Location | modules/XX_name/*.py |
tests/XX_name/ or tests/integration/ |
| Scope | Single component | Multiple modules |
| Dependencies | None (isolated) | Previous modules |
| Speed | Fast | Slower |
| Purpose | Component correctness | System correctness |
| When to run | During development | Before commit/export |
| What gets tested | modules/ code directly |
tinytorch/ exported code |
| Example | test_unit_tensor_creation() |
test_tensor_with_layers() |
🚀 Testing Workflow
For Students:
# 1. Work on module
cd modules/01_tensor
vim tensor.py
# 2. Run inline tests (fast feedback)
python tensor.py
# or
tito test 01_tensor --inline-only
# 3. Export to package
tito export 01_tensor
# 4. Run integration tests (full validation)
tito test 01_tensor
# or
pytest tests/01_tensor/
# 5. Run cross-module tests (ensure nothing broke)
pytest tests/integration/
For Instructors:
# Comprehensive test suite
tito test --comprehensive
# Specific module deep dive
tito test 05_autograd --detailed
# All inline tests only (quick check)
tito test --all --inline-only
# Critical integration tests
pytest tests/integration/test_gradient_flow.py -v
💡 Why This Architecture?
Separation of Concerns:
- Inline tests = "Does this component work?"
- Integration tests = "Do these components work together?"
Educational Value:
- Students learn component testing first
- Then learn integration testing
- Mirrors professional ML engineering workflows
Practical Benefits:
- Fast feedback during development (inline tests)
- Comprehensive validation before commit (integration tests)
- Catches bugs at the right level
- Clear mental model: component vs. system
Real-World Alignment:
- Professional ML teams use this pattern
- Unit tests for components
- Integration tests for pipelines
- System tests for workflows
📚 Summary
Think of tests/ as the "system validation layer":
modules/inline tests = "Does my component work?"tests/XX_modulename/= "Does my module work with previous modules?"tests/integration/= "Do multiple modules work together?"tests/debugging/= "Are there edge cases I'm missing?"tests/regression/= "Did I break something that was working?"tests/performance/= "Is my implementation efficient?"tests/system/= "Does the entire system work?"
The key insight: tests/ validates that exported code (tinytorch/) works correctly in realistic scenarios, catching bugs that isolated unit tests miss.
Last Updated: 2025-01-XX
Test Infrastructure: Complete (20/20 modules have test directories)
Philosophy: Component correctness ≠ System correctness