mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-11 21:53:34 -05:00

Files

Vijay Janapa Reddi f31865560e Add enumitem package to fix itemize formatting

The itemize environment parameters [leftmargin=*, itemsep=1pt, parsep=0pt]
were appearing as visible text in the PDF because the enumitem package
wasn't loaded. This fix adds \usepackage{enumitem} to the preamble.

All itemized lists now format correctly with proper spacing and margins.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-19 08:43:41 -05:00

13 KiB

Raw Blame History

TinyTorch Testing Architecture

🎯 Overview: Two-Tier Testing Strategy

TinyTorch uses a two-tier testing approach that separates component validation from system integration:

Inline Tests (modules/) - Component validation, unit tests
Integration Tests (tests/) - Inter-module integration, edge cases, system tests

This separation follows ML engineering best practices: validate components in isolation, then test how they work together.

📋 Tier 1: Inline Tests (Component Validation)

Location: `modules/XX_modulename/*.py`

Purpose:

✅ Validate individual components work correctly in isolation
✅ Test single module functionality
✅ Provide immediate feedback during development
✅ Educate students about expected behavior
✅ Fast execution for rapid iteration

What Gets Tested:

Individual class/function correctness
Mathematical operations (forward passes)
Shape transformations
Basic edge cases and error handling
Component-level functionality

Test Pattern:

def test_unit_componentname():
    """🧪 Unit Test: Component Name
    
    **This is a unit test** - it tests [component] in isolation.
    """
    print("🔬 Unit Test: Component...")
    
    # Test implementation
    assert condition, "✅ Component works"
    
    print("✅ Component test passed")

Example: `modules/01_tensor/tensor.py`

test_unit_tensor_creation() - Tests tensor creation
test_unit_arithmetic_operations() - Tests +, -, *, /
test_unit_matrix_multiplication() - Tests @ operator
test_unit_shape_manipulation() - Tests reshape, transpose
test_unit_reduction_operations() - Tests sum, mean, max

Execution:

# Run inline tests only
tito test 01_tensor --inline-only

# Tests run when you execute the module file
python modules/01_tensor/tensor.py

Key Characteristics:

✅ Fast: Run during development for immediate feedback
✅ Isolated: No dependencies on other modules
✅ Educational: Shows students what "correct" looks like
✅ Component-focused: Tests one thing at a time

📊 Tier 2: Integration Tests (`tests/` Directory)

Location: `tests/`

Purpose:

✅ Test how multiple modules work together
✅ Validate cross-module dependencies
✅ Test realistic workflows and use cases
✅ Ensure system-level correctness
✅ Catch bugs that unit tests miss
✅ Test edge cases and corner scenarios
✅ Validate exported code (tinytorch/) works correctly

Key Insight:

Component correctness ≠ System correctness

A tensor might work perfectly in isolation, but fail when gradients flow through layers → activations → losses → optimizers. Integration tests catch these "seam" bugs.

🗂️ Structure of `tests/` Directory

1. Module-Specific Integration Tests (`tests/XX_modulename/`)

Purpose: Test that module N works correctly with all previous modules (1 through N-1)

Example: tests/05_autograd/test_progressive_integration.py

Tests autograd with Tensor (01), Activations (02), Layers (03), Losses (04)
Validates that gradients flow correctly through the entire stack built so far

Pattern: Progressive integration

# tests/05_autograd/test_progressive_integration.py
def test_autograd_with_all_previous_modules():
    # Uses real Tensor, real Layers, real Activations, real Losses
    # Then tests Autograd (05) with all of them
    x = Tensor([[1.0, 2.0]], requires_grad=True)
    layer = Linear(2, 3)
    activation = ReLU()
    loss_fn = MSELoss()
    
    output = activation(layer(x))
    loss = loss_fn(output, target)
    loss.backward()
    
    assert x.grad is not None  # Gradient flowed through everything!

Why This Matters:

Catches integration bugs early
Ensures modules don't break previous functionality
Validates the "seams" between modules

2. Cross-Module Integration Tests (`tests/integration/`)

Purpose: Test multiple modules working together in realistic scenarios

Key Files:

test_gradient_flow.py - CRITICAL: Validates gradients flow through entire training stack
test_end_to_end_training.py - Full training loops
test_module_compatibility.py - Module interfaces

Example: tests/integration/test_gradient_flow.py

def test_complete_training_stack():
    """Test that gradients flow through: Tensor → Layers → Activations → Loss → Autograd → Optimizer"""
    # Uses modules 01, 02, 03, 04, 05, 06, 07
    # Validates the entire training pipeline works

Why This Matters:

Catches bugs that unit tests miss
Validates the "seams" between modules
Ensures training actually works end-to-end
Tests realistic ML workflows

3. Edge Cases & Stress Tests (`tests/05_autograd/`, `tests/debugging/`)

Purpose: Test corner cases and common pitfalls

Examples:

tests/05_autograd/test_broadcasting.py - Broadcasting gradient bugs
tests/05_autograd/test_computation_graph.py - Graph construction edge cases
tests/debugging/test_gradient_vanishing.py - Detect vanishing gradients
tests/debugging/test_common_mistakes.py - "Did you forget backward()?" style tests

Philosophy: When these tests fail, the error message should teach the student what went wrong and how to fix it.

Why This Matters:

Catches numerical stability issues
Tests edge cases that break in production
Pedagogical: teaches debugging skills

4. Regression Tests (`tests/regression/`)

Purpose: Ensure previously fixed bugs don't come back

Pattern: Each bug gets a test file

test_issue_20241125_conv_fc_shapes.py - Tests a specific bug that was fixed
Documents the bug, root cause, fix, and prevention

Why This Matters:

Prevents regressions
Documents historical bugs
Ensures fixes persist

5. Performance Tests (`tests/performance/`)

Purpose: Validate systems performance characteristics

Examples:

Memory profiling
Speed benchmarks
Scalability tests

Why This Matters:

Ensures implementations are efficient
Validates performance characteristics
Catches performance regressions

6. System Tests (`tests/system/`)

Purpose: Test entire system workflows

Examples:

End-to-end training pipelines
Model export/import
Checkpoint system tests

Why This Matters:

Validates complete workflows
Tests production scenarios
Ensures system-level correctness

7. Checkpoint Tests (`tests/checkpoints/`)

Purpose: Validate milestone capabilities

Examples:

checkpoint_01_foundation.py - Tensor operations mastered
checkpoint_05_learning.py - Autograd working correctly

Why This Matters:

Validates student progress
Ensures milestones are met
Provides clear success criteria

🔄 Code Flow: Development → Export → Testing

┌─────────────────────────────────────────────────────────────┐
│                    DEVELOPMENT WORKFLOW                      │
└─────────────────────────────────────────────────────────────┘

1. DEVELOP in modules/
   └─> modules/01_tensor/tensor.py
       ├─> Write code
       ├─> Write inline tests (test_unit_*)
       └─> Run: python modules/01_tensor/tensor.py

2. EXPORT to tinytorch/
   └─> tito export 01_tensor
       └─> Code exported to tinytorch/core/tensor.py

3. TEST integration
   └─> tests/01_tensor/test_progressive_integration.py
       ├─> Imports from tinytorch.core.tensor (exported code!)
       ├─> Tests module works with previous modules
       └─> Run: pytest tests/01_tensor/

4. TEST cross-module
   └─> tests/integration/test_gradient_flow.py
       ├─> Imports from tinytorch.* (all exported modules)
       ├─> Tests multiple modules working together
       └─> Run: pytest tests/integration/

🎯 Decision Tree: Where Should This Test Go?

Is it testing a single component in isolation?
├─ YES → modules/XX_modulename/*.py (inline test_unit_*)
│
└─ NO → Is it testing module N with previous modules?
    ├─ YES → tests/XX_modulename/test_progressive_integration.py
    │
    └─ NO → Is it testing multiple modules together?
        ├─ YES → tests/integration/test_*.py
        │
        └─ NO → Is it an edge case or stress test?
            ├─ YES → tests/XX_modulename/test_*_edge_cases.py
            │         OR tests/debugging/test_*.py
            │
            └─ NO → Is it a regression test?
                ├─ YES → tests/regression/test_issue_*.py
                │
                └─ NO → Is it a performance test?
                    ├─ YES → tests/performance/test_*.py
                    │
                    └─ NO → Is it a system test?
                        └─ YES → tests/system/test_*.py

📝 Best Practices

DO:

✅ Write inline tests immediately after implementing a component
✅ Test one thing per inline test function
✅ Use descriptive test function names (test_unit_sigmoid, not test1)
✅ Add integration tests when combining multiple modules
✅ Run inline tests frequently during development
✅ Run full test suite before committing
✅ Test exported code (tinytorch/), not development code (modules/)
✅ Write tests that catch real bugs you've encountered

DON'T:

❌ Mix inline and integration test concerns
❌ Test implementation details in integration tests
❌ Skip inline tests and jump to integration
❌ Test mocked/fake components (use real ones)
❌ Create dependencies between test files
❌ Test code in modules/ directly in tests/ (test tinytorch/ instead)
❌ Duplicate inline tests in tests/ directory

🔍 Key Distinctions

Aspect	Inline Tests (`modules/`)	Integration Tests (`tests/`)
Location	`modules/XX_name/*.py`	`tests/XX_name/` or `tests/integration/`
Scope	Single component	Multiple modules
Dependencies	None (isolated)	Previous modules
Speed	Fast	Slower
Purpose	Component correctness	System correctness
When to run	During development	Before commit/export
What gets tested	`modules/` code directly	`tinytorch/` exported code
Example	`test_unit_tensor_creation()`	`test_tensor_with_layers()`

🚀 Testing Workflow

For Students:

# 1. Work on module
cd modules/01_tensor
vim tensor.py

# 2. Run inline tests (fast feedback)
python tensor.py
# or
tito test 01_tensor --inline-only

# 3. Export to package
tito export 01_tensor

# 4. Run integration tests (full validation)
tito test 01_tensor
# or
pytest tests/01_tensor/

# 5. Run cross-module tests (ensure nothing broke)
pytest tests/integration/

For Instructors:

# Comprehensive test suite
tito test --comprehensive

# Specific module deep dive
tito test 05_autograd --detailed

# All inline tests only (quick check)
tito test --all --inline-only

# Critical integration tests
pytest tests/integration/test_gradient_flow.py -v

💡 Why This Architecture?

Separation of Concerns:

Inline tests = "Does this component work?"
Integration tests = "Do these components work together?"

Educational Value:

Students learn component testing first
Then learn integration testing
Mirrors professional ML engineering workflows

Practical Benefits:

Fast feedback during development (inline tests)
Comprehensive validation before commit (integration tests)
Catches bugs at the right level
Clear mental model: component vs. system

Real-World Alignment:

Professional ML teams use this pattern
Unit tests for components
Integration tests for pipelines
System tests for workflows

📚 Summary

Think of tests/ as the "system validation layer":

modules/ inline tests = "Does my component work?"
tests/XX_modulename/ = "Does my module work with previous modules?"
tests/integration/ = "Do multiple modules work together?"
tests/debugging/ = "Are there edge cases I'm missing?"
tests/regression/ = "Did I break something that was working?"
tests/performance/ = "Is my implementation efficient?"
tests/system/ = "Does the entire system work?"

The key insight: tests/ validates that exported code (tinytorch/) works correctly in realistic scenarios, catching bugs that isolated unit tests miss.

Last Updated: 2025-01-XX
Test Infrastructure: Complete (20/20 modules have test directories)
Philosophy: Component correctness ≠ System correctness

13 KiB Raw Blame History

TinyTorch Testing Architecture

🎯 Overview: Two-Tier Testing Strategy

📋 Tier 1: Inline Tests (Component Validation)

Location: modules/XX_modulename/*.py

Purpose:

What Gets Tested:

Test Pattern:

Example: modules/01_tensor/tensor.py

Execution:

Key Characteristics:

📊 Tier 2: Integration Tests (tests/ Directory)

Location: tests/

Purpose:

Key Insight:

🗂️ Structure of tests/ Directory

1. Module-Specific Integration Tests (tests/XX_modulename/)

2. Cross-Module Integration Tests (tests/integration/)

3. Edge Cases & Stress Tests (tests/05_autograd/, tests/debugging/)

4. Regression Tests (tests/regression/)

5. Performance Tests (tests/performance/)

6. System Tests (tests/system/)

7. Checkpoint Tests (tests/checkpoints/)

🔄 Code Flow: Development → Export → Testing

🎯 Decision Tree: Where Should This Test Go?

📝 Best Practices

DO:

DON'T:

🔍 Key Distinctions

🚀 Testing Workflow

For Students:

For Instructors:

💡 Why This Architecture?

Separation of Concerns:

Educational Value:

Practical Benefits:

Real-World Alignment:

📚 Summary

13 KiB

Raw Blame History

Location: `modules/XX_modulename/*.py`

Example: `modules/01_tensor/tensor.py`

📊 Tier 2: Integration Tests (`tests/` Directory)

Location: `tests/`

🗂️ Structure of `tests/` Directory

1. Module-Specific Integration Tests (`tests/XX_modulename/`)

2. Cross-Module Integration Tests (`tests/integration/`)

3. Edge Cases & Stress Tests (`tests/05_autograd/`, `tests/debugging/`)

4. Regression Tests (`tests/regression/`)

5. Performance Tests (`tests/performance/`)

6. System Tests (`tests/system/`)

7. Checkpoint Tests (`tests/checkpoints/`)