Files
TinyTorch/docs/development/testing-design.md
Vijay Janapa Reddi 2f2feeae3f Add comprehensive testing and module structure design documents
- Create testing-design.md analyzing current testing redundancy
- Propose unified testing approach eliminating unit/module distinction
- Create module-structure-design.md with standardized patterns
- Document NBDev educational framework requirements
- Establish design guidelines for future module development
2025-07-12 18:54:24 -04:00

11 KiB

TinyTorch Testing Design Document

Overview

This document analyzes the current testing architecture and proposes a unified approach that eliminates redundancy while maximizing educational value and development efficiency.

Current Testing Structure (Analysis)

What We Have Now

  1. Inline Tests (in *_dev.py files)

    • NBGrader cells with immediate feedback
    • Test individual functions after implementation
    • Labeled as "unit tests" but really immediate feedback
    • Visual feedback with emojis and progress tracking
  2. Module Tests (in tests/test_*.py files)

    • Comprehensive pytest suites
    • Test entire module functionality
    • Professional test structure with classes and fixtures
    • Edge cases and error handling
  3. Integration Tests (planned)

    • Cross-module workflows
    • End-to-end pipelines
  4. System Tests (planned)

    • Performance and scalability
    • Production scenarios

Problems with Current Approach

  1. Redundancy: Testing the same functions twice with different approaches
  2. Complexity: Students need to understand two testing paradigms
  3. Maintenance: Changes require updating tests in multiple places
  4. Artificial Distinction: "Unit vs Module" tests are testing the same code
  5. Scattered Feedback: Tests are in different files with different formats

Proposed Unified Testing Architecture

Core Principle: Progressive Testing Within Notebooks

Instead of separate test files, integrate comprehensive testing directly into the educational notebooks using a "Build → Test → Build → Test" rhythm.

Four-Stage Testing Pipeline

📚 Notebook Tests (Progressive)    →    🔗 Integration Tests    →    🚀 System Tests
   ↓                                        ↓                         ↓
Individual functions                   Cross-module workflows      Production scenarios
Immediate feedback                     End-to-end pipelines       Performance & scale
Educational context                    Real ML workflows          Robustness testing

Stage 1: Progressive Notebook Testing

Replace both inline tests and module tests with comprehensive notebook testing:

# %% [markdown]
"""
### 🧪 Comprehensive Test: Tensor Creation

This tests all tensor creation scenarios with real data and edge cases.
"""

# %% nbgrader={"grade": true, "grade_id": "test-tensor-creation", "locked": true, "points": 20, "schema_version": 3, "solution": false, "task": false}
import pytest
import numpy as np

class TestTensorCreation:
    """Comprehensive tensor creation tests."""
    
    def test_scalar_creation(self):
        """Test scalar tensor creation."""
        # Basic scalar
        scalar = Tensor(5.0)
        assert scalar.shape == ()
        assert scalar.size == 1
        assert scalar.data.item() == 5.0
        
        # Different types
        int_scalar = Tensor(42)
        assert int_scalar.dtype in [np.int32, np.int64]
        
        float_scalar = Tensor(3.14)
        assert float_scalar.dtype == np.float32
    
    def test_vector_creation(self):
        """Test vector tensor creation."""
        # From list
        vector = Tensor([1, 2, 3, 4, 5])
        assert vector.shape == (5,)
        assert vector.size == 5
        assert np.array_equal(vector.data, np.array([1, 2, 3, 4, 5]))
        
        # From numpy array
        np_array = np.array([10, 20, 30])
        vector_from_np = Tensor(np_array)
        assert np.array_equal(vector_from_np.data, np_array)
    
    def test_matrix_creation(self):
        """Test matrix tensor creation."""
        matrix = Tensor([[1, 2], [3, 4]])
        assert matrix.shape == (2, 2)
        assert matrix.size == 4
        expected = np.array([[1, 2], [3, 4]])
        assert np.array_equal(matrix.data, expected)
    
    def test_dtype_handling(self):
        """Test data type handling."""
        # Explicit dtype
        float_tensor = Tensor([1, 2, 3], dtype='float32')
        assert float_tensor.dtype == np.float32
        
        # Auto dtype detection
        int_tensor = Tensor([1, 2, 3])
        assert int_tensor.dtype in [np.int32, np.int64]
    
    def test_edge_cases(self):
        """Test edge cases and error conditions."""
        # Empty tensor
        empty = Tensor([])
        assert empty.shape == (0,)
        assert empty.size == 0
        
        # Single element
        single = Tensor([42])
        assert single.shape == (1,)
        assert single.size == 1
        
        # Large tensor
        large = Tensor(list(range(1000)))
        assert large.shape == (1000,)
        assert large.size == 1000

# Run the tests with visual feedback
def run_tensor_creation_tests():
    """Run tensor creation tests with educational feedback."""
    print("🔬 Running comprehensive tensor creation tests...")
    
    test_class = TestTensorCreation()
    tests = [
        ('Scalar Creation', test_class.test_scalar_creation),
        ('Vector Creation', test_class.test_vector_creation),
        ('Matrix Creation', test_class.test_matrix_creation),
        ('Data Type Handling', test_class.test_dtype_handling),
        ('Edge Cases', test_class.test_edge_cases)
    ]
    
    passed = 0
    total = len(tests)
    
    for test_name, test_func in tests:
        try:
            test_func()
            print(f"✅ {test_name}: PASSED")
            passed += 1
        except Exception as e:
            print(f"❌ {test_name}: FAILED - {e}")
    
    print(f"\n📊 Results: {passed}/{total} tests passed")
    if passed == total:
        print("🎉 All tensor creation tests passed!")
        print("📈 Progress: Tensor Creation ✓")
    else:
        print("⚠️  Some tests failed - check your implementation")
    
    return passed == total

# Execute tests
run_tensor_creation_tests()

Benefits of Unified Approach

  1. Single Source of Truth: All tests in one place
  2. Educational Context: Tests explain what they're checking
  3. Immediate Feedback: Students see results instantly
  4. Professional Structure: Uses pytest patterns within notebooks
  5. Comprehensive Coverage: Covers functionality, edge cases, and errors
  6. Visual Learning: Clear pass/fail feedback with explanations

Stage 2: Integration Testing

Test cross-module workflows in dedicated integration files:

# tests/integration/test_basic_ml_pipeline.py
def test_tensor_to_activations_pipeline():
    """Test tensor → activation function workflow."""
    from tinytorch.core.tensor import Tensor
    from tinytorch.core.activations import ReLU
    
    # Create tensor
    x = Tensor([-1, 0, 1, 2])
    
    # Apply activation
    relu = ReLU()
    y = relu(x)
    
    # Verify pipeline
    expected = Tensor([0, 0, 1, 2])
    assert np.array_equal(y.data, expected.data)

Stage 3: System Testing

Test production scenarios in dedicated system files:

# tests/system/test_performance.py
def test_tensor_operations_performance():
    """Test tensor operations with large data."""
    import time
    
    # Large tensor operations
    large_tensor = Tensor(np.random.randn(10000, 1000))
    
    start = time.time()
    result = large_tensor + large_tensor
    duration = time.time() - start
    
    # Should complete within reasonable time
    assert duration < 1.0, f"Operation took {duration:.2f}s, expected < 1.0s"

Implementation Strategy

Phase 1: Consolidate Notebook Testing

  1. Remove duplicate tests - eliminate separate module test files
  2. Enhance notebook tests - make them comprehensive with pytest structure
  3. Add visual feedback - maintain educational value with progress tracking
  4. Standardize format - consistent test structure across all modules

Phase 2: Implement Integration Testing

  1. Create integration test taxonomy - basic ML, vision, data pipelines
  2. Implement cross-module tests - verify components work together
  3. Test real workflows - end-to-end ML scenarios

Phase 3: Implement System Testing

  1. Performance testing - speed, memory, throughput
  2. Scalability testing - large datasets, batch processing
  3. Robustness testing - error handling, edge cases

Module Testing Guidelines

Structure for Each Module

# %% [markdown]
"""
# Module X: Component Testing

This section contains comprehensive tests for all module functionality.
Tests are organized by component and include:
- ✅ Basic functionality
- ✅ Edge cases
- ✅ Error handling
- ✅ Integration points
"""

# %% [markdown]
"""
### 🧪 Component A Tests
Tests for the first major component...
"""

# %% nbgrader={"grade": true, "grade_id": "test-component-a", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Comprehensive Component A tests here...

# %% [markdown]
"""
### 🧪 Component B Tests
Tests for the second major component...
"""

# %% nbgrader={"grade": true, "grade_id": "test-component-b", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Comprehensive Component B tests here...

# %% [markdown]
"""
### 🧪 Integration Tests
Tests for how components work together...
"""

# %% nbgrader={"grade": true, "grade_id": "test-integration", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Integration tests here...

Test Execution

Students run tests within notebooks:

# All tests run automatically as cells execute
# No separate commands needed
# Immediate feedback and progress tracking

Instructors can also run centralized testing:

# Run all notebook tests
tito test --all

# Run specific module
tito test --module tensor

# Run integration tests
tito test --integration

# Run system tests
tito test --system

Migration Plan

Step 1: Audit Current Tests

  • Identify overlapping tests between inline and module tests
  • Catalog test coverage gaps
  • Document test dependencies

Step 2: Consolidate Testing

  • Merge inline and module tests into comprehensive notebook tests
  • Remove duplicate test files
  • Update CLI to support notebook testing

Step 3: Enhance Coverage

  • Add missing edge cases to notebook tests
  • Improve error handling tests
  • Add performance considerations

Step 4: Implement Integration/System Testing

  • Create integration test taxonomy
  • Implement cross-module tests
  • Add system performance tests

Conclusion

The unified testing approach eliminates redundancy while providing better educational value and development efficiency. Students get comprehensive testing within their learning context, while instructors maintain professional testing standards for production validation.

Key Benefits:

  • Simplified: One testing approach, not multiple
  • Educational: Tests explain what they're checking
  • Comprehensive: Full coverage within notebooks
  • Professional: Uses industry-standard pytest patterns
  • Efficient: No duplicate maintenance burden