mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-12 08:03:34 -05:00

Files

Vijay Janapa Reddi 2f2feeae3f Add comprehensive testing and module structure design documents

- Create testing-design.md analyzing current testing redundancy
- Propose unified testing approach eliminating unit/module distinction
- Create module-structure-design.md with standardized patterns
- Document NBDev educational framework requirements
- Establish design guidelines for future module development

2025-07-12 18:54:24 -04:00

11 KiB

Raw Blame History

TinyTorch Testing Design Document

Overview

This document analyzes the current testing architecture and proposes a unified approach that eliminates redundancy while maximizing educational value and development efficiency.

Current Testing Structure (Analysis)

What We Have Now

Inline Tests (in *_dev.py files)
- NBGrader cells with immediate feedback
- Test individual functions after implementation
- Labeled as "unit tests" but really immediate feedback
- Visual feedback with emojis and progress tracking
Module Tests (in tests/test_*.py files)
- Comprehensive pytest suites
- Test entire module functionality
- Professional test structure with classes and fixtures
- Edge cases and error handling
Integration Tests (planned)
- Cross-module workflows
- End-to-end pipelines
System Tests (planned)
- Performance and scalability
- Production scenarios

Problems with Current Approach

Redundancy: Testing the same functions twice with different approaches
Complexity: Students need to understand two testing paradigms
Maintenance: Changes require updating tests in multiple places
Artificial Distinction: "Unit vs Module" tests are testing the same code
Scattered Feedback: Tests are in different files with different formats

Proposed Unified Testing Architecture

Core Principle: Progressive Testing Within Notebooks

Instead of separate test files, integrate comprehensive testing directly into the educational notebooks using a "Build → Test → Build → Test" rhythm.

Four-Stage Testing Pipeline

📚 Notebook Tests (Progressive)    →    🔗 Integration Tests    →    🚀 System Tests
   ↓                                        ↓                         ↓
Individual functions                   Cross-module workflows      Production scenarios
Immediate feedback                     End-to-end pipelines       Performance & scale
Educational context                    Real ML workflows          Robustness testing

Stage 1: Progressive Notebook Testing

Replace both inline tests and module tests with comprehensive notebook testing:

# %% [markdown]
"""
### 🧪 Comprehensive Test: Tensor Creation

This tests all tensor creation scenarios with real data and edge cases.
"""

# %% nbgrader={"grade": true, "grade_id": "test-tensor-creation", "locked": true, "points": 20, "schema_version": 3, "solution": false, "task": false}
import pytest
import numpy as np

class TestTensorCreation:
    """Comprehensive tensor creation tests."""
    
    def test_scalar_creation(self):
        """Test scalar tensor creation."""
        # Basic scalar
        scalar = Tensor(5.0)
        assert scalar.shape == ()
        assert scalar.size == 1
        assert scalar.data.item() == 5.0
        
        # Different types
        int_scalar = Tensor(42)
        assert int_scalar.dtype in [np.int32, np.int64]
        
        float_scalar = Tensor(3.14)
        assert float_scalar.dtype == np.float32
    
    def test_vector_creation(self):
        """Test vector tensor creation."""
        # From list
        vector = Tensor([1, 2, 3, 4, 5])
        assert vector.shape == (5,)
        assert vector.size == 5
        assert np.array_equal(vector.data, np.array([1, 2, 3, 4, 5]))
        
        # From numpy array
        np_array = np.array([10, 20, 30])
        vector_from_np = Tensor(np_array)
        assert np.array_equal(vector_from_np.data, np_array)
    
    def test_matrix_creation(self):
        """Test matrix tensor creation."""
        matrix = Tensor([[1, 2], [3, 4]])
        assert matrix.shape == (2, 2)
        assert matrix.size == 4
        expected = np.array([[1, 2], [3, 4]])
        assert np.array_equal(matrix.data, expected)
    
    def test_dtype_handling(self):
        """Test data type handling."""
        # Explicit dtype
        float_tensor = Tensor([1, 2, 3], dtype='float32')
        assert float_tensor.dtype == np.float32
        
        # Auto dtype detection
        int_tensor = Tensor([1, 2, 3])
        assert int_tensor.dtype in [np.int32, np.int64]
    
    def test_edge_cases(self):
        """Test edge cases and error conditions."""
        # Empty tensor
        empty = Tensor([])
        assert empty.shape == (0,)
        assert empty.size == 0
        
        # Single element
        single = Tensor([42])
        assert single.shape == (1,)
        assert single.size == 1
        
        # Large tensor
        large = Tensor(list(range(1000)))
        assert large.shape == (1000,)
        assert large.size == 1000

# Run the tests with visual feedback
def run_tensor_creation_tests():
    """Run tensor creation tests with educational feedback."""
    print("🔬 Running comprehensive tensor creation tests...")
    
    test_class = TestTensorCreation()
    tests = [
        ('Scalar Creation', test_class.test_scalar_creation),
        ('Vector Creation', test_class.test_vector_creation),
        ('Matrix Creation', test_class.test_matrix_creation),
        ('Data Type Handling', test_class.test_dtype_handling),
        ('Edge Cases', test_class.test_edge_cases)
    ]
    
    passed = 0
    total = len(tests)
    
    for test_name, test_func in tests:
        try:
            test_func()
            print(f"✅ {test_name}: PASSED")
            passed += 1
        except Exception as e:
            print(f"❌ {test_name}: FAILED - {e}")
    
    print(f"\n📊 Results: {passed}/{total} tests passed")
    if passed == total:
        print("🎉 All tensor creation tests passed!")
        print("📈 Progress: Tensor Creation ✓")
    else:
        print("⚠️  Some tests failed - check your implementation")
    
    return passed == total

# Execute tests
run_tensor_creation_tests()

Benefits of Unified Approach

Single Source of Truth: All tests in one place
Educational Context: Tests explain what they're checking
Immediate Feedback: Students see results instantly
Professional Structure: Uses pytest patterns within notebooks
Comprehensive Coverage: Covers functionality, edge cases, and errors
Visual Learning: Clear pass/fail feedback with explanations

Stage 2: Integration Testing

Test cross-module workflows in dedicated integration files:

# tests/integration/test_basic_ml_pipeline.py
def test_tensor_to_activations_pipeline():
    """Test tensor → activation function workflow."""
    from tinytorch.core.tensor import Tensor
    from tinytorch.core.activations import ReLU
    
    # Create tensor
    x = Tensor([-1, 0, 1, 2])
    
    # Apply activation
    relu = ReLU()
    y = relu(x)
    
    # Verify pipeline
    expected = Tensor([0, 0, 1, 2])
    assert np.array_equal(y.data, expected.data)

Stage 3: System Testing

Test production scenarios in dedicated system files:

# tests/system/test_performance.py
def test_tensor_operations_performance():
    """Test tensor operations with large data."""
    import time
    
    # Large tensor operations
    large_tensor = Tensor(np.random.randn(10000, 1000))
    
    start = time.time()
    result = large_tensor + large_tensor
    duration = time.time() - start
    
    # Should complete within reasonable time
    assert duration < 1.0, f"Operation took {duration:.2f}s, expected < 1.0s"

Implementation Strategy

Phase 1: Consolidate Notebook Testing

Remove duplicate tests - eliminate separate module test files
Enhance notebook tests - make them comprehensive with pytest structure
Add visual feedback - maintain educational value with progress tracking
Standardize format - consistent test structure across all modules

Phase 2: Implement Integration Testing

Create integration test taxonomy - basic ML, vision, data pipelines
Implement cross-module tests - verify components work together
Test real workflows - end-to-end ML scenarios

Phase 3: Implement System Testing

Performance testing - speed, memory, throughput
Scalability testing - large datasets, batch processing
Robustness testing - error handling, edge cases

Module Testing Guidelines

Structure for Each Module

# %% [markdown]
"""
# Module X: Component Testing

This section contains comprehensive tests for all module functionality.
Tests are organized by component and include:
- ✅ Basic functionality
- ✅ Edge cases
- ✅ Error handling
- ✅ Integration points
"""

# %% [markdown]
"""
### 🧪 Component A Tests
Tests for the first major component...
"""

# %% nbgrader={"grade": true, "grade_id": "test-component-a", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Comprehensive Component A tests here...

# %% [markdown]
"""
### 🧪 Component B Tests
Tests for the second major component...
"""

# %% nbgrader={"grade": true, "grade_id": "test-component-b", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Comprehensive Component B tests here...

# %% [markdown]
"""
### 🧪 Integration Tests
Tests for how components work together...
"""

# %% nbgrader={"grade": true, "grade_id": "test-integration", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Integration tests here...

Test Execution

Students run tests within notebooks:

# All tests run automatically as cells execute
# No separate commands needed
# Immediate feedback and progress tracking

Instructors can also run centralized testing:

# Run all notebook tests
tito test --all

# Run specific module
tito test --module tensor

# Run integration tests
tito test --integration

# Run system tests
tito test --system

Migration Plan

Step 1: Audit Current Tests

Identify overlapping tests between inline and module tests
Catalog test coverage gaps
Document test dependencies

Step 2: Consolidate Testing

Merge inline and module tests into comprehensive notebook tests
Remove duplicate test files
Update CLI to support notebook testing

Step 3: Enhance Coverage

Add missing edge cases to notebook tests
Improve error handling tests
Add performance considerations

Step 4: Implement Integration/System Testing

Create integration test taxonomy
Implement cross-module tests
Add system performance tests

Conclusion

The unified testing approach eliminates redundancy while providing better educational value and development efficiency. Students get comprehensive testing within their learning context, while instructors maintain professional testing standards for production validation.

Key Benefits:

Simplified: One testing approach, not multiple
Educational: Tests explain what they're checking
Comprehensive: Full coverage within notebooks
Professional: Uses industry-standard pytest patterns
Efficient: No duplicate maintenance burden

11 KiB Raw Blame History