mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-02 12:12:34 -05:00

Files

Vijay Janapa Reddi c5389e9e47 docs: Add comprehensive integration testing rules

- Create focused integration testing rule (256 lines, under 500-line limit)
- Establish core principle: interface compatibility over functionality re-testing
- Define 4 integration test categories: Foundation, Architecture, Training, Inference
- Provide clear DO/DON'T examples with code snippets
- Document testing anti-patterns to avoid
- Include educational testing principles and workflow guidelines
- Reference 17 existing integration test files in tests/ directory
- Update rules README to include new integration-testing.md rule

2025-07-18 00:33:26 -04:00

9.0 KiB

Raw Blame History

TinyTorch Integration Testing Rules

Context: Cross-module testing patterns for TinyTorch's educational ML systems course
Reference: tests/ directory with 17 integration test files
Philosophy: Interface compatibility over functionality re-testing

🎯 Core Principle: Interface Testing, Not Function Re-testing

Integration tests in TinyTorch focus on cross-module interfaces and compatibility, NOT re-testing individual module functionality. Individual functions should already work from inline tests within each module's *_dev.py files.

✅ DO: Interface Compatibility Testing

def test_tensor_attention_interface():
    """Test Tensor operations work correctly with Attention mechanisms."""
    # Test interface compatibility
    tensor = Tensor([[1, 2, 3, 4]])
    attention = SelfAttention(embed_size=4, num_heads=1)
    
    # Focus on: Can these components work together?
    result = attention.forward(tensor)
    assert isinstance(result, Tensor)
    assert result.shape == tensor.shape

❌ DON'T: Functionality Re-testing

def test_attention_math_correctness():
    """DON'T: Re-test attention computation correctness."""
    # This should already be verified in attention_dev.py inline tests
    Q, K, V = create_qkv_matrices()
    scores = np.matmul(Q, K.transpose(-2, -1)) / np.sqrt(d_k)
    # ... detailed math verification belongs in inline tests

🏗️ Integration Test Categories

1. Foundation Integration (Tensor + Core Operations)

Tensor + Activations: Do activation functions work with Tensor operations?
Tensor + Autograd: Does gradient computation integrate with Tensor?
Layers + Dense Networks: Do individual layers compose into networks?

2. Architecture Integration (Component Composition)

Tensor + Attention: Do attention mechanisms work with Tensor operations?
Spatial + Dense: Do CNN layers integrate with fully connected layers?
Complete Pipelines: Do end-to-end architectures work together?

3. Training & Data Integration (Learning Workflows)

DataLoader + Tensor: Does data loading work with tensor operations?
Training Integration: Do complete training workflows function?
ML Pipeline: Do end-to-end machine learning pipelines work?

4. Inference Serving Integration (Production Systems)

Compression + Models: Do compressed models maintain functionality?
Kernels + Operations: Do custom kernels integrate with standard ops?
Benchmarking + Systems: Does performance measurement work across components?

📋 Integration Test Structure

File Naming Convention

test_{module1}_{module2}_integration.py    # Two-module interface testing
test_{workflow}_pipeline_integration.py    # Multi-module workflow testing

Test Function Structure

def test_interface_compatibility():
    """Test that ModuleA outputs work as ModuleB inputs."""
    # Setup components
    module_a = ModuleA()
    module_b = ModuleB()
    
    # Test interface compatibility
    output_a = module_a.process(input_data)
    result = module_b.process(output_a)  # Key: Does this work?
    
    # Assert compatibility, not correctness
    assert isinstance(result, expected_type)
    assert result.shape == expected_shape
    
def test_workflow_integration():
    """Test complete workflow across multiple modules."""
    # Test realistic usage patterns
    pipeline = setup_realistic_pipeline()
    result = pipeline.run(real_data)
    
    # Assert workflow completion
    assert pipeline.successful_completion()

🎓 Educational Testing Principles

Real Components, Not Mocks

# ✅ Good: Use actual TinyTorch components
from tinytorch.core.tensor import Tensor
from tinytorch.core.activations import ReLU
from tinytorch.core.layers import Dense

# ❌ Bad: Mock components that don't reflect real behavior
class MockTensor:
    def __init__(self, data): pass

Realistic Scenarios

def test_student_workflow():
    """Test scenarios students will actually encounter."""
    # Use realistic data sizes and patterns
    data = np.random.randn(32, 10)  # Reasonable batch size
    
    # Test common student workflows
    network = create_simple_network()
    predictions = network.forward(data)
    loss = compute_loss(predictions, targets)

Clear Success Criteria

def test_cross_module_compatibility():
    """Integration test with clear educational objectives."""
    print("🔬 Integration Test: Tensor + Activation compatibility...")
    
    # Test specific interface
    result = test_compatibility()
    
    # Clear feedback
    if result.success:
        print("✅ Tensor and Activation modules integrate correctly")
        print("📈 Progress: Cross-module interfaces ✓")
    else:
        print(f"❌ Integration issue: {result.error}")

🔧 Test Implementation Guidelines

Focus Areas for Integration Tests

Data Flow Compatibility
- Output of ModuleA → Input of ModuleB
- Shape and type consistency across modules
- Error propagation across component boundaries
Interface Contracts
- Method signatures work across modules
- Expected behaviors are maintained in composition
- Error handling integrates properly
Workflow Validation
- Complete educational scenarios work end-to-end
- Student-facing APIs function together
- Real-world usage patterns succeed

Testing Anti-Patterns to Avoid

❌ Don't Re-test Module Internals

# Bad: This should be in activation_dev.py inline tests
def test_relu_computation():
    assert relu(np.array([-1, 1])) == np.array([0, 1])

❌ Don't Test Contrived Scenarios

# Bad: Students will never use 10000x10000 tensors
def test_massive_tensor_integration():
    huge_tensor = Tensor(np.random.randn(10000, 10000))

❌ Don't Duplicate Inline Test Coverage

# Bad: If inline tests pass, mathematical correctness is verified
def test_attention_math_again():
    # Detailed mathematical verification already done inline

✅ Good Integration Test Examples

def test_attention_tensor_interface():
    """Verify Attention mechanisms work with Tensor operations."""
    # Setup realistic scenario
    sequence_length, embed_size = 10, 64
    tensor_input = Tensor(np.random.randn(1, sequence_length, embed_size))
    
    # Test interface compatibility
    attention = SelfAttention(embed_size=embed_size)
    result = attention.forward(tensor_input)
    
    # Assert interface contract
    assert isinstance(result, Tensor)
    assert result.shape == tensor_input.shape
    print("✅ Attention-Tensor interface compatibility verified")

def test_complete_transformer_pipeline():
    """Test realistic transformer-like pipeline."""
    # Realistic student project scenario
    vocab_size, seq_len, embed_size = 1000, 20, 128
    
    # Complete pipeline components
    embedding = Embedding(vocab_size, embed_size)
    attention = SelfAttention(embed_size)
    dense = Dense(embed_size, vocab_size)
    
    # Test end-to-end workflow
    tokens = np.random.randint(0, vocab_size, (1, seq_len))
    embedded = embedding(tokens)
    attended = attention.forward(embedded)
    output = dense.forward(attended)
    
    # Verify complete workflow
    assert output.shape == (1, seq_len, vocab_size)
    print("✅ Complete transformer pipeline integration successful")

🚀 Integration Testing Workflow

Development Process

Module completion: Ensure inline tests pass first
Interface design: Define cross-module contracts
Integration tests: Write interface compatibility tests
Pipeline tests: Add workflow validation tests
Student validation: Test realistic usage scenarios

Running Integration Tests

# All integration tests
pytest tests/ -v

# Specific integration area
pytest tests/test_tensor_attention_integration.py -v
pytest tests/test_attention_pipeline_integration.py -v

# Integration test pattern
pytest tests/ -k "integration" -v

Quality Standards

✅ Interface Focus: Tests verify component compatibility
✅ Real Components: Uses actual TinyTorch modules, not mocks
✅ Student Scenarios: Reflects realistic educational workflows
✅ Clear Feedback: Provides educational progress indicators
✅ Complementary: Adds value beyond inline testing

📚 Reference Implementation

Best Examples:

test_tensor_activations_integration.py - Interface compatibility testing
test_attention_pipeline_integration.py - Complete workflow testing
test_layers_networks_integration.py - Component composition testing

Avoid These Patterns:

Tests that duplicate inline test functionality
Mock-based testing that doesn't reflect real component behavior
Contrived scenarios that students won't encounter

Integration tests should answer: "Do these modules work together?" not "Do these modules work correctly?" - correctness is verified by inline tests.

9.0 KiB Raw Blame History