docs: Add comprehensive integration testing rules

- Create focused integration testing rule (256 lines, under 500-line limit) - Establish core principle: interface compatibility over functionality re-testing - Define 4 integration test categories: Foundation, Architecture, Training, Inference - Provide clear DO/DON'T examples with code snippets - Document testing anti-patterns to avoid - Include educational testing principles and workflow guidelines - Reference 17 existing integration test files in tests/ directory - Update rules README to include new integration-testing.md rule
2026-04-28 21:42:42 -05:00 · 2025-07-18 00:33:26 -04:00
parent 49f7f5f3dc
commit c5389e9e47
2 changed files with 258 additions and 0 deletions
--- a/.cursor/rules/README.md
+++ b/.cursor/rules/README.md
@@ -25,6 +25,7 @@ This directory contains focused, actionable rules for TinyTorch development. Eac

 ### Testing
 - **[testing-patterns.mdc](testing-patterns.mdc)** (115 lines) - Testing standards with pytest and real data
+- **[integration-testing.md](integration-testing.md)** (238 lines) - Cross-module interface compatibility testing

 ## Key Design Principles

--- a/.cursor/rules/integration-testing.md
+++ b/.cursor/rules/integration-testing.md
@@ -0,0 +1,257 @@
+# TinyTorch Integration Testing Rules
+
+**Context**: Cross-module testing patterns for TinyTorch's educational ML systems course  
+**Reference**: `tests/` directory with 17 integration test files  
+**Philosophy**: Interface compatibility over functionality re-testing
+
+## 🎯 **Core Principle: Interface Testing, Not Function Re-testing**
+
+Integration tests in TinyTorch focus on **cross-module interfaces and compatibility**, NOT re-testing individual module functionality. Individual functions should already work from inline tests within each module's `*_dev.py` files.
+
+### ✅ **DO: Interface Compatibility Testing**
+```python
+def test_tensor_attention_interface():
+    """Test Tensor operations work correctly with Attention mechanisms."""
+    # Test interface compatibility
+    tensor = Tensor([[1, 2, 3, 4]])
+    attention = SelfAttention(embed_size=4, num_heads=1)
+    
+    # Focus on: Can these components work together?
+    result = attention.forward(tensor)
+    assert isinstance(result, Tensor)
+    assert result.shape == tensor.shape
+```
+
+### ❌ **DON'T: Functionality Re-testing**
+```python
+def test_attention_math_correctness():
+    """DON'T: Re-test attention computation correctness."""
+    # This should already be verified in attention_dev.py inline tests
+    Q, K, V = create_qkv_matrices()
+    scores = np.matmul(Q, K.transpose(-2, -1)) / np.sqrt(d_k)
+    # ... detailed math verification belongs in inline tests
+```
+
+## 🏗️ **Integration Test Categories**
+
+### 1. **Foundation Integration** (Tensor + Core Operations)
+- **Tensor + Activations**: Do activation functions work with Tensor operations?
+- **Tensor + Autograd**: Does gradient computation integrate with Tensor?
+- **Layers + Dense Networks**: Do individual layers compose into networks?
+
+### 2. **Architecture Integration** (Component Composition)
+- **Tensor + Attention**: Do attention mechanisms work with Tensor operations?
+- **Spatial + Dense**: Do CNN layers integrate with fully connected layers?
+- **Complete Pipelines**: Do end-to-end architectures work together?
+
+### 3. **Training & Data Integration** (Learning Workflows)
+- **DataLoader + Tensor**: Does data loading work with tensor operations?
+- **Training Integration**: Do complete training workflows function?
+- **ML Pipeline**: Do end-to-end machine learning pipelines work?
+
+### 4. **Inference Serving Integration** (Production Systems)
+- **Compression + Models**: Do compressed models maintain functionality?
+- **Kernels + Operations**: Do custom kernels integrate with standard ops?
+- **Benchmarking + Systems**: Does performance measurement work across components?
+
+## 📋 **Integration Test Structure**
+
+### **File Naming Convention**
+```
+test_{module1}_{module2}_integration.py    # Two-module interface testing
+test_{workflow}_pipeline_integration.py    # Multi-module workflow testing
+```
+
+### **Test Function Structure**
+```python
+def test_interface_compatibility():
+    """Test that ModuleA outputs work as ModuleB inputs."""
+    # Setup components
+    module_a = ModuleA()
+    module_b = ModuleB()
+    
+    # Test interface compatibility
+    output_a = module_a.process(input_data)
+    result = module_b.process(output_a)  # Key: Does this work?
+    
+    # Assert compatibility, not correctness
+    assert isinstance(result, expected_type)
+    assert result.shape == expected_shape
+    
+def test_workflow_integration():
+    """Test complete workflow across multiple modules."""
+    # Test realistic usage patterns
+    pipeline = setup_realistic_pipeline()
+    result = pipeline.run(real_data)
+    
+    # Assert workflow completion
+    assert pipeline.successful_completion()
+```
+
+## 🎓 **Educational Testing Principles**
+
+### **Real Components, Not Mocks**
+```python
+# ✅ Good: Use actual TinyTorch components
+from tinytorch.core.tensor import Tensor
+from tinytorch.core.activations import ReLU
+from tinytorch.core.layers import Dense
+
+# ❌ Bad: Mock components that don't reflect real behavior
+class MockTensor:
+    def __init__(self, data): pass
+```
+
+### **Realistic Scenarios**
+```python
+def test_student_workflow():
+    """Test scenarios students will actually encounter."""
+    # Use realistic data sizes and patterns
+    data = np.random.randn(32, 10)  # Reasonable batch size
+    
+    # Test common student workflows
+    network = create_simple_network()
+    predictions = network.forward(data)
+    loss = compute_loss(predictions, targets)
+```
+
+### **Clear Success Criteria**
+```python
+def test_cross_module_compatibility():
+    """Integration test with clear educational objectives."""
+    print("🔬 Integration Test: Tensor + Activation compatibility...")
+    
+    # Test specific interface
+    result = test_compatibility()
+    
+    # Clear feedback
+    if result.success:
+        print("✅ Tensor and Activation modules integrate correctly")
+        print("📈 Progress: Cross-module interfaces ✓")
+    else:
+        print(f"❌ Integration issue: {result.error}")
+```
+
+## 🔧 **Test Implementation Guidelines**
+
+### **Focus Areas for Integration Tests**
+
+1. **Data Flow Compatibility**
+   - Output of ModuleA → Input of ModuleB
+   - Shape and type consistency across modules
+   - Error propagation across component boundaries
+
+2. **Interface Contracts**
+   - Method signatures work across modules
+   - Expected behaviors are maintained in composition
+   - Error handling integrates properly
+
+3. **Workflow Validation**
+   - Complete educational scenarios work end-to-end
+   - Student-facing APIs function together
+   - Real-world usage patterns succeed
+
+### **Testing Anti-Patterns to Avoid**
+
+❌ **Don't Re-test Module Internals**
+```python
+# Bad: This should be in activation_dev.py inline tests
+def test_relu_computation():
+    assert relu(np.array([-1, 1])) == np.array([0, 1])
+```
+
+❌ **Don't Test Contrived Scenarios**
+```python
+# Bad: Students will never use 10000x10000 tensors
+def test_massive_tensor_integration():
+    huge_tensor = Tensor(np.random.randn(10000, 10000))
+```
+
+❌ **Don't Duplicate Inline Test Coverage**
+```python
+# Bad: If inline tests pass, mathematical correctness is verified
+def test_attention_math_again():
+    # Detailed mathematical verification already done inline
+```
+
+### ✅ **Good Integration Test Examples**
+
+```python
+def test_attention_tensor_interface():
+    """Verify Attention mechanisms work with Tensor operations."""
+    # Setup realistic scenario
+    sequence_length, embed_size = 10, 64
+    tensor_input = Tensor(np.random.randn(1, sequence_length, embed_size))
+    
+    # Test interface compatibility
+    attention = SelfAttention(embed_size=embed_size)
+    result = attention.forward(tensor_input)
+    
+    # Assert interface contract
+    assert isinstance(result, Tensor)
+    assert result.shape == tensor_input.shape
+    print("✅ Attention-Tensor interface compatibility verified")
+
+def test_complete_transformer_pipeline():
+    """Test realistic transformer-like pipeline."""
+    # Realistic student project scenario
+    vocab_size, seq_len, embed_size = 1000, 20, 128
+    
+    # Complete pipeline components
+    embedding = Embedding(vocab_size, embed_size)
+    attention = SelfAttention(embed_size)
+    dense = Dense(embed_size, vocab_size)
+    
+    # Test end-to-end workflow
+    tokens = np.random.randint(0, vocab_size, (1, seq_len))
+    embedded = embedding(tokens)
+    attended = attention.forward(embedded)
+    output = dense.forward(attended)
+    
+    # Verify complete workflow
+    assert output.shape == (1, seq_len, vocab_size)
+    print("✅ Complete transformer pipeline integration successful")
+```
+
+## 🚀 **Integration Testing Workflow**
+
+### **Development Process**
+1. **Module completion**: Ensure inline tests pass first
+2. **Interface design**: Define cross-module contracts
+3. **Integration tests**: Write interface compatibility tests
+4. **Pipeline tests**: Add workflow validation tests
+5. **Student validation**: Test realistic usage scenarios
+
+### **Running Integration Tests**
+```bash
+# All integration tests
+pytest tests/ -v
+
+# Specific integration area
+pytest tests/test_tensor_attention_integration.py -v
+pytest tests/test_attention_pipeline_integration.py -v
+
+# Integration test pattern
+pytest tests/ -k "integration" -v
+```
+
+### **Quality Standards**
+- ✅ **Interface Focus**: Tests verify component compatibility
+- ✅ **Real Components**: Uses actual TinyTorch modules, not mocks
+- ✅ **Student Scenarios**: Reflects realistic educational workflows
+- ✅ **Clear Feedback**: Provides educational progress indicators
+- ✅ **Complementary**: Adds value beyond inline testing
+
+## 📚 **Reference Implementation**
+
+**Best Examples**: 
+- `test_tensor_activations_integration.py` - Interface compatibility testing
+- `test_attention_pipeline_integration.py` - Complete workflow testing
+- `test_layers_networks_integration.py` - Component composition testing
+
+**Avoid These Patterns**:
+- Tests that duplicate inline test functionality
+- Mock-based testing that doesn't reflect real component behavior
+- Contrived scenarios that students won't encounter
+
+Integration tests should answer: *"Do these modules work together?"* not *"Do these modules work correctly?"* - correctness is verified by inline tests.