mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-28 21:42:42 -05:00
docs: Add comprehensive integration testing rules
- Create focused integration testing rule (256 lines, under 500-line limit) - Establish core principle: interface compatibility over functionality re-testing - Define 4 integration test categories: Foundation, Architecture, Training, Inference - Provide clear DO/DON'T examples with code snippets - Document testing anti-patterns to avoid - Include educational testing principles and workflow guidelines - Reference 17 existing integration test files in tests/ directory - Update rules README to include new integration-testing.md rule
This commit is contained in:
@@ -25,6 +25,7 @@ This directory contains focused, actionable rules for TinyTorch development. Eac
|
||||
|
||||
### Testing
|
||||
- **[testing-patterns.mdc](testing-patterns.mdc)** (115 lines) - Testing standards with pytest and real data
|
||||
- **[integration-testing.md](integration-testing.md)** (238 lines) - Cross-module interface compatibility testing
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
|
||||
257
.cursor/rules/integration-testing.md
Normal file
257
.cursor/rules/integration-testing.md
Normal file
@@ -0,0 +1,257 @@
|
||||
# TinyTorch Integration Testing Rules
|
||||
|
||||
**Context**: Cross-module testing patterns for TinyTorch's educational ML systems course
|
||||
**Reference**: `tests/` directory with 17 integration test files
|
||||
**Philosophy**: Interface compatibility over functionality re-testing
|
||||
|
||||
## 🎯 **Core Principle: Interface Testing, Not Function Re-testing**
|
||||
|
||||
Integration tests in TinyTorch focus on **cross-module interfaces and compatibility**, NOT re-testing individual module functionality. Individual functions should already work from inline tests within each module's `*_dev.py` files.
|
||||
|
||||
### ✅ **DO: Interface Compatibility Testing**
|
||||
```python
|
||||
def test_tensor_attention_interface():
|
||||
"""Test Tensor operations work correctly with Attention mechanisms."""
|
||||
# Test interface compatibility
|
||||
tensor = Tensor([[1, 2, 3, 4]])
|
||||
attention = SelfAttention(embed_size=4, num_heads=1)
|
||||
|
||||
# Focus on: Can these components work together?
|
||||
result = attention.forward(tensor)
|
||||
assert isinstance(result, Tensor)
|
||||
assert result.shape == tensor.shape
|
||||
```
|
||||
|
||||
### ❌ **DON'T: Functionality Re-testing**
|
||||
```python
|
||||
def test_attention_math_correctness():
|
||||
"""DON'T: Re-test attention computation correctness."""
|
||||
# This should already be verified in attention_dev.py inline tests
|
||||
Q, K, V = create_qkv_matrices()
|
||||
scores = np.matmul(Q, K.transpose(-2, -1)) / np.sqrt(d_k)
|
||||
# ... detailed math verification belongs in inline tests
|
||||
```
|
||||
|
||||
## 🏗️ **Integration Test Categories**
|
||||
|
||||
### 1. **Foundation Integration** (Tensor + Core Operations)
|
||||
- **Tensor + Activations**: Do activation functions work with Tensor operations?
|
||||
- **Tensor + Autograd**: Does gradient computation integrate with Tensor?
|
||||
- **Layers + Dense Networks**: Do individual layers compose into networks?
|
||||
|
||||
### 2. **Architecture Integration** (Component Composition)
|
||||
- **Tensor + Attention**: Do attention mechanisms work with Tensor operations?
|
||||
- **Spatial + Dense**: Do CNN layers integrate with fully connected layers?
|
||||
- **Complete Pipelines**: Do end-to-end architectures work together?
|
||||
|
||||
### 3. **Training & Data Integration** (Learning Workflows)
|
||||
- **DataLoader + Tensor**: Does data loading work with tensor operations?
|
||||
- **Training Integration**: Do complete training workflows function?
|
||||
- **ML Pipeline**: Do end-to-end machine learning pipelines work?
|
||||
|
||||
### 4. **Inference Serving Integration** (Production Systems)
|
||||
- **Compression + Models**: Do compressed models maintain functionality?
|
||||
- **Kernels + Operations**: Do custom kernels integrate with standard ops?
|
||||
- **Benchmarking + Systems**: Does performance measurement work across components?
|
||||
|
||||
## 📋 **Integration Test Structure**
|
||||
|
||||
### **File Naming Convention**
|
||||
```
|
||||
test_{module1}_{module2}_integration.py # Two-module interface testing
|
||||
test_{workflow}_pipeline_integration.py # Multi-module workflow testing
|
||||
```
|
||||
|
||||
### **Test Function Structure**
|
||||
```python
|
||||
def test_interface_compatibility():
|
||||
"""Test that ModuleA outputs work as ModuleB inputs."""
|
||||
# Setup components
|
||||
module_a = ModuleA()
|
||||
module_b = ModuleB()
|
||||
|
||||
# Test interface compatibility
|
||||
output_a = module_a.process(input_data)
|
||||
result = module_b.process(output_a) # Key: Does this work?
|
||||
|
||||
# Assert compatibility, not correctness
|
||||
assert isinstance(result, expected_type)
|
||||
assert result.shape == expected_shape
|
||||
|
||||
def test_workflow_integration():
|
||||
"""Test complete workflow across multiple modules."""
|
||||
# Test realistic usage patterns
|
||||
pipeline = setup_realistic_pipeline()
|
||||
result = pipeline.run(real_data)
|
||||
|
||||
# Assert workflow completion
|
||||
assert pipeline.successful_completion()
|
||||
```
|
||||
|
||||
## 🎓 **Educational Testing Principles**
|
||||
|
||||
### **Real Components, Not Mocks**
|
||||
```python
|
||||
# ✅ Good: Use actual TinyTorch components
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.core.layers import Dense
|
||||
|
||||
# ❌ Bad: Mock components that don't reflect real behavior
|
||||
class MockTensor:
|
||||
def __init__(self, data): pass
|
||||
```
|
||||
|
||||
### **Realistic Scenarios**
|
||||
```python
|
||||
def test_student_workflow():
|
||||
"""Test scenarios students will actually encounter."""
|
||||
# Use realistic data sizes and patterns
|
||||
data = np.random.randn(32, 10) # Reasonable batch size
|
||||
|
||||
# Test common student workflows
|
||||
network = create_simple_network()
|
||||
predictions = network.forward(data)
|
||||
loss = compute_loss(predictions, targets)
|
||||
```
|
||||
|
||||
### **Clear Success Criteria**
|
||||
```python
|
||||
def test_cross_module_compatibility():
|
||||
"""Integration test with clear educational objectives."""
|
||||
print("🔬 Integration Test: Tensor + Activation compatibility...")
|
||||
|
||||
# Test specific interface
|
||||
result = test_compatibility()
|
||||
|
||||
# Clear feedback
|
||||
if result.success:
|
||||
print("✅ Tensor and Activation modules integrate correctly")
|
||||
print("📈 Progress: Cross-module interfaces ✓")
|
||||
else:
|
||||
print(f"❌ Integration issue: {result.error}")
|
||||
```
|
||||
|
||||
## 🔧 **Test Implementation Guidelines**
|
||||
|
||||
### **Focus Areas for Integration Tests**
|
||||
|
||||
1. **Data Flow Compatibility**
|
||||
- Output of ModuleA → Input of ModuleB
|
||||
- Shape and type consistency across modules
|
||||
- Error propagation across component boundaries
|
||||
|
||||
2. **Interface Contracts**
|
||||
- Method signatures work across modules
|
||||
- Expected behaviors are maintained in composition
|
||||
- Error handling integrates properly
|
||||
|
||||
3. **Workflow Validation**
|
||||
- Complete educational scenarios work end-to-end
|
||||
- Student-facing APIs function together
|
||||
- Real-world usage patterns succeed
|
||||
|
||||
### **Testing Anti-Patterns to Avoid**
|
||||
|
||||
❌ **Don't Re-test Module Internals**
|
||||
```python
|
||||
# Bad: This should be in activation_dev.py inline tests
|
||||
def test_relu_computation():
|
||||
assert relu(np.array([-1, 1])) == np.array([0, 1])
|
||||
```
|
||||
|
||||
❌ **Don't Test Contrived Scenarios**
|
||||
```python
|
||||
# Bad: Students will never use 10000x10000 tensors
|
||||
def test_massive_tensor_integration():
|
||||
huge_tensor = Tensor(np.random.randn(10000, 10000))
|
||||
```
|
||||
|
||||
❌ **Don't Duplicate Inline Test Coverage**
|
||||
```python
|
||||
# Bad: If inline tests pass, mathematical correctness is verified
|
||||
def test_attention_math_again():
|
||||
# Detailed mathematical verification already done inline
|
||||
```
|
||||
|
||||
### ✅ **Good Integration Test Examples**
|
||||
|
||||
```python
|
||||
def test_attention_tensor_interface():
|
||||
"""Verify Attention mechanisms work with Tensor operations."""
|
||||
# Setup realistic scenario
|
||||
sequence_length, embed_size = 10, 64
|
||||
tensor_input = Tensor(np.random.randn(1, sequence_length, embed_size))
|
||||
|
||||
# Test interface compatibility
|
||||
attention = SelfAttention(embed_size=embed_size)
|
||||
result = attention.forward(tensor_input)
|
||||
|
||||
# Assert interface contract
|
||||
assert isinstance(result, Tensor)
|
||||
assert result.shape == tensor_input.shape
|
||||
print("✅ Attention-Tensor interface compatibility verified")
|
||||
|
||||
def test_complete_transformer_pipeline():
|
||||
"""Test realistic transformer-like pipeline."""
|
||||
# Realistic student project scenario
|
||||
vocab_size, seq_len, embed_size = 1000, 20, 128
|
||||
|
||||
# Complete pipeline components
|
||||
embedding = Embedding(vocab_size, embed_size)
|
||||
attention = SelfAttention(embed_size)
|
||||
dense = Dense(embed_size, vocab_size)
|
||||
|
||||
# Test end-to-end workflow
|
||||
tokens = np.random.randint(0, vocab_size, (1, seq_len))
|
||||
embedded = embedding(tokens)
|
||||
attended = attention.forward(embedded)
|
||||
output = dense.forward(attended)
|
||||
|
||||
# Verify complete workflow
|
||||
assert output.shape == (1, seq_len, vocab_size)
|
||||
print("✅ Complete transformer pipeline integration successful")
|
||||
```
|
||||
|
||||
## 🚀 **Integration Testing Workflow**
|
||||
|
||||
### **Development Process**
|
||||
1. **Module completion**: Ensure inline tests pass first
|
||||
2. **Interface design**: Define cross-module contracts
|
||||
3. **Integration tests**: Write interface compatibility tests
|
||||
4. **Pipeline tests**: Add workflow validation tests
|
||||
5. **Student validation**: Test realistic usage scenarios
|
||||
|
||||
### **Running Integration Tests**
|
||||
```bash
|
||||
# All integration tests
|
||||
pytest tests/ -v
|
||||
|
||||
# Specific integration area
|
||||
pytest tests/test_tensor_attention_integration.py -v
|
||||
pytest tests/test_attention_pipeline_integration.py -v
|
||||
|
||||
# Integration test pattern
|
||||
pytest tests/ -k "integration" -v
|
||||
```
|
||||
|
||||
### **Quality Standards**
|
||||
- ✅ **Interface Focus**: Tests verify component compatibility
|
||||
- ✅ **Real Components**: Uses actual TinyTorch modules, not mocks
|
||||
- ✅ **Student Scenarios**: Reflects realistic educational workflows
|
||||
- ✅ **Clear Feedback**: Provides educational progress indicators
|
||||
- ✅ **Complementary**: Adds value beyond inline testing
|
||||
|
||||
## 📚 **Reference Implementation**
|
||||
|
||||
**Best Examples**:
|
||||
- `test_tensor_activations_integration.py` - Interface compatibility testing
|
||||
- `test_attention_pipeline_integration.py` - Complete workflow testing
|
||||
- `test_layers_networks_integration.py` - Component composition testing
|
||||
|
||||
**Avoid These Patterns**:
|
||||
- Tests that duplicate inline test functionality
|
||||
- Mock-based testing that doesn't reflect real component behavior
|
||||
- Contrived scenarios that students won't encounter
|
||||
|
||||
Integration tests should answer: *"Do these modules work together?"* not *"Do these modules work correctly?"* - correctness is verified by inline tests.
|
||||
Reference in New Issue
Block a user