mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-12 00:43:34 -05:00

Files

Vijay Janapa Reddi 469af4c3de Remove module-level tests directories, keep only main tests/ for exported package validation

- Remove all tests/ directories under modules/source/
- Keep main tests/ directory for testing exported functionality
- Update status command to check tests in main tests/ directory
- Update documentation to reflect new test structure
- Reduce maintenance burden by eliminating duplicate test systems
- Focus on inline NBGrader tests for development, main tests for package validation

2025-07-13 17:14:14 -04:00

13 KiB

Raw Blame History

TinyTorch Testing Design Document

Overview

This document defines the inline-first testing architecture for TinyTorch, prioritizing student learning effectiveness and flow state over context switching overhead.

Core Philosophy: Learning-First Testing

Primary Goal: Student learning and confidence building
Secondary Goal: Comprehensive validation for grading
Tertiary Goal: Professional testing practices

Key Insight

Students learning ML concepts are already at cognitive capacity. Every context switch is expensive and disrupts the learning flow. We prioritize immediate feedback and confidence building over professional tool complexity.

Three-Tier Testing Architecture

1. Inline Tests (Primary - In Notebooks)

Goal: Immediate feedback and confidence building during development

Location: Embedded in *_dev.py files as NBGrader cells
Dependencies: None (or minimal, well-controlled)
Scope: Comprehensive testing with educational context
Purpose: Build confidence step-by-step while ensuring correctness

Example:

# %% [markdown]
"""
### 🧪 Test Your ReLU Implementation

Let's verify your ReLU function works correctly with various inputs.
This tests the core functionality you just implemented.
"""

# %% nbgrader={"grade": true, "grade_id": "test-relu-comprehensive", "locked": true, "points": 15, "schema_version": 3, "solution": false, "task": false}
def test_relu_comprehensive():
    """Comprehensive test of ReLU function."""
    print("🔬 Testing ReLU function...")
    
    # Test 1: Basic positive numbers
    try:
        result = relu([1, 2, 3])
        expected = [1, 2, 3]
        assert result == expected, f"Positive numbers: expected {expected}, got {result}"
        print("✅ Positive numbers work correctly")
    except Exception as e:
        print(f"❌ Positive numbers failed: {e}")
        return False
    
    # Test 2: Negative numbers (should become 0)
    try:
        result = relu([-1, -2, -3])
        expected = [0, 0, 0]
        assert result == expected, f"Negative numbers: expected {expected}, got {result}"
        print("✅ Negative numbers correctly clipped to 0")
    except Exception as e:
        print(f"❌ Negative numbers failed: {e}")
        return False
    
    # Test 3: Mixed positive and negative
    try:
        result = relu([-2, -1, 0, 1, 2])
        expected = [0, 0, 0, 1, 2]
        assert result == expected, f"Mixed numbers: expected {expected}, got {result}"
        print("✅ Mixed positive/negative numbers work correctly")
    except Exception as e:
        print(f"❌ Mixed numbers failed: {e}")
        return False
    
    # Test 4: Edge case - zero
    try:
        result = relu([0])
        expected = [0]
        assert result == expected, f"Zero: expected {expected}, got {result}"
        print("✅ Zero handled correctly")
    except Exception as e:
        print(f"❌ Zero failed: {e}")
        return False
    
    print("🎉 All ReLU tests passed! Your implementation works correctly.")
    print("📈 Progress: ReLU function ✓")
    return True

# Run the test
success = test_relu_comprehensive()
if not success:
    print("\n💡 Hints for fixing ReLU:")
    print("- ReLU should return max(0, x) for each element")
    print("- Negative numbers should become 0")
    print("- Positive numbers should stay unchanged")
    print("- Zero should remain zero")

Characteristics:

Comprehensive: Test all functionality, edge cases, error conditions
Educational: Explain what's being tested and why
Visual: Clear pass/fail feedback with emojis and progress tracking
Immediate: No context switching required
Encouraging: Build confidence with positive reinforcement
Helpful: Provide hints and guidance when tests fail

2. Module Tests (For Grading - Separate Files with Mocks)

Goal: Comprehensive validation for instructor grading using simple, visible mocks

Location: tests/test_{module}.py files
Dependencies: Simple, visible mock objects (no cross-module dependencies)
Scope: Complete module functionality with professional structure
Purpose: Verify module works correctly for grading and assessment

Example:

# tests/test_layers.py
"""
Comprehensive Layers Module Tests

Tests Dense layer functionality using simple mock objects.
Used for instructor grading and comprehensive validation.
"""

class SimpleTensor:
    """
    Simple mock tensor for testing layers.
    
    Shows exactly what interface the Dense layer expects:
    - .data (numpy array): The actual numerical data
    - .shape (tuple): Dimensions of the data
    """
    def __init__(self, data):
        self.data = np.array(data)
        self.shape = self.data.shape
    
    def __repr__(self):
        return f"SimpleTensor(shape={self.shape})"

class TestDenseLayer:
    """Comprehensive tests for Dense layer - used for grading."""
    
    def test_initialization(self):
        """Test Dense layer creation and weight initialization."""
        layer = Dense(input_size=3, output_size=2)
        
        assert hasattr(layer, 'weights'), "Dense layer should have weights"
        assert hasattr(layer, 'bias'), "Dense layer should have bias"
        assert layer.weights.shape == (3, 2), f"Expected weights shape (3, 2), got {layer.weights.shape}"
        assert layer.bias.shape == (2,), f"Expected bias shape (2,), got {layer.bias.shape}"
    
    def test_forward_pass_comprehensive(self):
        """Comprehensive test of Dense layer forward pass."""
        layer = Dense(input_size=3, output_size=2)
        
        # Test single sample
        input_tensor = SimpleTensor([[1.0, 2.0, 3.0]])
        output = layer(input_tensor)
        
        assert hasattr(output, 'data'), "Layer should return tensor-like object with .data"
        assert hasattr(output, 'shape'), "Layer should return tensor-like object with .shape"
        assert output.shape == (1, 2), f"Expected output shape (1, 2), got {output.shape}"
        
        # Verify computation (y = Wx + b)
        expected = np.dot(input_tensor.data, layer.weights) + layer.bias
        np.testing.assert_array_almost_equal(output.data, expected)
        
        # Test batch processing
        batch_input = SimpleTensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
        batch_output = layer(batch_input)
        assert batch_output.shape == (2, 2), f"Expected batch output shape (2, 2), got {batch_output.shape}"
        
        # Test edge cases
        edge_input = SimpleTensor([[0.0, 0.0, 0.0]])
        edge_output = layer(edge_input)
        assert edge_output.shape == (1, 2), "Should handle zero input"

Characteristics:

Self-contained: No dependencies on other TinyTorch modules
Mock-based: Simple, visible mocks that document interfaces
Comprehensive: Test all functionality, edge cases, error conditions
Professional: Use pytest structure and best practices
Grading-focused: Designed for instructor assessment

3. Integration Tests (Cross-Module Validation)

Goal: Verify modules work together using vetted solutions

Location: tests/integration/ directory
Dependencies: Instructor-provided working implementations
Scope: Cross-module workflows and realistic ML scenarios
Purpose: Ensure modules compose correctly without cascade failures

Example:

# tests/integration/test_basic_ml_pipeline.py
"""
Integration Tests - Basic ML Pipeline

Tests how student modules work together with vetted solutions.
No cascade failures - student code tested with known-working dependencies.
"""

from tinytorch.solutions.tensor import Tensor  # Working implementation
from tinytorch.solutions.activations import ReLU  # Working implementation
from student_layers import Dense  # Student's implementation

class TestBasicMLPipeline:
    """Test student modules in realistic ML workflows."""
    
    def test_neural_network_forward_pass(self):
        """Test complete neural network using student's Dense layer."""
        # Create network with student's Dense layers
        layer1 = Dense(input_size=4, output_size=3)  # Student implementation
        activation = ReLU()  # Working implementation
        layer2 = Dense(input_size=3, output_size=2)  # Student implementation
        
        # Create input with working Tensor
        x = Tensor([[1.0, 2.0, 3.0, 4.0]])  # Working tensor
        
        # Forward pass through network
        h1 = layer1(x)  # Student layer with working tensor
        h1_activated = activation(h1)  # Working activation
        output = layer2(h1_activated)  # Student layer
        
        # Verify pipeline works end-to-end
        assert output.shape == (1, 2), "Network should produce correct output shape"
        assert isinstance(output, Tensor), "Network should produce Tensor output"
        
        print("✅ Student's Dense layers work in complete neural network!")

Testing Workflow

For Students (Primary Path)

Implement in notebook: Write code with educational guidance
Test inline: Get immediate feedback without leaving notebook
Build confidence: See progress with visual indicators
Complete module: All inline tests pass before moving on

For Instructors (Grading Path)

Run module tests: Comprehensive validation with mocks
Run integration tests: Verify cross-module functionality
Grade systematically: Clear separation of concerns

Key Principles

1. Learning-First Design

Prioritize student understanding over tool complexity
Minimize context switching and cognitive overhead
Provide immediate, encouraging feedback
Build confidence step by step

2. Flow State Preservation

Keep students in their notebooks
Provide instant validation
Use visual, encouraging feedback
Avoid workflow disruption

3. Comprehensive Coverage

Inline tests are thorough, not just quick checks
Test functionality, edge cases, and error conditions
Provide educational context for each test
Explain what's being tested and why

4. Professional Structure (Where Appropriate)

Module tests use professional pytest structure
Integration tests mirror real-world workflows
Maintain quality standards for grading
Prepare students for industry practices

Implementation Guidelines

Inline Test Design

Comprehensive: Test all functionality thoroughly
Educational: Explain what's being tested
Visual: Use emojis and clear progress indicators
Helpful: Provide hints when tests fail
Encouraging: Build confidence with positive feedback

Mock Design for Module Tests

Simple: Only implement what's needed
Visible: Put mocks at top of files with clear documentation
Educational: Show exactly what interfaces are expected
Minimal: Don't over-engineer the mocks

Test Organization

modules/source/{module}/{module}_dev.py    # Implementation + comprehensive inline tests
tests/test_{module}.py                     # Package tests for exported functionality
tests/integration/                         # Cross-module tests with vetted solutions

CLI Integration

# Primary workflow - students work in notebooks
# Tests run automatically as cells execute

# Instructor grading workflow
tito test --module tensor                  # Run module tests with mocks
tito test --integration                    # Run integration tests
tito test --all                           # Run all tests

Benefits

For Students

No context switching: Stay in flow state
Immediate feedback: Know instantly if code works
Confidence building: Step-by-step validation
Clear guidance: Helpful error messages and hints
Educational value: Learn what good testing looks like

For Instructors

Comprehensive validation: Thorough testing for grading
Clear diagnostics: Know exactly what's working/broken
Independent assessment: Module tests don't depend on student's other modules
Professional standards: Maintain quality without overwhelming students

For the System

Maintainable: Clear separation of learning vs grading concerns
Scalable: Easy to add new modules
Educational: Every test serves a learning purpose
Practical: Balances learning effectiveness with quality assurance

Conclusion

The inline-first testing approach prioritizes student learning effectiveness over tool complexity. Students get comprehensive testing within their learning context, building confidence and understanding step by step. Instructors maintain professional testing standards for grading while avoiding the cognitive overhead that disrupts the learning process.

Key insight: Context switching is expensive for learners. Keep them in flow state while ensuring comprehensive validation.

13 KiB Raw Blame History

TinyTorch Testing Design Document

Overview

Core Philosophy: Learning-First Testing

Key Insight

Three-Tier Testing Architecture

1. Inline Tests (Primary - In Notebooks)

2. Module Tests (For Grading - Separate Files with Mocks)

3. Integration Tests (Cross-Module Validation)

Testing Workflow

For Students (Primary Path)

For Instructors (Grading Path)

Key Principles

1. Learning-First Design

2. Flow State Preservation

3. Comprehensive Coverage

4. Professional Structure (Where Appropriate)

Implementation Guidelines

Inline Test Design

Mock Design for Module Tests

Test Organization

CLI Integration

Benefits

For Students

For Instructors

For the System

Conclusion

13 KiB

Raw Blame History