mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2025-12-05 19:17:52 -06:00
Clean up repository
- Remove stale feature branches (kept debugging branch with unmerged work) - Move test_spatial_core.py to correct directory (tests/09_spatial) - Remove .tito user state from tracking (config.json, progress.json) - Delete archived CLI commands (tito/commands/_archived/) - Move standalone integration tests to tests/integration/ - Remove outdated audit/report markdown files - Remove old template and deprecated test files - Simplify .gitignore for .tito/ directory
This commit is contained in:
5
.gitignore
vendored
5
.gitignore
vendored
@@ -137,9 +137,8 @@ Thumbs.db
|
||||
tito-cli.log
|
||||
COMMIT_LOG.txt
|
||||
|
||||
# Tito CLI backups and cache
|
||||
.tito/backups/
|
||||
.tito/cache/
|
||||
# Tito CLI user state and cache (local to each user)
|
||||
.tito/
|
||||
|
||||
# Downloaded datasets (not source-controlled, too large)
|
||||
data/
|
||||
|
||||
@@ -1,3 +0,0 @@
|
||||
{
|
||||
"logo_theme": "standard"
|
||||
}
|
||||
@@ -1,16 +0,0 @@
|
||||
{
|
||||
"completed_modules": [
|
||||
"01_setup",
|
||||
"02_tensor",
|
||||
"03_activations",
|
||||
"04_layers"
|
||||
],
|
||||
"completion_dates": {
|
||||
"01_setup": "2025-09-19T10:21:11.081117",
|
||||
"02_tensor": "2025-09-19T10:21:34.831693",
|
||||
"03_activations": "2025-09-19T10:21:50.000000",
|
||||
"04_layers": "2025-09-19T10:21:55.000000"
|
||||
},
|
||||
"achievements": [],
|
||||
"total_capabilities_unlocked": 0
|
||||
}
|
||||
@@ -1,660 +0,0 @@
|
||||
# Module 05 (Autograd) Integration Test Audit Report
|
||||
|
||||
**Date**: 2025-11-25
|
||||
**Auditor**: Dr. Sarah Rodriguez
|
||||
**Status**: CRITICAL GAPS IDENTIFIED
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current State**: The `test_progressive_integration.py` file is MISNAMED and tests Module 08 (DataLoader), NOT Module 05 (Autograd). This is a critical error that breaks the testing framework.
|
||||
|
||||
**Test Coverage**: 40% - Missing critical integration tests for gradient flow, in-place operations, memory leaks, and multi-module integration.
|
||||
|
||||
**Bug-Catching Priority**: MEDIUM - Existing tests cover specific operations but miss systemic integration issues.
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### 1. WRONG MODULE TESTED (BLOCKER)
|
||||
|
||||
**Issue**: `/Users/VJ/GitHub/TinyTorch/tests/05_autograd/test_progressive_integration.py` tests Module 08 (DataLoader), not Module 05 (Autograd)
|
||||
|
||||
**Evidence**:
|
||||
```python
|
||||
# Line 1-7 of test_progressive_integration.py
|
||||
"""
|
||||
Module 08: Progressive Integration Tests
|
||||
Tests that Module 08 (DataLoader) works correctly AND that the entire prior stack works.
|
||||
|
||||
DEPENDENCY CHAIN: 01_setup → 02_tensor → 03_activations → 04_layers → 05_dense → 06_spatial → 07_attention → 08_dataloader
|
||||
This is where we enable real data processing for ML systems.
|
||||
```
|
||||
|
||||
**Impact**:
|
||||
- Module 05 has NO progressive integration tests
|
||||
- Cannot verify that Autograd works with prior modules (01-04)
|
||||
- Cannot verify that prior modules remain stable after Autograd
|
||||
|
||||
**Action Required**:
|
||||
1. Rename current file to `tests/08_dataloader/test_progressive_integration.py`
|
||||
2. Create NEW `tests/05_autograd/test_progressive_integration.py` for Autograd
|
||||
|
||||
---
|
||||
|
||||
## Current Test Coverage Analysis
|
||||
|
||||
### Existing Tests (What We Have)
|
||||
|
||||
| Test File | Purpose | Coverage |
|
||||
|-----------|---------|----------|
|
||||
| `test_gradient_flow.py` | Tests gradient tracking through operations | ✅ Good |
|
||||
| `test_batched_matmul_backward.py` | Tests batched matmul gradients | ✅ Excellent |
|
||||
| `test_dataloader_tensor_integration.py` | DataLoader integration (wrong module!) | ❌ Misplaced |
|
||||
| `test_progressive_integration.py` | Module 08 tests (WRONG!) | ❌ Wrong module |
|
||||
|
||||
### What These Tests Cover
|
||||
|
||||
**✅ COVERED:**
|
||||
1. **Arithmetic gradient flow** (add, sub, mul, div)
|
||||
2. **Activation gradients** (ReLU, Sigmoid, Softmax, GELU)
|
||||
3. **Reshape/transpose gradients**
|
||||
4. **Batched matmul** (attention patterns)
|
||||
5. **LayerNorm operations** (sqrt, mean)
|
||||
|
||||
**❌ MISSING:**
|
||||
1. **Integration with Module 01 (Tensor)** - No tests that Tensor operations work
|
||||
2. **Integration with Module 02 (Activations)** - Limited activation gradient tests
|
||||
3. **Integration with Module 03 (Layers)** - No Dense layer gradient tests
|
||||
4. **Integration with Module 04 (Losses)** - No loss gradient tests
|
||||
5. **In-place operation bugs** - Critical for catching graph breaking
|
||||
6. **Memory leak detection** - Computational graph accumulation
|
||||
7. **Gradient accumulation bugs** - Shared parameters
|
||||
8. **Multi-layer backprop** - End-to-end gradient flow
|
||||
9. **Prior module stability** - Regression testing
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Points Analysis
|
||||
|
||||
### Integration Point 1: Autograd + Module 01 (Tensor)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- All Tensor operations preserve `requires_grad`
|
||||
- Tensor operations create `_grad_fn` correctly
|
||||
- `backward()` computes correct gradients for all operations
|
||||
- Broadcasting during backward works correctly
|
||||
- Scalar tensors can call `backward()` without arguments
|
||||
|
||||
**Current Coverage**: 60%
|
||||
- ✅ Basic operations tested in `test_gradient_flow.py`
|
||||
- ❌ Missing: Broadcasting edge cases
|
||||
- ❌ Missing: Scalar tensor backward
|
||||
- ❌ Missing: Inplace operation detection
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: Broadcasting gradient accumulation
|
||||
def test_broadcasting_backward():
|
||||
"""Test gradients accumulate correctly with broadcasting."""
|
||||
bias = Tensor([1.0], requires_grad=True) # Shape (1,)
|
||||
x = Tensor([[1, 2], [3, 4]], requires_grad=True) # Shape (2, 2)
|
||||
y = x + bias # Broadcasts to (2, 2)
|
||||
loss = y.sum()
|
||||
loss.backward()
|
||||
# bias.grad should be summed over all broadcast dimensions
|
||||
assert bias.grad.shape == (1,), "Bias gradient shape wrong"
|
||||
assert np.allclose(bias.grad, [4.0]), "Broadcasting backward failed"
|
||||
```
|
||||
|
||||
### Integration Point 2: Autograd + Module 02 (Activations)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- ReLU, Sigmoid, Softmax, GELU all preserve gradient tracking
|
||||
- Activation gradients compose correctly in chains
|
||||
- Dead ReLU neurons (zero gradient) handled correctly
|
||||
- Softmax numerical stability during backward
|
||||
|
||||
**Current Coverage**: 70%
|
||||
- ✅ Basic activation gradients tested
|
||||
- ✅ GELU gradient flow tested
|
||||
- ❌ Missing: Activation chaining gradients
|
||||
- ❌ Missing: Dead ReLU detection
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: Multi-activation gradient chain
|
||||
def test_activation_chain_gradients():
|
||||
"""Test gradients flow through chained activations."""
|
||||
x = Tensor([1.0, -1.0, 2.0], requires_grad=True)
|
||||
relu = ReLU()
|
||||
sigmoid = Sigmoid()
|
||||
|
||||
# Chain: x -> ReLU -> Sigmoid -> loss
|
||||
h = relu(x)
|
||||
y = sigmoid(h)
|
||||
loss = y.sum()
|
||||
loss.backward()
|
||||
|
||||
# x.grad should reflect both ReLU and Sigmoid derivatives
|
||||
assert x.grad is not None, "Gradient didn't flow through chain"
|
||||
# Dead neuron at x=-1 should have zero gradient
|
||||
assert np.isclose(x.grad[1], 0.0), "Dead ReLU gradient not zero"
|
||||
```
|
||||
|
||||
### Integration Point 3: Autograd + Module 03 (Layers)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- Dense layer forward preserves `requires_grad`
|
||||
- Dense layer backward computes weight and bias gradients
|
||||
- Multi-layer networks backpropagate correctly
|
||||
- Parameter sharing accumulates gradients
|
||||
|
||||
**Current Coverage**: 0% ❌
|
||||
- **COMPLETELY MISSING**: No tests for Dense layer gradients
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: Dense layer gradient computation
|
||||
def test_dense_layer_gradients():
|
||||
"""Test Dense layer computes weight and bias gradients."""
|
||||
from tinytorch.core.layers import Dense
|
||||
|
||||
layer = Dense(3, 2)
|
||||
x = Tensor([[1, 2, 3]], requires_grad=True)
|
||||
|
||||
# Forward pass
|
||||
y = layer(x)
|
||||
loss = y.sum()
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check all gradients exist
|
||||
assert layer.weight.grad is not None, "Weight gradient missing"
|
||||
assert layer.bias.grad is not None, "Bias gradient missing"
|
||||
assert x.grad is not None, "Input gradient missing"
|
||||
|
||||
# Check gradient shapes
|
||||
assert layer.weight.grad.shape == layer.weight.shape
|
||||
assert layer.bias.grad.shape == layer.bias.shape
|
||||
```
|
||||
|
||||
### Integration Point 4: Autograd + Module 04 (Losses)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- MSE loss computes correct gradients
|
||||
- CrossEntropy loss computes correct gradients
|
||||
- BCE loss computes correct gradients
|
||||
- Loss gradients match hand-calculated values
|
||||
|
||||
**Current Coverage**: 0% ❌
|
||||
- **COMPLETELY MISSING**: No tests for loss function gradients
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: MSE loss gradient
|
||||
def test_mse_loss_gradient():
|
||||
"""Test MSE loss computes correct gradients."""
|
||||
from tinytorch.core.losses import MSELoss
|
||||
|
||||
predictions = Tensor([1.0, 2.0, 3.0], requires_grad=True)
|
||||
targets = Tensor([1.5, 2.5, 2.5])
|
||||
|
||||
mse = MSELoss()
|
||||
loss = mse(predictions, targets)
|
||||
loss.backward()
|
||||
|
||||
# MSE gradient: 2 * (pred - target) / N
|
||||
expected_grad = 2 * (predictions.data - targets.data) / 3
|
||||
assert np.allclose(predictions.grad, expected_grad), "MSE gradient incorrect"
|
||||
```
|
||||
|
||||
### Integration Point 5: In-Place Operations
|
||||
|
||||
**What Should Be Tested**:
|
||||
- In-place ops break computation graph (expected behavior)
|
||||
- In-place ops raise warnings or errors
|
||||
- Students see clear error messages
|
||||
|
||||
**Current Coverage**: 0% ❌
|
||||
- **COMPLETELY MISSING**: No in-place operation tests
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: In-place operation detection
|
||||
def test_inplace_operations_break_graph():
|
||||
"""Test that in-place operations are detected and warned."""
|
||||
x = Tensor([1, 2, 3], requires_grad=True)
|
||||
y = x * 2
|
||||
|
||||
# In-place modification (if implemented) should break graph
|
||||
# This test ensures students understand the danger
|
||||
try:
|
||||
x.data[0] = 999 # Direct modification
|
||||
y.backward(Tensor([1, 1, 1]))
|
||||
# If we get here, gradient is computed on modified data - BAD!
|
||||
assert False, "In-place modification should affect gradients"
|
||||
except Exception:
|
||||
# Expected: Some warning or error about in-place ops
|
||||
pass
|
||||
```
|
||||
|
||||
### Integration Point 6: Memory Leaks (Computational Graph)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- Computation graphs don't accumulate across iterations
|
||||
- `zero_grad()` prevents gradient accumulation
|
||||
- Large graphs can be garbage collected
|
||||
|
||||
**Current Coverage**: 0% ❌
|
||||
- **COMPLETELY MISSING**: No memory leak tests
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: Gradient accumulation prevention
|
||||
def test_zero_grad_prevents_accumulation():
|
||||
"""Test zero_grad() prevents gradient accumulation."""
|
||||
x = Tensor([1.0], requires_grad=True)
|
||||
|
||||
# First backward pass
|
||||
y1 = x * 2
|
||||
y1.backward()
|
||||
first_grad = x.grad.copy()
|
||||
|
||||
# Second backward WITHOUT zero_grad - accumulates
|
||||
y2 = x * 3
|
||||
y2.backward()
|
||||
assert np.allclose(x.grad, first_grad + 3.0), "Gradients should accumulate"
|
||||
|
||||
# Third backward WITH zero_grad - doesn't accumulate
|
||||
x.zero_grad()
|
||||
y3 = x * 4
|
||||
y3.backward()
|
||||
assert np.allclose(x.grad, 4.0), "zero_grad() should reset gradients"
|
||||
```
|
||||
|
||||
### Integration Point 7: Gradient Accumulation (Parameter Sharing)
|
||||
|
||||
**What Should Be Tested**:
|
||||
- Shared parameters accumulate gradients correctly
|
||||
- Embedding layers with repeated indices accumulate gradients
|
||||
- Multi-path graphs accumulate gradients
|
||||
|
||||
**Current Coverage**: 0% ❌
|
||||
- **COMPLETELY MISSING**: No gradient accumulation tests
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
# Test: Parameter sharing gradient accumulation
|
||||
def test_shared_parameter_gradient_accumulation():
|
||||
"""Test shared parameters accumulate gradients from multiple uses."""
|
||||
weight = Tensor([2.0], requires_grad=True)
|
||||
|
||||
# Use same weight twice
|
||||
x1 = Tensor([1.0])
|
||||
x2 = Tensor([3.0])
|
||||
|
||||
y1 = weight * x1 # First use
|
||||
y2 = weight * x2 # Second use
|
||||
|
||||
loss = y1.sum() + y2.sum()
|
||||
loss.backward()
|
||||
|
||||
# Gradient should accumulate: dy1/dw + dy2/dw = 1.0 + 3.0 = 4.0
|
||||
assert np.allclose(weight.grad, 4.0), "Shared parameter gradients didn't accumulate"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Missing Progressive Integration Tests
|
||||
|
||||
### Test Class 1: Prior Stack Stability (Modules 01-04)
|
||||
|
||||
**Purpose**: Verify Autograd didn't break previous modules
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
class TestPriorStackStillWorking:
|
||||
"""Verify Modules 01-04 still work after Autograd."""
|
||||
|
||||
def test_tensor_operations_stable(self):
|
||||
"""Tensor operations work without requires_grad."""
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Should work exactly as before (Module 01)
|
||||
x = Tensor([1, 2, 3])
|
||||
y = Tensor([4, 5, 6])
|
||||
z = x + y
|
||||
|
||||
assert np.array_equal(z.data, [5, 7, 9])
|
||||
assert z.grad is None # No gradient tracking
|
||||
|
||||
def test_activations_stable(self):
|
||||
"""Activations work without requires_grad."""
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
relu = ReLU()
|
||||
x = Tensor([-1, 0, 1])
|
||||
y = relu(x)
|
||||
|
||||
assert np.array_equal(y.data, [0, 0, 1])
|
||||
assert y.grad is None # No gradient tracking
|
||||
```
|
||||
|
||||
### Test Class 2: Autograd Core Functionality
|
||||
|
||||
**Purpose**: Test Autograd's core capabilities
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
class TestModule05AutogradCore:
|
||||
"""Test Module 05 (Autograd) core functionality."""
|
||||
|
||||
def test_simple_backward_pass(self):
|
||||
"""Test simple computational graph backward pass."""
|
||||
enable_autograd()
|
||||
|
||||
x = Tensor([2.0], requires_grad=True)
|
||||
y = x * 3
|
||||
loss = y.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
assert x.grad is not None
|
||||
assert np.allclose(x.grad, [3.0])
|
||||
|
||||
def test_multi_step_backward(self):
|
||||
"""Test multi-step computation graph."""
|
||||
enable_autograd()
|
||||
|
||||
x = Tensor([2.0], requires_grad=True)
|
||||
y = x * 3 # y = 6
|
||||
z = y + 1 # z = 7
|
||||
w = z * 2 # w = 14
|
||||
|
||||
w.backward()
|
||||
|
||||
# dw/dx = dw/dz * dz/dy * dy/dx = 2 * 1 * 3 = 6
|
||||
assert np.allclose(x.grad, [6.0])
|
||||
```
|
||||
|
||||
### Test Class 3: Full Stack Integration
|
||||
|
||||
**Purpose**: Test complete pipeline (Modules 01-05)
|
||||
|
||||
**Missing Tests**:
|
||||
```python
|
||||
class TestProgressiveStackIntegration:
|
||||
"""Test complete stack (01→05) works together."""
|
||||
|
||||
def test_neural_network_backward(self):
|
||||
"""Test complete neural network with backprop."""
|
||||
enable_autograd()
|
||||
from tinytorch.core.layers import Dense
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.core.losses import MSELoss
|
||||
|
||||
# Build network
|
||||
layer1 = Dense(3, 4)
|
||||
relu = ReLU()
|
||||
layer2 = Dense(4, 2)
|
||||
|
||||
# Forward pass
|
||||
x = Tensor([[1, 2, 3]], requires_grad=True)
|
||||
h = relu(layer1(x))
|
||||
y = layer2(h)
|
||||
|
||||
# Loss
|
||||
target = Tensor([[1, 0]])
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn(y, target)
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# All parameters should have gradients
|
||||
assert layer1.weight.grad is not None
|
||||
assert layer1.bias.grad is not None
|
||||
assert layer2.weight.grad is not None
|
||||
assert layer2.bias.grad is not None
|
||||
assert x.grad is not None
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priority Matrix
|
||||
|
||||
| Category | Priority | Coverage | Missing Tests |
|
||||
|----------|----------|----------|---------------|
|
||||
| **Gradient Correctness** | 🔴 CRITICAL | 70% | Numerical gradient checks |
|
||||
| **In-Place Operations** | 🔴 CRITICAL | 0% | Graph breaking detection |
|
||||
| **Memory Leaks** | 🟠 HIGH | 0% | Graph accumulation tests |
|
||||
| **Gradient Accumulation** | 🟠 HIGH | 0% | Shared parameter tests |
|
||||
| **Module Integration** | 🟠 HIGH | 30% | Multi-module pipelines |
|
||||
| **Prior Module Stability** | 🟡 MEDIUM | 0% | Regression tests |
|
||||
| **Broadcasting** | 🟡 MEDIUM | 40% | Edge case tests |
|
||||
| **Numerical Stability** | 🟢 LOW | 50% | Extreme value tests |
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate Actions (Week 1)
|
||||
|
||||
1. **Fix File Misplacement** (1 hour)
|
||||
- Move `test_progressive_integration.py` to `tests/08_dataloader/`
|
||||
- Create new `tests/05_autograd/test_progressive_integration.py`
|
||||
|
||||
2. **Add Critical Missing Tests** (4 hours)
|
||||
- Dense layer gradient tests
|
||||
- Loss function gradient tests
|
||||
- In-place operation detection
|
||||
- Memory leak tests
|
||||
|
||||
3. **Add Prior Module Stability Tests** (2 hours)
|
||||
- Test Modules 01-04 still work
|
||||
- Test gradients don't affect non-gradient mode
|
||||
|
||||
### Short-Term Actions (Week 2-3)
|
||||
|
||||
4. **Add Integration Tests** (6 hours)
|
||||
- Full neural network backward pass
|
||||
- Multi-layer gradient flow
|
||||
- Shared parameter accumulation
|
||||
|
||||
5. **Add Edge Case Tests** (3 hours)
|
||||
- Broadcasting edge cases
|
||||
- Scalar tensor backward
|
||||
- Empty gradient handling
|
||||
|
||||
### Long-Term Actions (Month 1)
|
||||
|
||||
6. **Add Numerical Gradient Checks** (8 hours)
|
||||
- Finite difference verification for all operations
|
||||
- Ensures analytical gradients are correct
|
||||
|
||||
7. **Add Performance Tests** (4 hours)
|
||||
- Large graph memory usage
|
||||
- Gradient computation speed
|
||||
- Graph building overhead
|
||||
|
||||
---
|
||||
|
||||
## Test Template for Module 05
|
||||
|
||||
```python
|
||||
"""
|
||||
Module 05: Progressive Integration Tests
|
||||
Tests that Module 05 (Autograd) works correctly AND that all previous modules still work.
|
||||
|
||||
DEPENDENCY CHAIN: 01_tensor → 02_activations → 03_layers → 04_losses → 05_autograd
|
||||
This is where automatic differentiation enables training.
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add project root to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
||||
|
||||
|
||||
class TestPriorStackStillWorking:
|
||||
"""Verify Modules 01-04 functionality is still intact."""
|
||||
|
||||
def test_tensor_operations_stable(self):
|
||||
"""Ensure tensor operations work without gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_activations_stable(self):
|
||||
"""Ensure activations work without gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_layers_stable(self):
|
||||
"""Ensure layers work without gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestModule05AutogradCore:
|
||||
"""Test Module 05 (Autograd) core functionality."""
|
||||
|
||||
def test_enable_autograd(self):
|
||||
"""Test autograd can be enabled."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_simple_backward(self):
|
||||
"""Test simple backward pass."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_requires_grad_tracking(self):
|
||||
"""Test requires_grad flag works."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestAutogradTensorIntegration:
|
||||
"""Test Autograd works with all Tensor operations (Module 01)."""
|
||||
|
||||
def test_arithmetic_gradients(self):
|
||||
"""Test gradients for +, -, *, /."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_matmul_gradients(self):
|
||||
"""Test gradients for matrix multiplication."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_broadcasting_gradients(self):
|
||||
"""Test broadcasting during backward."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestAutogradActivationIntegration:
|
||||
"""Test Autograd works with Activations (Module 02)."""
|
||||
|
||||
def test_relu_gradients(self):
|
||||
"""Test ReLU gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_sigmoid_gradients(self):
|
||||
"""Test Sigmoid gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_activation_chain_gradients(self):
|
||||
"""Test chained activation gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestAutogradLayerIntegration:
|
||||
"""Test Autograd works with Layers (Module 03)."""
|
||||
|
||||
def test_dense_layer_gradients(self):
|
||||
"""Test Dense layer parameter gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_multi_layer_gradients(self):
|
||||
"""Test multi-layer network gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestAutogradLossIntegration:
|
||||
"""Test Autograd works with Loss functions (Module 04)."""
|
||||
|
||||
def test_mse_loss_gradients(self):
|
||||
"""Test MSE loss gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_crossentropy_loss_gradients(self):
|
||||
"""Test CrossEntropy loss gradients."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestProgressiveStackIntegration:
|
||||
"""Test complete stack (01→05) works together."""
|
||||
|
||||
def test_end_to_end_training_step(self):
|
||||
"""Test complete forward + backward pass."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_gradient_accumulation(self):
|
||||
"""Test gradients accumulate correctly."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
|
||||
class TestAutogradBugPrevention:
|
||||
"""Tests that catch common autograd bugs."""
|
||||
|
||||
def test_inplace_operations(self):
|
||||
"""Test in-place operations are handled correctly."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_memory_leaks(self):
|
||||
"""Test computation graphs don't leak memory."""
|
||||
# Test implementation
|
||||
pass
|
||||
|
||||
def test_zero_grad_works(self):
|
||||
"""Test zero_grad() prevents accumulation."""
|
||||
# Test implementation
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Overall Assessment**: Module 05 integration tests are **INCOMPLETE** and **MISPLACED**.
|
||||
|
||||
**Risk Level**: 🔴 **HIGH** - Missing critical tests could allow gradient bugs to slip into production.
|
||||
|
||||
**Recommended Action**: Implement missing tests IMMEDIATELY before students encounter gradient bugs.
|
||||
|
||||
**Estimated Effort**: 20-25 hours to achieve 90% coverage.
|
||||
|
||||
**Student Impact**: Without these tests, students will encounter confusing gradient bugs that are hard to debug. Proper integration tests will catch these issues early.
|
||||
|
||||
---
|
||||
|
||||
**Report Generated**: 2025-11-25
|
||||
**Next Review**: After implementing critical missing tests
|
||||
@@ -1,401 +0,0 @@
|
||||
"""
|
||||
Module 08: Progressive Integration Tests
|
||||
Tests that Module 08 (DataLoader) works correctly AND that the entire prior stack works.
|
||||
|
||||
DEPENDENCY CHAIN: 01_setup → 02_tensor → 03_activations → 04_layers → 05_dense → 06_spatial → 07_attention → 08_dataloader
|
||||
This is where we enable real data processing for ML systems.
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add project root to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
||||
|
||||
|
||||
class TestPriorStackStillWorking:
|
||||
"""Quick regression checks that prior modules (01→07) still work."""
|
||||
|
||||
def test_foundation_stack_stable(self):
|
||||
"""Verify foundation stack (01→05) remains stable."""
|
||||
# Environment (Module 01)
|
||||
assert sys.version_info >= (3, 8), "Foundation broken: Python version"
|
||||
|
||||
# Core functionality should work
|
||||
try:
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Dense
|
||||
|
||||
# Should still be able to build networks
|
||||
layer = Dense(10, 5)
|
||||
x = Tensor(np.random.randn(4, 10))
|
||||
output = layer(x)
|
||||
assert output.shape == (4, 5), "Foundation broken: Neural network"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Foundation not implemented yet"
|
||||
|
||||
def test_advanced_stack_stable(self):
|
||||
"""Verify advanced modules (06→07) still work."""
|
||||
try:
|
||||
from tinytorch.core.spatial import Conv2D
|
||||
from tinytorch.core.attention import MultiHeadAttention
|
||||
|
||||
# Spatial and attention should work
|
||||
conv = Conv2D(in_channels=3, out_channels=16, kernel_size=3)
|
||||
attention = MultiHeadAttention(embed_dim=64, num_heads=8)
|
||||
|
||||
assert hasattr(conv, 'forward'), "Advanced stack broken: Spatial"
|
||||
assert hasattr(attention, 'forward'), "Advanced stack broken: Attention"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Advanced stack not implemented yet"
|
||||
|
||||
|
||||
class TestModule08DataLoaderCore:
|
||||
"""Test Module 08 (DataLoader) core functionality."""
|
||||
|
||||
def test_dataset_creation(self):
|
||||
"""Test basic dataset creation works."""
|
||||
try:
|
||||
from tinytorch.core.data import Dataset
|
||||
|
||||
# Create simple dataset
|
||||
class SimpleDataset(Dataset):
|
||||
def __init__(self, size=100):
|
||||
self.size = size
|
||||
self.data = np.random.randn(size, 10)
|
||||
self.targets = np.random.randint(0, 3, size)
|
||||
|
||||
def __len__(self):
|
||||
return self.size
|
||||
|
||||
def __getitem__(self, idx):
|
||||
return self.data[idx], self.targets[idx]
|
||||
|
||||
dataset = SimpleDataset(50)
|
||||
assert len(dataset) == 50, "Dataset length broken"
|
||||
|
||||
# Test data access
|
||||
sample, target = dataset[0]
|
||||
assert sample.shape == (10,), "Dataset sample shape broken"
|
||||
assert isinstance(target, (int, np.integer)), "Dataset target type broken"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Dataset not implemented yet"
|
||||
|
||||
def test_dataloader_creation(self):
|
||||
"""Test DataLoader creation and batching."""
|
||||
try:
|
||||
from tinytorch.core.data import DataLoader, Dataset
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Simple dataset for testing
|
||||
class TestDataset(Dataset):
|
||||
def __init__(self):
|
||||
self.data = np.random.randn(20, 5)
|
||||
self.targets = np.random.randint(0, 2, 20)
|
||||
|
||||
def __len__(self):
|
||||
return 20
|
||||
|
||||
def __getitem__(self, idx):
|
||||
return Tensor(self.data[idx]), self.targets[idx]
|
||||
|
||||
dataset = TestDataset()
|
||||
dataloader = DataLoader(dataset, batch_size=4, shuffle=True)
|
||||
|
||||
# Test batching
|
||||
for batch_x, batch_y in dataloader:
|
||||
assert batch_x.shape == (4, 5), "DataLoader batch shape broken"
|
||||
assert len(batch_y) == 4, "DataLoader target batch broken"
|
||||
break # Just test first batch
|
||||
|
||||
except ImportError:
|
||||
assert True, "DataLoader not implemented yet"
|
||||
|
||||
def test_real_dataset_support(self):
|
||||
"""Test support for real datasets like CIFAR-10."""
|
||||
try:
|
||||
from tinytorch.core.data import CIFAR10Dataset
|
||||
|
||||
# Note: This might download data, so we'll just test instantiation
|
||||
# In real usage, students would download CIFAR-10
|
||||
try:
|
||||
dataset = CIFAR10Dataset(root='./data', train=True, download=False)
|
||||
# If dataset exists, test basic functionality
|
||||
if len(dataset) > 0:
|
||||
sample, target = dataset[0]
|
||||
assert len(sample.shape) >= 2, "CIFAR-10 sample shape invalid"
|
||||
assert isinstance(target, (int, np.integer)), "CIFAR-10 target invalid"
|
||||
except (FileNotFoundError, RuntimeError):
|
||||
# Data not downloaded, which is fine for testing
|
||||
assert True, "CIFAR-10 data not available (expected)"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Real dataset support not implemented yet"
|
||||
|
||||
|
||||
class TestProgressiveStackIntegration:
|
||||
"""Test that the complete stack (01→08) works together."""
|
||||
|
||||
def test_complete_training_pipeline(self):
|
||||
"""Test complete ML pipeline: data → model → training."""
|
||||
try:
|
||||
from tinytorch.core.data import DataLoader, Dataset
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Dense
|
||||
from tinytorch.core.activations import ReLU, Softmax
|
||||
|
||||
# Create dataset
|
||||
class MLDataset(Dataset):
|
||||
def __init__(self):
|
||||
self.data = np.random.randn(40, 10)
|
||||
self.targets = np.random.randint(0, 3, 40)
|
||||
|
||||
def __len__(self):
|
||||
return 40
|
||||
|
||||
def __getitem__(self, idx):
|
||||
return Tensor(self.data[idx]), self.targets[idx]
|
||||
|
||||
# Create data pipeline
|
||||
dataset = MLDataset()
|
||||
dataloader = DataLoader(dataset, batch_size=8, shuffle=True)
|
||||
|
||||
# Create model using prior modules
|
||||
layer1 = Dense(10, 16)
|
||||
layer2 = Dense(16, 3)
|
||||
relu = ReLU()
|
||||
softmax = Softmax()
|
||||
|
||||
# Test training loop structure
|
||||
for batch_x, batch_y in dataloader:
|
||||
# Forward pass through complete pipeline
|
||||
h = relu(layer1(batch_x))
|
||||
logits = layer2(h)
|
||||
predictions = softmax(logits)
|
||||
|
||||
assert predictions.shape == (8, 3), "Complete pipeline broken"
|
||||
|
||||
# Test one batch
|
||||
break
|
||||
|
||||
except ImportError:
|
||||
assert True, "Complete training pipeline not ready yet"
|
||||
|
||||
def test_cnn_data_pipeline(self):
|
||||
"""Test CNN pipeline with spatial data."""
|
||||
try:
|
||||
from tinytorch.core.data import DataLoader, Dataset
|
||||
from tinytorch.core.spatial import Conv2D, MaxPool2D
|
||||
from tinytorch.core.layers import Dense
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Image dataset
|
||||
class ImageDataset(Dataset):
|
||||
def __init__(self):
|
||||
# 32x32 RGB images
|
||||
self.data = np.random.randn(20, 3, 32, 32)
|
||||
self.targets = np.random.randint(0, 5, 20)
|
||||
|
||||
def __len__(self):
|
||||
return 20
|
||||
|
||||
def __getitem__(self, idx):
|
||||
return Tensor(self.data[idx]), self.targets[idx]
|
||||
|
||||
dataset = ImageDataset()
|
||||
dataloader = DataLoader(dataset, batch_size=4)
|
||||
|
||||
# CNN components
|
||||
conv1 = Conv2D(in_channels=3, out_channels=16, kernel_size=3)
|
||||
pool = MaxPool2D(kernel_size=2)
|
||||
fc = Dense(16 * 15 * 15, 5) # Approximate after conv/pool
|
||||
|
||||
# Test CNN pipeline
|
||||
for batch_x, batch_y in dataloader:
|
||||
assert batch_x.shape == (4, 3, 32, 32), "Image batch shape broken"
|
||||
|
||||
# Simplified CNN forward (shape checking)
|
||||
if hasattr(conv1, '__call__'):
|
||||
conv_out = conv1(batch_x)
|
||||
# Check reasonable conv output shape
|
||||
assert len(conv_out.shape) == 4, "Conv output dimensionality broken"
|
||||
|
||||
break
|
||||
|
||||
except ImportError:
|
||||
assert True, "CNN data pipeline not ready yet"
|
||||
|
||||
|
||||
class TestRealWorldDataCapability:
|
||||
"""Test capability to handle real-world datasets."""
|
||||
|
||||
def test_data_preprocessing_pipeline(self):
|
||||
"""Test data preprocessing and augmentation."""
|
||||
try:
|
||||
from tinytorch.core.data import transforms
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Basic transforms
|
||||
if hasattr(transforms, 'Normalize'):
|
||||
normalize = transforms.Normalize(mean=[0.5], std=[0.5])
|
||||
|
||||
# Test data
|
||||
data = Tensor(np.random.randn(3, 32, 32))
|
||||
normalized = normalize(data)
|
||||
|
||||
assert normalized.shape == data.shape, "Normalization broken"
|
||||
|
||||
if hasattr(transforms, 'RandomCrop'):
|
||||
crop = transforms.RandomCrop(size=28)
|
||||
|
||||
data = Tensor(np.random.randn(3, 32, 32))
|
||||
cropped = crop(data)
|
||||
|
||||
assert cropped.shape[-2:] == (28, 28), "Random crop broken"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Data preprocessing not implemented yet"
|
||||
|
||||
def test_memory_efficient_loading(self):
|
||||
"""Test memory efficient data loading."""
|
||||
try:
|
||||
from tinytorch.core.data import DataLoader, Dataset
|
||||
|
||||
# Large dataset simulation
|
||||
class LargeDataset(Dataset):
|
||||
def __init__(self, size=1000):
|
||||
self.size = size
|
||||
# Don't load all data at once - simulate lazy loading
|
||||
|
||||
def __len__(self):
|
||||
return self.size
|
||||
|
||||
def __getitem__(self, idx):
|
||||
# Simulate loading data on-demand
|
||||
return np.random.randn(100), idx % 10
|
||||
|
||||
dataset = LargeDataset(1000)
|
||||
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
|
||||
|
||||
# Should be able to iterate without loading all data
|
||||
batch_count = 0
|
||||
for batch_x, batch_y in dataloader:
|
||||
batch_count += 1
|
||||
if batch_count >= 3: # Test a few batches
|
||||
break
|
||||
|
||||
assert batch_count == 3, "Memory efficient loading broken"
|
||||
|
||||
except ImportError:
|
||||
assert True, "Memory efficient loading not ready yet"
|
||||
|
||||
def test_parallel_data_loading(self):
|
||||
"""Test parallel/multi-threaded data loading."""
|
||||
try:
|
||||
from tinytorch.core.data import DataLoader, Dataset
|
||||
|
||||
class ParallelDataset(Dataset):
|
||||
def __init__(self):
|
||||
self.data = np.random.randn(100, 50)
|
||||
|
||||
def __len__(self):
|
||||
return 100
|
||||
|
||||
def __getitem__(self, idx):
|
||||
# Simulate some processing time
|
||||
return self.data[idx], idx % 5
|
||||
|
||||
dataset = ParallelDataset()
|
||||
|
||||
# Test with num_workers if supported
|
||||
if 'num_workers' in DataLoader.__init__.__code__.co_varnames:
|
||||
dataloader = DataLoader(dataset, batch_size=16, num_workers=2)
|
||||
else:
|
||||
dataloader = DataLoader(dataset, batch_size=16)
|
||||
|
||||
# Should work regardless of parallel support
|
||||
for batch_x, batch_y in dataloader:
|
||||
assert batch_x.shape == (16, 50), "Parallel loading broken"
|
||||
break
|
||||
|
||||
except ImportError:
|
||||
assert True, "Parallel data loading not ready yet"
|
||||
|
||||
|
||||
class TestRegressionPrevention:
|
||||
"""Ensure previous modules still work after Module 08 development."""
|
||||
|
||||
def test_no_foundation_regression(self):
|
||||
"""Verify foundation stack (01→05) unchanged."""
|
||||
# Core functionality should remain stable
|
||||
assert sys.version_info.major >= 3, "Foundation: Python detection broken"
|
||||
|
||||
# Tensor operations should still work
|
||||
try:
|
||||
from tinytorch.core.tensor import Tensor
|
||||
t = Tensor([1, 2, 3])
|
||||
assert t.shape == (3,), "Foundation regression: Tensor broken"
|
||||
except ImportError:
|
||||
import numpy as np
|
||||
arr = np.array([1, 2, 3])
|
||||
assert arr.shape == (3,), "Foundation regression: Numpy broken"
|
||||
|
||||
def test_no_advanced_regression(self):
|
||||
"""Verify advanced modules (06→07) unchanged."""
|
||||
try:
|
||||
from tinytorch.core.spatial import Conv2D
|
||||
from tinytorch.core.attention import MultiHeadAttention
|
||||
|
||||
# Advanced operations should still work
|
||||
conv = Conv2D(in_channels=1, out_channels=4, kernel_size=3)
|
||||
attention = MultiHeadAttention(embed_dim=32, num_heads=4)
|
||||
|
||||
assert hasattr(conv, 'forward'), "Advanced regression: Spatial broken"
|
||||
assert hasattr(attention, 'forward'), "Advanced regression: Attention broken"
|
||||
|
||||
except ImportError:
|
||||
# If not implemented, basic functionality should work
|
||||
import numpy as np
|
||||
assert np.random is not None, "Advanced regression: Random broken"
|
||||
|
||||
def test_progressive_stability(self):
|
||||
"""Test the progressive stack is stable through data loading."""
|
||||
# Stack should be stable through: Setup → ... → Attention → DataLoader
|
||||
|
||||
# Setup level
|
||||
import numpy as np
|
||||
assert np is not None, "Setup level broken"
|
||||
|
||||
# Foundation level (if available)
|
||||
try:
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Dense
|
||||
|
||||
# Neural networks should still work
|
||||
layer = Dense(5, 3)
|
||||
x = Tensor(np.random.randn(2, 5))
|
||||
output = layer(x)
|
||||
assert output.shape == (2, 3), "Foundation level broken"
|
||||
|
||||
except ImportError:
|
||||
pass # Not implemented yet
|
||||
|
||||
# Data level (if available)
|
||||
try:
|
||||
from tinytorch.core.data import Dataset
|
||||
|
||||
class TestDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 10
|
||||
def __getitem__(self, idx):
|
||||
return idx, idx * 2
|
||||
|
||||
dataset = TestDataset()
|
||||
assert len(dataset) == 10, "Data level broken"
|
||||
|
||||
except ImportError:
|
||||
pass # Not implemented yet
|
||||
@@ -1,515 +0,0 @@
|
||||
"""
|
||||
Module 07 Training - Critical Integration Tests Template
|
||||
|
||||
This file contains the TOP 3 CRITICAL tests that MUST be implemented immediately
|
||||
to establish basic confidence that Module 07 (Training) works correctly.
|
||||
|
||||
These tests catch the most common and severe bugs in training systems.
|
||||
|
||||
PRIORITY: P0 - IMPLEMENT IMMEDIATELY
|
||||
ESTIMATED TIME: 2-3 hours
|
||||
BUG-CATCHING VALUE: CRITICAL
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import numpy as np
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add project root to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
||||
|
||||
# Import from TinyTorch
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Linear
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.core.losses import MSELoss, CrossEntropyLoss
|
||||
from tinytorch.core.optimizers import SGD, AdamW
|
||||
from tinytorch.core.training import Trainer, CosineSchedule, clip_grad_norm
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# CRITICAL TEST 1: Missing zero_grad() Detection
|
||||
# =============================================================================
|
||||
# BUG-CATCHING VALUE: CRITICAL
|
||||
# COMMON STUDENT MISTAKE: Forgetting optimizer.zero_grad()
|
||||
# SYMPTOM: Training appears to run but gradients accumulate incorrectly
|
||||
# =============================================================================
|
||||
|
||||
class TestMissingZeroGrad:
|
||||
"""Test that missing zero_grad() is caught and causes visible failure."""
|
||||
|
||||
def test_zero_grad_required_for_correct_training(self):
|
||||
"""
|
||||
Test that zero_grad() is essential for correct gradient computation.
|
||||
|
||||
This test validates that:
|
||||
1. Without zero_grad(), gradients accumulate across batches
|
||||
2. Accumulated gradients cause incorrect parameter updates
|
||||
3. Training with accumulated gradients behaves differently than correct training
|
||||
"""
|
||||
# Create simple linear model: y = Wx + b
|
||||
layer_correct = Linear(1, 1)
|
||||
layer_broken = Linear(1, 1)
|
||||
|
||||
# Make weights identical to start
|
||||
layer_broken.weights.data = layer_correct.weights.data.copy()
|
||||
if hasattr(layer_correct, 'bias') and layer_correct.bias is not None:
|
||||
layer_broken.bias.data = layer_correct.bias.data.copy()
|
||||
|
||||
# Create optimizers
|
||||
optimizer_correct = SGD(layer_correct.parameters(), lr=0.1)
|
||||
optimizer_broken = SGD(layer_broken.parameters(), lr=0.1)
|
||||
|
||||
loss_fn = MSELoss()
|
||||
|
||||
# Training data: 5 identical samples
|
||||
x_data = Tensor([[1.0]])
|
||||
y_data = Tensor([[2.0]])
|
||||
|
||||
# === CORRECT TRAINING (with zero_grad) ===
|
||||
correct_grad_norms = []
|
||||
for step in range(5):
|
||||
optimizer_correct.zero_grad() # ✅ CRITICAL: Clear gradients
|
||||
|
||||
output = layer_correct.forward(x_data)
|
||||
loss = loss_fn.forward(output, y_data)
|
||||
loss.backward()
|
||||
|
||||
# Record gradient norm
|
||||
grad_norm = np.linalg.norm(layer_correct.weights.grad.data)
|
||||
correct_grad_norms.append(grad_norm)
|
||||
|
||||
optimizer_correct.step()
|
||||
|
||||
# === BROKEN TRAINING (without zero_grad) ===
|
||||
broken_grad_norms = []
|
||||
for step in range(5):
|
||||
# ❌ BUG: Missing optimizer_broken.zero_grad()
|
||||
|
||||
output = layer_broken.forward(x_data)
|
||||
loss = loss_fn.forward(output, y_data)
|
||||
loss.backward()
|
||||
|
||||
# Record gradient norm (should accumulate!)
|
||||
grad_norm = np.linalg.norm(layer_broken.weights.grad.data)
|
||||
broken_grad_norms.append(grad_norm)
|
||||
|
||||
optimizer_broken.step()
|
||||
|
||||
# === VALIDATION ===
|
||||
print("\n🔬 Testing zero_grad() requirement:")
|
||||
print(f"Correct gradient norms (with zero_grad): {correct_grad_norms}")
|
||||
print(f"Broken gradient norms (without zero_grad): {broken_grad_norms}")
|
||||
|
||||
# Test 1: Gradients should accumulate without zero_grad()
|
||||
assert broken_grad_norms[-1] > broken_grad_norms[0] * 2.0, \
|
||||
"Gradients should accumulate when zero_grad() is missing"
|
||||
|
||||
# Test 2: Correct gradients should be relatively stable
|
||||
correct_variation = max(correct_grad_norms) / (min(correct_grad_norms) + 1e-8)
|
||||
assert correct_variation < 5.0, \
|
||||
"Correct gradients shouldn't grow excessively"
|
||||
|
||||
# Test 3: Broken gradients grow much larger than correct ones
|
||||
assert broken_grad_norms[-1] > correct_grad_norms[-1] * 2.0, \
|
||||
"Missing zero_grad() should cause noticeably larger gradients"
|
||||
|
||||
print("✅ zero_grad() requirement correctly enforced!")
|
||||
|
||||
def test_trainer_calls_zero_grad(self):
|
||||
"""
|
||||
Test that Trainer class properly calls zero_grad() during training.
|
||||
|
||||
This validates the Trainer implementation includes the critical zero_grad() call.
|
||||
"""
|
||||
# Create simple model
|
||||
class SimpleModel:
|
||||
def __init__(self):
|
||||
self.layer = Linear(2, 1)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
return self.layer.forward(x)
|
||||
|
||||
def parameters(self):
|
||||
return self.layer.parameters()
|
||||
|
||||
model = SimpleModel()
|
||||
optimizer = SGD(model.parameters(), lr=0.01)
|
||||
loss_fn = MSELoss()
|
||||
trainer = Trainer(model, optimizer, loss_fn)
|
||||
|
||||
# Create simple dataset
|
||||
class SimpleDataset:
|
||||
def __iter__(self):
|
||||
for _ in range(3):
|
||||
x = Tensor(np.random.randn(2, 2))
|
||||
y = Tensor(np.random.randn(2, 1))
|
||||
yield x, y
|
||||
|
||||
# Train for 2 epochs
|
||||
for epoch in range(2):
|
||||
trainer.train_epoch(SimpleDataset())
|
||||
|
||||
# After training, gradients should be zeroed (from last zero_grad() call)
|
||||
# OR they should exist from last backward (depends on implementation)
|
||||
# Key test: Training should have called zero_grad() internally
|
||||
# (This is validated by training not diverging)
|
||||
|
||||
print("✅ Trainer correctly manages gradient clearing!")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# CRITICAL TEST 2: Loss Convergence Validation
|
||||
# =============================================================================
|
||||
# BUG-CATCHING VALUE: CRITICAL
|
||||
# PURPOSE: Validate entire training pipeline produces learning
|
||||
# SYMPTOM: Training runs but model doesn't improve
|
||||
# =============================================================================
|
||||
|
||||
class TestLossConvergence:
|
||||
"""Test that training actually produces learning on simple problems."""
|
||||
|
||||
def test_linear_regression_convergence(self):
|
||||
"""
|
||||
Test training converges on simple linear regression problem.
|
||||
|
||||
Problem: Learn y = 2x + 1
|
||||
Model: Linear(1, 1) with weights and bias
|
||||
Success criteria: Loss decreases, learned weights ≈ [2.0], bias ≈ [1.0]
|
||||
"""
|
||||
# Create model
|
||||
class LinearModel:
|
||||
def __init__(self):
|
||||
self.layer = Linear(1, 1)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
return self.layer.forward(x)
|
||||
|
||||
def parameters(self):
|
||||
return self.layer.parameters()
|
||||
|
||||
model = LinearModel()
|
||||
optimizer = SGD(model.parameters(), lr=0.01)
|
||||
loss_fn = MSELoss()
|
||||
trainer = Trainer(model, optimizer, loss_fn)
|
||||
|
||||
# Generate training data: y = 2x + 1
|
||||
np.random.seed(42)
|
||||
X_train = np.random.randn(100, 1).astype(np.float32)
|
||||
y_train = (2.0 * X_train + 1.0).astype(np.float32)
|
||||
|
||||
# Create dataset
|
||||
class RegressionDataset:
|
||||
def __init__(self, X, y, batch_size=10):
|
||||
self.X = X
|
||||
self.y = y
|
||||
self.batch_size = batch_size
|
||||
|
||||
def __iter__(self):
|
||||
indices = np.arange(len(self.X))
|
||||
np.random.shuffle(indices)
|
||||
for i in range(0, len(self.X), self.batch_size):
|
||||
batch_indices = indices[i:i+self.batch_size]
|
||||
yield Tensor(self.X[batch_indices]), Tensor(self.y[batch_indices])
|
||||
|
||||
dataset = RegressionDataset(X_train, y_train, batch_size=10)
|
||||
|
||||
# Train for 100 epochs
|
||||
print("\n🔬 Testing loss convergence on y = 2x + 1:")
|
||||
losses = []
|
||||
for epoch in range(100):
|
||||
loss = trainer.train_epoch(dataset)
|
||||
losses.append(loss)
|
||||
|
||||
if epoch % 20 == 0:
|
||||
print(f"Epoch {epoch:3d}: Loss = {loss:.6f}")
|
||||
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
|
||||
print(f"\nInitial loss: {initial_loss:.6f}")
|
||||
print(f"Final loss: {final_loss:.6f}")
|
||||
print(f"Reduction: {(1 - final_loss/initial_loss)*100:.1f}%")
|
||||
|
||||
# Test 1: Loss should decrease significantly
|
||||
assert final_loss < initial_loss * 0.1, \
|
||||
f"Loss should decrease to < 10% of initial. Got {final_loss/initial_loss*100:.1f}%"
|
||||
|
||||
# Test 2: Loss should be near zero (good fit)
|
||||
assert final_loss < 0.1, \
|
||||
f"Final loss should be < 0.1 for simple problem. Got {final_loss:.6f}"
|
||||
|
||||
# Test 3: Learned weights should approximate true values
|
||||
learned_weight = model.layer.weights.data[0, 0]
|
||||
learned_bias = model.layer.bias.data[0] if model.layer.bias is not None else 0.0
|
||||
|
||||
print(f"\nTrue parameters: weight=2.0, bias=1.0")
|
||||
print(f"Learned parameters: weight={learned_weight:.3f}, bias={learned_bias:.3f}")
|
||||
|
||||
# Allow some tolerance for learning
|
||||
assert abs(learned_weight - 2.0) < 0.5, \
|
||||
f"Weight should be close to 2.0, got {learned_weight:.3f}"
|
||||
|
||||
if model.layer.bias is not None:
|
||||
assert abs(learned_bias - 1.0) < 0.5, \
|
||||
f"Bias should be close to 1.0, got {learned_bias:.3f}"
|
||||
|
||||
print("✅ Training successfully converged to correct solution!")
|
||||
|
||||
def test_classification_convergence(self):
|
||||
"""
|
||||
Test training converges on simple classification problem.
|
||||
|
||||
Problem: Learn XOR-like pattern with 2-layer network
|
||||
Success criteria: Loss decreases, accuracy improves
|
||||
"""
|
||||
# Create 2-layer model for XOR
|
||||
class XORModel:
|
||||
def __init__(self):
|
||||
self.layer1 = Linear(2, 4)
|
||||
self.relu = ReLU()
|
||||
self.layer2 = Linear(4, 2)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
x = self.layer1.forward(x)
|
||||
x = self.relu.forward(x)
|
||||
x = self.layer2.forward(x)
|
||||
return x
|
||||
|
||||
def parameters(self):
|
||||
return self.layer1.parameters() + self.layer2.parameters()
|
||||
|
||||
model = XORModel()
|
||||
optimizer = AdamW(model.parameters(), lr=0.01)
|
||||
loss_fn = CrossEntropyLoss()
|
||||
trainer = Trainer(model, optimizer, loss_fn)
|
||||
|
||||
# Generate XOR-like data
|
||||
np.random.seed(42)
|
||||
X_train = np.array([
|
||||
[0, 0], [0, 1], [1, 0], [1, 1],
|
||||
[0, 0], [0, 1], [1, 0], [1, 1],
|
||||
[0, 0], [0, 1], [1, 0], [1, 1],
|
||||
], dtype=np.float32)
|
||||
|
||||
y_train = np.array([0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0], dtype=np.int64)
|
||||
|
||||
# Create dataset
|
||||
class XORDataset:
|
||||
def __iter__(self):
|
||||
for i in range(len(X_train)):
|
||||
yield Tensor(X_train[i:i+1]), Tensor(y_train[i:i+1])
|
||||
|
||||
dataset = XORDataset()
|
||||
|
||||
# Train for 200 epochs
|
||||
print("\n🔬 Testing classification convergence on XOR pattern:")
|
||||
losses = []
|
||||
for epoch in range(200):
|
||||
loss = trainer.train_epoch(dataset)
|
||||
losses.append(loss)
|
||||
|
||||
if epoch % 40 == 0:
|
||||
print(f"Epoch {epoch:3d}: Loss = {loss:.6f}")
|
||||
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
|
||||
print(f"\nInitial loss: {initial_loss:.6f}")
|
||||
print(f"Final loss: {final_loss:.6f}")
|
||||
print(f"Reduction: {(1 - final_loss/initial_loss)*100:.1f}%")
|
||||
|
||||
# Test: Loss should decrease significantly
|
||||
assert final_loss < initial_loss * 0.5, \
|
||||
f"Loss should decrease to < 50% of initial. Got {final_loss/initial_loss*100:.1f}%"
|
||||
|
||||
print("✅ Classification training successfully converged!")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# CRITICAL TEST 3: Scheduler Integration
|
||||
# =============================================================================
|
||||
# BUG-CATCHING VALUE: HIGH
|
||||
# COMMON BUG: Scheduler exists but doesn't actually update learning rate
|
||||
# SYMPTOM: Learning rate stays constant despite scheduler
|
||||
# =============================================================================
|
||||
|
||||
class TestSchedulerIntegration:
|
||||
"""Test that learning rate scheduler actually updates optimizer learning rate."""
|
||||
|
||||
def test_scheduler_updates_learning_rate(self):
|
||||
"""
|
||||
Test that CosineSchedule integrates with Trainer and updates LR each epoch.
|
||||
|
||||
This validates:
|
||||
1. Scheduler computes correct learning rates
|
||||
2. Trainer applies scheduler updates to optimizer
|
||||
3. Learning rate actually changes during training
|
||||
"""
|
||||
# Create simple model
|
||||
class SimpleModel:
|
||||
def __init__(self):
|
||||
self.layer = Linear(2, 1)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
return self.layer.forward(x)
|
||||
|
||||
def parameters(self):
|
||||
return self.layer.parameters()
|
||||
|
||||
model = SimpleModel()
|
||||
optimizer = SGD(model.parameters(), lr=0.1) # Initial LR (will be overridden)
|
||||
|
||||
# Create scheduler: 0.1 → 0.01 over 10 epochs
|
||||
scheduler = CosineSchedule(max_lr=0.1, min_lr=0.01, total_epochs=10)
|
||||
|
||||
loss_fn = MSELoss()
|
||||
trainer = Trainer(model, optimizer, loss_fn, scheduler=scheduler)
|
||||
|
||||
# Create simple dataset
|
||||
class SimpleDataset:
|
||||
def __iter__(self):
|
||||
for _ in range(5):
|
||||
x = Tensor(np.random.randn(4, 2))
|
||||
y = Tensor(np.random.randn(4, 1))
|
||||
yield x, y
|
||||
|
||||
print("\n🔬 Testing learning rate scheduling:")
|
||||
|
||||
# Train for 10 epochs and track learning rate
|
||||
learning_rates = []
|
||||
for epoch in range(10):
|
||||
# Record LR before training
|
||||
lr_before = optimizer.lr
|
||||
|
||||
# Train one epoch
|
||||
trainer.train_epoch(SimpleDataset())
|
||||
|
||||
# Record LR after training (scheduler should have updated it)
|
||||
lr_after = optimizer.lr
|
||||
learning_rates.append(lr_after)
|
||||
|
||||
print(f"Epoch {epoch}: LR = {lr_after:.6f}")
|
||||
|
||||
print(f"\nLearning rates: {[f'{lr:.4f}' for lr in learning_rates]}")
|
||||
|
||||
# Test 1: Learning rate should start at max_lr
|
||||
assert abs(learning_rates[0] - 0.1) < 1e-6, \
|
||||
f"Initial LR should be 0.1, got {learning_rates[0]:.6f}"
|
||||
|
||||
# Test 2: Learning rate should end at min_lr
|
||||
assert abs(learning_rates[-1] - 0.01) < 1e-6, \
|
||||
f"Final LR should be 0.01, got {learning_rates[-1]:.6f}"
|
||||
|
||||
# Test 3: Learning rate should decrease monotonically
|
||||
for i in range(len(learning_rates) - 1):
|
||||
assert learning_rates[i] >= learning_rates[i+1], \
|
||||
f"LR should decrease monotonically. Epoch {i}: {learning_rates[i]:.6f} > Epoch {i+1}: {learning_rates[i+1]:.6f}"
|
||||
|
||||
# Test 4: Learning rate should actually change (not stuck)
|
||||
unique_lrs = len(set([round(lr, 6) for lr in learning_rates]))
|
||||
assert unique_lrs >= 5, \
|
||||
f"LR should change across epochs. Only {unique_lrs} unique values found."
|
||||
|
||||
# Test 5: History should track learning rates
|
||||
assert len(trainer.history['learning_rates']) == 10, \
|
||||
"Trainer should record learning rate for each epoch"
|
||||
|
||||
print("✅ Learning rate scheduling works correctly!")
|
||||
|
||||
def test_training_without_scheduler(self):
|
||||
"""
|
||||
Test that training works correctly when scheduler=None.
|
||||
|
||||
This validates that scheduler is truly optional.
|
||||
"""
|
||||
# Create simple model
|
||||
class SimpleModel:
|
||||
def __init__(self):
|
||||
self.layer = Linear(1, 1)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
return self.layer.forward(x)
|
||||
|
||||
def parameters(self):
|
||||
return self.layer.parameters()
|
||||
|
||||
model = SimpleModel()
|
||||
optimizer = SGD(model.parameters(), lr=0.05)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
# Create trainer WITHOUT scheduler
|
||||
trainer = Trainer(model, optimizer, loss_fn, scheduler=None)
|
||||
|
||||
# Create simple dataset
|
||||
class SimpleDataset:
|
||||
def __iter__(self):
|
||||
for _ in range(3):
|
||||
x = Tensor(np.random.randn(2, 1))
|
||||
y = Tensor(np.random.randn(2, 1))
|
||||
yield x, y
|
||||
|
||||
print("\n🔬 Testing training without scheduler:")
|
||||
|
||||
# Train for 5 epochs
|
||||
initial_lr = optimizer.lr
|
||||
for epoch in range(5):
|
||||
trainer.train_epoch(SimpleDataset())
|
||||
current_lr = optimizer.lr
|
||||
|
||||
print(f"Epoch {epoch}: LR = {current_lr:.6f}")
|
||||
|
||||
# Learning rate should stay constant
|
||||
assert abs(current_lr - initial_lr) < 1e-9, \
|
||||
f"LR should remain constant without scheduler. Expected {initial_lr}, got {current_lr}"
|
||||
|
||||
print("✅ Training without scheduler works correctly!")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Test Execution
|
||||
# =============================================================================
|
||||
|
||||
if __name__ == "__main__":
|
||||
print("=" * 70)
|
||||
print("Module 07 - CRITICAL Integration Tests")
|
||||
print("=" * 70)
|
||||
|
||||
# Test 1: Missing zero_grad()
|
||||
print("\n" + "=" * 70)
|
||||
print("TEST 1: Missing zero_grad() Detection")
|
||||
print("=" * 70)
|
||||
test_zero_grad = TestMissingZeroGrad()
|
||||
test_zero_grad.test_zero_grad_required_for_correct_training()
|
||||
test_zero_grad.test_trainer_calls_zero_grad()
|
||||
|
||||
# Test 2: Loss Convergence
|
||||
print("\n" + "=" * 70)
|
||||
print("TEST 2: Loss Convergence Validation")
|
||||
print("=" * 70)
|
||||
test_convergence = TestLossConvergence()
|
||||
test_convergence.test_linear_regression_convergence()
|
||||
test_convergence.test_classification_convergence()
|
||||
|
||||
# Test 3: Scheduler Integration
|
||||
print("\n" + "=" * 70)
|
||||
print("TEST 3: Scheduler Integration")
|
||||
print("=" * 70)
|
||||
test_scheduler = TestSchedulerIntegration()
|
||||
test_scheduler.test_scheduler_updates_learning_rate()
|
||||
test_scheduler.test_training_without_scheduler()
|
||||
|
||||
print("\n" + "=" * 70)
|
||||
print("ALL CRITICAL TESTS PASSED! ✅")
|
||||
print("=" * 70)
|
||||
print("\nModule 07 Training has passed critical integration validation.")
|
||||
print("These tests verify:")
|
||||
print(" ✅ Gradients are managed correctly (zero_grad)")
|
||||
print(" ✅ Training produces learning (convergence)")
|
||||
print(" ✅ Learning rate scheduling works (scheduler integration)")
|
||||
@@ -1,550 +0,0 @@
|
||||
# Module 07 (Training) - Integration Test Audit Report
|
||||
|
||||
**Date**: 2025-11-25
|
||||
**Auditor**: Dr. Sarah Rodriguez
|
||||
**Status**: CRITICAL GAPS IDENTIFIED - Test coverage is for Module 10 (Optimizers), not Module 07 (Training)
|
||||
|
||||
---
|
||||
|
||||
## CRITICAL FINDING: Wrong Module Being Tested
|
||||
|
||||
**ISSUE**: The file `/tests/07_training/test_progressive_integration.py` contains tests for **Module 10 (Optimizers)**, NOT Module 07 (Training).
|
||||
|
||||
**Evidence**:
|
||||
- Line 2: "Module 10: Progressive Integration Tests"
|
||||
- Line 3: "Tests that Module 10 (Optimizers) works correctly"
|
||||
- Line 5: "DEPENDENCY CHAIN: 01_setup → ... → 10_optimizers"
|
||||
- Line 6: "This is where we enable actual learning through gradient-based optimization."
|
||||
|
||||
**Impact**: Module 07 (Training) has NO progressive integration tests validating its core functionality.
|
||||
|
||||
---
|
||||
|
||||
## Module 07 Implementation Overview
|
||||
|
||||
Based on `/src/07_training/07_training.py`, Module 07 provides:
|
||||
|
||||
### Core Components Implemented:
|
||||
1. **CosineSchedule** - Learning rate scheduling with cosine annealing
|
||||
2. **clip_grad_norm()** - Global gradient norm clipping
|
||||
3. **Trainer class** - Complete training orchestration with:
|
||||
- `train_epoch()` - Training loop with gradient accumulation
|
||||
- `evaluate()` - Evaluation mode without gradients
|
||||
- `save_checkpoint()` / `load_checkpoint()` - State persistence
|
||||
- Train/eval mode switching
|
||||
- Learning rate scheduling integration
|
||||
- Gradient clipping integration
|
||||
- History tracking
|
||||
|
||||
### Integration Points (Modules 01-06):
|
||||
- Module 01: Tensor operations
|
||||
- Module 02: Activations (ReLU, Sigmoid)
|
||||
- Module 03: Layers (Linear)
|
||||
- Module 04: Losses (MSELoss, CrossEntropyLoss)
|
||||
- Module 05: Autograd (backward pass, gradients)
|
||||
- Module 06: Optimizers (SGD, AdamW)
|
||||
|
||||
---
|
||||
|
||||
## Current Test Coverage Analysis
|
||||
|
||||
### Existing Test Files:
|
||||
1. **test_progressive_integration.py** (498 lines)
|
||||
- **WRONG MODULE**: Tests Module 10 (Optimizers)
|
||||
- Tests SGD/Adam creation, parameter updates, gradient clipping
|
||||
- Does NOT test Trainer class or training loops
|
||||
|
||||
2. **test_autograd_integration.py** (213 lines)
|
||||
- Tests autograd integration with tensors, layers, activations
|
||||
- Validates backward pass, computation graphs
|
||||
- Does NOT test training-specific functionality
|
||||
|
||||
3. **test_tensor_autograd_integration.py** (348 lines)
|
||||
- Tests Variable wrapping of Tensors
|
||||
- Tests operations (add, multiply, relu, sigmoid)
|
||||
- Tests backward pass and gradient computation
|
||||
- Does NOT test training loops
|
||||
|
||||
### Coverage Summary:
|
||||
- **Autograd Integration**: ✅ Well covered (561 lines)
|
||||
- **Optimizer Integration**: ✅ Covered (in wrong file)
|
||||
- **Training Loop Integration**: ❌ **MISSING**
|
||||
- **Trainer Class Integration**: ❌ **MISSING**
|
||||
- **Learning Rate Scheduling**: ❌ **MISSING**
|
||||
- **Gradient Clipping**: ⚠️ Partial (optimizer tests only)
|
||||
- **Checkpointing**: ❌ **MISSING**
|
||||
- **Train/Eval Mode**: ❌ **MISSING**
|
||||
|
||||
---
|
||||
|
||||
## MISSING INTEGRATION TESTS - Critical Priorities
|
||||
|
||||
### Priority 1: Training Loop Core Functionality
|
||||
|
||||
#### Test 1.1: Complete Training Loop Integration
|
||||
**What to test**: End-to-end training loop through Trainer class
|
||||
```python
|
||||
class TestTrainerCoreIntegration:
|
||||
def test_complete_training_loop(self):
|
||||
"""Test complete training loop integrates all modules correctly."""
|
||||
# Components from all modules:
|
||||
# - Model: Linear layers (Module 03) + ReLU (Module 02)
|
||||
# - Loss: MSELoss or CrossEntropyLoss (Module 04)
|
||||
# - Optimizer: SGD or AdamW (Module 06)
|
||||
# - Trainer: Training orchestration (Module 07)
|
||||
|
||||
# Verify:
|
||||
# - Forward pass works
|
||||
# - Loss computation works
|
||||
# - Backward pass computes gradients
|
||||
# - Optimizer updates parameters
|
||||
# - Loss decreases over epochs
|
||||
```
|
||||
|
||||
**Why critical**: This is the PRIMARY integration point for Module 07. If this doesn't work, nothing else matters.
|
||||
|
||||
#### Test 1.2: Missing zero_grad() Detection
|
||||
**What to test**: Training fails catastrophically if zero_grad() is missing
|
||||
```python
|
||||
def test_missing_zero_grad_causes_gradient_accumulation(self):
|
||||
"""Test that forgetting zero_grad() causes incorrect gradient accumulation."""
|
||||
# Create trainer WITHOUT zero_grad() call
|
||||
# Run multiple training steps
|
||||
# Verify gradients accumulate incorrectly
|
||||
# Show loss diverges instead of converging
|
||||
```
|
||||
|
||||
**Why critical**: This is the #1 student mistake in training loops. Tests should catch it.
|
||||
|
||||
**Bug-catching value**: HIGH - Common error that silently breaks training
|
||||
|
||||
#### Test 1.3: Gradient Accumulation Pattern
|
||||
**What to test**: Gradient accumulation works correctly with accumulation_steps > 1
|
||||
```python
|
||||
def test_gradient_accumulation_correctness(self):
|
||||
"""Test gradient accumulation produces same results as larger batch."""
|
||||
# Train with batch_size=4, accumulation_steps=1
|
||||
# Train with batch_size=2, accumulation_steps=2
|
||||
# Verify final gradients are equivalent
|
||||
# Verify effective batch size is the same
|
||||
```
|
||||
|
||||
**Why critical**: Production pattern for memory-limited training. Must work correctly.
|
||||
|
||||
---
|
||||
|
||||
### Priority 2: Train/Eval Mode Switching
|
||||
|
||||
#### Test 2.1: Mode Switching Affects Model Behavior
|
||||
**What to test**: model.training flag changes behavior correctly
|
||||
```python
|
||||
def test_train_eval_mode_switching(self):
|
||||
"""Test train/eval mode switching affects model behavior."""
|
||||
# Create model with dropout or batchnorm (future modules)
|
||||
# Run forward in training mode
|
||||
# Run forward in eval mode
|
||||
# Verify different outputs/behavior
|
||||
|
||||
# For Module 07: At minimum verify:
|
||||
# - Trainer sets model.training = True in train_epoch()
|
||||
# - Trainer sets model.training = False in evaluate()
|
||||
```
|
||||
|
||||
**Why critical**: Proper mode switching is essential for correct evaluation and inference.
|
||||
|
||||
**Bug-catching value**: MEDIUM - Subtle bug that causes incorrect evaluation metrics
|
||||
|
||||
#### Test 2.2: Gradients Disabled During Evaluation
|
||||
**What to test**: No gradients computed during evaluation
|
||||
```python
|
||||
def test_evaluation_disables_gradients(self):
|
||||
"""Test evaluation doesn't compute or accumulate gradients."""
|
||||
# Run evaluate() on test data
|
||||
# Verify no gradients are computed
|
||||
# Verify no parameter updates occur
|
||||
# Verify optimizer state unchanged
|
||||
```
|
||||
|
||||
**Why critical**: Evaluation should be faster and memory-efficient without gradients.
|
||||
|
||||
---
|
||||
|
||||
### Priority 3: Learning Rate Scheduling Integration
|
||||
|
||||
#### Test 3.1: Scheduler Updates Learning Rate
|
||||
**What to test**: Scheduler properly updates optimizer learning rate each epoch
|
||||
```python
|
||||
def test_scheduler_updates_learning_rate(self):
|
||||
"""Test learning rate scheduler integrates with training loop."""
|
||||
# Create CosineSchedule(max_lr=0.1, min_lr=0.01, total_epochs=10)
|
||||
# Create Trainer with scheduler
|
||||
# Train for 10 epochs
|
||||
# Verify optimizer.lr changes each epoch
|
||||
# Verify lr follows cosine schedule (decreasing)
|
||||
# Verify final lr ≈ min_lr
|
||||
```
|
||||
|
||||
**Why critical**: Scheduling is essential for training convergence. Must integrate correctly.
|
||||
|
||||
**Bug-catching value**: HIGH - Scheduler exists but doesn't actually update LR (common integration bug)
|
||||
|
||||
#### Test 3.2: Training Without Scheduler Still Works
|
||||
**What to test**: Scheduler is optional, training works without it
|
||||
```python
|
||||
def test_training_without_scheduler(self):
|
||||
"""Test training works with scheduler=None."""
|
||||
# Create Trainer with scheduler=None
|
||||
# Train for multiple epochs
|
||||
# Verify optimizer.lr stays constant
|
||||
# Verify training still works correctly
|
||||
```
|
||||
|
||||
**Why critical**: Ensures optional components are truly optional.
|
||||
|
||||
---
|
||||
|
||||
### Priority 4: Gradient Clipping Integration
|
||||
|
||||
#### Test 4.1: Gradient Clipping Prevents Explosion
|
||||
**What to test**: Gradient clipping rescales large gradients correctly
|
||||
```python
|
||||
def test_gradient_clipping_prevents_explosion(self):
|
||||
"""Test gradient clipping prevents exploding gradients."""
|
||||
# Create model with potential for large gradients
|
||||
# Set grad_clip_norm=1.0
|
||||
# Inject artificially large gradients
|
||||
# Train one step
|
||||
# Verify gradient norm ≤ clip threshold
|
||||
# Verify parameters update reasonably
|
||||
```
|
||||
|
||||
**Why critical**: Prevents training instability from exploding gradients.
|
||||
|
||||
**Bug-catching value**: HIGH - Clipping may be called but not actually applied
|
||||
|
||||
#### Test 4.2: Small Gradients Not Affected
|
||||
**What to test**: Gradient clipping doesn't affect small gradients
|
||||
```python
|
||||
def test_small_gradients_unchanged_by_clipping(self):
|
||||
"""Test gradient clipping doesn't modify small gradients."""
|
||||
# Create model with small gradients
|
||||
# Set grad_clip_norm=10.0 (high threshold)
|
||||
# Compute gradients
|
||||
# Verify gradients unchanged
|
||||
```
|
||||
|
||||
**Why critical**: Clipping should only activate when needed.
|
||||
|
||||
---
|
||||
|
||||
### Priority 5: Loss Convergence Validation
|
||||
|
||||
#### Test 5.1: Loss Decreases During Training
|
||||
**What to test**: Training actually improves model performance
|
||||
```python
|
||||
def test_loss_convergence_on_simple_problem(self):
|
||||
"""Test training reduces loss on simple learnable problem."""
|
||||
# Create simple linear regression problem: y = 2x + 1
|
||||
# Create model: Linear(1, 1)
|
||||
# Train for 100 epochs
|
||||
# Verify loss decreases monotonically (or mostly)
|
||||
# Verify final loss < initial loss * 0.1
|
||||
# Verify learned weights ≈ [2.0] and bias ≈ [1.0]
|
||||
```
|
||||
|
||||
**Why critical**: Validates entire training pipeline produces learning.
|
||||
|
||||
**Bug-catching value**: CRITICAL - Detects any component breaking learning
|
||||
|
||||
#### Test 5.2: History Tracking Accuracy
|
||||
**What to test**: trainer.history correctly records training metrics
|
||||
```python
|
||||
def test_history_tracking(self):
|
||||
"""Test training history is tracked correctly."""
|
||||
# Train for 5 epochs
|
||||
# Verify len(trainer.history['train_loss']) == 5
|
||||
# Verify len(trainer.history['learning_rates']) == 5 (if scheduler used)
|
||||
# Verify values are reasonable (no NaN, no infinite)
|
||||
```
|
||||
|
||||
**Why critical**: Users rely on history for monitoring and debugging.
|
||||
|
||||
---
|
||||
|
||||
### Priority 6: Checkpointing and State Persistence
|
||||
|
||||
#### Test 6.1: Save and Load Checkpoint
|
||||
**What to test**: Training state can be saved and restored
|
||||
```python
|
||||
def test_save_load_checkpoint(self):
|
||||
"""Test checkpoint saving and loading preserves training state."""
|
||||
# Train for 5 epochs
|
||||
# Save checkpoint
|
||||
# Train for 5 more epochs
|
||||
# Record final state
|
||||
|
||||
# Create new trainer
|
||||
# Load checkpoint
|
||||
# Train for 5 epochs
|
||||
# Verify final state matches original
|
||||
```
|
||||
|
||||
**Why critical**: Essential for long training jobs and experimentation.
|
||||
|
||||
**Bug-catching value**: MEDIUM - Checkpoint may save but not restore correctly
|
||||
|
||||
#### Test 6.2: Checkpoint Contains Complete State
|
||||
**What to test**: Checkpoint includes all necessary components
|
||||
```python
|
||||
def test_checkpoint_completeness(self):
|
||||
"""Test checkpoint contains all training state components."""
|
||||
# Train for a few epochs
|
||||
# Save checkpoint
|
||||
# Load checkpoint dictionary
|
||||
# Verify contains:
|
||||
# - model state (weights, biases)
|
||||
# - optimizer state (momentum, velocity for Adam)
|
||||
# - scheduler state (current epoch)
|
||||
# - training metadata (epoch, step)
|
||||
```
|
||||
|
||||
**Why critical**: Incomplete checkpoints cause subtle resume errors.
|
||||
|
||||
---
|
||||
|
||||
### Priority 7: Integration with Previous Modules
|
||||
|
||||
#### Test 7.1: Works with Different Layer Types
|
||||
**What to test**: Training works with various layer architectures
|
||||
```python
|
||||
def test_training_with_different_architectures(self):
|
||||
"""Test training works with different model architectures."""
|
||||
# Test 1: Single Linear layer
|
||||
# Test 2: Multi-layer perceptron (Linear + ReLU + Linear)
|
||||
# Test 3: Different activation functions
|
||||
# Verify all train successfully
|
||||
```
|
||||
|
||||
**Why critical**: Training should be architecture-agnostic.
|
||||
|
||||
#### Test 7.2: Works with Different Loss Functions
|
||||
**What to test**: Training works with MSE, CrossEntropy, etc.
|
||||
```python
|
||||
def test_training_with_different_losses(self):
|
||||
"""Test training works with different loss functions."""
|
||||
# Test 1: MSELoss for regression
|
||||
# Test 2: CrossEntropyLoss for classification
|
||||
# Verify both train correctly
|
||||
# Verify gradients flow properly
|
||||
```
|
||||
|
||||
**Why critical**: Training should support all loss types.
|
||||
|
||||
#### Test 7.3: Works with Different Optimizers
|
||||
**What to test**: Training works with SGD, AdamW, etc.
|
||||
```python
|
||||
def test_training_with_different_optimizers(self):
|
||||
"""Test training works with different optimizers."""
|
||||
# Test 1: SGD (simple, no momentum)
|
||||
# Test 2: AdamW (complex, with momentum and adaptive LR)
|
||||
# Verify both integrate correctly
|
||||
# Verify both produce learning
|
||||
```
|
||||
|
||||
**Why critical**: Training should be optimizer-agnostic.
|
||||
|
||||
---
|
||||
|
||||
## Test Organization Recommendations
|
||||
|
||||
### Suggested File Structure:
|
||||
|
||||
```
|
||||
tests/07_training/
|
||||
├── test_progressive_integration.py # FIX: Rename/move to tests/10_optimizers/
|
||||
├── test_trainer_core.py # NEW: Priority 1 tests
|
||||
├── test_trainer_modes.py # NEW: Priority 2 tests
|
||||
├── test_scheduler_integration.py # NEW: Priority 3 tests
|
||||
├── test_gradient_clipping.py # NEW: Priority 4 tests
|
||||
├── test_convergence.py # NEW: Priority 5 tests
|
||||
├── test_checkpointing.py # NEW: Priority 6 tests
|
||||
├── test_module_integration.py # NEW: Priority 7 tests
|
||||
├── test_autograd_integration.py # KEEP: Good coverage
|
||||
└── test_tensor_autograd_integration.py # KEEP: Good coverage
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priority Matrix
|
||||
|
||||
| Test Category | Bug-Catching Value | Student Impact | Priority |
|
||||
|--------------|-------------------|----------------|----------|
|
||||
| Missing zero_grad() | CRITICAL | High - Silent failure | P0 |
|
||||
| Loss convergence validation | CRITICAL | High - No learning | P0 |
|
||||
| Scheduler integration | HIGH | Medium - Poor convergence | P1 |
|
||||
| Gradient clipping | HIGH | Medium - Training instability | P1 |
|
||||
| Train/eval mode | MEDIUM | Medium - Wrong metrics | P2 |
|
||||
| Checkpoint save/load | MEDIUM | Low - Resume failures | P2 |
|
||||
| Gradient accumulation | MEDIUM | Low - Memory issues | P3 |
|
||||
|
||||
---
|
||||
|
||||
## Recommended Test Implementation Order
|
||||
|
||||
### Phase 1: Core Functionality (P0)
|
||||
1. ✅ Fix file organization (move optimizer tests to correct location)
|
||||
2. ✅ Test complete training loop integration
|
||||
3. ✅ Test missing zero_grad() detection
|
||||
4. ✅ Test loss convergence on simple problem
|
||||
|
||||
### Phase 2: Essential Features (P1)
|
||||
5. ✅ Test learning rate scheduling integration
|
||||
6. ✅ Test gradient clipping prevents explosion
|
||||
7. ✅ Test train/eval mode switching
|
||||
|
||||
### Phase 3: Production Features (P2)
|
||||
8. ✅ Test checkpoint save and load
|
||||
9. ✅ Test gradient accumulation correctness
|
||||
10. ✅ Test history tracking accuracy
|
||||
|
||||
### Phase 4: Robustness (P3)
|
||||
11. ✅ Test with different architectures
|
||||
12. ✅ Test with different loss functions
|
||||
13. ✅ Test with different optimizers
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
### Current State:
|
||||
- **Total test lines**: 1159 (but misplaced)
|
||||
- **Module 07 specific tests**: ~0 (all tests are for wrong module)
|
||||
- **Integration coverage**: 0% for training, 100% for autograd
|
||||
|
||||
### Required Action:
|
||||
1. **URGENT**: Rename/move `test_progressive_integration.py` to `tests/10_optimizers/`
|
||||
2. **URGENT**: Create new `test_trainer_core.py` with Priority 1 tests (P0)
|
||||
3. **HIGH**: Create Priority 2-3 test files (P1)
|
||||
4. **MEDIUM**: Create Priority 4-7 test files (P2-P3)
|
||||
|
||||
### Estimated Test Lines Needed:
|
||||
- **Minimum (P0-P1)**: ~400 lines for critical functionality
|
||||
- **Recommended (P0-P2)**: ~800 lines for production readiness
|
||||
- **Comprehensive (P0-P3)**: ~1200 lines for full coverage
|
||||
|
||||
### Critical Integration Points Missing Tests:
|
||||
1. ❌ Training loop orchestration
|
||||
2. ❌ zero_grad() requirement
|
||||
3. ❌ Learning rate scheduling
|
||||
4. ❌ Gradient clipping application
|
||||
5. ❌ Train/eval mode effects
|
||||
6. ❌ Loss convergence validation
|
||||
7. ❌ Checkpoint persistence
|
||||
|
||||
**Overall Assessment**: Module 07 has ZERO integration test coverage. All existing tests are for the wrong module (10) or test components (autograd) rather than the training loop itself.
|
||||
|
||||
**Risk Level**: 🔴 **CRITICAL** - Module 07 could be completely broken and tests would pass.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Test Template Examples
|
||||
|
||||
### Template: Complete Training Loop Test
|
||||
```python
|
||||
class TestTrainerCoreIntegration:
|
||||
"""Test Trainer class integrates all modules correctly."""
|
||||
|
||||
def test_complete_training_loop(self):
|
||||
"""Test end-to-end training with all components."""
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Linear
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.core.losses import MSELoss
|
||||
from tinytorch.core.optimizers import SGD
|
||||
from tinytorch.core.training import Trainer
|
||||
|
||||
# Create simple model
|
||||
class SimpleModel:
|
||||
def __init__(self):
|
||||
self.layer1 = Linear(2, 4)
|
||||
self.relu = ReLU()
|
||||
self.layer2 = Linear(4, 1)
|
||||
self.training = True
|
||||
|
||||
def forward(self, x):
|
||||
x = self.layer1(x)
|
||||
x = self.relu(x)
|
||||
x = self.layer2(x)
|
||||
return x
|
||||
|
||||
def parameters(self):
|
||||
return self.layer1.parameters() + self.layer2.parameters()
|
||||
|
||||
# Create components
|
||||
model = SimpleModel()
|
||||
optimizer = SGD(model.parameters(), lr=0.01)
|
||||
loss_fn = MSELoss()
|
||||
trainer = Trainer(model, optimizer, loss_fn)
|
||||
|
||||
# Create simple dataset: y = x1 + x2
|
||||
class SimpleDataset:
|
||||
def __iter__(self):
|
||||
for _ in range(10): # 10 batches
|
||||
x = Tensor(np.random.randn(4, 2))
|
||||
y = Tensor(x.data[:, 0:1] + x.data[:, 1:2])
|
||||
yield x, y
|
||||
|
||||
# Train for 5 epochs
|
||||
initial_loss = None
|
||||
for epoch in range(5):
|
||||
loss = trainer.train_epoch(SimpleDataset())
|
||||
if initial_loss is None:
|
||||
initial_loss = loss
|
||||
|
||||
# Verify training worked
|
||||
assert loss < initial_loss * 0.8, "Loss should decrease significantly"
|
||||
assert len(trainer.history['train_loss']) == 5
|
||||
assert trainer.epoch == 5
|
||||
```
|
||||
|
||||
### Template: Missing zero_grad() Test
|
||||
```python
|
||||
def test_missing_zero_grad_breaks_training(self):
|
||||
"""Test that forgetting zero_grad() causes gradient accumulation."""
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Linear
|
||||
from tinytorch.core.losses import MSELoss
|
||||
from tinytorch.core.optimizers import SGD
|
||||
|
||||
# Create model and optimizer
|
||||
layer = Linear(1, 1)
|
||||
optimizer = SGD(layer.parameters(), lr=0.1)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
# Manual training loop WITHOUT zero_grad()
|
||||
x = Tensor([[1.0]])
|
||||
y = Tensor([[2.0]])
|
||||
|
||||
# First step
|
||||
out1 = layer.forward(x)
|
||||
loss1 = loss_fn.forward(out1, y)
|
||||
loss1.backward()
|
||||
grad1 = layer.weights.grad.data.copy()
|
||||
optimizer.step()
|
||||
# FORGOT: optimizer.zero_grad() ← BUG
|
||||
|
||||
# Second step
|
||||
out2 = layer.forward(x)
|
||||
loss2 = loss_fn.forward(out2, y)
|
||||
loss2.backward()
|
||||
grad2 = layer.weights.grad.data.copy()
|
||||
|
||||
# Verify gradients accumulated incorrectly
|
||||
# grad2 should be ~2x grad1 because gradients accumulated
|
||||
assert np.abs(grad2) > np.abs(grad1) * 1.5, \
|
||||
"Gradients should accumulate when zero_grad() is missing"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**End of Audit Report**
|
||||
@@ -1,151 +0,0 @@
|
||||
# Module 07 Integration Test Audit - Quick Reference
|
||||
|
||||
## TL;DR
|
||||
|
||||
**Status**: 🔴 CRITICAL - Module 07 has 0% integration test coverage
|
||||
|
||||
**Problem**: Test file tests wrong module (Module 10 instead of Module 07)
|
||||
|
||||
**Impact**: Training loop could be completely broken and tests would pass
|
||||
|
||||
---
|
||||
|
||||
## What to Read
|
||||
|
||||
1. **Executive Summary** (2 min): `AUDIT_SUMMARY.md`
|
||||
- Critical findings
|
||||
- Top 3 missing tests
|
||||
- Action items
|
||||
|
||||
2. **Full Audit Report** (10 min): `INTEGRATION_TEST_AUDIT.md`
|
||||
- Complete coverage analysis
|
||||
- All missing tests (Priorities 0-3)
|
||||
- Implementation templates
|
||||
|
||||
3. **Critical Tests** (code): `CRITICAL_TESTS_TEMPLATE.py`
|
||||
- Top 3 bug-catching tests (ready to run)
|
||||
- ~400 lines of working test code
|
||||
- Immediate implementation guide
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Points
|
||||
|
||||
| Integration Point | Current Coverage | Priority |
|
||||
|------------------|------------------|----------|
|
||||
| Training loop orchestration | ❌ 0% | P0 - CRITICAL |
|
||||
| zero_grad() requirement | ❌ 0% | P0 - CRITICAL |
|
||||
| Loss convergence | ❌ 0% | P0 - CRITICAL |
|
||||
| Learning rate scheduling | ❌ 0% | P1 - HIGH |
|
||||
| Gradient clipping | ⚠️ 20% | P1 - HIGH |
|
||||
| Train/eval mode | ❌ 0% | P1 - HIGH |
|
||||
| Checkpointing | ❌ 0% | P2 - MEDIUM |
|
||||
| Gradient accumulation | ❌ 0% | P2 - MEDIUM |
|
||||
|
||||
---
|
||||
|
||||
## Immediate Actions Required
|
||||
|
||||
### 1. Fix File Organization (5 min)
|
||||
```bash
|
||||
# Move misplaced test file to correct module
|
||||
mv tests/07_training/test_progressive_integration.py \
|
||||
tests/10_optimizers/test_progressive_integration.py
|
||||
```
|
||||
|
||||
### 2. Run Critical Tests (30 min)
|
||||
```bash
|
||||
# Test the 3 most critical integration points
|
||||
cd tests/07_training
|
||||
pytest CRITICAL_TESTS_TEMPLATE.py -v
|
||||
|
||||
# Expected: Some tests may FAIL (catching real bugs!)
|
||||
```
|
||||
|
||||
### 3. Create Real Test File (2 hours)
|
||||
```bash
|
||||
# Use template as basis for permanent test file
|
||||
cp CRITICAL_TESTS_TEMPLATE.py test_trainer_core.py
|
||||
|
||||
# Integrate with TinyTorch test suite
|
||||
# Add to CI/CD pipeline
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Implementation Priority
|
||||
|
||||
**Phase 1: P0 Tests (~210 lines, CRITICAL)**
|
||||
- Missing zero_grad() detection
|
||||
- Loss convergence validation
|
||||
- Complete training loop integration
|
||||
|
||||
**Phase 2: P1 Tests (~160 lines, HIGH)**
|
||||
- Learning rate scheduling
|
||||
- Gradient clipping
|
||||
- Train/eval mode switching
|
||||
|
||||
**Phase 3: P2 Tests (~180 lines, MEDIUM)**
|
||||
- Checkpoint save/load
|
||||
- Gradient accumulation
|
||||
- History tracking
|
||||
|
||||
---
|
||||
|
||||
## Expected Test Results
|
||||
|
||||
### If All Components Work:
|
||||
```
|
||||
✅ zero_grad() requirement correctly enforced
|
||||
✅ Training successfully converged to correct solution
|
||||
✅ Learning rate scheduling works correctly
|
||||
```
|
||||
|
||||
### If Bugs Exist (likely):
|
||||
```
|
||||
❌ Gradients accumulate without zero_grad() but training still "works"
|
||||
→ BUG: Missing zero_grad() in training loop
|
||||
|
||||
❌ Loss doesn't decrease after 100 epochs
|
||||
→ BUG: Complete pipeline failure (check backward pass, optimizer)
|
||||
|
||||
❌ Learning rate stays constant at 0.1
|
||||
→ BUG: Scheduler not integrated (called but LR not updated)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Files Created by This Audit
|
||||
|
||||
1. `AUDIT_SUMMARY.md` - Executive summary
|
||||
2. `INTEGRATION_TEST_AUDIT.md` - Full audit report
|
||||
3. `CRITICAL_TESTS_TEMPLATE.py` - Top 3 tests (ready to run)
|
||||
4. `README_AUDIT.md` - This quick reference
|
||||
|
||||
---
|
||||
|
||||
## Questions to Answer
|
||||
|
||||
**Q: Why is this marked CRITICAL?**
|
||||
A: Module 07 is where ALL previous modules integrate. If training doesn't work, nothing works. Zero test coverage means complete integration could be broken.
|
||||
|
||||
**Q: How do we know tests are missing?**
|
||||
A: Current test file (`test_progressive_integration.py`) has wrong header ("Module 10") and tests optimizers, not training loops.
|
||||
|
||||
**Q: What's the quickest way to establish confidence?**
|
||||
A: Run `CRITICAL_TESTS_TEMPLATE.py`. If those 3 tests pass, core functionality works. If they fail, we found critical bugs.
|
||||
|
||||
**Q: How much work to fix?**
|
||||
A: Minimum (P0): ~210 lines, 2-3 hours. Recommended (P0+P1): ~370 lines, 1 day.
|
||||
|
||||
---
|
||||
|
||||
## Contact
|
||||
|
||||
For questions about this audit, see:
|
||||
- Full report: `INTEGRATION_TEST_AUDIT.md`
|
||||
- Test templates: `CRITICAL_TESTS_TEMPLATE.py`
|
||||
- Module implementation: `/src/07_training/07_training.py`
|
||||
|
||||
**Audit Date**: 2025-11-25
|
||||
**Status**: CRITICAL - Immediate action required
|
||||
@@ -1,210 +0,0 @@
|
||||
╔═══════════════════════════════════════════════════════════════════════════════╗
|
||||
║ MODULE 08 INTEGRATION TEST AUDIT SUMMARY ║
|
||||
╚═══════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
🚨 CRITICAL BUG FOUND 🚨
|
||||
┌───────────────────────────────────────────────────────────────────────────────┐
|
||||
│ File Location: tests/08_dataloader/test_progressive_integration.py │
|
||||
│ Expected Module: Module 08 (DataLoader) │
|
||||
│ Actual Module: Module 09 (Autograd) ❌ │
|
||||
│ │
|
||||
│ IMPACT: Module 08 has ZERO integration tests currently! │
|
||||
└───────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
📊 CURRENT TEST COVERAGE ANALYSIS
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Current Tests (ALL WRONG MODULE):
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ ✗ TestCompleteMLPipelineStillWorks │
|
||||
│ └─ Tests Module 09 regression, not Module 08 │
|
||||
│ │
|
||||
│ ✗ TestModule09AutogradCore │
|
||||
│ ├─ test_variable_wrapper_exists │
|
||||
│ ├─ test_gradient_computation │
|
||||
│ └─ test_computation_graph_building │
|
||||
│ │
|
||||
│ ✗ TestAutogradIntegration │
|
||||
│ ├─ test_autograd_with_layers │
|
||||
│ ├─ test_autograd_with_spatial_operations │
|
||||
│ └─ test_autograd_with_attention │
|
||||
│ │
|
||||
│ ✗ TestGradientBasedLearningFoundation │
|
||||
│ ├─ test_parameter_gradient_computation │
|
||||
│ ├─ test_loss_function_gradients │
|
||||
│ └─ test_optimization_readiness │
|
||||
│ │
|
||||
│ ✗ TestModule09Completion │
|
||||
│ └─ test_autograd_foundation_complete │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
|
||||
Module 08 Coverage: 0/7 critical integration points tested ❌
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
🎯 MISSING MODULE 08 INTEGRATION TESTS
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
🔴 CRITICAL PRIORITY (Must Have):
|
||||
|
||||
1. DataLoader + Training Loop Integration ⚠️
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Batches work with model forward pass │
|
||||
│ Risk: Students can't train models │
|
||||
│ Catches: Shape mismatches, iteration bugs │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
2. Shuffling Consistency Across Epochs ⚠️
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Data shuffles properly each epoch │
|
||||
│ Risk: Training may not converge │
|
||||
│ Catches: Randomization bugs, duplicate samples │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
3. Batch Size Memory Scaling ⚠️
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Memory usage scales with batch size │
|
||||
│ Risk: OOM errors, poor performance │
|
||||
│ Catches: Memory issues, batch handling bugs │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
🟡 HIGH PRIORITY (Very Important):
|
||||
|
||||
4. Tensor Dtype Compatibility
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: DataLoader tensors match model expectations │
|
||||
│ Risk: Type errors during training │
|
||||
│ Catches: Dtype mismatches, conversion errors │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
5. DataLoader + Loss Function Integration
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Batched predictions work with loss functions │
|
||||
│ Risk: Loss computation fails │
|
||||
│ Catches: Shape errors, reduction bugs │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
🟢 MEDIUM PRIORITY (Should Have):
|
||||
|
||||
6. Empty/Single Sample Edge Cases
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Graceful handling of unusual datasets │
|
||||
│ Risk: Crashes on edge cases │
|
||||
│ Catches: Division by zero, empty iteration │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
7. Multi-Epoch Iteration Stability
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ Tests: Multiple epochs work reliably │
|
||||
│ Risk: Multi-epoch training fails │
|
||||
│ Catches: Memory leaks, iteration bugs │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
🔗 MODULE 08 INTEGRATION POINTS
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Dependencies (What Module 08 Uses):
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Module 01 (Tensor) ────→ Core data structure │
|
||||
│ Module 03 (Layers) ────→ Batches passed to layers │
|
||||
│ Module 04 (Losses) ────→ Batch predictions → loss │
|
||||
│ Module 05 (Autograd) ──→ Batches in gradient tracking │
|
||||
│ Module 06 (Optimizers) → Batches drive updates │
|
||||
│ Module 07 (Training) ──→ DataLoader in training loop │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
Enables (What Uses Module 08):
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Module 07 (Training) → Training loop iteration │
|
||||
│ Module 09 (Spatial) ──→ Batched image data for CNNs │
|
||||
│ Module 10 (Text) ─────→ Batched text/token data │
|
||||
│ All Future Modules ───→ Any batch processing │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
🛠️ RECOMMENDED ACTION PLAN
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Step 1: Fix File Location ⚠️ IMMEDIATE
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Move current file to correct location: │
|
||||
│ │
|
||||
│ FROM: tests/08_dataloader/test_progressive_*.py │
|
||||
│ TO: tests/09_autograd/test_progressive_*.py │
|
||||
│ │
|
||||
│ Reason: Current tests are for Module 09, not 08 │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
Step 2: Create New Module 08 Tests
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Create proper test_progressive_integration.py for: │
|
||||
│ - Dataset abstract class │
|
||||
│ - TensorDataset implementation │
|
||||
│ - DataLoader batching and shuffling │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
Step 3: Implement Critical Tests First
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Priority Order: │
|
||||
│ 1. DataLoader + Training Loop Integration │
|
||||
│ 2. Shuffling Consistency │
|
||||
│ 3. Batch Size Memory Scaling │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
Step 4: Validate Student Workflows
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Ensure tests catch real student issues: │
|
||||
│ - Can they create datasets? │
|
||||
│ - Can they iterate batches? │
|
||||
│ - Can they train models end-to-end? │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
📈 IMPACT ASSESSMENT
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Current State:
|
||||
┌────────────────────────────────────────────┐
|
||||
│ Module 08 Integration Coverage: 0% │
|
||||
│ Critical Bug Risk: VERY HIGH │
|
||||
│ Student Success Risk: VERY HIGH │
|
||||
└────────────────────────────────────────────┘
|
||||
|
||||
After Implementing Recommended Tests:
|
||||
┌────────────────────────────────────────────┐
|
||||
│ Module 08 Integration Coverage: 100% │
|
||||
│ Critical Bug Risk: LOW │
|
||||
│ Student Success Risk: LOW │
|
||||
└────────────────────────────────────────────┘
|
||||
|
||||
Bugs Caught by New Tests:
|
||||
✓ Training loop integration failures
|
||||
✓ Shuffling and randomization bugs
|
||||
✓ Memory allocation issues
|
||||
✓ Dtype mismatches
|
||||
✓ Loss function integration errors
|
||||
✓ Edge case crashes
|
||||
✓ Multi-epoch stability issues
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
🎓 STUDENT IMPACT
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Without Module 08 Tests:
|
||||
❌ Students can implement DataLoader but can't verify it works
|
||||
❌ Training loop failures discovered during later modules
|
||||
❌ Confusing errors with no clear debugging path
|
||||
❌ Wasted time on issues that tests should catch
|
||||
❌ Poor understanding of batch processing trade-offs
|
||||
|
||||
With Module 08 Tests:
|
||||
✅ Students verify DataLoader works immediately
|
||||
✅ Integration issues caught at Module 08 boundary
|
||||
✅ Clear error messages guide debugging
|
||||
✅ Confidence to proceed to next modules
|
||||
✅ Deep understanding of batch processing mechanics
|
||||
|
||||
═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
For detailed analysis, see: INTEGRATION_TEST_AUDIT.md
|
||||
@@ -1,361 +0,0 @@
|
||||
# Module 08 (DataLoader) Integration Test Audit
|
||||
|
||||
## CRITICAL BUG IDENTIFIED
|
||||
|
||||
**File**: `/Users/VJ/GitHub/TinyTorch/tests/08_dataloader/test_progressive_integration.py`
|
||||
**Issue**: Tests Module 09 (Autograd) instead of Module 08 (DataLoader)
|
||||
|
||||
### Current Status
|
||||
|
||||
The test file header claims to test Module 08 but actually tests:
|
||||
```python
|
||||
"""
|
||||
Module 08: Progressive Integration Tests
|
||||
Tests that Module 09 (Autograd) works correctly AND that the entire prior stack (01→08) still works.
|
||||
```
|
||||
|
||||
**This is WRONG.** The file is in `tests/08_dataloader/` but tests Module 09 functionality.
|
||||
|
||||
---
|
||||
|
||||
## What Tests Currently Exist
|
||||
|
||||
### Current Tests (Module 09 - Autograd, WRONG MODULE)
|
||||
|
||||
1. **TestCompleteMLPipelineStillWorks**
|
||||
- `test_end_to_end_ml_pipeline_stable()` - Full CNN pipeline
|
||||
- `test_attention_and_spatial_integration_stable()` - Advanced architectures
|
||||
|
||||
2. **TestModule09AutogradCore** (WRONG - testing future module!)
|
||||
- `test_variable_wrapper_exists()` - Variable class
|
||||
- `test_gradient_computation()` - Backward pass
|
||||
- `test_computation_graph_building()` - Computation graph
|
||||
|
||||
3. **TestAutogradIntegration** (WRONG - testing future module!)
|
||||
- `test_autograd_with_layers()` - Gradients through Dense layers
|
||||
- `test_autograd_with_spatial_operations()` - CNN gradients
|
||||
- `test_autograd_with_attention()` - Transformer gradients
|
||||
|
||||
4. **TestGradientBasedLearningFoundation** (WRONG - testing future module!)
|
||||
- `test_parameter_gradient_computation()` - Parameter gradients
|
||||
- `test_loss_function_gradients()` - Loss gradients
|
||||
- `test_optimization_readiness()` - Optimizer foundation
|
||||
|
||||
5. **TestModule09Completion** (WRONG - testing future module!)
|
||||
- `test_autograd_foundation_complete()` - Complete autograd validation
|
||||
|
||||
---
|
||||
|
||||
## What Module 08 Tests SHOULD Exist
|
||||
|
||||
### Module 08 Scope: DataLoader (Data Pipeline)
|
||||
|
||||
**Implementation Location**: `tinytorch/data/loader.py`
|
||||
|
||||
**Core Components**:
|
||||
- `Dataset` - Abstract base class
|
||||
- `TensorDataset` - Tensor wrapper dataset
|
||||
- `DataLoader` - Batching and shuffling
|
||||
|
||||
### Missing Integration Tests for Module 08
|
||||
|
||||
#### 1. **DataLoader + Training Loop Integration** ⚠️ CRITICAL
|
||||
**Why**: Students need to verify DataLoader works with training loops
|
||||
|
||||
```python
|
||||
def test_dataloader_training_loop_integration():
|
||||
"""
|
||||
Test DataLoader provides batches correctly for training.
|
||||
|
||||
Integration Points:
|
||||
- DataLoader batches → Model forward pass
|
||||
- Batch tensors → Loss computation
|
||||
- Multi-epoch iteration
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- DataLoader provides correct batch shapes
|
||||
- Batches work with model forward pass
|
||||
- Multiple epochs iterate correctly
|
||||
- Training loop can consume all batches
|
||||
|
||||
|
||||
#### 2. **Shuffling Consistency** ⚠️ CRITICAL
|
||||
**Why**: Critical for training stability and reproducibility
|
||||
|
||||
```python
|
||||
def test_dataloader_shuffling_consistency():
|
||||
"""
|
||||
Test shuffling behavior across epochs.
|
||||
|
||||
Integration Points:
|
||||
- Same data, different order each epoch
|
||||
- Reproducibility with random seed
|
||||
- All samples seen exactly once per epoch
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Shuffle=True changes order between epochs
|
||||
- Shuffle=False maintains order
|
||||
- All samples appear exactly once per epoch
|
||||
- Random seed controls shuffling
|
||||
|
||||
|
||||
#### 3. **Batch Size Memory Scaling** ⚠️ CRITICAL
|
||||
**Why**: Students need to understand batch size impact on memory
|
||||
|
||||
```python
|
||||
def test_batch_size_memory_scaling():
|
||||
"""
|
||||
Test memory usage scales with batch size.
|
||||
|
||||
Systems Analysis:
|
||||
- Small batches (4): Low memory, more iterations
|
||||
- Medium batches (32): Balanced
|
||||
- Large batches (128): High memory, fewer iterations
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Small batch sizes work correctly
|
||||
- Large batch sizes work correctly
|
||||
- Total samples = batches * batch_size (approximately)
|
||||
- Last batch handles remainder correctly
|
||||
|
||||
|
||||
#### 4. **Tensor Dtype Compatibility** ⚠️ HIGH PRIORITY
|
||||
**Why**: DataLoader tensors must match model expectations
|
||||
|
||||
```python
|
||||
def test_dataloader_tensor_dtype_compatibility():
|
||||
"""
|
||||
Test DataLoader outputs match model input expectations.
|
||||
|
||||
Integration Points:
|
||||
- DataLoader tensors → Model layers
|
||||
- Feature dtype (float32)
|
||||
- Label dtype (int64 for classification, float32 for regression)
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Features are float32 tensors
|
||||
- Labels have correct dtype
|
||||
- Shapes match model input requirements
|
||||
- No dtype conversion errors during training
|
||||
|
||||
|
||||
#### 5. **DataLoader + Loss Function Integration** ⚠️ HIGH PRIORITY
|
||||
**Why**: Batches must work with loss computation
|
||||
|
||||
```python
|
||||
def test_dataloader_loss_integration():
|
||||
"""
|
||||
Test DataLoader batches work with loss functions.
|
||||
|
||||
Integration Points:
|
||||
- Batch predictions → Loss computation
|
||||
- Batch labels → Loss targets
|
||||
- Reduction across batch dimension
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Batched predictions work with MSE loss
|
||||
- Batched predictions work with CrossEntropy loss
|
||||
- Loss reduction handles batch dimension
|
||||
- Gradients (when ready) flow through batches
|
||||
|
||||
|
||||
#### 6. **Empty/Single Sample Edge Cases** ⚠️ MEDIUM PRIORITY
|
||||
**Why**: Robust data handling prevents training crashes
|
||||
|
||||
```python
|
||||
def test_dataloader_edge_cases():
|
||||
"""
|
||||
Test DataLoader handles edge cases gracefully.
|
||||
|
||||
Edge Cases:
|
||||
- Dataset smaller than batch size
|
||||
- Single sample dataset
|
||||
- Last batch smaller than batch_size
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Dataset with 1 sample
|
||||
- Dataset smaller than batch_size
|
||||
- Uneven division (10 samples, batch_size=3 → 4 batches)
|
||||
- Empty iteration behavior
|
||||
|
||||
|
||||
#### 7. **DataLoader Iteration Stability** ⚠️ MEDIUM PRIORITY
|
||||
**Why**: Multiple epochs must work reliably
|
||||
|
||||
```python
|
||||
def test_dataloader_multi_epoch_stability():
|
||||
"""
|
||||
Test DataLoader can iterate multiple epochs without issues.
|
||||
|
||||
Integration Points:
|
||||
- Reset between epochs
|
||||
- Shuffle consistency
|
||||
- No memory leaks across epochs
|
||||
"""
|
||||
```
|
||||
|
||||
**What to test**:
|
||||
- Can iterate 10+ epochs
|
||||
- Each epoch yields same total samples
|
||||
- Shuffling works every epoch
|
||||
- No gradual slowdown
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priority Ranking
|
||||
|
||||
### CRITICAL (Must Have for Module 08)
|
||||
|
||||
1. **DataLoader + Training Loop Integration**
|
||||
- **Risk**: Students can't train models without this
|
||||
- **Impact**: Complete failure of ML pipeline
|
||||
- **Catches**: Shape mismatches, iteration bugs
|
||||
|
||||
2. **Shuffling Consistency**
|
||||
- **Risk**: Training may not converge if shuffling breaks
|
||||
- **Impact**: Poor model performance, confusing results
|
||||
- **Catches**: Randomization bugs, duplicate samples
|
||||
|
||||
3. **Batch Size Memory Scaling**
|
||||
- **Risk**: Students don't understand memory-compute trade-offs
|
||||
- **Impact**: OOM errors, slow training
|
||||
- **Catches**: Memory issues, batch handling bugs
|
||||
|
||||
### HIGH PRIORITY (Very Important)
|
||||
|
||||
4. **Tensor Dtype Compatibility**
|
||||
- **Risk**: Type errors during training
|
||||
- **Impact**: Cryptic errors, wasted debugging time
|
||||
- **Catches**: Dtype mismatches, conversion errors
|
||||
|
||||
5. **DataLoader + Loss Function Integration**
|
||||
- **Risk**: Loss computation fails with batched data
|
||||
- **Impact**: Training loop crashes
|
||||
- **Catches**: Shape errors, reduction bugs
|
||||
|
||||
### MEDIUM PRIORITY (Should Have)
|
||||
|
||||
6. **Empty/Single Sample Edge Cases**
|
||||
- **Risk**: Crashes on unusual datasets
|
||||
- **Impact**: Fragile code, production failures
|
||||
- **Catches**: Division by zero, empty iteration
|
||||
|
||||
7. **DataLoader Iteration Stability**
|
||||
- **Risk**: Multi-epoch training fails
|
||||
- **Impact**: Can't train for sufficient epochs
|
||||
- **Catches**: Memory leaks, iteration bugs
|
||||
|
||||
---
|
||||
|
||||
## Recommended Action Plan
|
||||
|
||||
### Immediate Actions
|
||||
|
||||
1. **Rename Current File**
|
||||
```bash
|
||||
mv tests/08_dataloader/test_progressive_integration.py \
|
||||
tests/09_autograd/test_progressive_integration.py
|
||||
```
|
||||
The current tests are for Module 09 (Autograd), not Module 08.
|
||||
|
||||
2. **Create New Module 08 Tests**
|
||||
Create a proper `test_progressive_integration.py` for Module 08 DataLoader testing.
|
||||
|
||||
3. **Implement Critical Tests First**
|
||||
- DataLoader + Training Loop Integration
|
||||
- Shuffling Consistency
|
||||
- Batch Size Memory Scaling
|
||||
|
||||
### Test Structure for Module 08
|
||||
|
||||
```python
|
||||
"""
|
||||
Module 08: Progressive Integration Tests
|
||||
Tests that Module 08 (DataLoader) works correctly AND that the entire prior stack (01→07) still works.
|
||||
|
||||
DEPENDENCY CHAIN: 01_tensor → 02_activations → 03_layers → 04_losses → 05_autograd → 06_optimizers → 07_training → 08_dataloader
|
||||
|
||||
This is where we enable efficient batch processing and data iteration for training.
|
||||
"""
|
||||
|
||||
class TestPriorStackStillWorking:
|
||||
"""Regression: Modules 01-07 still work"""
|
||||
# Quick smoke tests for foundation
|
||||
|
||||
class TestModule08DataLoaderCore:
|
||||
"""Test Module 08 (DataLoader) core functionality"""
|
||||
# Dataset, TensorDataset, DataLoader basic operations
|
||||
|
||||
class TestDataLoaderTrainingIntegration:
|
||||
"""Integration: DataLoader + Training Loop"""
|
||||
# CRITICAL: Full training pipeline with batching
|
||||
|
||||
class TestDataLoaderMemoryBehavior:
|
||||
"""Systems: Memory and performance characteristics"""
|
||||
# Batch size scaling, memory usage
|
||||
|
||||
class TestModule08Completion:
|
||||
"""Final validation: Ready for next modules"""
|
||||
# Complete checklist
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration Points for Module 08
|
||||
|
||||
Based on existing code analysis:
|
||||
|
||||
### Module 08 Dependencies (What it uses)
|
||||
- **Module 01 (Tensor)**: `tinytorch.core.tensor.Tensor` - Core data structure
|
||||
- **Module 02 (Activations)**: Not directly used, but batches go through activations
|
||||
- **Module 03 (Layers)**: Batches passed to layers
|
||||
- **Module 04 (Losses)**: Batch predictions → loss computation
|
||||
- **Module 05 (Autograd)**: Batches participate in gradient computation
|
||||
- **Module 06 (Optimizers)**: Batches drive parameter updates
|
||||
- **Module 07 (Training)**: DataLoader provides batches for training loop
|
||||
|
||||
### Module 08 Enables (What uses it)
|
||||
- **Module 07 (Training)**: Training loops iterate over DataLoader
|
||||
- **Module 09 (Spatial)**: Batched image data for CNNs
|
||||
- **Module 10 (Tokenization)**: Batched text data
|
||||
- **Module 11 (Embeddings)**: Batched sequence data
|
||||
- All future training/inference pipelines
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
### Current Coverage: **0% for Module 08 DataLoader**
|
||||
- All existing tests are for Module 09 (Autograd)
|
||||
- No tests for Dataset, TensorDataset, or DataLoader
|
||||
- Critical integration points completely untested
|
||||
|
||||
### Missing Tests: **7 integration test scenarios**
|
||||
- 3 CRITICAL priority tests
|
||||
- 2 HIGH priority tests
|
||||
- 2 MEDIUM priority tests
|
||||
|
||||
### Bug-Catching Gaps:
|
||||
- **Training integration**: Untested - will students be able to train models?
|
||||
- **Shuffling behavior**: Untested - will training converge?
|
||||
- **Memory scaling**: Untested - will students understand batch size?
|
||||
- **Dtype compatibility**: Untested - will type errors occur?
|
||||
|
||||
### Recommended Next Steps:
|
||||
1. Move current file to Module 09 tests
|
||||
2. Create proper Module 08 integration tests
|
||||
3. Implement critical tests first (training loop, shuffling, memory)
|
||||
4. Validate with student workflows
|
||||
@@ -1,575 +0,0 @@
|
||||
# Module 10 (Tokenization) Integration Test Audit
|
||||
|
||||
**Date**: 2025-11-25
|
||||
**Auditor**: QA Agent
|
||||
**Status**: CRITICAL ISSUES FOUND - Test file contains completely wrong content
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**CRITICAL FINDING**: The integration test file `/tests/10_tokenization/test_progressive_integration.py` contains **WRONG MODULE CONTENT** - it tests Module 11 (Training) instead of Module 10 (Tokenization).
|
||||
|
||||
**Current Coverage**: 0% - No tokenization integration tests exist
|
||||
**Missing Tests**: 100% - All critical integration points untested
|
||||
**Priority**: HIGH - Module 10 has no integration validation
|
||||
|
||||
---
|
||||
|
||||
## Current Test File Analysis
|
||||
|
||||
### Problem: Wrong Module Tests
|
||||
|
||||
The file `test_progressive_integration.py` contains:
|
||||
- ❌ **Line 3-6**: References wrong dependency chain (mentions "11_training")
|
||||
- ❌ **Classes**: TestModule11TrainingCore, TestAdvancedTrainingFeatures
|
||||
- ❌ **Tests**: training loops, loss functions, optimizers, CNN pipelines
|
||||
- ❌ **Imports**: training.Trainer, training.CrossEntropyLoss, etc.
|
||||
|
||||
**Root Cause**: Copy-paste error from Module 11 template
|
||||
|
||||
---
|
||||
|
||||
## Module 10 Actual Implementation
|
||||
|
||||
### What Module 10 Provides
|
||||
|
||||
**Location**: `tinytorch.text.tokenization`
|
||||
|
||||
**Classes Implemented**:
|
||||
1. `Tokenizer` - Base class with encode/decode interface
|
||||
2. `CharTokenizer` - Character-level tokenization
|
||||
3. `BPETokenizer` - Byte Pair Encoding tokenizer
|
||||
|
||||
**Key Methods**:
|
||||
- `CharTokenizer.build_vocab(corpus)` - Build vocabulary from text
|
||||
- `CharTokenizer.encode(text)` - Text → token IDs (List[int])
|
||||
- `CharTokenizer.decode(tokens)` - Token IDs → text
|
||||
- `BPETokenizer.train(corpus, vocab_size)` - Learn BPE merges
|
||||
- `BPETokenizer.encode(text)` - BPE encoding
|
||||
- `BPETokenizer.decode(tokens)` - BPE decoding
|
||||
|
||||
**Integration Points with Other Modules**:
|
||||
- Module 01 (Tensor): Can convert token IDs to Tensor (optional)
|
||||
- Module 11 (Embeddings): Token IDs feed into embedding layers
|
||||
- Module 08 (DataLoader): Tokenizers process text datasets
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Tests MISSING
|
||||
|
||||
### Priority 1: Data Type Correctness (Bug-Catching Priority)
|
||||
|
||||
**Missing Test**: Tokenizers produce correct tensor dtypes
|
||||
```python
|
||||
def test_tokenizer_produces_int64_tensors():
|
||||
"""Verify tokenizers produce int64 token IDs for embedding layers."""
|
||||
# WHY CRITICAL: Embeddings expect int64 indices, not float32
|
||||
# BUG SCENARIO: If tokenizer returns float, embedding lookup crashes
|
||||
|
||||
tokenizer = CharTokenizer()
|
||||
tokenizer.build_vocab(["hello world"])
|
||||
|
||||
# Encode text
|
||||
token_ids = tokenizer.encode("hello")
|
||||
|
||||
# CRITICAL: Must be integers, not floats
|
||||
assert all(isinstance(t, (int, np.integer)) for t in token_ids), \
|
||||
"Token IDs must be integers for embedding lookup"
|
||||
|
||||
# If converting to Tensor, must be int64
|
||||
token_tensor = Tensor(token_ids)
|
||||
assert token_tensor.data.dtype == np.int64, \
|
||||
f"Expected int64 for embeddings, got {token_tensor.data.dtype}"
|
||||
```
|
||||
|
||||
**Bug This Catches**: Type mismatch between tokenizer output and embedding input
|
||||
|
||||
---
|
||||
|
||||
### Priority 2: Embedding Layer Integration (Module 11 Dependency)
|
||||
|
||||
**Missing Test**: Token sequences work with embeddings
|
||||
```python
|
||||
def test_tokenization_to_embedding_pipeline():
|
||||
"""Test complete tokenization → embedding pipeline."""
|
||||
# WHY CRITICAL: This is the PRIMARY use case for tokenizers
|
||||
|
||||
try:
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
from tinytorch.text.tokenization import CharTokenizer
|
||||
|
||||
# Build tokenizer
|
||||
tokenizer = CharTokenizer()
|
||||
corpus = ["hello", "world", "test"]
|
||||
tokenizer.build_vocab(corpus)
|
||||
|
||||
vocab_size = len(tokenizer.vocab)
|
||||
embed_dim = 16
|
||||
|
||||
# Create embedding layer
|
||||
embedding = Embedding(vocab_size, embed_dim)
|
||||
|
||||
# Tokenize text
|
||||
text = "hello world"
|
||||
token_ids = tokenizer.encode(text)
|
||||
|
||||
# CRITICAL: Shape compatibility
|
||||
token_tensor = Tensor(token_ids)
|
||||
assert token_tensor.shape == (len(token_ids),), \
|
||||
"Token IDs should be 1D sequence"
|
||||
|
||||
# Embedding lookup should work
|
||||
embedded = embedding(token_tensor)
|
||||
assert embedded.shape == (len(token_ids), embed_dim), \
|
||||
f"Expected shape ({len(token_ids)}, {embed_dim}), got {embedded.shape}"
|
||||
|
||||
# Values should be actual embeddings, not zeros
|
||||
assert not np.allclose(embedded.data, 0), \
|
||||
"Embeddings should be non-zero (initialized randomly)"
|
||||
|
||||
except ImportError:
|
||||
pytest.skip("Embeddings module not yet implemented")
|
||||
```
|
||||
|
||||
**Bug This Catches**: Shape mismatches, dtype errors, index out-of-bounds
|
||||
|
||||
---
|
||||
|
||||
### Priority 3: BPE Edge Cases (Robustness)
|
||||
|
||||
**Missing Test**: BPE tokenizer handles edge cases
|
||||
```python
|
||||
def test_bpe_edge_cases():
|
||||
"""Test BPE tokenizer robustness with edge cases."""
|
||||
tokenizer = BPETokenizer(vocab_size=100)
|
||||
|
||||
# Edge Case 1: Empty string
|
||||
token_ids = tokenizer.encode("")
|
||||
assert token_ids == [], "Empty string should produce empty token list"
|
||||
|
||||
decoded = tokenizer.decode([])
|
||||
assert decoded == "", "Empty tokens should decode to empty string"
|
||||
|
||||
# Edge Case 2: Single character
|
||||
tokenizer.train(["a", "b", "c"])
|
||||
token_ids = tokenizer.encode("a")
|
||||
assert len(token_ids) > 0, "Single char should tokenize"
|
||||
assert tokenizer.decode(token_ids).strip() == "a", "Should roundtrip"
|
||||
|
||||
# Edge Case 3: Unknown characters (after training on limited corpus)
|
||||
tokenizer.train(["hello", "world"])
|
||||
token_ids = tokenizer.encode("xyz") # Characters not in training
|
||||
|
||||
# Should handle gracefully with <UNK> token
|
||||
assert 0 in token_ids or tokenizer.token_to_id.get('<UNK>') in token_ids, \
|
||||
"Unknown characters should map to <UNK> token"
|
||||
|
||||
# Edge Case 4: Very long text
|
||||
long_text = "hello " * 1000
|
||||
token_ids = tokenizer.encode(long_text)
|
||||
assert len(token_ids) > 0, "Long text should tokenize"
|
||||
assert all(isinstance(t, int) for t in token_ids), \
|
||||
"All tokens should be integers"
|
||||
|
||||
# Edge Case 5: Special characters
|
||||
special_text = "hello, world! @#$%"
|
||||
token_ids = tokenizer.encode(special_text)
|
||||
decoded = tokenizer.decode(token_ids)
|
||||
# Should preserve word content even if punctuation changes
|
||||
assert "hello" in decoded or "world" in decoded, \
|
||||
"Should preserve core words"
|
||||
```
|
||||
|
||||
**Bug This Catches**: Crashes on empty input, unknown character handling, memory issues
|
||||
|
||||
---
|
||||
|
||||
### Priority 4: Vocabulary Consistency
|
||||
|
||||
**Missing Test**: Vocabulary consistency across encode/decode
|
||||
```python
|
||||
def test_vocabulary_encode_decode_consistency():
|
||||
"""Verify vocabulary mappings are bidirectional and consistent."""
|
||||
|
||||
# Test CharTokenizer
|
||||
char_tokenizer = CharTokenizer()
|
||||
corpus = ["abc", "def", "xyz"]
|
||||
char_tokenizer.build_vocab(corpus)
|
||||
|
||||
# Check bidirectional mappings
|
||||
for token, token_id in char_tokenizer.token_to_id.items():
|
||||
assert char_tokenizer.id_to_token[token_id] == token, \
|
||||
f"Bidirectional mapping broken: {token} -> {token_id} -> {char_tokenizer.id_to_token[token_id]}"
|
||||
|
||||
# Test roundtrip for all corpus text
|
||||
for text in corpus:
|
||||
token_ids = char_tokenizer.encode(text)
|
||||
decoded = char_tokenizer.decode(token_ids)
|
||||
# Should preserve characters (may have different spacing)
|
||||
for char in text:
|
||||
assert char in decoded, f"Lost character '{char}' in roundtrip"
|
||||
|
||||
# Test BPETokenizer
|
||||
bpe_tokenizer = BPETokenizer(vocab_size=50)
|
||||
bpe_tokenizer.train(["hello world", "test data"])
|
||||
|
||||
# Vocabulary should contain special tokens
|
||||
assert '<UNK>' in bpe_tokenizer.vocab, "BPE should have <UNK> token"
|
||||
assert bpe_tokenizer.token_to_id['<UNK>'] == 0, "<UNK> should be ID 0"
|
||||
|
||||
# Test roundtrip
|
||||
text = "hello world"
|
||||
token_ids = bpe_tokenizer.encode(text)
|
||||
decoded = bpe_tokenizer.decode(token_ids)
|
||||
|
||||
# Should preserve words (BPE may merge/split differently)
|
||||
words = text.split()
|
||||
for word in words:
|
||||
# Word content should be preserved (possibly with merges)
|
||||
assert word in decoded or any(word in decoded for word in words), \
|
||||
f"Lost word '{word}' in BPE roundtrip"
|
||||
```
|
||||
|
||||
**Bug This Catches**: Vocabulary corruption, ID collisions, decode inconsistency
|
||||
|
||||
---
|
||||
|
||||
### Priority 5: Batch Processing
|
||||
|
||||
**Missing Test**: Tokenizer handles batches correctly
|
||||
```python
|
||||
def test_tokenizer_batch_processing():
|
||||
"""Test tokenizer works with batched text data."""
|
||||
tokenizer = CharTokenizer()
|
||||
corpus = ["hello", "world", "test", "data"]
|
||||
tokenizer.build_vocab(corpus)
|
||||
|
||||
# Batch of texts
|
||||
texts = ["hello world", "test data", "new text"]
|
||||
|
||||
# Encode batch
|
||||
batch_token_ids = [tokenizer.encode(text) for text in texts]
|
||||
|
||||
# Check all are lists of ints
|
||||
for token_ids in batch_token_ids:
|
||||
assert isinstance(token_ids, list), "Each should be a list"
|
||||
assert all(isinstance(t, int) for t in token_ids), \
|
||||
"All tokens should be integers"
|
||||
|
||||
# Check different texts produce different token sequences
|
||||
assert batch_token_ids[0] != batch_token_ids[1], \
|
||||
"Different texts should produce different token sequences"
|
||||
|
||||
# Decode batch
|
||||
decoded_texts = [tokenizer.decode(token_ids) for token_ids in batch_token_ids]
|
||||
|
||||
# Should preserve core content
|
||||
for original, decoded in zip(texts, decoded_texts):
|
||||
# May have spacing differences, but core words should match
|
||||
original_words = set(original.split())
|
||||
decoded_words = set(decoded.split())
|
||||
|
||||
# At least some words should match
|
||||
assert len(original_words & decoded_words) > 0, \
|
||||
f"Lost all words in roundtrip: {original} -> {decoded}"
|
||||
```
|
||||
|
||||
**Bug This Catches**: Batch size errors, state pollution between encodes
|
||||
|
||||
---
|
||||
|
||||
### Priority 6: Memory and Performance
|
||||
|
||||
**Missing Test**: Tokenization memory usage and throughput
|
||||
```python
|
||||
def test_tokenization_performance():
|
||||
"""Test tokenization memory and throughput characteristics."""
|
||||
import time
|
||||
|
||||
# Build tokenizers
|
||||
char_tokenizer = CharTokenizer()
|
||||
bpe_tokenizer = BPETokenizer(vocab_size=1000)
|
||||
|
||||
# Training corpus
|
||||
corpus = ["hello world"] * 100
|
||||
char_tokenizer.build_vocab(corpus)
|
||||
bpe_tokenizer.train(corpus)
|
||||
|
||||
# Test text (simulate real document)
|
||||
test_text = "hello world test data " * 100 # ~400 chars
|
||||
|
||||
# Measure CharTokenizer throughput
|
||||
start = time.time()
|
||||
iterations = 1000
|
||||
for _ in range(iterations):
|
||||
token_ids = char_tokenizer.encode(test_text)
|
||||
char_time = time.time() - start
|
||||
char_throughput = (len(test_text) * iterations) / char_time
|
||||
|
||||
print(f"CharTokenizer: {char_throughput:.0f} chars/sec")
|
||||
assert char_throughput > 10000, \
|
||||
f"CharTokenizer too slow: {char_throughput:.0f} chars/sec (expected >10K)"
|
||||
|
||||
# Measure BPE throughput
|
||||
start = time.time()
|
||||
for _ in range(iterations):
|
||||
token_ids = bpe_tokenizer.encode(test_text)
|
||||
bpe_time = time.time() - start
|
||||
bpe_throughput = (len(test_text) * iterations) / bpe_time
|
||||
|
||||
print(f"BPETokenizer: {bpe_throughput:.0f} chars/sec")
|
||||
# BPE should be slower (more complex), but still reasonable
|
||||
assert bpe_throughput > 1000, \
|
||||
f"BPETokenizer too slow: {bpe_throughput:.0f} chars/sec (expected >1K)"
|
||||
|
||||
# Vocabulary size check
|
||||
assert len(char_tokenizer.vocab) < 500, \
|
||||
f"CharTokenizer vocab too large: {len(char_tokenizer.vocab)} (expected <500)"
|
||||
|
||||
assert len(bpe_tokenizer.vocab) <= 1000, \
|
||||
f"BPETokenizer vocab exceeded limit: {len(bpe_tokenizer.vocab)}"
|
||||
```
|
||||
|
||||
**Bug This Catches**: Performance regressions, memory leaks, vocabulary explosion
|
||||
|
||||
---
|
||||
|
||||
### Priority 7: DataLoader Integration
|
||||
|
||||
**Missing Test**: Tokenizer integration with DataLoader
|
||||
```python
|
||||
def test_tokenizer_dataloader_integration():
|
||||
"""Test tokenizer works in DataLoader pipeline."""
|
||||
try:
|
||||
from tinytorch.core.data import Dataset, DataLoader
|
||||
from tinytorch.text.tokenization import CharTokenizer
|
||||
|
||||
# Custom dataset with tokenization
|
||||
class TextDataset(Dataset):
|
||||
def __init__(self, texts, tokenizer):
|
||||
self.texts = texts
|
||||
self.tokenizer = tokenizer
|
||||
|
||||
def __len__(self):
|
||||
return len(self.texts)
|
||||
|
||||
def __getitem__(self, idx):
|
||||
text = self.texts[idx]
|
||||
token_ids = self.tokenizer.encode(text)
|
||||
# Return as tensor
|
||||
return Tensor(token_ids)
|
||||
|
||||
# Build tokenizer
|
||||
tokenizer = CharTokenizer()
|
||||
texts = ["hello world", "test data", "sample text"]
|
||||
tokenizer.build_vocab(texts)
|
||||
|
||||
# Create dataset and dataloader
|
||||
dataset = TextDataset(texts, tokenizer)
|
||||
dataloader = DataLoader(dataset, batch_size=2, shuffle=False)
|
||||
|
||||
# Iterate batches
|
||||
batch_count = 0
|
||||
for batch in dataloader:
|
||||
batch_count += 1
|
||||
|
||||
# Batch should be tensor or list of tensors
|
||||
if isinstance(batch, (list, tuple)):
|
||||
assert len(batch) <= 2, "Batch size should be 2"
|
||||
for item in batch:
|
||||
assert hasattr(item, 'data') or isinstance(item, Tensor), \
|
||||
"Items should be Tensors"
|
||||
else:
|
||||
# Single batch tensor
|
||||
assert hasattr(batch, 'data'), "Batch should be Tensor"
|
||||
|
||||
assert batch_count > 0, "DataLoader should produce batches"
|
||||
|
||||
except ImportError:
|
||||
pytest.skip("DataLoader not yet implemented")
|
||||
```
|
||||
|
||||
**Bug This Catches**: DataLoader compatibility issues, batching errors
|
||||
|
||||
---
|
||||
|
||||
## Regression Prevention Tests MISSING
|
||||
|
||||
### Test: Prior Stack Still Works
|
||||
|
||||
**Missing Test**: Verify Modules 01-09 unchanged
|
||||
```python
|
||||
def test_no_prior_module_regression():
|
||||
"""Ensure tokenization doesn't break prior modules."""
|
||||
# Module 01 (Tensor) should still work
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
x = Tensor([1, 2, 3])
|
||||
assert x.shape == (3,), "Tensor creation broken"
|
||||
|
||||
# Module 02 (Activations) should still work
|
||||
try:
|
||||
from tinytorch.core.activations import ReLU
|
||||
relu = ReLU()
|
||||
y = relu(x)
|
||||
assert y.shape == x.shape, "Activation broken"
|
||||
except ImportError:
|
||||
pass # Not implemented yet
|
||||
|
||||
# Module 08 (DataLoader) should still work
|
||||
try:
|
||||
from tinytorch.core.data import Dataset, DataLoader
|
||||
|
||||
class DummyDataset(Dataset):
|
||||
def __len__(self):
|
||||
return 5
|
||||
def __getitem__(self, idx):
|
||||
return idx
|
||||
|
||||
dataset = DummyDataset()
|
||||
loader = DataLoader(dataset, batch_size=2)
|
||||
assert len(dataset) == 5, "Dataset broken"
|
||||
except ImportError:
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Test File Structure
|
||||
|
||||
```python
|
||||
"""
|
||||
Module 10: Progressive Integration Tests
|
||||
Tests that Module 10 (Tokenization) works correctly AND integrates with prior modules.
|
||||
|
||||
DEPENDENCY CHAIN: 01_tensor → ... → 08_dataloader → 10_tokenization → 11_embeddings
|
||||
This is where we enable text processing for NLP.
|
||||
"""
|
||||
|
||||
class TestPriorStackStillWorking:
|
||||
"""Quick regression checks that prior modules (01-09) still work."""
|
||||
|
||||
def test_tensor_operations_stable(self):
|
||||
"""Verify Module 01 (Tensor) still works."""
|
||||
|
||||
def test_dataloader_stable(self):
|
||||
"""Verify Module 08 (DataLoader) still works."""
|
||||
|
||||
|
||||
class TestModule10TokenizationCore:
|
||||
"""Test Module 10 (Tokenization) core functionality."""
|
||||
|
||||
def test_char_tokenizer_creation(self):
|
||||
"""Test CharTokenizer initialization and vocab building."""
|
||||
|
||||
def test_char_tokenizer_encode_decode(self):
|
||||
"""Test CharTokenizer encode/decode roundtrip."""
|
||||
|
||||
def test_bpe_tokenizer_training(self):
|
||||
"""Test BPE tokenizer training on corpus."""
|
||||
|
||||
def test_bpe_tokenizer_encode_decode(self):
|
||||
"""Test BPE encode/decode roundtrip."""
|
||||
|
||||
|
||||
class TestTokenizationIntegration:
|
||||
"""Test tokenization integration with other modules."""
|
||||
|
||||
def test_tokenizer_produces_correct_dtypes(self):
|
||||
"""PRIORITY 1: Verify int64 output for embeddings."""
|
||||
|
||||
def test_tokenization_to_embedding_pipeline(self):
|
||||
"""PRIORITY 2: Test complete tokenization → embedding flow."""
|
||||
|
||||
def test_tokenizer_dataloader_integration(self):
|
||||
"""Test tokenizer in DataLoader pipeline."""
|
||||
|
||||
|
||||
class TestTokenizationEdgeCases:
|
||||
"""Test tokenization robustness with edge cases."""
|
||||
|
||||
def test_bpe_edge_cases(self):
|
||||
"""PRIORITY 3: Empty strings, unknown tokens, special chars."""
|
||||
|
||||
def test_vocabulary_consistency(self):
|
||||
"""PRIORITY 4: Bidirectional mappings, roundtrip integrity."""
|
||||
|
||||
def test_batch_processing(self):
|
||||
"""PRIORITY 5: Batch encoding/decoding correctness."""
|
||||
|
||||
|
||||
class TestTokenizationPerformance:
|
||||
"""Test tokenization performance characteristics."""
|
||||
|
||||
def test_tokenization_throughput(self):
|
||||
"""PRIORITY 6: Measure chars/sec, vocab size."""
|
||||
|
||||
def test_memory_usage(self):
|
||||
"""Verify vocabulary doesn't consume excessive memory."""
|
||||
|
||||
|
||||
class TestRegressionPrevention:
|
||||
"""Ensure previous modules still work after Module 10."""
|
||||
|
||||
def test_no_tensor_regression(self):
|
||||
"""Verify Module 01 (Tensor) unchanged."""
|
||||
|
||||
def test_no_dataloader_regression(self):
|
||||
"""Verify Module 08 (DataLoader) unchanged."""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary Statistics
|
||||
|
||||
| Category | Missing Tests | Priority | Impact |
|
||||
|----------|--------------|----------|--------|
|
||||
| Data Type Correctness | 1 | CRITICAL | Breaks embeddings |
|
||||
| Embedding Integration | 1 | CRITICAL | Core use case |
|
||||
| BPE Edge Cases | 1 | HIGH | Production robustness |
|
||||
| Vocabulary Consistency | 1 | HIGH | Data integrity |
|
||||
| Batch Processing | 1 | MEDIUM | Real-world usage |
|
||||
| Performance | 1 | MEDIUM | Production viability |
|
||||
| DataLoader Integration | 1 | MEDIUM | Pipeline integrity |
|
||||
| Regression Prevention | 2 | HIGH | Stack stability |
|
||||
|
||||
**Total Missing Tests**: 9 critical integration tests
|
||||
**Current Test Coverage**: 0% (wrong module)
|
||||
**Recommended Action**: REPLACE entire test file
|
||||
|
||||
---
|
||||
|
||||
## Recommended Action Plan
|
||||
|
||||
### Phase 1: Immediate (Critical Fixes)
|
||||
1. **REPLACE test_progressive_integration.py** with correct Module 10 tests
|
||||
2. **Implement Priority 1-2 tests** (dtype correctness, embedding integration)
|
||||
3. **Add BPE edge case tests** (Priority 3)
|
||||
|
||||
### Phase 2: Short-term (Robustness)
|
||||
4. **Add vocabulary consistency tests** (Priority 4)
|
||||
5. **Add batch processing tests** (Priority 5)
|
||||
6. **Add regression prevention tests**
|
||||
|
||||
### Phase 3: Performance Validation
|
||||
7. **Add performance benchmarks** (Priority 6)
|
||||
8. **Add DataLoader integration** (Priority 7)
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priorities (Ranked)
|
||||
|
||||
1. **Data Type Mismatch** (CRITICAL): int vs float breaks embedding lookup
|
||||
2. **Embedding Integration** (CRITICAL): Core use case must work
|
||||
3. **Unknown Token Handling** (HIGH): Crashes on unseen characters
|
||||
4. **Vocabulary Corruption** (HIGH): Encode/decode inconsistency
|
||||
5. **Empty Input Crashes** (MEDIUM): Edge case handling
|
||||
6. **Batch State Pollution** (MEDIUM): Tokenizer state leaks between calls
|
||||
7. **Performance Regression** (LOW): Slow tokenization impacts pipelines
|
||||
|
||||
---
|
||||
|
||||
**Audit Completed**: 2025-11-25
|
||||
**Next Review**: After test file replacement
|
||||
**Sign-off**: QA Agent - Integration Testing Team
|
||||
@@ -1,105 +0,0 @@
|
||||
================================================================================
|
||||
MODULE 11 EMBEDDINGS - INTEGRATION TEST AUDIT SUMMARY
|
||||
================================================================================
|
||||
Date: 2025-11-25
|
||||
Status: CRITICAL ISSUES FOUND
|
||||
|
||||
CRITICAL FINDING
|
||||
================================================================================
|
||||
The test file tests THE WRONG MODULE!
|
||||
- File claims to test Module 11 (Embeddings)
|
||||
- Actually tests Module 12 (Compression)
|
||||
- This is a copy-paste error requiring COMPLETE REWRITE
|
||||
|
||||
COVERAGE ANALYSIS
|
||||
================================================================================
|
||||
Current Coverage: 0% (tests wrong module)
|
||||
Missing Tests: 12 critical integration tests
|
||||
Risk Level: HIGH - No validation of embedding functionality
|
||||
|
||||
TOP PRIORITY MISSING TESTS (P0 - CRITICAL)
|
||||
================================================================================
|
||||
1. test_tokenizer_embedding_pipeline
|
||||
→ Validates Module 10 → Module 11 integration
|
||||
→ Catches: Vocab size mismatches, invalid token IDs
|
||||
→ Priority: HIGHEST - This is the core use case
|
||||
|
||||
2. test_embedding_index_out_of_bounds
|
||||
→ Validates error handling for invalid indices
|
||||
→ Catches: Silent failures, tokenizer bugs
|
||||
→ Priority: HIGHEST - Prevents crashes
|
||||
|
||||
3. test_positional_encoding_max_seq_len
|
||||
→ Validates sequence length limits
|
||||
→ Catches: OOB errors in attention, OOM crashes
|
||||
→ Priority: HIGHEST - Critical for Module 12
|
||||
|
||||
4. test_embedding_gradient_flow
|
||||
→ Validates autograd integration (Module 05)
|
||||
→ Catches: Training failures, gradient bugs
|
||||
→ Priority: HIGH - Ensures embeddings are trainable
|
||||
|
||||
HIGH PRIORITY MISSING TESTS (P1)
|
||||
================================================================================
|
||||
5. test_embedding_attention_shape_compatibility
|
||||
→ Validates Module 11 → Module 12 forward integration
|
||||
→ Ensures attention receives correct input shapes
|
||||
|
||||
6. test_variable_sequence_length_handling
|
||||
→ Validates dynamic sequence length support
|
||||
→ Critical for real-world NLP tasks
|
||||
|
||||
7. test_embedding_positional_composition
|
||||
→ Validates token + positional encoding combination
|
||||
→ Ensures both components contribute
|
||||
|
||||
8. test_embedding_parameters_optimizable
|
||||
→ Validates optimizer integration
|
||||
→ Ensures embeddings participate in training
|
||||
|
||||
CRITICAL INTEGRATION POINTS
|
||||
================================================================================
|
||||
Backward Integration (Dependencies):
|
||||
✗ Module 10 (Tokenization) → Token IDs feed embeddings
|
||||
✗ Module 05 (Autograd) → Gradient flow through embeddings
|
||||
✗ Module 01 (Tensor) → Embedding operations use Tensor
|
||||
|
||||
Forward Integration (Dependents):
|
||||
✗ Module 11 → Module 12 (Attention) → Shape compatibility
|
||||
✗ Module 11 → Module 13 (Transformers) → Complete pipeline
|
||||
✗ Module 11 → Module 06 (Optimizers) → Parameter updates
|
||||
|
||||
BUG-CATCHING VALUE
|
||||
================================================================================
|
||||
Highest Impact Tests:
|
||||
1. Index validation → Catches 40% of embedding bugs
|
||||
2. Gradient flow → Catches 25% of bugs
|
||||
3. Shape compatibility → Catches 20% of bugs
|
||||
4. Sequence length limits → Catches 15% of bugs
|
||||
|
||||
IMMEDIATE ACTION REQUIRED
|
||||
================================================================================
|
||||
1. Delete all compression tests from test_progressive_integration.py
|
||||
2. Implement 4 P0 tests (tokenizer integration, index validation, etc.)
|
||||
3. Implement 4 P1 tests (attention compatibility, variable sequences, etc.)
|
||||
4. Add regression prevention tests (prior stack stability)
|
||||
|
||||
ESTIMATED EFFORT
|
||||
================================================================================
|
||||
Total Time: 4-6 hours
|
||||
- Fix wrong module bug: 30 min
|
||||
- P0 tests (4): 1.5 hours
|
||||
- P1 tests (4): 1.5 hours
|
||||
- P2 tests (4): 1.5 hours
|
||||
- Documentation: 30 min
|
||||
- Testing/validation: 1 hour
|
||||
|
||||
EXPECTED OUTCOME
|
||||
================================================================================
|
||||
After fixes: 90%+ bug detection coverage
|
||||
- Tokenizer integration validated
|
||||
- Gradient flow confirmed
|
||||
- Attention compatibility ensured
|
||||
- Training loop integration verified
|
||||
|
||||
See INTEGRATION_TEST_AUDIT.md for detailed analysis and test implementations.
|
||||
@@ -1,630 +0,0 @@
|
||||
# Module 11 (Embeddings) Integration Test Audit Report
|
||||
|
||||
**Date**: 2025-11-25
|
||||
**Auditor**: Dr. Sarah Rodriguez
|
||||
**Module**: 11_embeddings (Token and Positional Embeddings)
|
||||
**Test File**: `tests/11_embeddings/test_progressive_integration.py`
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**CRITICAL FINDING**: The integration test file is completely incorrect - it tests Module 12 (Compression) instead of Module 11 (Embeddings). This is a copy-paste error that must be fixed immediately.
|
||||
|
||||
**Status**: MAJOR ISSUES - Complete rewrite required
|
||||
**Coverage**: 0% of Module 11 functionality (tests wrong module)
|
||||
**Risk Level**: HIGH - No integration validation for embeddings
|
||||
|
||||
---
|
||||
|
||||
## Current Test File Issues
|
||||
|
||||
### Issue 1: Wrong Module Being Tested (CRITICAL)
|
||||
**Problem**: File header says "Module 11" but tests "Module 12 (Compression)"
|
||||
```python
|
||||
# Current (WRONG):
|
||||
"""
|
||||
Module 11: Progressive Integration Tests
|
||||
Tests that Module 12 (Compression) works correctly...
|
||||
"""
|
||||
|
||||
# Should be:
|
||||
"""
|
||||
Module 11: Progressive Integration Tests
|
||||
Tests that Module 11 (Embeddings) works correctly...
|
||||
"""
|
||||
```
|
||||
|
||||
**Impact**: ZERO coverage of Module 11 integration points
|
||||
|
||||
### Issue 2: Wrong Dependency Chain
|
||||
**Problem**: States dependency chain ending in compression
|
||||
```python
|
||||
# Current (WRONG):
|
||||
DEPENDENCY CHAIN: 01_setup → ... → 11_training → 12_compression
|
||||
|
||||
# Should be:
|
||||
DEPENDENCY CHAIN: 01_tensor → 02_activations → ... → 10_tokenization → 11_embeddings
|
||||
```
|
||||
|
||||
### Issue 3: No Embedding-Specific Tests
|
||||
**Problem**: All test classes focus on compression (quantization, pruning, distillation)
|
||||
- `TestModule12CompressionCore` - Wrong module
|
||||
- No `TestModule11EmbeddingsCore` - Missing!
|
||||
- No embedding-tokenizer integration - Missing!
|
||||
- No embedding-attention preparation - Missing!
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Points for Module 11
|
||||
|
||||
Based on the module implementation and DEFINITIVE_MODULE_PLAN, Module 11 must validate:
|
||||
|
||||
### 1. Backward Integration (Dependencies)
|
||||
**Module 10 (Tokenization) → Module 11 (Embeddings)**
|
||||
- ✗ Token IDs from tokenizers must be valid embedding indices
|
||||
- ✗ Vocabulary size consistency between tokenizer and embedding
|
||||
- ✗ Special token handling (<UNK>, <PAD>, <BOS>, <EOS>)
|
||||
- ✗ Batch dimension handling from DataLoader
|
||||
|
||||
**Module 01 (Tensor) → Module 11**
|
||||
- ✗ Embeddings return proper Tensor objects
|
||||
- ✗ Gradient tracking works (`requires_grad=True`)
|
||||
- ✗ Tensor operations (slicing, reshaping) preserve embedding semantics
|
||||
|
||||
**Module 05 (Autograd) → Module 11**
|
||||
- ✗ EmbeddingBackward gradient computation
|
||||
- ✗ Gradient accumulation for shared embeddings
|
||||
- ✗ Positional encoding gradients flow correctly
|
||||
|
||||
### 2. Forward Integration (Dependents)
|
||||
**Module 11 (Embeddings) → Module 12 (Attention)**
|
||||
- ✗ Embedding output shape matches attention input requirements
|
||||
- ✗ Positional encodings don't exceed max_seq_len
|
||||
- ✗ Embedding + positional encoding creates position-aware representations
|
||||
- ✗ Variable sequence length handling
|
||||
|
||||
**Module 11 → Module 13 (Transformers)**
|
||||
- ✗ EmbeddingLayer provides complete pipeline (token + positional)
|
||||
- ✗ Embedding scaling (sqrt(embed_dim)) matches transformer conventions
|
||||
- ✗ Learnable vs sinusoidal positional encoding options
|
||||
|
||||
### 3. Cross-Module Integration
|
||||
**Embeddings + Optimizers**
|
||||
- ✗ Embedding parameters appear in optimizer.parameters()
|
||||
- ✗ Gradient updates modify embedding table correctly
|
||||
- ✗ Positional encodings are trainable (when learned)
|
||||
|
||||
**Embeddings + Training**
|
||||
- ✗ Forward pass with batched token sequences
|
||||
- ✗ Loss computation with embedded representations
|
||||
- ✗ Backward pass updates embedding weights
|
||||
|
||||
---
|
||||
|
||||
## Missing Test Coverage Analysis
|
||||
|
||||
### Category A: Backward Integration Tests (HIGH PRIORITY)
|
||||
|
||||
#### 1. Tokenizer → Embedding Integration
|
||||
**Missing Test**: `test_tokenizer_embedding_pipeline`
|
||||
```python
|
||||
def test_tokenizer_embedding_pipeline(self):
|
||||
"""Test token IDs from tokenizer work with embeddings."""
|
||||
from tinytorch.text.tokenization import CharTokenizer
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Tokenize text
|
||||
tokenizer = CharTokenizer()
|
||||
text = "Hello, world!"
|
||||
token_ids = tokenizer.encode(text) # Returns list of IDs
|
||||
|
||||
# Create embedding
|
||||
vocab_size = len(tokenizer.vocab)
|
||||
embed = Embedding(vocab_size=vocab_size, embed_dim=64)
|
||||
|
||||
# Convert to tensor and embed
|
||||
tokens_tensor = Tensor(np.array([token_ids])) # (1, seq_len)
|
||||
embeddings = embed.forward(tokens_tensor)
|
||||
|
||||
# Validate
|
||||
assert embeddings.shape == (1, len(token_ids), 64)
|
||||
assert embeddings.requires_grad == True # Should track gradients
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Catches vocabulary size mismatches, invalid token IDs, dimension errors
|
||||
|
||||
#### 2. Embedding Index Validation
|
||||
**Missing Test**: `test_embedding_index_out_of_bounds`
|
||||
```python
|
||||
def test_embedding_index_out_of_bounds(self):
|
||||
"""Test embedding handles invalid token IDs gracefully."""
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
embed = Embedding(vocab_size=100, embed_dim=64)
|
||||
|
||||
# Test negative indices
|
||||
try:
|
||||
invalid_tokens = Tensor(np.array([[-1, 0, 1]]))
|
||||
output = embed.forward(invalid_tokens)
|
||||
assert False, "Should raise ValueError for negative indices"
|
||||
except ValueError as e:
|
||||
assert "out of range" in str(e).lower()
|
||||
|
||||
# Test indices >= vocab_size
|
||||
try:
|
||||
invalid_tokens = Tensor(np.array([[0, 1, 100]])) # 100 >= vocab_size
|
||||
output = embed.forward(invalid_tokens)
|
||||
assert False, "Should raise ValueError for indices >= vocab_size"
|
||||
except ValueError as e:
|
||||
assert "out of range" in str(e).lower()
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Prevents silent failures, catches tokenizer bugs, validates error messages
|
||||
|
||||
#### 3. Gradient Flow Through Embeddings
|
||||
**Missing Test**: `test_embedding_gradient_flow`
|
||||
```python
|
||||
def test_embedding_gradient_flow(self):
|
||||
"""Test gradients flow back to embedding weights."""
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
embed = Embedding(vocab_size=50, embed_dim=32)
|
||||
tokens = Tensor(np.array([[1, 2, 3]])) # (1, 3)
|
||||
|
||||
# Forward pass
|
||||
output = embed.forward(tokens)
|
||||
assert output.requires_grad == True
|
||||
|
||||
# Check backward function attached
|
||||
assert hasattr(output, '_grad_fn')
|
||||
assert output._grad_fn is not None
|
||||
|
||||
# Verify embedding weights are marked for gradients
|
||||
assert embed.weight.requires_grad == True
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Catches gradient tracking bugs, validates autograd integration
|
||||
|
||||
#### 4. Positional Encoding Sequence Length Limits
|
||||
**Missing Test**: `test_positional_encoding_max_seq_len`
|
||||
```python
|
||||
def test_positional_encoding_max_seq_len(self):
|
||||
"""Test positional encoding respects max_seq_len."""
|
||||
from tinytorch.text.embeddings import PositionalEncoding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
max_seq_len = 512
|
||||
pos_enc = PositionalEncoding(max_seq_len=max_seq_len, embed_dim=64)
|
||||
|
||||
# Test at limit (should work)
|
||||
x_valid = Tensor(np.random.randn(2, 512, 64)) # (batch, seq, embed)
|
||||
output = pos_enc.forward(x_valid)
|
||||
assert output.shape == (2, 512, 64)
|
||||
|
||||
# Test beyond limit (should fail)
|
||||
try:
|
||||
x_invalid = Tensor(np.random.randn(2, 513, 64)) # Exceeds max_seq_len
|
||||
output = pos_enc.forward(x_invalid)
|
||||
assert False, "Should raise ValueError for seq_len > max_seq_len"
|
||||
except ValueError as e:
|
||||
assert "exceeds maximum" in str(e).lower()
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Prevents position encoding OOB errors, critical for attention modules
|
||||
|
||||
### Category B: Forward Integration Tests (HIGH PRIORITY)
|
||||
|
||||
#### 5. Embedding → Attention Shape Compatibility
|
||||
**Missing Test**: `test_embedding_attention_shape_compatibility`
|
||||
```python
|
||||
def test_embedding_attention_shape_compatibility(self):
|
||||
"""Test embedding output shapes work with attention input requirements."""
|
||||
from tinytorch.text.embeddings import EmbeddingLayer
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Create embedding layer
|
||||
embed_layer = EmbeddingLayer(
|
||||
vocab_size=1000,
|
||||
embed_dim=512,
|
||||
max_seq_len=128,
|
||||
pos_encoding='learned'
|
||||
)
|
||||
|
||||
# Simulate tokenized batch
|
||||
batch_size, seq_len = 4, 32
|
||||
tokens = Tensor(np.random.randint(0, 1000, (batch_size, seq_len)))
|
||||
|
||||
# Get embeddings
|
||||
embeddings = embed_layer.forward(tokens)
|
||||
|
||||
# Validate attention-compatible shape (batch, seq, embed)
|
||||
assert embeddings.shape == (batch_size, seq_len, 512)
|
||||
assert embeddings.requires_grad == True
|
||||
|
||||
# Verify positional information is added
|
||||
# (Different positions should have different representations)
|
||||
# This is implicit validation - attention expects position-aware inputs
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Ensures Module 12 (Attention) integration works, catches shape errors
|
||||
|
||||
#### 6. Variable Sequence Length Handling
|
||||
**Missing Test**: `test_variable_sequence_length_handling`
|
||||
```python
|
||||
def test_variable_sequence_length_handling(self):
|
||||
"""Test embeddings handle variable sequence lengths correctly."""
|
||||
from tinytorch.text.embeddings import EmbeddingLayer
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
embed_layer = EmbeddingLayer(
|
||||
vocab_size=500,
|
||||
embed_dim=256,
|
||||
max_seq_len=512
|
||||
)
|
||||
|
||||
# Test different sequence lengths
|
||||
for seq_len in [10, 50, 100, 256, 512]:
|
||||
tokens = Tensor(np.random.randint(0, 500, (2, seq_len)))
|
||||
output = embed_layer.forward(tokens)
|
||||
|
||||
assert output.shape == (2, seq_len, 256)
|
||||
assert output.requires_grad == True
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Validates dynamic sequence handling, catches hardcoded assumptions
|
||||
|
||||
#### 7. Embedding + Positional Encoding Composition
|
||||
**Missing Test**: `test_embedding_positional_composition`
|
||||
```python
|
||||
def test_embedding_positional_composition(self):
|
||||
"""Test token embeddings correctly combine with positional encodings."""
|
||||
from tinytorch.text.embeddings import Embedding, PositionalEncoding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
# Create components
|
||||
token_embed = Embedding(vocab_size=100, embed_dim=64)
|
||||
pos_enc = PositionalEncoding(max_seq_len=128, embed_dim=64)
|
||||
|
||||
# Token sequence
|
||||
tokens = Tensor(np.array([[1, 2, 3, 4]])) # (1, 4)
|
||||
|
||||
# Manual composition
|
||||
token_embeds = token_embed.forward(tokens) # (1, 4, 64)
|
||||
position_aware = pos_enc.forward(token_embeds) # (1, 4, 64)
|
||||
|
||||
# Validate shape preservation
|
||||
assert position_aware.shape == token_embeds.shape
|
||||
|
||||
# Validate it's not just token embeddings (positional info added)
|
||||
# NOTE: Can't easily test this without comparing values,
|
||||
# but gradients should flow through both components
|
||||
assert hasattr(position_aware, '_grad_fn')
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Validates additive composition, ensures both components contribute
|
||||
|
||||
### Category C: Cross-Module Integration Tests (MEDIUM PRIORITY)
|
||||
|
||||
#### 8. Embedding Parameters in Optimizer
|
||||
**Missing Test**: `test_embedding_parameters_optimizable`
|
||||
```python
|
||||
def test_embedding_parameters_optimizable(self):
|
||||
"""Test embedding parameters work with optimizers."""
|
||||
from tinytorch.text.embeddings import EmbeddingLayer
|
||||
from tinytorch.core.optimizers import SGD
|
||||
from tinytorch.core.tensor import Tensor
|
||||
import numpy as np
|
||||
|
||||
# Create embedding layer
|
||||
embed_layer = EmbeddingLayer(
|
||||
vocab_size=200,
|
||||
embed_dim=128,
|
||||
pos_encoding='learned'
|
||||
)
|
||||
|
||||
# Get parameters
|
||||
params = embed_layer.parameters()
|
||||
|
||||
# Should have 2 parameter sets: token embeddings + positional encodings
|
||||
assert len(params) == 2
|
||||
assert all(p.requires_grad for p in params)
|
||||
|
||||
# Create optimizer
|
||||
optimizer = SGD(params, lr=0.01)
|
||||
|
||||
# Verify optimizer accepted parameters
|
||||
assert len(optimizer.parameters) == 2
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Ensures training loop integration, catches parameter registration bugs
|
||||
|
||||
#### 9. Embedding Training End-to-End
|
||||
**Missing Test**: `test_embedding_training_updates`
|
||||
```python
|
||||
def test_embedding_training_updates(self):
|
||||
"""Test embeddings update during training."""
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.losses import mse_loss
|
||||
import numpy as np
|
||||
|
||||
embed = Embedding(vocab_size=50, embed_dim=32)
|
||||
|
||||
# Save initial weights
|
||||
initial_weights = embed.weight.data.copy()
|
||||
|
||||
# Forward pass
|
||||
tokens = Tensor(np.array([[1, 2, 3]]))
|
||||
output = embed.forward(tokens)
|
||||
|
||||
# Compute loss (dummy target)
|
||||
target = Tensor(np.random.randn(1, 3, 32))
|
||||
loss = mse_loss(output, target)
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Verify gradients computed
|
||||
assert embed.weight.grad is not None
|
||||
assert embed.weight.grad.shape == embed.weight.shape
|
||||
|
||||
# Gradients should be non-zero for used embeddings
|
||||
# (Only tokens 1, 2, 3 should have gradients)
|
||||
# This validates sparse gradient accumulation
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Validates end-to-end training, catches gradient bugs
|
||||
|
||||
#### 10. Sinusoidal vs Learned Positional Encoding
|
||||
**Missing Test**: `test_sinusoidal_vs_learned_positional`
|
||||
```python
|
||||
def test_sinusoidal_vs_learned_positional(self):
|
||||
"""Test both positional encoding types work correctly."""
|
||||
from tinytorch.text.embeddings import EmbeddingLayer
|
||||
from tinytorch.core.tensor import Tensor
|
||||
|
||||
tokens = Tensor(np.random.randint(0, 100, (2, 10)))
|
||||
|
||||
# Learned positional encoding
|
||||
embed_learned = EmbeddingLayer(
|
||||
vocab_size=100,
|
||||
embed_dim=64,
|
||||
pos_encoding='learned'
|
||||
)
|
||||
output_learned = embed_learned.forward(tokens)
|
||||
assert output_learned.shape == (2, 10, 64)
|
||||
|
||||
# Should have trainable positional parameters
|
||||
params_learned = embed_learned.parameters()
|
||||
assert len(params_learned) == 2 # Token + Positional
|
||||
|
||||
# Sinusoidal positional encoding
|
||||
embed_sinusoidal = EmbeddingLayer(
|
||||
vocab_size=100,
|
||||
embed_dim=64,
|
||||
pos_encoding='sinusoidal'
|
||||
)
|
||||
output_sinusoidal = embed_sinusoidal.forward(tokens)
|
||||
assert output_sinusoidal.shape == (2, 10, 64)
|
||||
|
||||
# Should only have token embeddings as parameters (sinusoidal is fixed)
|
||||
params_sinusoidal = embed_sinusoidal.parameters()
|
||||
assert len(params_sinusoidal) == 1 # Only token embeddings
|
||||
|
||||
# No positional encoding
|
||||
embed_none = EmbeddingLayer(
|
||||
vocab_size=100,
|
||||
embed_dim=64,
|
||||
pos_encoding=None
|
||||
)
|
||||
output_none = embed_none.forward(tokens)
|
||||
assert output_none.shape == (2, 10, 64)
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Validates positional encoding options, ensures transformer flexibility
|
||||
|
||||
### Category D: Regression Prevention Tests (MEDIUM PRIORITY)
|
||||
|
||||
#### 11. Prior Stack Stability
|
||||
**Missing Test**: `test_prior_stack_stable_through_embeddings`
|
||||
```python
|
||||
def test_prior_stack_stable_through_embeddings(self):
|
||||
"""Verify embedding development didn't break Modules 01-10."""
|
||||
# Module 01: Tensor
|
||||
from tinytorch.core.tensor import Tensor
|
||||
t = Tensor([1, 2, 3])
|
||||
assert t.shape == (3,)
|
||||
|
||||
# Module 02: Activations
|
||||
from tinytorch.core.activations import ReLU
|
||||
relu = ReLU()
|
||||
assert hasattr(relu, 'forward')
|
||||
|
||||
# Module 05: Autograd
|
||||
from tinytorch.core.autograd import AddBackward
|
||||
assert AddBackward is not None
|
||||
|
||||
# Module 10: Tokenization
|
||||
from tinytorch.text.tokenization import CharTokenizer
|
||||
tokenizer = CharTokenizer()
|
||||
encoded = tokenizer.encode("test")
|
||||
assert isinstance(encoded, list)
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Catches import errors, validates module isolation
|
||||
|
||||
#### 12. Embedding Memory Scaling
|
||||
**Missing Test**: `test_embedding_memory_scaling`
|
||||
```python
|
||||
def test_embedding_memory_scaling(self):
|
||||
"""Test embedding memory scales as expected."""
|
||||
from tinytorch.text.embeddings import Embedding
|
||||
|
||||
# Small embedding
|
||||
embed_small = Embedding(vocab_size=1000, embed_dim=128)
|
||||
memory_small = embed_small.weight.data.nbytes
|
||||
|
||||
# Large embedding (4x vocabulary, 2x dimensions)
|
||||
embed_large = Embedding(vocab_size=4000, embed_dim=256)
|
||||
memory_large = embed_large.weight.data.nbytes
|
||||
|
||||
# Memory should scale proportionally: 4 * 2 = 8x
|
||||
expected_ratio = 8.0
|
||||
actual_ratio = memory_large / memory_small
|
||||
|
||||
assert np.isclose(actual_ratio, expected_ratio, rtol=0.1)
|
||||
```
|
||||
|
||||
**Bug-Catching Value**: Validates memory model, catches initialization bugs
|
||||
|
||||
---
|
||||
|
||||
## Recommended Test Structure
|
||||
|
||||
### New File: `test_progressive_integration.py`
|
||||
```python
|
||||
"""
|
||||
Module 11: Progressive Integration Tests
|
||||
Tests that Module 11 (Embeddings) works correctly AND integrates with prior modules.
|
||||
|
||||
DEPENDENCY CHAIN: 01_tensor → 05_autograd → 10_tokenization → 11_embeddings → 12_attention
|
||||
"""
|
||||
|
||||
class TestPriorStackStillWorking:
|
||||
"""Verify Modules 01-10 still work after Module 11 development."""
|
||||
|
||||
def test_tensor_functionality_stable(self):
|
||||
"""Module 01: Tensor operations still work."""
|
||||
|
||||
def test_tokenization_functionality_stable(self):
|
||||
"""Module 10: Tokenization still works."""
|
||||
|
||||
class TestModule11EmbeddingsCore:
|
||||
"""Test Module 11 core functionality in isolation."""
|
||||
|
||||
def test_embedding_creation(self):
|
||||
"""Test basic embedding layer creation."""
|
||||
|
||||
def test_positional_encoding_creation(self):
|
||||
"""Test positional encoding creation."""
|
||||
|
||||
def test_embedding_layer_complete_system(self):
|
||||
"""Test complete EmbeddingLayer system."""
|
||||
|
||||
class TestBackwardIntegration:
|
||||
"""Test Module 11 integrates with dependencies (Modules 01-10)."""
|
||||
|
||||
def test_tokenizer_embedding_pipeline(self):
|
||||
"""Module 10 → 11: Tokenizer output feeds embeddings."""
|
||||
|
||||
def test_embedding_gradient_flow(self):
|
||||
"""Module 05 → 11: Autograd works with embeddings."""
|
||||
|
||||
def test_embedding_index_validation(self):
|
||||
"""Input validation catches tokenizer bugs."""
|
||||
|
||||
class TestForwardIntegration:
|
||||
"""Test Module 11 prepares for dependents (Module 12+)."""
|
||||
|
||||
def test_embedding_attention_compatibility(self):
|
||||
"""Module 11 → 12: Output shapes match attention requirements."""
|
||||
|
||||
def test_positional_encoding_sequence_limits(self):
|
||||
"""Position encodings respect max_seq_len for attention."""
|
||||
|
||||
def test_variable_sequence_length_handling(self):
|
||||
"""Dynamic sequence lengths work correctly."""
|
||||
|
||||
class TestCrossModuleIntegration:
|
||||
"""Test Module 11 works with the complete stack."""
|
||||
|
||||
def test_embedding_parameters_optimizable(self):
|
||||
"""Embeddings integrate with optimizers."""
|
||||
|
||||
def test_embedding_training_updates(self):
|
||||
"""End-to-end training updates embeddings."""
|
||||
|
||||
def test_sinusoidal_vs_learned_encoding(self):
|
||||
"""Both positional encoding types work."""
|
||||
|
||||
class TestRegressionPrevention:
|
||||
"""Prevent future bugs and validate edge cases."""
|
||||
|
||||
def test_embedding_memory_scaling(self):
|
||||
"""Memory usage scales correctly."""
|
||||
|
||||
def test_embedding_edge_cases(self):
|
||||
"""Empty sequences, single tokens, max length."""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Priority Ranking for Implementation
|
||||
|
||||
### P0 - CRITICAL (Implement First)
|
||||
1. **Fix wrong module bug** - Replace compression tests with embedding tests
|
||||
2. **test_tokenizer_embedding_pipeline** - Core integration point
|
||||
3. **test_embedding_index_out_of_bounds** - Prevents silent failures
|
||||
4. **test_positional_encoding_max_seq_len** - Critical for attention
|
||||
|
||||
### P1 - HIGH (Implement Second)
|
||||
5. **test_embedding_attention_shape_compatibility** - Forward integration
|
||||
6. **test_embedding_gradient_flow** - Autograd validation
|
||||
7. **test_variable_sequence_length_handling** - Dynamic sequences
|
||||
8. **test_embedding_positional_composition** - Component interaction
|
||||
|
||||
### P2 - MEDIUM (Implement Third)
|
||||
9. **test_embedding_parameters_optimizable** - Training integration
|
||||
10. **test_sinusoidal_vs_learned_positional** - Encoding options
|
||||
11. **test_embedding_training_updates** - End-to-end validation
|
||||
12. **test_embedding_memory_scaling** - Performance awareness
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priorities
|
||||
|
||||
### Highest Value Tests (Catch Most Bugs)
|
||||
1. **Index validation** - Catches 40% of embedding bugs (OOB errors, vocab mismatches)
|
||||
2. **Gradient flow** - Catches 25% of bugs (autograd issues, training failures)
|
||||
3. **Shape compatibility** - Catches 20% of bugs (dimension mismatches, pipeline errors)
|
||||
4. **Sequence length limits** - Catches 15% of bugs (attention crashes, OOM errors)
|
||||
|
||||
### Production-Critical Tests
|
||||
- **test_tokenizer_embedding_pipeline** - Real usage pattern
|
||||
- **test_embedding_attention_compatibility** - Transformer requirement
|
||||
- **test_positional_encoding_max_seq_len** - Prevents runtime crashes
|
||||
- **test_embedding_training_updates** - Validates learning actually works
|
||||
|
||||
---
|
||||
|
||||
## Estimated Implementation Effort
|
||||
|
||||
**Total Work**: ~4-6 hours for complete integration test suite
|
||||
- P0 tests: 1.5 hours (4 tests)
|
||||
- P1 tests: 1.5 hours (4 tests)
|
||||
- P2 tests: 1.5 hours (4 tests)
|
||||
- Documentation: 0.5 hours
|
||||
- Testing & validation: 1 hour
|
||||
|
||||
**Recommended Approach**:
|
||||
1. Day 1: Fix wrong module bug, implement P0 tests
|
||||
2. Day 2: Implement P1 tests
|
||||
3. Day 3: Implement P2 tests, documentation
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The current integration test file is **completely broken** - it tests the wrong module (Compression instead of Embeddings). A full rewrite is required.
|
||||
|
||||
**Key Priorities**:
|
||||
1. Replace all compression tests with embedding tests
|
||||
2. Focus on tokenizer → embedding → attention integration
|
||||
3. Validate gradient flow and parameter optimization
|
||||
4. Test both learned and sinusoidal positional encodings
|
||||
|
||||
**Expected Outcome**: Robust integration test suite that catches 90%+ of embedding-related bugs before they reach production.
|
||||
@@ -1,518 +0,0 @@
|
||||
# Module 17 (Memoization/KV Cache) - Integration Test Audit Report
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current Status**: Module 15/17 (Memoization) has **NO specific integration tests** - the test file `tests/15_memoization/test_progressive_integration.py` currently contains only generic TinyGPT/Capstone tests that belong in a later module.
|
||||
|
||||
**Critical Gap**: This module implements KV caching - a production-critical optimization with complex integration points - but has zero tests validating those integrations work correctly.
|
||||
|
||||
---
|
||||
|
||||
## Current Test Coverage Analysis
|
||||
|
||||
### What Exists (tests/15_memoization/test_progressive_integration.py)
|
||||
|
||||
The current test file is **COMPLETELY MISNAMED** - it tests Module 16 (TinyGPT Capstone), NOT Module 17 (Memoization):
|
||||
|
||||
```python
|
||||
class TestModule16TinyGPTCore: # ← Tests TinyGPT, not KV cache!
|
||||
def test_transformer_block_creation(self)
|
||||
def test_tinygpt_model_creation(self)
|
||||
def test_text_generation_capabilities(self)
|
||||
|
||||
class TestCompleteSystemIntegration: # ← Generic system tests
|
||||
def test_end_to_end_language_model_training(self)
|
||||
def test_compressed_transformer_deployment(self)
|
||||
def test_multi_modal_capabilities(self)
|
||||
```
|
||||
|
||||
**Zero tests validate**:
|
||||
- KVCache integration with MultiHeadAttention
|
||||
- Cache updates during autoregressive generation
|
||||
- Training vs inference mode detection
|
||||
- Cache corruption across generation steps
|
||||
- Memory scaling validation
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Points for Module 17
|
||||
|
||||
Based on module implementation (`src/17_memoization/17_memoization.py`), these are the **CRITICAL integration points that MUST be tested**:
|
||||
|
||||
### 1. KVCache ↔ MultiHeadAttention Integration
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
class KVCache:
|
||||
def update(layer_idx, key, value) # ← Must work with attention output
|
||||
def get(layer_idx) # ← Must provide correct format for attention
|
||||
def advance() # ← Must sync with generation loop
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ KVCache stores K,V tensors from attention computation
|
||||
- ✅ Retrieved cache has correct shape for attention: `(batch, heads, seq_len, head_dim)`
|
||||
- ✅ Cache updates don't corrupt data across layers
|
||||
- ✅ Sequence position advances correctly after all layers process
|
||||
|
||||
**Risk**: Cache shape mismatch crashes attention → broken generation
|
||||
|
||||
---
|
||||
|
||||
### 2. Cache ↔ Generation Loop Integration
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
def enable_kv_cache(model) # ← Non-invasive model patching
|
||||
# Generation loop must:
|
||||
# 1. Create cache before generation
|
||||
# 2. Pass cache to model.forward()
|
||||
# 3. Advance cache after each step
|
||||
# 4. Stop at max_seq_len
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ Cache initialized with correct model architecture params
|
||||
- ✅ Generation produces correct output with cache enabled
|
||||
- ✅ Cache updates don't break across generation steps
|
||||
- ✅ Generated sequence length respects max_seq_len limit
|
||||
- ✅ Cache memory doesn't grow unbounded
|
||||
|
||||
**Risk**: Cache corruption mid-generation → garbage output after N tokens
|
||||
|
||||
---
|
||||
|
||||
### 3. Training Mode Detection
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
# From implementation:
|
||||
# - Training: Don't use cache (need gradients)
|
||||
# - Inference: Use cache (no gradients, faster)
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ model.train() disables cache usage
|
||||
- ✅ model.eval() enables cache usage
|
||||
- ✅ Training with cache accidentally enabled → error or warning
|
||||
- ✅ Cache correctly marked as inference-only (no gradient tracking)
|
||||
|
||||
**Risk**: Training with cache enabled → incorrect gradients → broken model
|
||||
|
||||
---
|
||||
|
||||
### 4. Multi-Layer Cache Consistency
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
# Each transformer layer has its own (K, V) cache
|
||||
# Cache updates must not interfere across layers
|
||||
cache.update(layer_idx=0, ...) # Layer 0
|
||||
cache.update(layer_idx=1, ...) # Layer 1
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ Layer 0 cache update doesn't corrupt Layer 1 cache
|
||||
- ✅ All layers retrieve correct cached K,V for their layer_idx
|
||||
- ✅ Parallel layer processing doesn't cause race conditions
|
||||
- ✅ Cache.get() returns layer-specific cached values
|
||||
|
||||
**Risk**: Layer cache mixing → incorrect attention → degraded quality
|
||||
|
||||
---
|
||||
|
||||
### 5. Batch Inference Validation
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
cache = KVCache(batch_size=4, ...) # Generate 4 sequences in parallel
|
||||
# Each sequence in batch has independent cache state
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ Batch dimension properly handled in cache updates
|
||||
- ✅ Different sequences don't interfere with each other
|
||||
- ✅ Cache memory scales linearly with batch_size
|
||||
- ✅ Batch inference produces same results as sequential
|
||||
|
||||
**Risk**: Batch sequences cross-contaminate → non-deterministic output
|
||||
|
||||
---
|
||||
|
||||
### 6. Memory Scaling Validation
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
# Cache memory = batch × layers × heads × seq_len × head_dim × 4 bytes
|
||||
# Must validate this doesn't OOM for realistic configs
|
||||
```
|
||||
|
||||
**Integration scenarios**:
|
||||
- ✅ Small model (2 layers, 64 dim) uses <1 MB
|
||||
- ✅ Medium model (4 layers, 128 dim) uses 1-10 MB
|
||||
- ✅ Large model (12 layers, 768 dim, seq=1024) uses ~37 MB
|
||||
- ✅ Memory calculation matches actual allocation
|
||||
- ✅ Max sequence length enforcement prevents unbounded growth
|
||||
|
||||
**Risk**: Unbounded cache growth → OOM crash in production
|
||||
|
||||
---
|
||||
|
||||
## Missing Integration Tests (Priority Ordered)
|
||||
|
||||
### CRITICAL (P0) - Break Production if Missing
|
||||
|
||||
#### Test 1: Cache-Enabled Generation Produces Correct Output
|
||||
```python
|
||||
def test_kv_cache_generation_correctness():
|
||||
"""Verify cached generation matches non-cached generation."""
|
||||
model = create_tiny_transformer()
|
||||
input_ids = [1, 2, 3]
|
||||
|
||||
# Generate without cache (baseline)
|
||||
output_no_cache = model.generate(input_ids, max_new_tokens=10)
|
||||
|
||||
# Generate with cache
|
||||
cache = enable_kv_cache(model)
|
||||
output_with_cache = model.generate(input_ids, max_new_tokens=10, cache=cache)
|
||||
|
||||
# Outputs should be identical (deterministic generation)
|
||||
assert output_no_cache == output_with_cache
|
||||
```
|
||||
|
||||
**Bug it catches**: Cache corruption producing wrong tokens
|
||||
|
||||
---
|
||||
|
||||
#### Test 2: Cache Updates Don't Corrupt Across Layers
|
||||
```python
|
||||
def test_cache_layer_isolation():
|
||||
"""Verify each layer's cache is independent."""
|
||||
cache = KVCache(batch_size=1, max_seq_len=10, num_layers=3,
|
||||
num_heads=4, head_dim=16)
|
||||
|
||||
# Update each layer with unique data
|
||||
for layer_idx in range(3):
|
||||
key = Tensor(np.full((1, 4, 1, 16), layer_idx))
|
||||
val = Tensor(np.full((1, 4, 1, 16), layer_idx * 10))
|
||||
cache.update(layer_idx, key, val)
|
||||
|
||||
cache.advance()
|
||||
|
||||
# Verify each layer has its own data (no cross-contamination)
|
||||
for layer_idx in range(3):
|
||||
k, v = cache.get(layer_idx)
|
||||
assert np.all(k.data == layer_idx), f"Layer {layer_idx} key corrupted"
|
||||
assert np.all(v.data == layer_idx * 10), f"Layer {layer_idx} value corrupted"
|
||||
```
|
||||
|
||||
**Bug it catches**: Layer cache mixing causing quality degradation
|
||||
|
||||
---
|
||||
|
||||
#### Test 3: Training Mode Prevents Cache Usage
|
||||
```python
|
||||
def test_training_mode_disables_cache():
|
||||
"""Verify cache is disabled during training."""
|
||||
model = create_tiny_transformer()
|
||||
cache = enable_kv_cache(model)
|
||||
|
||||
# Training mode
|
||||
model.train()
|
||||
|
||||
# Forward pass should NOT use cache (needs gradients)
|
||||
input_ids = Tensor([[1, 2, 3, 4]])
|
||||
output = model(input_ids)
|
||||
|
||||
# Cache should not have been updated
|
||||
assert cache.seq_pos == 0, "Cache updated during training mode!"
|
||||
|
||||
# Inference mode
|
||||
model.eval()
|
||||
output = model(input_ids)
|
||||
|
||||
# Now cache should be updated
|
||||
assert cache.seq_pos > 0, "Cache not updated during eval mode!"
|
||||
```
|
||||
|
||||
**Bug it catches**: Incorrect gradients from cached computation
|
||||
|
||||
---
|
||||
|
||||
#### Test 4: Cache Memory Grows Correctly
|
||||
```python
|
||||
def test_cache_memory_scaling():
|
||||
"""Verify cache memory scales as expected."""
|
||||
configs = [
|
||||
# (layers, embed_dim, heads, seq_len, expected_mb)
|
||||
(2, 64, 4, 64, 0.1), # Tiny: <0.2 MB
|
||||
(4, 128, 8, 128, 2.0), # Small: ~2 MB
|
||||
(6, 256, 8, 256, 12.0), # Medium: ~12 MB
|
||||
]
|
||||
|
||||
for num_layers, embed_dim, num_heads, max_seq_len, expected_mb in configs:
|
||||
head_dim = embed_dim // num_heads
|
||||
cache = KVCache(
|
||||
batch_size=1,
|
||||
max_seq_len=max_seq_len,
|
||||
num_layers=num_layers,
|
||||
num_heads=num_heads,
|
||||
head_dim=head_dim
|
||||
)
|
||||
|
||||
mem_info = cache.get_memory_usage()
|
||||
actual_mb = mem_info['total_mb']
|
||||
|
||||
# Allow 20% tolerance for overhead
|
||||
assert 0.8 * expected_mb < actual_mb < 1.2 * expected_mb, \
|
||||
f"Memory scaling broken: expected ~{expected_mb}MB, got {actual_mb}MB"
|
||||
```
|
||||
|
||||
**Bug it catches**: OOM from unbounded cache growth
|
||||
|
||||
---
|
||||
|
||||
### HIGH (P1) - Degrade User Experience
|
||||
|
||||
#### Test 5: Batch Inference Maintains Independence
|
||||
```python
|
||||
def test_batch_cache_independence():
|
||||
"""Verify batch sequences don't interfere."""
|
||||
cache = KVCache(batch_size=4, max_seq_len=10, num_layers=2,
|
||||
num_heads=4, head_dim=16)
|
||||
|
||||
# Update with batch-specific data
|
||||
# Batch 0: all 0s, Batch 1: all 1s, etc.
|
||||
for step in range(3):
|
||||
for layer_idx in range(2):
|
||||
key = Tensor(np.stack([
|
||||
np.full((4, 1, 16), batch_idx)
|
||||
for batch_idx in range(4)
|
||||
]))
|
||||
val = key.copy()
|
||||
cache.update(layer_idx, key, val)
|
||||
cache.advance()
|
||||
|
||||
# Verify each batch maintained its own data
|
||||
for layer_idx in range(2):
|
||||
k, v = cache.get(layer_idx)
|
||||
for batch_idx in range(4):
|
||||
assert np.all(k.data[batch_idx] == batch_idx), \
|
||||
f"Batch {batch_idx} contaminated"
|
||||
```
|
||||
|
||||
**Bug it catches**: Batch cross-contamination causing non-deterministic output
|
||||
|
||||
---
|
||||
|
||||
#### Test 6: Cache Sequence Length Enforcement
|
||||
```python
|
||||
def test_cache_max_length_enforcement():
|
||||
"""Verify cache prevents exceeding max_seq_len."""
|
||||
cache = KVCache(batch_size=1, max_seq_len=5, num_layers=2,
|
||||
num_heads=4, head_dim=16)
|
||||
|
||||
# Fill cache to max
|
||||
for step in range(5):
|
||||
for layer_idx in range(2):
|
||||
key = Tensor(np.random.randn(1, 4, 1, 16))
|
||||
val = Tensor(np.random.randn(1, 4, 1, 16))
|
||||
cache.update(layer_idx, key, val)
|
||||
cache.advance()
|
||||
|
||||
# Attempting to exceed should raise error
|
||||
with pytest.raises(ValueError, match="max_seq_len"):
|
||||
key = Tensor(np.random.randn(1, 4, 1, 16))
|
||||
val = Tensor(np.random.randn(1, 4, 1, 16))
|
||||
cache.update(0, key, val) # Should fail
|
||||
```
|
||||
|
||||
**Bug it catches**: Unbounded generation causing OOM
|
||||
|
||||
---
|
||||
|
||||
#### Test 7: Cache Reset Functionality
|
||||
```python
|
||||
def test_cache_reset_clears_state():
|
||||
"""Verify reset() clears cache for reuse."""
|
||||
cache = KVCache(batch_size=1, max_seq_len=10, num_layers=2,
|
||||
num_heads=4, head_dim=16)
|
||||
|
||||
# Fill cache with data
|
||||
for step in range(3):
|
||||
for layer_idx in range(2):
|
||||
key = Tensor(np.ones((1, 4, 1, 16)))
|
||||
val = Tensor(np.ones((1, 4, 1, 16)))
|
||||
cache.update(layer_idx, key, val)
|
||||
cache.advance()
|
||||
|
||||
assert cache.seq_pos == 3
|
||||
|
||||
# Reset cache
|
||||
cache.reset()
|
||||
|
||||
# Verify clean state
|
||||
assert cache.seq_pos == 0
|
||||
k, v = cache.get(0)
|
||||
assert k.shape[2] == 0, "Cache not empty after reset"
|
||||
```
|
||||
|
||||
**Bug it catches**: Stale cache data corrupting next generation
|
||||
|
||||
---
|
||||
|
||||
### MEDIUM (P2) - Nice to Have
|
||||
|
||||
#### Test 8: enable_kv_cache() Integration with Real Model
|
||||
```python
|
||||
def test_enable_kv_cache_real_model():
|
||||
"""Verify enable_kv_cache() works with transformer model."""
|
||||
from tinytorch.models.transformer import GPT
|
||||
|
||||
model = GPT(vocab_size=100, embed_dim=64, num_layers=2,
|
||||
num_heads=4, max_seq_len=32)
|
||||
|
||||
# Enable cache
|
||||
cache = enable_kv_cache(model)
|
||||
|
||||
# Verify model attributes
|
||||
assert hasattr(model, '_kv_cache')
|
||||
assert hasattr(model, '_cache_enabled')
|
||||
assert model._cache_enabled == True
|
||||
|
||||
# Verify cache configuration matches model
|
||||
assert cache.num_layers == model.num_layers
|
||||
assert cache.num_heads == model.num_heads
|
||||
assert cache.max_seq_len == model.max_seq_len
|
||||
```
|
||||
|
||||
**Bug it catches**: enable_kv_cache() misconfiguration
|
||||
|
||||
---
|
||||
|
||||
#### Test 9: Cache Shape Compatibility with Attention
|
||||
```python
|
||||
def test_cache_shapes_match_attention_requirements():
|
||||
"""Verify cached K,V have correct shapes for attention."""
|
||||
cache = KVCache(batch_size=2, max_seq_len=10, num_layers=1,
|
||||
num_heads=4, head_dim=16)
|
||||
|
||||
# Simulate 3 generation steps
|
||||
for step in range(3):
|
||||
key = Tensor(np.random.randn(2, 4, 1, 16)) # (B, H, 1, D)
|
||||
val = Tensor(np.random.randn(2, 4, 1, 16))
|
||||
cache.update(0, key, val)
|
||||
cache.advance()
|
||||
|
||||
# Get cached K,V
|
||||
k, v = cache.get(0)
|
||||
|
||||
# Should have shape (B, H, seq_pos, D)
|
||||
assert k.shape == (2, 4, 3, 16), f"Wrong key shape: {k.shape}"
|
||||
assert v.shape == (2, 4, 3, 16), f"Wrong value shape: {v.shape}"
|
||||
|
||||
# Should be compatible with attention computation
|
||||
# Q: (B, H, 1, D) @ K.T: (B, H, D, seq_pos) → (B, H, 1, seq_pos)
|
||||
query = Tensor(np.random.randn(2, 4, 1, 16))
|
||||
scores = query @ k.transpose(-2, -1)
|
||||
assert scores.shape == (2, 4, 1, 3), "Attention computation failed"
|
||||
```
|
||||
|
||||
**Bug it catches**: Shape mismatch causing attention crashes
|
||||
|
||||
---
|
||||
|
||||
## Test Organization Recommendation
|
||||
|
||||
### Proposed Structure
|
||||
|
||||
```
|
||||
tests/15_memoization/
|
||||
├── test_progressive_integration.py # RENAME from TinyGPT tests
|
||||
│ ├── TestKVCacheAttentionIntegration
|
||||
│ │ ├── test_cache_enabled_generation_correctness (P0)
|
||||
│ │ ├── test_cache_layer_isolation (P0)
|
||||
│ │ └── test_cache_shapes_match_attention (P2)
|
||||
│ │
|
||||
│ ├── TestCacheGenerationLoop
|
||||
│ │ ├── test_training_mode_disables_cache (P0)
|
||||
│ │ ├── test_cache_max_length_enforcement (P1)
|
||||
│ │ └── test_cache_reset_clears_state (P1)
|
||||
│ │
|
||||
│ ├── TestCacheMemoryScaling
|
||||
│ │ ├── test_cache_memory_scaling (P0)
|
||||
│ │ └── test_batch_cache_independence (P1)
|
||||
│ │
|
||||
│ └── TestEnableKVCacheIntegration
|
||||
│ └── test_enable_kv_cache_real_model (P2)
|
||||
│
|
||||
└── test_kv_cache_unit.py # Unit tests (already exist in module)
|
||||
└── test_unit_kvcache() # From 17_memoization.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary Statistics
|
||||
|
||||
| Category | Count |
|
||||
|----------|-------|
|
||||
| **Total Integration Tests Needed** | 9 |
|
||||
| **Critical (P0)** | 4 |
|
||||
| **High Priority (P1)** | 3 |
|
||||
| **Medium Priority (P2)** | 2 |
|
||||
| **Current Integration Tests** | 0 |
|
||||
| **Coverage Gap** | 100% |
|
||||
|
||||
---
|
||||
|
||||
## Recommended Action Plan
|
||||
|
||||
### Phase 1: Critical Tests (Week 1)
|
||||
1. Implement P0 tests (4 tests)
|
||||
2. Verify with real model (create minimal transformer for testing)
|
||||
3. Fix any bugs discovered
|
||||
|
||||
### Phase 2: High Priority (Week 2)
|
||||
4. Implement P1 tests (3 tests)
|
||||
5. Add batch inference validation
|
||||
6. Add sequence length enforcement
|
||||
|
||||
### Phase 3: Medium Priority (Week 3)
|
||||
7. Implement P2 tests (2 tests)
|
||||
8. Complete integration with enable_kv_cache()
|
||||
9. Final validation pass
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### Current Risk Level: **HIGH** ⚠️
|
||||
|
||||
**Without these integration tests:**
|
||||
- ✗ Cache corruption could go undetected → broken generation in production
|
||||
- ✗ Training mode cache usage → incorrect gradients → broken models
|
||||
- ✗ Memory leaks from unbounded cache → OOM crashes
|
||||
- ✗ Layer cache mixing → degraded output quality
|
||||
- ✗ Batch contamination → non-deterministic behavior
|
||||
|
||||
**With these integration tests:**
|
||||
- ✓ Catch cache corruption before deployment
|
||||
- ✓ Prevent training/inference mode bugs
|
||||
- ✓ Validate memory scaling behavior
|
||||
- ✓ Ensure layer independence
|
||||
- ✓ Guarantee batch inference correctness
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
Module 17 (Memoization/KV Cache) currently has **ZERO integration tests** despite implementing complex interactions with:
|
||||
- MultiHeadAttention (Module 12)
|
||||
- Transformer blocks (Module 13)
|
||||
- Generation loops
|
||||
- Training/inference mode switching
|
||||
- Multi-layer cache coordination
|
||||
|
||||
**Recommendation**: Prioritize implementing the 4 P0 tests IMMEDIATELY to prevent production issues. These tests would have caught cache corruption bugs that could silently degrade model quality.
|
||||
|
||||
The current test file is completely misnamed and tests the wrong module. It should be renamed and populated with the 9 integration tests outlined above.
|
||||
@@ -1,440 +0,0 @@
|
||||
# Module 16 Quantization - Integration Test Audit Report
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current Status**: ❌ **CRITICAL - No integration tests implemented**
|
||||
**Test File**: `tests/16_quantization/test_quantization_integration.py`
|
||||
**Current Coverage**: 0% (stub file only)
|
||||
**Required Coverage**: Full integration with Modules 01-15
|
||||
|
||||
---
|
||||
|
||||
## Critical Integration Points (Missing Tests)
|
||||
|
||||
### 1. ✅ Model Integrity After Quantization
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🔴 CRITICAL - Bug Prevention
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
def test_quantization_preserves_model_structure():
|
||||
"""Verify quantization doesn't corrupt model from Modules 03-13."""
|
||||
# Test that quantized models can still:
|
||||
# - Forward pass with correct shapes
|
||||
# - Work with optimizers (Module 06)
|
||||
# - Train with Trainer (Module 07)
|
||||
# - Process batched data from DataLoader (Module 08)
|
||||
# - Integrate with Conv2D/MaxPool2D (Module 09)
|
||||
# - Work with attention mechanisms (Module 12)
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Quantization modifies model layers IN-PLACE
|
||||
- Must preserve API compatibility with all prior modules
|
||||
- Breaking changes would cascade through entire system
|
||||
- Students need confidence their models still work
|
||||
|
||||
**Test cases needed**:
|
||||
1. Quantize MLP → verify Dense layers still work
|
||||
2. Quantize CNN → verify Conv2D/MaxPool2D integration
|
||||
3. Quantize Transformer → verify attention/embeddings work
|
||||
4. Quantize then train → verify optimizer compatibility
|
||||
5. Quantize then profile → verify profiler (M14) integration
|
||||
|
||||
---
|
||||
|
||||
### 2. ✅ Output Similarity Validation
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🔴 CRITICAL - Accuracy Validation
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
def test_quantized_output_matches_float32():
|
||||
"""Verify quantized models produce similar outputs to FP32."""
|
||||
# Given: Original FP32 model
|
||||
# When: Quantize to INT8
|
||||
# Then: Output error < 1% (not just < 0.2 like unit test)
|
||||
|
||||
# Test across:
|
||||
# - Different model architectures (MLP, CNN, Transformer)
|
||||
# - Different input distributions (uniform, normal, realistic)
|
||||
# - Different weight distributions (Xavier, He, pre-trained)
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Unit tests use random weights (not realistic)
|
||||
- Integration tests need realistic scenarios
|
||||
- Must validate on actual model architectures
|
||||
- Accuracy loss should be < 1% in production
|
||||
|
||||
**Test cases needed**:
|
||||
1. Simple MLP on random data (baseline)
|
||||
2. CNN on image-like data (spatial patterns)
|
||||
3. Attention on sequence data (positional dependencies)
|
||||
4. Pre-trained weights (realistic distributions)
|
||||
5. Edge cases: very small/large activation ranges
|
||||
|
||||
---
|
||||
|
||||
### 3. ⚠️ In-Place Modification Warning System
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟡 HIGH - Student Safety
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
def test_quantization_in_place_warning():
|
||||
"""Verify students are warned about destructive operations."""
|
||||
# Test that:
|
||||
# 1. quantize_model() warns about in-place modification
|
||||
# 2. Documentation clearly states weights are LOST
|
||||
# 3. Example shows copy.deepcopy() pattern
|
||||
# 4. Error handling for trying to "unquantize"
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Students will lose their trained models
|
||||
- Can't recover FP32 weights after quantization
|
||||
- Common mistake in production (quantize checkpoint by accident)
|
||||
- Educational: teach defensive programming patterns
|
||||
|
||||
**Test cases needed**:
|
||||
1. Verify warning message displays
|
||||
2. Test that original model IS modified
|
||||
3. Verify deepcopy() prevents modification
|
||||
4. Test error message for invalid recovery attempts
|
||||
|
||||
---
|
||||
|
||||
### 4. 💾 Memory Reduction Measurement
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟡 HIGH - Core Value Proposition
|
||||
|
||||
**What needs testing**:
|
||||
```python
|
||||
def test_quantization_actual_memory_reduction():
|
||||
"""Measure ACTUAL memory savings, not theoretical."""
|
||||
# Test that:
|
||||
# 1. INT8 tensors use 1 byte (not 4 bytes)
|
||||
# 2. Compression ratio ≈ 4× in practice
|
||||
# 3. Memory profiler (M14) shows real savings
|
||||
# 4. Savings persist after forward/backward passes
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Unit tests calculate theoretical savings
|
||||
- Need to verify ACTUAL memory usage
|
||||
- Python's memory model can be tricky (views, copies)
|
||||
- Students need to see real impact
|
||||
|
||||
**Test cases needed**:
|
||||
1. Profile memory before/after quantization
|
||||
2. Verify dtype is actually int8 (not float32)
|
||||
3. Test memory during forward pass (no hidden FP32 copies)
|
||||
4. Measure total process memory (OS-level)
|
||||
5. Compare with Module 14 profiler predictions
|
||||
|
||||
---
|
||||
|
||||
## Additional Missing Integration Tests
|
||||
|
||||
### 5. 🔄 Backward Compatibility
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟡 HIGH
|
||||
|
||||
```python
|
||||
def test_quantized_models_work_with_existing_code():
|
||||
"""Verify quantized models integrate seamlessly."""
|
||||
# Test that quantized models work with:
|
||||
# - DataLoader batching
|
||||
# - Training loops
|
||||
# - Gradient computation (if supported)
|
||||
# - Model saving/loading
|
||||
```
|
||||
|
||||
### 6. 🚨 Edge Cases and Error Handling
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟢 MEDIUM
|
||||
|
||||
```python
|
||||
def test_quantization_edge_cases():
|
||||
"""Test corner cases that might break."""
|
||||
# Test:
|
||||
# - Quantizing already quantized model (should error)
|
||||
# - Quantizing model with no Linear layers
|
||||
# - Quantizing with empty calibration data
|
||||
# - Quantizing constant weights (all zeros, all ones)
|
||||
# - Quantizing extreme ranges (very small, very large)
|
||||
```
|
||||
|
||||
### 7. 📊 Profiler Integration (Module 14)
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟢 MEDIUM
|
||||
|
||||
```python
|
||||
def test_quantization_with_profiler():
|
||||
"""Verify M14 profiler works with M16 quantization."""
|
||||
# Test that:
|
||||
# - Profiler can measure quantized models
|
||||
# - Memory measurements are accurate
|
||||
# - Parameter counting works correctly
|
||||
# - Benchmark results make sense
|
||||
```
|
||||
|
||||
### 8. 🏗️ Multi-Layer Model Integration
|
||||
**Status**: ❌ MISSING
|
||||
**Priority**: 🟡 HIGH
|
||||
|
||||
```python
|
||||
def test_quantization_complex_architectures():
|
||||
"""Test quantization on realistic architectures."""
|
||||
# Test:
|
||||
# - ResNet-like skip connections
|
||||
# - Multi-head attention models
|
||||
# - Mixed CNN + Transformer
|
||||
# - Models with shared weights (embeddings)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison with Other Modules
|
||||
|
||||
### Module 14 (Profiling) Integration Test Pattern
|
||||
```python
|
||||
# Module 14 tests verify:
|
||||
✅ Complete system (01→14) still works
|
||||
✅ Multi-modal models work correctly
|
||||
✅ Advanced features integrate properly
|
||||
✅ Regression prevention for all prior modules
|
||||
```
|
||||
|
||||
### Module 16 Should Follow Same Pattern
|
||||
```python
|
||||
# Module 16 needs:
|
||||
❌ Complete system (01→15) verification
|
||||
❌ Quantized multi-modal models
|
||||
❌ Integration with profiling/compression
|
||||
❌ Regression prevention
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Test Implementation Order
|
||||
|
||||
### Phase 1: Critical Bug Prevention (Week 1)
|
||||
1. **test_quantization_preserves_model_structure()** - Prevent breaking changes
|
||||
2. **test_quantized_output_matches_float32()** - Validate accuracy preservation
|
||||
3. **test_quantization_actual_memory_reduction()** - Verify core value prop
|
||||
|
||||
### Phase 2: Student Safety (Week 2)
|
||||
4. **test_quantization_in_place_warning()** - Prevent data loss
|
||||
5. **test_quantized_models_work_with_existing_code()** - Ensure usability
|
||||
6. **test_quantization_edge_cases()** - Handle corner cases
|
||||
|
||||
### Phase 3: Advanced Integration (Week 3)
|
||||
7. **test_quantization_with_profiler()** - M14 + M16 integration
|
||||
8. **test_quantization_complex_architectures()** - Real-world scenarios
|
||||
9. **test_complete_tinytorch_system_stable()** - Full regression suite
|
||||
|
||||
---
|
||||
|
||||
## Test Coverage Gaps - Detailed Analysis
|
||||
|
||||
### Current Unit Test Coverage (in module)
|
||||
✅ `test_unit_quantize_int8()` - Basic quantization works
|
||||
✅ `test_unit_dequantize_int8()` - Basic dequantization works
|
||||
✅ `test_unit_quantized_linear()` - Single layer quantization
|
||||
✅ `test_unit_quantize_model()` - Model-level quantization
|
||||
✅ `test_unit_compare_model_sizes()` - Memory comparison
|
||||
|
||||
### Missing Integration Coverage
|
||||
❌ **Cross-module compatibility** - No tests verify M16 works with M01-M15
|
||||
❌ **Real-world scenarios** - No tests on realistic architectures
|
||||
❌ **Production patterns** - No tests for deployment workflows
|
||||
❌ **Error recovery** - No tests for handling failures gracefully
|
||||
❌ **Performance validation** - No tests verify speedup claims
|
||||
❌ **Hardware compatibility** - No tests for different backends
|
||||
|
||||
---
|
||||
|
||||
## Bug-Catching Priorities
|
||||
|
||||
### P0: Critical Bugs (Would break student work)
|
||||
1. **Quantization corrupts model state** → Students lose trained models
|
||||
2. **Output accuracy degradation > 5%** → Models become useless
|
||||
3. **Memory not actually reduced** → False promises
|
||||
4. **In-place modification without warning** → Silent data loss
|
||||
|
||||
### P1: High-Impact Bugs (Would frustrate students)
|
||||
5. **Quantized models incompatible with training** → Can't fine-tune
|
||||
6. **Profiler breaks on quantized models** → Can't measure impact
|
||||
7. **Edge cases crash silently** → Hard to debug
|
||||
|
||||
### P2: Quality Issues (Would confuse students)
|
||||
8. **Inconsistent compression ratios** → Unclear value proposition
|
||||
9. **Calibration doesn't improve accuracy** → Wasted complexity
|
||||
10. **Documentation claims don't match reality** → Trust issues
|
||||
|
||||
---
|
||||
|
||||
## Recommended Test File Structure
|
||||
|
||||
```python
|
||||
"""
|
||||
Integration tests for Module 16: Quantization
|
||||
Tests INT8 quantization, model preservation, and system integration
|
||||
"""
|
||||
|
||||
class TestQuantizationModelIntegrity:
|
||||
"""Verify quantization preserves model structure and functionality."""
|
||||
|
||||
def test_quantize_mlp_preserves_structure()
|
||||
def test_quantize_cnn_preserves_spatial_ops()
|
||||
def test_quantize_transformer_preserves_attention()
|
||||
def test_quantized_model_trains_correctly()
|
||||
def test_quantized_model_profiles_correctly()
|
||||
|
||||
|
||||
class TestQuantizationAccuracy:
|
||||
"""Verify quantized models maintain acceptable accuracy."""
|
||||
|
||||
def test_mlp_output_similarity()
|
||||
def test_cnn_output_similarity()
|
||||
def test_transformer_output_similarity()
|
||||
def test_calibrated_vs_uncalibrated_accuracy()
|
||||
def test_quantization_error_within_1_percent()
|
||||
|
||||
|
||||
class TestQuantizationMemorySavings:
|
||||
"""Verify actual memory reduction matches claims."""
|
||||
|
||||
def test_int8_tensor_actual_memory()
|
||||
def test_compression_ratio_approximately_4x()
|
||||
def test_memory_savings_persist_during_inference()
|
||||
def test_profiler_measures_savings_correctly()
|
||||
def test_os_level_memory_reduction()
|
||||
|
||||
|
||||
class TestQuantizationSafety:
|
||||
"""Verify safe usage patterns and error handling."""
|
||||
|
||||
def test_in_place_modification_warning()
|
||||
def test_cannot_unquantize_model()
|
||||
def test_deepcopy_prevents_modification()
|
||||
def test_quantizing_quantized_model_errors()
|
||||
def test_edge_case_constant_tensors()
|
||||
|
||||
|
||||
class TestQuantizationSystemIntegration:
|
||||
"""Verify quantization works with complete TinyTorch system."""
|
||||
|
||||
def test_complete_system_01_to_15_stable()
|
||||
def test_quantized_dataloader_pipeline()
|
||||
def test_quantized_training_workflow()
|
||||
def test_quantization_plus_profiling()
|
||||
def test_multimodal_model_quantization()
|
||||
|
||||
|
||||
class TestQuantizationEdgeCases:
|
||||
"""Test corner cases and error conditions."""
|
||||
|
||||
def test_empty_calibration_data()
|
||||
def test_zero_weights_quantization()
|
||||
def test_extreme_activation_ranges()
|
||||
def test_model_with_no_linear_layers()
|
||||
def test_single_layer_quantization_error()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Minimum Acceptable Coverage
|
||||
- ✅ All P0 bugs prevented (4/4 tests)
|
||||
- ✅ Integration with M01-M15 verified (5+ tests)
|
||||
- ✅ Real-world scenarios tested (3+ architectures)
|
||||
- ✅ Memory savings validated (actual measurements)
|
||||
|
||||
### Gold Standard Coverage
|
||||
- ✅ All recommended tests implemented (20+ tests)
|
||||
- ✅ Cross-module regression suite (like M14)
|
||||
- ✅ Performance benchmarks included
|
||||
- ✅ Error handling comprehensive
|
||||
|
||||
---
|
||||
|
||||
## Next Actions
|
||||
|
||||
### Immediate (This Sprint)
|
||||
1. Create basic test structure (5 test classes)
|
||||
2. Implement P0 critical tests (4 tests)
|
||||
3. Add model integrity tests (5 tests)
|
||||
|
||||
### Short-term (Next Sprint)
|
||||
4. Implement accuracy validation (5 tests)
|
||||
5. Add memory measurement tests (5 tests)
|
||||
6. Create safety/warning tests (5 tests)
|
||||
|
||||
### Long-term (Future Sprints)
|
||||
7. Complete edge case coverage
|
||||
8. Add performance benchmarks
|
||||
9. Create comprehensive regression suite
|
||||
10. Document test patterns for future modules
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Test Examples
|
||||
|
||||
### Example: Critical Integration Test
|
||||
|
||||
```python
|
||||
def test_quantization_preserves_cnn_functionality():
|
||||
"""
|
||||
CRITICAL: Verify quantized CNN still works with spatial operations.
|
||||
|
||||
Bug this catches:
|
||||
- Quantization breaks Conv2D/MaxPool2D integration
|
||||
- Shape mismatches after quantization
|
||||
- Gradient flow issues (if backward supported)
|
||||
"""
|
||||
from tinytorch.core.spatial import Conv2D, MaxPool2D
|
||||
from tinytorch.core.layers import Linear
|
||||
from tinytorch.core.activations import ReLU
|
||||
from tinytorch.optimization.quantization import quantize_model
|
||||
|
||||
# Build realistic CNN
|
||||
conv1 = Conv2D(3, 16, kernel_size=3)
|
||||
pool = MaxPool2D(kernel_size=2)
|
||||
conv2 = Conv2D(16, 32, kernel_size=3)
|
||||
flatten = # ... flatten operation
|
||||
fc = Linear(800, 10) # Assume flattened size
|
||||
|
||||
model = SimpleCNN(conv1, pool, conv2, flatten, fc)
|
||||
|
||||
# Test original
|
||||
x = Tensor(np.random.randn(4, 3, 32, 32))
|
||||
original_output = model.forward(x)
|
||||
|
||||
# Quantize (in-place)
|
||||
quantize_model(model)
|
||||
|
||||
# Test quantized
|
||||
quantized_output = model.forward(x)
|
||||
|
||||
# Assertions
|
||||
assert quantized_output.shape == original_output.shape, \
|
||||
"Quantization changed output shape - BREAKS SYSTEM"
|
||||
|
||||
error = np.mean(np.abs(original_output.data - quantized_output.data))
|
||||
assert error < 0.5, \
|
||||
f"Quantization error {error:.3f} too high for CNN"
|
||||
|
||||
# Verify Conv2D layers still work
|
||||
assert hasattr(model.conv1, 'forward'), \
|
||||
"Quantization broke Conv2D API"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Report Generated**: 2024-11-25
|
||||
**Auditor**: Claude (ML Systems QA)
|
||||
**Status**: Ready for implementation
|
||||
@@ -1,453 +0,0 @@
|
||||
# Module 17 (Compression/Pruning) - Integration Test Audit Report
|
||||
|
||||
**Audit Date**: 2025-11-25
|
||||
**Auditor**: QA Agent
|
||||
**Module**: 17 - Compression (Pruning, Knowledge Distillation)
|
||||
**Status**: CRITICAL GAPS IDENTIFIED
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current State**: Module 17 has ONLY a placeholder integration test file with no actual tests.
|
||||
|
||||
**Risk Level**: HIGH - Module is exported to production package but lacks integration validation.
|
||||
|
||||
**Critical Finding**: The checkpoint test (checkpoint_17_compression.py) expects completely different APIs than what's implemented in the actual module.
|
||||
|
||||
---
|
||||
|
||||
## 1. Current Test Coverage
|
||||
|
||||
### Existing Test Files
|
||||
```
|
||||
tests/17_compression/
|
||||
├── test_compression_integration.py ❌ PLACEHOLDER ONLY (23 lines, no real tests)
|
||||
├── run_all_tests.py ✅ Exists but returns PENDING status
|
||||
└── __pycache__/
|
||||
```
|
||||
|
||||
### Current Coverage: 0%
|
||||
- **Unit Tests**: None in integration directory
|
||||
- **Integration Tests**: Placeholder only
|
||||
- **Progressive Tests**: Missing entirely
|
||||
- **Cross-Module Tests**: None
|
||||
|
||||
---
|
||||
|
||||
## 2. Critical Integration Points for Module 17
|
||||
|
||||
Based on the actual implementation (`tinytorch/optimization/compression.py`), these are the critical integration points that MUST be tested:
|
||||
|
||||
### 2.1 Pruning Doesn't Corrupt Shared Weight References
|
||||
**Risk**: High - Pruning modifies weights in-place
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: CRITICAL
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
# Multiple layers sharing same weight tensor
|
||||
layer1 = Linear(10, 20)
|
||||
layer2_weights = layer1.weight # Shared reference
|
||||
model = SimpleModel(layer1, layer2_with_shared_weights)
|
||||
|
||||
magnitude_prune(model, sparsity=0.5)
|
||||
|
||||
# CRITICAL: Verify both references see the same pruned weights
|
||||
# CRITICAL: Verify gradients still flow correctly through shared weights
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Weight sharing is common (e.g., tied embeddings in transformers)
|
||||
- In-place pruning could break reference sharing
|
||||
- Could cause silent accuracy degradation
|
||||
|
||||
### 2.2 Sparse Models Still Train Correctly
|
||||
**Risk**: High - Pruning creates zeros that must stay zero during training
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: CRITICAL
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
model = create_simple_mlp()
|
||||
magnitude_prune(model, sparsity=0.7)
|
||||
|
||||
# Train for several steps
|
||||
for _ in range(10):
|
||||
output = model.forward(input)
|
||||
loss = compute_loss(output, target)
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
# CRITICAL: Verify pruned weights remain zero after training
|
||||
# CRITICAL: Verify unpruned weights still update normally
|
||||
# CRITICAL: Verify loss decreases despite sparsity
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Pruned weights should stay pruned during fine-tuning
|
||||
- Optimizer updates could "resurrect" pruned weights
|
||||
- Gradient flow through sparse matrices can be unstable
|
||||
|
||||
### 2.3 Sparsity Measurement Consistency
|
||||
**Risk**: Medium - Different measurement methods should agree
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: MEDIUM
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
model = create_model()
|
||||
magnitude_prune(model, sparsity=0.6)
|
||||
|
||||
# Measure sparsity multiple ways
|
||||
sparsity_v1 = measure_sparsity(model) # Current implementation
|
||||
sparsity_v2 = manual_count_zeros(model) / total_params(model)
|
||||
sparsity_v3 = CompressionComplete.measure_sparsity(model)
|
||||
|
||||
# CRITICAL: All methods should agree within 1%
|
||||
assert abs(sparsity_v1 - sparsity_v2) < 0.01
|
||||
assert abs(sparsity_v1 - sparsity_v3) < 0.01
|
||||
```
|
||||
|
||||
**Why this matters**:
|
||||
- Inconsistent sparsity metrics confuse students
|
||||
- Could hide bugs in pruning implementation
|
||||
- Affects compression ratio calculations
|
||||
|
||||
### 2.4 Pruned Model Inference Works
|
||||
**Risk**: High - Sparse operations must produce correct outputs
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: HIGH
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
# Create model, train it, get baseline accuracy
|
||||
model = create_and_train_model()
|
||||
baseline_output = model.forward(test_input)
|
||||
|
||||
# Prune and verify inference still works
|
||||
magnitude_prune(model, sparsity=0.7)
|
||||
pruned_output = model.forward(test_input)
|
||||
|
||||
# CRITICAL: Output shape unchanged
|
||||
assert pruned_output.shape == baseline_output.shape
|
||||
|
||||
# CRITICAL: Output values reasonable (not NaN/Inf)
|
||||
assert not np.any(np.isnan(pruned_output.data))
|
||||
assert not np.any(np.isinf(pruned_output.data))
|
||||
|
||||
# CRITICAL: Output changes are bounded
|
||||
max_change = np.max(np.abs(pruned_output.data - baseline_output.data))
|
||||
assert max_change < 10.0 # Reasonable threshold
|
||||
```
|
||||
|
||||
### 2.5 Structured vs Unstructured Pruning Interaction
|
||||
**Risk**: Medium - Both pruning types modify same weights
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: MEDIUM
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
model = create_model()
|
||||
|
||||
# Apply both pruning types
|
||||
magnitude_prune(model, sparsity=0.5) # Unstructured
|
||||
initial_sparsity = measure_sparsity(model)
|
||||
|
||||
structured_prune(model, prune_ratio=0.3) # Structured
|
||||
final_sparsity = measure_sparsity(model)
|
||||
|
||||
# CRITICAL: Sparsity should increase (or stay same)
|
||||
assert final_sparsity >= initial_sparsity
|
||||
|
||||
# CRITICAL: Model still functional
|
||||
output = model.forward(test_input)
|
||||
assert output.shape == expected_shape
|
||||
```
|
||||
|
||||
### 2.6 Knowledge Distillation Integration
|
||||
**Risk**: High - KD loss depends on correct tensor operations
|
||||
**Current Coverage**: 0%
|
||||
**Bug Potential**: HIGH
|
||||
|
||||
**What to test**:
|
||||
```python
|
||||
teacher = create_large_model()
|
||||
student = create_small_model()
|
||||
|
||||
kd = KnowledgeDistillation(teacher, student, temperature=3.0, alpha=0.7)
|
||||
|
||||
# Generate predictions
|
||||
teacher_logits = teacher.forward(input)
|
||||
student_logits = student.forward(input)
|
||||
true_labels = np.array([0, 1, 2, 3])
|
||||
|
||||
# Compute distillation loss
|
||||
loss = kd.distillation_loss(student_logits, teacher_logits, true_labels)
|
||||
|
||||
# CRITICAL: Loss is a scalar
|
||||
assert np.isscalar(loss) or (isinstance(loss, np.ndarray) and loss.size == 1)
|
||||
|
||||
# CRITICAL: Loss is positive and finite
|
||||
assert loss > 0
|
||||
assert not np.isnan(loss)
|
||||
assert not np.isinf(loss)
|
||||
|
||||
# CRITICAL: Alpha parameter affects loss composition
|
||||
loss_high_alpha = KnowledgeDistillation(teacher, student, alpha=0.9).distillation_loss(...)
|
||||
loss_low_alpha = KnowledgeDistillation(teacher, student, alpha=0.1).distillation_loss(...)
|
||||
# Different alpha should give different losses
|
||||
assert abs(loss_high_alpha - loss_low_alpha) > 0.01
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Missing Progressive Integration Tests
|
||||
|
||||
Module 17 integration tests should verify the ENTIRE stack (Modules 01-17) still works:
|
||||
|
||||
### 3.1 Prior Stack Regression Tests (MISSING)
|
||||
```python
|
||||
class TestPriorStackStillWorking:
|
||||
"""Verify Modules 01-16 unchanged after compression development."""
|
||||
|
||||
def test_quantization_still_works(self):
|
||||
"""Module 16 (Quantization) should be unaffected."""
|
||||
# Test quantization APIs still functional
|
||||
|
||||
def test_profiling_still_works(self):
|
||||
"""Module 14 (Profiling) should be unaffected."""
|
||||
# Test profiling APIs still functional
|
||||
|
||||
def test_training_pipeline_stable(self):
|
||||
"""Complete training pipeline (Modules 01-07) should work."""
|
||||
# End-to-end training test
|
||||
```
|
||||
|
||||
### 3.2 Cross-Module Integration Tests (MISSING)
|
||||
```python
|
||||
class TestCompressionWithOtherModules:
|
||||
"""Test compression works with other advanced modules."""
|
||||
|
||||
def test_compression_with_quantization(self):
|
||||
"""Test: Prune first, then quantize."""
|
||||
model = create_model()
|
||||
magnitude_prune(model, sparsity=0.7)
|
||||
quantize_model(model, bits=8)
|
||||
# Verify both optimizations work together
|
||||
|
||||
def test_compression_with_attention(self):
|
||||
"""Test: Prune attention mechanisms."""
|
||||
attention = MultiHeadAttention(64, 8)
|
||||
structured_prune(attention, prune_ratio=0.3)
|
||||
# Verify attention still computes correctly
|
||||
|
||||
def test_compression_with_spatial_conv(self):
|
||||
"""Test: Prune CNN filters."""
|
||||
conv = Conv2D(3, 64, kernel_size=3)
|
||||
structured_prune(conv, prune_ratio=0.5)
|
||||
# Verify convolutions still work
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. API Mismatch with Checkpoint Test
|
||||
|
||||
**CRITICAL ISSUE**: The checkpoint test expects completely different APIs than what's implemented!
|
||||
|
||||
### Expected APIs (from checkpoint_17_compression.py):
|
||||
```python
|
||||
from tinytorch.nn.utils.prune import (
|
||||
MagnitudePruner, # ❌ Class-based API
|
||||
prune_conv_filters, # ❌ Specialized function
|
||||
CompressionAnalyzer # ❌ Analysis class
|
||||
)
|
||||
|
||||
pruner = MagnitudePruner()
|
||||
pruned_weights, mask, stats = pruner.prune(test_weights, sparsity=0.7)
|
||||
```
|
||||
|
||||
### Actual Implementation (in compression.py):
|
||||
```python
|
||||
from tinytorch.optimization.compression import (
|
||||
magnitude_prune, # ✅ Function-based API
|
||||
structured_prune, # ✅ Function-based API
|
||||
KnowledgeDistillation, # ✅ KD class
|
||||
measure_sparsity, # ✅ Utility function
|
||||
compress_model # ✅ Pipeline function
|
||||
)
|
||||
|
||||
magnitude_prune(model, sparsity=0.7) # In-place, no mask/stats returned
|
||||
```
|
||||
|
||||
### Resolution Required:
|
||||
1. **Option A**: Update checkpoint to match actual implementation
|
||||
2. **Option B**: Extend implementation to match checkpoint expectations
|
||||
3. **Option C**: Document API differences and maintain both
|
||||
|
||||
**Recommendation**: Option A - Update checkpoint to match the cleaner functional API actually implemented.
|
||||
|
||||
---
|
||||
|
||||
## 5. Bug-Catching Test Priorities
|
||||
|
||||
### Priority 1: CRITICAL (Could cause silent failures)
|
||||
1. **Shared weight corruption test** - Highest risk for silent accuracy degradation
|
||||
2. **Training with pruned weights test** - Optimizer could resurrect pruned weights
|
||||
3. **Knowledge distillation loss validity test** - Invalid loss breaks training
|
||||
|
||||
### Priority 2: HIGH (Could cause obvious failures)
|
||||
4. **Pruned model inference test** - Ensures basic functionality works
|
||||
5. **Sparsity measurement consistency test** - Prevents metric confusion
|
||||
6. **Cross-module integration tests** - Ensures compression doesn't break other modules
|
||||
|
||||
### Priority 3: MEDIUM (Quality of life issues)
|
||||
7. **Structured vs unstructured interaction test** - Edge case handling
|
||||
8. **Progressive stack regression tests** - Prevent accidental breakage
|
||||
9. **Performance profiling tests** - Verify compression actually improves performance
|
||||
|
||||
---
|
||||
|
||||
## 6. Recommended Test Structure
|
||||
|
||||
```
|
||||
tests/17_compression/
|
||||
├── test_progressive_integration.py # NEW - Progressive stack tests
|
||||
│ ├── TestPriorStackStillWorking # Modules 01-16 regression
|
||||
│ ├── TestModule17CompressionCore # Core compression functionality
|
||||
│ ├── TestProgressiveStackIntegration # Full stack (01-17) integration
|
||||
│ └── TestRegressionPrevention # Prevent breakage
|
||||
│
|
||||
├── test_compression_integration.py # EXPAND - Currently placeholder
|
||||
│ ├── TestPruningIntegration # In-place pruning behavior
|
||||
│ ├── TestSparsityConsistency # Measurement accuracy
|
||||
│ ├── TestKnowledgeDistillation # KD integration
|
||||
│ └── TestCrossModuleInteraction # With quantization, attention, etc.
|
||||
│
|
||||
├── test_pruning_edge_cases.py # NEW - Edge case handling
|
||||
│ ├── TestSharedWeightReferences # CRITICAL
|
||||
│ ├── TestTrainingAfterPruning # CRITICAL
|
||||
│ ├── TestExtremeSparsity # 0%, 100% sparsity
|
||||
│ └── TestInvalidInputHandling # Error cases
|
||||
│
|
||||
└── test_compression_performance.py # NEW - Performance validation
|
||||
├── TestMemoryReduction # Actual memory savings
|
||||
├── TestInferenceSpeed # Sparse inference performance
|
||||
└── TestCompressionQuality # Accuracy preservation
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Sample Integration Test Implementation
|
||||
|
||||
Here's a sample of what the CRITICAL shared weight test should look like:
|
||||
|
||||
```python
|
||||
def test_pruning_with_shared_weights():
|
||||
"""CRITICAL: Verify pruning doesn't corrupt shared weight references."""
|
||||
print("🔬 Testing pruning with shared weight references...")
|
||||
|
||||
# Create two layers sharing the same weight tensor
|
||||
layer1 = Linear(100, 50)
|
||||
layer2 = Linear(100, 50)
|
||||
|
||||
# Share weights (common pattern: tied embeddings)
|
||||
layer2.weight = layer1.weight # Share reference
|
||||
|
||||
# Create model with shared weights
|
||||
model = SimpleModel(layer1, layer2)
|
||||
|
||||
# Verify weights are actually shared before pruning
|
||||
original_id = id(layer1.weight.data)
|
||||
assert id(layer2.weight.data) == original_id, "Weights should be shared"
|
||||
|
||||
# Apply magnitude pruning
|
||||
magnitude_prune(model, sparsity=0.6)
|
||||
|
||||
# CRITICAL TEST 1: Weights still shared after pruning
|
||||
assert id(layer1.weight.data) == id(layer2.weight.data), \
|
||||
"Pruning should preserve weight sharing"
|
||||
|
||||
# CRITICAL TEST 2: Both layers see the same pruned pattern
|
||||
assert np.array_equal(layer1.weight.data, layer2.weight.data), \
|
||||
"Shared weights should have identical pruning masks"
|
||||
|
||||
# CRITICAL TEST 3: Sparsity is correct
|
||||
sparsity = np.sum(layer1.weight.data == 0) / layer1.weight.data.size
|
||||
assert 0.55 <= sparsity <= 0.65, \
|
||||
f"Expected ~60% sparsity, got {sparsity:.1%}"
|
||||
|
||||
# CRITICAL TEST 4: Forward pass works with shared pruned weights
|
||||
input_data = Tensor(np.random.randn(10, 100))
|
||||
output1 = layer1.forward(input_data)
|
||||
output2 = layer2.forward(input_data)
|
||||
|
||||
# Both layers should produce identical outputs (same weights)
|
||||
assert np.allclose(output1.data, output2.data), \
|
||||
"Shared pruned weights should produce identical outputs"
|
||||
|
||||
print("✅ Shared weight pruning works correctly!")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Actionable Recommendations
|
||||
|
||||
### Immediate Actions (This Sprint)
|
||||
1. **Create test_progressive_integration.py** - Following Module 02 pattern
|
||||
2. **Implement 6 critical integration tests** - Focus on shared weights, training, KD
|
||||
3. **Resolve checkpoint API mismatch** - Update checkpoint or extend implementation
|
||||
4. **Add cross-module tests** - Compression + Quantization, Compression + Attention
|
||||
|
||||
### Short-term Actions (Next Sprint)
|
||||
5. **Add edge case tests** - Extreme sparsity, invalid inputs, error handling
|
||||
6. **Add performance validation tests** - Verify actual memory/speed improvements
|
||||
7. **Document integration patterns** - How compression interacts with other modules
|
||||
8. **Create test data fixtures** - Reusable models for testing
|
||||
|
||||
### Long-term Actions (Future)
|
||||
9. **Continuous integration monitoring** - Add to CI/CD pipeline
|
||||
10. **Property-based testing** - Use Hypothesis for generative test cases
|
||||
11. **Benchmark suite** - Performance regression detection
|
||||
12. **Student confusion monitoring** - Track common errors in integration
|
||||
|
||||
---
|
||||
|
||||
## 9. Risk Assessment
|
||||
|
||||
| Risk Category | Likelihood | Impact | Mitigation Priority |
|
||||
|---------------|------------|--------|---------------------|
|
||||
| Shared weight corruption | HIGH | CRITICAL | P1 - Immediate |
|
||||
| Training resurrects pruned weights | HIGH | CRITICAL | P1 - Immediate |
|
||||
| KD loss computation errors | MEDIUM | HIGH | P1 - Immediate |
|
||||
| Sparsity measurement bugs | MEDIUM | MEDIUM | P2 - Short-term |
|
||||
| Cross-module incompatibility | LOW | HIGH | P2 - Short-term |
|
||||
| API confusion (checkpoint mismatch) | HIGH | MEDIUM | P1 - Immediate |
|
||||
|
||||
---
|
||||
|
||||
## 10. Conclusion
|
||||
|
||||
**Module 17 (Compression) has ZERO integration test coverage despite being exported to production.**
|
||||
|
||||
**Highest-risk gaps**:
|
||||
1. No validation that pruning preserves shared weight references
|
||||
2. No validation that pruned models can still train
|
||||
3. No validation that knowledge distillation produces valid losses
|
||||
4. Complete API mismatch with checkpoint expectations
|
||||
|
||||
**Recommended action**: Implement the 6 critical integration tests IMMEDIATELY before any student uses this module in combination with other modules.
|
||||
|
||||
**Estimated effort**:
|
||||
- Critical tests (Priority 1): 4-6 hours
|
||||
- High-priority tests (Priority 2): 3-4 hours
|
||||
- Progressive integration structure: 2-3 hours
|
||||
- **Total**: 10-13 hours to achieve acceptable coverage
|
||||
|
||||
**Next steps**: Review this audit with Module Developer, prioritize critical tests, assign implementation tasks.
|
||||
|
||||
---
|
||||
|
||||
**Audit completed**: 2025-11-25
|
||||
**Reviewed by**: QA Agent
|
||||
**Status**: APPROVED FOR DEVELOPMENT
|
||||
@@ -1,615 +0,0 @@
|
||||
# Module 19 (Benchmarking) - Integration Test Audit Report
|
||||
|
||||
**Audit Date**: 2025-11-25
|
||||
**Module**: 19_benchmarking
|
||||
**Current Test File**: `tests/19_benchmarking/test_benchmarking_integration.py`
|
||||
**Status**: STUB ONLY - NO IMPLEMENTATION
|
||||
|
||||
---
|
||||
|
||||
## EXECUTIVE SUMMARY
|
||||
|
||||
**CRITICAL FINDING**: Module 19 integration tests are completely unimplemented (TODO stub only).
|
||||
|
||||
- **Current Coverage**: 0% (stub file with TODO comments)
|
||||
- **Expected Coverage**: ~80% for production-ready benchmarking system
|
||||
- **Priority**: HIGH - Benchmarking is final implementation module and capstone foundation
|
||||
- **Risk**: Students cannot validate benchmarking correctness or integration with optimization modules
|
||||
|
||||
---
|
||||
|
||||
## 1. CURRENT TEST COVERAGE ANALYSIS
|
||||
|
||||
### 1.1 What EXISTS (Stub Only)
|
||||
|
||||
```python
|
||||
def test_benchmarking_integration():
|
||||
"""Test benchmarking system integration."""
|
||||
# TODO: Implement integration tests
|
||||
# - Test benchmark runner
|
||||
# - Test performance metrics collection
|
||||
# - Test result validation
|
||||
# - Test comparison with baselines
|
||||
# - Test leaderboard submission
|
||||
pass
|
||||
```
|
||||
|
||||
**Lines of Code**: 24 (all comments/stubs)
|
||||
**Actual Tests**: 0
|
||||
**Integration Scenarios**: 0
|
||||
|
||||
### 1.2 What Module 19 IMPLEMENTS (2546 lines)
|
||||
|
||||
Module 19 provides comprehensive benchmarking infrastructure:
|
||||
|
||||
**Core Components**:
|
||||
1. `BenchmarkResult` - Statistical analysis container
|
||||
2. `PreciseTimer` - High-precision timing infrastructure
|
||||
3. `Benchmark` - Multi-model comparison framework
|
||||
4. `BenchmarkSuite` - Comprehensive multi-metric evaluation
|
||||
5. `TinyMLPerf` - Industry-standard benchmark runner
|
||||
6. `compare_optimization_techniques()` - Optimization comparison engine
|
||||
|
||||
**Key Integration Points**:
|
||||
- Uses `Profiler` from Module 14 for measurements
|
||||
- Uses `Tensor` from Module 01 for data handling
|
||||
- Should work with optimized models from Modules 15-18
|
||||
- Generates reports for TorchPerf Olympics capstone
|
||||
|
||||
---
|
||||
|
||||
## 2. CRITICAL INTEGRATION POINTS FOR MODULE 19
|
||||
|
||||
### 2.1 Real Model Performance Measurement
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Benchmark measures ACTUAL model latency (not simulated)
|
||||
✗ Benchmark measures REAL memory usage (not estimates)
|
||||
✗ Benchmark handles different model types (TinyTorch, PyTorch, custom)
|
||||
✗ Benchmark works with models from previous modules (Conv2D, MLP, Transformer)
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- Students need to benchmark their actual implementations, not mock models
|
||||
- Profiler integration must work correctly with real TinyTorch models
|
||||
- Duck-typing (hasattr checks) must handle various model interfaces
|
||||
|
||||
### 2.2 Statistical Validity of Measurements
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Confidence intervals calculated correctly
|
||||
✗ Warmup runs eliminate cold-start effects
|
||||
✗ Measurement variance is reasonable (CV < 20%)
|
||||
✗ Outlier detection prevents skewed results
|
||||
✗ Sample size recommendations are valid
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- Poor statistics lead to incorrect optimization decisions
|
||||
- Benchmarking is worthless without statistical rigor
|
||||
- Students must learn to trust/distrust measurements
|
||||
|
||||
### 2.3 Resource Exhaustion Prevention
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Memory benchmarks don't cause OOM crashes
|
||||
✗ Large models don't hang the benchmarking system
|
||||
✗ Timeout mechanisms prevent infinite loops
|
||||
✗ Graceful degradation when resources are limited
|
||||
✗ Clean resource cleanup after benchmarks
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- Benchmarking shouldn't crash student systems
|
||||
- Edge cases (huge models, limited RAM) must be handled
|
||||
- Production systems require robust error handling
|
||||
|
||||
### 2.4 Benchmark Results Reproducibility
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Same model produces consistent results across runs
|
||||
✗ Randomness is controlled (seeded) where needed
|
||||
✗ System state doesn't affect benchmark validity
|
||||
✗ Results can be serialized/deserialized correctly
|
||||
✗ Comparison across different machines is meaningful
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- TorchPerf Olympics requires reproducible submissions
|
||||
- Students must be able to verify their optimizations
|
||||
- Leaderboard requires fair comparisons
|
||||
|
||||
### 2.5 Optimization Module Integration (M15-18)
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Benchmark works with quantized models (Module 15)
|
||||
✗ Benchmark works with pruned models (Module 16)
|
||||
✗ Benchmark works with distilled models (Module 17)
|
||||
✗ Benchmark works with fused operators (Module 18)
|
||||
✗ compare_optimization_techniques() handles all optimization types
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- Module 19 is the EVALUATION framework for Modules 15-18
|
||||
- Without integration, students can't validate optimizations
|
||||
- Capstone requires combining multiple optimization techniques
|
||||
|
||||
### 2.6 TinyMLPerf Standard Compliance
|
||||
|
||||
**What Needs Testing**:
|
||||
```python
|
||||
✗ Standard benchmarks (keyword_spotting, image_classification, etc.) run correctly
|
||||
✗ Compliance thresholds enforced properly
|
||||
✗ Report generation matches MLPerf format
|
||||
✗ Leaderboard submission format is valid
|
||||
✗ Results are comparable to official MLPerf baselines
|
||||
```
|
||||
|
||||
**Why Critical**:
|
||||
- Industry-standard benchmarking teaches professional practices
|
||||
- Capstone submissions require MLPerf-style reporting
|
||||
- Career preparation for ML engineering roles
|
||||
|
||||
---
|
||||
|
||||
## 3. MISSING INTEGRATION TESTS (BY PRIORITY)
|
||||
|
||||
### PRIORITY 1: Core Benchmarking Workflow (CRITICAL)
|
||||
|
||||
**Test**: `test_benchmark_real_tinytorch_models()`
|
||||
```python
|
||||
def test_benchmark_real_tinytorch_models():
|
||||
"""
|
||||
✅ TEST: Benchmark should measure REAL TinyTorch models correctly
|
||||
|
||||
VALIDATES:
|
||||
- Integration with Tensor, Linear, Conv2D from earlier modules
|
||||
- Profiler from Module 14 works in benchmarking context
|
||||
- Latency/memory measurements are realistic (not zero, not infinite)
|
||||
- Results structure is correct and serializable
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Model.forward() not being called correctly
|
||||
- Profiler returning None or invalid measurements
|
||||
- Memory tracking not working with TinyTorch tensors
|
||||
- Duck-typing failures with real TinyTorch models
|
||||
"""
|
||||
```
|
||||
|
||||
**Bug Examples**:
|
||||
- Benchmark tries to call `model.predict()` but TinyTorch uses `model.forward()`
|
||||
- Memory measurement returns 0 for all models
|
||||
- Latency measurement includes warmup time incorrectly
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_statistical_validity()`
|
||||
```python
|
||||
def test_statistical_validity():
|
||||
"""
|
||||
✅ TEST: Statistical analysis should be mathematically correct
|
||||
|
||||
VALIDATES:
|
||||
- Confidence intervals calculated using proper formulas
|
||||
- Mean/std/median computed correctly
|
||||
- Sample size sufficient for statistical significance
|
||||
- Variance is reasonable (not too high or too low)
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Wrong t-score value (should be 1.96 for 95% CI)
|
||||
- Division by zero when n=1
|
||||
- CI width unreasonably large (>50% of mean)
|
||||
- Outliers not handled properly
|
||||
"""
|
||||
```
|
||||
|
||||
**Bug Examples**:
|
||||
- Confidence interval calculation uses wrong formula
|
||||
- Single measurement causes divide-by-zero in std calculation
|
||||
- Outliers skew results (one 100ms measurement among 1ms measurements)
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_benchmark_suite_multi_metric()`
|
||||
```python
|
||||
def test_benchmark_suite_multi_metric():
|
||||
"""
|
||||
✅ TEST: BenchmarkSuite should run all metrics and combine results
|
||||
|
||||
VALIDATES:
|
||||
- Latency, accuracy, memory, energy all measured
|
||||
- Results structure contains all metrics
|
||||
- Pareto frontier analysis identifies optimal models
|
||||
- Report generation produces valid output
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- One metric failing breaks entire suite
|
||||
- Results missing some metrics
|
||||
- Pareto analysis chooses dominated solutions
|
||||
- Energy estimation produces negative values
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### PRIORITY 2: Optimization Integration (HIGH)
|
||||
|
||||
**Test**: `test_optimization_module_integration()`
|
||||
```python
|
||||
def test_optimization_module_integration():
|
||||
"""
|
||||
✅ TEST: Benchmark should work with models from optimization modules
|
||||
|
||||
VALIDATES:
|
||||
- Quantized models (Module 15) benchmark correctly
|
||||
- Pruned models (Module 16) show reduced memory
|
||||
- Distilled models (Module 17) measured accurately
|
||||
- Fused operators (Module 18) show speedups
|
||||
- compare_optimization_techniques() generates valid comparisons
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Quantized model measurement crashes
|
||||
- Pruned model memory doesn't decrease
|
||||
- Fused operators show no speedup
|
||||
- Comparison function fails with empty models
|
||||
"""
|
||||
```
|
||||
|
||||
**Bug Examples**:
|
||||
- Quantized model forward() returns wrong dtype, crashes Profiler
|
||||
- Pruned model parameter counting doesn't account for sparse weights
|
||||
- Comparison assumes all models have same interface
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_optimization_recommendations()`
|
||||
```python
|
||||
def test_optimization_recommendations():
|
||||
"""
|
||||
✅ TEST: Recommendation engine should provide actionable guidance
|
||||
|
||||
VALIDATES:
|
||||
- Recommendations match use case constraints
|
||||
- Latency-critical use case chooses fastest model
|
||||
- Memory-constrained use case chooses smallest model
|
||||
- Balanced use case considers multiple metrics
|
||||
- Recommendations include reasoning
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Latency-critical recommends slowest model
|
||||
- Memory-constrained ignores memory metric
|
||||
- Recommendations contradict actual measurements
|
||||
- Reasoning is generic (not specific to results)
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### PRIORITY 3: Robustness & Edge Cases (MEDIUM)
|
||||
|
||||
**Test**: `test_resource_exhaustion_prevention()`
|
||||
```python
|
||||
def test_resource_exhaustion_prevention():
|
||||
"""
|
||||
✅ TEST: Benchmark should handle resource constraints gracefully
|
||||
|
||||
VALIDATES:
|
||||
- Large models don't cause OOM crashes
|
||||
- Long-running benchmarks can be interrupted
|
||||
- Memory is cleaned up after benchmarks
|
||||
- Timeout prevents infinite loops
|
||||
- Error messages are helpful
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Memory leak in benchmark loop
|
||||
- No timeout on model.forward() calls
|
||||
- Crash instead of graceful degradation
|
||||
- Resources not released on exception
|
||||
"""
|
||||
```
|
||||
|
||||
**Bug Examples**:
|
||||
- Benchmarking 1GB model crashes with OOM
|
||||
- Infinite loop in warmup phase (no timeout)
|
||||
- Memory leak: each benchmark run consumes more memory
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_benchmark_reproducibility()`
|
||||
```python
|
||||
def test_benchmark_reproducibility():
|
||||
"""
|
||||
✅ TEST: Benchmark results should be reproducible
|
||||
|
||||
VALIDATES:
|
||||
- Same model gives consistent results across runs
|
||||
- Random seed controls variability
|
||||
- Serialized results match original
|
||||
- Deserialized results can be compared
|
||||
- Variance is within acceptable bounds (CV < 10%)
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Results vary wildly between identical runs (CV > 50%)
|
||||
- Serialization loses precision
|
||||
- Deserialization fails on valid files
|
||||
- No seed control for reproducibility
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_edge_case_models()`
|
||||
```python
|
||||
def test_edge_case_models():
|
||||
"""
|
||||
✅ TEST: Benchmark should handle unusual model types
|
||||
|
||||
VALIDATES:
|
||||
- Empty model (no parameters) doesn't crash
|
||||
- Single-parameter model benchmarks correctly
|
||||
- Model with no forward() method fails gracefully
|
||||
- Model returning wrong shape is caught
|
||||
- Non-tensor outputs handled appropriately
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Empty model causes division by zero
|
||||
- Missing forward() crashes instead of error message
|
||||
- Wrong output shape causes silent failure
|
||||
- Non-tensor output crashes Profiler
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### PRIORITY 4: TinyMLPerf & Capstone (MEDIUM-HIGH)
|
||||
|
||||
**Test**: `test_tinymlperf_standard_benchmarks()`
|
||||
```python
|
||||
def test_tinymlperf_standard_benchmarks():
|
||||
"""
|
||||
✅ TEST: TinyMLPerf should run standard industry benchmarks
|
||||
|
||||
VALIDATES:
|
||||
- All standard benchmarks (keyword_spotting, image_classification, etc.) run
|
||||
- Compliance thresholds enforced correctly
|
||||
- Report format matches MLPerf specification
|
||||
- Leaderboard submission JSON is valid
|
||||
- Results comparable to reference implementations
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Benchmark names don't match MLPerf standard
|
||||
- Compliance check uses wrong thresholds
|
||||
- Report missing required fields
|
||||
- JSON serialization produces invalid format
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Test**: `test_torchperf_olympics_workflow()`
|
||||
```python
|
||||
def test_torchperf_olympics_workflow():
|
||||
"""
|
||||
✅ TEST: TorchPerf Olympics submission workflow should work end-to-end
|
||||
|
||||
VALIDATES:
|
||||
- Student can choose Olympic event
|
||||
- Benchmark runs for chosen event
|
||||
- Results validated against event constraints
|
||||
- Submission package generated correctly
|
||||
- Leaderboard ranking calculated properly
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Event constraints not enforced
|
||||
- Invalid submission passes validation
|
||||
- Ranking algorithm broken (ties handled wrong)
|
||||
- Submission package missing required files
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### PRIORITY 5: Progressive Integration (MEDIUM)
|
||||
|
||||
**Test**: `test_complete_tinytorch_system_still_works()`
|
||||
```python
|
||||
def test_complete_tinytorch_system_still_works():
|
||||
"""
|
||||
🔄 REGRESSION: Complete TinyTorch system (Modules 01-18) should still work
|
||||
|
||||
VALIDATES:
|
||||
- Tensor, activations, layers still functional
|
||||
- Training loops still work
|
||||
- Optimization modules (15-18) still work
|
||||
- Benchmarking doesn't break existing functionality
|
||||
|
||||
🐛 BUG-CATCHING:
|
||||
- Benchmarking imports break core modules
|
||||
- Profiler integration interferes with training
|
||||
- Circular dependencies introduced
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. REFERENCE: GOOD INTEGRATION TEST STRUCTURE
|
||||
|
||||
Based on `tests/02_activations/test_progressive_integration.py`:
|
||||
|
||||
```python
|
||||
"""
|
||||
Module 19: Progressive Integration Tests
|
||||
Tests that Module 19 (Benchmarking) works correctly AND that entire TinyTorch system still works.
|
||||
|
||||
DEPENDENCY CHAIN: 01_tensor → ... → 18_fusion → 19_benchmarking → Capstone
|
||||
Final validation before TorchPerf Olympics capstone project.
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import sys
|
||||
from pathlib import Path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
||||
|
||||
|
||||
class TestModules01Through18StillWorking:
|
||||
"""Verify all previous modules still work after benchmarking development."""
|
||||
|
||||
def test_core_modules_stable(self):
|
||||
"""Ensure core modules (01-09) weren't broken."""
|
||||
# Test imports and basic functionality
|
||||
pass
|
||||
|
||||
def test_optimization_modules_stable(self):
|
||||
"""Ensure optimization modules (15-18) still work."""
|
||||
# Test quantization, pruning, distillation, fusion
|
||||
pass
|
||||
|
||||
|
||||
class TestModule19BenchmarkingCore:
|
||||
"""Test Module 19 core benchmarking functionality."""
|
||||
|
||||
def test_benchmark_result_statistics(self):
|
||||
"""Test BenchmarkResult calculates statistics correctly."""
|
||||
pass
|
||||
|
||||
def test_benchmark_runner_real_models(self):
|
||||
"""Test Benchmark class with real TinyTorch models."""
|
||||
pass
|
||||
|
||||
def test_benchmark_suite_multi_metric(self):
|
||||
"""Test BenchmarkSuite runs all metrics."""
|
||||
pass
|
||||
|
||||
def test_tinymlperf_compliance(self):
|
||||
"""Test TinyMLPerf standard benchmarks."""
|
||||
pass
|
||||
|
||||
|
||||
class TestProgressiveStackIntegration:
|
||||
"""Test complete stack (01→19) works together."""
|
||||
|
||||
def test_benchmark_optimized_models_pipeline(self):
|
||||
"""Test benchmarking pipeline with models from optimization modules."""
|
||||
# Create base model
|
||||
# Apply optimization (quantize, prune, etc.)
|
||||
# Benchmark both
|
||||
# Verify comparison results
|
||||
pass
|
||||
|
||||
def test_torchperf_olympics_submission_workflow(self):
|
||||
"""Test end-to-end capstone submission workflow."""
|
||||
# Choose event
|
||||
# Optimize model
|
||||
# Benchmark
|
||||
# Generate submission
|
||||
# Validate submission
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. BUG-CATCHING PRIORITIES
|
||||
|
||||
### 5.1 CRITICAL Bugs (Would Break Capstone)
|
||||
|
||||
1. **Benchmark fails with real TinyTorch models** → Students can't validate their work
|
||||
2. **Statistical calculations wrong** → Incorrect optimization decisions
|
||||
3. **Memory measurement always returns 0** → Can't evaluate memory optimizations
|
||||
4. **Profiler integration broken** → No measurements at all
|
||||
5. **compare_optimization_techniques() crashes** → Can't compare optimizations
|
||||
|
||||
### 5.2 HIGH-PRIORITY Bugs (Would Mislead Students)
|
||||
|
||||
6. **Confidence intervals calculated incorrectly** → False confidence in results
|
||||
7. **Warmup runs not working** → Cold-start bias in measurements
|
||||
8. **Pareto frontier analysis chooses dominated solutions** → Wrong recommendations
|
||||
9. **Energy estimation produces negative values** → Meaningless results
|
||||
10. **Reproducibility broken** → Can't verify submissions
|
||||
|
||||
### 5.3 MEDIUM-PRIORITY Bugs (Would Cause Confusion)
|
||||
|
||||
11. **Duck-typing fails with custom models** → Limits flexibility
|
||||
12. **Resource exhaustion crashes system** → Poor student experience
|
||||
13. **Serialization loses precision** → Comparison errors
|
||||
14. **Report generation missing metrics** → Incomplete analysis
|
||||
15. **Timeout not implemented** → Infinite loops possible
|
||||
|
||||
---
|
||||
|
||||
## 6. RECOMMENDED IMPLEMENTATION ORDER
|
||||
|
||||
### Phase 1: Core Functionality (Week 1)
|
||||
1. `test_benchmark_real_tinytorch_models()` - CRITICAL
|
||||
2. `test_statistical_validity()` - CRITICAL
|
||||
3. `test_benchmark_suite_multi_metric()` - CRITICAL
|
||||
|
||||
### Phase 2: Optimization Integration (Week 2)
|
||||
4. `test_optimization_module_integration()` - HIGH
|
||||
5. `test_optimization_recommendations()` - HIGH
|
||||
6. `test_complete_tinytorch_system_still_works()` - HIGH (regression)
|
||||
|
||||
### Phase 3: Robustness (Week 3)
|
||||
7. `test_resource_exhaustion_prevention()` - MEDIUM
|
||||
8. `test_benchmark_reproducibility()` - MEDIUM
|
||||
9. `test_edge_case_models()` - MEDIUM
|
||||
|
||||
### Phase 4: Capstone Preparation (Week 4)
|
||||
10. `test_tinymlperf_standard_benchmarks()` - MEDIUM-HIGH
|
||||
11. `test_torchperf_olympics_workflow()` - MEDIUM-HIGH
|
||||
|
||||
---
|
||||
|
||||
## 7. ACCEPTANCE CRITERIA
|
||||
|
||||
Module 19 integration tests are COMPLETE when:
|
||||
|
||||
- [ ] **Benchmark works with real TinyTorch models** (Tensor, Linear, Conv2D, MLP, Transformer)
|
||||
- [ ] **Statistical analysis is mathematically correct** (CI, mean, std validated)
|
||||
- [ ] **All metrics measured correctly** (latency, memory, accuracy, energy)
|
||||
- [ ] **Optimization modules integrate properly** (quantization, pruning, distillation, fusion)
|
||||
- [ ] **Resource exhaustion prevented** (OOM, timeouts, cleanup tested)
|
||||
- [ ] **Results are reproducible** (same model → consistent results)
|
||||
- [ ] **TinyMLPerf compliance validated** (standard benchmarks run correctly)
|
||||
- [ ] **Capstone workflow tested end-to-end** (Olympics submission works)
|
||||
- [ ] **Progressive integration verified** (all previous modules still work)
|
||||
- [ ] **Test coverage ≥ 80%** for critical integration points
|
||||
|
||||
---
|
||||
|
||||
## 8. CONCLUSION
|
||||
|
||||
**Current State**: CRITICAL GAP - No integration tests implemented
|
||||
|
||||
**Risk Level**: HIGH
|
||||
- Students cannot validate benchmarking correctness
|
||||
- Capstone project (TorchPerf Olympics) has no test foundation
|
||||
- Integration with optimization modules unverified
|
||||
- Statistical validity unchecked
|
||||
|
||||
**Recommendation**: IMPLEMENT IMMEDIATELY
|
||||
- Start with Phase 1 (core functionality) ASAP
|
||||
- Module 19 is the final implementation module before capstone
|
||||
- Benchmarking is the EVALUATION framework for all optimizations
|
||||
- Without tests, students cannot trust their measurements
|
||||
|
||||
**Estimated Effort**: 3-4 weeks for complete implementation
|
||||
- Week 1: Core benchmarking tests (3 tests, ~500 LOC)
|
||||
- Week 2: Optimization integration tests (3 tests, ~400 LOC)
|
||||
- Week 3: Robustness tests (3 tests, ~300 LOC)
|
||||
- Week 4: Capstone workflow tests (2 tests, ~300 LOC)
|
||||
|
||||
**Total**: ~11 comprehensive integration tests, ~1500 LOC
|
||||
|
||||
---
|
||||
|
||||
**Next Steps**:
|
||||
1. Implement `test_benchmark_real_tinytorch_models()` first (most critical)
|
||||
2. Add `test_statistical_validity()` (foundation for all analysis)
|
||||
3. Proceed through phases systematically
|
||||
4. Test with real student models from earlier modules
|
||||
5. Validate capstone workflow before student submission deadlines
|
||||
@@ -1,119 +0,0 @@
|
||||
# CLI Command Files - Usage Report
|
||||
|
||||
## Summary
|
||||
|
||||
**Status**: ✅ All files are accounted for. Some are imported but not exposed as top-level commands.
|
||||
|
||||
## File Categories
|
||||
|
||||
### 1. ✅ Registered Top-Level Commands (18)
|
||||
These are in `TinyTorchCLI.commands` and accessible via `tito <command>`:
|
||||
|
||||
```
|
||||
benchmark, book, checkpoint, community, demo, export,
|
||||
grade, leaderboard, logo, milestones, module, nbgrader,
|
||||
olympics, package, setup, src, system, test
|
||||
```
|
||||
|
||||
### 2. 🔧 Internal Subcommands (7)
|
||||
**Imported and used by other commands, but not top-level:**
|
||||
|
||||
| File | Used By | Purpose |
|
||||
|------|---------|---------|
|
||||
| `reset.py` | `package.py` | Reset functionality for package command |
|
||||
| `module_reset.py` | `module_workflow.py` | Module reset subcommand |
|
||||
| `status.py` | - | Imported in main.py but not clearly used |
|
||||
| `nbdev.py` | `package.py` | NBDev integration for package command |
|
||||
| `info.py` | `system.py`, `health.py` | System info subcommand |
|
||||
| `health.py` | `system.py` | System health check subcommand |
|
||||
| `jupyter.py` | `system.py` | Jupyter integration subcommand |
|
||||
|
||||
**Action**: ✅ **KEEP THESE** - They're used by other commands
|
||||
|
||||
### 3. ❓ Imported but Unclear Usage (1)
|
||||
|
||||
| File | Issue | Recommendation |
|
||||
|------|-------|----------------|
|
||||
| `notebooks.py` | Imported in main.py, but no usage found | Check if used, otherwise remove import |
|
||||
| `status.py` | Imported in main.py, but no clear usage | Check if used, otherwise remove import |
|
||||
|
||||
**Action**: Need to verify these
|
||||
|
||||
### 4. 🗑️ Likely Unused/Deprecated (9)
|
||||
|
||||
| File | Status |
|
||||
|------|--------|
|
||||
| `check.py` | Not imported anywhere |
|
||||
| `clean.py` | Not imported anywhere |
|
||||
| `clean_workspace.py` | Not imported anywhere |
|
||||
| `help.py` | Not imported anywhere |
|
||||
| `protect.py` | Not imported anywhere |
|
||||
| `report.py` | Not imported anywhere |
|
||||
| `version.py` | Not imported anywhere |
|
||||
| `view.py` | Not imported anywhere |
|
||||
|
||||
**Action**: ⚠️ Safe to delete (not imported anywhere)
|
||||
|
||||
## Cleanup Actions
|
||||
|
||||
### Step 1: Remove Dead Imports from main.py
|
||||
|
||||
These are imported but not registered or used:
|
||||
|
||||
```python
|
||||
# Remove from tito/main.py lines 28-37:
|
||||
from .commands.notebooks import NotebooksCommand # ❌ Not used
|
||||
from .commands.status import StatusCommand # ❌ Not used (verify first)
|
||||
```
|
||||
|
||||
### Step 2: Delete Truly Unused Files
|
||||
|
||||
```bash
|
||||
# These are safe to delete (not imported anywhere)
|
||||
rm tito/commands/check.py
|
||||
rm tito/commands/clean.py
|
||||
rm tito/commands/clean_workspace.py
|
||||
rm tito/commands/help.py
|
||||
rm tito/commands/protect.py
|
||||
rm tito/commands/report.py
|
||||
rm tito/commands/version.py
|
||||
rm tito/commands/view.py
|
||||
```
|
||||
|
||||
### Step 3: Verify and Update Tests
|
||||
|
||||
Update `test_cli_registry.py` to remove deleted files from `known_internal`:
|
||||
|
||||
```python
|
||||
known_internal = {
|
||||
'health.py', # Used by system command
|
||||
'info.py', # Used by system command
|
||||
'jupyter.py', # Used by system command
|
||||
'nbdev.py', # Used by package command
|
||||
'notebooks.py', # Verify if needed, otherwise remove
|
||||
'reset.py', # Used by package command
|
||||
'status.py', # Verify if needed, otherwise remove
|
||||
'module_reset.py' # Used by module_workflow command
|
||||
}
|
||||
```
|
||||
|
||||
## Verification Commands
|
||||
|
||||
Check if status.py is actually used:
|
||||
```bash
|
||||
grep -r "StatusCommand" tito/ --include="*.py" | grep -v "^tito/main.py:from" | grep -v "class StatusCommand"
|
||||
```
|
||||
|
||||
Check if notebooks.py is actually used:
|
||||
```bash
|
||||
grep -r "NotebooksCommand" tito/ --include="*.py" | grep -v "^tito/main.py:from" | grep -v "class NotebooksCommand"
|
||||
```
|
||||
|
||||
## Final Architecture
|
||||
|
||||
After cleanup, you'll have:
|
||||
- **18 top-level commands** (user-facing via `tito <cmd>`)
|
||||
- **7-8 internal commands** (used as helpers by other commands)
|
||||
- **0 orphaned files** (everything has a purpose)
|
||||
|
||||
Clean CLI with clear separation between public API and internal helpers!
|
||||
@@ -1,107 +0,0 @@
|
||||
# Final Answer: CLI Command Cleanup
|
||||
|
||||
## What the Tests Found ✅
|
||||
|
||||
**Good news**: No broken or dangling commands! Everything is accounted for.
|
||||
|
||||
**However**: Found some cleanup opportunities:
|
||||
|
||||
### 1. Dead Imports in main.py
|
||||
|
||||
These 2 commands are imported but **never used**:
|
||||
```python
|
||||
# tito/main.py lines 28 and 37
|
||||
from .commands.notebooks import NotebooksCommand # ❌ DELETE
|
||||
from .commands.status import StatusCommand # ❌ DELETE
|
||||
```
|
||||
|
||||
They're only in `__init__.py` exports, not actually used anywhere.
|
||||
|
||||
### 2. Orphaned Command Files (8 files)
|
||||
|
||||
These files exist but are **not imported anywhere**:
|
||||
```bash
|
||||
tito/commands/check.py
|
||||
tito/commands/clean.py
|
||||
tito/commands/clean_workspace.py
|
||||
tito/commands/help.py
|
||||
tito/commands/protect.py
|
||||
tito/commands/report.py
|
||||
tito/commands/version.py
|
||||
tito/commands/view.py
|
||||
```
|
||||
|
||||
### 3. Internal Helper Commands (6 files) ✅ KEEP
|
||||
|
||||
These are used by other commands:
|
||||
- `reset.py` → used by `package.py`
|
||||
- `nbdev.py` → used by `package.py`
|
||||
- `info.py` → used by `system.py`
|
||||
- `health.py` → used by `system.py`
|
||||
- `jupyter.py` → used by `system.py`
|
||||
- `module_reset.py` → used by `module_workflow.py`
|
||||
|
||||
## Recommended Actions
|
||||
|
||||
### Option A: Full Cleanup (Recommended)
|
||||
|
||||
```bash
|
||||
# 1. Delete truly orphaned files
|
||||
rm tito/commands/check.py
|
||||
rm tito/commands/clean.py
|
||||
rm tito/commands/clean_workspace.py
|
||||
rm tito/commands/help.py
|
||||
rm tito/commands/protect.py
|
||||
rm tito/commands/report.py
|
||||
rm tito/commands/version.py
|
||||
rm tito/commands/view.py
|
||||
|
||||
# 2. Delete unused imported files
|
||||
rm tito/commands/notebooks.py
|
||||
rm tito/commands/status.py
|
||||
|
||||
# 3. Remove dead imports from main.py
|
||||
# Edit tito/main.py and remove lines 28 and 37
|
||||
```
|
||||
|
||||
### Option B: Conservative (Move to Archive)
|
||||
|
||||
```bash
|
||||
# Move to archive instead of deleting
|
||||
mkdir -p tito/commands/_archived
|
||||
mv tito/commands/{check,clean,clean_workspace,help,protect,report,version,view}.py tito/commands/_archived/
|
||||
mv tito/commands/{notebooks,status}.py tito/commands/_archived/
|
||||
```
|
||||
|
||||
### Option C: Do Nothing
|
||||
|
||||
Current state is **fine** - tests prove nothing is broken. The extra files just create clutter but don't hurt.
|
||||
|
||||
## After Cleanup
|
||||
|
||||
Update `tests/cli/test_cli_registry.py`:
|
||||
|
||||
```python
|
||||
# Remove these from known_internal since they'll be deleted:
|
||||
known_internal = {
|
||||
'health.py', # Used by system
|
||||
'info.py', # Used by system
|
||||
'jupyter.py', # Used by system
|
||||
'nbdev.py', # Used by package
|
||||
'reset.py', # Used by package
|
||||
'module_reset.py' # Used by module_workflow
|
||||
}
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
Your CLI is **healthy**! The tests caught:
|
||||
- ✅ 18 working registered commands
|
||||
- ✅ 6 internal helper commands (properly used)
|
||||
- ❌ 2 dead imports (should remove)
|
||||
- ❌ 8 orphaned files (safe to delete)
|
||||
- ❌ 2 unused command files (safe to delete)
|
||||
|
||||
**Total cleanup**: 12 files/imports that can be safely removed without breaking anything.
|
||||
|
||||
Want me to do the cleanup for you?
|
||||
@@ -1,233 +0,0 @@
|
||||
# CLI Hierarchy Refactor - COMPLETE ✅
|
||||
|
||||
## Summary
|
||||
|
||||
Successfully refactored TinyTorch CLI from flat structure to hierarchical organization with subfolders for complex commands.
|
||||
|
||||
**Date**: 2025-11-28
|
||||
**Tests Passing**: 52/52 ✅
|
||||
**User Impact**: ZERO (completely internal)
|
||||
|
||||
---
|
||||
|
||||
## What Changed
|
||||
|
||||
### Before (Flat Structure)
|
||||
```
|
||||
tito/commands/
|
||||
├── module_workflow.py
|
||||
├── module_reset.py
|
||||
├── system.py
|
||||
├── info.py
|
||||
├── health.py
|
||||
├── jupyter.py
|
||||
├── package.py
|
||||
├── reset.py
|
||||
├── nbdev.py
|
||||
├── ... (34 files total, hard to navigate)
|
||||
```
|
||||
|
||||
### After (Hierarchical Structure)
|
||||
```
|
||||
tito/commands/
|
||||
├── module/
|
||||
│ ├── __init__.py
|
||||
│ ├── workflow.py # Main module command
|
||||
│ └── reset.py # Module reset subcommand
|
||||
├── system/
|
||||
│ ├── __init__.py
|
||||
│ ├── system.py # Main system command
|
||||
│ ├── info.py # system info
|
||||
│ ├── health.py # system doctor
|
||||
│ └── jupyter.py # system jupyter
|
||||
├── package/
|
||||
│ ├── __init__.py
|
||||
│ ├── package.py # Main package command
|
||||
│ ├── reset.py # package reset
|
||||
│ └── nbdev.py # package nbdev
|
||||
├── _archived/ # Deprecated files
|
||||
│ ├── clean.py
|
||||
│ ├── help.py
|
||||
│ ├── notebooks.py
|
||||
│ └── status.py
|
||||
├── setup.py # Simple commands stay flat
|
||||
├── test.py
|
||||
├── export.py
|
||||
└── ... (15 simple commands)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
### ✅ Clear Ownership
|
||||
- Easy to see that `module/reset.py` belongs to module command
|
||||
- No confusion about which files are helpers vs top-level commands
|
||||
|
||||
### ✅ Better Organization
|
||||
- Related files grouped together
|
||||
- Subfolders scale as commands grow
|
||||
- Clear separation between simple and complex commands
|
||||
|
||||
### ✅ Easier Maintenance
|
||||
- Tests validate structure automatically
|
||||
- Adding new subcommands is straightforward
|
||||
- No orphaned files hiding in flat structure
|
||||
|
||||
### ✅ Zero User Impact
|
||||
```bash
|
||||
# These still work EXACTLY the same:
|
||||
tito module complete 01
|
||||
tito system info
|
||||
tito package export
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Files Changed
|
||||
|
||||
### Moved Files (10)
|
||||
```
|
||||
module_workflow.py → module/workflow.py
|
||||
module_reset.py → module/reset.py
|
||||
system.py → system/system.py
|
||||
info.py → system/info.py
|
||||
health.py → system/health.py
|
||||
jupyter.py → system/jupyter.py
|
||||
package.py → package/package.py
|
||||
reset.py → package/reset.py
|
||||
nbdev.py → package/nbdev.py
|
||||
```
|
||||
|
||||
### Created Files (4)
|
||||
```
|
||||
module/__init__.py
|
||||
system/__init__.py
|
||||
package/__init__.py
|
||||
_archived/README.md
|
||||
```
|
||||
|
||||
### Updated Files (3)
|
||||
```
|
||||
tito/main.py # Updated imports
|
||||
tito/commands/__init__.py # Updated imports
|
||||
tests/cli/test_cli_registry.py # Updated file path expectations
|
||||
```
|
||||
|
||||
### Archived Files (4)
|
||||
```
|
||||
Moved to _archived/:
|
||||
- clean.py (deprecated)
|
||||
- help.py (deprecated)
|
||||
- notebooks.py (deprecated)
|
||||
- status.py (deprecated)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Results
|
||||
|
||||
### Before Refactor
|
||||
```
|
||||
52 tests passing ✅
|
||||
```
|
||||
|
||||
### After Refactor
|
||||
```
|
||||
52 tests passing ✅
|
||||
```
|
||||
|
||||
### Test Coverage
|
||||
- ✅ All commands are BaseCommand subclasses
|
||||
- ✅ All commands have descriptions
|
||||
- ✅ All commands implement required methods
|
||||
- ✅ All help text accessible
|
||||
- ✅ No orphaned files
|
||||
- ✅ All file paths correct
|
||||
- ✅ All subcommands work
|
||||
|
||||
---
|
||||
|
||||
## Verification Commands
|
||||
|
||||
Test the refactored CLI:
|
||||
|
||||
```bash
|
||||
# Version check
|
||||
tito --version
|
||||
|
||||
# Module commands
|
||||
tito module -h
|
||||
tito module status
|
||||
|
||||
# System commands
|
||||
tito system -h
|
||||
tito system info
|
||||
tito system doctor
|
||||
|
||||
# Package commands
|
||||
tito package -h
|
||||
tito package reset -h
|
||||
|
||||
# Run all tests
|
||||
pytest tests/cli/ -v
|
||||
|
||||
# Quick import test
|
||||
python -c "from tito.main import TinyTorchCLI; print('Success')"
|
||||
```
|
||||
|
||||
All passing! ✅
|
||||
|
||||
---
|
||||
|
||||
## Architecture Decision
|
||||
|
||||
**Question**: Should we organize commands with subcommands into subfolders?
|
||||
**Answer**: YES! ✅
|
||||
|
||||
**Follows best practices from**:
|
||||
- Git (`git/builtin/`)
|
||||
- AWS CLI (`awscli/customizations/`)
|
||||
- Django (`django/core/management/commands/`)
|
||||
- Click (Python CLI framework)
|
||||
|
||||
**Key insight**: Flat worked when small, but with 34 files it became unmaintainable. Hierarchical structure scales better and makes ownership crystal clear.
|
||||
|
||||
---
|
||||
|
||||
## Future Additions
|
||||
|
||||
When adding new commands:
|
||||
|
||||
### Simple Command (no subcommands)
|
||||
```bash
|
||||
# Create at top level
|
||||
tito/commands/newcmd.py
|
||||
```
|
||||
|
||||
### Complex Command (with subcommands)
|
||||
```bash
|
||||
# Create subfolder
|
||||
tito/commands/newcmd/
|
||||
├── __init__.py # Export main command
|
||||
├── newcmd.py # Main command
|
||||
└── helper.py # Subcommand
|
||||
```
|
||||
|
||||
Tests will automatically validate! 🎉
|
||||
|
||||
---
|
||||
|
||||
## Impact Summary
|
||||
|
||||
| Metric | Before | After |
|
||||
|--------|--------|-------|
|
||||
| Total files in commands/ | 34 | 29 (+ 3 subfolders) |
|
||||
| Flat files | 34 | 19 |
|
||||
| Organized in subfolders | 0 | 10 |
|
||||
| Orphaned files | Unknown | 0 (archived) |
|
||||
| Tests passing | 52 | 52 |
|
||||
| User-facing changes | N/A | 0 |
|
||||
| Developer clarity | ⚠️ Confusing | ✅ Crystal clear |
|
||||
|
||||
**Result**: Much cleaner, easier to maintain, zero user impact! 🚀
|
||||
@@ -1,472 +1,436 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Comprehensive gradient flow testing for TinyTorch.
|
||||
Comprehensive Gradient Flow Tests for TinyTorch
|
||||
================================================
|
||||
|
||||
This test suite systematically validates that gradients propagate correctly
|
||||
through all components of the training stack.
|
||||
Tests that gradients flow correctly through:
|
||||
1. Simple networks (single layer)
|
||||
2. Multi-layer networks (MLP)
|
||||
3. Convolutional networks (CNN)
|
||||
4. Attention mechanisms
|
||||
5. Complete training loops
|
||||
|
||||
Run with: pytest tests/test_gradient_flow.py -v
|
||||
Or directly: python tests/test_gradient_flow.py
|
||||
This ensures backpropagation works correctly end-to-end.
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import sys
|
||||
import os
|
||||
import numpy as np
|
||||
|
||||
# Add project root to path
|
||||
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
|
||||
project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||
sys.path.insert(0, project_root)
|
||||
|
||||
from tinytorch import Tensor, Linear, Dropout
|
||||
from tinytorch import Sigmoid, ReLU, Tanh, GELU, Softmax
|
||||
from tinytorch import MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss
|
||||
from tinytorch import SGD, AdamW
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Linear, Dropout
|
||||
from tinytorch.core.activations import ReLU, Sigmoid, Softmax
|
||||
from tinytorch.core.losses import MSELoss, BinaryCrossEntropyLoss, CrossEntropyLoss
|
||||
from tinytorch.core.optimizers import SGD, Adam
|
||||
from tinytorch.core.spatial import Conv2d, MaxPool2d
|
||||
from tinytorch.core.autograd import enable_autograd
|
||||
|
||||
# Enable autograd
|
||||
enable_autograd()
|
||||
|
||||
def test_simple_linear_gradient_flow():
|
||||
"""Test gradients flow through a single linear layer"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 1: Simple Linear Layer Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create simple network: Linear(2->1)
|
||||
layer = Linear(2, 1)
|
||||
|
||||
# Input
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
target = Tensor([[3.0]])
|
||||
|
||||
# Forward pass
|
||||
output = layer.forward(x)
|
||||
|
||||
# Loss
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
print(f"Initial weight shape: {layer.weight.shape}")
|
||||
print(f"Initial bias shape: {layer.bias.shape}")
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check gradients exist
|
||||
assert layer.weight.grad is not None, "Weight gradient is None!"
|
||||
assert layer.bias.grad is not None, "Bias gradient is None!"
|
||||
assert x.grad is not None, "Input gradient is None!"
|
||||
|
||||
# Check gradients are non-zero
|
||||
weight_grad_norm = np.linalg.norm(layer.weight.grad.data)
|
||||
bias_grad_norm = np.linalg.norm(layer.bias.grad.data)
|
||||
input_grad_norm = np.linalg.norm(x.grad.data)
|
||||
|
||||
print(f"\n✓ Weight gradient norm: {weight_grad_norm:.6f}")
|
||||
print(f"✓ Bias gradient norm: {bias_grad_norm:.6f}")
|
||||
print(f"✓ Input gradient norm: {input_grad_norm:.6f}")
|
||||
|
||||
assert weight_grad_norm > 1e-6, f"Weight gradients too small: {weight_grad_norm}"
|
||||
assert bias_grad_norm > 1e-6, f"Bias gradients too small: {bias_grad_norm}"
|
||||
assert input_grad_norm > 1e-6, f"Input gradients too small: {input_grad_norm}"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through linear layer")
|
||||
return True
|
||||
|
||||
|
||||
class TestBasicTensorGradients:
|
||||
"""Test gradient computation for basic tensor operations."""
|
||||
|
||||
def test_multiplication_gradient(self):
|
||||
"""Test gradient flow through multiplication."""
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
y = x * 3
|
||||
loss = y.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
# dy/dx = 3
|
||||
assert x.grad is not None, "Gradient should be computed"
|
||||
assert np.allclose(x.grad, [[3.0, 3.0]]), f"Expected [[3, 3]], got {x.grad}"
|
||||
|
||||
def test_addition_gradient(self):
|
||||
"""Test gradient flow through addition."""
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
y = Tensor([[3.0, 4.0]], requires_grad=True)
|
||||
z = x + y
|
||||
loss = z.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
# dz/dx = 1, dz/dy = 1
|
||||
assert np.allclose(x.grad, [[1.0, 1.0]]), f"x.grad: {x.grad}"
|
||||
assert np.allclose(y.grad, [[1.0, 1.0]]), f"y.grad: {y.grad}"
|
||||
|
||||
def test_chain_rule(self):
|
||||
"""Test gradient flow through chain of operations."""
|
||||
x = Tensor([[2.0]], requires_grad=True)
|
||||
y = x * 3 # y = 3x
|
||||
z = y + 1 # z = 3x + 1
|
||||
w = z * 2 # w = 2(3x + 1) = 6x + 2
|
||||
|
||||
w.backward()
|
||||
|
||||
# dw/dx = 6
|
||||
assert np.allclose(x.grad, [[6.0]]), f"Expected [[6]], got {x.grad}"
|
||||
|
||||
def test_matmul_gradient(self):
|
||||
"""Test gradient flow through matrix multiplication."""
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
W = Tensor([[1.0], [2.0]], requires_grad=True)
|
||||
y = x.matmul(W) # y = [[5.0]]
|
||||
|
||||
y.backward()
|
||||
|
||||
# dy/dx = W^T = [[1, 2]]
|
||||
# dy/dW = x^T = [[1], [2]]
|
||||
assert np.allclose(x.grad, [[1.0, 2.0]]), f"x.grad: {x.grad}"
|
||||
assert np.allclose(W.grad, [[1.0], [2.0]]), f"W.grad: {W.grad}"
|
||||
|
||||
def test_broadcasting_gradient(self):
|
||||
"""Test gradient flow with broadcasting (e.g., bias addition)."""
|
||||
x = Tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True) # (2, 2)
|
||||
bias = Tensor([1.0, 2.0], requires_grad=True) # (2,)
|
||||
y = x + bias # Broadcasting happens
|
||||
loss = y.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
# Gradient should sum over broadcast dimension
|
||||
assert x.grad.shape == (2, 2), f"x.grad shape: {x.grad.shape}"
|
||||
assert bias.grad.shape == (2,), f"bias.grad shape: {bias.grad.shape}"
|
||||
assert np.allclose(bias.grad, [2.0, 2.0]), f"bias.grad: {bias.grad}"
|
||||
def test_mlp_gradient_flow():
|
||||
"""Test gradients flow through multi-layer perceptron"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 2: Multi-Layer Perceptron Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create MLP: Input(4) -> Linear(4->8) -> ReLU -> Linear(8->2)
|
||||
layer1 = Linear(4, 8)
|
||||
activation = ReLU()
|
||||
layer2 = Linear(8, 2)
|
||||
|
||||
# Input and target
|
||||
x = Tensor(np.random.randn(3, 4), requires_grad=True)
|
||||
target = Tensor(np.array([[1, 0], [0, 1], [1, 0]]))
|
||||
|
||||
print(f"Input shape: {x.shape}")
|
||||
print(f"Target shape: {target.shape}")
|
||||
|
||||
# Forward pass
|
||||
h1 = layer1.forward(x)
|
||||
h1_activated = activation.forward(h1)
|
||||
output = layer2.forward(h1_activated)
|
||||
|
||||
print(f"Hidden layer shape: {h1.shape}")
|
||||
print(f"Output shape: {output.shape}")
|
||||
|
||||
# Loss
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check all layer gradients exist
|
||||
assert layer1.weight.grad is not None, "Layer1 weight gradient is None!"
|
||||
assert layer1.bias.grad is not None, "Layer1 bias gradient is None!"
|
||||
assert layer2.weight.grad is not None, "Layer2 weight gradient is None!"
|
||||
assert layer2.bias.grad is not None, "Layer2 bias gradient is None!"
|
||||
|
||||
# Check gradient magnitudes
|
||||
l1_weight_norm = np.linalg.norm(layer1.weight.grad.data)
|
||||
l1_bias_norm = np.linalg.norm(layer1.bias.grad.data)
|
||||
l2_weight_norm = np.linalg.norm(layer2.weight.grad.data)
|
||||
l2_bias_norm = np.linalg.norm(layer2.bias.grad.data)
|
||||
|
||||
print(f"\n✓ Layer1 weight gradient norm: {l1_weight_norm:.6f}")
|
||||
print(f"✓ Layer1 bias gradient norm: {l1_bias_norm:.6f}")
|
||||
print(f"✓ Layer2 weight gradient norm: {l2_weight_norm:.6f}")
|
||||
print(f"✓ Layer2 bias gradient norm: {l2_bias_norm:.6f}")
|
||||
|
||||
assert l1_weight_norm > 1e-6, "Layer1 weight gradients too small"
|
||||
assert l1_bias_norm > 1e-6, "Layer1 bias gradients too small"
|
||||
assert l2_weight_norm > 1e-6, "Layer2 weight gradients too small"
|
||||
assert l2_bias_norm > 1e-6, "Layer2 bias gradients too small"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through MLP")
|
||||
return True
|
||||
|
||||
|
||||
class TestLayerGradients:
|
||||
"""Test gradient computation through neural network layers."""
|
||||
|
||||
def test_linear_layer_gradients(self):
|
||||
"""Test gradient flow through Linear layer."""
|
||||
layer = Linear(2, 3)
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
|
||||
w_before = layer.weight.data.copy()
|
||||
b_before = layer.bias.data.copy()
|
||||
|
||||
out = layer(x)
|
||||
loss = out.sum()
|
||||
loss.backward()
|
||||
|
||||
# All gradients should exist
|
||||
assert layer.weight.grad is not None, "Weight gradient missing"
|
||||
assert layer.bias.grad is not None, "Bias gradient missing"
|
||||
assert x.grad is not None, "Input gradient missing"
|
||||
|
||||
# Gradient shapes should match parameter shapes
|
||||
assert layer.weight.grad.shape == layer.weight.shape
|
||||
assert layer.bias.grad.shape == layer.bias.shape
|
||||
|
||||
def test_multi_layer_gradients(self):
|
||||
"""Test gradient flow through multiple layers."""
|
||||
layer1 = Linear(2, 3)
|
||||
layer2 = Linear(3, 1)
|
||||
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
|
||||
h = layer1(x)
|
||||
out = layer2(h)
|
||||
loss = out.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
# All layers should have gradients
|
||||
assert layer1.weight.grad is not None
|
||||
assert layer1.bias.grad is not None
|
||||
assert layer2.weight.grad is not None
|
||||
assert layer2.bias.grad is not None
|
||||
def test_mlp_training_updates():
|
||||
"""Test that MLP actually learns (loss decreases)"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 3: MLP Training - Loss Reduction")
|
||||
print("="*70)
|
||||
|
||||
# Create simple MLP
|
||||
layer1 = Linear(2, 4)
|
||||
activation = ReLU()
|
||||
layer2 = Linear(4, 1)
|
||||
|
||||
class TestActivationGradients:
|
||||
"""Test gradient computation through activation functions."""
|
||||
|
||||
def test_sigmoid_gradient(self):
|
||||
"""Test gradient flow through Sigmoid."""
|
||||
x = Tensor([[0.0, 1.0, -1.0]], requires_grad=True)
|
||||
sigmoid = Sigmoid()
|
||||
|
||||
y = sigmoid(x)
|
||||
loss = y.sum()
|
||||
loss.backward()
|
||||
|
||||
assert x.grad is not None, "Sigmoid gradient missing"
|
||||
# Sigmoid gradient: σ'(x) = σ(x)(1 - σ(x))
|
||||
# At x=0: σ(0) = 0.5, σ'(0) = 0.25
|
||||
assert x.grad[0, 0] > 0, "Gradient should be positive"
|
||||
|
||||
def test_relu_gradient(self):
|
||||
"""Test gradient flow through ReLU."""
|
||||
x = Tensor([[-1.0, 0.0, 1.0]], requires_grad=True)
|
||||
relu = ReLU()
|
||||
|
||||
y = relu(x)
|
||||
loss = y.sum()
|
||||
loss.backward()
|
||||
|
||||
# ReLU gradient: 1 if x > 0, else 0
|
||||
# Note: We haven't implemented ReLU backward yet, so this will fail
|
||||
# TODO: Implement ReLU backward in autograd
|
||||
|
||||
def test_tanh_gradient(self):
|
||||
"""Test gradient flow through Tanh."""
|
||||
x = Tensor([[0.0, 1.0]], requires_grad=True)
|
||||
tanh = Tanh()
|
||||
|
||||
y = tanh(x)
|
||||
loss = y.sum()
|
||||
|
||||
# TODO: Implement Tanh backward
|
||||
# loss.backward()
|
||||
# Simple dataset (XOR-like)
|
||||
X = Tensor(np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]), requires_grad=False)
|
||||
y = Tensor(np.array([[0.0], [1.0], [1.0], [0.0]]))
|
||||
|
||||
# Optimizer
|
||||
optimizer = SGD([layer1.weight, layer1.bias, layer2.weight, layer2.bias], lr=0.1)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
class TestLossGradients:
|
||||
"""Test gradient computation through loss functions."""
|
||||
|
||||
def test_bce_gradient(self):
|
||||
"""Test gradient flow through Binary Cross-Entropy."""
|
||||
predictions = Tensor([[0.7, 0.3, 0.9]], requires_grad=True)
|
||||
targets = Tensor([[1.0, 0.0, 1.0]])
|
||||
|
||||
loss_fn = BinaryCrossEntropyLoss()
|
||||
loss = loss_fn(predictions, targets)
|
||||
|
||||
loss.backward()
|
||||
|
||||
assert predictions.grad is not None, "BCE gradient missing"
|
||||
assert predictions.grad.shape == predictions.shape
|
||||
# Gradient should be negative for correct predictions
|
||||
assert predictions.grad[0, 0] < 0, "Gradient sign incorrect"
|
||||
|
||||
def test_mse_gradient(self):
|
||||
"""Test gradient flow through MSE loss."""
|
||||
predictions = Tensor([[1.0, 2.0, 3.0]], requires_grad=True)
|
||||
targets = Tensor([[2.0, 2.0, 2.0]])
|
||||
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn(predictions, targets)
|
||||
|
||||
# TODO: Implement MSE backward
|
||||
# loss.backward()
|
||||
losses = []
|
||||
|
||||
print("Training for 50 epochs...")
|
||||
for epoch in range(50):
|
||||
# Forward
|
||||
h1 = layer1.forward(X)
|
||||
h1_act = activation.forward(h1)
|
||||
output = layer2.forward(h1_act)
|
||||
|
||||
class TestOptimizerIntegration:
|
||||
"""Test optimizer integration with gradient flow."""
|
||||
|
||||
def test_sgd_updates_parameters(self):
|
||||
"""Test that SGD actually updates parameters."""
|
||||
layer = Linear(2, 1)
|
||||
optimizer = SGD(layer.parameters(), lr=0.1)
|
||||
|
||||
w_before = layer.weight.data.copy()
|
||||
b_before = layer.bias.data.copy()
|
||||
|
||||
# Forward pass
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
out = layer(x)
|
||||
loss = out.sum()
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Optimizer step
|
||||
optimizer.step()
|
||||
|
||||
# Parameters should change
|
||||
assert not np.allclose(layer.weight.data, w_before), "Weights didn't update"
|
||||
assert not np.allclose(layer.bias.data, b_before), "Bias didn't update"
|
||||
|
||||
def test_zero_grad_clears_gradients(self):
|
||||
"""Test that zero_grad() clears gradients."""
|
||||
layer = Linear(2, 1)
|
||||
optimizer = SGD(layer.parameters(), lr=0.1)
|
||||
|
||||
# First backward pass
|
||||
x = Tensor([[1.0, 2.0]])
|
||||
out = layer(x)
|
||||
loss = out.sum()
|
||||
loss.backward()
|
||||
|
||||
assert layer.weight.grad is not None, "Gradient should exist"
|
||||
|
||||
# Clear gradients
|
||||
# Loss
|
||||
loss = loss_fn.forward(output, y)
|
||||
losses.append(float(loss.data))
|
||||
|
||||
# Backward
|
||||
optimizer.zero_grad()
|
||||
|
||||
assert layer.weight.grad is None, "Gradient should be cleared"
|
||||
assert layer.bias.grad is None, "Bias gradient should be cleared"
|
||||
|
||||
def test_adamw_updates_parameters(self):
|
||||
"""Test that AdamW optimizer works."""
|
||||
layer = Linear(2, 1)
|
||||
optimizer = AdamW(layer.parameters(), lr=0.01)
|
||||
|
||||
w_before = layer.weight.data.copy()
|
||||
|
||||
x = Tensor([[1.0, 2.0]])
|
||||
out = layer(x)
|
||||
loss = out.sum()
|
||||
loss.backward()
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
|
||||
assert not np.allclose(layer.weight.data, w_before), "AdamW didn't update weights"
|
||||
|
||||
if (epoch + 1) % 10 == 0:
|
||||
print(f"Epoch {epoch+1:2d}: Loss = {float(loss.data):.6f}")
|
||||
|
||||
# Check loss decreased
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
reduction = initial_loss - final_loss
|
||||
reduction_pct = (reduction / initial_loss) * 100
|
||||
|
||||
print(f"\n✓ Initial loss: {initial_loss:.6f}")
|
||||
print(f"✓ Final loss: {final_loss:.6f}")
|
||||
print(f"✓ Reduction: {reduction:.6f} ({reduction_pct:.1f}%)")
|
||||
|
||||
assert final_loss < initial_loss, f"Loss didn't decrease! Initial: {initial_loss}, Final: {final_loss}"
|
||||
assert reduction_pct > 10, f"Loss reduction too small: {reduction_pct:.1f}%"
|
||||
|
||||
print("\n✅ TEST PASSED: MLP learns successfully (loss decreases)")
|
||||
return True
|
||||
|
||||
|
||||
class TestFullTrainingLoop:
|
||||
"""Test complete training scenarios."""
|
||||
|
||||
def test_simple_convergence(self):
|
||||
"""Test that a simple model can learn."""
|
||||
# Simple task: learn to output 5 from input [1, 2]
|
||||
layer = Linear(2, 1)
|
||||
optimizer = SGD(layer.parameters(), lr=0.1)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
x = Tensor([[1.0, 2.0]])
|
||||
target = Tensor([[5.0]])
|
||||
|
||||
initial_loss = None
|
||||
final_loss = None
|
||||
|
||||
# Train for a few iterations
|
||||
for i in range(50):
|
||||
# Forward
|
||||
pred = layer(x)
|
||||
loss = loss_fn(pred, target)
|
||||
|
||||
if i == 0:
|
||||
initial_loss = loss.data
|
||||
if i == 49:
|
||||
final_loss = loss.data
|
||||
|
||||
# Backward
|
||||
loss.backward()
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
optimizer.zero_grad()
|
||||
|
||||
# Loss should decrease
|
||||
assert final_loss < initial_loss, f"Loss didn't decrease: {initial_loss} → {final_loss}"
|
||||
|
||||
def test_binary_classification(self):
|
||||
"""Test binary classification training."""
|
||||
layer = Linear(2, 1)
|
||||
sigmoid = Sigmoid()
|
||||
loss_fn = BinaryCrossEntropyLoss()
|
||||
optimizer = SGD(layer.parameters(), lr=0.1)
|
||||
|
||||
# Simple dataset: [1, 1] → 1, [0, 0] → 0
|
||||
X = Tensor([[1.0, 1.0], [0.0, 0.0]])
|
||||
y = Tensor([[1.0], [0.0]])
|
||||
|
||||
initial_loss = None
|
||||
final_loss = None
|
||||
|
||||
for i in range(50):
|
||||
# Forward
|
||||
logits = layer(X)
|
||||
probs = sigmoid(logits)
|
||||
loss = loss_fn(probs, y)
|
||||
|
||||
if i == 0:
|
||||
initial_loss = loss.data
|
||||
if i == 49:
|
||||
final_loss = loss.data
|
||||
|
||||
# Backward
|
||||
loss.backward()
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
optimizer.zero_grad()
|
||||
|
||||
assert final_loss < initial_loss, "Binary classification didn't learn"
|
||||
def test_cnn_gradient_flow():
|
||||
"""Test gradients flow through convolutional layers"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 4: CNN Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create simple CNN: Conv2d -> ReLU -> Linear
|
||||
conv = Conv2d(in_channels=1, out_channels=4, kernel_size=3, stride=1, padding=0)
|
||||
activation = ReLU()
|
||||
|
||||
# Input: batch=2, channels=1, height=8, width=8
|
||||
x = Tensor(np.random.randn(2, 1, 8, 8), requires_grad=True)
|
||||
|
||||
print(f"Input shape: {x.shape}")
|
||||
print(f"Conv weight shape: {conv.weight.shape}")
|
||||
|
||||
# Forward through conv
|
||||
conv_out = conv.forward(x)
|
||||
print(f"Conv output shape: {conv_out.shape}")
|
||||
|
||||
activated = activation.forward(conv_out)
|
||||
|
||||
# Flatten for linear layer
|
||||
batch_size = activated.shape[0]
|
||||
flattened_size = np.prod(activated.shape[1:])
|
||||
# Use reshape method to maintain gradient flow
|
||||
flattened = activated.reshape(batch_size, flattened_size)
|
||||
|
||||
linear = Linear(flattened_size, 2)
|
||||
output = linear.forward(flattened)
|
||||
|
||||
print(f"Flattened shape: {flattened.shape}")
|
||||
print(f"Output shape: {output.shape}")
|
||||
|
||||
# Loss
|
||||
target = Tensor(np.array([[1, 0], [0, 1]]))
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
|
||||
# Backward
|
||||
loss.backward()
|
||||
|
||||
# Check gradients
|
||||
assert conv.weight.grad is not None, "Conv weight gradient is None!"
|
||||
assert conv.bias.grad is not None, "Conv bias gradient is None!"
|
||||
assert linear.weight.grad is not None, "Linear weight gradient is None!"
|
||||
|
||||
weight_grad_norm = np.linalg.norm(conv.weight.grad.data)
|
||||
conv_bias_norm = np.linalg.norm(conv.bias.grad.data)
|
||||
linear_grad_norm = np.linalg.norm(linear.weight.grad.data)
|
||||
|
||||
print(f"\n✓ Conv weight gradient norm: {weight_grad_norm:.6f}")
|
||||
print(f"✓ Conv bias gradient norm: {conv_bias_norm:.6f}")
|
||||
print(f"✓ Linear weight gradient norm: {linear_grad_norm:.6f}")
|
||||
|
||||
assert weight_grad_norm > 1e-6, f"Conv weight gradients too small: {weight_grad_norm}"
|
||||
assert conv_bias_norm > 1e-6, f"Conv bias gradients too small: {conv_bias_norm}"
|
||||
assert linear_grad_norm > 1e-6, f"Linear gradients too small: {linear_grad_norm}"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through CNN")
|
||||
return True
|
||||
|
||||
|
||||
class TestEdgeCases:
|
||||
"""Test edge cases and potential failure modes."""
|
||||
|
||||
def test_zero_gradient(self):
|
||||
"""Test that zero gradients don't break training."""
|
||||
x = Tensor([[0.0, 0.0]], requires_grad=True)
|
||||
y = x * 0
|
||||
loss = y.sum()
|
||||
|
||||
def test_cnn_training_updates():
|
||||
"""Test that CNN actually learns on simple data"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 5: CNN Training - Loss Reduction")
|
||||
print("="*70)
|
||||
|
||||
# Simple CNN
|
||||
conv = Conv2d(1, 2, kernel_size=3, stride=1, padding=1)
|
||||
activation = ReLU()
|
||||
|
||||
# Simple data: 4 samples, 1 channel, 4x4 images
|
||||
X = Tensor(np.random.randn(4, 1, 4, 4), requires_grad=False)
|
||||
|
||||
# After conv: (4, 2, 4, 4) -> flatten to (4, 32)
|
||||
conv_out_size = 2 * 4 * 4 # channels * height * width
|
||||
linear = Linear(conv_out_size, 2)
|
||||
|
||||
y = Tensor(np.array([[1, 0], [0, 1], [1, 0], [0, 1]]))
|
||||
|
||||
# Get parameters with gradients
|
||||
params = []
|
||||
for p in [conv.weight, conv.bias, linear.weight, linear.bias]:
|
||||
if not p.requires_grad:
|
||||
p.requires_grad = True
|
||||
params.append(p)
|
||||
|
||||
# Optimizer
|
||||
optimizer = SGD(params, lr=0.01)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
losses = []
|
||||
|
||||
print("Training for 30 epochs...")
|
||||
for epoch in range(30):
|
||||
# Forward
|
||||
conv_out = conv.forward(X)
|
||||
activated = activation.forward(conv_out)
|
||||
|
||||
# Flatten using reshape to maintain gradients
|
||||
batch_size = activated.shape[0]
|
||||
flattened = activated.reshape(batch_size, -1)
|
||||
|
||||
output = linear.forward(flattened)
|
||||
|
||||
# Loss
|
||||
loss = loss_fn.forward(output, y)
|
||||
losses.append(float(loss.data))
|
||||
|
||||
# Backward
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
|
||||
assert x.grad is not None
|
||||
assert np.allclose(x.grad, [[0.0, 0.0]])
|
||||
|
||||
def test_very_small_values(self):
|
||||
"""Test gradient flow with very small values."""
|
||||
x = Tensor([[1e-8, 1e-8]], requires_grad=True)
|
||||
y = x * 2
|
||||
loss = y.sum()
|
||||
|
||||
loss.backward()
|
||||
|
||||
assert x.grad is not None
|
||||
assert np.allclose(x.grad, [[2.0, 2.0]])
|
||||
|
||||
def test_gradient_accumulation(self):
|
||||
"""Test that gradients accumulate correctly across multiple backward passes."""
|
||||
x = Tensor([[1.0]], requires_grad=True)
|
||||
|
||||
# First backward
|
||||
y1 = x * 2
|
||||
y1.backward()
|
||||
grad_after_first = x.grad.copy()
|
||||
|
||||
# Second backward (without zero_grad)
|
||||
y2 = x * 3
|
||||
y2.backward()
|
||||
|
||||
# Gradient should accumulate: 2 + 3 = 5
|
||||
expected = grad_after_first + np.array([[3.0]])
|
||||
assert np.allclose(x.grad, expected), f"Expected {expected}, got {x.grad}"
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
|
||||
if (epoch + 1) % 10 == 0:
|
||||
print(f"Epoch {epoch+1:2d}: Loss = {float(loss.data):.6f}")
|
||||
|
||||
# Check loss decreased
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
reduction = initial_loss - final_loss
|
||||
reduction_pct = (reduction / initial_loss) * 100
|
||||
|
||||
print(f"\n✓ Initial loss: {initial_loss:.6f}")
|
||||
print(f"✓ Final loss: {final_loss:.6f}")
|
||||
print(f"✓ Reduction: {reduction:.6f} ({reduction_pct:.1f}%)")
|
||||
|
||||
assert final_loss < initial_loss, f"Loss didn't decrease! Initial: {initial_loss}, Final: {final_loss}"
|
||||
|
||||
print("\n✅ TEST PASSED: CNN learns successfully (loss decreases)")
|
||||
return True
|
||||
|
||||
|
||||
def run_all_tests():
|
||||
"""Run all tests and print results."""
|
||||
import inspect
|
||||
|
||||
test_classes = [
|
||||
TestBasicTensorGradients,
|
||||
TestLayerGradients,
|
||||
TestActivationGradients,
|
||||
TestLossGradients,
|
||||
TestOptimizerIntegration,
|
||||
TestFullTrainingLoop,
|
||||
TestEdgeCases,
|
||||
def test_gradient_accumulation():
|
||||
"""Test that gradients accumulate correctly across batches"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 6: Gradient Accumulation")
|
||||
print("="*70)
|
||||
|
||||
layer = Linear(2, 1)
|
||||
|
||||
# Two batches
|
||||
x1 = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
x2 = Tensor([[3.0, 4.0]], requires_grad=True)
|
||||
target = Tensor([[1.0]])
|
||||
|
||||
loss_fn = MSELoss()
|
||||
|
||||
# Forward + backward on first batch (don't zero grad)
|
||||
out1 = layer.forward(x1)
|
||||
loss1 = loss_fn.forward(out1, target)
|
||||
loss1.backward()
|
||||
|
||||
grad_after_first = np.array(layer.weight.grad.data)
|
||||
|
||||
# Forward + backward on second batch (gradients should accumulate)
|
||||
out2 = layer.forward(x2)
|
||||
loss2 = loss_fn.forward(out2, target)
|
||||
loss2.backward()
|
||||
|
||||
grad_after_second = layer.weight.grad.data
|
||||
|
||||
# Gradients should have accumulated (not been replaced)
|
||||
grad_diff = np.linalg.norm(grad_after_second - grad_after_first)
|
||||
|
||||
print(f"✓ Gradient after first batch norm: {np.linalg.norm(grad_after_first):.6f}")
|
||||
print(f"✓ Gradient after second batch norm: {np.linalg.norm(grad_after_second):.6f}")
|
||||
print(f"✓ Difference: {grad_diff:.6f}")
|
||||
|
||||
assert grad_diff > 1e-6, "Gradients didn't accumulate properly"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients accumulate correctly")
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
"""Run all gradient flow tests"""
|
||||
print("\n" + "="*70)
|
||||
print(" TINYTORCH GRADIENT FLOW TEST SUITE")
|
||||
print("="*70)
|
||||
|
||||
tests = [
|
||||
("Simple Linear", test_simple_linear_gradient_flow),
|
||||
("MLP Gradient Flow", test_mlp_gradient_flow),
|
||||
("MLP Training", test_mlp_training_updates),
|
||||
("CNN Gradient Flow", test_cnn_gradient_flow),
|
||||
("CNN Training", test_cnn_training_updates),
|
||||
("Gradient Accumulation", test_gradient_accumulation),
|
||||
]
|
||||
|
||||
total_tests = 0
|
||||
passed_tests = 0
|
||||
failed_tests = []
|
||||
skipped_tests = []
|
||||
|
||||
print("=" * 80)
|
||||
print("🧪 TINYTORCH GRADIENT FLOW TEST SUITE")
|
||||
print("=" * 80)
|
||||
|
||||
for test_class in test_classes:
|
||||
print(f"\n{'=' * 80}")
|
||||
print(f"📦 {test_class.__name__}")
|
||||
print(f"{'=' * 80}")
|
||||
|
||||
instance = test_class()
|
||||
methods = [m for m in dir(instance) if m.startswith('test_')]
|
||||
|
||||
for method_name in methods:
|
||||
total_tests += 1
|
||||
method = getattr(instance, method_name)
|
||||
|
||||
# Get docstring
|
||||
doc = method.__doc__ or method_name
|
||||
doc = doc.strip().split('\n')[0]
|
||||
|
||||
print(f"\n {method_name}")
|
||||
print(f" {doc}")
|
||||
|
||||
try:
|
||||
method()
|
||||
print(f" ✅ PASSED")
|
||||
passed_tests += 1
|
||||
except NotImplementedError as e:
|
||||
print(f" ⏭️ SKIPPED: {e}")
|
||||
skipped_tests.append((test_class.__name__, method_name, str(e)))
|
||||
except AssertionError as e:
|
||||
print(f" ❌ FAILED: {e}")
|
||||
failed_tests.append((test_class.__name__, method_name, str(e)))
|
||||
except Exception as e:
|
||||
print(f" ❌ ERROR: {e}")
|
||||
failed_tests.append((test_class.__name__, method_name, str(e)))
|
||||
|
||||
|
||||
results = []
|
||||
|
||||
for name, test_func in tests:
|
||||
try:
|
||||
result = test_func()
|
||||
results.append((name, "PASSED" if result else "FAILED"))
|
||||
except Exception as e:
|
||||
print(f"\n❌ TEST FAILED: {name}")
|
||||
print(f"Error: {str(e)}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
results.append((name, "FAILED"))
|
||||
|
||||
# Summary
|
||||
print("\n" + "=" * 80)
|
||||
print("📊 TEST SUMMARY")
|
||||
print("=" * 80)
|
||||
print(f"Total tests: {total_tests}")
|
||||
print(f"✅ Passed: {passed_tests}")
|
||||
print(f"❌ Failed: {len(failed_tests)}")
|
||||
print(f"⏭️ Skipped: {len(skipped_tests)}")
|
||||
|
||||
if failed_tests:
|
||||
print("\n" + "=" * 80)
|
||||
print("❌ FAILED TESTS:")
|
||||
print("=" * 80)
|
||||
for class_name, method_name, error in failed_tests:
|
||||
print(f"\n {class_name}.{method_name}")
|
||||
print(f" {error}")
|
||||
|
||||
if skipped_tests:
|
||||
print("\n" + "=" * 80)
|
||||
print("⏭️ SKIPPED TESTS (Not Yet Implemented):")
|
||||
print("=" * 80)
|
||||
for class_name, method_name, reason in skipped_tests:
|
||||
print(f" {class_name}.{method_name}")
|
||||
|
||||
print("\n" + "=" * 80)
|
||||
|
||||
return len(failed_tests) == 0
|
||||
print("\n" + "="*70)
|
||||
print(" TEST SUMMARY")
|
||||
print("="*70)
|
||||
|
||||
passed = sum(1 for _, status in results if status == "PASSED")
|
||||
total = len(results)
|
||||
|
||||
for name, status in results:
|
||||
symbol = "✅" if status == "PASSED" else "❌"
|
||||
print(f"{symbol} {name}: {status}")
|
||||
|
||||
print(f"\nTotal: {passed}/{total} tests passed")
|
||||
|
||||
if passed == total:
|
||||
print("\n🎉 ALL TESTS PASSED! Gradients flow correctly through TinyTorch.")
|
||||
return 0
|
||||
else:
|
||||
print(f"\n⚠️ {total - passed} tests failed. Please review the errors above.")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
success = run_all_tests()
|
||||
sys.exit(0 if success else 1)
|
||||
exit(main())
|
||||
|
||||
@@ -1,436 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Comprehensive Gradient Flow Tests for TinyTorch
|
||||
================================================
|
||||
|
||||
Tests that gradients flow correctly through:
|
||||
1. Simple networks (single layer)
|
||||
2. Multi-layer networks (MLP)
|
||||
3. Convolutional networks (CNN)
|
||||
4. Attention mechanisms
|
||||
5. Complete training loops
|
||||
|
||||
This ensures backpropagation works correctly end-to-end.
|
||||
"""
|
||||
|
||||
import sys
|
||||
import os
|
||||
import numpy as np
|
||||
|
||||
# Add project root to path
|
||||
project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||
sys.path.insert(0, project_root)
|
||||
|
||||
from tinytorch.core.tensor import Tensor
|
||||
from tinytorch.core.layers import Linear, Dropout
|
||||
from tinytorch.core.activations import ReLU, Sigmoid, Softmax
|
||||
from tinytorch.core.losses import MSELoss, BinaryCrossEntropyLoss, CrossEntropyLoss
|
||||
from tinytorch.core.optimizers import SGD, Adam
|
||||
from tinytorch.core.spatial import Conv2d, MaxPool2d
|
||||
from tinytorch.core.autograd import enable_autograd
|
||||
|
||||
# Enable autograd
|
||||
enable_autograd()
|
||||
|
||||
def test_simple_linear_gradient_flow():
|
||||
"""Test gradients flow through a single linear layer"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 1: Simple Linear Layer Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create simple network: Linear(2->1)
|
||||
layer = Linear(2, 1)
|
||||
|
||||
# Input
|
||||
x = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
target = Tensor([[3.0]])
|
||||
|
||||
# Forward pass
|
||||
output = layer.forward(x)
|
||||
|
||||
# Loss
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
print(f"Initial weight shape: {layer.weight.shape}")
|
||||
print(f"Initial bias shape: {layer.bias.shape}")
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check gradients exist
|
||||
assert layer.weight.grad is not None, "Weight gradient is None!"
|
||||
assert layer.bias.grad is not None, "Bias gradient is None!"
|
||||
assert x.grad is not None, "Input gradient is None!"
|
||||
|
||||
# Check gradients are non-zero
|
||||
weight_grad_norm = np.linalg.norm(layer.weight.grad.data)
|
||||
bias_grad_norm = np.linalg.norm(layer.bias.grad.data)
|
||||
input_grad_norm = np.linalg.norm(x.grad.data)
|
||||
|
||||
print(f"\n✓ Weight gradient norm: {weight_grad_norm:.6f}")
|
||||
print(f"✓ Bias gradient norm: {bias_grad_norm:.6f}")
|
||||
print(f"✓ Input gradient norm: {input_grad_norm:.6f}")
|
||||
|
||||
assert weight_grad_norm > 1e-6, f"Weight gradients too small: {weight_grad_norm}"
|
||||
assert bias_grad_norm > 1e-6, f"Bias gradients too small: {bias_grad_norm}"
|
||||
assert input_grad_norm > 1e-6, f"Input gradients too small: {input_grad_norm}"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through linear layer")
|
||||
return True
|
||||
|
||||
|
||||
def test_mlp_gradient_flow():
|
||||
"""Test gradients flow through multi-layer perceptron"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 2: Multi-Layer Perceptron Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create MLP: Input(4) -> Linear(4->8) -> ReLU -> Linear(8->2)
|
||||
layer1 = Linear(4, 8)
|
||||
activation = ReLU()
|
||||
layer2 = Linear(8, 2)
|
||||
|
||||
# Input and target
|
||||
x = Tensor(np.random.randn(3, 4), requires_grad=True)
|
||||
target = Tensor(np.array([[1, 0], [0, 1], [1, 0]]))
|
||||
|
||||
print(f"Input shape: {x.shape}")
|
||||
print(f"Target shape: {target.shape}")
|
||||
|
||||
# Forward pass
|
||||
h1 = layer1.forward(x)
|
||||
h1_activated = activation.forward(h1)
|
||||
output = layer2.forward(h1_activated)
|
||||
|
||||
print(f"Hidden layer shape: {h1.shape}")
|
||||
print(f"Output shape: {output.shape}")
|
||||
|
||||
# Loss
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check all layer gradients exist
|
||||
assert layer1.weight.grad is not None, "Layer1 weight gradient is None!"
|
||||
assert layer1.bias.grad is not None, "Layer1 bias gradient is None!"
|
||||
assert layer2.weight.grad is not None, "Layer2 weight gradient is None!"
|
||||
assert layer2.bias.grad is not None, "Layer2 bias gradient is None!"
|
||||
|
||||
# Check gradient magnitudes
|
||||
l1_weight_norm = np.linalg.norm(layer1.weight.grad.data)
|
||||
l1_bias_norm = np.linalg.norm(layer1.bias.grad.data)
|
||||
l2_weight_norm = np.linalg.norm(layer2.weight.grad.data)
|
||||
l2_bias_norm = np.linalg.norm(layer2.bias.grad.data)
|
||||
|
||||
print(f"\n✓ Layer1 weight gradient norm: {l1_weight_norm:.6f}")
|
||||
print(f"✓ Layer1 bias gradient norm: {l1_bias_norm:.6f}")
|
||||
print(f"✓ Layer2 weight gradient norm: {l2_weight_norm:.6f}")
|
||||
print(f"✓ Layer2 bias gradient norm: {l2_bias_norm:.6f}")
|
||||
|
||||
assert l1_weight_norm > 1e-6, "Layer1 weight gradients too small"
|
||||
assert l1_bias_norm > 1e-6, "Layer1 bias gradients too small"
|
||||
assert l2_weight_norm > 1e-6, "Layer2 weight gradients too small"
|
||||
assert l2_bias_norm > 1e-6, "Layer2 bias gradients too small"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through MLP")
|
||||
return True
|
||||
|
||||
|
||||
def test_mlp_training_updates():
|
||||
"""Test that MLP actually learns (loss decreases)"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 3: MLP Training - Loss Reduction")
|
||||
print("="*70)
|
||||
|
||||
# Create simple MLP
|
||||
layer1 = Linear(2, 4)
|
||||
activation = ReLU()
|
||||
layer2 = Linear(4, 1)
|
||||
|
||||
# Simple dataset (XOR-like)
|
||||
X = Tensor(np.array([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]), requires_grad=False)
|
||||
y = Tensor(np.array([[0.0], [1.0], [1.0], [0.0]]))
|
||||
|
||||
# Optimizer
|
||||
optimizer = SGD([layer1.weight, layer1.bias, layer2.weight, layer2.bias], lr=0.1)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
losses = []
|
||||
|
||||
print("Training for 50 epochs...")
|
||||
for epoch in range(50):
|
||||
# Forward
|
||||
h1 = layer1.forward(X)
|
||||
h1_act = activation.forward(h1)
|
||||
output = layer2.forward(h1_act)
|
||||
|
||||
# Loss
|
||||
loss = loss_fn.forward(output, y)
|
||||
losses.append(float(loss.data))
|
||||
|
||||
# Backward
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
|
||||
if (epoch + 1) % 10 == 0:
|
||||
print(f"Epoch {epoch+1:2d}: Loss = {float(loss.data):.6f}")
|
||||
|
||||
# Check loss decreased
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
reduction = initial_loss - final_loss
|
||||
reduction_pct = (reduction / initial_loss) * 100
|
||||
|
||||
print(f"\n✓ Initial loss: {initial_loss:.6f}")
|
||||
print(f"✓ Final loss: {final_loss:.6f}")
|
||||
print(f"✓ Reduction: {reduction:.6f} ({reduction_pct:.1f}%)")
|
||||
|
||||
assert final_loss < initial_loss, f"Loss didn't decrease! Initial: {initial_loss}, Final: {final_loss}"
|
||||
assert reduction_pct > 10, f"Loss reduction too small: {reduction_pct:.1f}%"
|
||||
|
||||
print("\n✅ TEST PASSED: MLP learns successfully (loss decreases)")
|
||||
return True
|
||||
|
||||
|
||||
def test_cnn_gradient_flow():
|
||||
"""Test gradients flow through convolutional layers"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 4: CNN Gradient Flow")
|
||||
print("="*70)
|
||||
|
||||
# Create simple CNN: Conv2d -> ReLU -> Linear
|
||||
conv = Conv2d(in_channels=1, out_channels=4, kernel_size=3, stride=1, padding=0)
|
||||
activation = ReLU()
|
||||
|
||||
# Input: batch=2, channels=1, height=8, width=8
|
||||
x = Tensor(np.random.randn(2, 1, 8, 8), requires_grad=True)
|
||||
|
||||
print(f"Input shape: {x.shape}")
|
||||
print(f"Conv weight shape: {conv.weight.shape}")
|
||||
|
||||
# Forward through conv
|
||||
conv_out = conv.forward(x)
|
||||
print(f"Conv output shape: {conv_out.shape}")
|
||||
|
||||
activated = activation.forward(conv_out)
|
||||
|
||||
# Flatten for linear layer
|
||||
batch_size = activated.shape[0]
|
||||
flattened_size = np.prod(activated.shape[1:])
|
||||
# Use reshape method to maintain gradient flow
|
||||
flattened = activated.reshape(batch_size, flattened_size)
|
||||
|
||||
linear = Linear(flattened_size, 2)
|
||||
output = linear.forward(flattened)
|
||||
|
||||
print(f"Flattened shape: {flattened.shape}")
|
||||
print(f"Output shape: {output.shape}")
|
||||
|
||||
# Loss
|
||||
target = Tensor(np.array([[1, 0], [0, 1]]))
|
||||
loss_fn = MSELoss()
|
||||
loss = loss_fn.forward(output, target)
|
||||
|
||||
print(f"Initial loss: {float(loss.data):.4f}")
|
||||
|
||||
# Backward
|
||||
loss.backward()
|
||||
|
||||
# Check gradients
|
||||
assert conv.weight.grad is not None, "Conv weight gradient is None!"
|
||||
assert conv.bias.grad is not None, "Conv bias gradient is None!"
|
||||
assert linear.weight.grad is not None, "Linear weight gradient is None!"
|
||||
|
||||
weight_grad_norm = np.linalg.norm(conv.weight.grad.data)
|
||||
conv_bias_norm = np.linalg.norm(conv.bias.grad.data)
|
||||
linear_grad_norm = np.linalg.norm(linear.weight.grad.data)
|
||||
|
||||
print(f"\n✓ Conv weight gradient norm: {weight_grad_norm:.6f}")
|
||||
print(f"✓ Conv bias gradient norm: {conv_bias_norm:.6f}")
|
||||
print(f"✓ Linear weight gradient norm: {linear_grad_norm:.6f}")
|
||||
|
||||
assert weight_grad_norm > 1e-6, f"Conv weight gradients too small: {weight_grad_norm}"
|
||||
assert conv_bias_norm > 1e-6, f"Conv bias gradients too small: {conv_bias_norm}"
|
||||
assert linear_grad_norm > 1e-6, f"Linear gradients too small: {linear_grad_norm}"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients flow correctly through CNN")
|
||||
return True
|
||||
|
||||
|
||||
def test_cnn_training_updates():
|
||||
"""Test that CNN actually learns on simple data"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 5: CNN Training - Loss Reduction")
|
||||
print("="*70)
|
||||
|
||||
# Simple CNN
|
||||
conv = Conv2d(1, 2, kernel_size=3, stride=1, padding=1)
|
||||
activation = ReLU()
|
||||
|
||||
# Simple data: 4 samples, 1 channel, 4x4 images
|
||||
X = Tensor(np.random.randn(4, 1, 4, 4), requires_grad=False)
|
||||
|
||||
# After conv: (4, 2, 4, 4) -> flatten to (4, 32)
|
||||
conv_out_size = 2 * 4 * 4 # channels * height * width
|
||||
linear = Linear(conv_out_size, 2)
|
||||
|
||||
y = Tensor(np.array([[1, 0], [0, 1], [1, 0], [0, 1]]))
|
||||
|
||||
# Get parameters with gradients
|
||||
params = []
|
||||
for p in [conv.weight, conv.bias, linear.weight, linear.bias]:
|
||||
if not p.requires_grad:
|
||||
p.requires_grad = True
|
||||
params.append(p)
|
||||
|
||||
# Optimizer
|
||||
optimizer = SGD(params, lr=0.01)
|
||||
loss_fn = MSELoss()
|
||||
|
||||
losses = []
|
||||
|
||||
print("Training for 30 epochs...")
|
||||
for epoch in range(30):
|
||||
# Forward
|
||||
conv_out = conv.forward(X)
|
||||
activated = activation.forward(conv_out)
|
||||
|
||||
# Flatten using reshape to maintain gradients
|
||||
batch_size = activated.shape[0]
|
||||
flattened = activated.reshape(batch_size, -1)
|
||||
|
||||
output = linear.forward(flattened)
|
||||
|
||||
# Loss
|
||||
loss = loss_fn.forward(output, y)
|
||||
losses.append(float(loss.data))
|
||||
|
||||
# Backward
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
|
||||
# Update
|
||||
optimizer.step()
|
||||
|
||||
if (epoch + 1) % 10 == 0:
|
||||
print(f"Epoch {epoch+1:2d}: Loss = {float(loss.data):.6f}")
|
||||
|
||||
# Check loss decreased
|
||||
initial_loss = losses[0]
|
||||
final_loss = losses[-1]
|
||||
reduction = initial_loss - final_loss
|
||||
reduction_pct = (reduction / initial_loss) * 100
|
||||
|
||||
print(f"\n✓ Initial loss: {initial_loss:.6f}")
|
||||
print(f"✓ Final loss: {final_loss:.6f}")
|
||||
print(f"✓ Reduction: {reduction:.6f} ({reduction_pct:.1f}%)")
|
||||
|
||||
assert final_loss < initial_loss, f"Loss didn't decrease! Initial: {initial_loss}, Final: {final_loss}"
|
||||
|
||||
print("\n✅ TEST PASSED: CNN learns successfully (loss decreases)")
|
||||
return True
|
||||
|
||||
|
||||
def test_gradient_accumulation():
|
||||
"""Test that gradients accumulate correctly across batches"""
|
||||
print("\n" + "="*70)
|
||||
print("TEST 6: Gradient Accumulation")
|
||||
print("="*70)
|
||||
|
||||
layer = Linear(2, 1)
|
||||
|
||||
# Two batches
|
||||
x1 = Tensor([[1.0, 2.0]], requires_grad=True)
|
||||
x2 = Tensor([[3.0, 4.0]], requires_grad=True)
|
||||
target = Tensor([[1.0]])
|
||||
|
||||
loss_fn = MSELoss()
|
||||
|
||||
# Forward + backward on first batch (don't zero grad)
|
||||
out1 = layer.forward(x1)
|
||||
loss1 = loss_fn.forward(out1, target)
|
||||
loss1.backward()
|
||||
|
||||
grad_after_first = np.array(layer.weight.grad.data)
|
||||
|
||||
# Forward + backward on second batch (gradients should accumulate)
|
||||
out2 = layer.forward(x2)
|
||||
loss2 = loss_fn.forward(out2, target)
|
||||
loss2.backward()
|
||||
|
||||
grad_after_second = layer.weight.grad.data
|
||||
|
||||
# Gradients should have accumulated (not been replaced)
|
||||
grad_diff = np.linalg.norm(grad_after_second - grad_after_first)
|
||||
|
||||
print(f"✓ Gradient after first batch norm: {np.linalg.norm(grad_after_first):.6f}")
|
||||
print(f"✓ Gradient after second batch norm: {np.linalg.norm(grad_after_second):.6f}")
|
||||
print(f"✓ Difference: {grad_diff:.6f}")
|
||||
|
||||
assert grad_diff > 1e-6, "Gradients didn't accumulate properly"
|
||||
|
||||
print("\n✅ TEST PASSED: Gradients accumulate correctly")
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
"""Run all gradient flow tests"""
|
||||
print("\n" + "="*70)
|
||||
print(" TINYTORCH GRADIENT FLOW TEST SUITE")
|
||||
print("="*70)
|
||||
|
||||
tests = [
|
||||
("Simple Linear", test_simple_linear_gradient_flow),
|
||||
("MLP Gradient Flow", test_mlp_gradient_flow),
|
||||
("MLP Training", test_mlp_training_updates),
|
||||
("CNN Gradient Flow", test_cnn_gradient_flow),
|
||||
("CNN Training", test_cnn_training_updates),
|
||||
("Gradient Accumulation", test_gradient_accumulation),
|
||||
]
|
||||
|
||||
results = []
|
||||
|
||||
for name, test_func in tests:
|
||||
try:
|
||||
result = test_func()
|
||||
results.append((name, "PASSED" if result else "FAILED"))
|
||||
except Exception as e:
|
||||
print(f"\n❌ TEST FAILED: {name}")
|
||||
print(f"Error: {str(e)}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
results.append((name, "FAILED"))
|
||||
|
||||
# Summary
|
||||
print("\n" + "="*70)
|
||||
print(" TEST SUMMARY")
|
||||
print("="*70)
|
||||
|
||||
passed = sum(1 for _, status in results if status == "PASSED")
|
||||
total = len(results)
|
||||
|
||||
for name, status in results:
|
||||
symbol = "✅" if status == "PASSED" else "❌"
|
||||
print(f"{symbol} {name}: {status}")
|
||||
|
||||
print(f"\nTotal: {passed}/{total} tests passed")
|
||||
|
||||
if passed == total:
|
||||
print("\n🎉 ALL TESTS PASSED! Gradients flow correctly through TinyTorch.")
|
||||
return 0
|
||||
else:
|
||||
print(f"\n⚠️ {total - passed} tests failed. Please review the errors above.")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit(main())
|
||||
@@ -1,24 +0,0 @@
|
||||
# Archived Commands
|
||||
|
||||
These command files are no longer top-level commands but are kept for reference.
|
||||
|
||||
## Archived Files
|
||||
|
||||
- `clean.py` - Deprecated cleanup command
|
||||
- `help.py` - Old help command (now handled by argparse)
|
||||
- `notebooks.py` - Deprecated notebooks command
|
||||
- `status.py` - Old status command (functionality moved to module workflow)
|
||||
- `checkpoint.py` - Old checkpoint tracking (superseded by milestones command)
|
||||
- `demo.py` - Demo runner (students can run demos directly with Python)
|
||||
- `book.py` - Jupyter Book builder (developers can run jupyter-book directly)
|
||||
- `leaderboard.py` - Community leaderboard (functionality merged into community command)
|
||||
- `olympics.py` - Competition events (functionality merged into community command)
|
||||
|
||||
## Note
|
||||
|
||||
During the CLI reorganization on 2025-11-28, commands with subcommands were moved into logical subfolders:
|
||||
- `module/` - Module workflow and reset
|
||||
- `system/` - System commands (info, health, jupyter, check, version, clean_workspace, report, protect)
|
||||
- `package/` - Package management (nbdev, reset)
|
||||
|
||||
These archived files are truly deprecated and not used anywhere in the codebase.
|
||||
@@ -1,396 +0,0 @@
|
||||
"""
|
||||
Book command for TinyTorch CLI: builds and manages the Jupyter Book.
|
||||
"""
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from pathlib import Path
|
||||
from rich.panel import Panel
|
||||
|
||||
from .base import BaseCommand
|
||||
|
||||
NOTEBOOKS_DIR = "modules"
|
||||
|
||||
class BookCommand(BaseCommand):
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "book"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Build and manage the TinyTorch Jupyter Book"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
subparsers = parser.add_subparsers(
|
||||
dest='book_command',
|
||||
help='Book management commands',
|
||||
metavar='COMMAND'
|
||||
)
|
||||
|
||||
# Build command
|
||||
build_parser = subparsers.add_parser(
|
||||
'build',
|
||||
help='Build the Jupyter Book locally'
|
||||
)
|
||||
|
||||
# Publish command
|
||||
publish_parser = subparsers.add_parser(
|
||||
'publish',
|
||||
help='Generate content, commit, and publish to GitHub'
|
||||
)
|
||||
publish_parser.add_argument(
|
||||
'--message',
|
||||
type=str,
|
||||
default='📚 Update book content',
|
||||
help='Commit message (default: "📚 Update book content")'
|
||||
)
|
||||
publish_parser.add_argument(
|
||||
'--branch',
|
||||
type=str,
|
||||
default='main',
|
||||
help='Branch to push to (default: main)'
|
||||
)
|
||||
|
||||
# Clean command
|
||||
clean_parser = subparsers.add_parser(
|
||||
'clean',
|
||||
help='Clean built book files'
|
||||
)
|
||||
|
||||
# Serve command
|
||||
serve_parser = subparsers.add_parser(
|
||||
'serve',
|
||||
help='Build and serve the Jupyter Book locally'
|
||||
)
|
||||
serve_parser.add_argument(
|
||||
'--port',
|
||||
type=int,
|
||||
default=8001,
|
||||
help='Port to serve on (default: 8001)'
|
||||
)
|
||||
serve_parser.add_argument(
|
||||
'--no-build',
|
||||
action='store_true',
|
||||
help='Skip building and serve existing files'
|
||||
)
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
console = self.console
|
||||
|
||||
# Check if we're in the right directory
|
||||
if not Path("site").exists():
|
||||
console.print(Panel(
|
||||
"[red]❌ site/ directory not found. Run this command from the TinyTorch root directory.[/red]",
|
||||
title="Error",
|
||||
border_style="red"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Handle subcommands
|
||||
if not hasattr(args, 'book_command') or not args.book_command:
|
||||
console.print(Panel(
|
||||
"[bold cyan]📚 TinyTorch Book Management[/bold cyan]\n\n"
|
||||
"[bold]Available Commands:[/bold]\n"
|
||||
" [bold green]build[/bold green] - Build the complete Jupyter Book\n"
|
||||
" [bold green]serve[/bold green] - Build and serve the Jupyter Book locally\n"
|
||||
" [bold green]publish[/bold green] - Generate content, commit, and publish to GitHub\n"
|
||||
" [bold green]clean[/bold green] - Clean built book files\n\n"
|
||||
"[bold]Quick Start:[/bold]\n"
|
||||
" [dim]tito book publish[/dim] - Generate, commit, and publish to GitHub\n"
|
||||
" [dim]tito book clean[/dim] - Clean built book files",
|
||||
title="Book Commands",
|
||||
border_style="bright_blue"
|
||||
))
|
||||
return 0
|
||||
|
||||
if args.book_command == 'build':
|
||||
return self._build_book(args)
|
||||
elif args.book_command == 'serve':
|
||||
return self._serve_book(args)
|
||||
elif args.book_command == 'publish':
|
||||
return self._publish_book(args)
|
||||
elif args.book_command == 'clean':
|
||||
return self._clean_book()
|
||||
else:
|
||||
console.print(f"[red]Unknown book command: {args.book_command}[/red]")
|
||||
return 1
|
||||
|
||||
def _generate_overview(self) -> int:
|
||||
"""Generate overview pages from modules."""
|
||||
console = self.console
|
||||
console.print("🔄 Generating overview pages from modules...")
|
||||
|
||||
try:
|
||||
os.chdir("site")
|
||||
result = subprocess.run(
|
||||
["python3", "convert_readmes.py"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
console.print("✅ Overview pages generated successfully")
|
||||
# Show summary from the output
|
||||
for line in result.stdout.split('\n'):
|
||||
if "✅ Created" in line or "🎉 Converted" in line:
|
||||
console.print(f" {line.strip()}")
|
||||
return 0
|
||||
else:
|
||||
console.print(f"[red]❌ Failed to generate overview pages: {result.stderr}[/red]")
|
||||
return 1
|
||||
|
||||
except FileNotFoundError:
|
||||
console.print("[red]❌ Python3 not found or convert_readmes.py missing[/red]")
|
||||
return 1
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error generating overview pages: {e}[/red]")
|
||||
return 1
|
||||
finally:
|
||||
os.chdir("..")
|
||||
|
||||
def _generate_all(self) -> int:
|
||||
"""Verify that all book chapters exist."""
|
||||
console = self.console
|
||||
console.print("📝 Verifying book chapters...")
|
||||
|
||||
# Check that the chapters directory exists
|
||||
chapters_dir = Path("docs/chapters")
|
||||
if not chapters_dir.exists():
|
||||
console.print("[red]❌ docs/chapters directory not found[/red]")
|
||||
return 1
|
||||
|
||||
# Count markdown files in chapters directory
|
||||
chapter_files = list(chapters_dir.glob("*.md"))
|
||||
if chapter_files:
|
||||
console.print(f"✅ Found {len(chapter_files)} chapter files")
|
||||
else:
|
||||
console.print("[yellow]⚠️ No chapter files found in docs/chapters/[/yellow]")
|
||||
|
||||
return 0
|
||||
|
||||
def _build_book(self, args: Namespace) -> int:
|
||||
"""Build the Jupyter Book locally."""
|
||||
console = self.console
|
||||
|
||||
# First generate all content (notebooks + overview pages)
|
||||
console.print("📄 Step 1: Generating all content...")
|
||||
if self._generate_all() != 0:
|
||||
return 1
|
||||
|
||||
# Then build the book
|
||||
console.print("📚 Step 2: Building Jupyter Book...")
|
||||
|
||||
try:
|
||||
os.chdir("site")
|
||||
result = subprocess.run(
|
||||
["jupyter-book", "build", "."],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
console.print("✅ Book built successfully!")
|
||||
|
||||
# Extract and show the file path
|
||||
if "file://" in result.stdout:
|
||||
for line in result.stdout.split('\n'):
|
||||
if "file://" in line:
|
||||
console.print(f"🌐 View at: {line.strip()}")
|
||||
break
|
||||
|
||||
console.print("📁 HTML files available in: docs/_build/html/")
|
||||
return 0
|
||||
else:
|
||||
console.print(f"[red]❌ Failed to build book[/red]")
|
||||
if result.stderr:
|
||||
console.print(f"Error details: {result.stderr}")
|
||||
return 1
|
||||
|
||||
except FileNotFoundError:
|
||||
console.print("[red]❌ jupyter-book not found. Install with: pip install jupyter-book[/red]")
|
||||
return 1
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error building book: {e}[/red]")
|
||||
return 1
|
||||
finally:
|
||||
os.chdir("..")
|
||||
|
||||
def _serve_book(self, args: Namespace) -> int:
|
||||
"""Build and serve the Jupyter Book locally."""
|
||||
console = self.console
|
||||
|
||||
# Build the book first unless --no-build is specified
|
||||
if not args.no_build:
|
||||
console.print("📚 Step 1: Building the book...")
|
||||
if self._build_book(args) != 0:
|
||||
return 1
|
||||
console.print()
|
||||
|
||||
# Start the HTTP server
|
||||
console.print("🌐 Step 2: Starting development server...")
|
||||
console.print(f"📖 Open your browser to: [bold blue]http://localhost:{args.port}[/bold blue]")
|
||||
console.print("🛑 Press [bold]Ctrl+C[/bold] to stop the server")
|
||||
console.print()
|
||||
|
||||
book_dir = Path("docs/_build/html")
|
||||
if not book_dir.exists():
|
||||
console.print("[red]❌ Built book not found. Run with --no-build=False to build first.[/red]")
|
||||
return 1
|
||||
|
||||
try:
|
||||
# Use Python's built-in HTTP server
|
||||
subprocess.run([
|
||||
"python3", "-m", "http.server", str(args.port),
|
||||
"--directory", str(book_dir)
|
||||
])
|
||||
except KeyboardInterrupt:
|
||||
console.print("\n🛑 Development server stopped")
|
||||
except FileNotFoundError:
|
||||
console.print("[red]❌ Python3 not found in PATH[/red]")
|
||||
return 1
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error starting server: {e}[/red]")
|
||||
return 1
|
||||
|
||||
return 0
|
||||
|
||||
def _clean_book(self) -> int:
|
||||
"""Clean built book files."""
|
||||
console = self.console
|
||||
console.print("🧹 Cleaning book build files...")
|
||||
|
||||
try:
|
||||
os.chdir("site")
|
||||
result = subprocess.run(
|
||||
["jupyter-book", "clean", "."],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
console.print("✅ Book files cleaned successfully")
|
||||
return 0
|
||||
else:
|
||||
console.print(f"[red]❌ Failed to clean book files: {result.stderr}[/red]")
|
||||
return 1
|
||||
|
||||
except FileNotFoundError:
|
||||
console.print("[red]❌ jupyter-book not found[/red]")
|
||||
return 1
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error cleaning book: {e}[/red]")
|
||||
return 1
|
||||
finally:
|
||||
os.chdir("..")
|
||||
|
||||
def _publish_book(self, args: Namespace) -> int:
|
||||
"""Generate content, commit, and publish to GitHub."""
|
||||
console = self.console
|
||||
|
||||
console.print("🚀 Starting book publishing workflow...")
|
||||
|
||||
# Step 1: Generate all content
|
||||
console.print("📝 Step 1: Generating all content...")
|
||||
if self._generate_all() != 0:
|
||||
console.print("[red]❌ Failed to generate content. Aborting publish.[/red]")
|
||||
return 1
|
||||
|
||||
# Step 2: Check git status
|
||||
console.print("🔍 Step 2: Checking git status...")
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "status", "--porcelain"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
cwd="."
|
||||
)
|
||||
|
||||
if result.returncode != 0:
|
||||
console.print("[red]❌ Git not available or not a git repository[/red]")
|
||||
return 1
|
||||
|
||||
changes = result.stdout.strip()
|
||||
if not changes:
|
||||
console.print("✅ No changes to publish")
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error checking git status: {e}[/red]")
|
||||
return 1
|
||||
|
||||
# Step 3: Add and commit changes
|
||||
console.print("📦 Step 3: Committing changes...")
|
||||
try:
|
||||
# Add all changes
|
||||
subprocess.run(["git", "add", "."], check=True, cwd=".")
|
||||
|
||||
# Commit with message
|
||||
subprocess.run([
|
||||
"git", "commit", "-m", args.message
|
||||
], check=True, cwd=".")
|
||||
|
||||
console.print(f"✅ Committed with message: {args.message}")
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
console.print(f"[red]❌ Failed to commit changes: {e}[/red]")
|
||||
return 1
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error during commit: {e}[/red]")
|
||||
return 1
|
||||
|
||||
# Step 4: Push to GitHub
|
||||
console.print(f"⬆️ Step 4: Pushing to {args.branch} branch...")
|
||||
try:
|
||||
result = subprocess.run([
|
||||
"git", "push", "origin", args.branch
|
||||
], capture_output=True, text=True, cwd=".")
|
||||
|
||||
if result.returncode == 0:
|
||||
console.print(f"✅ Successfully pushed to {args.branch}")
|
||||
else:
|
||||
console.print(f"[red]❌ Failed to push: {result.stderr}[/red]")
|
||||
return 1
|
||||
|
||||
except Exception as e:
|
||||
console.print(f"[red]❌ Error during push: {e}[/red]")
|
||||
return 1
|
||||
|
||||
# Step 5: Show deployment info
|
||||
console.print("🌐 Step 5: Deployment initiated...")
|
||||
console.print("✅ GitHub Actions will now:")
|
||||
console.print(" 📚 Build the Jupyter Book")
|
||||
console.print(" 🚀 Deploy to GitHub Pages")
|
||||
console.print(" 🔗 Update live website")
|
||||
|
||||
# Try to get repository info for deployment URL
|
||||
try:
|
||||
result = subprocess.run([
|
||||
"git", "remote", "get-url", "origin"
|
||||
], capture_output=True, text=True, cwd=".")
|
||||
|
||||
if result.returncode == 0:
|
||||
remote_url = result.stdout.strip()
|
||||
if "github.com" in remote_url:
|
||||
# Extract owner/repo from git URL
|
||||
if remote_url.endswith(".git"):
|
||||
remote_url = remote_url[:-4]
|
||||
if remote_url.startswith("git@github.com:"):
|
||||
repo_path = remote_url.replace("git@github.com:", "")
|
||||
elif remote_url.startswith("https://github.com/"):
|
||||
repo_path = remote_url.replace("https://github.com/", "")
|
||||
else:
|
||||
repo_path = None
|
||||
|
||||
if repo_path:
|
||||
console.print(f"\n🔗 Monitor deployment: https://github.com/{repo_path}/actions")
|
||||
console.print(f"📖 Live website: https://{repo_path.split('/')[0]}.github.io/{repo_path.split('/')[1]}/")
|
||||
|
||||
except Exception:
|
||||
# Don't fail the whole command if we can't get repo info
|
||||
pass
|
||||
|
||||
console.print("\n🎉 Publishing workflow complete!")
|
||||
console.print("💡 Check GitHub Actions for deployment status")
|
||||
|
||||
return 0
|
||||
@@ -1,690 +0,0 @@
|
||||
"""
|
||||
Checkpoint tracking and visualization command for TinyTorch CLI.
|
||||
|
||||
Provides capability-based progress tracking through the ML systems engineering journey:
|
||||
Foundation → Architecture → Training → Inference → Serving
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Tuple, Optional
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.progress import Progress, BarColumn, TextColumn, SpinnerColumn
|
||||
from rich.table import Table
|
||||
from rich.tree import Tree
|
||||
from rich.text import Text
|
||||
from rich.layout import Layout
|
||||
from rich.columns import Columns
|
||||
from rich.status import Status
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.config import CLIConfig
|
||||
from ..core.console import get_console, print_error, print_success
|
||||
|
||||
|
||||
class CheckpointSystem:
|
||||
"""Core checkpoint tracking system."""
|
||||
|
||||
# Define the 20-checkpoint structure for complete ML systems engineering journey
|
||||
CHECKPOINTS = {
|
||||
"00": {
|
||||
"name": "Environment",
|
||||
"description": "Development environment setup and configuration",
|
||||
"test_file": "checkpoint_00_environment.py",
|
||||
"capability": "Can I configure my TinyTorch development environment?"
|
||||
},
|
||||
"01": {
|
||||
"name": "Foundation",
|
||||
"description": "Basic tensor operations and ML building blocks",
|
||||
"test_file": "checkpoint_01_foundation.py",
|
||||
"capability": "Can I create and manipulate the building blocks of ML?"
|
||||
},
|
||||
"02": {
|
||||
"name": "Intelligence",
|
||||
"description": "Nonlinear activation functions",
|
||||
"test_file": "checkpoint_02_intelligence.py",
|
||||
"capability": "Can I add nonlinearity - the key to neural network intelligence?"
|
||||
},
|
||||
"03": {
|
||||
"name": "Components",
|
||||
"description": "Fundamental neural network building blocks",
|
||||
"test_file": "checkpoint_03_components.py",
|
||||
"capability": "Can I build the fundamental building blocks of neural networks?"
|
||||
},
|
||||
"04": {
|
||||
"name": "Networks",
|
||||
"description": "Complete multi-layer neural networks",
|
||||
"test_file": "checkpoint_04_networks.py",
|
||||
"capability": "Can I build complete multi-layer neural networks?"
|
||||
},
|
||||
"05": {
|
||||
"name": "Learning",
|
||||
"description": "Spatial data processing with convolutional operations",
|
||||
"test_file": "checkpoint_05_learning.py",
|
||||
"capability": "Can I process spatial data like images with convolutional operations?"
|
||||
},
|
||||
"06": {
|
||||
"name": "Attention",
|
||||
"description": "Attention mechanisms for sequence understanding",
|
||||
"test_file": "checkpoint_06_attention.py",
|
||||
"capability": "Can I build attention mechanisms for sequence understanding?"
|
||||
},
|
||||
"07": {
|
||||
"name": "Stability",
|
||||
"description": "Training stabilization with normalization",
|
||||
"test_file": "checkpoint_07_stability.py",
|
||||
"capability": "Can I stabilize training with normalization techniques?"
|
||||
},
|
||||
"08": {
|
||||
"name": "Differentiation",
|
||||
"description": "Automatic gradient computation for learning",
|
||||
"test_file": "checkpoint_08_differentiation.py",
|
||||
"capability": "Can I automatically compute gradients for learning?"
|
||||
},
|
||||
"09": {
|
||||
"name": "Optimization",
|
||||
"description": "Sophisticated optimization algorithms",
|
||||
"test_file": "checkpoint_09_optimization.py",
|
||||
"capability": "Can I optimize neural networks with sophisticated algorithms?"
|
||||
},
|
||||
"10": {
|
||||
"name": "Training",
|
||||
"description": "Complete training loops for end-to-end learning",
|
||||
"test_file": "checkpoint_10_training.py",
|
||||
"capability": "Can I build complete training loops for end-to-end learning?"
|
||||
},
|
||||
"11": {
|
||||
"name": "Regularization",
|
||||
"description": "Overfitting prevention and robust model building",
|
||||
"test_file": "checkpoint_11_regularization.py",
|
||||
"capability": "Can I prevent overfitting and build robust models?"
|
||||
},
|
||||
"12": {
|
||||
"name": "Kernels",
|
||||
"description": "High-performance computational kernels",
|
||||
"test_file": "checkpoint_12_kernels.py",
|
||||
"capability": "Can I implement high-performance computational kernels?"
|
||||
},
|
||||
"13": {
|
||||
"name": "Benchmarking",
|
||||
"description": "Performance analysis and bottleneck identification",
|
||||
"test_file": "checkpoint_13_benchmarking.py",
|
||||
"capability": "Can I analyze performance and identify bottlenecks in ML systems?"
|
||||
},
|
||||
"14": {
|
||||
"name": "Deployment",
|
||||
"description": "Production deployment and monitoring",
|
||||
"test_file": "checkpoint_14_deployment.py",
|
||||
"capability": "Can I deploy and monitor ML systems in production?"
|
||||
},
|
||||
"15": {
|
||||
"name": "Acceleration",
|
||||
"description": "Algorithmic optimization and acceleration techniques",
|
||||
"test_file": "checkpoint_15_acceleration.py",
|
||||
"capability": "Can I accelerate computations through algorithmic optimization?"
|
||||
},
|
||||
"16": {
|
||||
"name": "Quantization",
|
||||
"description": "Trading precision for speed with INT8 quantization",
|
||||
"test_file": "checkpoint_16_quantization.py",
|
||||
"capability": "Can I trade precision for speed with INT8 quantization?"
|
||||
},
|
||||
"17": {
|
||||
"name": "Compression",
|
||||
"description": "Neural network pruning for edge deployment",
|
||||
"test_file": "checkpoint_17_compression.py",
|
||||
"capability": "Can I remove 70% of parameters while maintaining accuracy?"
|
||||
},
|
||||
"18": {
|
||||
"name": "Caching",
|
||||
"description": "KV caching for transformer inference optimization",
|
||||
"test_file": "checkpoint_18_caching.py",
|
||||
"capability": "Can I transform O(N²) to O(N) complexity with intelligent caching?"
|
||||
},
|
||||
"19": {
|
||||
"name": "Competition",
|
||||
"description": "TinyMLPerf competition system for optimization mastery",
|
||||
"test_file": "checkpoint_19_competition.py",
|
||||
"capability": "Can I build competition-grade benchmarking infrastructure?"
|
||||
},
|
||||
"20": {
|
||||
"name": "TinyGPT Capstone",
|
||||
"description": "Complete language model demonstrating ML systems mastery",
|
||||
"test_file": "checkpoint_20_capstone.py",
|
||||
"capability": "Can I build a complete language model that generates coherent text from scratch?"
|
||||
}
|
||||
}
|
||||
|
||||
def __init__(self, config: CLIConfig):
|
||||
"""Initialize checkpoint system."""
|
||||
self.config = config
|
||||
self.console = get_console()
|
||||
self.modules_dir = config.project_root / "modules" / "source"
|
||||
self.checkpoints_dir = config.project_root / "tests" / "checkpoints"
|
||||
|
||||
def get_checkpoint_test_status(self, checkpoint_id: str) -> Dict[str, bool]:
|
||||
"""Get the status of a checkpoint test file."""
|
||||
if checkpoint_id not in self.CHECKPOINTS:
|
||||
return {"exists": False, "tested": False, "passed": False}
|
||||
|
||||
test_file = self.CHECKPOINTS[checkpoint_id]["test_file"]
|
||||
test_path = self.checkpoints_dir / test_file
|
||||
|
||||
return {
|
||||
"exists": test_path.exists(),
|
||||
"tested": False, # Will be set when we run tests
|
||||
"passed": False # Will be set based on test results
|
||||
}
|
||||
|
||||
def get_checkpoint_status(self, checkpoint_id: str) -> Dict:
|
||||
"""Get status information for a checkpoint."""
|
||||
checkpoint = self.CHECKPOINTS[checkpoint_id]
|
||||
test_status = self.get_checkpoint_test_status(checkpoint_id)
|
||||
|
||||
return {
|
||||
"checkpoint": checkpoint,
|
||||
"test_status": test_status,
|
||||
"is_available": test_status["exists"],
|
||||
"is_complete": test_status.get("passed", False),
|
||||
"checkpoint_id": checkpoint_id
|
||||
}
|
||||
|
||||
def get_overall_progress(self) -> Dict:
|
||||
"""Get overall progress across all checkpoints."""
|
||||
checkpoints_status = {}
|
||||
current_checkpoint = None
|
||||
total_complete = 0
|
||||
total_checkpoints = len(self.CHECKPOINTS)
|
||||
|
||||
for checkpoint_id in self.CHECKPOINTS.keys():
|
||||
status = self.get_checkpoint_status(checkpoint_id)
|
||||
checkpoints_status[checkpoint_id] = status
|
||||
|
||||
if status["is_complete"]:
|
||||
total_complete += 1
|
||||
elif current_checkpoint is None and status["is_available"]:
|
||||
# First available but incomplete checkpoint is current
|
||||
current_checkpoint = checkpoint_id
|
||||
|
||||
# If all are complete, set current to last checkpoint
|
||||
if current_checkpoint is None and total_complete == total_checkpoints:
|
||||
current_checkpoint = list(self.CHECKPOINTS.keys())[-1]
|
||||
# If none are complete, start with first
|
||||
elif current_checkpoint is None:
|
||||
current_checkpoint = "00"
|
||||
|
||||
# Calculate overall percentage
|
||||
overall_percent = (total_complete / total_checkpoints * 100) if total_checkpoints > 0 else 0
|
||||
|
||||
return {
|
||||
"checkpoints": checkpoints_status,
|
||||
"current": current_checkpoint,
|
||||
"overall_progress": overall_percent,
|
||||
"total_complete": total_complete,
|
||||
"total_checkpoints": total_checkpoints
|
||||
}
|
||||
|
||||
def run_checkpoint_test(self, checkpoint_id: str) -> Dict:
|
||||
"""Run a specific checkpoint test and return results."""
|
||||
if checkpoint_id not in self.CHECKPOINTS:
|
||||
return {"success": False, "error": f"Unknown checkpoint: {checkpoint_id}"}
|
||||
|
||||
checkpoint = self.CHECKPOINTS[checkpoint_id]
|
||||
test_file = checkpoint["test_file"]
|
||||
test_path = self.checkpoints_dir / test_file
|
||||
|
||||
if not test_path.exists():
|
||||
return {"success": False, "error": f"Test file not found: {test_file}"}
|
||||
|
||||
try:
|
||||
# Run the test using subprocess to capture output
|
||||
result = subprocess.run(
|
||||
[sys.executable, str(test_path)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
cwd=self.config.project_root,
|
||||
timeout=30 # 30 second timeout
|
||||
)
|
||||
|
||||
return {
|
||||
"success": result.returncode == 0,
|
||||
"returncode": result.returncode,
|
||||
"stdout": result.stdout,
|
||||
"stderr": result.stderr,
|
||||
"checkpoint_name": checkpoint["name"],
|
||||
"capability": checkpoint["capability"]
|
||||
}
|
||||
|
||||
except subprocess.TimeoutExpired:
|
||||
return {"success": False, "error": "Test timed out after 30 seconds"}
|
||||
except Exception as e:
|
||||
return {"success": False, "error": f"Test execution failed: {str(e)}"}
|
||||
|
||||
|
||||
class CheckpointCommand(BaseCommand):
|
||||
"""Checkpoint tracking and visualization command."""
|
||||
|
||||
name = "checkpoint"
|
||||
description = "Track and visualize ML systems engineering progress through checkpoints"
|
||||
|
||||
def add_arguments(self, parser: argparse.ArgumentParser) -> None:
|
||||
"""Add checkpoint-specific arguments."""
|
||||
subparsers = parser.add_subparsers(
|
||||
dest='checkpoint_command',
|
||||
help='Checkpoint operations',
|
||||
metavar='COMMAND'
|
||||
)
|
||||
|
||||
# Status command
|
||||
status_parser = subparsers.add_parser(
|
||||
'status',
|
||||
help='Show current checkpoint progress'
|
||||
)
|
||||
status_parser.add_argument(
|
||||
'--detailed', '-d',
|
||||
action='store_true',
|
||||
help='Show detailed module-level progress'
|
||||
)
|
||||
|
||||
# Timeline command
|
||||
timeline_parser = subparsers.add_parser(
|
||||
'timeline',
|
||||
help='Show visual progress timeline'
|
||||
)
|
||||
timeline_parser.add_argument(
|
||||
'--horizontal',
|
||||
action='store_true',
|
||||
help='Show horizontal timeline (default: vertical)'
|
||||
)
|
||||
|
||||
# Test command
|
||||
test_parser = subparsers.add_parser(
|
||||
'test',
|
||||
help='Test checkpoint capabilities'
|
||||
)
|
||||
test_parser.add_argument(
|
||||
'checkpoint_id',
|
||||
nargs='?',
|
||||
help='Checkpoint ID to test (00-20, current checkpoint if not specified)'
|
||||
)
|
||||
|
||||
# Run command (new)
|
||||
run_parser = subparsers.add_parser(
|
||||
'run',
|
||||
help='Run specific checkpoint tests with progress tracking'
|
||||
)
|
||||
run_parser.add_argument(
|
||||
'checkpoint_id',
|
||||
help='Checkpoint ID to run (00-20)'
|
||||
)
|
||||
run_parser.add_argument(
|
||||
'--verbose', '-v',
|
||||
action='store_true',
|
||||
help='Show detailed test output'
|
||||
)
|
||||
|
||||
# Unlock command
|
||||
unlock_parser = subparsers.add_parser(
|
||||
'unlock',
|
||||
help='Attempt to unlock next checkpoint'
|
||||
)
|
||||
|
||||
def run(self, args: argparse.Namespace) -> int:
|
||||
"""Execute checkpoint command."""
|
||||
checkpoint_system = CheckpointSystem(self.config)
|
||||
|
||||
if not args.checkpoint_command:
|
||||
return self._show_help(args)
|
||||
|
||||
if args.checkpoint_command == 'status':
|
||||
return self._show_status(checkpoint_system, args)
|
||||
elif args.checkpoint_command == 'timeline':
|
||||
return self._show_timeline(checkpoint_system, args)
|
||||
elif args.checkpoint_command == 'test':
|
||||
return self._test_checkpoint(checkpoint_system, args)
|
||||
elif args.checkpoint_command == 'run':
|
||||
return self._run_checkpoint(checkpoint_system, args)
|
||||
elif args.checkpoint_command == 'unlock':
|
||||
return self._unlock_checkpoint(checkpoint_system, args)
|
||||
else:
|
||||
print_error(f"Unknown checkpoint command: {args.checkpoint_command}")
|
||||
return 1
|
||||
|
||||
def _show_help(self, args: argparse.Namespace) -> int:
|
||||
"""Show checkpoint command help."""
|
||||
console = get_console()
|
||||
console.print(Panel(
|
||||
"[bold cyan]TinyTorch Checkpoint System[/bold cyan]\n\n"
|
||||
"[bold]Track your progress through 20 capability checkpoints:[/bold]\n"
|
||||
" 00-04: Foundation → Environment, tensors, networks\n"
|
||||
" 05-09: Architecture → Spatial, attention, autograd, optimization\n"
|
||||
" 10-14: Systems → Training, kernels, benchmarking, deployment\n"
|
||||
" 15-19: Optimization → Acceleration, quantization, compression, caching, competition\n"
|
||||
" 20: Capstone → Complete TinyGPT language model\n\n"
|
||||
"[bold]Available Commands:[/bold]\n"
|
||||
" [green]status[/green] - Show current progress and capabilities\n"
|
||||
" [green]timeline[/green] - Visual progress timeline\n"
|
||||
" [green]test[/green] - Test checkpoint capabilities\n"
|
||||
" [green]run[/green] - Run specific checkpoint with progress\n"
|
||||
" [green]unlock[/green] - Attempt to unlock next checkpoint\n\n"
|
||||
"[bold]Examples:[/bold]\n"
|
||||
" [dim]tito checkpoint status --detailed[/dim]\n"
|
||||
" [dim]tito checkpoint timeline --horizontal[/dim]\n"
|
||||
" [dim]tito checkpoint test 16[/dim]\n"
|
||||
" [dim]tito checkpoint run 20 --verbose[/dim]",
|
||||
title="Checkpoint System (20 Checkpoints)",
|
||||
border_style="bright_blue"
|
||||
))
|
||||
return 0
|
||||
|
||||
def _show_status(self, checkpoint_system: CheckpointSystem, args: argparse.Namespace) -> int:
|
||||
"""Show checkpoint status."""
|
||||
console = get_console()
|
||||
progress_data = checkpoint_system.get_overall_progress()
|
||||
|
||||
# Header
|
||||
console.print(Panel(
|
||||
"[bold cyan]🚀 TinyTorch Framework Capabilities[/bold cyan]",
|
||||
border_style="bright_blue"
|
||||
))
|
||||
|
||||
# Overall progress
|
||||
overall_percent = progress_data["overall_progress"]
|
||||
console.print(f"\n[bold]Overall Progress:[/bold] {overall_percent:.0f}% ({progress_data['total_complete']}/{progress_data['total_checkpoints']} checkpoints)")
|
||||
|
||||
# Current status summary
|
||||
current = progress_data["current"]
|
||||
if current:
|
||||
current_status = progress_data["checkpoints"][current]
|
||||
current_name = current_status["checkpoint"]["name"]
|
||||
|
||||
console.print(f"[bold]Current Checkpoint:[/bold] {current:0>2} - {current_name}")
|
||||
|
||||
if current_status["is_complete"]:
|
||||
console.print(f"[bold green]✅ {current_name} checkpoint achieved![/bold green]")
|
||||
console.print(f"[dim]Capability unlocked: {current_status['checkpoint']['capability']}[/dim]")
|
||||
else:
|
||||
console.print(f"[bold yellow]🎯 Ready to test {current_name} capabilities[/bold yellow]")
|
||||
console.print(f"[dim]Goal: {current_status['checkpoint']['capability']}[/dim]")
|
||||
|
||||
console.print()
|
||||
|
||||
# Checkpoint progress
|
||||
for checkpoint_id, checkpoint_data in progress_data["checkpoints"].items():
|
||||
checkpoint = checkpoint_data["checkpoint"]
|
||||
|
||||
# Checkpoint header
|
||||
if checkpoint_data["is_complete"]:
|
||||
status_icon = "✅"
|
||||
status_color = "green"
|
||||
elif checkpoint_id == current:
|
||||
status_icon = "🎯"
|
||||
status_color = "yellow"
|
||||
else:
|
||||
status_icon = "⏳"
|
||||
status_color = "dim"
|
||||
|
||||
console.print(f"[bold]{status_icon} {checkpoint_id:0>2}: {checkpoint['name']}[/bold] [{status_color}]{'COMPLETE' if checkpoint_data['is_complete'] else 'PENDING'}[/{status_color}]")
|
||||
|
||||
if args.detailed:
|
||||
# Show test file and availability
|
||||
test_status = checkpoint_data["test_status"]
|
||||
test_available = "✅" if test_status["exists"] else "❌"
|
||||
console.print(f" {test_available} Test: {checkpoint['test_file']}")
|
||||
|
||||
console.print(f" [dim]{checkpoint['capability']}[/dim]\n")
|
||||
|
||||
return 0
|
||||
|
||||
def _show_timeline(self, checkpoint_system: CheckpointSystem, args: argparse.Namespace) -> int:
|
||||
"""Show visual timeline with Rich progress bar."""
|
||||
console = get_console()
|
||||
progress_data = checkpoint_system.get_overall_progress()
|
||||
|
||||
console.print("\n[bold cyan]🚀 TinyTorch Framework Progress Timeline[/bold cyan]\n")
|
||||
|
||||
if args.horizontal:
|
||||
# Enhanced horizontal timeline with progress line
|
||||
overall_percent = progress_data["overall_progress"]
|
||||
total_checkpoints = progress_data["total_checkpoints"]
|
||||
complete_checkpoints = progress_data["total_complete"]
|
||||
|
||||
# Create a visual progress bar
|
||||
filled = int(overall_percent / 2) # 50 characters total width
|
||||
bar = "█" * filled + "░" * (50 - filled)
|
||||
console.print(f"[bold]Overall:[/bold] [{bar}] {overall_percent:.0f}%")
|
||||
console.print(f"[dim]{complete_checkpoints}/{total_checkpoints} checkpoints complete[/dim]\n")
|
||||
|
||||
# Show checkpoint progression - group in rows of 8
|
||||
checkpoints_list = list(progress_data["checkpoints"].items())
|
||||
|
||||
for row_start in range(0, len(checkpoints_list), 8):
|
||||
row_checkpoints = checkpoints_list[row_start:row_start + 8]
|
||||
|
||||
# Build the checkpoint line for this row
|
||||
checkpoint_line = ""
|
||||
names_line = ""
|
||||
|
||||
for i, (checkpoint_id, checkpoint_data) in enumerate(row_checkpoints):
|
||||
checkpoint = checkpoint_data["checkpoint"]
|
||||
|
||||
# Checkpoint status
|
||||
if checkpoint_data["is_complete"]:
|
||||
checkpoint_marker = f"[green]●[/green]"
|
||||
name_color = "green"
|
||||
elif checkpoint_id == progress_data["current"]:
|
||||
checkpoint_marker = f"[yellow]◉[/yellow]"
|
||||
name_color = "yellow"
|
||||
else:
|
||||
checkpoint_marker = f"[dim]○[/dim]"
|
||||
name_color = "dim"
|
||||
|
||||
# Add checkpoint with ID
|
||||
checkpoint_line += f"{checkpoint_marker}{checkpoint_id}"
|
||||
names_line += f"[{name_color}]{checkpoint['name'][:9]:^9}[/{name_color}]"
|
||||
|
||||
# Add spacing (except for last in row)
|
||||
if i < len(row_checkpoints) - 1:
|
||||
if checkpoint_data["is_complete"]:
|
||||
checkpoint_line += "[green]━━[/green]"
|
||||
else:
|
||||
checkpoint_line += "[dim]━━[/dim]"
|
||||
names_line += " "
|
||||
|
||||
console.print(checkpoint_line)
|
||||
console.print(names_line)
|
||||
console.print() # Empty line between rows
|
||||
|
||||
else:
|
||||
# Vertical timeline (tree structure)
|
||||
tree = Tree("ML Systems Engineering Journey (20 Checkpoints)")
|
||||
|
||||
for checkpoint_id, checkpoint_data in progress_data["checkpoints"].items():
|
||||
checkpoint = checkpoint_data["checkpoint"]
|
||||
|
||||
if checkpoint_data["is_complete"]:
|
||||
checkpoint_text = f"[green]✅ {checkpoint_id}: {checkpoint['name']}[/green]"
|
||||
elif checkpoint_id == progress_data["current"]:
|
||||
checkpoint_text = f"[yellow]🎯 {checkpoint_id}: {checkpoint['name']} (CURRENT)[/yellow]"
|
||||
else:
|
||||
checkpoint_text = f"[dim]⏳ {checkpoint_id}: {checkpoint['name']}[/dim]"
|
||||
|
||||
checkpoint_node = tree.add(checkpoint_text)
|
||||
checkpoint_node.add(f"[dim]{checkpoint['capability']}[/dim]")
|
||||
|
||||
console.print(tree)
|
||||
|
||||
console.print()
|
||||
return 0
|
||||
|
||||
def _test_checkpoint(self, checkpoint_system: CheckpointSystem, args: argparse.Namespace) -> int:
|
||||
"""Test checkpoint capabilities."""
|
||||
console = get_console()
|
||||
|
||||
# Determine which checkpoint to test
|
||||
checkpoint_id = args.checkpoint_id
|
||||
if not checkpoint_id:
|
||||
progress_data = checkpoint_system.get_overall_progress()
|
||||
checkpoint_id = progress_data["current"]
|
||||
|
||||
# Validate checkpoint ID
|
||||
if checkpoint_id not in checkpoint_system.CHECKPOINTS:
|
||||
print_error(f"Unknown checkpoint: {checkpoint_id}")
|
||||
console.print(f"[dim]Available checkpoints: {', '.join(checkpoint_system.CHECKPOINTS.keys())}[/dim]")
|
||||
return 1
|
||||
|
||||
checkpoint = checkpoint_system.CHECKPOINTS[checkpoint_id]
|
||||
|
||||
# Show what we're testing
|
||||
console.print(f"\n[bold cyan]Testing Checkpoint {checkpoint_id}: {checkpoint['name']}[/bold cyan]")
|
||||
console.print(f"[bold]Capability Question:[/bold] {checkpoint['capability']}\n")
|
||||
|
||||
# Run the test
|
||||
with console.status(f"[bold green]Running checkpoint {checkpoint_id} test...", spinner="dots") as status:
|
||||
result = checkpoint_system.run_checkpoint_test(checkpoint_id)
|
||||
|
||||
# Display results
|
||||
if result["success"]:
|
||||
console.print(f"[bold green]✅ Checkpoint {checkpoint_id} PASSED![/bold green]")
|
||||
console.print(f"[green]Capability achieved: {checkpoint['capability']}[/green]\n")
|
||||
|
||||
# Show brief output
|
||||
if result.get("stdout") and "🎉" in result["stdout"]:
|
||||
# Extract the completion message
|
||||
lines = result["stdout"].split('\n')
|
||||
for line in lines:
|
||||
if "🎉" in line or "📝" in line or "🎯" in line:
|
||||
console.print(f"[dim]{line}[/dim]")
|
||||
|
||||
print_success(f"Checkpoint {checkpoint_id} test completed successfully!")
|
||||
return 0
|
||||
else:
|
||||
console.print(f"[bold red]❌ Checkpoint {checkpoint_id} FAILED[/bold red]\n")
|
||||
|
||||
# Show error details
|
||||
if "error" in result:
|
||||
console.print(f"[red]Error: {result['error']}[/red]")
|
||||
elif result.get("stderr"):
|
||||
console.print(f"[red]Error output:[/red]")
|
||||
console.print(f"[dim]{result['stderr']}[/dim]")
|
||||
elif result.get("stdout"):
|
||||
console.print(f"[yellow]Test output:[/yellow]")
|
||||
console.print(f"[dim]{result['stdout']}[/dim]")
|
||||
|
||||
print_error(f"Checkpoint {checkpoint_id} test failed")
|
||||
return 1
|
||||
|
||||
def _run_checkpoint(self, checkpoint_system: CheckpointSystem, args: argparse.Namespace) -> int:
|
||||
"""Run specific checkpoint test with detailed progress tracking."""
|
||||
console = get_console()
|
||||
checkpoint_id = args.checkpoint_id
|
||||
|
||||
# Validate checkpoint ID
|
||||
if checkpoint_id not in checkpoint_system.CHECKPOINTS:
|
||||
print_error(f"Unknown checkpoint: {checkpoint_id}")
|
||||
console.print(f"[dim]Available checkpoints: {', '.join(checkpoint_system.CHECKPOINTS.keys())}[/dim]")
|
||||
return 1
|
||||
|
||||
checkpoint = checkpoint_system.CHECKPOINTS[checkpoint_id]
|
||||
|
||||
# Show detailed information
|
||||
console.print(Panel(
|
||||
f"[bold cyan]Checkpoint {checkpoint_id}: {checkpoint['name']}[/bold cyan]\n\n"
|
||||
f"[bold]Capability Question:[/bold]\n{checkpoint['capability']}\n\n"
|
||||
f"[bold]Test File:[/bold] {checkpoint['test_file']}\n"
|
||||
f"[bold]Description:[/bold] {checkpoint['description']}",
|
||||
title=f"Running Checkpoint {checkpoint_id}",
|
||||
border_style="bright_blue"
|
||||
))
|
||||
|
||||
# Check if test file exists
|
||||
test_path = checkpoint_system.checkpoints_dir / checkpoint["test_file"]
|
||||
if not test_path.exists():
|
||||
print_error(f"Test file not found: {checkpoint['test_file']}")
|
||||
return 1
|
||||
|
||||
console.print(f"\n[bold]Executing test...[/bold]")
|
||||
|
||||
# Run the test with status feedback
|
||||
with console.status(f"[bold green]Running checkpoint {checkpoint_id} test...", spinner="dots"):
|
||||
result = checkpoint_system.run_checkpoint_test(checkpoint_id)
|
||||
|
||||
console.print()
|
||||
|
||||
# Display detailed results
|
||||
if result["success"]:
|
||||
console.print(Panel(
|
||||
f"[bold green]✅ SUCCESS![/bold green]\n\n"
|
||||
f"[green]Checkpoint {checkpoint_id} completed successfully![/green]\n"
|
||||
f"[green]Capability achieved: {checkpoint['capability']}[/green]",
|
||||
title="Test Results",
|
||||
border_style="green"
|
||||
))
|
||||
|
||||
# Show test output if verbose or if it contains key markers
|
||||
if args.verbose or (result.get("stdout") and any(marker in result["stdout"] for marker in ["🎉", "✅", "📝", "🎯"])):
|
||||
console.print(f"\n[bold]Test Output:[/bold]")
|
||||
if result.get("stdout"):
|
||||
console.print(result["stdout"])
|
||||
|
||||
return 0
|
||||
else:
|
||||
console.print(Panel(
|
||||
f"[bold red]❌ FAILED[/bold red]\n\n"
|
||||
f"[red]Checkpoint {checkpoint_id} test failed[/red]\n"
|
||||
f"[yellow]This indicates the required capabilities are not yet implemented.[/yellow]",
|
||||
title="Test Results",
|
||||
border_style="red"
|
||||
))
|
||||
|
||||
# Show error details
|
||||
if "error" in result:
|
||||
console.print(f"\n[bold red]Error:[/bold red] {result['error']}")
|
||||
|
||||
if args.verbose or "error" in result:
|
||||
if result.get("stdout"):
|
||||
console.print(f"\n[bold]Standard Output:[/bold]")
|
||||
console.print(result["stdout"])
|
||||
if result.get("stderr"):
|
||||
console.print(f"\n[bold]Error Output:[/bold]")
|
||||
console.print(result["stderr"])
|
||||
|
||||
return 1
|
||||
|
||||
def _unlock_checkpoint(self, checkpoint_system: CheckpointSystem, args: argparse.Namespace) -> int:
|
||||
"""Attempt to unlock next checkpoint."""
|
||||
console = get_console()
|
||||
progress_data = checkpoint_system.get_overall_progress()
|
||||
current = progress_data["current"]
|
||||
|
||||
if not current:
|
||||
console.print("[green]All checkpoints completed! 🎉[/green]")
|
||||
return 0
|
||||
|
||||
current_status = progress_data["checkpoints"][current]
|
||||
|
||||
if current_status["is_complete"]:
|
||||
console.print(f"[green]✅ Checkpoint {current} ({current_status['checkpoint']['name']}) already complete![/green]")
|
||||
|
||||
# Find next checkpoint
|
||||
checkpoint_ids = list(checkpoint_system.CHECKPOINTS.keys())
|
||||
try:
|
||||
current_index = checkpoint_ids.index(current)
|
||||
if current_index < len(checkpoint_ids) - 1:
|
||||
next_id = checkpoint_ids[current_index + 1]
|
||||
next_checkpoint = checkpoint_system.CHECKPOINTS[next_id]
|
||||
console.print(f"[bold]Next checkpoint:[/bold] {next_id} - {next_checkpoint['name']}")
|
||||
console.print(f"[dim]Goal: {next_checkpoint['capability']}[/dim]")
|
||||
else:
|
||||
console.print("[bold]🎉 All checkpoints completed![/bold]")
|
||||
except ValueError:
|
||||
console.print("[yellow]Cannot determine next checkpoint[/yellow]")
|
||||
else:
|
||||
console.print(f"[yellow]Test checkpoint {current} to unlock your next capability:[/yellow]")
|
||||
console.print(f"[bold]Goal:[/bold] {current_status['checkpoint']['capability']}")
|
||||
console.print(f"[dim]Run: tito checkpoint run {current}[/dim]")
|
||||
|
||||
return 0
|
||||
@@ -1,160 +0,0 @@
|
||||
"""
|
||||
Clean command for TinyTorch CLI: cleans up module directories to start fresh.
|
||||
"""
|
||||
|
||||
import shutil
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from pathlib import Path
|
||||
from rich.panel import Panel
|
||||
from rich.text import Text
|
||||
|
||||
from .base import BaseCommand
|
||||
|
||||
class CleanCommand(BaseCommand):
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "clean"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Clean up module directories (notebooks, cache, etc.)"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
parser.add_argument("module", nargs="?", help="Clean specific module only")
|
||||
parser.add_argument("--notebooks", action="store_true", help="Remove generated notebook files")
|
||||
parser.add_argument("--cache", action="store_true", help="Remove Python cache files")
|
||||
parser.add_argument("--all", action="store_true", help="Clean all modules")
|
||||
parser.add_argument("--force", action="store_true", help="Skip confirmation prompt")
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
console = self.console
|
||||
|
||||
console.print(Panel("🧹 Cleaning Module Directories",
|
||||
title="Module Cleanup", border_style="bright_yellow"))
|
||||
|
||||
modules_dir = Path("modules")
|
||||
if not modules_dir.exists():
|
||||
console.print(Panel("[red]❌ modules/ directory not found[/red]",
|
||||
title="Error", border_style="red"))
|
||||
return 1
|
||||
|
||||
# Determine what to clean (file types)
|
||||
clean_notebooks = args.notebooks or (not args.notebooks and not args.cache)
|
||||
clean_cache = args.cache or (not args.notebooks and not args.cache)
|
||||
|
||||
# Determine which modules to clean
|
||||
if args.module:
|
||||
module_path = modules_dir / args.module
|
||||
if not module_path.exists():
|
||||
console.print(Panel(f"[red]❌ Module '{args.module}' not found[/red]",
|
||||
title="Module Not Found", border_style="red"))
|
||||
return 1
|
||||
module_dirs = [module_path]
|
||||
elif args.all:
|
||||
# Find all module directories (exclude special directories)
|
||||
exclude_dirs = {'.quarto', '__pycache__', '.git', '.pytest_cache', 'sidebar.yml', 'nbdev.yml'}
|
||||
module_dirs = [d for d in modules_dir.iterdir()
|
||||
if d.is_dir() and d.name not in exclude_dirs]
|
||||
else:
|
||||
# No module specified and no --all flag
|
||||
console.print(Panel("[red]❌ Please specify a module name or use --all to clean all modules[/red]\n\n"
|
||||
"[dim]Examples:[/dim]\n"
|
||||
"[dim] tito module clean tensor - Clean specific module[/dim]\n"
|
||||
"[dim] tito module clean --all - Clean all modules[/dim]",
|
||||
title="Module Required", border_style="red"))
|
||||
return 1
|
||||
|
||||
if not module_dirs:
|
||||
console.print(Panel("[yellow]⚠️ No modules found to clean[/yellow]",
|
||||
title="Nothing to Clean", border_style="yellow"))
|
||||
return 0
|
||||
|
||||
# Show what will be cleaned
|
||||
clean_text = Text()
|
||||
clean_text.append("📋 Cleanup Plan:\n\n", style="bold cyan")
|
||||
|
||||
files_to_remove = []
|
||||
for module_dir in module_dirs:
|
||||
module_name = module_dir.name
|
||||
clean_text.append(f"📁 {module_name}:\n", style="bold white")
|
||||
|
||||
if clean_notebooks:
|
||||
# Find .ipynb files
|
||||
for ipynb_file in module_dir.glob("*.ipynb"):
|
||||
files_to_remove.append(ipynb_file)
|
||||
clean_text.append(f" 🗑️ {ipynb_file.name}\n", style="yellow")
|
||||
|
||||
if clean_cache:
|
||||
# Find __pycache__ directories
|
||||
pycache_dirs = []
|
||||
for pycache in module_dir.rglob("__pycache__"):
|
||||
if pycache.is_dir():
|
||||
pycache_dirs.append(pycache)
|
||||
files_to_remove.append(pycache)
|
||||
clean_text.append(f" 🗑️ {pycache.relative_to(module_dir)}/\n", style="yellow")
|
||||
|
||||
# Find .pyc files that are NOT inside __pycache__ directories
|
||||
for pyc_file in module_dir.rglob("*.pyc"):
|
||||
# Check if this pyc file is inside any __pycache__ directory
|
||||
is_in_pycache = any(pycache in pyc_file.parents for pycache in pycache_dirs)
|
||||
if not is_in_pycache:
|
||||
files_to_remove.append(pyc_file)
|
||||
clean_text.append(f" 🗑️ {pyc_file.relative_to(module_dir)}\n", style="yellow")
|
||||
|
||||
if not files_to_remove:
|
||||
console.print(Panel("[green]✅ No files found to clean - modules are already clean![/green]",
|
||||
title="Already Clean", border_style="green"))
|
||||
return 0
|
||||
|
||||
clean_text.append(f"\n📊 Total: {len(files_to_remove)} files/directories to remove\n", style="bold cyan")
|
||||
|
||||
console.print(Panel(clean_text, title="Cleanup Preview", border_style="bright_yellow"))
|
||||
|
||||
# Ask for confirmation unless --force is used
|
||||
if not args.force:
|
||||
console.print("\n[yellow]This will permanently remove the files listed above.[/yellow]")
|
||||
console.print("[yellow]Python source files (*.py) will be preserved.[/yellow]\n")
|
||||
|
||||
try:
|
||||
response = input("Are you sure you want to proceed? (y/N): ").strip().lower()
|
||||
if response not in ['y', 'yes']:
|
||||
console.print(Panel("[cyan]Cleanup cancelled.[/cyan]",
|
||||
title="Cancelled", border_style="cyan"))
|
||||
return 0
|
||||
except KeyboardInterrupt:
|
||||
console.print(Panel("[cyan]Cleanup cancelled.[/cyan]",
|
||||
title="Cancelled", border_style="cyan"))
|
||||
return 0
|
||||
|
||||
# Perform cleanup
|
||||
removed_count = 0
|
||||
error_count = 0
|
||||
|
||||
for file_path in files_to_remove:
|
||||
try:
|
||||
if file_path.is_dir():
|
||||
shutil.rmtree(file_path)
|
||||
else:
|
||||
file_path.unlink()
|
||||
removed_count += 1
|
||||
except Exception as e:
|
||||
console.print(f" ❌ Failed to remove {file_path}: {e}")
|
||||
error_count += 1
|
||||
|
||||
# Show results
|
||||
result_text = Text()
|
||||
if removed_count > 0:
|
||||
result_text.append(f"✅ Successfully removed {removed_count} files/directories\n", style="bold green")
|
||||
if error_count > 0:
|
||||
result_text.append(f"❌ Failed to remove {error_count} files/directories\n", style="bold red")
|
||||
|
||||
if removed_count > 0:
|
||||
result_text.append("\n💡 Next steps:\n", style="bold yellow")
|
||||
result_text.append(" • Run: tito module notebooks - Regenerate notebooks\n", style="white")
|
||||
result_text.append(" • Run: tito module test --all - Test all modules\n", style="white")
|
||||
result_text.append(" • Run: tito module export --all - Export to package\n", style="white")
|
||||
|
||||
border_style = "green" if error_count == 0 else "yellow"
|
||||
console.print(Panel(result_text, title="Cleanup Complete", border_style=border_style))
|
||||
|
||||
return 0 if error_count == 0 else 1
|
||||
@@ -1,263 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Tito Demo Command - Show off your AI capabilities!
|
||||
Runs progressive demos showing what TinyTorch can do at each stage.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from rich.console import Console
|
||||
from rich.table import Table
|
||||
from rich.panel import Panel
|
||||
from rich.text import Text
|
||||
|
||||
from .base import BaseCommand
|
||||
|
||||
console = Console()
|
||||
|
||||
class TinyTorchDemoMatrix:
|
||||
"""Tracks and displays TinyTorch AI demo capabilities"""
|
||||
|
||||
def __init__(self):
|
||||
self.demos = {
|
||||
'math': {
|
||||
'name': 'Mathematical Operations',
|
||||
'file': 'demo_tensor_math.py',
|
||||
'requires': ['02_tensor'],
|
||||
'description': 'Linear algebra, matrix operations, transformations'
|
||||
},
|
||||
'logic': {
|
||||
'name': 'Logical Reasoning',
|
||||
'file': 'demo_activations.py',
|
||||
'requires': ['02_tensor', '03_activations'],
|
||||
'description': 'Boolean functions, XOR problem, decision boundaries'
|
||||
},
|
||||
'neuron': {
|
||||
'name': 'Single Neuron Learning',
|
||||
'file': 'demo_single_neuron.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers'],
|
||||
'description': 'Watch a neuron learn the AND gate'
|
||||
},
|
||||
'network': {
|
||||
'name': 'Multi-Layer Networks',
|
||||
'file': 'demo_xor_network.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers', '05_dense'],
|
||||
'description': 'Solve the famous XOR problem'
|
||||
},
|
||||
'vision': {
|
||||
'name': 'Computer Vision',
|
||||
'file': 'demo_vision.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers', '05_dense', '06_spatial'],
|
||||
'description': 'Image processing and pattern recognition'
|
||||
},
|
||||
'attention': {
|
||||
'name': 'Attention Mechanisms',
|
||||
'file': 'demo_attention.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers', '05_dense', '07_attention'],
|
||||
'description': 'Sequence processing and attention'
|
||||
},
|
||||
'training': {
|
||||
'name': 'End-to-End Training',
|
||||
'file': 'demo_training.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers', '05_dense', '11_training'],
|
||||
'description': 'Complete training pipelines'
|
||||
},
|
||||
'language': {
|
||||
'name': 'Language Generation',
|
||||
'file': 'demo_language.py',
|
||||
'requires': ['02_tensor', '03_activations', '04_layers', '05_dense', '07_attention', '16_tinygpt'],
|
||||
'description': 'AI text generation and language models'
|
||||
}
|
||||
}
|
||||
|
||||
def check_module_exported(self, module_name):
|
||||
"""Check if a module has been exported to the package"""
|
||||
try:
|
||||
if module_name == '02_tensor':
|
||||
import tinytorch.core.tensor
|
||||
return True
|
||||
elif module_name == '03_activations':
|
||||
import tinytorch.core.activations
|
||||
return True
|
||||
elif module_name == '04_layers':
|
||||
import tinytorch.core.layers
|
||||
return True
|
||||
elif module_name == '05_dense':
|
||||
import tinytorch.core.dense
|
||||
return True
|
||||
elif module_name == '06_spatial':
|
||||
import tinytorch.core.spatial
|
||||
return True
|
||||
elif module_name == '07_attention':
|
||||
import tinytorch.core.attention
|
||||
return True
|
||||
elif module_name == '11_training':
|
||||
import tinytorch.core.training
|
||||
return True
|
||||
elif module_name == '16_tinygpt':
|
||||
import tinytorch.tinygpt
|
||||
return True
|
||||
return False
|
||||
except ImportError:
|
||||
return False
|
||||
|
||||
def get_demo_status(self, demo_name):
|
||||
"""Get status of a demo: available, partial, or unavailable"""
|
||||
demo = self.demos[demo_name]
|
||||
required_modules = demo['requires']
|
||||
|
||||
available_count = sum(1 for module in required_modules if self.check_module_exported(module))
|
||||
total_count = len(required_modules)
|
||||
|
||||
if available_count == total_count:
|
||||
return '✅' # Fully available
|
||||
elif available_count > 0:
|
||||
return '⚡' # Partially available
|
||||
else:
|
||||
return '❌' # Not available
|
||||
|
||||
def show_matrix(self):
|
||||
"""Display the demo capability matrix"""
|
||||
console.print("\n🤖 TinyTorch Demo Matrix", style="bold cyan")
|
||||
console.print("=" * 50)
|
||||
|
||||
table = Table(show_header=True, header_style="bold magenta")
|
||||
table.add_column("Demo", style="cyan", width=20)
|
||||
table.add_column("Status", justify="center", width=8)
|
||||
table.add_column("Description", style="dim")
|
||||
|
||||
available_demos = []
|
||||
|
||||
for demo_name, demo_info in self.demos.items():
|
||||
status = self.get_demo_status(demo_name)
|
||||
table.add_row(demo_info['name'], status, demo_info['description'])
|
||||
|
||||
if status == '✅':
|
||||
available_demos.append(demo_name)
|
||||
|
||||
console.print(table)
|
||||
console.print()
|
||||
|
||||
if available_demos:
|
||||
console.print("🎯 Available Demos:", style="bold green")
|
||||
for demo in available_demos:
|
||||
console.print(f" • tito demo {demo}")
|
||||
console.print()
|
||||
|
||||
console.print("Legend: ✅ Ready ⚡ Partial ❌ Not Available")
|
||||
console.print()
|
||||
|
||||
def run_demo(self, demo_name):
|
||||
"""Run a specific demo"""
|
||||
if demo_name not in self.demos:
|
||||
console.print(f"❌ Unknown demo: {demo_name}", style="red")
|
||||
console.print("Available demos:", ', '.join(self.demos.keys()))
|
||||
return False
|
||||
|
||||
demo = self.demos[demo_name]
|
||||
status = self.get_demo_status(demo_name)
|
||||
|
||||
if status == '❌':
|
||||
console.print(f"❌ Demo '{demo_name}' not available", style="red")
|
||||
missing_modules = [m for m in demo['requires'] if not self.check_module_exported(m)]
|
||||
console.print(f"Missing modules: {', '.join(missing_modules)}")
|
||||
console.print(f"Run: tito export {' '.join(missing_modules)}")
|
||||
return False
|
||||
|
||||
if status == '⚡':
|
||||
console.print(f"⚠️ Demo '{demo_name}' partially available", style="yellow")
|
||||
console.print("Some features may not work correctly.")
|
||||
|
||||
# Find the demo file
|
||||
project_root = Path(__file__).parent.parent.parent
|
||||
demo_file = project_root / "demos" / demo['file']
|
||||
|
||||
if not demo_file.exists():
|
||||
console.print(f"❌ Demo file not found: {demo_file}", style="red")
|
||||
return False
|
||||
|
||||
console.print(f"🚀 Running {demo['name']} Demo...", style="bold green")
|
||||
console.print()
|
||||
|
||||
# Run the demo
|
||||
try:
|
||||
result = subprocess.run([sys.executable, str(demo_file)],
|
||||
capture_output=False,
|
||||
text=True)
|
||||
return result.returncode == 0
|
||||
except Exception as e:
|
||||
console.print(f"❌ Demo failed: {e}", style="red")
|
||||
return False
|
||||
|
||||
class DemoCommand(BaseCommand):
|
||||
"""Command for running TinyTorch AI capability demos"""
|
||||
|
||||
def __init__(self, config):
|
||||
super().__init__(config)
|
||||
self.matrix = TinyTorchDemoMatrix()
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "demo"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Run AI capability demos"
|
||||
|
||||
def add_arguments(self, parser):
|
||||
"""Add demo command arguments"""
|
||||
parser.add_argument('demo_name', nargs='?',
|
||||
help='Name of demo to run (math, logic, neuron, network, etc.)')
|
||||
parser.add_argument('--all', action='store_true',
|
||||
help='Run all available demos')
|
||||
parser.add_argument('--matrix', action='store_true',
|
||||
help='Show capability matrix only')
|
||||
|
||||
def run(self, args):
|
||||
"""Execute the demo command"""
|
||||
# Just show matrix if no args or --matrix flag
|
||||
if not args.demo_name and not args.all or args.matrix:
|
||||
self.matrix.show_matrix()
|
||||
return
|
||||
|
||||
# Run all available demos
|
||||
if args.all:
|
||||
self.matrix.show_matrix()
|
||||
available_demos = [name for name in self.matrix.demos.keys()
|
||||
if self.matrix.get_demo_status(name) == '✅']
|
||||
|
||||
if not available_demos:
|
||||
console.print("❌ No demos available. Export some modules first!", style="red")
|
||||
return
|
||||
|
||||
console.print(f"🚀 Running {len(available_demos)} available demos...", style="bold green")
|
||||
console.print()
|
||||
|
||||
for demo_name in available_demos:
|
||||
console.print(f"\n{'='*60}")
|
||||
success = self.matrix.run_demo(demo_name)
|
||||
if not success:
|
||||
console.print(f"❌ Demo {demo_name} failed", style="red")
|
||||
|
||||
console.print(f"\n{'='*60}")
|
||||
console.print("🏆 All available demos completed!", style="bold green")
|
||||
return
|
||||
|
||||
# Run specific demo
|
||||
if args.demo_name:
|
||||
self.matrix.run_demo(args.demo_name)
|
||||
|
||||
def main():
|
||||
"""Standalone entry point for development"""
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser()
|
||||
DemoCommand.add_parser(parser._subparsers_action.add_parser if hasattr(parser, '_subparsers_action') else parser.add_subparser)
|
||||
args = parser.parse_args()
|
||||
|
||||
cmd = DemoCommand()
|
||||
cmd.execute(args)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,469 +0,0 @@
|
||||
"""
|
||||
Tiny🔥Torch Interactive Help System
|
||||
|
||||
Provides contextual, progressive guidance for new and experienced users.
|
||||
"""
|
||||
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from typing import Optional, List, Dict, Any
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.config import CLIConfig
|
||||
from ..core.console import get_console
|
||||
from rich.console import Console
|
||||
from rich.panel import Panel
|
||||
from rich.columns import Columns
|
||||
from rich.table import Table
|
||||
from rich.text import Text
|
||||
from rich.prompt import Prompt, Confirm
|
||||
|
||||
|
||||
class HelpCommand(BaseCommand):
|
||||
"""Interactive help and onboarding system."""
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "help"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Interactive help system with guided onboarding"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
"""Add help command arguments."""
|
||||
parser.add_argument(
|
||||
'topic',
|
||||
nargs='?',
|
||||
help='Specific help topic (getting-started, commands, workflow, etc.)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--interactive', '-i',
|
||||
action='store_true',
|
||||
help='Launch interactive onboarding wizard'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--quick', '-q',
|
||||
action='store_true',
|
||||
help='Show quick reference card'
|
||||
)
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
"""Execute help command."""
|
||||
console = get_console()
|
||||
|
||||
# Interactive onboarding wizard
|
||||
if args.interactive:
|
||||
return self._interactive_onboarding()
|
||||
|
||||
# Quick reference
|
||||
if args.quick:
|
||||
return self._show_quick_reference()
|
||||
|
||||
# Topic-specific help
|
||||
if args.topic:
|
||||
return self._show_topic_help(args.topic)
|
||||
|
||||
# Default: Show main help with user context
|
||||
return self._show_contextual_help()
|
||||
|
||||
def _interactive_onboarding(self) -> int:
|
||||
"""Launch interactive onboarding wizard."""
|
||||
console = get_console()
|
||||
|
||||
# Welcome screen
|
||||
console.print(Panel.fit(
|
||||
"[bold blue]🚀 Welcome to Tiny🔥Torch![/bold blue]\n\n"
|
||||
"Let's get you started on your ML systems engineering journey.\n"
|
||||
"This quick wizard will help you understand what Tiny🔥Torch is\n"
|
||||
"and guide you to the right starting point.",
|
||||
title="Tiny🔥Torch Onboarding Wizard",
|
||||
border_style="blue"
|
||||
))
|
||||
|
||||
# User experience assessment
|
||||
experience = self._assess_user_experience()
|
||||
|
||||
# Learning goal identification
|
||||
goals = self._identify_learning_goals()
|
||||
|
||||
# Time commitment assessment
|
||||
time_commitment = self._assess_time_commitment()
|
||||
|
||||
# Generate personalized recommendations
|
||||
recommendations = self._generate_recommendations(experience, goals, time_commitment)
|
||||
|
||||
# Show personalized path
|
||||
self._show_personalized_path(recommendations)
|
||||
|
||||
# Offer to start immediately
|
||||
if Confirm.ask("\n[bold green]Ready to start your first steps?[/bold green]"):
|
||||
self._launch_first_steps(recommendations)
|
||||
|
||||
return 0
|
||||
|
||||
def _assess_user_experience(self) -> str:
|
||||
"""Assess user's ML and programming experience."""
|
||||
console = get_console()
|
||||
|
||||
console.print("\n[bold cyan]📋 Quick Experience Assessment[/bold cyan]")
|
||||
|
||||
choices = [
|
||||
"New to ML and Python - need fundamentals",
|
||||
"Know Python, new to ML - want to learn systems",
|
||||
"Use PyTorch/TensorFlow - want to understand internals",
|
||||
"ML Engineer - need to debug/optimize production systems",
|
||||
"Instructor - want to teach this course"
|
||||
]
|
||||
|
||||
console.print("\nWhat best describes your background?")
|
||||
for i, choice in enumerate(choices, 1):
|
||||
console.print(f" {i}. {choice}")
|
||||
|
||||
while True:
|
||||
try:
|
||||
selection = int(Prompt.ask("\nEnter your choice (1-5)"))
|
||||
if 1 <= selection <= 5:
|
||||
return ['beginner', 'python_user', 'framework_user', 'ml_engineer', 'instructor'][selection-1]
|
||||
else:
|
||||
console.print("[red]Please enter a number between 1-5[/red]")
|
||||
except ValueError:
|
||||
console.print("[red]Please enter a valid number[/red]")
|
||||
|
||||
def _identify_learning_goals(self) -> List[str]:
|
||||
"""Identify user's learning goals."""
|
||||
console = get_console()
|
||||
|
||||
console.print("\n[bold cyan]🎯 Learning Goals[/bold cyan]")
|
||||
console.print("What do you want to achieve? (Select all that apply)")
|
||||
|
||||
goals = [
|
||||
("understand_internals", "Understand how PyTorch/TensorFlow work internally"),
|
||||
("build_networks", "Build neural networks from scratch"),
|
||||
("optimize_performance", "Learn to optimize ML system performance"),
|
||||
("debug_production", "Debug production ML systems"),
|
||||
("teach_course", "Teach ML systems to others"),
|
||||
("career_transition", "Transition from software engineering to ML"),
|
||||
("research_custom", "Implement custom operations for research")
|
||||
]
|
||||
|
||||
selected_goals = []
|
||||
for key, description in goals:
|
||||
if Confirm.ask(f" • {description}?"):
|
||||
selected_goals.append(key)
|
||||
|
||||
return selected_goals
|
||||
|
||||
def _assess_time_commitment(self) -> str:
|
||||
"""Assess available time commitment."""
|
||||
console = get_console()
|
||||
|
||||
console.print("\n[bold cyan]⏰ Time Commitment[/bold cyan]")
|
||||
|
||||
choices = [
|
||||
("15_minutes", "15 minutes - just want a quick taste"),
|
||||
("2_hours", "2 hours - explore a few modules"),
|
||||
("weekend", "Weekend project - build something substantial"),
|
||||
("semester", "8-12 weeks - complete learning journey"),
|
||||
("teaching", "Teaching timeline - need instructor resources")
|
||||
]
|
||||
|
||||
console.print("How much time can you dedicate?")
|
||||
for i, (key, description) in enumerate(choices, 1):
|
||||
console.print(f" {i}. {description}")
|
||||
|
||||
while True:
|
||||
try:
|
||||
selection = int(Prompt.ask("\nEnter your choice (1-5)"))
|
||||
if 1 <= selection <= 5:
|
||||
return choices[selection-1][0]
|
||||
else:
|
||||
console.print("[red]Please enter a number between 1-5[/red]")
|
||||
except ValueError:
|
||||
console.print("[red]Please enter a valid number[/red]")
|
||||
|
||||
def _generate_recommendations(self, experience: str, goals: List[str], time: str) -> Dict[str, Any]:
|
||||
"""Generate personalized recommendations."""
|
||||
|
||||
# Learning path mapping
|
||||
path_mapping = {
|
||||
'beginner': 'foundation_first',
|
||||
'python_user': 'guided_learning',
|
||||
'framework_user': 'systems_focus',
|
||||
'ml_engineer': 'optimization_focus',
|
||||
'instructor': 'teaching_resources'
|
||||
}
|
||||
|
||||
# Starting point mapping
|
||||
start_mapping = {
|
||||
'15_minutes': 'quick_demo',
|
||||
'2_hours': 'first_module',
|
||||
'weekend': 'milestone_project',
|
||||
'semester': 'full_curriculum',
|
||||
'teaching': 'instructor_setup'
|
||||
}
|
||||
|
||||
return {
|
||||
'learning_path': path_mapping.get(experience, 'guided_learning'),
|
||||
'starting_point': start_mapping.get(time, 'first_module'),
|
||||
'experience_level': experience,
|
||||
'goals': goals,
|
||||
'time_commitment': time
|
||||
}
|
||||
|
||||
def _show_personalized_path(self, recommendations: Dict[str, Any]) -> None:
|
||||
"""Show personalized learning path."""
|
||||
console = get_console()
|
||||
|
||||
# Path descriptions
|
||||
paths = {
|
||||
'foundation_first': {
|
||||
'title': '🌱 Foundation First Path',
|
||||
'description': 'Build fundamentals step-by-step with extra explanations',
|
||||
'next_steps': ['Module 1: Setup & Environment', 'Python fundamentals review', 'Linear algebra primer']
|
||||
},
|
||||
'guided_learning': {
|
||||
'title': '🎯 Guided Learning Path',
|
||||
'description': 'Structured progression through all major concepts',
|
||||
'next_steps': ['Module 1: Setup', 'Module 2: Tensors', 'Track progress with checkpoints']
|
||||
},
|
||||
'systems_focus': {
|
||||
'title': '⚡ Systems Focus Path',
|
||||
'description': 'Understand internals of frameworks you already use',
|
||||
'next_steps': ['Compare PyTorch vs your code', 'Profile memory usage', 'Optimization modules']
|
||||
},
|
||||
'optimization_focus': {
|
||||
'title': '🚀 Optimization Focus Path',
|
||||
'description': 'Performance debugging and production optimization',
|
||||
'next_steps': ['Profiling module', 'Benchmarking module', 'TinyMLPerf competition']
|
||||
},
|
||||
'teaching_resources': {
|
||||
'title': '🎓 Teaching Resources Path',
|
||||
'description': 'Instructor guides and classroom setup',
|
||||
'next_steps': ['Instructor guide', 'NBGrader setup', 'Student progress tracking']
|
||||
}
|
||||
}
|
||||
|
||||
path_info = paths[recommendations['learning_path']]
|
||||
|
||||
console.print(f"\n[bold green]✨ Your Personalized Learning Path[/bold green]")
|
||||
console.print(Panel(
|
||||
f"[bold]{path_info['title']}[/bold]\n\n"
|
||||
f"{path_info['description']}\n\n"
|
||||
f"[bold cyan]Your Next Steps:[/bold cyan]\n" +
|
||||
"\n".join(f" • {step}" for step in path_info['next_steps']),
|
||||
border_style="green"
|
||||
))
|
||||
|
||||
def _launch_first_steps(self, recommendations: Dict[str, Any]) -> None:
|
||||
"""Launch appropriate first steps based on recommendations."""
|
||||
console = get_console()
|
||||
|
||||
starting_point = recommendations['starting_point']
|
||||
|
||||
if starting_point == 'quick_demo':
|
||||
console.print("\n[bold blue]🚀 Launching Quick Demo...[/bold blue]")
|
||||
console.print("Running: [code]tito demo quick[/code]")
|
||||
os.system("tito demo quick")
|
||||
|
||||
elif starting_point == 'first_module':
|
||||
console.print("\n[bold blue]🛠️ Setting up Module 1...[/bold blue]")
|
||||
console.print("Next commands:")
|
||||
console.print(" [code]cd modules/01_setup[/code]")
|
||||
console.print(" [code]jupyter lab setup.py[/code]")
|
||||
|
||||
elif starting_point == 'milestone_project':
|
||||
console.print("\n[bold blue]🎯 Weekend Project Recommendations...[/bold blue]")
|
||||
console.print("Suggested goal: Build XOR solver (Modules 1-6)")
|
||||
console.print("Time estimate: 6-8 hours")
|
||||
|
||||
elif starting_point == 'full_curriculum':
|
||||
console.print("\n[bold blue]📚 Full Curriculum Setup...[/bold blue]")
|
||||
console.print("Running checkpoint system initialization...")
|
||||
os.system("tito checkpoint status")
|
||||
|
||||
elif starting_point == 'instructor_setup':
|
||||
console.print("\n[bold blue]🎓 Instructor Resources...[/bold blue]")
|
||||
console.print("Opening instructor guide...")
|
||||
console.print("Check: [code]book/usage-paths/classroom-use.html[/code]")
|
||||
|
||||
def _show_quick_reference(self) -> int:
|
||||
"""Show quick reference card."""
|
||||
console = get_console()
|
||||
|
||||
# Essential commands table
|
||||
table = Table(title="🚀 TinyTorch Quick Reference", show_header=True, header_style="bold cyan")
|
||||
table.add_column("Command", style="bold", width=25)
|
||||
table.add_column("Description", width=40)
|
||||
table.add_column("Example", style="dim", width=30)
|
||||
|
||||
essential_commands = [
|
||||
("tito help --interactive", "Launch onboarding wizard", "First time users"),
|
||||
("tito checkpoint status", "See your progress", "Track learning journey"),
|
||||
("tito module complete 02", "Finish a module", "Export & test your code"),
|
||||
("tito demo quick", "See framework in action", "5-minute demonstration"),
|
||||
("tito leaderboard join", "Join community", "Connect with learners"),
|
||||
("tito system health", "Check environment", "Troubleshoot issues")
|
||||
]
|
||||
|
||||
for cmd, desc, example in essential_commands:
|
||||
table.add_row(cmd, desc, example)
|
||||
|
||||
console.print(table)
|
||||
|
||||
# Common workflows
|
||||
console.print("\n[bold cyan]📋 Common Workflows:[/bold cyan]")
|
||||
workflows = [
|
||||
("New User", "tito help -i → tito checkpoint status → cd modules/01_setup"),
|
||||
("Continue Learning", "tito checkpoint status → work on next module → tito module complete XX"),
|
||||
("Join Community", "tito leaderboard join → submit progress → see global rankings"),
|
||||
("Get Help", "tito system health → check docs/FAQ → ask community")
|
||||
]
|
||||
|
||||
for workflow, commands in workflows:
|
||||
console.print(f" [bold]{workflow}:[/bold] {commands}")
|
||||
|
||||
return 0
|
||||
|
||||
def _show_topic_help(self, topic: str) -> int:
|
||||
"""Show help for specific topic."""
|
||||
console = get_console()
|
||||
|
||||
topics = {
|
||||
'getting-started': self._help_getting_started,
|
||||
'commands': self._help_commands,
|
||||
'workflow': self._help_workflow,
|
||||
'modules': self._help_modules,
|
||||
'checkpoints': self._help_checkpoints,
|
||||
'community': self._help_community,
|
||||
'troubleshooting': self._help_troubleshooting
|
||||
}
|
||||
|
||||
if topic in topics:
|
||||
topics[topic]()
|
||||
return 0
|
||||
else:
|
||||
console.print(f"[red]Unknown help topic: {topic}[/red]")
|
||||
console.print("Available topics: " + ", ".join(topics.keys()))
|
||||
return 1
|
||||
|
||||
def _show_contextual_help(self) -> int:
|
||||
"""Show contextual help based on user progress."""
|
||||
console = get_console()
|
||||
|
||||
# Check user progress to provide contextual guidance
|
||||
progress = self._assess_user_progress()
|
||||
|
||||
if progress['is_new_user']:
|
||||
self._show_new_user_help()
|
||||
elif progress['current_module']:
|
||||
self._show_in_progress_help(progress['current_module'])
|
||||
else:
|
||||
self._show_experienced_user_help()
|
||||
|
||||
return 0
|
||||
|
||||
def _assess_user_progress(self) -> Dict[str, Any]:
|
||||
"""Assess user's current progress."""
|
||||
# Check for checkpoint files, completed modules, etc.
|
||||
# This would integrate with the checkpoint system
|
||||
|
||||
# Simplified implementation for now
|
||||
checkpoints_dir = Path("tests/checkpoints")
|
||||
modules_dir = Path("modules")
|
||||
|
||||
return {
|
||||
'is_new_user': not checkpoints_dir.exists(),
|
||||
'current_module': None, # Would be determined by checkpoint status
|
||||
'completed_modules': [], # Would be populated from checkpoint results
|
||||
'has_joined_community': False # Would check leaderboard status
|
||||
}
|
||||
|
||||
def _show_new_user_help(self) -> None:
|
||||
"""Show help optimized for new users."""
|
||||
console = get_console()
|
||||
|
||||
console.print(Panel.fit(
|
||||
"[bold blue]👋 Welcome to Tiny🔥Torch![/bold blue]\n\n"
|
||||
"You're about to build a complete ML framework from scratch.\n"
|
||||
"Here's how to get started:\n\n"
|
||||
"[bold cyan]Next Steps:[/bold cyan]\n"
|
||||
"1. [code]tito help --interactive[/code] - Personalized onboarding\n"
|
||||
"2. [code]tito system health[/code] - Check your environment\n"
|
||||
"3. [code]tito checkpoint status[/code] - See the learning journey\n\n"
|
||||
"[bold yellow]New to ML systems?[/bold yellow] Run the interactive wizard!",
|
||||
title="Getting Started",
|
||||
border_style="blue"
|
||||
))
|
||||
|
||||
def _help_getting_started(self) -> None:
|
||||
"""Detailed getting started help."""
|
||||
console = get_console()
|
||||
|
||||
console.print("[bold blue]🚀 Getting Started with Tiny🔥Torch[/bold blue]\n")
|
||||
|
||||
# Installation steps
|
||||
install_panel = Panel(
|
||||
"[bold]1. Environment Setup[/bold]\n"
|
||||
"```bash\n"
|
||||
"git clone https://github.com/mlsysbook/Tiny🔥Torch.git\n"
|
||||
"cd Tiny🔥Torch\n"
|
||||
f"python -m venv {self.venv_path}\n"
|
||||
f"source {self.venv_path}/bin/activate # Windows: .venv\\Scripts\\activate\n"
|
||||
"pip install -r requirements.txt\n"
|
||||
"pip install -e .\n"
|
||||
"```",
|
||||
title="Installation",
|
||||
border_style="green"
|
||||
)
|
||||
|
||||
# First steps
|
||||
first_steps_panel = Panel(
|
||||
"[bold]2. First Steps[/bold]\n"
|
||||
"• [code]tito system health[/code] - Verify installation\n"
|
||||
"• [code]tito help --interactive[/code] - Personalized guidance\n"
|
||||
"• [code]tito checkpoint status[/code] - See learning path\n"
|
||||
"• [code]cd modules/01_setup[/code] - Start first module",
|
||||
title="First Steps",
|
||||
border_style="blue"
|
||||
)
|
||||
|
||||
# Learning path
|
||||
learning_panel = Panel(
|
||||
"[bold]3. Learning Journey[/bold]\n"
|
||||
"📚 [bold]Modules 1-8:[/bold] Neural Network Foundations\n"
|
||||
"🔬 [bold]Modules 9-10:[/bold] Computer Vision (CNNs)\n"
|
||||
"🤖 [bold]Modules 11-14:[/bold] Language Models (Transformers)\n"
|
||||
"⚡ [bold]Modules 15-20:[/bold] System Optimization\n\n"
|
||||
"[dim]Each module: Build → Test → Export → Checkpoint[/dim]",
|
||||
title="Learning Path",
|
||||
border_style="yellow"
|
||||
)
|
||||
|
||||
console.print(Columns([install_panel, first_steps_panel, learning_panel]))
|
||||
|
||||
# Additional help methods would be implemented here...
|
||||
def _help_commands(self) -> None:
|
||||
"""Show comprehensive command reference."""
|
||||
pass
|
||||
|
||||
def _help_workflow(self) -> None:
|
||||
"""Show common workflow patterns."""
|
||||
pass
|
||||
|
||||
def _help_modules(self) -> None:
|
||||
"""Show module system explanation."""
|
||||
pass
|
||||
|
||||
def _help_checkpoints(self) -> None:
|
||||
"""Show checkpoint system explanation."""
|
||||
pass
|
||||
|
||||
def _help_community(self) -> None:
|
||||
"""Show community features and leaderboard."""
|
||||
pass
|
||||
|
||||
def _help_troubleshooting(self) -> None:
|
||||
"""Show troubleshooting guide."""
|
||||
pass
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,193 +0,0 @@
|
||||
"""
|
||||
Notebooks command for building Jupyter notebooks from Python files using Jupytext.
|
||||
"""
|
||||
|
||||
import subprocess
|
||||
import sys
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from pathlib import Path
|
||||
from typing import List, Tuple
|
||||
|
||||
from rich.panel import Panel
|
||||
from rich.text import Text
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.exceptions import ExecutionError, ModuleNotFoundError
|
||||
|
||||
class NotebooksCommand(BaseCommand):
|
||||
"""Command to build Jupyter notebooks from Python files using Jupytext."""
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "notebooks"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Build notebooks from Python files"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
"""Add notebooks command arguments."""
|
||||
parser.add_argument(
|
||||
'--module',
|
||||
help='Build notebook for specific module'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--force',
|
||||
action='store_true',
|
||||
help='Force rebuild even if notebook exists'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--dry-run',
|
||||
action='store_true',
|
||||
help='Show what would be built without actually building'
|
||||
)
|
||||
|
||||
def validate_args(self, args: Namespace) -> None:
|
||||
"""Validate notebooks command arguments."""
|
||||
if args.module:
|
||||
module_dir = self.config.modules_dir / args.module
|
||||
if not module_dir.exists():
|
||||
raise ModuleNotFoundError(f"Module directory '{args.module}' not found")
|
||||
|
||||
# Find module Python file in the module directory
|
||||
# Extract short name from module directory name
|
||||
if args.module.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = args.module[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = args.module
|
||||
dev_file = module_dir / f"{short_name}.py"
|
||||
if not dev_file.exists():
|
||||
raise ModuleNotFoundError(
|
||||
f"No module file found in module '{args.module}'. Expected: {dev_file.name}"
|
||||
)
|
||||
|
||||
def _find_dev_files(self) -> List[Path]:
|
||||
"""Find all module Python files in modules directory."""
|
||||
dev_files = []
|
||||
# Look in modules/ directory
|
||||
modules_dir = self.config.modules_dir
|
||||
|
||||
for module_dir in modules_dir.iterdir():
|
||||
if module_dir.is_dir() and not module_dir.name.startswith('.'):
|
||||
# Extract short name from module directory name
|
||||
module_name = module_dir.name
|
||||
if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = module_name[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = module_name
|
||||
# Look for module Python file (without _dev suffix)
|
||||
py_file = module_dir / f"{short_name}.py"
|
||||
if py_file.exists():
|
||||
dev_files.append(py_file)
|
||||
return sorted(dev_files)
|
||||
|
||||
def _convert_file(self, dev_file: Path) -> Tuple[bool, str]:
|
||||
"""Convert a single Python file to notebook using Jupytext."""
|
||||
try:
|
||||
# Use Jupytext from venv to convert Python file to notebook
|
||||
import sys
|
||||
venv_python = Path(sys.executable)
|
||||
jupytext_cmd = venv_python.parent / "jupytext"
|
||||
|
||||
result = subprocess.run([
|
||||
str(jupytext_cmd), "--to", "notebook", str(dev_file)
|
||||
], capture_output=True, text=True, timeout=30, cwd=dev_file.parent)
|
||||
|
||||
if result.returncode == 0:
|
||||
notebook_file = dev_file.with_suffix('.ipynb')
|
||||
return True, f"{dev_file.name} → {notebook_file.name}"
|
||||
else:
|
||||
error_msg = result.stderr.strip() if result.stderr.strip() else "Conversion failed"
|
||||
return False, error_msg
|
||||
|
||||
except subprocess.TimeoutExpired:
|
||||
return False, "Conversion timed out"
|
||||
except FileNotFoundError:
|
||||
return False, "Jupytext not found. Install with: pip install jupytext"
|
||||
except Exception as e:
|
||||
return False, f"Error: {str(e)}"
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
"""Execute the notebooks command."""
|
||||
self.console.print(Panel(
|
||||
"📓 Building Notebooks from Python Files (using Jupytext)",
|
||||
title="Notebook Generation",
|
||||
border_style="bright_cyan"
|
||||
))
|
||||
|
||||
# Find files to convert
|
||||
if args.module:
|
||||
module_dir = self.config.modules_dir / args.module
|
||||
# Extract short name from module directory name
|
||||
module_name = args.module
|
||||
if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = module_name[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = module_name
|
||||
dev_file = module_dir / f"{short_name}.py"
|
||||
if dev_file.exists():
|
||||
dev_files = [dev_file]
|
||||
else:
|
||||
dev_files = []
|
||||
self.console.print(f"🔄 Building notebook for module: {args.module}")
|
||||
else:
|
||||
dev_files = self._find_dev_files()
|
||||
if not dev_files:
|
||||
self.console.print(Panel(
|
||||
"[yellow]⚠️ No *.py files found in modules/[/yellow]",
|
||||
title="Nothing to Convert",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 0
|
||||
self.console.print(f"🔄 Building notebooks for {len(dev_files)} modules...")
|
||||
|
||||
# Dry run mode
|
||||
if args.dry_run:
|
||||
self.console.print("\n[cyan]Dry run mode - would convert:[/cyan]")
|
||||
for dev_file in dev_files:
|
||||
module_name = dev_file.parent.name
|
||||
self.console.print(f" • {module_name}: {dev_file.name}")
|
||||
return 0
|
||||
|
||||
# Convert files
|
||||
success_count = 0
|
||||
error_count = 0
|
||||
|
||||
for dev_file in dev_files:
|
||||
success, message = self._convert_file(dev_file)
|
||||
module_name = dev_file.parent.name
|
||||
|
||||
if success:
|
||||
success_count += 1
|
||||
self.console.print(f" ✅ {module_name}: {message}")
|
||||
else:
|
||||
error_count += 1
|
||||
self.console.print(f" ❌ {module_name}: {message}")
|
||||
|
||||
# Summary
|
||||
self._print_summary(success_count, error_count)
|
||||
|
||||
return 0 if error_count == 0 else 1
|
||||
|
||||
def _print_summary(self, success_count: int, error_count: int) -> None:
|
||||
"""Print command execution summary."""
|
||||
summary_text = Text()
|
||||
|
||||
if success_count > 0:
|
||||
summary_text.append(f"✅ Successfully built {success_count} notebook(s)\n", style="bold green")
|
||||
if error_count > 0:
|
||||
summary_text.append(f"❌ Failed to build {error_count} notebook(s)\n", style="bold red")
|
||||
|
||||
if success_count > 0:
|
||||
summary_text.append("\n💡 Next steps:\n", style="bold yellow")
|
||||
summary_text.append(" • Open notebooks with: jupyter lab\n", style="white")
|
||||
summary_text.append(" • Work interactively in the notebooks\n", style="white")
|
||||
summary_text.append(" • Export code with: tito package export\n", style="white")
|
||||
summary_text.append(" • Run tests with: tito module test\n", style="white")
|
||||
|
||||
border_style = "green" if error_count == 0 else "yellow"
|
||||
self.console.print(Panel(
|
||||
summary_text,
|
||||
title="Notebook Generation Complete",
|
||||
border_style=border_style
|
||||
))
|
||||
@@ -1,897 +0,0 @@
|
||||
"""
|
||||
TinyTorch Olympics Command
|
||||
|
||||
Special competition events with focused challenges, time-limited competitions,
|
||||
and unique recognition opportunities beyond the regular community leaderboard.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Any
|
||||
import uuid
|
||||
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
from rich.progress import track
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.console import Group
|
||||
from rich.align import Align
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.exceptions import TinyTorchCLIError
|
||||
|
||||
|
||||
class OlympicsCommand(BaseCommand):
|
||||
"""Special competition events - Focused challenges and recognition"""
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "olympics"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Special competition events with unique challenges and recognition"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
"""Add olympics subcommands."""
|
||||
subparsers = parser.add_subparsers(
|
||||
dest='olympics_command',
|
||||
help='Olympics operations',
|
||||
metavar='COMMAND'
|
||||
)
|
||||
|
||||
# Events command
|
||||
events_parser = subparsers.add_parser(
|
||||
'events',
|
||||
help='View current and upcoming competition events'
|
||||
)
|
||||
events_parser.add_argument(
|
||||
'--upcoming',
|
||||
action='store_true',
|
||||
help='Show only upcoming events'
|
||||
)
|
||||
events_parser.add_argument(
|
||||
'--past',
|
||||
action='store_true',
|
||||
help='Show past competition results'
|
||||
)
|
||||
|
||||
# Compete command
|
||||
compete_parser = subparsers.add_parser(
|
||||
'compete',
|
||||
help='Enter a specific competition event'
|
||||
)
|
||||
compete_parser.add_argument(
|
||||
'--event',
|
||||
required=True,
|
||||
help='Event ID to compete in'
|
||||
)
|
||||
compete_parser.add_argument(
|
||||
'--accuracy',
|
||||
type=float,
|
||||
help='Accuracy achieved for this competition'
|
||||
)
|
||||
compete_parser.add_argument(
|
||||
'--model',
|
||||
help='Model description and approach used'
|
||||
)
|
||||
compete_parser.add_argument(
|
||||
'--code-url',
|
||||
help='Optional: Link to your competition code/approach'
|
||||
)
|
||||
compete_parser.add_argument(
|
||||
'--notes',
|
||||
help='Competition-specific notes, innovations, learnings'
|
||||
)
|
||||
|
||||
# Awards command
|
||||
awards_parser = subparsers.add_parser(
|
||||
'awards',
|
||||
help='View special recognition and achievement badges'
|
||||
)
|
||||
awards_parser.add_argument(
|
||||
'--personal',
|
||||
action='store_true',
|
||||
help='Show only your personal awards'
|
||||
)
|
||||
|
||||
# History command
|
||||
history_parser = subparsers.add_parser(
|
||||
'history',
|
||||
help='View past competition events and memorable moments'
|
||||
)
|
||||
history_parser.add_argument(
|
||||
'--year',
|
||||
type=int,
|
||||
help='Filter by specific year'
|
||||
)
|
||||
history_parser.add_argument(
|
||||
'--event-type',
|
||||
choices=['speed', 'accuracy', 'innovation', 'efficiency', 'community'],
|
||||
help='Filter by event type'
|
||||
)
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
"""Execute olympics command."""
|
||||
command = getattr(args, 'olympics_command', None)
|
||||
|
||||
if not command:
|
||||
self._show_olympics_overview()
|
||||
return 0
|
||||
|
||||
if command == 'events':
|
||||
return self._show_events(args)
|
||||
elif command == 'compete':
|
||||
return self._compete_in_event(args)
|
||||
elif command == 'awards':
|
||||
return self._show_awards(args)
|
||||
elif command == 'history':
|
||||
return self._show_history(args)
|
||||
else:
|
||||
raise TinyTorchCLIError(f"Unknown olympics command: {command}")
|
||||
|
||||
def _show_olympics_overview(self) -> None:
|
||||
"""Show olympics overview and current special events."""
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
Align.center("[bold bright_gold]🏅 TinyTorch Olympics 🏅[/bold bright_gold]"),
|
||||
"",
|
||||
"[bold]Special Competition Events![/bold] Beyond the regular community leaderboard:",
|
||||
"",
|
||||
"🎯 [bold bright_blue]Focused Challenges[/bold bright_blue]",
|
||||
" • Time-limited competitions (24hr, 1week, 1month challenges)",
|
||||
" • Specific constraints (memory-efficient, fastest training, novel architectures)",
|
||||
" • Theme-based events (interpretability, fairness, efficiency)",
|
||||
"",
|
||||
"🏆 [bold bright_yellow]Special Recognition[/bold bright_yellow]",
|
||||
" • Olympic medals and achievement badges",
|
||||
" • Innovation awards for creative approaches",
|
||||
" • Community impact recognition",
|
||||
"",
|
||||
"🌟 [bold bright_green]Current Active Events[/bold bright_green]",
|
||||
" • Winter 2024 Speed Challenge (Training under 5 minutes)",
|
||||
" • Memory Efficiency Olympics (Models under 1MB)",
|
||||
" • Architecture Innovation Contest (Novel designs welcome)",
|
||||
"",
|
||||
"[bold]Available Commands:[/bold]",
|
||||
" [green]events[/green] - See current and upcoming competitions",
|
||||
" [green]compete[/green] - Enter a specific event",
|
||||
" [green]awards[/green] - View special recognition and badges",
|
||||
" [green]history[/green] - Past competitions and memorable moments",
|
||||
"",
|
||||
"[dim]💡 Note: Olympics are special events separate from daily community leaderboard[/dim]",
|
||||
),
|
||||
title="🥇 Competition Central",
|
||||
border_style="bright_yellow",
|
||||
padding=(1, 2)
|
||||
))
|
||||
|
||||
def _show_events(self, args: Namespace) -> int:
|
||||
"""Show current and upcoming competition events."""
|
||||
# Load events data (mock for now)
|
||||
events = self._load_olympics_events()
|
||||
|
||||
if args.upcoming:
|
||||
events = [e for e in events if e["status"] == "upcoming"]
|
||||
title = "📅 Upcoming Competition Events"
|
||||
elif args.past:
|
||||
events = [e for e in events if e["status"] == "completed"]
|
||||
title = "🏛️ Past Competition Results"
|
||||
else:
|
||||
title = "🏅 All Competition Events"
|
||||
|
||||
if not events:
|
||||
status_text = "upcoming" if args.upcoming else "past" if args.past else "available"
|
||||
self.console.print(Panel(
|
||||
f"[yellow]No {status_text} events at this time![/yellow]\n\n"
|
||||
"Check back soon for new competition opportunities!",
|
||||
title="📅 No Events",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 0
|
||||
|
||||
# Create events table
|
||||
table = Table(title=title)
|
||||
table.add_column("Event", style="bold")
|
||||
table.add_column("Type", style="blue")
|
||||
table.add_column("Duration", style="green")
|
||||
table.add_column("Status", style="yellow")
|
||||
table.add_column("Prize/Recognition", style="bright_magenta")
|
||||
table.add_column("Participants", style="cyan", justify="right")
|
||||
|
||||
for event in events:
|
||||
status_display = self._get_status_display(event["status"], event.get("end_date"))
|
||||
|
||||
table.add_row(
|
||||
event["name"],
|
||||
event["type"],
|
||||
event["duration"],
|
||||
status_display,
|
||||
event["prize"],
|
||||
str(event.get("participants", 0))
|
||||
)
|
||||
|
||||
self.console.print(table)
|
||||
|
||||
# Show active event details
|
||||
active_events = [e for e in events if e["status"] == "active"]
|
||||
if active_events:
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
"[bold bright_green]🔥 Active Competitions You Can Join Now![/bold bright_green]",
|
||||
"",
|
||||
*[f"• [bold]{event['name']}[/bold]: {event['description']}" for event in active_events[:3]],
|
||||
"",
|
||||
"[bold]Join a competition:[/bold]",
|
||||
"[dim]tito olympics compete --event <event_id>[/dim]",
|
||||
),
|
||||
title="⚡ Join Now",
|
||||
border_style="bright_green",
|
||||
padding=(0, 1)
|
||||
))
|
||||
|
||||
return 0
|
||||
|
||||
def _compete_in_event(self, args: Namespace) -> int:
|
||||
"""Enter a competition event."""
|
||||
# Check if user is registered for leaderboard
|
||||
if not self._is_user_registered():
|
||||
self.console.print(Panel(
|
||||
"[yellow]Please register for the community leaderboard first![/yellow]\n\n"
|
||||
"Olympics competitions require community membership:\n"
|
||||
"[bold]tito leaderboard register[/bold]",
|
||||
title="📝 Registration Required",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Load event details
|
||||
event = self._get_event_details(args.event)
|
||||
if not event:
|
||||
self.console.print(Panel(
|
||||
f"[red]Event '{args.event}' not found![/red]\n\n"
|
||||
"See available events: [bold]tito olympics events[/bold]",
|
||||
title="❌ Event Not Found",
|
||||
border_style="red"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Check if event is active
|
||||
if event["status"] != "active":
|
||||
self.console.print(Panel(
|
||||
f"[yellow]Event '{event['name']}' is not currently active![/yellow]\n\n"
|
||||
f"Status: {event['status']}\n"
|
||||
"See active events: [bold]tito olympics events[/bold]",
|
||||
title="⏰ Event Not Active",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Show event details and confirm participation
|
||||
self._show_event_details(event)
|
||||
|
||||
if not Confirm.ask("\n[bold]Compete in this event?[/bold]"):
|
||||
self.console.print("[dim]Maybe next time! 👋[/dim]")
|
||||
return 0
|
||||
|
||||
# Gather competition submission
|
||||
submission = self._gather_competition_submission(event, args)
|
||||
|
||||
# Validate submission meets event criteria
|
||||
validation_result = self._validate_submission(event, submission)
|
||||
if not validation_result["valid"]:
|
||||
self.console.print(Panel(
|
||||
f"[red]Submission doesn't meet event criteria![/red]\n\n"
|
||||
f"Issue: {validation_result['reason']}\n\n"
|
||||
"Please check event requirements and try again.",
|
||||
title="❌ Validation Failed",
|
||||
border_style="red"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Save competition entry
|
||||
self._save_competition_entry(event, submission)
|
||||
|
||||
# Show competition confirmation and standing
|
||||
self._show_competition_confirmation(event, submission)
|
||||
|
||||
return 0
|
||||
|
||||
def _show_awards(self, args: Namespace) -> int:
|
||||
"""Show special recognition and achievement badges."""
|
||||
if args.personal:
|
||||
return self._show_personal_awards()
|
||||
else:
|
||||
return self._show_all_awards()
|
||||
|
||||
def _show_personal_awards(self) -> int:
|
||||
"""Show user's personal awards and badges."""
|
||||
if not self._is_user_registered():
|
||||
self.console.print(Panel(
|
||||
"[yellow]Please register first to see your awards![/yellow]\n\n"
|
||||
"Run: [bold]tito leaderboard register[/bold]",
|
||||
title="📝 Registration Required",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Load user's Olympic achievements
|
||||
olympic_profile = self._load_user_olympic_profile()
|
||||
awards = olympic_profile.get("awards", [])
|
||||
competitions = olympic_profile.get("competitions", [])
|
||||
|
||||
if not awards and not competitions:
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
"[bold bright_blue]🌟 Your Olympic Journey Awaits![/bold bright_blue]",
|
||||
"",
|
||||
"You haven't participated in Olympics competitions yet.",
|
||||
"",
|
||||
"[bold]Start your journey:[/bold]",
|
||||
"• Check active events: [green]tito olympics events[/green]",
|
||||
"• Join a competition: [green]tito olympics compete --event <id>[/green]",
|
||||
"• Earn your first Olympic badge! 🏅",
|
||||
"",
|
||||
"[dim]Every Olympic participant gets recognition for participation![/dim]",
|
||||
),
|
||||
title="🏅 Your Olympic Profile",
|
||||
border_style="bright_blue",
|
||||
padding=(1, 2)
|
||||
))
|
||||
return 0
|
||||
|
||||
# Show awards and achievements
|
||||
self._display_personal_olympic_achievements(olympic_profile)
|
||||
return 0
|
||||
|
||||
def _show_all_awards(self) -> int:
|
||||
"""Show community awards and notable achievements."""
|
||||
# Mock awards data
|
||||
notable_awards = self._load_notable_awards()
|
||||
|
||||
# Recent awards table
|
||||
table = Table(title="🏆 Recent Olympic Achievements")
|
||||
table.add_column("Award", style="bold")
|
||||
table.add_column("Recipient", style="green")
|
||||
table.add_column("Event", style="blue")
|
||||
table.add_column("Achievement", style="yellow")
|
||||
table.add_column("Date", style="dim")
|
||||
|
||||
for award in notable_awards[:10]:
|
||||
table.add_row(
|
||||
award["award_type"],
|
||||
award["recipient"],
|
||||
award["event"],
|
||||
award["description"],
|
||||
award["date"]
|
||||
)
|
||||
|
||||
self.console.print(table)
|
||||
|
||||
# Award categories explanation
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
"[bold bright_yellow]🏅 Olympic Award Categories[/bold bright_yellow]",
|
||||
"",
|
||||
"🥇 [bold]Performance Awards[/bold]",
|
||||
" • Gold/Silver/Bronze medals for top competition results",
|
||||
" • Speed records, accuracy achievements, efficiency milestones",
|
||||
"",
|
||||
"🌟 [bold]Innovation Awards[/bold]",
|
||||
" • Novel Architecture Award for creative model designs",
|
||||
" • Optimization Genius for breakthrough efficiency techniques",
|
||||
" • Interpretability Champion for explainable AI contributions",
|
||||
"",
|
||||
"🤝 [bold]Community Awards[/bold]",
|
||||
" • Mentor Badge for helping other competitors",
|
||||
" • Knowledge Sharer for valuable insights and tutorials",
|
||||
" • Sportsperson Award for exceptional community spirit",
|
||||
"",
|
||||
"🎯 [bold]Special Recognition[/bold]",
|
||||
" • First Participation Badge (everyone gets this!)",
|
||||
" • Consistency Award for regular competition participation",
|
||||
" • Breakthrough Achievement for major personal improvements",
|
||||
),
|
||||
title="🏆 Recognition System",
|
||||
border_style="bright_yellow",
|
||||
padding=(0, 1)
|
||||
))
|
||||
|
||||
return 0
|
||||
|
||||
def _show_history(self, args: Namespace) -> int:
|
||||
"""Show past competition events and memorable moments."""
|
||||
# Load historical data
|
||||
history = self._load_olympics_history()
|
||||
|
||||
# Filter by year if specified
|
||||
if args.year:
|
||||
history = [h for h in history if h["year"] == args.year]
|
||||
|
||||
# Filter by event type if specified
|
||||
if args.event_type:
|
||||
history = [h for h in history if h["type"] == args.event_type]
|
||||
|
||||
if not history:
|
||||
filter_text = f" for {args.year}" if args.year else ""
|
||||
filter_text += f" ({args.event_type} events)" if args.event_type else ""
|
||||
|
||||
self.console.print(Panel(
|
||||
f"[yellow]No competition history found{filter_text}![/yellow]\n\n"
|
||||
"The Olympics program is just getting started!",
|
||||
title="📚 No History",
|
||||
border_style="yellow"
|
||||
))
|
||||
return 0
|
||||
|
||||
# Create history table
|
||||
table = Table(title="📚 TinyTorch Olympics History")
|
||||
table.add_column("Event", style="bold")
|
||||
table.add_column("Date", style="dim")
|
||||
table.add_column("Type", style="blue")
|
||||
table.add_column("Winner", style="green")
|
||||
table.add_column("Achievement", style="yellow")
|
||||
table.add_column("Memorable Moment", style="cyan")
|
||||
|
||||
for event in sorted(history, key=lambda x: x["date"], reverse=True):
|
||||
table.add_row(
|
||||
event["name"],
|
||||
event["date"],
|
||||
event["type"],
|
||||
event["winner"],
|
||||
event["winning_achievement"],
|
||||
event["memorable_moment"]
|
||||
)
|
||||
|
||||
self.console.print(table)
|
||||
|
||||
# Show legendary moments
|
||||
if not args.year and not args.event_type:
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
"[bold bright_gold]🌟 Legendary Olympic Moments[/bold bright_gold]",
|
||||
"",
|
||||
"🏆 [bold]The Great Speed Challenge 2024[/bold]",
|
||||
" Winner achieved 75% CIFAR-10 accuracy in just 47 seconds!",
|
||||
"",
|
||||
"🧠 [bold]Architecture Innovation Contest[/bold]",
|
||||
" Revolutionary attention mechanism reduced parameters by 90%",
|
||||
"",
|
||||
"🤝 [bold]Community Spirit Award[/bold]",
|
||||
" Competitor shared winning code to help others improve",
|
||||
"",
|
||||
"[dim]Each Olympics creates new legends in the TinyTorch community! 💫[/dim]",
|
||||
),
|
||||
title="🏛️ Hall of Fame",
|
||||
border_style="bright_gold",
|
||||
padding=(0, 1)
|
||||
))
|
||||
|
||||
return 0
|
||||
|
||||
def _load_olympics_events(self) -> List[Dict[str, Any]]:
|
||||
"""Load olympics events data (mock implementation)."""
|
||||
return [
|
||||
{
|
||||
"id": "winter2024_speed",
|
||||
"name": "Winter 2024 Speed Challenge",
|
||||
"type": "Speed",
|
||||
"status": "active",
|
||||
"duration": "24 hours",
|
||||
"description": "Train CIFAR-10 model to 70%+ accuracy in under 5 minutes",
|
||||
"prize": "🏆 Speed Medal + Recognition",
|
||||
"participants": 23,
|
||||
"start_date": "2024-01-15",
|
||||
"end_date": "2024-01-16",
|
||||
"criteria": {"min_accuracy": 70.0, "max_time_minutes": 5}
|
||||
},
|
||||
{
|
||||
"id": "memory2024_efficiency",
|
||||
"name": "Memory Efficiency Olympics",
|
||||
"type": "Efficiency",
|
||||
"status": "active",
|
||||
"duration": "1 week",
|
||||
"description": "Best CIFAR-10 accuracy with model under 1MB",
|
||||
"prize": "🥇 Efficiency Champion",
|
||||
"participants": 15,
|
||||
"start_date": "2024-01-10",
|
||||
"end_date": "2024-01-17",
|
||||
"criteria": {"max_model_size_mb": 1.0}
|
||||
},
|
||||
{
|
||||
"id": "innovation2024_arch",
|
||||
"name": "Architecture Innovation Contest",
|
||||
"type": "Innovation",
|
||||
"status": "upcoming",
|
||||
"duration": "2 weeks",
|
||||
"description": "Novel architectures and creative approaches welcome",
|
||||
"prize": "🌟 Innovation Award",
|
||||
"participants": 0,
|
||||
"start_date": "2024-02-01",
|
||||
"end_date": "2024-02-14",
|
||||
"criteria": {"novelty_required": True}
|
||||
},
|
||||
{
|
||||
"id": "autumn2023_classic",
|
||||
"name": "Autumn 2023 Classic",
|
||||
"type": "Accuracy",
|
||||
"status": "completed",
|
||||
"duration": "1 month",
|
||||
"description": "Best overall CIFAR-10 accuracy challenge",
|
||||
"prize": "🥇 Gold Medal",
|
||||
"participants": 87,
|
||||
"start_date": "2023-10-01",
|
||||
"end_date": "2023-10-31",
|
||||
"winner": "neural_champion",
|
||||
"winning_score": 84.2
|
||||
}
|
||||
]
|
||||
|
||||
def _get_status_display(self, status: str, end_date: Optional[str] = None) -> str:
|
||||
"""Get display-friendly status with timing information."""
|
||||
if status == "active":
|
||||
if end_date:
|
||||
# Calculate time remaining
|
||||
end = datetime.fromisoformat(end_date)
|
||||
now = datetime.now()
|
||||
if end > now:
|
||||
remaining = end - now
|
||||
if remaining.days > 0:
|
||||
return f"🔥 Active ({remaining.days}d left)"
|
||||
else:
|
||||
hours = remaining.seconds // 3600
|
||||
return f"🔥 Active ({hours}h left)"
|
||||
return "🔥 Active"
|
||||
elif status == "upcoming":
|
||||
return "📅 Upcoming"
|
||||
elif status == "completed":
|
||||
return "✅ Completed"
|
||||
else:
|
||||
return status.title()
|
||||
|
||||
def _is_user_registered(self) -> bool:
|
||||
"""Check if user is registered for community leaderboard."""
|
||||
from .leaderboard import LeaderboardCommand
|
||||
leaderboard_cmd = LeaderboardCommand(self.config)
|
||||
return leaderboard_cmd._load_user_profile() is not None
|
||||
|
||||
def _get_event_details(self, event_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get details for a specific event."""
|
||||
events = self._load_olympics_events()
|
||||
return next((e for e in events if e["id"] == event_id), None)
|
||||
|
||||
def _show_event_details(self, event: Dict[str, Any]) -> None:
|
||||
"""Show detailed information about an event."""
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
f"[bold bright_blue]{event['name']}[/bold bright_blue]",
|
||||
"",
|
||||
f"[bold]Type:[/bold] {event['type']}",
|
||||
f"[bold]Duration:[/bold] {event['duration']}",
|
||||
f"[bold]Current Participants:[/bold] {event.get('participants', 0)}",
|
||||
"",
|
||||
f"[bold]Challenge:[/bold]",
|
||||
f" {event['description']}",
|
||||
"",
|
||||
f"[bold]Recognition:[/bold]",
|
||||
f" {event['prize']}",
|
||||
"",
|
||||
f"[bold]Requirements:[/bold]",
|
||||
*[f" • {k.replace('_', ' ').title()}: {v}" for k, v in event.get('criteria', {}).items()],
|
||||
),
|
||||
title=f"🏅 {event['type']} Competition",
|
||||
border_style="bright_blue",
|
||||
padding=(1, 2)
|
||||
))
|
||||
|
||||
def _gather_competition_submission(self, event: Dict[str, Any], args: Namespace) -> Dict[str, Any]:
|
||||
"""Gather submission details for competition."""
|
||||
submission = {
|
||||
"event_id": event["id"],
|
||||
"submitted_date": datetime.now().isoformat()
|
||||
}
|
||||
|
||||
# Get accuracy
|
||||
if args.accuracy is not None:
|
||||
submission["accuracy"] = args.accuracy
|
||||
else:
|
||||
submission["accuracy"] = float(Prompt.ask(
|
||||
f"[bold]Accuracy achieved on {event.get('dataset', 'the task')}[/bold]",
|
||||
default="0.0"
|
||||
))
|
||||
|
||||
# Get model description
|
||||
if args.model:
|
||||
submission["model"] = args.model
|
||||
else:
|
||||
submission["model"] = Prompt.ask(
|
||||
"[bold]Model description[/bold] (architecture, approach, innovations)",
|
||||
default="Custom Model"
|
||||
)
|
||||
|
||||
# Optional fields
|
||||
submission["code_url"] = args.code_url or Prompt.ask(
|
||||
"[bold]Code/approach URL[/bold] (optional)",
|
||||
default=""
|
||||
) or None
|
||||
|
||||
submission["notes"] = args.notes or Prompt.ask(
|
||||
"[bold]Competition notes[/bold] (innovations, challenges, learnings)",
|
||||
default=""
|
||||
) or None
|
||||
|
||||
# Event-specific metrics
|
||||
if "max_time_minutes" in event.get("criteria", {}):
|
||||
training_time = float(Prompt.ask(
|
||||
"[bold]Training time in minutes[/bold]",
|
||||
default="0.0"
|
||||
))
|
||||
submission["training_time_minutes"] = training_time
|
||||
|
||||
if "max_model_size_mb" in event.get("criteria", {}):
|
||||
model_size = float(Prompt.ask(
|
||||
"[bold]Model size in MB[/bold]",
|
||||
default="0.0"
|
||||
))
|
||||
submission["model_size_mb"] = model_size
|
||||
|
||||
return submission
|
||||
|
||||
def _validate_submission(self, event: Dict[str, Any], submission: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Validate submission meets event criteria."""
|
||||
criteria = event.get("criteria", {})
|
||||
|
||||
# Check minimum accuracy
|
||||
if "min_accuracy" in criteria:
|
||||
if submission["accuracy"] < criteria["min_accuracy"]:
|
||||
return {
|
||||
"valid": False,
|
||||
"reason": f"Accuracy {submission['accuracy']:.1f}% below required {criteria['min_accuracy']:.1f}%"
|
||||
}
|
||||
|
||||
# Check maximum training time
|
||||
if "max_time_minutes" in criteria:
|
||||
if submission.get("training_time_minutes", 0) > criteria["max_time_minutes"]:
|
||||
return {
|
||||
"valid": False,
|
||||
"reason": f"Training time {submission['training_time_minutes']:.1f}min exceeds limit {criteria['max_time_minutes']:.1f}min"
|
||||
}
|
||||
|
||||
# Check maximum model size
|
||||
if "max_model_size_mb" in criteria:
|
||||
if submission.get("model_size_mb", 0) > criteria["max_model_size_mb"]:
|
||||
return {
|
||||
"valid": False,
|
||||
"reason": f"Model size {submission['model_size_mb']:.1f}MB exceeds limit {criteria['max_model_size_mb']:.1f}MB"
|
||||
}
|
||||
|
||||
return {"valid": True}
|
||||
|
||||
def _save_competition_entry(self, event: Dict[str, Any], submission: Dict[str, Any]) -> None:
|
||||
"""Save competition entry to user's Olympic profile."""
|
||||
olympic_profile = self._load_user_olympic_profile()
|
||||
|
||||
if "competitions" not in olympic_profile:
|
||||
olympic_profile["competitions"] = []
|
||||
|
||||
olympic_profile["competitions"].append(submission)
|
||||
|
||||
# Add participation award if first competition
|
||||
if len(olympic_profile["competitions"]) == 1:
|
||||
award = {
|
||||
"type": "participation",
|
||||
"name": "First Olympic Participation",
|
||||
"description": "Welcomed to the Olympics community!",
|
||||
"event": event["name"],
|
||||
"earned_date": datetime.now().isoformat()
|
||||
}
|
||||
if "awards" not in olympic_profile:
|
||||
olympic_profile["awards"] = []
|
||||
olympic_profile["awards"].append(award)
|
||||
|
||||
self._save_user_olympic_profile(olympic_profile)
|
||||
|
||||
def _show_competition_confirmation(self, event: Dict[str, Any], submission: Dict[str, Any]) -> None:
|
||||
"""Show confirmation and current standing."""
|
||||
# Determine performance level for this competition
|
||||
ranking_message = self._get_competition_ranking_message(event, submission)
|
||||
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
Align.center("[bold bright_green]🎉 Competition Entry Submitted! 🎉[/bold bright_green]"),
|
||||
"",
|
||||
f"[bold]Event:[/bold] {event['name']}",
|
||||
f"[bold]Your Result:[/bold] {submission['accuracy']:.1f}% accuracy",
|
||||
f"[bold]Model:[/bold] {submission['model']}",
|
||||
"",
|
||||
ranking_message,
|
||||
"",
|
||||
"[bold bright_blue]🏅 Recognition Earned:[/bold bright_blue]",
|
||||
"• Olympic Participant Badge",
|
||||
"• Competition Experience Points",
|
||||
"• Community Recognition",
|
||||
"",
|
||||
"[bold]Next Steps:[/bold]",
|
||||
"• View your awards: [green]tito olympics awards --personal[/green]",
|
||||
"• See current standings: [green]tito olympics events[/green]",
|
||||
"• Join another event: [green]tito olympics events[/green]",
|
||||
),
|
||||
title="🥇 Olympic Achievement",
|
||||
border_style="bright_green",
|
||||
padding=(1, 2)
|
||||
))
|
||||
|
||||
def _get_competition_ranking_message(self, event: Dict[str, Any], submission: Dict[str, Any]) -> str:
|
||||
"""Get appropriate ranking/performance message for competition."""
|
||||
accuracy = submission["accuracy"]
|
||||
|
||||
# Mock competition standings for encouragement
|
||||
if accuracy >= 80:
|
||||
return "[bright_green]🏆 Outstanding performance! You're in contention for top prizes![/bright_green]"
|
||||
elif accuracy >= 70:
|
||||
return "[bright_blue]🎯 Strong showing! You're competing well in this event![/bright_blue]"
|
||||
elif accuracy >= 60:
|
||||
return "[bright_yellow]🌟 Good effort! Every competition teaches valuable lessons![/bright_yellow]"
|
||||
else:
|
||||
return "[bright_magenta]💝 Thank you for participating! Competition experience is valuable![/bright_magenta]"
|
||||
|
||||
def _load_user_olympic_profile(self) -> Dict[str, Any]:
|
||||
"""Load user's Olympic competition profile."""
|
||||
data_dir = Path.home() / ".tinytorch" / "olympics"
|
||||
data_dir.mkdir(parents=True, exist_ok=True)
|
||||
profile_file = data_dir / "olympic_profile.json"
|
||||
|
||||
if profile_file.exists():
|
||||
with open(profile_file, 'r') as f:
|
||||
return json.load(f)
|
||||
|
||||
return {
|
||||
"competitions": [],
|
||||
"awards": [],
|
||||
"created_date": datetime.now().isoformat()
|
||||
}
|
||||
|
||||
def _save_user_olympic_profile(self, profile: Dict[str, Any]) -> None:
|
||||
"""Save user's Olympic competition profile."""
|
||||
data_dir = Path.home() / ".tinytorch" / "olympics"
|
||||
profile_file = data_dir / "olympic_profile.json"
|
||||
|
||||
with open(profile_file, 'w') as f:
|
||||
json.dump(profile, f, indent=2)
|
||||
|
||||
def _display_personal_olympic_achievements(self, olympic_profile: Dict[str, Any]) -> None:
|
||||
"""Display user's personal Olympic achievements."""
|
||||
competitions = olympic_profile.get("competitions", [])
|
||||
awards = olympic_profile.get("awards", [])
|
||||
|
||||
# Summary stats
|
||||
total_competitions = len(competitions)
|
||||
best_accuracy = max([c["accuracy"] for c in competitions], default=0)
|
||||
events_participated = len(set(c["event_id"] for c in competitions))
|
||||
|
||||
self.console.print(Panel(
|
||||
Group(
|
||||
Align.center("[bold bright_gold]🏅 Your Olympic Journey 🏅[/bold bright_gold]"),
|
||||
"",
|
||||
f"🎯 Competitions Entered: {total_competitions}",
|
||||
f"🏆 Best Performance: {best_accuracy:.1f}% accuracy",
|
||||
f"🌟 Events Participated: {events_participated}",
|
||||
f"🥇 Awards Earned: {len(awards)}",
|
||||
),
|
||||
title="📊 Olympic Stats",
|
||||
border_style="bright_gold",
|
||||
padding=(1, 2)
|
||||
))
|
||||
|
||||
# Awards table
|
||||
if awards:
|
||||
awards_table = Table(title="🏆 Your Olympic Awards")
|
||||
awards_table.add_column("Award", style="bold")
|
||||
awards_table.add_column("Event", style="blue")
|
||||
awards_table.add_column("Description", style="green")
|
||||
awards_table.add_column("Date", style="dim")
|
||||
|
||||
for award in sorted(awards, key=lambda x: x["earned_date"], reverse=True):
|
||||
awards_table.add_row(
|
||||
award["name"],
|
||||
award["event"],
|
||||
award["description"],
|
||||
award["earned_date"][:10]
|
||||
)
|
||||
|
||||
self.console.print(awards_table)
|
||||
|
||||
# Recent competitions
|
||||
if competitions:
|
||||
recent_comps = sorted(competitions, key=lambda x: x["submitted_date"], reverse=True)[:5]
|
||||
|
||||
comps_table = Table(title="🎯 Recent Competition Entries")
|
||||
comps_table.add_column("Event", style="bold")
|
||||
comps_table.add_column("Accuracy", style="green", justify="right")
|
||||
comps_table.add_column("Model", style="blue")
|
||||
comps_table.add_column("Date", style="dim")
|
||||
|
||||
for comp in recent_comps:
|
||||
comps_table.add_row(
|
||||
comp["event_id"],
|
||||
f"{comp['accuracy']:.1f}%",
|
||||
comp["model"],
|
||||
comp["submitted_date"][:10]
|
||||
)
|
||||
|
||||
self.console.print(comps_table)
|
||||
|
||||
def _load_notable_awards(self) -> List[Dict[str, Any]]:
|
||||
"""Load notable community awards (mock implementation)."""
|
||||
return [
|
||||
{
|
||||
"award_type": "🥇 Gold Medal",
|
||||
"recipient": "speed_demon",
|
||||
"event": "Winter 2024 Speed Challenge",
|
||||
"description": "2.3 min training, 78.4% accuracy",
|
||||
"date": "2024-01-16"
|
||||
},
|
||||
{
|
||||
"award_type": "🌟 Innovation Award",
|
||||
"recipient": "arch_wizard",
|
||||
"event": "Memory Efficiency Olympics",
|
||||
"description": "Novel attention mechanism",
|
||||
"date": "2024-01-15"
|
||||
},
|
||||
{
|
||||
"award_type": "🤝 Community Spirit",
|
||||
"recipient": "helpful_mentor",
|
||||
"event": "Autumn 2023 Classic",
|
||||
"description": "Shared winning approach publicly",
|
||||
"date": "2023-11-01"
|
||||
},
|
||||
{
|
||||
"award_type": "🏆 Speed Record",
|
||||
"recipient": "lightning_fast",
|
||||
"event": "Winter 2024 Speed Challenge",
|
||||
"description": "47 second training record",
|
||||
"date": "2024-01-15"
|
||||
},
|
||||
{
|
||||
"award_type": "🎯 Accuracy Champion",
|
||||
"recipient": "precision_master",
|
||||
"event": "Architecture Innovation",
|
||||
"description": "86.7% CIFAR-10 accuracy",
|
||||
"date": "2024-01-10"
|
||||
}
|
||||
]
|
||||
|
||||
def _load_olympics_history(self) -> List[Dict[str, Any]]:
|
||||
"""Load historical Olympics data (mock implementation)."""
|
||||
return [
|
||||
{
|
||||
"name": "Autumn 2023 Classic",
|
||||
"date": "2023-10-31",
|
||||
"year": 2023,
|
||||
"type": "accuracy",
|
||||
"winner": "neural_champion",
|
||||
"winning_achievement": "84.2% CIFAR-10 accuracy",
|
||||
"memorable_moment": "First 80%+ achievement in community"
|
||||
},
|
||||
{
|
||||
"name": "Summer 2023 Speed Trial",
|
||||
"date": "2023-07-15",
|
||||
"year": 2023,
|
||||
"type": "speed",
|
||||
"winner": "velocity_victor",
|
||||
"winning_achievement": "3.2 minute training",
|
||||
"memorable_moment": "Breakthrough GPU optimization technique"
|
||||
},
|
||||
{
|
||||
"name": "Spring 2023 Innovation Fair",
|
||||
"date": "2023-04-20",
|
||||
"year": 2023,
|
||||
"type": "innovation",
|
||||
"winner": "creative_genius",
|
||||
"winning_achievement": "Self-organizing architecture",
|
||||
"memorable_moment": "Inspired 12 follow-up research papers"
|
||||
}
|
||||
]
|
||||
@@ -1,572 +0,0 @@
|
||||
"""
|
||||
Status command for TinyTorch CLI: checks status of all modules in modules/ directory.
|
||||
|
||||
Supports both basic status checking and comprehensive system analysis.
|
||||
"""
|
||||
|
||||
import subprocess
|
||||
import sys
|
||||
import yaml
|
||||
import re
|
||||
import time
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from pathlib import Path
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
from rich.text import Text
|
||||
from typing import Union, Dict, Any, Optional
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.status_analyzer import TinyTorchStatusAnalyzer
|
||||
|
||||
class StatusCommand(BaseCommand):
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "status"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Check status of all modules"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
parser.add_argument("--progress", action="store_true", help="Show user progress (modules + milestones) - DEFAULT")
|
||||
parser.add_argument("--files", action="store_true", help="Show file structure and module status")
|
||||
parser.add_argument("--details", action="store_true", help="Show detailed file structure")
|
||||
parser.add_argument("--metadata", action="store_true", help="Show module metadata information")
|
||||
parser.add_argument("--test-status", action="store_true", help="Include test execution status (slower)")
|
||||
parser.add_argument("--comprehensive", action="store_true", help="Run comprehensive system health dashboard (environment + compliance + testing)")
|
||||
|
||||
def _get_export_target(self, module_path: Path) -> str:
|
||||
"""
|
||||
Read the actual export target from the dev file's #| default_exp directive.
|
||||
Same logic as the export command.
|
||||
"""
|
||||
# Extract short name from module directory name for dev file
|
||||
module_name = module_path.name
|
||||
if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = module_name[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = module_name
|
||||
dev_file = module_path / f"{short_name}.py"
|
||||
if not dev_file.exists():
|
||||
return "not_found"
|
||||
|
||||
try:
|
||||
with open(dev_file, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
# Look for #| default_exp directive
|
||||
match = re.search(r'#\|\s*default_exp\s+([^\n\r]+)', content)
|
||||
if match:
|
||||
return match.group(1).strip()
|
||||
return "no_export"
|
||||
except Exception:
|
||||
return "read_error"
|
||||
|
||||
def _count_test_functions(self, dev_file: Path) -> int:
|
||||
"""Count the number of test functions in a dev file."""
|
||||
try:
|
||||
with open(dev_file, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
# Count lines that start with "def test_"
|
||||
lines = content.split('\n')
|
||||
test_functions = [line for line in lines if line.strip().startswith('def test_')]
|
||||
return len(test_functions)
|
||||
except Exception:
|
||||
return 0
|
||||
|
||||
def _count_export_functions(self, dev_file: Path) -> int:
|
||||
"""Count the number of exported functions/classes in a dev file."""
|
||||
try:
|
||||
with open(dev_file, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
# Count lines that have #| export directive
|
||||
lines = content.split('\n')
|
||||
export_lines = [line for line in lines if line.strip().startswith('#| export')]
|
||||
return len(export_lines)
|
||||
except Exception:
|
||||
return 0
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
console = self.console
|
||||
|
||||
# Handle comprehensive analysis mode
|
||||
if args.comprehensive:
|
||||
return self._run_comprehensive_analysis()
|
||||
|
||||
# Handle progress view (default if no flags, or --progress)
|
||||
if not args.files and not args.details and not args.metadata and not args.test_status:
|
||||
return self._run_progress_view()
|
||||
|
||||
if args.progress:
|
||||
return self._run_progress_view()
|
||||
|
||||
# Standard file status check mode
|
||||
return self._run_standard_status(args)
|
||||
|
||||
def _run_progress_view(self) -> int:
|
||||
"""Show unified user progress view (modules + milestones)."""
|
||||
console = self.console
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
# Load progress data
|
||||
progress_file = Path(".tito") / "progress.json"
|
||||
milestones_file = Path(".tito") / "milestones.json"
|
||||
|
||||
# Load module progress
|
||||
if progress_file.exists():
|
||||
progress_data = json.loads(progress_file.read_text())
|
||||
completed_modules = progress_data.get("completed_modules", [])
|
||||
completion_dates = progress_data.get("completion_dates", {})
|
||||
else:
|
||||
completed_modules = []
|
||||
completion_dates = {}
|
||||
|
||||
# Load milestone achievements
|
||||
if milestones_file.exists():
|
||||
milestones_data = json.loads(milestones_file.read_text())
|
||||
completed_milestones = milestones_data.get("completed_milestones", [])
|
||||
milestone_dates = milestones_data.get("completion_dates", {})
|
||||
else:
|
||||
completed_milestones = []
|
||||
milestone_dates = {}
|
||||
|
||||
# Calculate progress percentages
|
||||
total_modules = 20
|
||||
total_milestones = 6
|
||||
modules_percent = int((len(completed_modules) / total_modules) * 100)
|
||||
milestones_percent = int((len(completed_milestones) / total_milestones) * 100)
|
||||
|
||||
# Create summary panel
|
||||
summary_text = Text()
|
||||
summary_text.append(f"📦 Modules Completed: ", style="bold")
|
||||
summary_text.append(f"{len(completed_modules)}/{total_modules} ({modules_percent}%)\n", style="cyan")
|
||||
summary_text.append(f"🏆 Milestones Achieved: ", style="bold")
|
||||
summary_text.append(f"{len(completed_milestones)}/{total_milestones} ({milestones_percent}%)\n\n", style="magenta")
|
||||
|
||||
# Last activity
|
||||
all_dates = list(completion_dates.values()) + list(milestone_dates.values())
|
||||
if all_dates:
|
||||
latest_date = max(all_dates)
|
||||
summary_text.append("📍 Last Activity: ", style="bold")
|
||||
summary_text.append(f"{latest_date}\n", style="dim")
|
||||
|
||||
console.print(Panel(
|
||||
summary_text,
|
||||
title="📊 TinyTorch Progress",
|
||||
border_style="bright_cyan"
|
||||
))
|
||||
|
||||
# Module Progress Table
|
||||
if completed_modules:
|
||||
console.print("\n[bold]Module Progress:[/bold]")
|
||||
for i in range(1, total_modules + 1):
|
||||
mod_num = i
|
||||
if mod_num in completed_modules:
|
||||
module_name = self._get_module_name(mod_num)
|
||||
console.print(f" [green]✅ {mod_num:02d} {module_name}[/green]")
|
||||
elif i <= len(completed_modules) + 3: # Show next few modules
|
||||
module_name = self._get_module_name(mod_num)
|
||||
console.print(f" [dim]🔒 {mod_num:02d} {module_name}[/dim]")
|
||||
|
||||
# Milestone Achievements
|
||||
if completed_milestones or (completed_modules and len(completed_modules) >= 1):
|
||||
console.print("\n[bold]Milestone Achievements:[/bold]")
|
||||
milestone_names = {
|
||||
"01": "Perceptron (1957)",
|
||||
"02": "Backpropagation (1986)",
|
||||
"03": "MLP Revival (1986)",
|
||||
"04": "CNN Revolution (1998)",
|
||||
"05": "Transformer Era (2017)",
|
||||
"06": "MLPerf (2018)"
|
||||
}
|
||||
for mid in ["01", "02", "03", "04", "05", "06"]:
|
||||
if mid in completed_milestones:
|
||||
console.print(f" [magenta]✅ {mid} - {milestone_names[mid]}[/magenta]")
|
||||
else:
|
||||
# Check if ready
|
||||
prereqs_met = self._check_milestone_prereqs(mid, completed_modules)
|
||||
if prereqs_met:
|
||||
console.print(f" [yellow]🎯 {mid} - {milestone_names[mid]} [Ready!][/yellow]")
|
||||
else:
|
||||
console.print(f" [dim]🔒 {mid} - {milestone_names[mid]}[/dim]")
|
||||
|
||||
console.print()
|
||||
return 0
|
||||
|
||||
def _get_module_name(self, module_num: int) -> str:
|
||||
"""Get module name from number."""
|
||||
module_names = {
|
||||
1: "Tensor", 2: "Activations", 3: "Layers", 4: "Losses",
|
||||
5: "Autograd", 6: "Optimizers", 7: "Training", 8: "DataLoader",
|
||||
9: "Convolutions", 10: "Normalization", 11: "Tokenization",
|
||||
12: "Embeddings", 13: "Attention", 14: "Transformers",
|
||||
15: "Profiling", 16: "Quantization", 17: "Compression",
|
||||
18: "Memoization", 19: "Benchmarking", 20: "Capstone"
|
||||
}
|
||||
return module_names.get(module_num, "Unknown")
|
||||
|
||||
def _check_milestone_prereqs(self, milestone_id: str, completed_modules: list) -> bool:
|
||||
"""Check if milestone prerequisites are met."""
|
||||
prereqs = {
|
||||
"01": [1],
|
||||
"02": [1, 2, 3, 4, 5],
|
||||
"03": [1, 2, 3, 4, 5, 6, 7],
|
||||
"04": [1, 2, 3, 4, 5, 6, 7, 8, 9],
|
||||
"05": [1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14],
|
||||
"06": [1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 16, 19]
|
||||
}
|
||||
required = prereqs.get(milestone_id, [])
|
||||
return all(mod in completed_modules for mod in required)
|
||||
|
||||
def _run_comprehensive_analysis(self) -> int:
|
||||
"""Run comprehensive system health dashboard."""
|
||||
console = self.console
|
||||
start_time = time.time()
|
||||
|
||||
console.print("🚀 Starting TinyTorch Comprehensive Status Check...", style="bold green")
|
||||
|
||||
# Initialize analyzer
|
||||
analyzer = TinyTorchStatusAnalyzer()
|
||||
|
||||
# Run full analysis
|
||||
result = analyzer.run_full_analysis()
|
||||
|
||||
# Generate comprehensive report
|
||||
analyzer.generate_comprehensive_report(console)
|
||||
|
||||
# Summary
|
||||
total_time = time.time() - start_time
|
||||
console.print(f"\n⏱️ Comprehensive analysis completed in {total_time:.1f}s", style="dim")
|
||||
|
||||
# Return appropriate exit code
|
||||
if result['summary']['environment_healthy'] and result['summary']['working_modules'] >= result['summary']['total_modules'] * 0.8:
|
||||
return 0 # Success
|
||||
else:
|
||||
return 1 # Issues found
|
||||
|
||||
def _run_standard_status(self, args: Namespace) -> int:
|
||||
"""Run standard status check mode."""
|
||||
console = self.console
|
||||
|
||||
# Scan modules directory
|
||||
modules_dir = Path("modules")
|
||||
if not modules_dir.exists():
|
||||
console.print(Panel("[red]❌ modules/ directory not found[/red]",
|
||||
title="Error", border_style="red"))
|
||||
return 1
|
||||
|
||||
# Find all module directories (exclude special directories)
|
||||
exclude_dirs = {'.quarto', '__pycache__', '.git', '.pytest_cache'}
|
||||
module_dirs = [d for d in modules_dir.iterdir()
|
||||
if d.is_dir() and d.name not in exclude_dirs]
|
||||
|
||||
if not module_dirs:
|
||||
console.print(Panel("[yellow]⚠️ No modules found in modules/ directory[/yellow]",
|
||||
title="Warning", border_style="yellow"))
|
||||
return 0
|
||||
|
||||
console.print(Panel(f"📋 Found {len(module_dirs)} modules in modules directory",
|
||||
title="Module Status Check", border_style="bright_cyan"))
|
||||
|
||||
# Create status table
|
||||
status_table = Table(title="Module Status Overview", show_header=True, header_style="bold blue")
|
||||
status_table.add_column("Module", style="bold cyan", width=17)
|
||||
status_table.add_column("Status", width=12, justify="center")
|
||||
status_table.add_column("Dev File", width=12, justify="center")
|
||||
status_table.add_column("Inline Tests", width=12, justify="center")
|
||||
status_table.add_column("External Tests", width=12, justify="center")
|
||||
status_table.add_column("README", width=12, justify="center")
|
||||
|
||||
if args.metadata:
|
||||
status_table.add_column("Export Target", width=20, justify="center")
|
||||
status_table.add_column("Prerequisites", width=15, justify="center")
|
||||
|
||||
# Check each module
|
||||
modules_status = []
|
||||
for module_dir in sorted(module_dirs):
|
||||
module_name = module_dir.name
|
||||
status = self._check_module_status(module_dir, args.test_status)
|
||||
modules_status.append((module_name, status))
|
||||
|
||||
# Add to table
|
||||
row = [
|
||||
module_name,
|
||||
self._format_status(status['overall_status']),
|
||||
self._format_file_status(status['dev_file'], status.get('export_count', 0)),
|
||||
self._format_inline_tests(status['inline_test_count']),
|
||||
self._format_external_tests(status['external_tests'], status.get('external_test_status')),
|
||||
"✅" if status['readme'] else "❌"
|
||||
]
|
||||
|
||||
# Add metadata columns if requested
|
||||
if args.metadata:
|
||||
metadata = status.get('metadata', {})
|
||||
export_target = status.get('export_target', 'unknown')
|
||||
row.append(export_target if export_target not in ['not_found', 'no_export', 'read_error'] else export_target)
|
||||
|
||||
# Show prerequisites from dependencies
|
||||
deps = metadata.get('dependencies', {})
|
||||
prereqs = deps.get('prerequisites', [])
|
||||
row.append(', '.join(prereqs) if prereqs else 'none')
|
||||
|
||||
status_table.add_row(*row)
|
||||
|
||||
console.print(status_table)
|
||||
|
||||
# Summary with better logic
|
||||
total_modules = len(modules_status)
|
||||
|
||||
# A module is "working" if it has a dev file with implementations
|
||||
working_modules = sum(1 for _, status in modules_status
|
||||
if status['dev_file'] and status.get('export_count', 0) > 0)
|
||||
|
||||
# A module is "complete" if it has everything
|
||||
complete_modules = sum(1 for _, status in modules_status
|
||||
if status['dev_file'] and status['external_tests'] and status['readme'] and status.get('export_count', 0) > 0)
|
||||
|
||||
console.print(f"\n📊 Summary:")
|
||||
console.print(f" 🏗️ Working modules: {working_modules}/{total_modules} (have implementations)")
|
||||
console.print(f" ✅ Complete modules: {complete_modules}/{total_modules} (have implementations, tests, docs)")
|
||||
|
||||
# Helpful commands
|
||||
console.print(f"\n💡 Quick commands:")
|
||||
console.print(f" [bold cyan]tito status --comprehensive[/bold cyan] # Full system health dashboard")
|
||||
console.print(f" [bold cyan]tito module test --all[/bold cyan] # Test all modules")
|
||||
console.print(f" [bold cyan]tito module test MODULE_NAME[/bold cyan] # Test specific module")
|
||||
console.print(f" [bold cyan]pytest modules/*/ -k test_[/bold cyan] # Run pytest on inline tests")
|
||||
console.print(f" [bold cyan]pytest tests/test_*.py[/bold cyan] # Run external tests")
|
||||
|
||||
# Detailed view
|
||||
if args.details:
|
||||
console.print("\n" + "="*60)
|
||||
console.print("📁 Detailed Module Structure")
|
||||
console.print("="*60)
|
||||
|
||||
for module_name, status in modules_status:
|
||||
self._print_module_details(module_name, status)
|
||||
|
||||
# Metadata view
|
||||
if args.metadata:
|
||||
console.print("\n" + "="*60)
|
||||
console.print("📊 Module Metadata")
|
||||
console.print("="*60)
|
||||
|
||||
for module_name, status in modules_status:
|
||||
if status.get('metadata'):
|
||||
self._print_module_metadata(module_name, status['metadata'])
|
||||
|
||||
return 0
|
||||
|
||||
def _check_module_status(self, module_dir: Path, check_tests: bool = False) -> dict:
|
||||
"""Check the status of a single module."""
|
||||
module_name = module_dir.name
|
||||
|
||||
# Check for required files
|
||||
# Extract short name from module directory name for dev file
|
||||
if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = module_name[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = module_name
|
||||
dev_file = module_dir / f"{short_name}.py"
|
||||
readme_file = module_dir / "README.md"
|
||||
metadata_file = module_dir / "module.yaml"
|
||||
|
||||
# Check for tests in main tests directory
|
||||
# Extract short name from module directory name (e.g., "01_tensor" -> "tensor")
|
||||
if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))):
|
||||
short_name = module_name[3:] # Remove "00_" prefix
|
||||
else:
|
||||
short_name = module_name
|
||||
|
||||
main_test_file = Path("tests") / f"test_{short_name}.py"
|
||||
|
||||
status = {
|
||||
'dev_file': dev_file.exists(),
|
||||
'readme': readme_file.exists(),
|
||||
'metadata_file': metadata_file.exists(),
|
||||
'external_tests': main_test_file.exists(),
|
||||
'inline_test_count': 0,
|
||||
'export_count': 0,
|
||||
'export_target': 'not_found',
|
||||
'external_test_status': None,
|
||||
'overall_status': 'unknown',
|
||||
'metadata': None
|
||||
}
|
||||
|
||||
# Count inline tests and exports if dev file exists
|
||||
if dev_file.exists():
|
||||
status['inline_test_count'] = self._count_test_functions(dev_file)
|
||||
status['export_count'] = self._count_export_functions(dev_file)
|
||||
status['export_target'] = self._get_export_target(module_dir)
|
||||
|
||||
# Run external tests if requested (slower)
|
||||
if check_tests and main_test_file.exists():
|
||||
status['external_test_status'] = self._check_external_tests(main_test_file)
|
||||
|
||||
# Determine overall status
|
||||
status['overall_status'] = self._determine_overall_status(status)
|
||||
|
||||
# Load metadata if available
|
||||
if metadata_file.exists():
|
||||
try:
|
||||
with open(metadata_file, 'r') as f:
|
||||
metadata = yaml.safe_load(f)
|
||||
status['metadata'] = metadata
|
||||
except Exception as e:
|
||||
status['metadata'] = {'error': str(e)}
|
||||
|
||||
return status
|
||||
|
||||
def _determine_overall_status(self, status: dict) -> str:
|
||||
"""Determine overall module status based on files and implementation."""
|
||||
# If no dev file, module is not started
|
||||
if not status['dev_file']:
|
||||
return 'not_started'
|
||||
|
||||
# If dev file exists but no implementations, module is empty
|
||||
if status.get('export_count', 0) == 0:
|
||||
return 'empty'
|
||||
|
||||
# If has implementations but no tests, module is in progress
|
||||
if status.get('inline_test_count', 0) == 0 and not status.get('external_tests', False):
|
||||
return 'no_tests'
|
||||
|
||||
# If has implementations and tests, module is working
|
||||
if status.get('export_count', 0) > 0 and (status.get('inline_test_count', 0) > 0 or status.get('external_tests', False)):
|
||||
return 'working'
|
||||
|
||||
return 'unknown'
|
||||
|
||||
def _check_external_tests(self, test_file: Path) -> str:
|
||||
"""Check if external tests pass (used only when --test-status is specified)."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
[sys.executable, "-m", "pytest", str(test_file), "-q", "--tb=no"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
return 'passing'
|
||||
else:
|
||||
return 'failing'
|
||||
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||
return 'error'
|
||||
|
||||
def _format_status(self, status: str) -> str:
|
||||
"""Format overall module status with appropriate emoji and color."""
|
||||
status_map = {
|
||||
'working': '✅', # Has implementations and tests
|
||||
'no_tests': '🚧', # Has implementations but no tests
|
||||
'empty': '📝', # Has dev file but no implementations
|
||||
'not_started': '❌', # No dev file
|
||||
'unknown': '❓'
|
||||
}
|
||||
return status_map.get(status, '❓')
|
||||
|
||||
def _format_file_status(self, exists: bool, export_count: int) -> str:
|
||||
"""Format dev file status showing if it has implementations."""
|
||||
if not exists:
|
||||
return "❌"
|
||||
if export_count == 0:
|
||||
return "📝" # File exists but empty
|
||||
return f"✅({export_count})" # File exists with implementations
|
||||
|
||||
def _format_inline_tests(self, test_count: int) -> str:
|
||||
"""Format inline test count."""
|
||||
if test_count == 0:
|
||||
return "❌"
|
||||
return f"✅({test_count})"
|
||||
|
||||
def _format_external_tests(self, exists: bool, test_status: Optional[str] = None) -> str:
|
||||
"""Format external test status."""
|
||||
if not exists:
|
||||
return "❌"
|
||||
if test_status == 'passing':
|
||||
return "✅"
|
||||
elif test_status == 'failing':
|
||||
return "🔴"
|
||||
elif test_status == 'error':
|
||||
return "⚠️"
|
||||
else:
|
||||
return "✅" # Exists but not tested
|
||||
|
||||
def _print_module_details(self, module_name: str, status: dict) -> None:
|
||||
"""Print detailed information about a module."""
|
||||
console = self.console
|
||||
|
||||
# Module header
|
||||
console.print(f"\n📦 {module_name.upper()}", style="bold cyan")
|
||||
console.print("-" * 40)
|
||||
|
||||
# File structure
|
||||
files_table = Table(show_header=False, box=None, padding=(0, 2))
|
||||
files_table.add_column("File", style="dim")
|
||||
files_table.add_column("Status")
|
||||
|
||||
dev_status = "✅ Found" if status['dev_file'] else "❌ Missing"
|
||||
if status['dev_file']:
|
||||
dev_status += f" ({status.get('export_count', 0)} exports, {status.get('inline_test_count', 0)} inline tests)"
|
||||
|
||||
files_table.add_row(f"{module_name}.py", dev_status)
|
||||
files_table.add_row("tests/test_*.py", "✅ Found" if status['external_tests'] else "❌ Missing")
|
||||
files_table.add_row("README.md", "✅ Found" if status['readme'] else "❌ Missing")
|
||||
|
||||
console.print(files_table)
|
||||
|
||||
# Pytest commands
|
||||
if status['dev_file'] or status['external_tests']:
|
||||
console.print("\n[dim]💡 Test commands:[/dim]")
|
||||
if status['dev_file']:
|
||||
console.print(f"[dim] pytest modules/{module_name}/{module_name}.py -k test_[/dim]")
|
||||
if status['external_tests']:
|
||||
short_name = module_name[3:] if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))) else module_name
|
||||
console.print(f"[dim] pytest tests/test_{short_name}.py -v[/dim]")
|
||||
|
||||
def _print_module_metadata(self, module_name: str, metadata: dict) -> None:
|
||||
"""Print detailed metadata information about a module."""
|
||||
console = self.console
|
||||
|
||||
# Module header
|
||||
title = metadata.get('title', module_name.title())
|
||||
console.print(f"\n📦 {title}", style="bold cyan")
|
||||
console.print("-" * (len(title) + 4))
|
||||
|
||||
# Basic info
|
||||
if metadata.get('description'):
|
||||
console.print(f"📝 {metadata['description']}")
|
||||
|
||||
# Export info (read from dev file - source of truth)
|
||||
module_path = Path(f"modules/{module_name}")
|
||||
export_target = self._get_export_target(module_path)
|
||||
if export_target not in ['not_found', 'no_export', 'read_error']:
|
||||
console.print(f"📦 Exports to: {export_target}")
|
||||
|
||||
# Dependencies
|
||||
if metadata.get('dependencies'):
|
||||
deps = metadata['dependencies']
|
||||
console.print("\n🔗 Dependencies:")
|
||||
if deps.get('prerequisites'):
|
||||
console.print(f" Prerequisites: {', '.join(deps['prerequisites'])}")
|
||||
if deps.get('enables'):
|
||||
console.print(f" Enables: {', '.join(deps['enables'])}")
|
||||
|
||||
# Components
|
||||
if metadata.get('components'):
|
||||
console.print("\n🧩 Components:")
|
||||
for component in metadata['components']:
|
||||
console.print(f" • {component}")
|
||||
|
||||
# Files
|
||||
if metadata.get('files'):
|
||||
files = metadata['files']
|
||||
console.print("\n📁 Files:")
|
||||
if files.get('dev_file'):
|
||||
console.print(f" • Dev: {files['dev_file']}")
|
||||
if files.get('test_file'):
|
||||
console.print(f" • Test: {files['test_file']}")
|
||||
if files.get('readme'):
|
||||
console.print(f" • README: {files['readme']}")
|
||||
Reference in New Issue
Block a user