feat: Transform 7 modules to follow progressive testing pedagogical pattern

- Implement 'explain → code → test → repeat' structure across all modules
- Replace comprehensive end-of-module tests with progressive unit tests
- Add rich scaffolding with detailed implementation guidance
- Transform generic TODOs into step-by-step learning instructions
- Connect educational content to real-world ML systems and PyTorch
- Reduce overall codebase by 37% while enhancing learning experience
- Ensure immediate feedback and skill building for students

Modules transformed:
- 01_tensor: Tensor operations and broadcasting
- 02_activations: Activation functions and derivatives
- 03_layers: Linear layers and forward/backward propagation
- 04_networks: Network building and multi-layer composition
- 05_cnn: Convolution operations and CNN architecture
- 06_dataloader: Data pipeline and batch processing
- 07_autograd: Automatic differentiation and computational graphs
This commit is contained in:
Vijay Janapa Reddi
2025-07-13 16:43:27 -04:00
parent 5213050131
commit 833475c2c7
7 changed files with 3241 additions and 7083 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -21,10 +21,18 @@ Welcome to the CNN module! Here you'll implement the core building block of mode
- Compose Conv2D with other layers to build complete convolutional networks
- See how convolution enables parameter sharing and translation invariance
## Build → Use → Understand
## Build → Use → Reflect
1. **Build**: Conv2D layer using sliding window convolution from scratch
2. **Use**: Transform images and see feature maps emerge
3. **Understand**: How CNNs learn hierarchical spatial patterns
3. **Reflect**: How CNNs learn hierarchical spatial patterns
## What You'll Learn
By the end of this module, you'll understand:
- How convolution works as a sliding window operation
- Why convolution is perfect for spatial data like images
- How to build learnable convolutional layers
- The CNN pipeline: Conv2D → Activation → Flatten → Dense
- How parameter sharing makes CNNs efficient
"""
# %% nbgrader={"grade": false, "grade_id": "cnn-imports", "locked": false, "schema_version": 3, "solution": false, "task": false}
@@ -96,59 +104,18 @@ from tinytorch.core.tensor import Tensor # Foundation
- **Integration:** Works seamlessly with other TinyTorch components
"""
# %% [markdown]
"""
## 🧠 The Mathematical Foundation of Convolution
### The Convolution Operation
Convolution is a mathematical operation that combines two functions to produce a third function:
```
(f * g)(t) = ∫ f(τ)g(t - τ)dτ
```
In discrete 2D computer vision, this becomes:
```
(I * K)[i,j] = ΣΣ I[i+m, j+n] × K[m,n]
```
### Why Convolution is Perfect for Images
- **Local connectivity**: Each output depends only on a small region of input
- **Weight sharing**: Same filter applied everywhere (translation invariance)
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Parameter efficiency**: Much fewer parameters than fully connected layers
### The Three Core Principles
1. **Sparse connectivity**: Each neuron connects to only a small region
2. **Parameter sharing**: Same weights used across all spatial locations
3. **Equivariant representation**: If input shifts, output shifts correspondingly
### Connection to Real ML Systems
Every vision framework uses convolution:
- **PyTorch**: `torch.nn.Conv2d` with optimized CUDA kernels
- **TensorFlow**: `tf.keras.layers.Conv2D` with cuDNN acceleration
- **JAX**: `jax.lax.conv_general_dilated` with XLA compilation
- **TinyTorch**: `tinytorch.core.cnn.Conv2D` (what we're building!)
### Performance Considerations
- **Memory layout**: Efficient data access patterns
- **Vectorization**: SIMD operations for parallel computation
- **Cache efficiency**: Spatial locality in memory access
- **Optimization**: im2col, FFT-based convolution, Winograd algorithm
"""
# %% [markdown]
"""
## Step 1: Understanding Convolution
### What is Convolution?
A **convolutional layer** applies a small filter (kernel) across the input, producing a feature map. This operation captures local patterns and is the foundation of modern vision models.
**Convolution** is a mathematical operation that slides a small filter (kernel) across an input, computing dot products at each position.
### Why Convolution Matters in Computer Vision
- **Local connectivity**: Each output value depends only on a small region of the input
- **Weight sharing**: The same filter is applied everywhere (translation invariance)
### Why Convolution is Perfect for Images
- **Local patterns**: Images have local structure (edges, textures)
- **Translation invariance**: Same pattern can appear anywhere
- **Parameter sharing**: One filter detects the pattern everywhere
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Parameter efficiency**: Much fewer parameters than fully connected layers
### The Fundamental Insight
**Convolution is pattern matching!** The kernel learns to detect specific patterns:
@@ -157,7 +124,7 @@ A **convolutional layer** applies a small filter (kernel) across the input, prod
- **Shape detectors**: Identify geometric forms
- **Feature detectors**: Combine simple patterns into complex features
### Real-World Examples
### Real-World Applications
- **Image processing**: Detect edges, blur, sharpen
- **Computer vision**: Recognize objects, faces, text
- **Medical imaging**: Detect tumors, analyze scans
@@ -204,7 +171,7 @@ def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:
EXAMPLE:
Input: [[1, 2, 3], Kernel: [[1, 0],
[4, 5, 6], [0, -1]]
[4, 5, 6], [0, -1]]
[7, 8, 9]]
Output[0,0] = 1*1 + 2*0 + 4*0 + 5*(-1) = 1 - 5 = -4
@@ -217,7 +184,6 @@ def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:
- Use four nested loops: for i in range(out_H): for j in range(out_W): for di in range(kH): for dj in range(kW):
- Accumulate the sum: output[i,j] += input[i+di, j+dj] * kernel[di, dj]
"""
### BEGIN SOLUTION
# Get input and kernel dimensions
H, W = input.shape
kH, kW = kernel.shape
@@ -236,18 +202,19 @@ def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:
output[i, j] += input[i + di, j + dj] * kernel[di, dj]
return output
### END SOLUTION
# %% [markdown]
"""
### 🧪 Quick Test: Convolution Operation
### 🧪 Unit Test: Convolution Operation
Let's test your convolution implementation right away! This is the core operation that powers computer vision.
**This is a unit test** - it tests one specific function (conv2d_naive) in isolation.
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-naive-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test conv2d_naive function immediately after implementation
print("🔬 Testing convolution operation...")
print("🔬 Unit Test: Convolution Operation...")
# Test simple 3x3 input with 2x2 kernel
try:
@@ -367,14 +334,12 @@ class Conv2D:
- Initialize kernel: np.random.randn(kH, kW) * 0.1 (small values)
- Convert to float32 for consistency
"""
### BEGIN SOLUTION
# Store kernel size
self.kernel_size = kernel_size
kH, kW = kernel_size
# Initialize random kernel with small values
self.kernel = np.random.randn(kH, kW).astype(np.float32) * 0.1
### END SOLUTION
def forward(self, x: Tensor) -> Tensor:
"""
@@ -403,11 +368,9 @@ class Conv2D:
- Use conv2d_naive(x.data, self.kernel)
- Return Tensor(result) to wrap the result
"""
### BEGIN SOLUTION
# Apply convolution using naive implementation
result = conv2d_naive(x.data, self.kernel)
return Tensor(result)
### END SOLUTION
def __call__(self, x: Tensor) -> Tensor:
"""Make layer callable: layer(x) same as layer.forward(x)"""
@@ -415,14 +378,16 @@ class Conv2D:
# %% [markdown]
"""
### 🧪 Quick Test: Conv2D Layer
### 🧪 Unit Test: Conv2D Layer
Let's test your Conv2D layer implementation! This is a learnable convolutional layer that can be trained.
**This is a unit test** - it tests one specific class (Conv2D) in isolation.
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-layer-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test Conv2D layer immediately after implementation
print("🔬 Testing Conv2D layer...")
print("🔬 Unit Test: Conv2D Layer...")
# Create a Conv2D layer
try:
@@ -525,23 +490,23 @@ def flatten(x: Tensor) -> Tensor:
- Add batch dimension: result[None, :]
- Return Tensor(result)
"""
### BEGIN SOLUTION
# Flatten the tensor and add batch dimension
flattened = x.data.flatten()
result = flattened[None, :] # Add batch dimension
return Tensor(result)
### END SOLUTION
# %% [markdown]
"""
### 🧪 Quick Test: Flatten Function
### 🧪 Unit Test: Flatten Function
Let's test your flatten function! This connects convolutional layers to dense layers.
**This is a unit test** - it tests one specific function (flatten) in isolation.
"""
# %% nbgrader={"grade": true, "grade_id": "test-flatten-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test flatten function immediately after implementation
print("🔬 Testing flatten function...")
print("🔬 Unit Test: Flatten Function...")
# Test case 1: 2x2 tensor
try:
@@ -596,524 +561,160 @@ print(" Converts 2D tensor to 1D")
print(" Preserves batch dimension")
print(" Enables connection to Dense layers")
print("📈 Progress: Convolution operation ✓, Conv2D layer ✓, Flatten ✓")
print("🚀 CNN pipeline ready!")
# %% [markdown]
"""
## 🧪 Comprehensive CNN Testing Suite
## Step 4: Integration Test - Complete CNN Pipeline
Let's test all CNN components thoroughly with realistic computer vision scenarios!
### Real-World CNN Applications
Let's test our CNN components in realistic scenarios:
#### **Image Classification Pipeline**
```python
# The standard CNN pattern
Conv2D → ReLU → Flatten → Dense → Output
```
#### **Multi-layer CNN**
```python
# Deeper pattern for complex features
Conv2D → ReLU → Conv2D → ReLU → Flatten → Dense → Output
```
#### **Feature Extraction**
```python
# Extract spatial features then classify
image → CNN features → dense classifier → predictions
```
This integration test ensures our CNN components work together for real computer vision applications!
"""
# %% nbgrader={"grade": false, "grade_id": "test-cnn-comprehensive", "locked": false, "schema_version": 3, "solution": false, "task": false}
def test_convolution_operations():
"""Test 1: Comprehensive convolution operations testing"""
print("🔬 Testing Convolution Operations...")
# %% nbgrader={"grade": true, "grade_id": "test-integration", "locked": true, "points": 15, "schema_version": 3, "solution": false, "task": false}
# Integration test - complete CNN applications
print("🔬 Integration Test: Complete CNN Applications...")
try:
# Test 1: Simple CNN Pipeline
print("\n1. Simple CNN Pipeline Test:")
# Test 1.1: Basic convolution
try:
input_img = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
identity_kernel = np.array([[1, 0], [0, 1]], dtype=np.float32)
result = conv2d_naive(input_img, identity_kernel)
expected = np.array([[6, 8], [12, 14]], dtype=np.float32)
assert np.allclose(result, expected), f"Identity convolution failed: {result} vs {expected}"
print("✅ Basic convolution test passed")
except Exception as e:
print(f"❌ Basic convolution failed: {e}")
return False
# Create pipeline: Conv2D → ReLU → Flatten → Dense
conv = Conv2D(kernel_size=(2, 2))
relu = ReLU()
dense = Dense(input_size=4, output_size=3)
# Test 1.2: Edge detection kernel
try:
# Vertical edge detection
edge_input = np.array([[0, 0, 1, 1], [0, 0, 1, 1], [0, 0, 1, 1]], dtype=np.float32)
vertical_edge = np.array([[-1, 1], [-1, 1]], dtype=np.float32)
result = conv2d_naive(edge_input, vertical_edge)
# Should detect the vertical edge at position (0,1) and (1,1)
assert result[0, 1] > 0 and result[1, 1] > 0, "Vertical edge not detected"
print("✅ Edge detection test passed")
except Exception as e:
print(f"❌ Edge detection failed: {e}")
return False
# Input image
image = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Test 1.3: Blur kernel
try:
noise_input = np.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]], dtype=np.float32)
blur_kernel = np.array([[0.25, 0.25], [0.25, 0.25]], dtype=np.float32)
result = conv2d_naive(noise_input, blur_kernel)
# Blur should smooth out the noise
assert np.all(result >= 0) and np.all(result <= 1), "Blur kernel failed"
print("✅ Blur kernel test passed")
except Exception as e:
print(f"❌ Blur kernel failed: {e}")
return False
# Forward pass
features = conv(image) # (3,3) → (2,2)
activated = relu(features) # (2,2) → (2,2)
flattened = flatten(activated) # (2,2) → (1,4)
output = dense(flattened) # (1,4) → (1,3)
# Test 1.4: Different kernel sizes
try:
large_input = np.random.randn(10, 10).astype(np.float32)
# Test 3x3 kernel
kernel_3x3 = np.random.randn(3, 3).astype(np.float32)
result_3x3 = conv2d_naive(large_input, kernel_3x3)
assert result_3x3.shape == (8, 8), f"3x3 kernel output shape wrong: {result_3x3.shape}"
# Test 5x5 kernel
kernel_5x5 = np.random.randn(5, 5).astype(np.float32)
result_5x5 = conv2d_naive(large_input, kernel_5x5)
assert result_5x5.shape == (6, 6), f"5x5 kernel output shape wrong: {result_5x5.shape}"
print("✅ Different kernel sizes test passed")
except Exception as e:
print(f"❌ Different kernel sizes failed: {e}")
return False
assert features.shape == (2, 2), f"Conv output shape wrong: {features.shape}"
assert activated.shape == (2, 2), f"ReLU output shape wrong: {activated.shape}"
assert flattened.shape == (1, 4), f"Flatten output shape wrong: {flattened.shape}"
assert output.shape == (1, 3), f"Dense output shape wrong: {output.shape}"
print("🎯 Convolution operations: All tests passed!")
return True
def test_conv2d_layer():
"""Test 2: Conv2D layer comprehensive testing"""
print("🔬 Testing Conv2D Layer...")
print("✅ Simple CNN pipeline works correctly")
# Test 2.1: Layer initialization
try:
layer_2x2 = Conv2D(kernel_size=(2, 2))
assert layer_2x2.kernel.shape == (2, 2), f"2x2 kernel shape wrong: {layer_2x2.kernel.shape}"
assert not np.allclose(layer_2x2.kernel, 0), "Kernel should not be all zeros"
layer_3x3 = Conv2D(kernel_size=(3, 3))
assert layer_3x3.kernel.shape == (3, 3), f"3x3 kernel shape wrong: {layer_3x3.kernel.shape}"
print("✅ Layer initialization test passed")
except Exception as e:
print(f"❌ Layer initialization failed: {e}")
return False
# Test 2: Multi-layer CNN
print("\n2. Multi-layer CNN Test:")
# Test 2.2: Forward pass with different inputs
try:
layer = Conv2D(kernel_size=(2, 2))
# Small image
small_img = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
output_small = layer(small_img)
assert output_small.shape == (2, 2), f"Small image output shape wrong: {output_small.shape}"
assert isinstance(output_small, Tensor), "Output should be Tensor"
# Larger image
large_img = Tensor(np.random.randn(8, 8))
output_large = layer(large_img)
assert output_large.shape == (7, 7), f"Large image output shape wrong: {output_large.shape}"
print("✅ Forward pass test passed")
except Exception as e:
print(f"❌ Forward pass failed: {e}")
return False
# Create deeper pipeline: Conv2D → ReLU → Conv2D → ReLU → Flatten → Dense
conv1 = Conv2D(kernel_size=(2, 2))
relu1 = ReLU()
conv2 = Conv2D(kernel_size=(2, 2))
relu2 = ReLU()
dense_multi = Dense(input_size=9, output_size=2)
# Test 2.3: Learnable parameters
try:
layer1 = Conv2D(kernel_size=(2, 2))
layer2 = Conv2D(kernel_size=(2, 2))
# Different layers should have different random kernels
assert not np.allclose(layer1.kernel, layer2.kernel), "Different layers should have different kernels"
# Test that kernels are reasonable size (not too large)
assert np.max(np.abs(layer1.kernel)) < 1.0, "Kernel values should be small for stable training"
print("✅ Learnable parameters test passed")
except Exception as e:
print(f"❌ Learnable parameters failed: {e}")
return False
# Larger input for multi-layer processing
large_image = Tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]])
# Test 2.4: Real computer vision scenario - digit recognition
try:
# Simulate a simple 5x5 digit
digit_5x5 = Tensor([
[0, 1, 1, 1, 0],
[1, 0, 0, 0, 1],
[1, 0, 1, 0, 1],
[1, 0, 0, 0, 1],
[0, 1, 1, 1, 0]
])
# Edge detection layer
edge_layer = Conv2D(kernel_size=(3, 3))
edge_layer.kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype=np.float32)
edges = edge_layer(digit_5x5)
assert edges.shape == (3, 3), f"Edge detection output shape wrong: {edges.shape}"
print("✅ Computer vision scenario test passed")
except Exception as e:
print(f"❌ Computer vision scenario failed: {e}")
return False
# Forward pass
h1 = conv1(large_image) # (5,5) → (4,4)
h2 = relu1(h1) # (4,4) → (4,4)
h3 = conv2(h2) # (4,4) → (3,3)
h4 = relu2(h3) # (3,3) → (3,3)
h5 = flatten(h4) # (3,3) → (1,9)
output_multi = dense_multi(h5) # (1,9) → (1,2)
print("🎯 Conv2D layer: All tests passed!")
return True
def test_flatten_operations():
"""Test 3: Flatten operations comprehensive testing"""
print("🔬 Testing Flatten Operations...")
assert h1.shape == (4, 4), f"Conv1 output wrong: {h1.shape}"
assert h3.shape == (3, 3), f"Conv2 output wrong: {h3.shape}"
assert h5.shape == (1, 9), f"Flatten output wrong: {h5.shape}"
assert output_multi.shape == (1, 2), f"Final output wrong: {output_multi.shape}"
# Test 3.1: Basic flattening
try:
# 2x2 tensor
x_2x2 = Tensor([[1, 2], [3, 4]])
flat_2x2 = flatten(x_2x2)
assert flat_2x2.shape == (1, 4), f"2x2 flatten shape wrong: {flat_2x2.shape}"
expected = np.array([[1, 2, 3, 4]])
assert np.array_equal(flat_2x2.data, expected), f"2x2 flatten data wrong: {flat_2x2.data}"
# 3x3 tensor
x_3x3 = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
flat_3x3 = flatten(x_3x3)
assert flat_3x3.shape == (1, 9), f"3x3 flatten shape wrong: {flat_3x3.shape}"
expected = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
assert np.array_equal(flat_3x3.data, expected), f"3x3 flatten data wrong: {flat_3x3.data}"
print("✅ Basic flattening test passed")
except Exception as e:
print(f"❌ Basic flattening failed: {e}")
return False
print("✅ Multi-layer CNN works correctly")
# Test 3.2: Different aspect ratios
try:
# Wide tensor
x_wide = Tensor([[1, 2, 3, 4, 5, 6]]) # 1x6
flat_wide = flatten(x_wide)
assert flat_wide.shape == (1, 6), f"Wide flatten shape wrong: {flat_wide.shape}"
# Tall tensor
x_tall = Tensor([[1], [2], [3], [4], [5], [6]]) # 6x1
flat_tall = flatten(x_tall)
assert flat_tall.shape == (1, 6), f"Tall flatten shape wrong: {flat_tall.shape}"
print("✅ Different aspect ratios test passed")
except Exception as e:
print(f"❌ Different aspect ratios failed: {e}")
return False
# Test 3: Image Classification Scenario
print("\n3. Image Classification Test:")
# Test 3.3: Preserve data order
try:
# Test that flattening preserves row-major order
x_ordered = Tensor([[1, 2, 3], [4, 5, 6]]) # 2x3
flat_ordered = flatten(x_ordered)
expected_order = np.array([[1, 2, 3, 4, 5, 6]])
assert np.array_equal(flat_ordered.data, expected_order), "Flatten should preserve row-major order"
print("✅ Data order preservation test passed")
except Exception as e:
print(f"❌ Data order preservation failed: {e}")
return False
# Simulate digit classification with 8x8 image
digit_image = Tensor([[1, 0, 0, 1, 1, 0, 0, 1],
[0, 1, 0, 1, 1, 0, 1, 0],
[0, 0, 1, 1, 1, 1, 0, 0],
[1, 1, 1, 0, 0, 1, 1, 1],
[1, 0, 0, 1, 1, 0, 0, 1],
[0, 1, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 1]])
# Test 3.4: CNN to Dense connection scenario
try:
# Simulate CNN feature map -> Dense layer
feature_map = Tensor([[0.1, 0.2], [0.3, 0.4]]) # 2x2 feature map
flattened_features = flatten(feature_map)
# Should be ready for Dense layer input
assert flattened_features.shape == (1, 4), "Feature map should flatten to (1, 4)"
assert isinstance(flattened_features, Tensor), "Should remain a Tensor"
# Test with Dense layer
dense = Dense(input_size=4, output_size=2)
output = dense(flattened_features)
assert output.shape == (1, 2), f"Dense output shape wrong: {output.shape}"
print("✅ CNN to Dense connection test passed")
except Exception as e:
print(f"❌ CNN to Dense connection failed: {e}")
return False
# CNN for digit classification
feature_extractor = Conv2D(kernel_size=(3, 3)) # (8,8) → (6,6)
activation = ReLU()
classifier = Dense(input_size=36, output_size=10) # 10 digit classes
print("🎯 Flatten operations: All tests passed!")
return True
def test_cnn_pipelines():
"""Test 4: Complete CNN pipeline testing"""
print("🔬 Testing CNN Pipelines...")
# Forward pass
features = feature_extractor(digit_image)
activated_features = activation(features)
feature_vector = flatten(activated_features)
digit_scores = classifier(feature_vector)
# Test 4.1: Simple CNN pipeline
try:
# Create pipeline: Conv2D -> ReLU -> Flatten -> Dense
conv = Conv2D(kernel_size=(2, 2))
relu = ReLU()
dense = Dense(input_size=4, output_size=3)
# Input image
image = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Forward pass
features = conv(image) # (3,3) -> (2,2)
activated = relu(features) # (2,2) -> (2,2)
flattened = flatten(activated) # (2,2) -> (1,4)
output = dense(flattened) # (1,4) -> (1,3)
assert features.shape == (2, 2), f"Conv output shape wrong: {features.shape}"
assert activated.shape == (2, 2), f"ReLU output shape wrong: {activated.shape}"
assert flattened.shape == (1, 4), f"Flatten output shape wrong: {flattened.shape}"
assert output.shape == (1, 3), f"Dense output shape wrong: {output.shape}"
print("✅ Simple CNN pipeline test passed")
except Exception as e:
print(f"❌ Simple CNN pipeline failed: {e}")
return False
assert features.shape == (6, 6), f"Feature extraction shape wrong: {features.shape}"
assert feature_vector.shape == (1, 36), f"Feature vector shape wrong: {feature_vector.shape}"
assert digit_scores.shape == (1, 10), f"Digit scores shape wrong: {digit_scores.shape}"
# Test 4.2: Multi-layer CNN
try:
# Create deeper pipeline: Conv2D -> ReLU -> Conv2D -> ReLU -> Flatten -> Dense
conv1 = Conv2D(kernel_size=(2, 2))
relu1 = ReLU()
conv2 = Conv2D(kernel_size=(2, 2))
relu2 = ReLU()
dense = Dense(input_size=1, output_size=2)
# Larger input for multi-layer processing
large_image = Tensor(np.random.randn(5, 5))
# Forward pass
h1 = conv1(large_image) # (5,5) -> (4,4)
h2 = relu1(h1) # (4,4) -> (4,4)
h3 = conv2(h2) # (4,4) -> (3,3)
h4 = relu2(h3) # (3,3) -> (3,3)
h5 = flatten(h4) # (3,3) -> (1,9)
# Adjust dense layer for correct input size
dense_adjusted = Dense(input_size=9, output_size=2)
output = dense_adjusted(h5) # (1,9) -> (1,2)
assert h1.shape == (4, 4), f"Conv1 output wrong: {h1.shape}"
assert h3.shape == (3, 3), f"Conv2 output wrong: {h3.shape}"
assert h5.shape == (1, 9), f"Flatten output wrong: {h5.shape}"
assert output.shape == (1, 2), f"Final output wrong: {output.shape}"
print("✅ Multi-layer CNN test passed")
except Exception as e:
print(f"❌ Multi-layer CNN failed: {e}")
return False
print("✅ Image classification scenario works correctly")
# Test 4.3: Image classification scenario
try:
# Simulate MNIST-like 8x8 digit classification
digit_image = Tensor(np.random.randn(8, 8))
# CNN for digit classification
feature_extractor = Conv2D(kernel_size=(3, 3)) # (8,8) -> (6,6)
activation = ReLU()
classifier_prep = flatten # (6,6) -> (1,36)
classifier = Dense(input_size=36, output_size=10) # 10 digit classes
# Forward pass
features = feature_extractor(digit_image)
activated_features = activation(features)
feature_vector = classifier_prep(activated_features)
digit_scores = classifier(feature_vector)
assert features.shape == (6, 6), f"Feature extraction shape wrong: {features.shape}"
assert feature_vector.shape == (1, 36), f"Feature vector shape wrong: {feature_vector.shape}"
assert digit_scores.shape == (1, 10), f"Digit scores shape wrong: {digit_scores.shape}"
print("✅ Image classification scenario test passed")
except Exception as e:
print(f"❌ Image classification scenario failed: {e}")
return False
# Test 4: Feature Extraction and Composition
print("\n4. Feature Extraction Test:")
# Test 4.4: Real-world CNN architecture pattern
try:
# Simulate LeNet-like architecture pattern
input_img = Tensor(np.random.randn(32, 32)) # 32x32 input image
# First conv block
conv1 = Conv2D(kernel_size=(5, 5)) # (32,32) -> (28,28)
relu1 = ReLU()
# Second conv block
conv2 = Conv2D(kernel_size=(5, 5)) # (28,28) -> (24,24)
relu2 = ReLU()
# Classifier
classifier = Dense(input_size=24*24, output_size=3) # 3 classes
# Forward pass
h1 = relu1(conv1(input_img))
h2 = relu2(conv2(h1))
h3 = flatten(h2)
output = classifier(h3)
assert h1.shape == (28, 28), f"First conv block output wrong: {h1.shape}"
assert h2.shape == (24, 24), f"Second conv block output wrong: {h2.shape}"
assert h3.shape == (1, 576), f"Flattened features wrong: {h3.shape}" # 24*24 = 576
assert output.shape == (1, 3), f"Classification output wrong: {output.shape}"
print("✅ Real-world CNN architecture test passed")
except Exception as e:
print(f"❌ Real-world CNN architecture failed: {e}")
return False
# Create modular feature extractor
feature_conv = Conv2D(kernel_size=(2, 2))
feature_activation = ReLU()
print("🎯 CNN pipelines: All tests passed!")
return True
# Run all comprehensive tests
def run_comprehensive_cnn_tests():
"""Run all comprehensive CNN tests"""
print("🧪 Running Comprehensive CNN Test Suite...")
print("=" * 50)
# Create classifier head
classifier_head = Dense(input_size=4, output_size=3)
test_results = []
# Test composition
test_image = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Run all test functions
test_results.append(test_convolution_operations())
test_results.append(test_conv2d_layer())
test_results.append(test_flatten_operations())
test_results.append(test_cnn_pipelines())
# Extract features
extracted_features = feature_conv(test_image)
activated_features = feature_activation(extracted_features)
feature_representation = flatten(activated_features)
# Summary
print("=" * 50)
print("📊 Test Results Summary:")
print(f"✅ Convolution Operations: {'PASSED' if test_results[0] else 'FAILED'}")
print(f"✅ Conv2D Layer: {'PASSED' if test_results[1] else 'FAILED'}")
print(f"✅ Flatten Operations: {'PASSED' if test_results[2] else 'FAILED'}")
print(f"✅ CNN Pipelines: {'PASSED' if test_results[3] else 'FAILED'}")
# Classify
predictions = classifier_head(feature_representation)
all_passed = all(test_results)
print(f"\n🎯 Overall Result: {'ALL TESTS PASSED! 🎉' if all_passed else 'SOME TESTS FAILED ❌'}")
assert extracted_features.shape == (2, 2), f"Feature extraction wrong: {extracted_features.shape}"
assert feature_representation.shape == (1, 4), f"Feature representation wrong: {feature_representation.shape}"
assert predictions.shape == (1, 3), f"Predictions wrong: {predictions.shape}"
if all_passed:
print("\n🚀 CNN Module Implementation Complete!")
print(" ✓ Convolution operations working correctly")
print(" ✓ Conv2D layers ready for training")
print(" ✓ Flatten operations connecting conv to dense layers")
print(" ✓ Complete CNN pipelines functional")
print("\n🎓 Ready for real computer vision applications!")
print("✅ Feature extraction and composition works correctly")
return all_passed
print("\n🎉 Integration test passed! Your CNN components work correctly for:")
print(" • Simple CNN pipelines (Conv2D → ReLU → Flatten → Dense)")
print(" • Multi-layer CNNs (stacked convolutional layers)")
print(" • Image classification scenarios")
print(" • Feature extraction and modular composition")
except Exception as e:
print(f"❌ Integration test failed: {e}")
raise
# Run the comprehensive test suite
if __name__ == "__main__":
run_comprehensive_cnn_tests()
# %% [markdown]
"""
### 🧪 Test Your CNN Implementations
Once you implement the functions above, run these cells to test them:
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-naive", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test conv2d_naive function
print("Testing conv2d_naive function...")
# Test case 1: Simple 3x3 input with 2x2 kernel
input_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
kernel_array = np.array([[1, 0], [0, -1]], dtype=np.float32)
result = conv2d_naive(input_array, kernel_array)
expected = np.array([[-4, -4], [-4, -4]], dtype=np.float32)
print(f"Input:\n{input_array}")
print(f"Kernel:\n{kernel_array}")
print(f"Result:\n{result}")
print(f"Expected:\n{expected}")
assert np.allclose(result, expected), f"conv2d_naive failed: expected {expected}, got {result}"
# Test case 2: Different kernel
kernel2 = np.array([[1, 1], [1, 1]], dtype=np.float32)
result2 = conv2d_naive(input_array, kernel2)
expected2 = np.array([[12, 16], [24, 28]], dtype=np.float32)
assert np.allclose(result2, expected2), f"conv2d_naive failed: expected {expected2}, got {result2}"
print("✅ conv2d_naive tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-layer", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test Conv2D layer
print("Testing Conv2D layer...")
# Create a Conv2D layer
layer = Conv2D(kernel_size=(2, 2))
print(f"Kernel size: {layer.kernel_size}")
print(f"Kernel shape: {layer.kernel.shape}")
# Test with sample input
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Input shape: {x.shape}")
y = layer(x)
print(f"Output shape: {y.shape}")
print(f"Output: {y}")
# Verify shapes
assert y.shape == (2, 2), f"Output shape should be (2, 2), got {y.shape}"
assert isinstance(y, Tensor), "Output should be a Tensor"
print("✅ Conv2D layer tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-flatten", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test flatten function
print("Testing flatten function...")
# Test case 1: 2x2 tensor
x = Tensor([[1, 2], [3, 4]])
flattened = flatten(x)
print(f"Input: {x}")
print(f"Flattened: {flattened}")
print(f"Flattened shape: {flattened.shape}")
# Verify shape and content
assert flattened.shape == (1, 4), f"Flattened shape should be (1, 4), got {flattened.shape}"
expected_data = np.array([[1, 2, 3, 4]])
assert np.array_equal(flattened.data, expected_data), f"Flattened data should be {expected_data}, got {flattened.data}"
# Test case 2: 3x3 tensor
x2 = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
flattened2 = flatten(x2)
assert flattened2.shape == (1, 9), f"Flattened shape should be (1, 9), got {flattened2.shape}"
expected_data2 = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
assert np.array_equal(flattened2.data, expected_data2), f"Flattened data should be {expected_data2}, got {flattened2.data}"
print("✅ Flatten tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-cnn-pipeline", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test complete CNN pipeline
print("Testing complete CNN pipeline...")
# Create a simple CNN pipeline: Conv2D → ReLU → Flatten → Dense
conv_layer = Conv2D(kernel_size=(2, 2))
relu = ReLU()
dense_layer = Dense(input_size=4, output_size=2)
# Test input (3x3 image)
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Input shape: {x.shape}")
# Forward pass through pipeline
h1 = conv_layer(x)
print(f"After Conv2D: {h1.shape}")
h2 = relu(h1)
print(f"After ReLU: {h2.shape}")
h3 = flatten(h2)
print(f"After Flatten: {h3.shape}")
h4 = dense_layer(h3)
print(f"After Dense: {h4.shape}")
# Verify pipeline works
assert h1.shape == (2, 2), f"Conv2D output should be (2, 2), got {h1.shape}"
assert h2.shape == (2, 2), f"ReLU output should be (2, 2), got {h2.shape}"
assert h3.shape == (1, 4), f"Flatten output should be (1, 4), got {h3.shape}"
assert h4.shape == (1, 2), f"Dense output should be (1, 2), got {h4.shape}"
print("✅ CNN pipeline tests passed!")
print("📈 Final Progress: Complete CNN system ready for computer vision!")
# %% [markdown]
"""
@@ -1122,52 +723,62 @@ print("✅ CNN pipeline tests passed!")
Congratulations! You've successfully implemented the core components of convolutional neural networks:
### What You've Accomplished
✅ **Convolution Operation**: Implemented conv2d_naive with sliding window from scratch
✅ **Conv2D Layer**: Built a learnable convolutional layer with random kernel initialization
✅ **Flattening**: Created the bridge between convolutional and dense layers
✅ **CNN Pipeline**: Composed Conv2D → ReLU → Flatten → Dense for complete networks
✅ **Spatial Pattern Detection**: Understanding how convolution detects local features
✅ **Convolution Operation**: Implemented the sliding window mechanism from scratch
✅ **Conv2D Layer**: Built learnable convolutional layers with random initialization
✅ **Flatten Function**: Created the bridge between convolutional and dense layers
✅ **CNN Pipelines**: Composed complete systems for image processing
✅ **Real Applications**: Tested on image classification and feature extraction
### Key Concepts You've Learned
- **Convolution is pattern matching**: Kernels detect specific spatial patterns
- **Parameter sharing**: Same kernel applied everywhere for translation invariance
- **Local connectivity**: Each output depends only on a small input region
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Dimension management**: Flattening connects spatial and vector representations
- **Convolution as pattern matching**: Kernels detect specific features
- **Sliding window mechanism**: How convolution processes spatial data
- **Parameter sharing**: Same kernel applied across the entire image
- **Spatial hierarchy**: Multiple layers build complex features
- **CNN architecture**: Conv2D → Activation → Flatten → Dense pattern
### Mathematical Foundations
- **Convolution operation**: (I * K)[i,j] = ΣΣ I[i+m, j+n] × K[m,n]
- **Sliding window**: Kernel moves across input computing dot products
- **Feature maps**: Convolution outputs that highlight detected patterns
- **Translation invariance**: Same pattern detected regardless of position
- **Convolution operation**: dot product of kernel and image patches
- **Output size calculation**: (input_size - kernel_size + 1)
- **Translation invariance**: Same pattern detected anywhere in input
- **Feature maps**: Spatial representations of detected patterns
### Real-World Applications
- **Computer vision**: Object recognition, face detection, medical imaging
- **Image processing**: Edge detection, noise reduction, enhancement
- **Autonomous systems**: Traffic sign recognition, obstacle detection
- **Scientific imaging**: Satellite imagery, microscopy, astronomy
- **Image classification**: Object recognition, medical imaging
- **Computer vision**: Face detection, autonomous driving
- **Pattern recognition**: Texture analysis, edge detection
- **Feature extraction**: Transfer learning, representation learning
### CNN Architecture Insights
- **Kernel size**: 3×3 most common, balances locality and capacity
- **Stacking layers**: Builds hierarchical feature representations
- **Spatial reduction**: Each layer reduces spatial dimensions
- **Channel progression**: Typically increase channels while reducing spatial size
### Performance Characteristics
- **Parameter efficiency**: Dramatic reduction vs. fully connected
- **Translation invariance**: Robust to object location changes
- **Computational efficiency**: Parallel processing of spatial regions
- **Memory considerations**: Feature maps require storage during forward pass
### Next Steps
1. **Export your code**: `tito package nbdev --export 05_cnn`
2. **Test your implementation**: `tito module test 05_cnn`
3. **Use your CNN components**:
1. **Export your code**: Use NBDev to export to the `tinytorch` package
2. **Test your implementation**: Run the complete test suite
3. **Build CNN architectures**:
```python
from tinytorch.core.cnn import Conv2D, conv2d_naive, flatten
from tinytorch.core.cnn import Conv2D, flatten
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU
# Create CNN pipeline
conv = Conv2D((3, 3))
# Create CNN
conv = Conv2D(kernel_size=(3, 3))
relu = ReLU()
dense = Dense(16, 10)
dense = Dense(input_size=36, output_size=10)
# Process image
features = conv(image)
activated = relu(features)
flattened = flatten(activated)
output = dense(flattened)
features = relu(conv(image))
predictions = dense(flatten(features))
```
4. **Move to Module 6**: Start building data loading and preprocessing pipelines!
4. **Explore advanced CNNs**: Pooling, multiple channels, modern architectures!
**Ready for the next challenge?** Let's build efficient data loading systems to feed our networks!
**Ready for the next challenge?** Let's build data loaders to handle real datasets efficiently!
"""

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff