Files
TinyTorch/modules/source/05_cnn/cnn_dev.py
Vijay Janapa Reddi 365e2ee394 feat: Add comprehensive intermediate testing across all TinyTorch modules
- Add 17 intermediate test points across 6 modules for immediate student feedback
- Tensor module: Tests after creation, properties, arithmetic, and operators
- Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax)
- Layers module: Tests after matrix multiplication and Dense layer implementation
- Networks module: Tests after Sequential class and MLP creation
- CNN module: Tests after convolution, Conv2D layer, and flatten operations
- DataLoader module: Tests after Dataset interface and DataLoader class
- All tests include visual progress indicators and behavioral explanations
- Maintains NBGrader compliance with proper metadata and point allocation
- Enables steady forward progress and better debugging for students
- 100% test success rate across all modules and integration testing
2025-07-12 18:28:35 -04:00

774 lines
28 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.17.1
# ---
# %% [markdown]
"""
# Module 5: CNN - Convolutional Neural Networks
Welcome to the CNN module! Here you'll implement the core building block of modern computer vision: the convolutional layer.
## Learning Goals
- Understand the convolution operation and its importance in computer vision
- Implement Conv2D with explicit for-loops to understand the sliding window mechanism
- Build convolutional layers that can detect spatial patterns in images
- Compose Conv2D with other layers to build complete convolutional networks
- See how convolution enables parameter sharing and translation invariance
## Build → Use → Understand
1. **Build**: Conv2D layer using sliding window convolution from scratch
2. **Use**: Transform images and see feature maps emerge
3. **Understand**: How CNNs learn hierarchical spatial patterns
"""
# %% nbgrader={"grade": false, "grade_id": "cnn-imports", "locked": false, "schema_version": 3, "solution": false, "task": false}
#| default_exp core.cnn
#| export
import numpy as np
import os
import sys
from typing import List, Tuple, Optional
import matplotlib.pyplot as plt
# Import from the main package - try package first, then local modules
try:
from tinytorch.core.tensor import Tensor
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU
except ImportError:
# For development, import from local modules
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '02_activations'))
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '03_layers'))
from tensor_dev import Tensor
from activations_dev import ReLU
from layers_dev import Dense
# %% nbgrader={"grade": false, "grade_id": "cnn-setup", "locked": false, "schema_version": 3, "solution": false, "task": false}
#| hide
#| export
def _should_show_plots():
"""Check if we should show plots (disable during testing)"""
# Check multiple conditions that indicate we're in test mode
is_pytest = (
'pytest' in sys.modules or
'test' in sys.argv or
os.environ.get('PYTEST_CURRENT_TEST') is not None or
any('test' in arg for arg in sys.argv) or
any('pytest' in arg for arg in sys.argv)
)
# Show plots in development mode (when not in test mode)
return not is_pytest
# %% nbgrader={"grade": false, "grade_id": "cnn-welcome", "locked": false, "schema_version": 3, "solution": false, "task": false}
print("🔥 TinyTorch CNN Module")
print(f"NumPy version: {np.__version__}")
print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")
print("Ready to build convolutional neural networks!")
# %% [markdown]
"""
## 📦 Where This Code Lives in the Final Package
**Learning Side:** You work in `modules/source/05_cnn/cnn_dev.py`
**Building Side:** Code exports to `tinytorch.core.cnn`
```python
# Final package structure:
from tinytorch.core.cnn import Conv2D, conv2d_naive, flatten # CNN operations!
from tinytorch.core.layers import Dense # Fully connected layers
from tinytorch.core.activations import ReLU # Nonlinearity
from tinytorch.core.tensor import Tensor # Foundation
```
**Why this matters:**
- **Learning:** Focused modules for deep understanding of convolution
- **Production:** Proper organization like PyTorch's `torch.nn.Conv2d`
- **Consistency:** All CNN operations live together in `core.cnn`
- **Integration:** Works seamlessly with other TinyTorch components
"""
# %% [markdown]
"""
## 🧠 The Mathematical Foundation of Convolution
### The Convolution Operation
Convolution is a mathematical operation that combines two functions to produce a third function:
```
(f * g)(t) = ∫ f(τ)g(t - τ)dτ
```
In discrete 2D computer vision, this becomes:
```
(I * K)[i,j] = ΣΣ I[i+m, j+n] × K[m,n]
```
### Why Convolution is Perfect for Images
- **Local connectivity**: Each output depends only on a small region of input
- **Weight sharing**: Same filter applied everywhere (translation invariance)
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Parameter efficiency**: Much fewer parameters than fully connected layers
### The Three Core Principles
1. **Sparse connectivity**: Each neuron connects to only a small region
2. **Parameter sharing**: Same weights used across all spatial locations
3. **Equivariant representation**: If input shifts, output shifts correspondingly
### Connection to Real ML Systems
Every vision framework uses convolution:
- **PyTorch**: `torch.nn.Conv2d` with optimized CUDA kernels
- **TensorFlow**: `tf.keras.layers.Conv2D` with cuDNN acceleration
- **JAX**: `jax.lax.conv_general_dilated` with XLA compilation
- **TinyTorch**: `tinytorch.core.cnn.Conv2D` (what we're building!)
### Performance Considerations
- **Memory layout**: Efficient data access patterns
- **Vectorization**: SIMD operations for parallel computation
- **Cache efficiency**: Spatial locality in memory access
- **Optimization**: im2col, FFT-based convolution, Winograd algorithm
"""
# %% [markdown]
"""
## Step 1: Understanding Convolution
### What is Convolution?
A **convolutional layer** applies a small filter (kernel) across the input, producing a feature map. This operation captures local patterns and is the foundation of modern vision models.
### Why Convolution Matters in Computer Vision
- **Local connectivity**: Each output value depends only on a small region of the input
- **Weight sharing**: The same filter is applied everywhere (translation invariance)
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Parameter efficiency**: Much fewer parameters than fully connected layers
### The Fundamental Insight
**Convolution is pattern matching!** The kernel learns to detect specific patterns:
- **Edge detectors**: Find boundaries between objects
- **Texture detectors**: Recognize surface patterns
- **Shape detectors**: Identify geometric forms
- **Feature detectors**: Combine simple patterns into complex features
### Real-World Examples
- **Image processing**: Detect edges, blur, sharpen
- **Computer vision**: Recognize objects, faces, text
- **Medical imaging**: Detect tumors, analyze scans
- **Autonomous driving**: Identify traffic signs, pedestrians
### Visual Intuition
```
Input Image: Kernel: Output Feature Map:
[1, 2, 3] [1, 0] [1*1+2*0+4*0+5*(-1), 2*1+3*0+5*0+6*(-1)]
[4, 5, 6] [0, -1] [4*1+5*0+7*0+8*(-1), 5*1+6*0+8*0+9*(-1)]
[7, 8, 9]
```
The kernel slides across the input, computing dot products at each position.
Let's implement this step by step!
"""
# %% nbgrader={"grade": false, "grade_id": "conv2d-naive", "locked": false, "schema_version": 3, "solution": true, "task": false}
#| export
def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:
"""
Naive 2D convolution (single channel, no stride, no padding).
Args:
input: 2D input array (H, W)
kernel: 2D filter (kH, kW)
Returns:
2D output array (H-kH+1, W-kW+1)
TODO: Implement the sliding window convolution using for-loops.
APPROACH:
1. Get input dimensions: H, W = input.shape
2. Get kernel dimensions: kH, kW = kernel.shape
3. Calculate output dimensions: out_H = H - kH + 1, out_W = W - kW + 1
4. Create output array: np.zeros((out_H, out_W))
5. Use nested loops to slide the kernel:
- i loop: output rows (0 to out_H-1)
- j loop: output columns (0 to out_W-1)
- di loop: kernel rows (0 to kH-1)
- dj loop: kernel columns (0 to kW-1)
6. For each (i,j), compute: output[i,j] += input[i+di, j+dj] * kernel[di, dj]
EXAMPLE:
Input: [[1, 2, 3], Kernel: [[1, 0],
[4, 5, 6], [0, -1]]
[7, 8, 9]]
Output[0,0] = 1*1 + 2*0 + 4*0 + 5*(-1) = 1 - 5 = -4
Output[0,1] = 2*1 + 3*0 + 5*0 + 6*(-1) = 2 - 6 = -4
Output[1,0] = 4*1 + 5*0 + 7*0 + 8*(-1) = 4 - 8 = -4
Output[1,1] = 5*1 + 6*0 + 8*0 + 9*(-1) = 5 - 9 = -4
HINTS:
- Start with output = np.zeros((out_H, out_W))
- Use four nested loops: for i in range(out_H): for j in range(out_W): for di in range(kH): for dj in range(kW):
- Accumulate the sum: output[i,j] += input[i+di, j+dj] * kernel[di, dj]
"""
### BEGIN SOLUTION
# Get input and kernel dimensions
H, W = input.shape
kH, kW = kernel.shape
# Calculate output dimensions
out_H, out_W = H - kH + 1, W - kW + 1
# Initialize output array
output = np.zeros((out_H, out_W), dtype=input.dtype)
# Sliding window convolution with four nested loops
for i in range(out_H):
for j in range(out_W):
for di in range(kH):
for dj in range(kW):
output[i, j] += input[i + di, j + dj] * kernel[di, dj]
return output
### END SOLUTION
# %% [markdown]
"""
### 🧪 Quick Test: Convolution Operation
Let's test your convolution implementation right away! This is the core operation that powers computer vision.
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-naive-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test conv2d_naive function immediately after implementation
print("🔬 Testing convolution operation...")
# Test simple 3x3 input with 2x2 kernel
try:
input_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
kernel_array = np.array([[1, 0], [0, 1]], dtype=np.float32) # Identity-like kernel
result = conv2d_naive(input_array, kernel_array)
expected = np.array([[6, 8], [12, 14]], dtype=np.float32) # 1+5, 2+6, 4+8, 5+9
print(f"Input:\n{input_array}")
print(f"Kernel:\n{kernel_array}")
print(f"Result:\n{result}")
print(f"Expected:\n{expected}")
assert np.allclose(result, expected), f"Convolution failed: expected {expected}, got {result}"
print("✅ Simple convolution test passed")
except Exception as e:
print(f"❌ Simple convolution test failed: {e}")
raise
# Test edge detection kernel
try:
input_array = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]], dtype=np.float32)
edge_kernel = np.array([[-1, -1], [-1, 3]], dtype=np.float32) # Edge detection
result = conv2d_naive(input_array, edge_kernel)
expected = np.array([[0, 0], [0, 0]], dtype=np.float32) # Uniform region = no edges
assert np.allclose(result, expected), f"Edge detection failed: expected {expected}, got {result}"
print("✅ Edge detection test passed")
except Exception as e:
print(f"❌ Edge detection test failed: {e}")
raise
# Test output shape
try:
input_5x5 = np.random.randn(5, 5).astype(np.float32)
kernel_3x3 = np.random.randn(3, 3).astype(np.float32)
result = conv2d_naive(input_5x5, kernel_3x3)
expected_shape = (3, 3) # 5-3+1 = 3
assert result.shape == expected_shape, f"Output shape wrong: expected {expected_shape}, got {result.shape}"
print("✅ Output shape test passed")
except Exception as e:
print(f"❌ Output shape test failed: {e}")
raise
# Show the convolution process
print("🎯 Convolution behavior:")
print(" Slides kernel across input")
print(" Computes dot product at each position")
print(" Output size = Input size - Kernel size + 1")
print("📈 Progress: Convolution operation ✓")
# %% [markdown]
"""
## Step 2: Building the Conv2D Layer
### What is a Conv2D Layer?
A **Conv2D layer** is a learnable convolutional layer that:
- Has learnable kernel weights (initialized randomly)
- Applies convolution to input tensors
- Integrates with the rest of the neural network
### Why Conv2D Layers Matter
- **Feature learning**: Kernels learn to detect useful patterns
- **Composability**: Can be stacked with other layers
- **Efficiency**: Shared weights reduce parameters dramatically
- **Translation invariance**: Same patterns detected anywhere in the image
### Real-World Applications
- **Image classification**: Recognize objects in photos
- **Object detection**: Find and locate objects
- **Medical imaging**: Detect anomalies in scans
- **Autonomous driving**: Identify road features
### Design Decisions
- **Kernel size**: Typically 3×3 or 5×5 for balance of locality and capacity
- **Initialization**: Small random values to break symmetry
- **Integration**: Works with Tensor class and other layers
"""
# %% nbgrader={"grade": false, "grade_id": "conv2d-class", "locked": false, "schema_version": 3, "solution": true, "task": false}
#| export
class Conv2D:
"""
2D Convolutional Layer (single channel, single filter, no stride/pad).
A learnable convolutional layer that applies a kernel to detect spatial patterns.
Perfect for building the foundation of convolutional neural networks.
"""
def __init__(self, kernel_size: Tuple[int, int]):
"""
Initialize Conv2D layer with random kernel.
Args:
kernel_size: (kH, kW) - size of the convolution kernel
TODO: Initialize a random kernel with small values.
APPROACH:
1. Store kernel_size as instance variable
2. Initialize random kernel with small values
3. Use proper initialization for stable training
EXAMPLE:
Conv2D((2, 2)) creates:
- kernel: shape (2, 2) with small random values
HINTS:
- Store kernel_size as self.kernel_size
- Initialize kernel: np.random.randn(kH, kW) * 0.1 (small values)
- Convert to float32 for consistency
"""
### BEGIN SOLUTION
# Store kernel size
self.kernel_size = kernel_size
kH, kW = kernel_size
# Initialize random kernel with small values
self.kernel = np.random.randn(kH, kW).astype(np.float32) * 0.1
### END SOLUTION
def forward(self, x: Tensor) -> Tensor:
"""
Forward pass: apply convolution to input tensor.
Args:
x: Input tensor (2D for simplicity)
Returns:
Output tensor after convolution
TODO: Implement forward pass using conv2d_naive function.
APPROACH:
1. Extract numpy array from input tensor
2. Apply conv2d_naive with stored kernel
3. Return result wrapped in Tensor
EXAMPLE:
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # shape (3, 3)
layer = Conv2D((2, 2))
y = layer(x) # shape (2, 2)
HINTS:
- Use x.data to get numpy array
- Use conv2d_naive(x.data, self.kernel)
- Return Tensor(result) to wrap the result
"""
### BEGIN SOLUTION
# Apply convolution using naive implementation
result = conv2d_naive(x.data, self.kernel)
return Tensor(result)
### END SOLUTION
def __call__(self, x: Tensor) -> Tensor:
"""Make layer callable: layer(x) same as layer.forward(x)"""
return self.forward(x)
# %% [markdown]
"""
### 🧪 Quick Test: Conv2D Layer
Let's test your Conv2D layer implementation! This is a learnable convolutional layer that can be trained.
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-layer-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test Conv2D layer immediately after implementation
print("🔬 Testing Conv2D layer...")
# Create a Conv2D layer
try:
layer = Conv2D(kernel_size=(2, 2))
print(f"Conv2D layer created with kernel size: {layer.kernel_size}")
print(f"Kernel shape: {layer.kernel.shape}")
# Test that kernel is initialized properly
assert layer.kernel.shape == (2, 2), f"Kernel shape should be (2, 2), got {layer.kernel.shape}"
assert not np.allclose(layer.kernel, 0), "Kernel should not be all zeros"
print("✅ Conv2D layer initialization successful")
# Test with sample input
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Input shape: {x.shape}")
y = layer(x)
print(f"Output shape: {y.shape}")
print(f"Output: {y}")
# Verify shapes
assert y.shape == (2, 2), f"Output shape should be (2, 2), got {y.shape}"
assert isinstance(y, Tensor), "Output should be a Tensor"
print("✅ Conv2D layer forward pass successful")
except Exception as e:
print(f"❌ Conv2D layer test failed: {e}")
raise
# Test different kernel sizes
try:
layer_3x3 = Conv2D(kernel_size=(3, 3))
x_5x5 = Tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]])
y_3x3 = layer_3x3(x_5x5)
assert y_3x3.shape == (3, 3), f"3x3 kernel output should be (3, 3), got {y_3x3.shape}"
print("✅ Different kernel sizes work correctly")
except Exception as e:
print(f"❌ Different kernel sizes test failed: {e}")
raise
# Show the layer behavior
print("🎯 Conv2D layer behavior:")
print(" Learnable kernel weights")
print(" Applies convolution to detect patterns")
print(" Can be trained end-to-end")
print("📈 Progress: Convolution operation ✓, Conv2D layer ✓")
# %% [markdown]
"""
## Step 3: Flattening for Dense Layers
### What is Flattening?
**Flattening** converts multi-dimensional tensors to 1D vectors, enabling connection between convolutional and dense layers.
### Why Flattening is Needed
- **Interface compatibility**: Conv2D outputs 2D, Dense expects 1D
- **Network composition**: Connect spatial features to classification
- **Standard practice**: Almost all CNNs use this pattern
- **Dimension management**: Preserve information while changing shape
### The Pattern
```
Conv2D → ReLU → Conv2D → ReLU → Flatten → Dense → Output
```
### Real-World Usage
- **Classification**: Final layers need 1D input for class probabilities
- **Feature extraction**: Convert spatial features to vector representations
- **Transfer learning**: Extract features from pre-trained CNNs
"""
# %% nbgrader={"grade": false, "grade_id": "flatten-function", "locked": false, "schema_version": 3, "solution": true, "task": false}
#| export
def flatten(x: Tensor) -> Tensor:
"""
Flatten a 2D tensor to 1D (for connecting to Dense layers).
Args:
x: Input tensor to flatten
Returns:
Flattened tensor with batch dimension preserved
TODO: Implement flattening operation.
APPROACH:
1. Get the numpy array from the tensor
2. Use .flatten() to convert to 1D
3. Add batch dimension with [None, :]
4. Return Tensor wrapped around the result
EXAMPLE:
Input: Tensor([[1, 2], [3, 4]]) # shape (2, 2)
Output: Tensor([[1, 2, 3, 4]]) # shape (1, 4)
HINTS:
- Use x.data.flatten() to get 1D array
- Add batch dimension: result[None, :]
- Return Tensor(result)
"""
### BEGIN SOLUTION
# Flatten the tensor and add batch dimension
flattened = x.data.flatten()
result = flattened[None, :] # Add batch dimension
return Tensor(result)
### END SOLUTION
# %% [markdown]
"""
### 🧪 Quick Test: Flatten Function
Let's test your flatten function! This connects convolutional layers to dense layers.
"""
# %% nbgrader={"grade": true, "grade_id": "test-flatten-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test flatten function immediately after implementation
print("🔬 Testing flatten function...")
# Test case 1: 2x2 tensor
try:
x = Tensor([[1, 2], [3, 4]])
flattened = flatten(x)
print(f"Input: {x}")
print(f"Flattened: {flattened}")
print(f"Flattened shape: {flattened.shape}")
# Verify shape and content
assert flattened.shape == (1, 4), f"Flattened shape should be (1, 4), got {flattened.shape}"
expected_data = np.array([[1, 2, 3, 4]])
assert np.array_equal(flattened.data, expected_data), f"Flattened data should be {expected_data}, got {flattened.data}"
print("✅ 2x2 flatten test passed")
except Exception as e:
print(f"❌ 2x2 flatten test failed: {e}")
raise
# Test case 2: 3x3 tensor
try:
x2 = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
flattened2 = flatten(x2)
assert flattened2.shape == (1, 9), f"Flattened shape should be (1, 9), got {flattened2.shape}"
expected_data2 = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
assert np.array_equal(flattened2.data, expected_data2), f"Flattened data should be {expected_data2}, got {flattened2.data}"
print("✅ 3x3 flatten test passed")
except Exception as e:
print(f"❌ 3x3 flatten test failed: {e}")
raise
# Test case 3: Different shapes
try:
x3 = Tensor([[1, 2, 3, 4], [5, 6, 7, 8]]) # 2x4
flattened3 = flatten(x3)
assert flattened3.shape == (1, 8), f"Flattened shape should be (1, 8), got {flattened3.shape}"
expected_data3 = np.array([[1, 2, 3, 4, 5, 6, 7, 8]])
assert np.array_equal(flattened3.data, expected_data3), f"Flattened data should be {expected_data3}, got {flattened3.data}"
print("✅ Different shapes flatten test passed")
except Exception as e:
print(f"❌ Different shapes flatten test failed: {e}")
raise
# Show the flattening behavior
print("🎯 Flatten behavior:")
print(" Converts 2D tensor to 1D")
print(" Preserves batch dimension")
print(" Enables connection to Dense layers")
print("📈 Progress: Convolution operation ✓, Conv2D layer ✓, Flatten ✓")
print("🚀 CNN pipeline ready!")
# %% [markdown]
"""
### 🧪 Test Your CNN Implementations
Once you implement the functions above, run these cells to test them:
"""
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-naive", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test conv2d_naive function
print("Testing conv2d_naive function...")
# Test case 1: Simple 3x3 input with 2x2 kernel
input_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
kernel_array = np.array([[1, 0], [0, -1]], dtype=np.float32)
result = conv2d_naive(input_array, kernel_array)
expected = np.array([[-4, -4], [-4, -4]], dtype=np.float32)
print(f"Input:\n{input_array}")
print(f"Kernel:\n{kernel_array}")
print(f"Result:\n{result}")
print(f"Expected:\n{expected}")
assert np.allclose(result, expected), f"conv2d_naive failed: expected {expected}, got {result}"
# Test case 2: Different kernel
kernel2 = np.array([[1, 1], [1, 1]], dtype=np.float32)
result2 = conv2d_naive(input_array, kernel2)
expected2 = np.array([[12, 16], [24, 28]], dtype=np.float32)
assert np.allclose(result2, expected2), f"conv2d_naive failed: expected {expected2}, got {result2}"
print("✅ conv2d_naive tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-conv2d-layer", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test Conv2D layer
print("Testing Conv2D layer...")
# Create a Conv2D layer
layer = Conv2D(kernel_size=(2, 2))
print(f"Kernel size: {layer.kernel_size}")
print(f"Kernel shape: {layer.kernel.shape}")
# Test with sample input
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Input shape: {x.shape}")
y = layer(x)
print(f"Output shape: {y.shape}")
print(f"Output: {y}")
# Verify shapes
assert y.shape == (2, 2), f"Output shape should be (2, 2), got {y.shape}"
assert isinstance(y, Tensor), "Output should be a Tensor"
print("✅ Conv2D layer tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-flatten", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test flatten function
print("Testing flatten function...")
# Test case 1: 2x2 tensor
x = Tensor([[1, 2], [3, 4]])
flattened = flatten(x)
print(f"Input: {x}")
print(f"Flattened: {flattened}")
print(f"Flattened shape: {flattened.shape}")
# Verify shape and content
assert flattened.shape == (1, 4), f"Flattened shape should be (1, 4), got {flattened.shape}"
expected_data = np.array([[1, 2, 3, 4]])
assert np.array_equal(flattened.data, expected_data), f"Flattened data should be {expected_data}, got {flattened.data}"
# Test case 2: 3x3 tensor
x2 = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
flattened2 = flatten(x2)
assert flattened2.shape == (1, 9), f"Flattened shape should be (1, 9), got {flattened2.shape}"
expected_data2 = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
assert np.array_equal(flattened2.data, expected_data2), f"Flattened data should be {expected_data2}, got {flattened2.data}"
print("✅ Flatten tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-cnn-pipeline", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test complete CNN pipeline
print("Testing complete CNN pipeline...")
# Create a simple CNN pipeline: Conv2D → ReLU → Flatten → Dense
conv_layer = Conv2D(kernel_size=(2, 2))
relu = ReLU()
dense_layer = Dense(input_size=4, output_size=2)
# Test input (3x3 image)
x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Input shape: {x.shape}")
# Forward pass through pipeline
h1 = conv_layer(x)
print(f"After Conv2D: {h1.shape}")
h2 = relu(h1)
print(f"After ReLU: {h2.shape}")
h3 = flatten(h2)
print(f"After Flatten: {h3.shape}")
h4 = dense_layer(h3)
print(f"After Dense: {h4.shape}")
# Verify pipeline works
assert h1.shape == (2, 2), f"Conv2D output should be (2, 2), got {h1.shape}"
assert h2.shape == (2, 2), f"ReLU output should be (2, 2), got {h2.shape}"
assert h3.shape == (1, 4), f"Flatten output should be (1, 4), got {h3.shape}"
assert h4.shape == (1, 2), f"Dense output should be (1, 2), got {h4.shape}"
print("✅ CNN pipeline tests passed!")
# %% [markdown]
"""
## 🎯 Module Summary
Congratulations! You've successfully implemented the core components of convolutional neural networks:
### What You've Accomplished
✅ **Convolution Operation**: Implemented conv2d_naive with sliding window from scratch
✅ **Conv2D Layer**: Built a learnable convolutional layer with random kernel initialization
✅ **Flattening**: Created the bridge between convolutional and dense layers
✅ **CNN Pipeline**: Composed Conv2D → ReLU → Flatten → Dense for complete networks
✅ **Spatial Pattern Detection**: Understanding how convolution detects local features
### Key Concepts You've Learned
- **Convolution is pattern matching**: Kernels detect specific spatial patterns
- **Parameter sharing**: Same kernel applied everywhere for translation invariance
- **Local connectivity**: Each output depends only on a small input region
- **Spatial hierarchy**: Multiple layers build increasingly complex features
- **Dimension management**: Flattening connects spatial and vector representations
### Mathematical Foundations
- **Convolution operation**: (I * K)[i,j] = ΣΣ I[i+m, j+n] × K[m,n]
- **Sliding window**: Kernel moves across input computing dot products
- **Feature maps**: Convolution outputs that highlight detected patterns
- **Translation invariance**: Same pattern detected regardless of position
### Real-World Applications
- **Computer vision**: Object recognition, face detection, medical imaging
- **Image processing**: Edge detection, noise reduction, enhancement
- **Autonomous systems**: Traffic sign recognition, obstacle detection
- **Scientific imaging**: Satellite imagery, microscopy, astronomy
### Next Steps
1. **Export your code**: `tito package nbdev --export 05_cnn`
2. **Test your implementation**: `tito module test 05_cnn`
3. **Use your CNN components**:
```python
from tinytorch.core.cnn import Conv2D, conv2d_naive, flatten
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU
# Create CNN pipeline
conv = Conv2D((3, 3))
relu = ReLU()
dense = Dense(16, 10)
# Process image
features = conv(image)
activated = relu(features)
flattened = flatten(activated)
output = dense(flattened)
```
4. **Move to Module 6**: Start building data loading and preprocessing pipelines!
**Ready for the next challenge?** Let's build efficient data loading systems to feed our networks!
"""