Files
TinyTorch/assignments/source/05_cnn/05_cnn.ipynb
Vijay Janapa Reddi 77150be3a6 Module 00_setup migration: Core functionality complete, NBGrader architecture issue discovered
 COMPLETED:
- Instructor solution executes perfectly
- NBDev export works (fixed import directives)
- Package functionality verified
- Student assignment generation works
- CLI integration complete
- Systematic testing framework established

⚠️ CRITICAL DISCOVERY:
- NBGrader requires cell metadata architecture changes
- Current generator creates content correctly but wrong cell types
- Would require major rework of assignment generation pipeline

📊 STATUS:
- Core TinyTorch functionality:  READY FOR STUDENTS
- NBGrader integration: Requires Phase 2 rework
- Ready to continue systematic testing of modules 01-06

🔧 FIXES APPLIED:
- Added #| export directive to imports in enhanced modules
- Fixed generator logic for student scaffolding
- Updated testing framework and documentation
2025-07-12 09:08:45 -04:00

816 lines
30 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "ca53839c",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"# Module X: CNN - Convolutional Neural Networks\n",
"\n",
"Welcome to the CNN module! Here you'll implement the core building block of modern computer vision: the convolutional layer.\n",
"\n",
"## Learning Goals\n",
"- Understand the convolution operation (sliding window, local connectivity, weight sharing)\n",
"- Implement Conv2D with explicit for-loops\n",
"- Visualize how convolution builds feature maps\n",
"- Compose Conv2D with other layers to build a simple ConvNet\n",
"- (Stretch) Explore stride, padding, pooling, and multi-channel input\n",
"\n",
"## Build \u2192 Use \u2192 Understand\n",
"1. **Build**: Conv2D layer using sliding window convolution\n",
"2. **Use**: Transform images and see feature maps\n",
"3. **Understand**: How CNNs learn spatial patterns"
]
},
{
"cell_type": "markdown",
"id": "9e0d8f02",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## \ud83d\udce6 Where This Code Lives in the Final Package\n",
"\n",
"**Learning Side:** You work in `modules/cnn/cnn_dev.py` \n",
"**Building Side:** Code exports to `tinytorch.core.layers`\n",
"\n",
"```python\n",
"# Final package structure:\n",
"from tinytorch.core.layers import Dense, Conv2D # Both layers together!\n",
"from tinytorch.core.activations import ReLU\n",
"from tinytorch.core.tensor import Tensor\n",
"```\n",
"\n",
"**Why this matters:**\n",
"- **Learning:** Focused modules for deep understanding\n",
"- **Production:** Proper organization like PyTorch's `torch.nn`\n",
"- **Consistency:** All layers (Dense, Conv2D) live together in `core.layers`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fbd717db",
"metadata": {},
"outputs": [],
"source": [
"#| default_exp core.cnn"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7f22e530",
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"import numpy as np\n",
"from typing import List, Tuple, Optional\n",
"from tinytorch.core.tensor import Tensor\n",
"\n",
"# Setup and imports (for development)\n",
"import matplotlib.pyplot as plt\n",
"from tinytorch.core.layers import Dense\n",
"from tinytorch.core.activations import ReLU"
]
},
{
"cell_type": "markdown",
"id": "f99723c8",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
},
"source": [
"## Step 1: What is Convolution?\n",
"\n",
"### Definition\n",
"A **convolutional layer** applies a small filter (kernel) across the input, producing a feature map. This operation captures local patterns and is the foundation of modern vision models.\n",
"\n",
"### Why Convolution Matters in Computer Vision\n",
"- **Local connectivity**: Each output value depends only on a small region of the input\n",
"- **Weight sharing**: The same filter is applied everywhere (translation invariance)\n",
"- **Spatial hierarchy**: Multiple layers build increasingly complex features\n",
"- **Parameter efficiency**: Much fewer parameters than fully connected layers\n",
"\n",
"### The Fundamental Insight\n",
"**Convolution is pattern matching!** The kernel learns to detect specific patterns:\n",
"- **Edge detectors**: Find boundaries between objects\n",
"- **Texture detectors**: Recognize surface patterns\n",
"- **Shape detectors**: Identify geometric forms\n",
"- **Feature detectors**: Combine simple patterns into complex features\n",
"\n",
"### Real-World Examples\n",
"- **Image processing**: Detect edges, blur, sharpen\n",
"- **Computer vision**: Recognize objects, faces, text\n",
"- **Medical imaging**: Detect tumors, analyze scans\n",
"- **Autonomous driving**: Identify traffic signs, pedestrians\n",
"\n",
"### Visual Intuition\n",
"```\n",
"Input Image: Kernel: Output Feature Map:\n",
"[1, 2, 3] [1, 0] [1*1+2*0+4*0+5*(-1), 2*1+3*0+5*0+6*(-1)]\n",
"[4, 5, 6] [0, -1] [4*1+5*0+7*0+8*(-1), 5*1+6*0+8*0+9*(-1)]\n",
"[7, 8, 9]\n",
"```\n",
"\n",
"The kernel slides across the input, computing dot products at each position.\n",
"\n",
"### The Math Behind It\n",
"For input I (H\u00d7W) and kernel K (kH\u00d7kW), the output O (out_H\u00d7out_W) is:\n",
"```\n",
"O[i,j] = sum(I[i+di, j+dj] * K[di, dj] for di in range(kH), dj in range(kW))\n",
"```\n",
"\n",
"Let's implement this step by step!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa4af055",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| export\n",
"def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:\n",
" \"\"\"\n",
" Naive 2D convolution (single channel, no stride, no padding).\n",
" \n",
" Args:\n",
" input: 2D input array (H, W)\n",
" kernel: 2D filter (kH, kW)\n",
" Returns:\n",
" 2D output array (H-kH+1, W-kW+1)\n",
" \n",
" TODO: Implement the sliding window convolution using for-loops.\n",
" \n",
" APPROACH:\n",
" 1. Get input dimensions: H, W = input.shape\n",
" 2. Get kernel dimensions: kH, kW = kernel.shape\n",
" 3. Calculate output dimensions: out_H = H - kH + 1, out_W = W - kW + 1\n",
" 4. Create output array: np.zeros((out_H, out_W))\n",
" 5. Use nested loops to slide the kernel:\n",
" - i loop: output rows (0 to out_H-1)\n",
" - j loop: output columns (0 to out_W-1)\n",
" - di loop: kernel rows (0 to kH-1)\n",
" - dj loop: kernel columns (0 to kW-1)\n",
" 6. For each (i,j), compute: output[i,j] += input[i+di, j+dj] * kernel[di, dj]\n",
" \n",
" EXAMPLE:\n",
" Input: [[1, 2, 3], Kernel: [[1, 0],\n",
" [4, 5, 6], [0, -1]]\n",
" [7, 8, 9]]\n",
" \n",
" Output[0,0] = 1*1 + 2*0 + 4*0 + 5*(-1) = 1 - 5 = -4\n",
" Output[0,1] = 2*1 + 3*0 + 5*0 + 6*(-1) = 2 - 6 = -4\n",
" Output[1,0] = 4*1 + 5*0 + 7*0 + 8*(-1) = 4 - 8 = -4\n",
" Output[1,1] = 5*1 + 6*0 + 8*0 + 9*(-1) = 5 - 9 = -4\n",
" \n",
" HINTS:\n",
" - Start with output = np.zeros((out_H, out_W))\n",
" - Use four nested loops: for i in range(out_H): for j in range(out_W): for di in range(kH): for dj in range(kW):\n",
" - Accumulate the sum: output[i,j] += input[i+di, j+dj] * kernel[di, dj]\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d83b2c10",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| hide\n",
"#| export\n",
"def conv2d_naive(input: np.ndarray, kernel: np.ndarray) -> np.ndarray:\n",
" H, W = input.shape\n",
" kH, kW = kernel.shape\n",
" out_H, out_W = H - kH + 1, W - kW + 1\n",
" output = np.zeros((out_H, out_W), dtype=input.dtype)\n",
" for i in range(out_H):\n",
" for j in range(out_W):\n",
" for di in range(kH):\n",
" for dj in range(kW):\n",
" output[i, j] += input[i + di, j + dj] * kernel[di, dj]\n",
" return output"
]
},
{
"cell_type": "markdown",
"id": "454a6bad",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"### \ud83e\uddea Test Your Conv2D Implementation\n",
"\n",
"Try your function on this simple example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7705032a",
"metadata": {},
"outputs": [],
"source": [
"# Test case for conv2d_naive\n",
"input = np.array([\n",
" [1, 2, 3],\n",
" [4, 5, 6],\n",
" [7, 8, 9]\n",
"], dtype=np.float32)\n",
"kernel = np.array([\n",
" [1, 0],\n",
" [0, -1]\n",
"], dtype=np.float32)\n",
"\n",
"expected = np.array([\n",
" [1*1+2*0+4*0+5*(-1), 2*1+3*0+5*0+6*(-1)],\n",
" [4*1+5*0+7*0+8*(-1), 5*1+6*0+8*0+9*(-1)]\n",
"], dtype=np.float32)\n",
"\n",
"try:\n",
" output = conv2d_naive(input, kernel)\n",
" print(\"\u2705 Input:\\n\", input)\n",
" print(\"\u2705 Kernel:\\n\", kernel)\n",
" print(\"\u2705 Your output:\\n\", output)\n",
" print(\"\u2705 Expected:\\n\", expected)\n",
" assert np.allclose(output, expected), \"\u274c Output does not match expected!\"\n",
" print(\"\ud83c\udf89 conv2d_naive works!\")\n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")\n",
" print(\"Make sure to implement conv2d_naive above!\")"
]
},
{
"cell_type": "markdown",
"id": "53449e22",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 2: Understanding What Convolution Does\n",
"\n",
"Let's visualize how different kernels detect different patterns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05a1ce2c",
"metadata": {},
"outputs": [],
"source": [
"# Visualize different convolution kernels\n",
"print(\"Visualizing different convolution kernels...\")\n",
"\n",
"try:\n",
" # Test different kernels\n",
" test_input = np.array([\n",
" [1, 1, 1, 0, 0],\n",
" [1, 1, 1, 0, 0],\n",
" [1, 1, 1, 0, 0],\n",
" [0, 0, 0, 0, 0],\n",
" [0, 0, 0, 0, 0]\n",
" ], dtype=np.float32)\n",
" \n",
" # Edge detection kernel (horizontal)\n",
" edge_kernel = np.array([\n",
" [1, 1, 1],\n",
" [0, 0, 0],\n",
" [-1, -1, -1]\n",
" ], dtype=np.float32)\n",
" \n",
" # Sharpening kernel\n",
" sharpen_kernel = np.array([\n",
" [0, -1, 0],\n",
" [-1, 5, -1],\n",
" [0, -1, 0]\n",
" ], dtype=np.float32)\n",
" \n",
" # Test edge detection\n",
" edge_output = conv2d_naive(test_input, edge_kernel)\n",
" print(\"\u2705 Edge detection kernel:\")\n",
" print(\" Detects horizontal edges (boundaries between light and dark)\")\n",
" print(\" Output:\\n\", edge_output)\n",
" \n",
" # Test sharpening\n",
" sharpen_output = conv2d_naive(test_input, sharpen_kernel)\n",
" print(\"\u2705 Sharpening kernel:\")\n",
" print(\" Enhances edges and details\")\n",
" print(\" Output:\\n\", sharpen_output)\n",
" \n",
" print(\"\\n\ud83d\udca1 Different kernels detect different patterns!\")\n",
" print(\" Neural networks learn these kernels automatically!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")"
]
},
{
"cell_type": "markdown",
"id": "0b33791b",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
},
"source": [
"## Step 3: Conv2D Layer Class\n",
"\n",
"Now let's wrap your convolution function in a layer class for use in networks. This makes it consistent with other layers like Dense.\n",
"\n",
"### Why Layer Classes Matter\n",
"- **Consistent API**: Same interface as Dense layers\n",
"- **Learnable parameters**: Kernels can be learned from data\n",
"- **Composability**: Can be combined with other layers\n",
"- **Integration**: Works seamlessly with the rest of TinyTorch\n",
"\n",
"### The Pattern\n",
"```\n",
"Input Tensor \u2192 Conv2D \u2192 Output Tensor\n",
"```\n",
"\n",
"Just like Dense layers, but with spatial operations instead of linear transformations."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "118ba687",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| export\n",
"class Conv2D:\n",
" \"\"\"\n",
" 2D Convolutional Layer (single channel, single filter, no stride/pad).\n",
" \n",
" Args:\n",
" kernel_size: (kH, kW) - size of the convolution kernel\n",
" \n",
" TODO: Initialize a random kernel and implement the forward pass using conv2d_naive.\n",
" \n",
" APPROACH:\n",
" 1. Store kernel_size as instance variable\n",
" 2. Initialize random kernel with small values\n",
" 3. Implement forward pass using conv2d_naive function\n",
" 4. Return Tensor wrapped around the result\n",
" \n",
" EXAMPLE:\n",
" layer = Conv2D(kernel_size=(2, 2))\n",
" x = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # shape (3, 3)\n",
" y = layer(x) # shape (2, 2)\n",
" \n",
" HINTS:\n",
" - Store kernel_size as (kH, kW)\n",
" - Initialize kernel with np.random.randn(kH, kW) * 0.1 (small values)\n",
" - Use conv2d_naive(x.data, self.kernel) in forward pass\n",
" - Return Tensor(result) to wrap the result\n",
" \"\"\"\n",
" def __init__(self, kernel_size: Tuple[int, int]):\n",
" \"\"\"\n",
" Initialize Conv2D layer with random kernel.\n",
" \n",
" Args:\n",
" kernel_size: (kH, kW) - size of the convolution kernel\n",
" \n",
" TODO: \n",
" 1. Store kernel_size as instance variable\n",
" 2. Initialize random kernel with small values\n",
" 3. Scale kernel values to prevent large outputs\n",
" \n",
" STEP-BY-STEP:\n",
" 1. Store kernel_size as self.kernel_size\n",
" 2. Unpack kernel_size into kH, kW\n",
" 3. Initialize kernel: np.random.randn(kH, kW) * 0.1\n",
" 4. Convert to float32 for consistency\n",
" \n",
" EXAMPLE:\n",
" Conv2D((2, 2)) creates:\n",
" - kernel: shape (2, 2) with small random values\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")\n",
" \n",
" def forward(self, x: Tensor) -> Tensor:\n",
" \"\"\"\n",
" Forward pass: apply convolution to input.\n",
" \n",
" Args:\n",
" x: Input tensor of shape (H, W)\n",
" \n",
" Returns:\n",
" Output tensor of shape (H-kH+1, W-kW+1)\n",
" \n",
" TODO: Implement convolution using conv2d_naive function.\n",
" \n",
" STEP-BY-STEP:\n",
" 1. Use conv2d_naive(x.data, self.kernel)\n",
" 2. Return Tensor(result)\n",
" \n",
" EXAMPLE:\n",
" Input x: Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # shape (3, 3)\n",
" Kernel: shape (2, 2)\n",
" Output: Tensor([[val1, val2], [val3, val4]]) # shape (2, 2)\n",
" \n",
" HINTS:\n",
" - x.data gives you the numpy array\n",
" - self.kernel is your learned kernel\n",
" - Use conv2d_naive(x.data, self.kernel)\n",
" - Return Tensor(result) to wrap the result\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")\n",
" \n",
" def __call__(self, x: Tensor) -> Tensor:\n",
" \"\"\"Make layer callable: layer(x) same as layer.forward(x)\"\"\"\n",
" return self.forward(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3e18c382",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| hide\n",
"#| export\n",
"class Conv2D:\n",
" def __init__(self, kernel_size: Tuple[int, int]):\n",
" self.kernel_size = kernel_size\n",
" kH, kW = kernel_size\n",
" # Initialize with small random values\n",
" self.kernel = np.random.randn(kH, kW).astype(np.float32) * 0.1\n",
" \n",
" def forward(self, x: Tensor) -> Tensor:\n",
" return Tensor(conv2d_naive(x.data, self.kernel))\n",
" \n",
" def __call__(self, x: Tensor) -> Tensor:\n",
" return self.forward(x)"
]
},
{
"cell_type": "markdown",
"id": "e288fb18",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"### \ud83e\uddea Test Your Conv2D Layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f1a4a6a",
"metadata": {},
"outputs": [],
"source": [
"# Test Conv2D layer\n",
"print(\"Testing Conv2D layer...\")\n",
"\n",
"try:\n",
" # Test basic Conv2D layer\n",
" conv = Conv2D(kernel_size=(2, 2))\n",
" x = Tensor(np.array([\n",
" [1, 2, 3],\n",
" [4, 5, 6],\n",
" [7, 8, 9]\n",
" ], dtype=np.float32))\n",
" \n",
" print(f\"\u2705 Input shape: {x.shape}\")\n",
" print(f\"\u2705 Kernel shape: {conv.kernel.shape}\")\n",
" print(f\"\u2705 Kernel values:\\n{conv.kernel}\")\n",
" \n",
" y = conv(x)\n",
" print(f\"\u2705 Output shape: {y.shape}\")\n",
" print(f\"\u2705 Output: {y}\")\n",
" \n",
" # Test with different kernel size\n",
" conv2 = Conv2D(kernel_size=(3, 3))\n",
" y2 = conv2(x)\n",
" print(f\"\u2705 3x3 kernel output shape: {y2.shape}\")\n",
" \n",
" print(\"\\n\ud83c\udf89 Conv2D layer works!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")\n",
" print(\"Make sure to implement the Conv2D layer above!\")"
]
},
{
"cell_type": "markdown",
"id": "97939763",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
},
"source": [
"## Step 4: Building a Simple ConvNet\n",
"\n",
"Now let's compose Conv2D layers with other layers to build a complete convolutional neural network!\n",
"\n",
"### Why ConvNets Matter\n",
"- **Spatial hierarchy**: Each layer learns increasingly complex features\n",
"- **Parameter sharing**: Same kernel applied everywhere (efficiency)\n",
"- **Translation invariance**: Can recognize objects regardless of position\n",
"- **Real-world success**: Power most modern computer vision systems\n",
"\n",
"### The Architecture\n",
"```\n",
"Input Image \u2192 Conv2D \u2192 ReLU \u2192 Flatten \u2192 Dense \u2192 Output\n",
"```\n",
"\n",
"This simple architecture can learn to recognize patterns in images!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51631fe6",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| export\n",
"def flatten(x: Tensor) -> Tensor:\n",
" \"\"\"\n",
" Flatten a 2D tensor to 1D (for connecting to Dense).\n",
" \n",
" TODO: Implement flattening operation.\n",
" \n",
" APPROACH:\n",
" 1. Get the numpy array from the tensor\n",
" 2. Use .flatten() to convert to 1D\n",
" 3. Add batch dimension with [None, :]\n",
" 4. Return Tensor wrapped around the result\n",
" \n",
" EXAMPLE:\n",
" Input: Tensor([[1, 2], [3, 4]]) # shape (2, 2)\n",
" Output: Tensor([[1, 2, 3, 4]]) # shape (1, 4)\n",
" \n",
" HINTS:\n",
" - Use x.data.flatten() to get 1D array\n",
" - Add batch dimension: result[None, :]\n",
" - Return Tensor(result)\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e8f2b50",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| hide\n",
"#| export\n",
"def flatten(x: Tensor) -> Tensor:\n",
" \"\"\"Flatten a 2D tensor to 1D (for connecting to Dense).\"\"\"\n",
" return Tensor(x.data.flatten()[None, :])"
]
},
{
"cell_type": "markdown",
"id": "7bdb9f80",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"### \ud83e\uddea Test Your Flatten Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c6d92ebc",
"metadata": {},
"outputs": [],
"source": [
"# Test flatten function\n",
"print(\"Testing flatten function...\")\n",
"\n",
"try:\n",
" # Test flattening\n",
" x = Tensor([[1, 2, 3], [4, 5, 6]]) # shape (2, 3)\n",
" flattened = flatten(x)\n",
" \n",
" print(f\"\u2705 Input shape: {x.shape}\")\n",
" print(f\"\u2705 Flattened shape: {flattened.shape}\")\n",
" print(f\"\u2705 Flattened values: {flattened}\")\n",
" \n",
" # Verify the flattening worked correctly\n",
" expected = np.array([[1, 2, 3, 4, 5, 6]])\n",
" assert np.allclose(flattened.data, expected), \"\u274c Flattening incorrect!\"\n",
" print(\"\u2705 Flattening works correctly!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")\n",
" print(\"Make sure to implement the flatten function above!\")"
]
},
{
"cell_type": "markdown",
"id": "9804128d",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 5: Composing a Complete ConvNet\n",
"\n",
"Now let's build a simple convolutional neural network that can process images!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d60d05b9",
"metadata": {},
"outputs": [],
"source": [
"# Compose a simple ConvNet\n",
"print(\"Building a simple ConvNet...\")\n",
"\n",
"try:\n",
" # Create network components\n",
" conv = Conv2D((2, 2))\n",
" relu = ReLU()\n",
" dense = Dense(input_size=4, output_size=1) # 4 features from 2x2 output\n",
" \n",
" # Test input (small 3x3 \"image\")\n",
" x = Tensor(np.random.randn(3, 3).astype(np.float32))\n",
" print(f\"\u2705 Input shape: {x.shape}\")\n",
" print(f\"\u2705 Input: {x}\")\n",
" \n",
" # Forward pass through the network\n",
" conv_out = conv(x)\n",
" print(f\"\u2705 After Conv2D: {conv_out}\")\n",
" \n",
" relu_out = relu(conv_out)\n",
" print(f\"\u2705 After ReLU: {relu_out}\")\n",
" \n",
" flattened = flatten(relu_out)\n",
" print(f\"\u2705 After flatten: {flattened}\")\n",
" \n",
" final_out = dense(flattened)\n",
" print(f\"\u2705 Final output: {final_out}\")\n",
" \n",
" print(\"\\n\ud83c\udf89 Simple ConvNet works!\")\n",
" print(\"This network can learn to recognize patterns in images!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")\n",
" print(\"Check your Conv2D, flatten, and Dense implementations!\")"
]
},
{
"cell_type": "markdown",
"id": "9fe4faf0",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 6: Understanding the Power of Convolution\n",
"\n",
"Let's see how convolution captures different types of patterns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "434133c2",
"metadata": {},
"outputs": [],
"source": [
"# Demonstrate pattern detection\n",
"print(\"Demonstrating pattern detection...\")\n",
"\n",
"try:\n",
" # Create a simple \"image\" with a pattern\n",
" image = np.array([\n",
" [0, 0, 0, 0, 0],\n",
" [0, 1, 1, 1, 0],\n",
" [0, 1, 1, 1, 0],\n",
" [0, 1, 1, 1, 0],\n",
" [0, 0, 0, 0, 0]\n",
" ], dtype=np.float32)\n",
" \n",
" # Different kernels detect different patterns\n",
" edge_kernel = np.array([\n",
" [1, 1, 1],\n",
" [1, -8, 1],\n",
" [1, 1, 1]\n",
" ], dtype=np.float32)\n",
" \n",
" blur_kernel = np.array([\n",
" [1/9, 1/9, 1/9],\n",
" [1/9, 1/9, 1/9],\n",
" [1/9, 1/9, 1/9]\n",
" ], dtype=np.float32)\n",
" \n",
" # Test edge detection\n",
" edge_result = conv2d_naive(image, edge_kernel)\n",
" print(\"\u2705 Edge detection:\")\n",
" print(\" Detects boundaries around the white square\")\n",
" print(\" Result:\\n\", edge_result)\n",
" \n",
" # Test blurring\n",
" blur_result = conv2d_naive(image, blur_kernel)\n",
" print(\"\u2705 Blurring:\")\n",
" print(\" Smooths the image\")\n",
" print(\" Result:\\n\", blur_result)\n",
" \n",
" print(\"\\n\ud83d\udca1 Different kernels = different feature detectors!\")\n",
" print(\" Neural networks learn these automatically from data!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"\u274c Error: {e}\")"
]
},
{
"cell_type": "markdown",
"id": "80938b52",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## \ud83c\udfaf Module Summary\n",
"\n",
"Congratulations! You've built the foundation of convolutional neural networks:\n",
"\n",
"### What You've Accomplished\n",
"\u2705 **Convolution Operation**: Understanding the sliding window mechanism \n",
"\u2705 **Conv2D Layer**: Learnable convolutional layer implementation \n",
"\u2705 **Pattern Detection**: Visualizing how kernels detect different features \n",
"\u2705 **ConvNet Architecture**: Composing Conv2D with other layers \n",
"\u2705 **Real-world Applications**: Understanding computer vision applications \n",
"\n",
"### Key Concepts You've Learned\n",
"- **Convolution** is pattern matching with sliding windows\n",
"- **Local connectivity** means each output depends on a small input region\n",
"- **Weight sharing** makes CNNs parameter-efficient\n",
"- **Spatial hierarchy** builds complex features from simple patterns\n",
"- **Translation invariance** allows recognition regardless of position\n",
"\n",
"### What's Next\n",
"In the next modules, you'll build on this foundation:\n",
"- **Advanced CNN features**: Stride, padding, pooling\n",
"- **Multi-channel convolution**: RGB images, multiple filters\n",
"- **Training**: Learning kernels from data\n",
"- **Real applications**: Image classification, object detection\n",
"\n",
"### Real-World Connection\n",
"Your Conv2D layer is now ready to:\n",
"- Learn edge detectors, texture recognizers, and shape detectors\n",
"- Process real images for computer vision tasks\n",
"- Integrate with the rest of the TinyTorch ecosystem\n",
"- Scale to complex architectures like ResNet, VGG, etc.\n",
"\n",
"**Ready for the next challenge?** Let's move on to training these networks!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "03f153f1",
"metadata": {},
"outputs": [],
"source": [
"# Final verification\n",
"print(\"\\n\" + \"=\"*50)\n",
"print(\"\ud83c\udf89 CNN MODULE COMPLETE!\")\n",
"print(\"=\"*50)\n",
"print(\"\u2705 Convolution operation understanding\")\n",
"print(\"\u2705 Conv2D layer implementation\")\n",
"print(\"\u2705 Pattern detection visualization\")\n",
"print(\"\u2705 ConvNet architecture composition\")\n",
"print(\"\u2705 Real-world computer vision context\")\n",
"print(\"\\n\ud83d\ude80 Ready to train networks in the next module!\") "
]
}
],
"metadata": {
"jupytext": {
"main_language": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}