Files
TinyTorch/assignments/source/03_layers/layers_dev.ipynb
Vijay Janapa Reddi 83fb269d9f Complete migration from modules/ to assignments/source/ structure
- Migrated all Python source files to assignments/source/ structure
- Updated nbdev configuration to use assignments/source as nbs_path
- Updated all tito commands (nbgrader, export, test) to use new structure
- Fixed hardcoded paths in Python files and documentation
- Updated config.py to use assignments/source instead of modules
- Fixed test command to use correct file naming (short names vs full module names)
- Regenerated all notebook files with clean metadata
- Verified complete workflow: Python source → NBGrader → nbdev export → testing

All systems now working: NBGrader (14 source assignments, 1 released), nbdev export (7 generated files), and pytest integration.

The modules/ directory has been retired and replaced with standard NBGrader structure.
2025-07-12 12:06:56 -04:00

798 lines
29 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "2668bc45",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"# Module 2: Layers - Neural Network Building Blocks\n",
"\n",
"Welcome to the Layers module! This is where neural networks begin. You'll implement the fundamental building blocks that transform tensors.\n",
"\n",
"## Learning Goals\n",
"- Understand layers as functions that transform tensors: `y = f(x)`\n",
"- Implement Dense layers with linear transformations: `y = Wx + b`\n",
"- Use activation functions from the activations module for nonlinearity\n",
"- See how neural networks are just function composition\n",
"- Build intuition before diving into training\n",
"\n",
"## Build → Use → Understand\n",
"1. **Build**: Dense layers using activation functions as building blocks\n",
"2. **Use**: Transform tensors and see immediate results\n",
"3. **Understand**: How neural networks transform information\n",
"\n",
"## Module Dependencies\n",
"This module builds on the **activations** module:\n",
"- **activations** → **layers** → **networks**\n",
"- Clean separation of concerns: math functions → layer building blocks → full networks"
]
},
{
"cell_type": "markdown",
"id": "530716e8",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## 📦 Where This Code Lives in the Final Package\n",
"\n",
"**Learning Side:** You work in `assignments/source/03_layers/layers_dev.py` \n",
"**Building Side:** Code exports to `tinytorch.core.layers`\n",
"\n",
"```python\n",
"# Final package structure:\n",
"from tinytorch.core.layers import Dense, Conv2D # All layers together!\n",
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
"from tinytorch.core.tensor import Tensor\n",
"```\n",
"\n",
"**Why this matters:**\n",
"- **Learning:** Focused modules for deep understanding\n",
"- **Production:** Proper organization like PyTorch's `torch.nn`\n",
"- **Consistency:** All layers (Dense, Conv2D) live together in `core.layers`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f63809e",
"metadata": {},
"outputs": [],
"source": [
"#| default_exp core.layers\n",
"\n",
"# Setup and imports\n",
"import numpy as np\n",
"import sys\n",
"from typing import Union, Optional, Callable\n",
"import math"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "00a72b7c",
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"import numpy as np\n",
"import math\n",
"import sys\n",
"from typing import Union, Optional, Callable\n",
"\n",
"# Import from the main package (rock solid foundation)\n",
"from tinytorch.core.tensor import Tensor\n",
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
"\n",
"# print(\"🔥 TinyTorch Layers Module\")\n",
"# print(f\"NumPy version: {np.__version__}\")\n",
"# print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
"# print(\"Ready to build neural network layers!\")"
]
},
{
"cell_type": "markdown",
"id": "a0ad08ea",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 1: What is a Layer?\n",
"\n",
"### Definition\n",
"A **layer** is a function that transforms tensors. Think of it as a mathematical operation that takes input data and produces output data:\n",
"\n",
"```\n",
"Input Tensor → Layer → Output Tensor\n",
"```\n",
"\n",
"### Why Layers Matter in Neural Networks\n",
"Layers are the fundamental building blocks of all neural networks because:\n",
"- **Modularity**: Each layer has a specific job (linear transformation, nonlinearity, etc.)\n",
"- **Composability**: Layers can be combined to create complex functions\n",
"- **Learnability**: Each layer has parameters that can be learned from data\n",
"- **Interpretability**: Different layers learn different features\n",
"\n",
"### The Fundamental Insight\n",
"**Neural networks are just function composition!**\n",
"```\n",
"x → Layer1 → Layer2 → Layer3 → y\n",
"```\n",
"\n",
"Each layer transforms the data, and the final output is the composition of all these transformations.\n",
"\n",
"### Real-World Examples\n",
"- **Dense Layer**: Learns linear relationships between features\n",
"- **Convolutional Layer**: Learns spatial patterns in images\n",
"- **Recurrent Layer**: Learns temporal patterns in sequences\n",
"- **Activation Layer**: Adds nonlinearity to make networks powerful\n",
"\n",
"### Visual Intuition\n",
"```\n",
"Input: [1, 2, 3] (3 features)\n",
"Dense Layer: y = Wx + b\n",
"Weights W: [[0.1, 0.2, 0.3],\n",
" [0.4, 0.5, 0.6]] (2×3 matrix)\n",
"Bias b: [0.1, 0.2] (2 values)\n",
"Output: [0.1*1 + 0.2*2 + 0.3*3 + 0.1,\n",
" 0.4*1 + 0.5*2 + 0.6*3 + 0.2] = [1.4, 3.2]\n",
"```\n",
"\n",
"Let's start with the most important layer: **Dense** (also called Linear or Fully Connected)."
]
},
{
"cell_type": "markdown",
"id": "5d63d076",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
},
"source": [
"## Step 2: Understanding Matrix Multiplication\n",
"\n",
"Before we build layers, let's understand the core operation: **matrix multiplication**. This is what powers all neural network computations.\n",
"\n",
"### Why Matrix Multiplication Matters\n",
"- **Efficiency**: Process multiple inputs at once\n",
"- **Parallelization**: GPU acceleration works great with matrix operations\n",
"- **Batch processing**: Handle multiple samples simultaneously\n",
"- **Mathematical foundation**: Linear algebra is the language of neural networks\n",
"\n",
"### The Math Behind It\n",
"For matrices A (m×n) and B (n×p), the result C (m×p) is:\n",
"```\n",
"C[i,j] = sum(A[i,k] * B[k,j] for k in range(n))\n",
"```\n",
"\n",
"### Visual Example\n",
"```\n",
"A = [[1, 2], B = [[5, 6],\n",
" [3, 4]] [7, 8]]\n",
"\n",
"C = A @ B = [[1*5 + 2*7, 1*6 + 2*8],\n",
" [3*5 + 4*7, 3*6 + 4*8]]\n",
" = [[19, 22],\n",
" [43, 50]]\n",
"```\n",
"\n",
"Let's implement this step by step!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82cc8565",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| export\n",
"def matmul_naive(A: np.ndarray, B: np.ndarray) -> np.ndarray:\n",
" \"\"\"\n",
" Naive matrix multiplication using explicit for-loops.\n",
" \n",
" This helps you understand what matrix multiplication really does!\n",
" \n",
" Args:\n",
" A: Matrix of shape (m, n)\n",
" B: Matrix of shape (n, p)\n",
" \n",
" Returns:\n",
" Matrix of shape (m, p) where C[i,j] = sum(A[i,k] * B[k,j] for k in range(n))\n",
" \n",
" TODO: Implement matrix multiplication using three nested for-loops.\n",
" \n",
" APPROACH:\n",
" 1. Get the dimensions: m, n from A and n2, p from B\n",
" 2. Check that n == n2 (matrices must be compatible)\n",
" 3. Create output matrix C of shape (m, p) filled with zeros\n",
" 4. Use three nested loops:\n",
" - i loop: rows of A (0 to m-1)\n",
" - j loop: columns of B (0 to p-1) \n",
" - k loop: shared dimension (0 to n-1)\n",
" 5. For each (i,j), compute: C[i,j] += A[i,k] * B[k,j]\n",
" \n",
" EXAMPLE:\n",
" A = [[1, 2], B = [[5, 6],\n",
" [3, 4]] [7, 8]]\n",
" \n",
" C[0,0] = A[0,0]*B[0,0] + A[0,1]*B[1,0] = 1*5 + 2*7 = 19\n",
" C[0,1] = A[0,0]*B[0,1] + A[0,1]*B[1,1] = 1*6 + 2*8 = 22\n",
" C[1,0] = A[1,0]*B[0,0] + A[1,1]*B[1,0] = 3*5 + 4*7 = 43\n",
" C[1,1] = A[1,0]*B[0,1] + A[1,1]*B[1,1] = 3*6 + 4*8 = 50\n",
" \n",
" HINTS:\n",
" - Start with C = np.zeros((m, p))\n",
" - Use three nested for loops: for i in range(m): for j in range(p): for k in range(n):\n",
" - Accumulate the sum: C[i,j] += A[i,k] * B[k,j]\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ea923f30",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| hide\n",
"#| export\n",
"def matmul_naive(A: np.ndarray, B: np.ndarray) -> np.ndarray:\n",
" \"\"\"\n",
" Naive matrix multiplication using explicit for-loops.\n",
" \n",
" This helps you understand what matrix multiplication really does!\n",
" \"\"\"\n",
" m, n = A.shape\n",
" n2, p = B.shape\n",
" assert n == n2, f\"Matrix shapes don't match: A({m},{n}) @ B({n2},{p})\"\n",
" \n",
" C = np.zeros((m, p))\n",
" for i in range(m):\n",
" for j in range(p):\n",
" for k in range(n):\n",
" C[i, j] += A[i, k] * B[k, j]\n",
" return C"
]
},
{
"cell_type": "markdown",
"id": "60fb8544",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"### 🧪 Test Your Matrix Multiplication"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28898e45",
"metadata": {},
"outputs": [],
"source": [
"# Test matrix multiplication\n",
"print(\"Testing matrix multiplication...\")\n",
"\n",
"try:\n",
" # Test case 1: Simple 2x2 matrices\n",
" A = np.array([[1, 2], [3, 4]], dtype=np.float32)\n",
" B = np.array([[5, 6], [7, 8]], dtype=np.float32)\n",
" \n",
" result = matmul_naive(A, B)\n",
" expected = np.array([[19, 22], [43, 50]], dtype=np.float32)\n",
" \n",
" print(f\"✅ Matrix A:\\n{A}\")\n",
" print(f\"✅ Matrix B:\\n{B}\")\n",
" print(f\"✅ Your result:\\n{result}\")\n",
" print(f\"✅ Expected:\\n{expected}\")\n",
" \n",
" assert np.allclose(result, expected), \"❌ Result doesn't match expected!\"\n",
" print(\"🎉 Matrix multiplication works!\")\n",
" \n",
" # Test case 2: Compare with NumPy\n",
" numpy_result = A @ B\n",
" assert np.allclose(result, numpy_result), \"❌ Doesn't match NumPy result!\"\n",
" print(\"✅ Matches NumPy implementation!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"❌ Error: {e}\")\n",
" print(\"Make sure to implement matmul_naive above!\")"
]
},
{
"cell_type": "markdown",
"id": "d8176801",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
},
"source": [
"## Step 3: Building the Dense Layer\n",
"\n",
"Now let's build the **Dense layer**, the most fundamental building block of neural networks. A Dense layer performs a linear transformation: `y = Wx + b`\n",
"\n",
"### What is a Dense Layer?\n",
"- **Linear transformation**: `y = Wx + b`\n",
"- **W**: Weight matrix (learnable parameters)\n",
"- **x**: Input tensor\n",
"- **b**: Bias vector (learnable parameters)\n",
"- **y**: Output tensor\n",
"\n",
"### Why Dense Layers Matter\n",
"- **Universal approximation**: Can approximate any function with enough neurons\n",
"- **Feature learning**: Each neuron learns a different feature\n",
"- **Nonlinearity**: When combined with activation functions, becomes very powerful\n",
"- **Foundation**: All other layers build on this concept\n",
"\n",
"### The Math\n",
"For input x of shape (batch_size, input_size):\n",
"- **W**: Weight matrix of shape (input_size, output_size)\n",
"- **b**: Bias vector of shape (output_size)\n",
"- **y**: Output of shape (batch_size, output_size)\n",
"\n",
"### Visual Example\n",
"```\n",
"Input: x = [1, 2, 3] (3 features)\n",
"Weights: W = [[0.1, 0.2], Bias: b = [0.1, 0.2]\n",
" [0.3, 0.4],\n",
" [0.5, 0.6]]\n",
"\n",
"Step 1: Wx = [0.1*1 + 0.3*2 + 0.5*3, 0.2*1 + 0.4*2 + 0.6*3]\n",
" = [2.2, 3.2]\n",
"\n",
"Step 2: y = Wx + b = [2.2 + 0.1, 3.2 + 0.2] = [2.3, 3.4]\n",
"```\n",
"\n",
"Let's implement this!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4a916c67",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| export\n",
"class Dense:\n",
" \"\"\"\n",
" Dense (Linear) Layer: y = Wx + b\n",
" \n",
" The fundamental building block of neural networks.\n",
" Performs linear transformation: matrix multiplication + bias addition.\n",
" \n",
" Args:\n",
" input_size: Number of input features\n",
" output_size: Number of output features\n",
" use_bias: Whether to include bias term (default: True)\n",
" use_naive_matmul: Whether to use naive matrix multiplication (for learning)\n",
" \n",
" TODO: Implement the Dense layer with weight initialization and forward pass.\n",
" \n",
" APPROACH:\n",
" 1. Store layer parameters (input_size, output_size, use_bias, use_naive_matmul)\n",
" 2. Initialize weights with small random values (Xavier/Glorot initialization)\n",
" 3. Initialize bias to zeros (if use_bias=True)\n",
" 4. Implement forward pass using matrix multiplication and bias addition\n",
" \n",
" EXAMPLE:\n",
" layer = Dense(input_size=3, output_size=2)\n",
" x = Tensor([[1, 2, 3]]) # batch_size=1, input_size=3\n",
" y = layer(x) # shape: (1, 2)\n",
" \n",
" HINTS:\n",
" - Use np.random.randn() for random initialization\n",
" - Scale weights by sqrt(2/(input_size + output_size)) for Xavier init\n",
" - Store weights and bias as numpy arrays\n",
" - Use matmul_naive or @ operator based on use_naive_matmul flag\n",
" \"\"\"\n",
" \n",
" def __init__(self, input_size: int, output_size: int, use_bias: bool = True, \n",
" use_naive_matmul: bool = False):\n",
" \"\"\"\n",
" Initialize Dense layer with random weights.\n",
" \n",
" Args:\n",
" input_size: Number of input features\n",
" output_size: Number of output features\n",
" use_bias: Whether to include bias term\n",
" use_naive_matmul: Use naive matrix multiplication (for learning)\n",
" \n",
" TODO: \n",
" 1. Store layer parameters (input_size, output_size, use_bias, use_naive_matmul)\n",
" 2. Initialize weights with small random values\n",
" 3. Initialize bias to zeros (if use_bias=True)\n",
" \n",
" STEP-BY-STEP:\n",
" 1. Store the parameters as instance variables\n",
" 2. Calculate scale factor for Xavier initialization: sqrt(2/(input_size + output_size))\n",
" 3. Initialize weights: np.random.randn(input_size, output_size) * scale\n",
" 4. If use_bias=True, initialize bias: np.zeros(output_size)\n",
" 5. If use_bias=False, set bias to None\n",
" \n",
" EXAMPLE:\n",
" Dense(3, 2) creates:\n",
" - weights: shape (3, 2) with small random values\n",
" - bias: shape (2,) with zeros\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")\n",
" \n",
" def forward(self, x: Tensor) -> Tensor:\n",
" \"\"\"\n",
" Forward pass: y = Wx + b\n",
" \n",
" Args:\n",
" x: Input tensor of shape (batch_size, input_size)\n",
" \n",
" Returns:\n",
" Output tensor of shape (batch_size, output_size)\n",
" \n",
" TODO: Implement matrix multiplication and bias addition\n",
" - Use self.use_naive_matmul to choose between NumPy and naive implementation\n",
" - If use_naive_matmul=True, use matmul_naive(x.data, self.weights)\n",
" - If use_naive_matmul=False, use x.data @ self.weights\n",
" - Add bias if self.use_bias=True\n",
" \n",
" STEP-BY-STEP:\n",
" 1. Perform matrix multiplication: Wx\n",
" - If use_naive_matmul: result = matmul_naive(x.data, self.weights)\n",
" - Else: result = x.data @ self.weights\n",
" 2. Add bias if use_bias: result += self.bias\n",
" 3. Return Tensor(result)\n",
" \n",
" EXAMPLE:\n",
" Input x: Tensor([[1, 2, 3]]) # shape (1, 3)\n",
" Weights: shape (3, 2)\n",
" Output: Tensor([[val1, val2]]) # shape (1, 2)\n",
" \n",
" HINTS:\n",
" - x.data gives you the numpy array\n",
" - self.weights is your weight matrix\n",
" - Use broadcasting for bias addition: result + self.bias\n",
" - Return Tensor(result) to wrap the result\n",
" \"\"\"\n",
" raise NotImplementedError(\"Student implementation required\")\n",
" \n",
" def __call__(self, x: Tensor) -> Tensor:\n",
" \"\"\"Make layer callable: layer(x) same as layer.forward(x)\"\"\"\n",
" return self.forward(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8570d026",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"#| hide\n",
"#| export\n",
"class Dense:\n",
" \"\"\"\n",
" Dense (Linear) Layer: y = Wx + b\n",
" \n",
" The fundamental building block of neural networks.\n",
" Performs linear transformation: matrix multiplication + bias addition.\n",
" \"\"\"\n",
" \n",
" def __init__(self, input_size: int, output_size: int, use_bias: bool = True, \n",
" use_naive_matmul: bool = False):\n",
" \"\"\"\n",
" Initialize Dense layer with random weights.\n",
" \n",
" Args:\n",
" input_size: Number of input features\n",
" output_size: Number of output features\n",
" use_bias: Whether to include bias term\n",
" use_naive_matmul: Use naive matrix multiplication (for learning)\n",
" \"\"\"\n",
" # Store parameters\n",
" self.input_size = input_size\n",
" self.output_size = output_size\n",
" self.use_bias = use_bias\n",
" self.use_naive_matmul = use_naive_matmul\n",
" \n",
" # Xavier/Glorot initialization\n",
" scale = np.sqrt(2.0 / (input_size + output_size))\n",
" self.weights = np.random.randn(input_size, output_size).astype(np.float32) * scale\n",
" \n",
" # Initialize bias\n",
" if use_bias:\n",
" self.bias = np.zeros(output_size, dtype=np.float32)\n",
" else:\n",
" self.bias = None\n",
" \n",
" def forward(self, x: Tensor) -> Tensor:\n",
" \"\"\"\n",
" Forward pass: y = Wx + b\n",
" \n",
" Args:\n",
" x: Input tensor of shape (batch_size, input_size)\n",
" \n",
" Returns:\n",
" Output tensor of shape (batch_size, output_size)\n",
" \"\"\"\n",
" # Matrix multiplication\n",
" if self.use_naive_matmul:\n",
" result = matmul_naive(x.data, self.weights)\n",
" else:\n",
" result = x.data @ self.weights\n",
" \n",
" # Add bias\n",
" if self.use_bias:\n",
" result += self.bias\n",
" \n",
" return Tensor(result)\n",
" \n",
" def __call__(self, x: Tensor) -> Tensor:\n",
" \"\"\"Make layer callable: layer(x) same as layer.forward(x)\"\"\"\n",
" return self.forward(x)"
]
},
{
"cell_type": "markdown",
"id": "90197c65",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"### 🧪 Test Your Dense Layer"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9d9e4d64",
"metadata": {},
"outputs": [],
"source": [
"# Test Dense layer\n",
"print(\"Testing Dense layer...\")\n",
"\n",
"try:\n",
" # Test basic Dense layer\n",
" layer = Dense(input_size=3, output_size=2, use_bias=True)\n",
" x = Tensor([[1, 2, 3]]) # batch_size=1, input_size=3\n",
" \n",
" print(f\"✅ Input shape: {x.shape}\")\n",
" print(f\"✅ Layer weights shape: {layer.weights.shape}\")\n",
" print(f\"✅ Layer bias shape: {layer.bias.shape}\")\n",
" \n",
" y = layer(x)\n",
" print(f\"✅ Output shape: {y.shape}\")\n",
" print(f\"✅ Output: {y}\")\n",
" \n",
" # Test without bias\n",
" layer_no_bias = Dense(input_size=2, output_size=1, use_bias=False)\n",
" x2 = Tensor([[1, 2]])\n",
" y2 = layer_no_bias(x2)\n",
" print(f\"✅ No bias output: {y2}\")\n",
" \n",
" # Test naive matrix multiplication\n",
" layer_naive = Dense(input_size=2, output_size=2, use_naive_matmul=True)\n",
" x3 = Tensor([[1, 2]])\n",
" y3 = layer_naive(x3)\n",
" print(f\"✅ Naive matmul output: {y3}\")\n",
" \n",
" print(\"\\n🎉 All Dense layer tests passed!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"❌ Error: {e}\")\n",
" print(\"Make sure to implement the Dense layer above!\")"
]
},
{
"cell_type": "markdown",
"id": "37532e4d",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 4: Composing Layers with Activations\n",
"\n",
"Now let's see how layers work together! A neural network is just layers composed with activation functions.\n",
"\n",
"### Why Layer Composition Matters\n",
"- **Nonlinearity**: Activation functions make networks powerful\n",
"- **Feature learning**: Each layer learns different levels of features\n",
"- **Universal approximation**: Can approximate any function\n",
"- **Modularity**: Easy to experiment with different architectures\n",
"\n",
"### The Pattern\n",
"```\n",
"Input → Dense → Activation → Dense → Activation → Output\n",
"```\n",
"\n",
"### Real-World Example\n",
"```\n",
"Input: [1, 2, 3] (3 features)\n",
"Dense(3→2): [1.4, 2.8] (linear transformation)\n",
"ReLU: [1.4, 2.8] (nonlinearity)\n",
"Dense(2→1): [3.2] (final prediction)\n",
"```\n",
"\n",
"Let's build a simple network!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d6e1d85c",
"metadata": {},
"outputs": [],
"source": [
"# Test layer composition\n",
"print(\"Testing layer composition...\")\n",
"\n",
"try:\n",
" # Create a simple network: Dense → ReLU → Dense\n",
" dense1 = Dense(input_size=3, output_size=2)\n",
" relu = ReLU()\n",
" dense2 = Dense(input_size=2, output_size=1)\n",
" \n",
" # Test input\n",
" x = Tensor([[1, 2, 3]])\n",
" print(f\"✅ Input: {x}\")\n",
" \n",
" # Forward pass through the network\n",
" h1 = dense1(x)\n",
" print(f\"✅ After Dense1: {h1}\")\n",
" \n",
" h2 = relu(h1)\n",
" print(f\"✅ After ReLU: {h2}\")\n",
" \n",
" y = dense2(h2)\n",
" print(f\"✅ Final output: {y}\")\n",
" \n",
" print(\"\\n🎉 Layer composition works!\")\n",
" print(\"This is how neural networks work: layers + activations!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"❌ Error: {e}\")\n",
" print(\"Make sure all your layers and activations are working!\")"
]
},
{
"cell_type": "markdown",
"id": "5f2f8a48",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## Step 5: Performance Comparison\n",
"\n",
"Let's compare our naive matrix multiplication with NumPy's optimized version to understand why optimization matters in ML.\n",
"\n",
"### Why Performance Matters\n",
"- **Training time**: Neural networks train for hours/days\n",
"- **Inference speed**: Real-time applications need fast predictions\n",
"- **GPU utilization**: Optimized operations use hardware efficiently\n",
"- **Scalability**: Large models need efficient implementations"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6f490a2",
"metadata": {},
"outputs": [],
"source": [
"# Performance comparison\n",
"print(\"Comparing naive vs NumPy matrix multiplication...\")\n",
"\n",
"try:\n",
" import time\n",
" \n",
" # Create test matrices\n",
" A = np.random.randn(100, 100).astype(np.float32)\n",
" B = np.random.randn(100, 100).astype(np.float32)\n",
" \n",
" # Time naive implementation\n",
" start_time = time.time()\n",
" result_naive = matmul_naive(A, B)\n",
" naive_time = time.time() - start_time\n",
" \n",
" # Time NumPy implementation\n",
" start_time = time.time()\n",
" result_numpy = A @ B\n",
" numpy_time = time.time() - start_time\n",
" \n",
" print(f\"✅ Naive time: {naive_time:.4f} seconds\")\n",
" print(f\"✅ NumPy time: {numpy_time:.4f} seconds\")\n",
" print(f\"✅ Speedup: {naive_time/numpy_time:.1f}x faster\")\n",
" \n",
" # Verify correctness\n",
" assert np.allclose(result_naive, result_numpy), \"Results don't match!\"\n",
" print(\"✅ Results are identical!\")\n",
" \n",
" print(\"\\n💡 This is why we use optimized libraries in production!\")\n",
" \n",
"except Exception as e:\n",
" print(f\"❌ Error: {e}\")"
]
},
{
"cell_type": "markdown",
"id": "35efc1ca",
"metadata": {
"cell_marker": "\"\"\""
},
"source": [
"## 🎯 Module Summary\n",
"\n",
"Congratulations! You've built the foundation of neural network layers:\n",
"\n",
"### What You've Accomplished\n",
"✅ **Matrix Multiplication**: Understanding the core operation \n",
"✅ **Dense Layer**: Linear transformation with weights and bias \n",
"✅ **Layer Composition**: Combining layers with activations \n",
"✅ **Performance Awareness**: Understanding optimization importance \n",
"✅ **Testing**: Immediate feedback on your implementations \n",
"\n",
"### Key Concepts You've Learned\n",
"- **Layers** are functions that transform tensors\n",
"- **Matrix multiplication** powers all neural network computations\n",
"- **Dense layers** perform linear transformations: `y = Wx + b`\n",
"- **Layer composition** creates complex functions from simple building blocks\n",
"- **Performance** matters for real-world ML applications\n",
"\n",
"### What's Next\n",
"In the next modules, you'll build on this foundation:\n",
"- **Networks**: Compose layers into complete models\n",
"- **Training**: Learn parameters with gradients and optimization\n",
"- **Convolutional layers**: Process spatial data like images\n",
"- **Recurrent layers**: Process sequential data like text\n",
"\n",
"### Real-World Connection\n",
"Your Dense layer is now ready to:\n",
"- Learn patterns in data through weight updates\n",
"- Transform features for classification and regression\n",
"- Serve as building blocks for complex architectures\n",
"- Integrate with the rest of the TinyTorch ecosystem\n",
"\n",
"**Ready for the next challenge?** Let's move on to building complete neural networks!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9c9187ca",
"metadata": {},
"outputs": [],
"source": [
"# Final verification\n",
"print(\"\\n\" + \"=\"*50)\n",
"print(\"🎉 LAYERS MODULE COMPLETE!\")\n",
"print(\"=\"*50)\n",
"print(\"✅ Matrix multiplication understanding\")\n",
"print(\"✅ Dense layer implementation\")\n",
"print(\"✅ Layer composition with activations\")\n",
"print(\"✅ Performance awareness\")\n",
"print(\"✅ Comprehensive testing\")\n",
"print(\"\\n🚀 Ready to build networks in the next module!\") "
]
}
],
"metadata": {
"jupytext": {
"main_language": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}