mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-03-11 20:55:19 -05:00
- Migrated all Python source files to assignments/source/ structure - Updated nbdev configuration to use assignments/source as nbs_path - Updated all tito commands (nbgrader, export, test) to use new structure - Fixed hardcoded paths in Python files and documentation - Updated config.py to use assignments/source instead of modules - Fixed test command to use correct file naming (short names vs full module names) - Regenerated all notebook files with clean metadata - Verified complete workflow: Python source → NBGrader → nbdev export → testing All systems now working: NBGrader (14 source assignments, 1 released), nbdev export (7 generated files), and pytest integration. The modules/ directory has been retired and replaced with standard NBGrader structure.
1438 lines
50 KiB
Plaintext
1438 lines
50 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "355dc307",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"# Module 3: Networks - Neural Network Architectures\n",
|
|
"\n",
|
|
"Welcome to the Networks module! This is where we compose layers into complete neural network architectures.\n",
|
|
"\n",
|
|
"## Learning Goals\n",
|
|
"- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`\n",
|
|
"- Build common architectures (MLP, CNN) from layers\n",
|
|
"- Visualize network structure and data flow\n",
|
|
"- See how architecture affects capability\n",
|
|
"- Master forward pass inference (no training yet!)\n",
|
|
"\n",
|
|
"## Build → Use → Understand\n",
|
|
"1. **Build**: Compose layers into complete networks\n",
|
|
"2. **Use**: Create different architectures and run inference\n",
|
|
"3. **Understand**: How architecture design affects network behavior\n",
|
|
"\n",
|
|
"## Module Dependencies\n",
|
|
"This module builds on previous modules:\n",
|
|
"- **tensor** → **activations** → **layers** → **networks**\n",
|
|
"- Clean composition: math functions → building blocks → complete systems"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "cf724917",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 📦 Where This Code Lives in the Final Package\n",
|
|
"\n",
|
|
"**Learning Side:** You work in `assignments/source/04_networks/networks_dev.py` \n",
|
|
"**Building Side:** Code exports to `tinytorch.core.networks`\n",
|
|
"\n",
|
|
"```python\n",
|
|
"# Final package structure:\n",
|
|
"from tinytorch.core.networks import Sequential, MLP\n",
|
|
"from tinytorch.core.layers import Dense, Conv2D\n",
|
|
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
|
|
"from tinytorch.core.tensor import Tensor\n",
|
|
"```\n",
|
|
"\n",
|
|
"**Why this matters:**\n",
|
|
"- **Learning:** Focused modules for deep understanding\n",
|
|
"- **Production:** Proper organization like PyTorch's `torch.nn`\n",
|
|
"- **Consistency:** All network architectures live together in `core.networks`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "79460d45",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| default_exp core.networks\n",
|
|
"\n",
|
|
"# Setup and imports\n",
|
|
"import numpy as np\n",
|
|
"import sys\n",
|
|
"from typing import List, Union, Optional, Callable\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import matplotlib.patches as patches\n",
|
|
"from matplotlib.patches import FancyBboxPatch, ConnectionPatch\n",
|
|
"import seaborn as sns\n",
|
|
"\n",
|
|
"# Import all the building blocks we need\n",
|
|
"from tinytorch.core.tensor import Tensor\n",
|
|
"from tinytorch.core.layers import Dense\n",
|
|
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax\n",
|
|
"\n",
|
|
"print(\"🔥 TinyTorch Networks Module\")\n",
|
|
"print(f\"NumPy version: {np.__version__}\")\n",
|
|
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
|
"print(\"Ready to build neural network architectures!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2190e04d",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"import numpy as np\n",
|
|
"import sys\n",
|
|
"from typing import List, Union, Optional, Callable\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import matplotlib.patches as patches\n",
|
|
"from matplotlib.patches import FancyBboxPatch, ConnectionPatch\n",
|
|
"import seaborn as sns\n",
|
|
"\n",
|
|
"# Import our building blocks\n",
|
|
"from tinytorch.core.tensor import Tensor\n",
|
|
"from tinytorch.core.layers import Dense\n",
|
|
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "c03a46b9",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def _should_show_plots():\n",
|
|
" \"\"\"Check if we should show plots (disable during testing)\"\"\"\n",
|
|
" return 'pytest' not in sys.modules and 'test' not in sys.argv"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "58e30d14",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 1: What is a Network?\n",
|
|
"\n",
|
|
"### Definition\n",
|
|
"A **network** is a composition of layers that transforms input data into output predictions. Think of it as a pipeline of transformations:\n",
|
|
"\n",
|
|
"```\n",
|
|
"Input → Layer1 → Layer2 → Layer3 → Output\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why Networks Matter\n",
|
|
"- **Function composition**: Complex behavior from simple building blocks\n",
|
|
"- **Learnable parameters**: Each layer has weights that can be learned\n",
|
|
"- **Architecture design**: Different layouts solve different problems\n",
|
|
"- **Real-world applications**: Classification, regression, generation, etc.\n",
|
|
"\n",
|
|
"### The Fundamental Insight\n",
|
|
"**Neural networks are just function composition!**\n",
|
|
"- Each layer is a function: `f_i(x)`\n",
|
|
"- The network is: `f(x) = f_n(...f_2(f_1(x)))`\n",
|
|
"- Complex behavior emerges from simple building blocks\n",
|
|
"\n",
|
|
"### Real-World Examples\n",
|
|
"- **MLP (Multi-Layer Perceptron)**: Classic feedforward network\n",
|
|
"- **CNN (Convolutional Neural Network)**: For image processing\n",
|
|
"- **RNN (Recurrent Neural Network)**: For sequential data\n",
|
|
"- **Transformer**: For attention-based processing\n",
|
|
"\n",
|
|
"### Visual Intuition\n",
|
|
"```\n",
|
|
"Input: [1, 2, 3] (3 features)\n",
|
|
"Layer1: [1.4, 2.8] (linear transformation)\n",
|
|
"Layer2: [1.4, 2.8] (nonlinearity)\n",
|
|
"Layer3: [0.7] (final prediction)\n",
|
|
"```\n",
|
|
"\n",
|
|
"### The Math Behind It\n",
|
|
"For a network with layers `f_1, f_2, ..., f_n`:\n",
|
|
"```\n",
|
|
"f(x) = f_n(f_{n-1}(...f_2(f_1(x))))\n",
|
|
"```\n",
|
|
"\n",
|
|
"Each layer transforms the data, and the final output is the composition of all these transformations.\n",
|
|
"\n",
|
|
"Let's start by building the most fundamental network: **Sequential**."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8de00b9b",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"class Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Sequential Network: Composes layers in sequence\n",
|
|
" \n",
|
|
" The most fundamental network architecture.\n",
|
|
" Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" layers: List of layers to compose\n",
|
|
" \n",
|
|
" TODO: Implement the Sequential network with forward pass.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Store the list of layers as an instance variable\n",
|
|
" 2. Implement forward pass that applies each layer in sequence\n",
|
|
" 3. Make the network callable for easy use\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" network = Sequential([\n",
|
|
" Dense(3, 4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(4, 2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" x = Tensor([[1, 2, 3]])\n",
|
|
" y = network(x) # Forward pass through all layers\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Store layers in self.layers\n",
|
|
" - Use a for loop to apply each layer in order\n",
|
|
" - Each layer's output becomes the next layer's input\n",
|
|
" - Return the final output\n",
|
|
" \"\"\"\n",
|
|
" \n",
|
|
" def __init__(self, layers: List):\n",
|
|
" \"\"\"\n",
|
|
" Initialize Sequential network with layers.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" layers: List of layers to compose in order\n",
|
|
" \n",
|
|
" TODO: Store the layers and implement forward pass\n",
|
|
" \n",
|
|
" STEP-BY-STEP:\n",
|
|
" 1. Store the layers list as self.layers\n",
|
|
" 2. This creates the network architecture\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Sequential([Dense(3,4), ReLU(), Dense(4,2)])\n",
|
|
" creates a 3-layer network: Dense → ReLU → Dense\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")\n",
|
|
" \n",
|
|
" def forward(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"\n",
|
|
" Forward pass through all layers in sequence.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" x: Input tensor\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Output tensor after passing through all layers\n",
|
|
" \n",
|
|
" TODO: Implement sequential forward pass through all layers\n",
|
|
" \n",
|
|
" STEP-BY-STEP:\n",
|
|
" 1. Start with the input tensor: current = x\n",
|
|
" 2. Loop through each layer in self.layers\n",
|
|
" 3. Apply each layer: current = layer(current)\n",
|
|
" 4. Return the final output\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Input: Tensor([[1, 2, 3]])\n",
|
|
" Layer1 (Dense): Tensor([[1.4, 2.8]])\n",
|
|
" Layer2 (ReLU): Tensor([[1.4, 2.8]])\n",
|
|
" Layer3 (Dense): Tensor([[0.7]])\n",
|
|
" Output: Tensor([[0.7]])\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use a for loop: for layer in self.layers:\n",
|
|
" - Apply each layer: current = layer(current)\n",
|
|
" - The output of one layer becomes input to the next\n",
|
|
" - Return the final result\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")\n",
|
|
" \n",
|
|
" def __call__(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"Make network callable: network(x) same as network.forward(x)\"\"\"\n",
|
|
" return self.forward(x)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "4e9f65af",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"class Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Sequential Network: Composes layers in sequence\n",
|
|
" \n",
|
|
" The most fundamental network architecture.\n",
|
|
" Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))\n",
|
|
" \"\"\"\n",
|
|
" \n",
|
|
" def __init__(self, layers: List):\n",
|
|
" \"\"\"Initialize Sequential network with layers.\"\"\"\n",
|
|
" self.layers = layers\n",
|
|
" \n",
|
|
" def forward(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"Forward pass through all layers in sequence.\"\"\"\n",
|
|
" # Apply each layer in order\n",
|
|
" for layer in self.layers:\n",
|
|
" x = layer(x)\n",
|
|
" return x\n",
|
|
" \n",
|
|
" def __call__(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"Make network callable: network(x) same as network.forward(x)\"\"\"\n",
|
|
" return self.forward(x)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "88b54128",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Your Sequential Network"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9b814f23",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test the Sequential network\n",
|
|
"print(\"Testing Sequential network...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create a simple 2-layer network: 3 → 4 → 2\n",
|
|
" network = Sequential([\n",
|
|
" Dense(input_size=3, output_size=4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=4, output_size=2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" print(f\"✅ Network created with {len(network.layers)} layers\")\n",
|
|
" \n",
|
|
" # Test with sample data\n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" print(f\"✅ Input: {x}\")\n",
|
|
" \n",
|
|
" # Forward pass\n",
|
|
" y = network(x)\n",
|
|
" print(f\"✅ Output: {y}\")\n",
|
|
" print(f\"✅ Output shape: {y.shape}\")\n",
|
|
" \n",
|
|
" # Verify the network works\n",
|
|
" assert y.shape == (1, 2), f\"❌ Expected shape (1, 2), got {y.shape}\"\n",
|
|
" assert np.all(y.data >= 0) and np.all(y.data <= 1), \"❌ Sigmoid output should be between 0 and 1\"\n",
|
|
" print(\"🎉 Sequential network works!\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement the Sequential network above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "28eb9398",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 2: Understanding Network Architecture\n",
|
|
"\n",
|
|
"Now let's explore how different network architectures affect the network's capabilities.\n",
|
|
"\n",
|
|
"### What is Network Architecture?\n",
|
|
"**Architecture** refers to how layers are arranged and connected. It determines:\n",
|
|
"- **Capacity**: How complex patterns the network can learn\n",
|
|
"- **Efficiency**: How many parameters and computations needed\n",
|
|
"- **Specialization**: What types of problems it's good at\n",
|
|
"\n",
|
|
"### Common Architectures\n",
|
|
"\n",
|
|
"#### 1. **MLP (Multi-Layer Perceptron)**\n",
|
|
"```\n",
|
|
"Input → Dense → ReLU → Dense → ReLU → Dense → Output\n",
|
|
"```\n",
|
|
"- **Use case**: General-purpose learning\n",
|
|
"- **Strengths**: Universal approximation, simple to understand\n",
|
|
"- **Weaknesses**: Doesn't exploit spatial structure\n",
|
|
"\n",
|
|
"#### 2. **CNN (Convolutional Neural Network)**\n",
|
|
"```\n",
|
|
"Input → Conv2D → ReLU → Conv2D → ReLU → Dense → Output\n",
|
|
"```\n",
|
|
"- **Use case**: Image processing, spatial data\n",
|
|
"- **Strengths**: Parameter sharing, translation invariance\n",
|
|
"- **Weaknesses**: Fixed spatial structure\n",
|
|
"\n",
|
|
"#### 3. **Deep Network**\n",
|
|
"```\n",
|
|
"Input → Dense → ReLU → Dense → ReLU → Dense → ReLU → Dense → Output\n",
|
|
"```\n",
|
|
"- **Use case**: Complex pattern recognition\n",
|
|
"- **Strengths**: High capacity, can learn complex functions\n",
|
|
"- **Weaknesses**: More parameters, harder to train\n",
|
|
"\n",
|
|
"Let's build some common architectures!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ae4fe584",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int, \n",
|
|
" activation=ReLU, output_activation=Sigmoid) -> Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Create a Multi-Layer Perceptron (MLP) network.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" input_size: Number of input features\n",
|
|
" hidden_sizes: List of hidden layer sizes\n",
|
|
" output_size: Number of output features\n",
|
|
" activation: Activation function for hidden layers (default: ReLU)\n",
|
|
" output_activation: Activation function for output layer (default: Sigmoid)\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Sequential network with MLP architecture\n",
|
|
" \n",
|
|
" TODO: Implement MLP creation with alternating Dense and activation layers.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Start with an empty list of layers\n",
|
|
" 2. Add the first Dense layer: input_size → first hidden size\n",
|
|
" 3. For each hidden layer:\n",
|
|
" - Add activation function\n",
|
|
" - Add Dense layer connecting to next hidden size\n",
|
|
" 4. Add final activation function\n",
|
|
" 5. Add final Dense layer: last hidden size → output_size\n",
|
|
" 6. Add output activation function\n",
|
|
" 7. Return Sequential(layers)\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" create_mlp(3, [4, 2], 1) creates:\n",
|
|
" Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Start with layers = []\n",
|
|
" - Add Dense layers with appropriate input/output sizes\n",
|
|
" - Add activation functions between Dense layers\n",
|
|
" - Don't forget the final output activation\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3df597d8",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int, \n",
|
|
" activation=ReLU, output_activation=Sigmoid) -> Sequential:\n",
|
|
" \"\"\"Create a Multi-Layer Perceptron (MLP) network.\"\"\"\n",
|
|
" layers = []\n",
|
|
" \n",
|
|
" # Add first layer\n",
|
|
" current_size = input_size\n",
|
|
" for hidden_size in hidden_sizes:\n",
|
|
" layers.append(Dense(input_size=current_size, output_size=hidden_size))\n",
|
|
" layers.append(activation())\n",
|
|
" current_size = hidden_size\n",
|
|
" \n",
|
|
" # Add output layer\n",
|
|
" layers.append(Dense(input_size=current_size, output_size=output_size))\n",
|
|
" layers.append(output_activation())\n",
|
|
" \n",
|
|
" return Sequential(layers)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f053d4a8",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Your MLP Creation"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "efec756b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test MLP creation\n",
|
|
"print(\"Testing MLP creation...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create different MLP architectures\n",
|
|
" mlp1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)\n",
|
|
" mlp2 = create_mlp(input_size=5, hidden_sizes=[8, 4], output_size=2)\n",
|
|
" mlp3 = create_mlp(input_size=2, hidden_sizes=[10, 6, 3], output_size=1, activation=Tanh)\n",
|
|
" \n",
|
|
" print(f\"✅ MLP1: {len(mlp1.layers)} layers\")\n",
|
|
" print(f\"✅ MLP2: {len(mlp2.layers)} layers\")\n",
|
|
" print(f\"✅ MLP3: {len(mlp3.layers)} layers\")\n",
|
|
" \n",
|
|
" # Test forward pass\n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" y1 = mlp1(x)\n",
|
|
" print(f\"✅ MLP1 output: {y1}\")\n",
|
|
" \n",
|
|
" x2 = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])\n",
|
|
" y2 = mlp2(x2)\n",
|
|
" print(f\"✅ MLP2 output: {y2}\")\n",
|
|
" \n",
|
|
" print(\"🎉 MLP creation works!\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement create_mlp above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9d1c34b6",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 3: Network Visualization and Analysis\n",
|
|
"\n",
|
|
"Let's create tools to visualize and analyze network architectures. This helps us understand what our networks are doing.\n",
|
|
"\n",
|
|
"### Why Visualization Matters\n",
|
|
"- **Architecture understanding**: See how data flows through the network\n",
|
|
"- **Debugging**: Identify bottlenecks and issues\n",
|
|
"- **Design**: Compare different architectures\n",
|
|
"- **Communication**: Explain networks to others\n",
|
|
"\n",
|
|
"### What We'll Build\n",
|
|
"1. **Architecture visualization**: Show layer connections\n",
|
|
"2. **Data flow visualization**: See how data transforms\n",
|
|
"3. **Network comparison**: Compare different architectures\n",
|
|
"4. **Behavior analysis**: Understand network capabilities"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "a74a3b28",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def visualize_network_architecture(network: Sequential, title: str = \"Network Architecture\"):\n",
|
|
" \"\"\"\n",
|
|
" Visualize the architecture of a Sequential network.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" network: Sequential network to visualize\n",
|
|
" title: Title for the plot\n",
|
|
" \n",
|
|
" TODO: Create a visualization showing the network structure.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Create a matplotlib figure\n",
|
|
" 2. For each layer, draw a box showing its type and size\n",
|
|
" 3. Connect the boxes with arrows showing data flow\n",
|
|
" 4. Add labels and formatting\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Input → Dense(3→4) → ReLU → Dense(4→2) → Sigmoid → Output\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use plt.subplots() to create the figure\n",
|
|
" - Use plt.text() to add layer labels\n",
|
|
" - Use plt.arrow() to show connections\n",
|
|
" - Add proper spacing and formatting\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "b1274dbc",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def visualize_network_architecture(network: Sequential, title: str = \"Network Architecture\"):\n",
|
|
" \"\"\"Visualize the architecture of a Sequential network.\"\"\"\n",
|
|
" if not _should_show_plots():\n",
|
|
" print(\"📊 Visualization disabled during testing\")\n",
|
|
" return\n",
|
|
" \n",
|
|
" fig, ax = plt.subplots(1, 1, figsize=(12, 6))\n",
|
|
" \n",
|
|
" # Calculate positions\n",
|
|
" num_layers = len(network.layers)\n",
|
|
" x_positions = np.linspace(0, 10, num_layers + 2)\n",
|
|
" \n",
|
|
" # Draw input\n",
|
|
" ax.text(x_positions[0], 0, 'Input', ha='center', va='center', \n",
|
|
" bbox=dict(boxstyle='round,pad=0.3', facecolor='lightblue'))\n",
|
|
" \n",
|
|
" # Draw layers\n",
|
|
" for i, layer in enumerate(network.layers):\n",
|
|
" layer_name = type(layer).__name__\n",
|
|
" ax.text(x_positions[i+1], 0, layer_name, ha='center', va='center',\n",
|
|
" bbox=dict(boxstyle='round,pad=0.3', facecolor='lightgreen'))\n",
|
|
" \n",
|
|
" # Draw arrow\n",
|
|
" ax.arrow(x_positions[i], 0, 0.8, 0, head_width=0.1, head_length=0.1, \n",
|
|
" fc='black', ec='black')\n",
|
|
" \n",
|
|
" # Draw output\n",
|
|
" ax.text(x_positions[-1], 0, 'Output', ha='center', va='center',\n",
|
|
" bbox=dict(boxstyle='round,pad=0.3', facecolor='lightcoral'))\n",
|
|
" \n",
|
|
" ax.set_xlim(-0.5, 10.5)\n",
|
|
" ax.set_ylim(-0.5, 0.5)\n",
|
|
" ax.set_title(title)\n",
|
|
" ax.axis('off')\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "286f403e",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Network Visualization"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2630d356",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test network visualization\n",
|
|
"print(\"Testing network visualization...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create a test network\n",
|
|
" test_network = Sequential([\n",
|
|
" Dense(input_size=3, output_size=4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=4, output_size=2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" # Visualize the network\n",
|
|
" if _should_show_plots():\n",
|
|
" visualize_network_architecture(test_network, \"Test Network Architecture\")\n",
|
|
" print(\"✅ Network visualization created!\")\n",
|
|
" else:\n",
|
|
" print(\"✅ Network visualization skipped during testing\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement visualize_network_architecture above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d1b3aaee",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 4: Data Flow Analysis\n",
|
|
"\n",
|
|
"Let's create tools to analyze how data flows through the network. This helps us understand what each layer is doing.\n",
|
|
"\n",
|
|
"### Why Data Flow Analysis Matters\n",
|
|
"- **Debugging**: See where data gets corrupted\n",
|
|
"- **Optimization**: Identify bottlenecks\n",
|
|
"- **Understanding**: Learn what each layer learns\n",
|
|
"- **Design**: Choose appropriate layer sizes"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7bc5136d",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = \"Data Flow Through Network\"):\n",
|
|
" \"\"\"\n",
|
|
" Visualize how data flows through the network.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" network: Sequential network to analyze\n",
|
|
" input_data: Input tensor to trace through the network\n",
|
|
" title: Title for the plot\n",
|
|
" \n",
|
|
" TODO: Create a visualization showing how data transforms through each layer.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Trace the input through each layer\n",
|
|
" 2. Record the output of each layer\n",
|
|
" 3. Create a visualization showing the transformations\n",
|
|
" 4. Add statistics (mean, std, range) for each layer\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Input: [1, 2, 3] → Layer1: [1.4, 2.8] → Layer2: [1.4, 2.8] → Output: [0.7]\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use a for loop to apply each layer\n",
|
|
" - Store intermediate outputs\n",
|
|
" - Use plt.subplot() to create multiple subplots\n",
|
|
" - Show statistics for each layer output\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "c318ea50",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = \"Data Flow Through Network\"):\n",
|
|
" \"\"\"Visualize how data flows through the network.\"\"\"\n",
|
|
" if not _should_show_plots():\n",
|
|
" print(\"📊 Visualization disabled during testing\")\n",
|
|
" return\n",
|
|
" \n",
|
|
" # Trace data through network\n",
|
|
" current_data = input_data\n",
|
|
" layer_outputs = [current_data.data.flatten()]\n",
|
|
" layer_names = ['Input']\n",
|
|
" \n",
|
|
" for layer in network.layers:\n",
|
|
" current_data = layer(current_data)\n",
|
|
" layer_outputs.append(current_data.data.flatten())\n",
|
|
" layer_names.append(type(layer).__name__)\n",
|
|
" \n",
|
|
" # Create visualization\n",
|
|
" fig, axes = plt.subplots(2, len(layer_outputs), figsize=(15, 8))\n",
|
|
" \n",
|
|
" for i, (output, name) in enumerate(zip(layer_outputs, layer_names)):\n",
|
|
" # Histogram\n",
|
|
" axes[0, i].hist(output, bins=20, alpha=0.7)\n",
|
|
" axes[0, i].set_title(f'{name}\\nShape: {output.shape}')\n",
|
|
" axes[0, i].set_xlabel('Value')\n",
|
|
" axes[0, i].set_ylabel('Frequency')\n",
|
|
" \n",
|
|
" # Statistics\n",
|
|
" stats_text = f'Mean: {np.mean(output):.3f}\\nStd: {np.std(output):.3f}\\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]'\n",
|
|
" axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes, \n",
|
|
" verticalalignment='center', fontsize=10)\n",
|
|
" axes[1, i].set_title(f'{name} Statistics')\n",
|
|
" axes[1, i].axis('off')\n",
|
|
" \n",
|
|
" plt.suptitle(title)\n",
|
|
" plt.tight_layout()\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "bba1f652",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Data Flow Visualization"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "af4ed8de",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test data flow visualization\n",
|
|
"print(\"Testing data flow visualization...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create a test network\n",
|
|
" test_network = Sequential([\n",
|
|
" Dense(input_size=3, output_size=4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=4, output_size=2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" # Test input\n",
|
|
" test_input = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" \n",
|
|
" # Visualize data flow\n",
|
|
" if _should_show_plots():\n",
|
|
" visualize_data_flow(test_network, test_input, \"Test Network Data Flow\")\n",
|
|
" print(\"✅ Data flow visualization created!\")\n",
|
|
" else:\n",
|
|
" print(\"✅ Data flow visualization skipped during testing\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement visualize_data_flow above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "02308b13",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 5: Network Comparison and Analysis\n",
|
|
"\n",
|
|
"Let's create tools to compare different network architectures and understand their capabilities.\n",
|
|
"\n",
|
|
"### Why Network Comparison Matters\n",
|
|
"- **Architecture selection**: Choose the right network for your problem\n",
|
|
"- **Performance analysis**: Understand trade-offs between different designs\n",
|
|
"- **Design insights**: Learn what makes networks effective\n",
|
|
"- **Research**: Compare new architectures to baselines"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "4c3634ab",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def compare_networks(networks: List[Sequential], network_names: List[str], \n",
|
|
" input_data: Tensor, title: str = \"Network Comparison\"):\n",
|
|
" \"\"\"\n",
|
|
" Compare multiple networks on the same input.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" networks: List of Sequential networks to compare\n",
|
|
" network_names: Names for each network\n",
|
|
" input_data: Input tensor to test all networks\n",
|
|
" title: Title for the plot\n",
|
|
" \n",
|
|
" TODO: Create a comparison visualization showing how different networks process the same input.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Run the same input through each network\n",
|
|
" 2. Collect the outputs and intermediate results\n",
|
|
" 3. Create a visualization comparing the results\n",
|
|
" 4. Show statistics and differences\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Compare MLP vs Deep Network vs Wide Network on same input\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use a for loop to test each network\n",
|
|
" - Store outputs and any relevant statistics\n",
|
|
" - Use plt.subplot() to create comparison plots\n",
|
|
" - Show both outputs and intermediate layer results\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ce1d5a21",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def compare_networks(networks: List[Sequential], network_names: List[str], \n",
|
|
" input_data: Tensor, title: str = \"Network Comparison\"):\n",
|
|
" \"\"\"Compare multiple networks on the same input.\"\"\"\n",
|
|
" if not _should_show_plots():\n",
|
|
" print(\"📊 Visualization disabled during testing\")\n",
|
|
" return\n",
|
|
" \n",
|
|
" # Test all networks\n",
|
|
" outputs = []\n",
|
|
" for network in networks:\n",
|
|
" output = network(input_data)\n",
|
|
" outputs.append(output.data.flatten())\n",
|
|
" \n",
|
|
" # Create comparison plot\n",
|
|
" fig, axes = plt.subplots(2, len(networks), figsize=(15, 8))\n",
|
|
" \n",
|
|
" for i, (output, name) in enumerate(zip(outputs, network_names)):\n",
|
|
" # Output distribution\n",
|
|
" axes[0, i].hist(output, bins=20, alpha=0.7)\n",
|
|
" axes[0, i].set_title(f'{name}\\nOutput Distribution')\n",
|
|
" axes[0, i].set_xlabel('Value')\n",
|
|
" axes[0, i].set_ylabel('Frequency')\n",
|
|
" \n",
|
|
" # Statistics\n",
|
|
" stats_text = f'Mean: {np.mean(output):.3f}\\nStd: {np.std(output):.3f}\\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]\\nSize: {len(output)}'\n",
|
|
" axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes, \n",
|
|
" verticalalignment='center', fontsize=10)\n",
|
|
" axes[1, i].set_title(f'{name} Statistics')\n",
|
|
" axes[1, i].axis('off')\n",
|
|
" \n",
|
|
" plt.suptitle(title)\n",
|
|
" plt.tight_layout()\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d16eb163",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Network Comparison"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ab17ac91",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test network comparison\n",
|
|
"print(\"Testing network comparison...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create different networks\n",
|
|
" network1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)\n",
|
|
" network2 = create_mlp(input_size=3, hidden_sizes=[8, 4], output_size=1)\n",
|
|
" network3 = create_mlp(input_size=3, hidden_sizes=[2], output_size=1, activation=Tanh)\n",
|
|
" \n",
|
|
" networks = [network1, network2, network3]\n",
|
|
" names = [\"Small MLP\", \"Deep MLP\", \"Tanh MLP\"]\n",
|
|
" \n",
|
|
" # Test input\n",
|
|
" test_input = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" \n",
|
|
" # Compare networks\n",
|
|
" if _should_show_plots():\n",
|
|
" compare_networks(networks, names, test_input, \"Network Architecture Comparison\")\n",
|
|
" print(\"✅ Network comparison created!\")\n",
|
|
" else:\n",
|
|
" print(\"✅ Network comparison skipped during testing\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement compare_networks above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c61fc030",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 6: Practical Network Architectures\n",
|
|
"\n",
|
|
"Now let's create some practical network architectures for common machine learning tasks.\n",
|
|
"\n",
|
|
"### Common Network Types\n",
|
|
"\n",
|
|
"#### 1. **Classification Networks**\n",
|
|
"- **Binary classification**: Output single probability\n",
|
|
"- **Multi-class classification**: Output probability distribution\n",
|
|
"- **Use cases**: Image classification, spam detection, sentiment analysis\n",
|
|
"\n",
|
|
"#### 2. **Regression Networks**\n",
|
|
"- **Single output**: Predict continuous value\n",
|
|
"- **Multiple outputs**: Predict multiple values\n",
|
|
"- **Use cases**: Price prediction, temperature forecasting, demand estimation\n",
|
|
"\n",
|
|
"#### 3. **Feature Extraction Networks**\n",
|
|
"- **Encoder networks**: Compress data into features\n",
|
|
"- **Use cases**: Dimensionality reduction, feature learning, representation learning"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f117af1e",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def create_classification_network(input_size: int, num_classes: int, \n",
|
|
" hidden_sizes: List[int] = None) -> Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Create a network for classification tasks.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" input_size: Number of input features\n",
|
|
" num_classes: Number of output classes\n",
|
|
" hidden_sizes: List of hidden layer sizes (default: [input_size * 2])\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Sequential network for classification\n",
|
|
" \n",
|
|
" TODO: Implement classification network creation.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Use default hidden sizes if none provided\n",
|
|
" 2. Create MLP with appropriate architecture\n",
|
|
" 3. Use Sigmoid for binary classification (num_classes=1)\n",
|
|
" 4. Use appropriate activation for multi-class\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" create_classification_network(10, 3) creates:\n",
|
|
" Dense(10→20) → ReLU → Dense(20→3) → Sigmoid\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use create_mlp() function\n",
|
|
" - Choose appropriate output activation based on num_classes\n",
|
|
" - For binary classification (num_classes=1), use Sigmoid\n",
|
|
" - For multi-class, you could use Sigmoid or no activation\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "867fa5d4",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def create_classification_network(input_size: int, num_classes: int, \n",
|
|
" hidden_sizes: List[int] = None) -> Sequential:\n",
|
|
" \"\"\"Create a network for classification tasks.\"\"\"\n",
|
|
" if hidden_sizes is None:\n",
|
|
" hidden_sizes = [input_size // 2] # Use input_size // 2 as default\n",
|
|
" \n",
|
|
" # Choose appropriate output activation\n",
|
|
" output_activation = Sigmoid if num_classes == 1 else Softmax\n",
|
|
" \n",
|
|
" return create_mlp(input_size, hidden_sizes, num_classes, \n",
|
|
" activation=ReLU, output_activation=output_activation)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8888dc0c",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def create_regression_network(input_size: int, output_size: int = 1,\n",
|
|
" hidden_sizes: List[int] = None) -> Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Create a network for regression tasks.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" input_size: Number of input features\n",
|
|
" output_size: Number of output values (default: 1)\n",
|
|
" hidden_sizes: List of hidden layer sizes (default: [input_size * 2])\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Sequential network for regression\n",
|
|
" \n",
|
|
" TODO: Implement regression network creation.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Use default hidden sizes if none provided\n",
|
|
" 2. Create MLP with appropriate architecture\n",
|
|
" 3. Use no activation on output layer (linear output)\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" create_regression_network(5, 1) creates:\n",
|
|
" Dense(5→10) → ReLU → Dense(10→1) (no activation)\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use create_mlp() but with no output activation\n",
|
|
" - For regression, we want linear outputs (no activation)\n",
|
|
" - You can pass None or identity function as output_activation\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "052bb51a",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def create_regression_network(input_size: int, output_size: int = 1,\n",
|
|
" hidden_sizes: List[int] = None) -> Sequential:\n",
|
|
" \"\"\"Create a network for regression tasks.\"\"\"\n",
|
|
" if hidden_sizes is None:\n",
|
|
" hidden_sizes = [input_size // 2] # Use input_size // 2 as default\n",
|
|
" \n",
|
|
" # Create MLP with Tanh output activation for regression\n",
|
|
" return create_mlp(input_size, hidden_sizes, output_size, \n",
|
|
" activation=ReLU, output_activation=Tanh)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5dd183e8",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Practical Networks"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0cf0dc20",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test practical networks\n",
|
|
"print(\"Testing practical networks...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Test classification network\n",
|
|
" class_net = create_classification_network(input_size=5, num_classes=1)\n",
|
|
" x_class = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])\n",
|
|
" y_class = class_net(x_class)\n",
|
|
" print(f\"✅ Classification output: {y_class}\")\n",
|
|
" print(f\"✅ Output range: [{np.min(y_class.data):.3f}, {np.max(y_class.data):.3f}]\")\n",
|
|
" \n",
|
|
" # Test regression network\n",
|
|
" reg_net = create_regression_network(input_size=3, output_size=1)\n",
|
|
" x_reg = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" y_reg = reg_net(x_reg)\n",
|
|
" print(f\"✅ Regression output: {y_reg}\")\n",
|
|
" print(f\"✅ Output range: [{np.min(y_reg.data):.3f}, {np.max(y_reg.data):.3f}]\")\n",
|
|
" \n",
|
|
" print(\"🎉 Practical networks work!\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement the network creation functions above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "da4b34d4",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 7: Network Behavior Analysis\n",
|
|
"\n",
|
|
"Let's create tools to analyze how networks behave with different inputs and understand their capabilities.\n",
|
|
"\n",
|
|
"### Why Behavior Analysis Matters\n",
|
|
"- **Understanding**: Learn what patterns networks can learn\n",
|
|
"- **Debugging**: Identify when networks fail\n",
|
|
"- **Design**: Choose appropriate architectures\n",
|
|
"- **Validation**: Ensure networks work as expected"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f9cbf0f3",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def analyze_network_behavior(network: Sequential, input_data: Tensor, \n",
|
|
" title: str = \"Network Behavior Analysis\"):\n",
|
|
" \"\"\"\n",
|
|
" Analyze how a network behaves with different inputs.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" network: Sequential network to analyze\n",
|
|
" input_data: Input tensor to test\n",
|
|
" title: Title for the plot\n",
|
|
" \n",
|
|
" TODO: Create an analysis showing network behavior and capabilities.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Test the network with the given input\n",
|
|
" 2. Analyze the output characteristics\n",
|
|
" 3. Test with variations of the input\n",
|
|
" 4. Create visualizations showing behavior patterns\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Test network with original input and noisy versions\n",
|
|
" Show how output changes with input variations\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Test the original input\n",
|
|
" - Create variations (noise, scaling, etc.)\n",
|
|
" - Compare outputs across variations\n",
|
|
" - Show statistics and patterns\n",
|
|
" \"\"\"\n",
|
|
" raise NotImplementedError(\"Student implementation required\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f002ab23",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def analyze_network_behavior(network: Sequential, input_data: Tensor, \n",
|
|
" title: str = \"Network Behavior Analysis\"):\n",
|
|
" \"\"\"Analyze how a network behaves with different inputs.\"\"\"\n",
|
|
" if not _should_show_plots():\n",
|
|
" print(\"📊 Visualization disabled during testing\")\n",
|
|
" return\n",
|
|
" \n",
|
|
" # Test original input\n",
|
|
" original_output = network(input_data)\n",
|
|
" \n",
|
|
" # Create variations\n",
|
|
" noise_levels = [0.0, 0.1, 0.2, 0.5]\n",
|
|
" outputs = []\n",
|
|
" \n",
|
|
" for noise in noise_levels:\n",
|
|
" noisy_input = Tensor(input_data.data + noise * np.random.randn(*input_data.data.shape))\n",
|
|
" output = network(noisy_input)\n",
|
|
" outputs.append(output.data.flatten())\n",
|
|
" \n",
|
|
" # Create analysis plot\n",
|
|
" fig, axes = plt.subplots(2, 2, figsize=(12, 10))\n",
|
|
" \n",
|
|
" # Original output\n",
|
|
" axes[0, 0].hist(outputs[0], bins=20, alpha=0.7)\n",
|
|
" axes[0, 0].set_title('Original Input Output')\n",
|
|
" axes[0, 0].set_xlabel('Value')\n",
|
|
" axes[0, 0].set_ylabel('Frequency')\n",
|
|
" \n",
|
|
" # Output stability\n",
|
|
" output_means = [np.mean(out) for out in outputs]\n",
|
|
" output_stds = [np.std(out) for out in outputs]\n",
|
|
" axes[0, 1].plot(noise_levels, output_means, 'bo-', label='Mean')\n",
|
|
" axes[0, 1].fill_between(noise_levels, \n",
|
|
" [m-s for m, s in zip(output_means, output_stds)],\n",
|
|
" [m+s for m, s in zip(output_means, output_stds)], \n",
|
|
" alpha=0.3, label='±1 Std')\n",
|
|
" axes[0, 1].set_xlabel('Noise Level')\n",
|
|
" axes[0, 1].set_ylabel('Output Value')\n",
|
|
" axes[0, 1].set_title('Output Stability')\n",
|
|
" axes[0, 1].legend()\n",
|
|
" \n",
|
|
" # Output distribution comparison\n",
|
|
" for i, (output, noise) in enumerate(zip(outputs, noise_levels)):\n",
|
|
" axes[1, 0].hist(output, bins=20, alpha=0.5, label=f'Noise={noise}')\n",
|
|
" axes[1, 0].set_xlabel('Output Value')\n",
|
|
" axes[1, 0].set_ylabel('Frequency')\n",
|
|
" axes[1, 0].set_title('Output Distribution Comparison')\n",
|
|
" axes[1, 0].legend()\n",
|
|
" \n",
|
|
" # Statistics\n",
|
|
" stats_text = f'Original Mean: {np.mean(outputs[0]):.3f}\\nOriginal Std: {np.std(outputs[0]):.3f}\\nOutput Range: [{np.min(outputs[0]):.3f}, {np.max(outputs[0]):.3f}]'\n",
|
|
" axes[1, 1].text(0.1, 0.5, stats_text, transform=axes[1, 1].transAxes, \n",
|
|
" verticalalignment='center', fontsize=10)\n",
|
|
" axes[1, 1].set_title('Network Statistics')\n",
|
|
" axes[1, 1].axis('off')\n",
|
|
" \n",
|
|
" plt.suptitle(title)\n",
|
|
" plt.tight_layout()\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "58c4d2fe",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Network Behavior Analysis"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "4241defa",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test network behavior analysis\n",
|
|
"print(\"Testing network behavior analysis...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Create a test network\n",
|
|
" test_network = create_classification_network(input_size=3, num_classes=1)\n",
|
|
" test_input = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" \n",
|
|
" # Analyze behavior\n",
|
|
" if _should_show_plots():\n",
|
|
" analyze_network_behavior(test_network, test_input, \"Test Network Behavior\")\n",
|
|
" print(\"✅ Network behavior analysis created!\")\n",
|
|
" else:\n",
|
|
" print(\"✅ Network behavior analysis skipped during testing\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Error: {e}\")\n",
|
|
" print(\"Make sure to implement analyze_network_behavior above!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5e6395d0",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 🎯 Module Summary\n",
|
|
"\n",
|
|
"Congratulations! You've built the foundation of neural network architectures:\n",
|
|
"\n",
|
|
"### What You've Accomplished\n",
|
|
"✅ **Sequential Networks**: Composing layers into complete architectures \n",
|
|
"✅ **MLP Creation**: Building multi-layer perceptrons \n",
|
|
"✅ **Network Visualization**: Understanding architecture and data flow \n",
|
|
"✅ **Network Comparison**: Analyzing different architectures \n",
|
|
"✅ **Practical Networks**: Classification and regression networks \n",
|
|
"✅ **Behavior Analysis**: Understanding network capabilities \n",
|
|
"\n",
|
|
"### Key Concepts You've Learned\n",
|
|
"- **Networks** are compositions of layers that transform data\n",
|
|
"- **Architecture design** determines network capabilities\n",
|
|
"- **Sequential networks** are the most fundamental building block\n",
|
|
"- **Different architectures** solve different problems\n",
|
|
"- **Visualization tools** help understand network behavior\n",
|
|
"\n",
|
|
"### What's Next\n",
|
|
"In the next modules, you'll build on this foundation:\n",
|
|
"- **Autograd**: Enable automatic differentiation for training\n",
|
|
"- **Training**: Learn parameters using gradients and optimizers\n",
|
|
"- **Loss Functions**: Define objectives for learning\n",
|
|
"- **Applications**: Solve real problems with neural networks\n",
|
|
"\n",
|
|
"### Real-World Connection\n",
|
|
"Your network architectures are now ready to:\n",
|
|
"- Compose layers into complete neural networks\n",
|
|
"- Create specialized architectures for different tasks\n",
|
|
"- Analyze and understand network behavior\n",
|
|
"- Integrate with the rest of the TinyTorch ecosystem\n",
|
|
"\n",
|
|
"**Ready for the next challenge?** Let's move on to automatic differentiation to enable training!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "090bbc0d",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Final verification\n",
|
|
"print(\"\\n\" + \"=\"*50)\n",
|
|
"print(\"🎉 NETWORKS MODULE COMPLETE!\")\n",
|
|
"print(\"=\"*50)\n",
|
|
"print(\"✅ Sequential network implementation\")\n",
|
|
"print(\"✅ MLP creation and architecture design\")\n",
|
|
"print(\"✅ Network visualization and analysis\")\n",
|
|
"print(\"✅ Network comparison tools\")\n",
|
|
"print(\"✅ Practical classification and regression networks\")\n",
|
|
"print(\"✅ Network behavior analysis\")\n",
|
|
"print(\"\\n🚀 Ready to enable training with autograd in the next module!\") "
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"jupytext": {
|
|
"main_language": "python"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|