mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-10 16:38:39 -05:00
✅ Renamed modules for clearer pedagogical flow: - 05_networks → 05_dense (multi-layer dense/fully connected networks) - 06_cnn → 06_spatial (convolutional networks for spatial patterns) - 06_attention → 07_attention (attention mechanisms for sequences) ✅ Shifted remaining modules down by 1: - 07_dataloader → 08_dataloader - 08_autograd → 09_autograd - 09_optimizers → 10_optimizers - 10_training → 11_training - 11_compression → 12_compression - 12_kernels → 13_kernels - 13_benchmarking → 14_benchmarking - 14_mlops → 15_mlops - 15_capstone → 16_capstone ✅ Updated module metadata (module.yaml files): - Updated names, descriptions, dependencies - Fixed prerequisite chains and enables relationships - Updated export paths to match new names New learner progression: Foundation → Individual Layers → Dense Networks → Spatial Networks → Attention Networks → Training Pipeline Perfect pedagogical flow: Build one layer → Stack dense layers → Add spatial patterns → Add attention mechanisms → Learn to train them all.
1166 lines
43 KiB
Plaintext
1166 lines
43 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e8d2a035",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"# Networks - Neural Network Architectures\n",
|
|
"\n",
|
|
"Welcome to the Networks module! This is where we compose layers into complete neural network architectures.\n",
|
|
"\n",
|
|
"## Learning Goals\n",
|
|
"- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`\n",
|
|
"- Build the Sequential network architecture for composing layers\n",
|
|
"- Create common network patterns like MLPs (Multi-Layer Perceptrons)\n",
|
|
"- Visualize network architectures and understand their capabilities\n",
|
|
"- Master forward pass inference through complete networks\n",
|
|
"\n",
|
|
"## Build → Use → Reflect\n",
|
|
"1. **Build**: Sequential networks that compose layers into complete architectures\n",
|
|
"2. **Use**: Create different network patterns and run inference\n",
|
|
"3. **Reflect**: How architecture design affects network behavior and capability\n",
|
|
"\n",
|
|
"## What You'll Learn\n",
|
|
"By the end of this module, you'll understand:\n",
|
|
"- How simple layers combine to create complex behaviors\n",
|
|
"- The fundamental Sequential architecture pattern\n",
|
|
"- How to build MLPs with any number of layers\n",
|
|
"- Different network architectures (shallow, deep, wide)\n",
|
|
"- How neural networks approximate complex functions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "85869d04",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "networks-imports",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| default_exp core.networks\n",
|
|
"\n",
|
|
"#| export\n",
|
|
"import numpy as np\n",
|
|
"import sys\n",
|
|
"import os\n",
|
|
"from typing import List, Union, Optional, Callable\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"\n",
|
|
"# Import all the building blocks we need - try package first, then local modules\n",
|
|
"try:\n",
|
|
" from tinytorch.core.tensor import Tensor\n",
|
|
" from tinytorch.core.layers import Dense\n",
|
|
" from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax\n",
|
|
"except ImportError:\n",
|
|
" # For development, import from local modules\n",
|
|
" sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))\n",
|
|
" sys.path.append(os.path.join(os.path.dirname(__file__), '..', '02_activations'))\n",
|
|
" sys.path.append(os.path.join(os.path.dirname(__file__), '..', '03_layers'))\n",
|
|
" from tensor_dev import Tensor\n",
|
|
" from activations_dev import ReLU, Sigmoid, Tanh, Softmax\n",
|
|
" from layers_dev import Dense"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2a0a2310",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "networks-setup",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| hide\n",
|
|
"#| export\n",
|
|
"def _should_show_plots():\n",
|
|
" \"\"\"Check if we should show plots (disable during testing)\"\"\"\n",
|
|
" # Check multiple conditions that indicate we're in test mode\n",
|
|
" is_pytest = (\n",
|
|
" 'pytest' in sys.modules or\n",
|
|
" 'test' in sys.argv or\n",
|
|
" os.environ.get('PYTEST_CURRENT_TEST') is not None or\n",
|
|
" any('test' in arg for arg in sys.argv) or\n",
|
|
" any('pytest' in arg for arg in sys.argv)\n",
|
|
" )\n",
|
|
" \n",
|
|
" # Show plots in development mode (when not in test mode)\n",
|
|
" return not is_pytest"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "83de2607",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "networks-welcome",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(\"🔥 TinyTorch Networks Module\")\n",
|
|
"print(f\"NumPy version: {np.__version__}\")\n",
|
|
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
|
"print(\"Ready to build neural network architectures!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "eb8d5033",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 📦 Where This Code Lives in the Final Package\n",
|
|
"\n",
|
|
"**Learning Side:** You work in `modules/source/04_networks/networks_dev.py` \n",
|
|
"**Building Side:** Code exports to `tinytorch.core.networks`\n",
|
|
"\n",
|
|
"```python\n",
|
|
"# Final package structure:\n",
|
|
"from tinytorch.core.networks import Sequential, create_mlp # Network architectures!\n",
|
|
"from tinytorch.core.layers import Dense, Conv2D # Building blocks\n",
|
|
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh # Nonlinearity\n",
|
|
"from tinytorch.core.tensor import Tensor # Foundation\n",
|
|
"```\n",
|
|
"\n",
|
|
"**Why this matters:**\n",
|
|
"- **Learning:** Focused modules for deep understanding\n",
|
|
"- **Production:** Proper organization like PyTorch's `torch.nn.Sequential`\n",
|
|
"- **Consistency:** All network architectures live together in `core.networks`\n",
|
|
"- **Integration:** Works seamlessly with layers, activations, and tensors"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d7ef9807",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 1: Understanding Neural Networks as Function Composition\n",
|
|
"\n",
|
|
"### What is a Neural Network?\n",
|
|
"A neural network is simply **function composition** - chaining simple functions together to create complex behaviors:\n",
|
|
"\n",
|
|
"```\n",
|
|
"f(x) = f_n(f_{n-1}(...f_2(f_1(x))))\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Real-World Analogy: Assembly Line\n",
|
|
"Think of an assembly line in a factory:\n",
|
|
"- **Input:** Raw materials (data)\n",
|
|
"- **Stations:** Each worker (layer) transforms the product\n",
|
|
"- **Output:** Final product (predictions)\n",
|
|
"\n",
|
|
"### The Power of Composition\n",
|
|
"```python\n",
|
|
"# Simple functions\n",
|
|
"def add_one(x): return x + 1\n",
|
|
"def multiply_two(x): return x * 2\n",
|
|
"def square(x): return x * x\n",
|
|
"\n",
|
|
"# Composed function\n",
|
|
"def complex_function(x):\n",
|
|
" return square(multiply_two(add_one(x)))\n",
|
|
" \n",
|
|
"# This is what neural networks do!\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why This Matters\n",
|
|
"- **Universal Approximation:** MLPs can approximate any continuous function\n",
|
|
"- **Hierarchical Learning:** Early layers learn simple features, later layers learn complex patterns\n",
|
|
"- **Composability:** Mix and match layers to create custom architectures\n",
|
|
"- **Scalability:** Add more layers or make them wider as needed\n",
|
|
"\n",
|
|
"### From Modules We've Built\n",
|
|
"- **Tensors:** The data containers that flow through networks\n",
|
|
"- **Activations:** The nonlinear transformations that enable complex behaviors\n",
|
|
"- **Layers:** The building blocks that transform data\n",
|
|
"\n",
|
|
"Now let's build our first network architecture!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d761b0e8",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 2: Building the Sequential Network\n",
|
|
"\n",
|
|
"### What is Sequential?\n",
|
|
"**Sequential** is the most fundamental network architecture - it applies layers in order:\n",
|
|
"\n",
|
|
"```\n",
|
|
"Sequential([layer1, layer2, layer3]) \n",
|
|
"→ f(x) = layer3(layer2(layer1(x)))\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why Sequential Matters\n",
|
|
"- **Foundation:** Every neural network library has this pattern\n",
|
|
"- **Simplicity:** Easy to understand and implement\n",
|
|
"- **Flexibility:** Can compose any layers in any order\n",
|
|
"- **Building Block:** Foundation for more complex architectures\n",
|
|
"\n",
|
|
"### The Sequential Pattern\n",
|
|
"```python\n",
|
|
"# PyTorch style\n",
|
|
"model = nn.Sequential(\n",
|
|
" nn.Linear(784, 128),\n",
|
|
" nn.ReLU(),\n",
|
|
" nn.Linear(128, 10)\n",
|
|
")\n",
|
|
"\n",
|
|
"# Our TinyTorch style\n",
|
|
"model = Sequential([\n",
|
|
" Dense(784, 128),\n",
|
|
" ReLU(),\n",
|
|
" Dense(128, 10)\n",
|
|
"])\n",
|
|
"```\n",
|
|
"\n",
|
|
"Let's implement this fundamental architecture!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "442a13a0",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "sequential-class",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"class Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Sequential Network: Composes layers in sequence\n",
|
|
" \n",
|
|
" The most fundamental network architecture.\n",
|
|
" Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))\n",
|
|
" \"\"\"\n",
|
|
" \n",
|
|
" def __init__(self, layers: Optional[List] = None):\n",
|
|
" \"\"\"\n",
|
|
" Initialize Sequential network with layers.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" layers: List of layers to compose in order (optional, defaults to empty list)\n",
|
|
" \n",
|
|
" TODO: Store the layers and implement forward pass\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Store the layers list as an instance variable\n",
|
|
" 2. Initialize empty list if no layers provided\n",
|
|
" 3. Prepare for forward pass implementation\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Sequential([Dense(3,4), ReLU(), Dense(4,2)])\n",
|
|
" creates a 3-layer network: Dense → ReLU → Dense\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use self.layers to store the layers\n",
|
|
" - Handle empty initialization case\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" self.layers = layers if layers is not None else []\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" def forward(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"\n",
|
|
" Forward pass through all layers in sequence.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" x: Input tensor\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Output tensor after passing through all layers\n",
|
|
" \n",
|
|
" TODO: Implement sequential forward pass through all layers\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Start with the input tensor\n",
|
|
" 2. Apply each layer in sequence\n",
|
|
" 3. Each layer's output becomes the next layer's input\n",
|
|
" 4. Return the final output\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Input: Tensor([[1, 2, 3]])\n",
|
|
" Layer1 (Dense): Tensor([[1.4, 2.8]])\n",
|
|
" Layer2 (ReLU): Tensor([[1.4, 2.8]])\n",
|
|
" Layer3 (Dense): Tensor([[0.7]])\n",
|
|
" Output: Tensor([[0.7]])\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use a for loop: for layer in self.layers:\n",
|
|
" - Apply each layer: x = layer(x)\n",
|
|
" - The output of one layer becomes input to the next\n",
|
|
" - Return the final result\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" # Apply each layer in sequence\n",
|
|
" for layer in self.layers:\n",
|
|
" x = layer(x)\n",
|
|
" return x\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" def __call__(self, x: Tensor) -> Tensor:\n",
|
|
" \"\"\"Make the network callable: sequential(x) instead of sequential.forward(x)\"\"\"\n",
|
|
" return self.forward(x)\n",
|
|
" \n",
|
|
" def add(self, layer):\n",
|
|
" \"\"\"Add a layer to the network.\"\"\"\n",
|
|
" self.layers.append(layer)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8d0cb245",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Unit Test: Sequential Network\n",
|
|
"\n",
|
|
"Let's test your Sequential network implementation! This is the foundation of all neural network architectures.\n",
|
|
"\n",
|
|
"**This is a unit test** - it tests one specific class (Sequential network) in isolation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "58bde5f1",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-sequential-immediate",
|
|
"locked": true,
|
|
"points": 10,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test Sequential network immediately after implementation\n",
|
|
"print(\"🔬 Unit Test: Sequential Network...\")\n",
|
|
"\n",
|
|
"# Create a simple 2-layer network: 3 → 4 → 2\n",
|
|
"try:\n",
|
|
" network = Sequential([\n",
|
|
" Dense(input_size=3, output_size=4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=4, output_size=2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" print(f\"Network created with {len(network.layers)} layers\")\n",
|
|
" print(\"✅ Sequential network creation successful\")\n",
|
|
" \n",
|
|
" # Test with sample data\n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" print(f\"Input: {x}\")\n",
|
|
" \n",
|
|
" # Forward pass\n",
|
|
" y = network(x)\n",
|
|
" print(f\"Output: {y}\")\n",
|
|
" print(f\"Output shape: {y.shape}\")\n",
|
|
" \n",
|
|
" # Verify the network works\n",
|
|
" assert y.shape == (1, 2), f\"Expected shape (1, 2), got {y.shape}\"\n",
|
|
" print(\"✅ Sequential network produces correct output shape\")\n",
|
|
" \n",
|
|
" # Test that sigmoid output is in valid range\n",
|
|
" assert np.all(y.data >= 0) and np.all(y.data <= 1), \"Sigmoid output should be between 0 and 1\"\n",
|
|
" print(\"✅ Sequential network output is in valid range\")\n",
|
|
" \n",
|
|
" # Test that layers are stored correctly\n",
|
|
" assert len(network.layers) == 4, f\"Expected 4 layers, got {len(network.layers)}\"\n",
|
|
" print(\"✅ Sequential network stores layers correctly\")\n",
|
|
" \n",
|
|
" # Test batch processing\n",
|
|
" x_batch = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])\n",
|
|
" y_batch = network(x_batch)\n",
|
|
" assert y_batch.shape == (2, 2), f\"Expected batch shape (2, 2), got {y_batch.shape}\"\n",
|
|
" print(\"✅ Sequential network handles batch processing\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Sequential network test failed: {e}\")\n",
|
|
" raise\n",
|
|
"\n",
|
|
"# Show the network architecture\n",
|
|
"print(\"🎯 Sequential network behavior:\")\n",
|
|
"print(\" Applies layers in sequence: f(g(h(x)))\")\n",
|
|
"print(\" Input flows through each layer in order\")\n",
|
|
"print(\" Output of layer i becomes input of layer i+1\")\n",
|
|
"print(\"📈 Progress: Sequential network ✓\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "86f50a55",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 3: Building Multi-Layer Perceptrons (MLPs)\n",
|
|
"\n",
|
|
"### What is an MLP?\n",
|
|
"A **Multi-Layer Perceptron** is the classic neural network architecture:\n",
|
|
"\n",
|
|
"```\n",
|
|
"Input → Dense → Activation → Dense → Activation → ... → Dense → Output\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why MLPs are Important\n",
|
|
"- **Universal approximation**: Can approximate any continuous function\n",
|
|
"- **Foundation**: Basis for understanding all neural networks\n",
|
|
"- **Versatile**: Works for classification, regression, and more\n",
|
|
"- **Simple**: Easy to understand and implement\n",
|
|
"\n",
|
|
"### MLP Architecture Pattern\n",
|
|
"```\n",
|
|
"create_mlp(3, [4, 2], 1) creates:\n",
|
|
"Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Real-World Applications\n",
|
|
"- **Tabular data**: Customer analytics, financial modeling\n",
|
|
"- **Feature learning**: Learning representations from raw data\n",
|
|
"- **Classification**: Spam detection, medical diagnosis\n",
|
|
"- **Regression**: Price prediction, time series forecasting\n",
|
|
"\n",
|
|
"### The MLP Factory Pattern\n",
|
|
"Instead of manually creating each layer, we'll build a function that creates MLPs automatically!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "39d18c19",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "create-mlp",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int, \n",
|
|
" activation=ReLU, output_activation=Sigmoid) -> Sequential:\n",
|
|
" \"\"\"\n",
|
|
" Create a Multi-Layer Perceptron (MLP) network.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" input_size: Number of input features\n",
|
|
" hidden_sizes: List of hidden layer sizes\n",
|
|
" output_size: Number of output features\n",
|
|
" activation: Activation function for hidden layers (default: ReLU)\n",
|
|
" output_activation: Activation function for output layer (default: Sigmoid)\n",
|
|
" \n",
|
|
" Returns:\n",
|
|
" Sequential network with MLP architecture\n",
|
|
" \n",
|
|
" TODO: Implement MLP creation with alternating Dense and activation layers.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Start with an empty list of layers\n",
|
|
" 2. Add layers in this pattern:\n",
|
|
" - Dense(input_size → first_hidden_size)\n",
|
|
" - Activation()\n",
|
|
" - Dense(first_hidden_size → second_hidden_size)\n",
|
|
" - Activation()\n",
|
|
" - ...\n",
|
|
" - Dense(last_hidden_size → output_size)\n",
|
|
" - Output_activation()\n",
|
|
" 3. Return Sequential(layers)\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" create_mlp(3, [4, 2], 1) creates:\n",
|
|
" Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Start with layers = []\n",
|
|
" - Track current_size starting with input_size\n",
|
|
" - For each hidden_size: add Dense(current_size, hidden_size), then activation\n",
|
|
" - Finally add Dense(last_hidden_size, output_size), then output_activation\n",
|
|
" - Return Sequential(layers)\n",
|
|
" \"\"\"\n",
|
|
" layers = []\n",
|
|
" current_size = input_size\n",
|
|
" \n",
|
|
" # Add hidden layers with activations\n",
|
|
" for hidden_size in hidden_sizes:\n",
|
|
" layers.append(Dense(current_size, hidden_size))\n",
|
|
" layers.append(activation())\n",
|
|
" current_size = hidden_size\n",
|
|
" \n",
|
|
" # Add output layer with output activation\n",
|
|
" layers.append(Dense(current_size, output_size))\n",
|
|
" layers.append(output_activation())\n",
|
|
" \n",
|
|
" return Sequential(layers)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "10852ab7",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Unit Test: MLP Creation\n",
|
|
"\n",
|
|
"Let's test your MLP creation function! This builds complete neural networks with a single function call.\n",
|
|
"\n",
|
|
"**This is a unit test** - it tests one specific function (create_mlp) in isolation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f8a67516",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-mlp-immediate",
|
|
"locked": true,
|
|
"points": 10,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test MLP creation immediately after implementation\n",
|
|
"print(\"🔬 Unit Test: MLP Creation...\")\n",
|
|
"\n",
|
|
"# Create a simple MLP: 3 → 4 → 2 → 1\n",
|
|
"try:\n",
|
|
" mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)\n",
|
|
" \n",
|
|
" print(f\"MLP created with {len(mlp.layers)} layers\")\n",
|
|
" print(\"✅ MLP creation successful\")\n",
|
|
" \n",
|
|
" # Test the structure - should have 6 layers: Dense, ReLU, Dense, ReLU, Dense, Sigmoid\n",
|
|
" expected_layers = 6 # 3 Dense + 2 ReLU + 1 Sigmoid\n",
|
|
" assert len(mlp.layers) == expected_layers, f\"Expected {expected_layers} layers, got {len(mlp.layers)}\"\n",
|
|
" print(\"✅ MLP has correct number of layers\")\n",
|
|
" \n",
|
|
" # Test layer types\n",
|
|
" layer_types = [type(layer).__name__ for layer in mlp.layers]\n",
|
|
" expected_pattern = ['Dense', 'ReLU', 'Dense', 'ReLU', 'Dense', 'Sigmoid']\n",
|
|
" assert layer_types == expected_pattern, f\"Expected pattern {expected_pattern}, got {layer_types}\"\n",
|
|
" print(\"✅ MLP follows correct layer pattern\")\n",
|
|
" \n",
|
|
" # Test with sample data\n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" y = mlp(x)\n",
|
|
" print(f\"MLP input: {x}\")\n",
|
|
" print(f\"MLP output: {y}\")\n",
|
|
" print(f\"MLP output shape: {y.shape}\")\n",
|
|
" \n",
|
|
" # Verify the output\n",
|
|
" assert y.shape == (1, 1), f\"Expected shape (1, 1), got {y.shape}\"\n",
|
|
" print(\"✅ MLP produces correct output shape\")\n",
|
|
" \n",
|
|
" # Test that sigmoid output is in valid range\n",
|
|
" assert np.all(y.data >= 0) and np.all(y.data <= 1), \"Sigmoid output should be between 0 and 1\"\n",
|
|
" print(\"✅ MLP output is in valid range\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ MLP creation test failed: {e}\")\n",
|
|
" raise\n",
|
|
"\n",
|
|
"# Test different architectures\n",
|
|
"try:\n",
|
|
" # Test shallow network\n",
|
|
" shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)\n",
|
|
" assert len(shallow_net.layers) == 4, f\"Shallow network should have 4 layers, got {len(shallow_net.layers)}\"\n",
|
|
" \n",
|
|
" # Test deep network \n",
|
|
" deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)\n",
|
|
" assert len(deep_net.layers) == 8, f\"Deep network should have 8 layers, got {len(deep_net.layers)}\"\n",
|
|
" \n",
|
|
" # Test wide network\n",
|
|
" wide_net = create_mlp(input_size=3, hidden_sizes=[10], output_size=1)\n",
|
|
" assert len(wide_net.layers) == 4, f\"Wide network should have 4 layers, got {len(wide_net.layers)}\"\n",
|
|
" \n",
|
|
" print(\"✅ Different MLP architectures work correctly\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ MLP architecture test failed: {e}\")\n",
|
|
" raise\n",
|
|
"\n",
|
|
"# Show the MLP pattern\n",
|
|
"print(\"🎯 MLP creation pattern:\")\n",
|
|
"print(\" Input → Dense → Activation → Dense → Activation → ... → Dense → Output_Activation\")\n",
|
|
"print(\" Automatically creates the complete architecture\")\n",
|
|
"print(\" Handles any number of hidden layers\")\n",
|
|
"print(\"📈 Progress: Sequential network ✓, MLP creation ✓\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "67d76916",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 4: Understanding Network Architectures\n",
|
|
"\n",
|
|
"### Architecture Patterns\n",
|
|
"Different network architectures solve different problems:\n",
|
|
"\n",
|
|
"#### **Shallow vs Deep Networks**\n",
|
|
"```python\n",
|
|
"# Shallow: 1 hidden layer\n",
|
|
"shallow = create_mlp(10, [20], 1)\n",
|
|
"\n",
|
|
"# Deep: Many hidden layers\n",
|
|
"deep = create_mlp(10, [20, 20, 20], 1)\n",
|
|
"```\n",
|
|
"\n",
|
|
"#### **Narrow vs Wide Networks**\n",
|
|
"```python\n",
|
|
"# Narrow: Few neurons per layer\n",
|
|
"narrow = create_mlp(10, [5, 5], 1)\n",
|
|
"\n",
|
|
"# Wide: Many neurons per layer\n",
|
|
"wide = create_mlp(10, [50], 1)\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why Architecture Matters\n",
|
|
"- **Capacity:** More parameters can learn more complex patterns\n",
|
|
"- **Depth:** Enables hierarchical feature learning\n",
|
|
"- **Width:** Allows parallel processing of features\n",
|
|
"- **Efficiency:** Balance between performance and computation\n",
|
|
"\n",
|
|
"### Different Activation Functions\n",
|
|
" ```python\n",
|
|
"# ReLU networks (most common)\n",
|
|
"relu_net = create_mlp(10, [20], 1, activation=ReLU)\n",
|
|
" \n",
|
|
"# Tanh networks (centered around 0)\n",
|
|
"tanh_net = create_mlp(10, [20], 1, activation=Tanh)\n",
|
|
" \n",
|
|
"# Multi-class classification\n",
|
|
"classifier = create_mlp(10, [20], 3, output_activation=Softmax)\n",
|
|
" ```\n",
|
|
"\n",
|
|
"Let's test different architectures!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ed72681d",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Unit Test: Architecture Variations\n",
|
|
"\n",
|
|
"Let's test different network architectures to understand their behavior.\n",
|
|
"\n",
|
|
"**This is a unit test** - it tests architectural variations in isolation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9daa8d0d",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-architectures",
|
|
"locked": true,
|
|
"points": 10,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test different architectures\n",
|
|
"print(\"🔬 Unit Test: Network Architecture Variations...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Test different activation functions\n",
|
|
" relu_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1, activation=ReLU)\n",
|
|
" tanh_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1, activation=Tanh)\n",
|
|
" \n",
|
|
" # Test different output activations\n",
|
|
" classifier = create_mlp(input_size=3, hidden_sizes=[4], output_size=3, output_activation=Softmax)\n",
|
|
" \n",
|
|
" # Test with sample data\n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" \n",
|
|
" # Test ReLU network\n",
|
|
" y_relu = relu_net(x)\n",
|
|
" assert y_relu.shape == (1, 1), \"ReLU network should work\"\n",
|
|
" print(\"✅ ReLU network works correctly\")\n",
|
|
" \n",
|
|
" # Test Tanh network\n",
|
|
" y_tanh = tanh_net(x)\n",
|
|
" assert y_tanh.shape == (1, 1), \"Tanh network should work\"\n",
|
|
" print(\"✅ Tanh network works correctly\")\n",
|
|
" \n",
|
|
" # Test multi-class classifier\n",
|
|
" y_multi = classifier(x)\n",
|
|
" assert y_multi.shape == (1, 3), \"Multi-class classifier should work\"\n",
|
|
" \n",
|
|
" # Check softmax properties\n",
|
|
" assert abs(np.sum(y_multi.data) - 1.0) < 1e-6, \"Softmax outputs should sum to 1\"\n",
|
|
" print(\"✅ Multi-class classifier with Softmax works correctly\")\n",
|
|
" \n",
|
|
" # Test different architectures\n",
|
|
" shallow = create_mlp(input_size=4, hidden_sizes=[5], output_size=1)\n",
|
|
" deep = create_mlp(input_size=4, hidden_sizes=[5, 5, 5], output_size=1)\n",
|
|
" wide = create_mlp(input_size=4, hidden_sizes=[20], output_size=1)\n",
|
|
" \n",
|
|
" x_test = Tensor([[1.0, 2.0, 3.0, 4.0]])\n",
|
|
" \n",
|
|
" # Test all architectures\n",
|
|
" for name, net in [(\"Shallow\", shallow), (\"Deep\", deep), (\"Wide\", wide)]:\n",
|
|
" y = net(x_test)\n",
|
|
" assert y.shape == (1, 1), f\"{name} network should produce correct shape\"\n",
|
|
" print(f\"✅ {name} network works correctly\")\n",
|
|
" \n",
|
|
" print(\"✅ All network architectures work correctly\")\n",
|
|
" \n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Architecture test failed: {e}\")\n",
|
|
" raise\n",
|
|
"\n",
|
|
"print(\"🎯 Architecture insights:\")\n",
|
|
"print(\" Different activations create different behaviors\")\n",
|
|
"print(\" Softmax enables multi-class classification\")\n",
|
|
"print(\" Architecture affects network capacity and learning\")\n",
|
|
"print(\"📈 Progress: Sequential ✓, MLP creation ✓, Architecture variations ✓\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8df67be5",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 5: Comprehensive Test - Complete Network Applications\n",
|
|
"\n",
|
|
"### Real-World Network Applications\n",
|
|
"Let's test our networks on realistic scenarios:\n",
|
|
"\n",
|
|
"#### **Classification Problem**\n",
|
|
"```python\n",
|
|
"# 4 features → 2 classes (binary classification)\n",
|
|
"classifier = create_mlp(4, [8, 4], 2, output_activation=Softmax)\n",
|
|
"```\n",
|
|
"\n",
|
|
"#### **Regression Problem**\n",
|
|
"```python\n",
|
|
"# 3 features → 1 continuous output\n",
|
|
"regressor = create_mlp(3, [10, 5], 1, output_activation=lambda: Dense(0, 0)) # Linear output\n",
|
|
"```\n",
|
|
"\n",
|
|
"#### **Deep Learning Pattern**\n",
|
|
"```python\n",
|
|
"# Complex feature learning\n",
|
|
"deep_net = create_mlp(10, [64, 32, 16], 1)\n",
|
|
"```\n",
|
|
"\n",
|
|
"This comprehensive test ensures our networks work for real ML applications!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "011cd928",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-integration",
|
|
"locked": true,
|
|
"points": 15,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Comprehensive test - complete network applications\n",
|
|
"print(\"🔬 Comprehensive Test: Complete Network Applications...\")\n",
|
|
"\n",
|
|
"try:\n",
|
|
" # Test 1: Multi-class Classification (Iris-like dataset)\n",
|
|
" print(\"\\n1. Multi-class Classification Test:\")\n",
|
|
" iris_classifier = create_mlp(input_size=4, hidden_sizes=[8, 6], output_size=3, output_activation=Softmax)\n",
|
|
" \n",
|
|
" # Simulate iris features: [sepal_length, sepal_width, petal_length, petal_width]\n",
|
|
" iris_samples = Tensor([\n",
|
|
" [5.1, 3.5, 1.4, 0.2], # Setosa\n",
|
|
" [7.0, 3.2, 4.7, 1.4], # Versicolor\n",
|
|
" [6.3, 3.3, 6.0, 2.5] # Virginica\n",
|
|
" ])\n",
|
|
" \n",
|
|
" iris_predictions = iris_classifier(iris_samples)\n",
|
|
" assert iris_predictions.shape == (3, 3), \"Iris classifier should output 3 classes for 3 samples\"\n",
|
|
" \n",
|
|
" # Check softmax properties\n",
|
|
" row_sums = np.sum(iris_predictions.data, axis=1)\n",
|
|
" assert np.allclose(row_sums, 1.0), \"Each prediction should sum to 1\"\n",
|
|
" print(\"✅ Multi-class classification works correctly\")\n",
|
|
" \n",
|
|
" # Test 2: Regression Task (Housing prices)\n",
|
|
" print(\"\\n2. Regression Task Test:\")\n",
|
|
" # Create a regressor without final activation (linear output)\n",
|
|
" class Identity:\n",
|
|
" def __call__(self, x): return x\n",
|
|
" \n",
|
|
" housing_regressor = create_mlp(input_size=3, hidden_sizes=[10, 5], output_size=1, output_activation=Identity)\n",
|
|
" \n",
|
|
" # Simulate housing features: [size, bedrooms, location_score]\n",
|
|
" housing_samples = Tensor([\n",
|
|
" [2000, 3, 8.5], # Large house, good location\n",
|
|
" [1200, 2, 6.0], # Medium house, ok location\n",
|
|
" [800, 1, 4.0] # Small house, poor location\n",
|
|
" ])\n",
|
|
" \n",
|
|
" housing_predictions = housing_regressor(housing_samples)\n",
|
|
" assert housing_predictions.shape == (3, 1), \"Housing regressor should output 1 value per sample\"\n",
|
|
" print(\"✅ Regression task works correctly\")\n",
|
|
" \n",
|
|
" # Test 3: Deep Network Performance\n",
|
|
" print(\"\\n3. Deep Network Test:\")\n",
|
|
" deep_network = create_mlp(input_size=10, hidden_sizes=[20, 15, 10, 5], output_size=1)\n",
|
|
" \n",
|
|
" # Test with realistic batch size\n",
|
|
" batch_data = Tensor(np.random.randn(32, 10)) # 32 samples, 10 features\n",
|
|
" deep_predictions = deep_network(batch_data)\n",
|
|
" \n",
|
|
" assert deep_predictions.shape == (32, 1), \"Deep network should handle batch processing\"\n",
|
|
" assert not np.any(np.isnan(deep_predictions.data)), \"Deep network should not produce NaN\"\n",
|
|
" print(\"✅ Deep network handles batch processing correctly\")\n",
|
|
" \n",
|
|
" # Test 4: Network Composition\n",
|
|
" print(\"\\n4. Network Composition Test:\")\n",
|
|
" # Create a feature extractor and classifier separately\n",
|
|
" feature_extractor = Sequential([\n",
|
|
" Dense(input_size=10, output_size=5),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=5, output_size=3),\n",
|
|
" ReLU()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" classifier_head = Sequential([\n",
|
|
" Dense(input_size=3, output_size=2),\n",
|
|
" Softmax()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" # Test composition\n",
|
|
" raw_data = Tensor(np.random.randn(5, 10))\n",
|
|
" features = feature_extractor(raw_data)\n",
|
|
" final_predictions = classifier_head(features)\n",
|
|
" \n",
|
|
" assert features.shape == (5, 3), \"Feature extractor should output 3 features\"\n",
|
|
" assert final_predictions.shape == (5, 2), \"Classifier should output 2 classes\"\n",
|
|
" \n",
|
|
" row_sums = np.sum(final_predictions.data, axis=1)\n",
|
|
" assert np.allclose(row_sums, 1.0), \"Composed network predictions should be valid\"\n",
|
|
" print(\"✅ Network composition works correctly\")\n",
|
|
" \n",
|
|
" print(\"\\n🎉 Comprehensive test passed! Your networks work correctly for:\")\n",
|
|
" print(\" • Multi-class classification (Iris flowers)\")\n",
|
|
" print(\" • Regression tasks (housing prices)\")\n",
|
|
" print(\" • Deep learning architectures\")\n",
|
|
" print(\" • Network composition and feature extraction\")\n",
|
|
"\n",
|
|
"except Exception as e:\n",
|
|
" print(f\"❌ Comprehensive test failed: {e}\")\n",
|
|
"\n",
|
|
"print(\"📈 Final Progress: Complete network architectures ready for real ML applications!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "cb906ad7",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "networks-compatibility",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"class MLP:\n",
|
|
" \"\"\"\n",
|
|
" Multi-Layer Perceptron (MLP) class.\n",
|
|
" \n",
|
|
" A convenient wrapper around Sequential networks for standard MLP architectures.\n",
|
|
" Maintains parameter information and provides a clean interface.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" input_size: Number of input features\n",
|
|
" hidden_size: Size of the single hidden layer\n",
|
|
" output_size: Number of output features\n",
|
|
" activation: Activation function for hidden layer (default: ReLU)\n",
|
|
" output_activation: Activation function for output layer (default: Sigmoid)\n",
|
|
" \"\"\"\n",
|
|
" \n",
|
|
" def __init__(self, input_size: int, hidden_size: int, output_size: int, \n",
|
|
" activation=ReLU, output_activation=None):\n",
|
|
" self.input_size = input_size\n",
|
|
" self.hidden_size = hidden_size\n",
|
|
" self.output_size = output_size\n",
|
|
" \n",
|
|
" # Build the network layers\n",
|
|
" layers = []\n",
|
|
" \n",
|
|
" # Input to hidden layer\n",
|
|
" layers.append(Dense(input_size, hidden_size))\n",
|
|
" layers.append(activation())\n",
|
|
" \n",
|
|
" # Hidden to output layer\n",
|
|
" layers.append(Dense(hidden_size, output_size))\n",
|
|
" if output_activation is not None:\n",
|
|
" layers.append(output_activation())\n",
|
|
" \n",
|
|
" self.network = Sequential(layers)\n",
|
|
" \n",
|
|
" def forward(self, x):\n",
|
|
" \"\"\"Forward pass through the MLP network.\"\"\"\n",
|
|
" return self.network.forward(x)\n",
|
|
" \n",
|
|
" def __call__(self, x):\n",
|
|
" \"\"\"Make the MLP callable.\"\"\"\n",
|
|
" return self.forward(x)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7c811b23",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"\n",
|
|
"def test_sequential_networks():\n",
|
|
" \"\"\"Test Sequential network implementation comprehensively.\"\"\"\n",
|
|
" print(\"🔬 Unit Test: Sequential Networks...\")\n",
|
|
" \n",
|
|
" # Test basic Sequential network\n",
|
|
" net = Sequential([\n",
|
|
" Dense(input_size=3, output_size=4),\n",
|
|
" ReLU(),\n",
|
|
" Dense(input_size=4, output_size=2),\n",
|
|
" Sigmoid()\n",
|
|
" ])\n",
|
|
" \n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" y = net(x)\n",
|
|
" \n",
|
|
" assert y.shape == (1, 2), \"Sequential network should produce correct output shape\"\n",
|
|
" assert np.all(y.data > 0), \"Sigmoid output should be positive\"\n",
|
|
" assert np.all(y.data < 1), \"Sigmoid output should be less than 1\"\n",
|
|
" \n",
|
|
" print(\"✅ Sequential networks work correctly\")\n",
|
|
"\n",
|
|
"def test_mlp_creation():\n",
|
|
" \"\"\"Test MLP creation function comprehensively.\"\"\"\n",
|
|
" print(\"🔬 Unit Test: MLP Creation...\")\n",
|
|
" \n",
|
|
" # Test different MLP architectures\n",
|
|
" shallow = create_mlp(input_size=4, hidden_sizes=[5], output_size=1)\n",
|
|
" deep = create_mlp(input_size=4, hidden_sizes=[8, 6, 4], output_size=2)\n",
|
|
" \n",
|
|
" x = Tensor([[1.0, 2.0, 3.0, 4.0]])\n",
|
|
" \n",
|
|
" # Test shallow network\n",
|
|
" y_shallow = shallow(x)\n",
|
|
" assert y_shallow.shape == (1, 1), \"Shallow MLP should work\"\n",
|
|
" \n",
|
|
" # Test deep network \n",
|
|
" y_deep = deep(x)\n",
|
|
" assert y_deep.shape == (1, 2), \"Deep MLP should work\"\n",
|
|
" \n",
|
|
" print(\"✅ MLP creation works correctly\")\n",
|
|
"\n",
|
|
"def test_network_architectures():\n",
|
|
" \"\"\"Test different network architectures comprehensively.\"\"\"\n",
|
|
" print(\"🔬 Unit Test: Network Architectures...\")\n",
|
|
" \n",
|
|
" # Test different activation functions\n",
|
|
" relu_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1, activation=ReLU)\n",
|
|
" tanh_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1, activation=Tanh)\n",
|
|
" \n",
|
|
" # Test multi-class classifier\n",
|
|
" classifier = create_mlp(input_size=3, hidden_sizes=[4], output_size=3, output_activation=Softmax)\n",
|
|
" \n",
|
|
" x = Tensor([[1.0, 2.0, 3.0]])\n",
|
|
" \n",
|
|
" # Test all architectures\n",
|
|
" y_relu = relu_net(x)\n",
|
|
" y_tanh = tanh_net(x)\n",
|
|
" y_multi = classifier(x)\n",
|
|
" \n",
|
|
" assert y_relu.shape == (1, 1), \"ReLU network should work\"\n",
|
|
" assert y_tanh.shape == (1, 1), \"Tanh network should work\"\n",
|
|
" assert y_multi.shape == (1, 3), \"Multi-class classifier should work\"\n",
|
|
" assert abs(np.sum(y_multi.data) - 1.0) < 1e-6, \"Softmax outputs should sum to 1\"\n",
|
|
" \n",
|
|
" print(\"✅ Network architectures work correctly\")\n",
|
|
"\n",
|
|
"def test_networks():\n",
|
|
" \"\"\"Test network comprehensive testing with real ML scenarios.\"\"\"\n",
|
|
" print(\"🔬 Comprehensive Test: Network Applications...\")\n",
|
|
" \n",
|
|
" # Test multi-class classification\n",
|
|
" iris_classifier = create_mlp(input_size=4, hidden_sizes=[8, 6], output_size=3, output_activation=Softmax)\n",
|
|
" iris_samples = Tensor([[5.1, 3.5, 1.4, 0.2], [7.0, 3.2, 4.7, 1.4], [6.3, 3.3, 6.0, 2.5]])\n",
|
|
" iris_predictions = iris_classifier(iris_samples)\n",
|
|
" \n",
|
|
" assert iris_predictions.shape == (3, 3), \"Iris classifier should work\"\n",
|
|
" row_sums = np.sum(iris_predictions.data, axis=1)\n",
|
|
" assert np.allclose(row_sums, 1.0), \"Predictions should sum to 1\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d8035240",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 🧪 Module Testing\n",
|
|
"\n",
|
|
"Time to test your implementation! This section uses TinyTorch's standardized testing framework to ensure your implementation works correctly.\n",
|
|
"\n",
|
|
"**This testing section is locked** - it provides consistent feedback across all modules and cannot be modified."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "001981f5",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "standardized-testing",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# =============================================================================\n",
|
|
"# STANDARDIZED MODULE TESTING - DO NOT MODIFY\n",
|
|
"# This cell is locked to ensure consistent testing across all TinyTorch modules\n",
|
|
"# =============================================================================\n",
|
|
"\n",
|
|
"if __name__ == \"__main__\":\n",
|
|
" from tito.tools.testing import run_module_tests_auto\n",
|
|
" \n",
|
|
" # Automatically discover and run all tests in this module\n",
|
|
" success = run_module_tests_auto(\"Networks\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5ee20f1b",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 🎯 Module Summary: Neural Network Architectures Mastery!\n",
|
|
"\n",
|
|
"Congratulations! You've successfully implemented complete neural network architectures:\n",
|
|
"\n",
|
|
"### What You've Accomplished\n",
|
|
"✅ **Sequential Networks**: Chained layers for complex transformations\n",
|
|
"✅ **MLP Creation**: Multi-layer perceptrons with flexible architectures\n",
|
|
"✅ **Network Architectures**: Different activation patterns and output types\n",
|
|
"✅ **Integration**: Real-world applications like classification and regression\n",
|
|
"\n",
|
|
"### Key Concepts You've Learned\n",
|
|
"- **Sequential Processing**: How layers chain together for complex functions\n",
|
|
"- **MLP Design**: Multi-layer perceptrons as universal function approximators \n",
|
|
"- **Architecture Choices**: How depth, width, and activations affect learning\n",
|
|
"- **Real Applications**: Classification, regression, and feature extraction\n",
|
|
"\n",
|
|
"### Next Steps\n",
|
|
"1. **Export your code**: `tito package nbdev --export 04_networks`\n",
|
|
"2. **Test your implementation**: `tito test 04_networks`\n",
|
|
"3. **Build complete models**: Combine with training for full ML pipelines\n",
|
|
"4. **Move to Module 5**: Add convolutional layers for image processing!\n",
|
|
"\n",
|
|
"**Ready for CNNs?** Your network foundations are now ready for specialized architectures!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"jupytext": {
|
|
"main_language": "python"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|