TinyTorch/assignments/source/04_networks/networks_dev.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "355dc307",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "# Module 3: Networks - Neural Network Architectures\n",
    "\n",
    "Welcome to the Networks module! This is where we compose layers into complete neural network architectures.\n",
    "\n",
    "## Learning Goals\n",
    "- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`\n",
    "- Build common architectures (MLP, CNN) from layers\n",
    "- Visualize network structure and data flow\n",
    "- See how architecture affects capability\n",
    "- Master forward pass inference (no training yet!)\n",
    "\n",
    "## Build → Use → Understand\n",
    "1. **Build**: Compose layers into complete networks\n",
    "2. **Use**: Create different architectures and run inference\n",
    "3. **Understand**: How architecture design affects network behavior\n",
    "\n",
    "## Module Dependencies\n",
    "This module builds on previous modules:\n",
    "- **tensor** → **activations** → **layers** → **networks**\n",
    "- Clean composition: math functions → building blocks → complete systems"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf724917",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## 📦 Where This Code Lives in the Final Package\n",
    "\n",
    "**Learning Side:** You work in `assignments/source/04_networks/networks_dev.py`  \n",
    "**Building Side:** Code exports to `tinytorch.core.networks`\n",
    "\n",
    "```python\n",
    "# Final package structure:\n",
    "from tinytorch.core.networks import Sequential, MLP\n",
    "from tinytorch.core.layers import Dense, Conv2D\n",
    "from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
    "from tinytorch.core.tensor import Tensor\n",
    "```\n",
    "\n",
    "**Why this matters:**\n",
    "- **Learning:** Focused modules for deep understanding\n",
    "- **Production:** Proper organization like PyTorch's `torch.nn`\n",
    "- **Consistency:** All network architectures live together in `core.networks`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79460d45",
   "metadata": {},
   "outputs": [],
   "source": [
    "#| default_exp core.networks\n",
    "\n",
    "# Setup and imports\n",
    "import numpy as np\n",
    "import sys\n",
    "from typing import List, Union, Optional, Callable\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.patches as patches\n",
    "from matplotlib.patches import FancyBboxPatch, ConnectionPatch\n",
    "import seaborn as sns\n",
    "\n",
    "# Import all the building blocks we need\n",
    "from tinytorch.core.tensor import Tensor\n",
    "from tinytorch.core.layers import Dense\n",
    "from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax\n",
    "\n",
    "print(\"🔥 TinyTorch Networks Module\")\n",
    "print(f\"NumPy version: {np.__version__}\")\n",
    "print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
    "print(\"Ready to build neural network architectures!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2190e04d",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "import numpy as np\n",
    "import sys\n",
    "from typing import List, Union, Optional, Callable\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.patches as patches\n",
    "from matplotlib.patches import FancyBboxPatch, ConnectionPatch\n",
    "import seaborn as sns\n",
    "\n",
    "# Import our building blocks\n",
    "from tinytorch.core.tensor import Tensor\n",
    "from tinytorch.core.layers import Dense\n",
    "from tinytorch.core.activations import ReLU, Sigmoid, Tanh"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c03a46b9",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def _should_show_plots():\n",
    "    \"\"\"Check if we should show plots (disable during testing)\"\"\"\n",
    "    return 'pytest' not in sys.modules and 'test' not in sys.argv"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58e30d14",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 1: What is a Network?\n",
    "\n",
    "### Definition\n",
    "A **network** is a composition of layers that transforms input data into output predictions. Think of it as a pipeline of transformations:\n",
    "\n",
    "```\n",
    "Input → Layer1 → Layer2 → Layer3 → Output\n",
    "```\n",
    "\n",
    "### Why Networks Matter\n",
    "- **Function composition**: Complex behavior from simple building blocks\n",
    "- **Learnable parameters**: Each layer has weights that can be learned\n",
    "- **Architecture design**: Different layouts solve different problems\n",
    "- **Real-world applications**: Classification, regression, generation, etc.\n",
    "\n",
    "### The Fundamental Insight\n",
    "**Neural networks are just function composition!**\n",
    "- Each layer is a function: `f_i(x)`\n",
    "- The network is: `f(x) = f_n(...f_2(f_1(x)))`\n",
    "- Complex behavior emerges from simple building blocks\n",
    "\n",
    "### Real-World Examples\n",
    "- **MLP (Multi-Layer Perceptron)**: Classic feedforward network\n",
    "- **CNN (Convolutional Neural Network)**: For image processing\n",
    "- **RNN (Recurrent Neural Network)**: For sequential data\n",
    "- **Transformer**: For attention-based processing\n",
    "\n",
    "### Visual Intuition\n",
    "```\n",
    "Input: [1, 2, 3] (3 features)\n",
    "Layer1: [1.4, 2.8] (linear transformation)\n",
    "Layer2: [1.4, 2.8] (nonlinearity)\n",
    "Layer3: [0.7] (final prediction)\n",
    "```\n",
    "\n",
    "### The Math Behind It\n",
    "For a network with layers `f_1, f_2, ..., f_n`:\n",
    "```\n",
    "f(x) = f_n(f_{n-1}(...f_2(f_1(x))))\n",
    "```\n",
    "\n",
    "Each layer transforms the data, and the final output is the composition of all these transformations.\n",
    "\n",
    "Let's start by building the most fundamental network: **Sequential**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8de00b9b",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "class Sequential:\n",
    "    \"\"\"\n",
    "    Sequential Network: Composes layers in sequence\n",
    "    \n",
    "    The most fundamental network architecture.\n",
    "    Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))\n",
    "    \n",
    "    Args:\n",
    "        layers: List of layers to compose\n",
    "        \n",
    "    TODO: Implement the Sequential network with forward pass.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Store the list of layers as an instance variable\n",
    "    2. Implement forward pass that applies each layer in sequence\n",
    "    3. Make the network callable for easy use\n",
    "    \n",
    "    EXAMPLE:\n",
    "    network = Sequential([\n",
    "        Dense(3, 4),\n",
    "        ReLU(),\n",
    "        Dense(4, 2),\n",
    "        Sigmoid()\n",
    "    ])\n",
    "    x = Tensor([[1, 2, 3]])\n",
    "    y = network(x)  # Forward pass through all layers\n",
    "    \n",
    "    HINTS:\n",
    "    - Store layers in self.layers\n",
    "    - Use a for loop to apply each layer in order\n",
    "    - Each layer's output becomes the next layer's input\n",
    "    - Return the final output\n",
    "    \"\"\"\n",
    "    \n",
    "    def __init__(self, layers: List):\n",
    "        \"\"\"\n",
    "        Initialize Sequential network with layers.\n",
    "        \n",
    "        Args:\n",
    "            layers: List of layers to compose in order\n",
    "            \n",
    "        TODO: Store the layers and implement forward pass\n",
    "        \n",
    "        STEP-BY-STEP:\n",
    "        1. Store the layers list as self.layers\n",
    "        2. This creates the network architecture\n",
    "        \n",
    "        EXAMPLE:\n",
    "        Sequential([Dense(3,4), ReLU(), Dense(4,2)])\n",
    "        creates a 3-layer network: Dense → ReLU → Dense\n",
    "        \"\"\"\n",
    "        raise NotImplementedError(\"Student implementation required\")\n",
    "    \n",
    "    def forward(self, x: Tensor) -> Tensor:\n",
    "        \"\"\"\n",
    "        Forward pass through all layers in sequence.\n",
    "        \n",
    "        Args:\n",
    "            x: Input tensor\n",
    "            \n",
    "        Returns:\n",
    "            Output tensor after passing through all layers\n",
    "            \n",
    "        TODO: Implement sequential forward pass through all layers\n",
    "        \n",
    "        STEP-BY-STEP:\n",
    "        1. Start with the input tensor: current = x\n",
    "        2. Loop through each layer in self.layers\n",
    "        3. Apply each layer: current = layer(current)\n",
    "        4. Return the final output\n",
    "        \n",
    "        EXAMPLE:\n",
    "        Input: Tensor([[1, 2, 3]])\n",
    "        Layer1 (Dense): Tensor([[1.4, 2.8]])\n",
    "        Layer2 (ReLU): Tensor([[1.4, 2.8]])\n",
    "        Layer3 (Dense): Tensor([[0.7]])\n",
    "        Output: Tensor([[0.7]])\n",
    "        \n",
    "        HINTS:\n",
    "        - Use a for loop: for layer in self.layers:\n",
    "        - Apply each layer: current = layer(current)\n",
    "        - The output of one layer becomes input to the next\n",
    "        - Return the final result\n",
    "        \"\"\"\n",
    "        raise NotImplementedError(\"Student implementation required\")\n",
    "    \n",
    "    def __call__(self, x: Tensor) -> Tensor:\n",
    "        \"\"\"Make network callable: network(x) same as network.forward(x)\"\"\"\n",
    "        return self.forward(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e9f65af",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "class Sequential:\n",
    "    \"\"\"\n",
    "    Sequential Network: Composes layers in sequence\n",
    "    \n",
    "    The most fundamental network architecture.\n",
    "    Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))\n",
    "    \"\"\"\n",
    "    \n",
    "    def __init__(self, layers: List):\n",
    "        \"\"\"Initialize Sequential network with layers.\"\"\"\n",
    "        self.layers = layers\n",
    "    \n",
    "    def forward(self, x: Tensor) -> Tensor:\n",
    "        \"\"\"Forward pass through all layers in sequence.\"\"\"\n",
    "        # Apply each layer in order\n",
    "        for layer in self.layers:\n",
    "            x = layer(x)\n",
    "        return x\n",
    "    \n",
    "    def __call__(self, x: Tensor) -> Tensor:\n",
    "        \"\"\"Make network callable: network(x) same as network.forward(x)\"\"\"\n",
    "        return self.forward(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88b54128",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Your Sequential Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b814f23",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test the Sequential network\n",
    "print(\"Testing Sequential network...\")\n",
    "\n",
    "try:\n",
    "    # Create a simple 2-layer network: 3 → 4 → 2\n",
    "    network = Sequential([\n",
    "        Dense(input_size=3, output_size=4),\n",
    "        ReLU(),\n",
    "        Dense(input_size=4, output_size=2),\n",
    "        Sigmoid()\n",
    "    ])\n",
    "    \n",
    "    print(f\"✅ Network created with {len(network.layers)} layers\")\n",
    "    \n",
    "    # Test with sample data\n",
    "    x = Tensor([[1.0, 2.0, 3.0]])\n",
    "    print(f\"✅ Input: {x}\")\n",
    "    \n",
    "    # Forward pass\n",
    "    y = network(x)\n",
    "    print(f\"✅ Output: {y}\")\n",
    "    print(f\"✅ Output shape: {y.shape}\")\n",
    "    \n",
    "    # Verify the network works\n",
    "    assert y.shape == (1, 2), f\"❌ Expected shape (1, 2), got {y.shape}\"\n",
    "    assert np.all(y.data >= 0) and np.all(y.data <= 1), \"❌ Sigmoid output should be between 0 and 1\"\n",
    "    print(\"🎉 Sequential network works!\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement the Sequential network above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28eb9398",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 2: Understanding Network Architecture\n",
    "\n",
    "Now let's explore how different network architectures affect the network's capabilities.\n",
    "\n",
    "### What is Network Architecture?\n",
    "**Architecture** refers to how layers are arranged and connected. It determines:\n",
    "- **Capacity**: How complex patterns the network can learn\n",
    "- **Efficiency**: How many parameters and computations needed\n",
    "- **Specialization**: What types of problems it's good at\n",
    "\n",
    "### Common Architectures\n",
    "\n",
    "#### 1. **MLP (Multi-Layer Perceptron)**\n",
    "```\n",
    "Input → Dense → ReLU → Dense → ReLU → Dense → Output\n",
    "```\n",
    "- **Use case**: General-purpose learning\n",
    "- **Strengths**: Universal approximation, simple to understand\n",
    "- **Weaknesses**: Doesn't exploit spatial structure\n",
    "\n",
    "#### 2. **CNN (Convolutional Neural Network)**\n",
    "```\n",
    "Input → Conv2D → ReLU → Conv2D → ReLU → Dense → Output\n",
    "```\n",
    "- **Use case**: Image processing, spatial data\n",
    "- **Strengths**: Parameter sharing, translation invariance\n",
    "- **Weaknesses**: Fixed spatial structure\n",
    "\n",
    "#### 3. **Deep Network**\n",
    "```\n",
    "Input → Dense → ReLU → Dense → ReLU → Dense → ReLU → Dense → Output\n",
    "```\n",
    "- **Use case**: Complex pattern recognition\n",
    "- **Strengths**: High capacity, can learn complex functions\n",
    "- **Weaknesses**: More parameters, harder to train\n",
    "\n",
    "Let's build some common architectures!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae4fe584",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int, \n",
    "               activation=ReLU, output_activation=Sigmoid) -> Sequential:\n",
    "    \"\"\"\n",
    "    Create a Multi-Layer Perceptron (MLP) network.\n",
    "    \n",
    "    Args:\n",
    "        input_size: Number of input features\n",
    "        hidden_sizes: List of hidden layer sizes\n",
    "        output_size: Number of output features\n",
    "        activation: Activation function for hidden layers (default: ReLU)\n",
    "        output_activation: Activation function for output layer (default: Sigmoid)\n",
    "        \n",
    "    Returns:\n",
    "        Sequential network with MLP architecture\n",
    "        \n",
    "    TODO: Implement MLP creation with alternating Dense and activation layers.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Start with an empty list of layers\n",
    "    2. Add the first Dense layer: input_size → first hidden size\n",
    "    3. For each hidden layer:\n",
    "       - Add activation function\n",
    "       - Add Dense layer connecting to next hidden size\n",
    "    4. Add final activation function\n",
    "    5. Add final Dense layer: last hidden size → output_size\n",
    "    6. Add output activation function\n",
    "    7. Return Sequential(layers)\n",
    "    \n",
    "    EXAMPLE:\n",
    "    create_mlp(3, [4, 2], 1) creates:\n",
    "    Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid\n",
    "    \n",
    "    HINTS:\n",
    "    - Start with layers = []\n",
    "    - Add Dense layers with appropriate input/output sizes\n",
    "    - Add activation functions between Dense layers\n",
    "    - Don't forget the final output activation\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3df597d8",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int, \n",
    "               activation=ReLU, output_activation=Sigmoid) -> Sequential:\n",
    "    \"\"\"Create a Multi-Layer Perceptron (MLP) network.\"\"\"\n",
    "    layers = []\n",
    "    \n",
    "    # Add first layer\n",
    "    current_size = input_size\n",
    "    for hidden_size in hidden_sizes:\n",
    "        layers.append(Dense(input_size=current_size, output_size=hidden_size))\n",
    "        layers.append(activation())\n",
    "        current_size = hidden_size\n",
    "    \n",
    "    # Add output layer\n",
    "    layers.append(Dense(input_size=current_size, output_size=output_size))\n",
    "    layers.append(output_activation())\n",
    "    \n",
    "    return Sequential(layers)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f053d4a8",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Your MLP Creation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "efec756b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test MLP creation\n",
    "print(\"Testing MLP creation...\")\n",
    "\n",
    "try:\n",
    "    # Create different MLP architectures\n",
    "    mlp1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)\n",
    "    mlp2 = create_mlp(input_size=5, hidden_sizes=[8, 4], output_size=2)\n",
    "    mlp3 = create_mlp(input_size=2, hidden_sizes=[10, 6, 3], output_size=1, activation=Tanh)\n",
    "    \n",
    "    print(f\"✅ MLP1: {len(mlp1.layers)} layers\")\n",
    "    print(f\"✅ MLP2: {len(mlp2.layers)} layers\")\n",
    "    print(f\"✅ MLP3: {len(mlp3.layers)} layers\")\n",
    "    \n",
    "    # Test forward pass\n",
    "    x = Tensor([[1.0, 2.0, 3.0]])\n",
    "    y1 = mlp1(x)\n",
    "    print(f\"✅ MLP1 output: {y1}\")\n",
    "    \n",
    "    x2 = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])\n",
    "    y2 = mlp2(x2)\n",
    "    print(f\"✅ MLP2 output: {y2}\")\n",
    "    \n",
    "    print(\"🎉 MLP creation works!\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement create_mlp above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d1c34b6",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 3: Network Visualization and Analysis\n",
    "\n",
    "Let's create tools to visualize and analyze network architectures. This helps us understand what our networks are doing.\n",
    "\n",
    "### Why Visualization Matters\n",
    "- **Architecture understanding**: See how data flows through the network\n",
    "- **Debugging**: Identify bottlenecks and issues\n",
    "- **Design**: Compare different architectures\n",
    "- **Communication**: Explain networks to others\n",
    "\n",
    "### What We'll Build\n",
    "1. **Architecture visualization**: Show layer connections\n",
    "2. **Data flow visualization**: See how data transforms\n",
    "3. **Network comparison**: Compare different architectures\n",
    "4. **Behavior analysis**: Understand network capabilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a74a3b28",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def visualize_network_architecture(network: Sequential, title: str = \"Network Architecture\"):\n",
    "    \"\"\"\n",
    "    Visualize the architecture of a Sequential network.\n",
    "    \n",
    "    Args:\n",
    "        network: Sequential network to visualize\n",
    "        title: Title for the plot\n",
    "        \n",
    "    TODO: Create a visualization showing the network structure.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Create a matplotlib figure\n",
    "    2. For each layer, draw a box showing its type and size\n",
    "    3. Connect the boxes with arrows showing data flow\n",
    "    4. Add labels and formatting\n",
    "    \n",
    "    EXAMPLE:\n",
    "    Input → Dense(3→4) → ReLU → Dense(4→2) → Sigmoid → Output\n",
    "    \n",
    "    HINTS:\n",
    "    - Use plt.subplots() to create the figure\n",
    "    - Use plt.text() to add layer labels\n",
    "    - Use plt.arrow() to show connections\n",
    "    - Add proper spacing and formatting\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1274dbc",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def visualize_network_architecture(network: Sequential, title: str = \"Network Architecture\"):\n",
    "    \"\"\"Visualize the architecture of a Sequential network.\"\"\"\n",
    "    if not _should_show_plots():\n",
    "        print(\"📊 Visualization disabled during testing\")\n",
    "        return\n",
    "    \n",
    "    fig, ax = plt.subplots(1, 1, figsize=(12, 6))\n",
    "    \n",
    "    # Calculate positions\n",
    "    num_layers = len(network.layers)\n",
    "    x_positions = np.linspace(0, 10, num_layers + 2)\n",
    "    \n",
    "    # Draw input\n",
    "    ax.text(x_positions[0], 0, 'Input', ha='center', va='center', \n",
    "            bbox=dict(boxstyle='round,pad=0.3', facecolor='lightblue'))\n",
    "    \n",
    "    # Draw layers\n",
    "    for i, layer in enumerate(network.layers):\n",
    "        layer_name = type(layer).__name__\n",
    "        ax.text(x_positions[i+1], 0, layer_name, ha='center', va='center',\n",
    "                bbox=dict(boxstyle='round,pad=0.3', facecolor='lightgreen'))\n",
    "        \n",
    "        # Draw arrow\n",
    "        ax.arrow(x_positions[i], 0, 0.8, 0, head_width=0.1, head_length=0.1, \n",
    "                fc='black', ec='black')\n",
    "    \n",
    "    # Draw output\n",
    "    ax.text(x_positions[-1], 0, 'Output', ha='center', va='center',\n",
    "            bbox=dict(boxstyle='round,pad=0.3', facecolor='lightcoral'))\n",
    "    \n",
    "    ax.set_xlim(-0.5, 10.5)\n",
    "    ax.set_ylim(-0.5, 0.5)\n",
    "    ax.set_title(title)\n",
    "    ax.axis('off')\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "286f403e",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Network Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2630d356",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test network visualization\n",
    "print(\"Testing network visualization...\")\n",
    "\n",
    "try:\n",
    "    # Create a test network\n",
    "    test_network = Sequential([\n",
    "        Dense(input_size=3, output_size=4),\n",
    "        ReLU(),\n",
    "        Dense(input_size=4, output_size=2),\n",
    "        Sigmoid()\n",
    "    ])\n",
    "    \n",
    "    # Visualize the network\n",
    "    if _should_show_plots():\n",
    "        visualize_network_architecture(test_network, \"Test Network Architecture\")\n",
    "        print(\"✅ Network visualization created!\")\n",
    "    else:\n",
    "        print(\"✅ Network visualization skipped during testing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement visualize_network_architecture above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1b3aaee",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 4: Data Flow Analysis\n",
    "\n",
    "Let's create tools to analyze how data flows through the network. This helps us understand what each layer is doing.\n",
    "\n",
    "### Why Data Flow Analysis Matters\n",
    "- **Debugging**: See where data gets corrupted\n",
    "- **Optimization**: Identify bottlenecks\n",
    "- **Understanding**: Learn what each layer learns\n",
    "- **Design**: Choose appropriate layer sizes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7bc5136d",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = \"Data Flow Through Network\"):\n",
    "    \"\"\"\n",
    "    Visualize how data flows through the network.\n",
    "    \n",
    "    Args:\n",
    "        network: Sequential network to analyze\n",
    "        input_data: Input tensor to trace through the network\n",
    "        title: Title for the plot\n",
    "        \n",
    "    TODO: Create a visualization showing how data transforms through each layer.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Trace the input through each layer\n",
    "    2. Record the output of each layer\n",
    "    3. Create a visualization showing the transformations\n",
    "    4. Add statistics (mean, std, range) for each layer\n",
    "    \n",
    "    EXAMPLE:\n",
    "    Input: [1, 2, 3] → Layer1: [1.4, 2.8] → Layer2: [1.4, 2.8] → Output: [0.7]\n",
    "    \n",
    "    HINTS:\n",
    "    - Use a for loop to apply each layer\n",
    "    - Store intermediate outputs\n",
    "    - Use plt.subplot() to create multiple subplots\n",
    "    - Show statistics for each layer output\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c318ea50",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = \"Data Flow Through Network\"):\n",
    "    \"\"\"Visualize how data flows through the network.\"\"\"\n",
    "    if not _should_show_plots():\n",
    "        print(\"📊 Visualization disabled during testing\")\n",
    "        return\n",
    "    \n",
    "    # Trace data through network\n",
    "    current_data = input_data\n",
    "    layer_outputs = [current_data.data.flatten()]\n",
    "    layer_names = ['Input']\n",
    "    \n",
    "    for layer in network.layers:\n",
    "        current_data = layer(current_data)\n",
    "        layer_outputs.append(current_data.data.flatten())\n",
    "        layer_names.append(type(layer).__name__)\n",
    "    \n",
    "    # Create visualization\n",
    "    fig, axes = plt.subplots(2, len(layer_outputs), figsize=(15, 8))\n",
    "    \n",
    "    for i, (output, name) in enumerate(zip(layer_outputs, layer_names)):\n",
    "        # Histogram\n",
    "        axes[0, i].hist(output, bins=20, alpha=0.7)\n",
    "        axes[0, i].set_title(f'{name}\\nShape: {output.shape}')\n",
    "        axes[0, i].set_xlabel('Value')\n",
    "        axes[0, i].set_ylabel('Frequency')\n",
    "        \n",
    "        # Statistics\n",
    "        stats_text = f'Mean: {np.mean(output):.3f}\\nStd: {np.std(output):.3f}\\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]'\n",
    "        axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes, \n",
    "                        verticalalignment='center', fontsize=10)\n",
    "        axes[1, i].set_title(f'{name} Statistics')\n",
    "        axes[1, i].axis('off')\n",
    "    \n",
    "    plt.suptitle(title)\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bba1f652",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Data Flow Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "af4ed8de",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test data flow visualization\n",
    "print(\"Testing data flow visualization...\")\n",
    "\n",
    "try:\n",
    "    # Create a test network\n",
    "    test_network = Sequential([\n",
    "        Dense(input_size=3, output_size=4),\n",
    "        ReLU(),\n",
    "        Dense(input_size=4, output_size=2),\n",
    "        Sigmoid()\n",
    "    ])\n",
    "    \n",
    "    # Test input\n",
    "    test_input = Tensor([[1.0, 2.0, 3.0]])\n",
    "    \n",
    "    # Visualize data flow\n",
    "    if _should_show_plots():\n",
    "        visualize_data_flow(test_network, test_input, \"Test Network Data Flow\")\n",
    "        print(\"✅ Data flow visualization created!\")\n",
    "    else:\n",
    "        print(\"✅ Data flow visualization skipped during testing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement visualize_data_flow above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02308b13",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 5: Network Comparison and Analysis\n",
    "\n",
    "Let's create tools to compare different network architectures and understand their capabilities.\n",
    "\n",
    "### Why Network Comparison Matters\n",
    "- **Architecture selection**: Choose the right network for your problem\n",
    "- **Performance analysis**: Understand trade-offs between different designs\n",
    "- **Design insights**: Learn what makes networks effective\n",
    "- **Research**: Compare new architectures to baselines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c3634ab",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def compare_networks(networks: List[Sequential], network_names: List[str], \n",
    "                    input_data: Tensor, title: str = \"Network Comparison\"):\n",
    "    \"\"\"\n",
    "    Compare multiple networks on the same input.\n",
    "    \n",
    "    Args:\n",
    "        networks: List of Sequential networks to compare\n",
    "        network_names: Names for each network\n",
    "        input_data: Input tensor to test all networks\n",
    "        title: Title for the plot\n",
    "        \n",
    "    TODO: Create a comparison visualization showing how different networks process the same input.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Run the same input through each network\n",
    "    2. Collect the outputs and intermediate results\n",
    "    3. Create a visualization comparing the results\n",
    "    4. Show statistics and differences\n",
    "    \n",
    "    EXAMPLE:\n",
    "    Compare MLP vs Deep Network vs Wide Network on same input\n",
    "    \n",
    "    HINTS:\n",
    "    - Use a for loop to test each network\n",
    "    - Store outputs and any relevant statistics\n",
    "    - Use plt.subplot() to create comparison plots\n",
    "    - Show both outputs and intermediate layer results\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce1d5a21",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def compare_networks(networks: List[Sequential], network_names: List[str], \n",
    "                    input_data: Tensor, title: str = \"Network Comparison\"):\n",
    "    \"\"\"Compare multiple networks on the same input.\"\"\"\n",
    "    if not _should_show_plots():\n",
    "        print(\"📊 Visualization disabled during testing\")\n",
    "        return\n",
    "    \n",
    "    # Test all networks\n",
    "    outputs = []\n",
    "    for network in networks:\n",
    "        output = network(input_data)\n",
    "        outputs.append(output.data.flatten())\n",
    "    \n",
    "    # Create comparison plot\n",
    "    fig, axes = plt.subplots(2, len(networks), figsize=(15, 8))\n",
    "    \n",
    "    for i, (output, name) in enumerate(zip(outputs, network_names)):\n",
    "        # Output distribution\n",
    "        axes[0, i].hist(output, bins=20, alpha=0.7)\n",
    "        axes[0, i].set_title(f'{name}\\nOutput Distribution')\n",
    "        axes[0, i].set_xlabel('Value')\n",
    "        axes[0, i].set_ylabel('Frequency')\n",
    "        \n",
    "        # Statistics\n",
    "        stats_text = f'Mean: {np.mean(output):.3f}\\nStd: {np.std(output):.3f}\\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]\\nSize: {len(output)}'\n",
    "        axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes, \n",
    "                        verticalalignment='center', fontsize=10)\n",
    "        axes[1, i].set_title(f'{name} Statistics')\n",
    "        axes[1, i].axis('off')\n",
    "    \n",
    "    plt.suptitle(title)\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d16eb163",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Network Comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab17ac91",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test network comparison\n",
    "print(\"Testing network comparison...\")\n",
    "\n",
    "try:\n",
    "    # Create different networks\n",
    "    network1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)\n",
    "    network2 = create_mlp(input_size=3, hidden_sizes=[8, 4], output_size=1)\n",
    "    network3 = create_mlp(input_size=3, hidden_sizes=[2], output_size=1, activation=Tanh)\n",
    "    \n",
    "    networks = [network1, network2, network3]\n",
    "    names = [\"Small MLP\", \"Deep MLP\", \"Tanh MLP\"]\n",
    "    \n",
    "    # Test input\n",
    "    test_input = Tensor([[1.0, 2.0, 3.0]])\n",
    "    \n",
    "    # Compare networks\n",
    "    if _should_show_plots():\n",
    "        compare_networks(networks, names, test_input, \"Network Architecture Comparison\")\n",
    "        print(\"✅ Network comparison created!\")\n",
    "    else:\n",
    "        print(\"✅ Network comparison skipped during testing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement compare_networks above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c61fc030",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 6: Practical Network Architectures\n",
    "\n",
    "Now let's create some practical network architectures for common machine learning tasks.\n",
    "\n",
    "### Common Network Types\n",
    "\n",
    "#### 1. **Classification Networks**\n",
    "- **Binary classification**: Output single probability\n",
    "- **Multi-class classification**: Output probability distribution\n",
    "- **Use cases**: Image classification, spam detection, sentiment analysis\n",
    "\n",
    "#### 2. **Regression Networks**\n",
    "- **Single output**: Predict continuous value\n",
    "- **Multiple outputs**: Predict multiple values\n",
    "- **Use cases**: Price prediction, temperature forecasting, demand estimation\n",
    "\n",
    "#### 3. **Feature Extraction Networks**\n",
    "- **Encoder networks**: Compress data into features\n",
    "- **Use cases**: Dimensionality reduction, feature learning, representation learning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f117af1e",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def create_classification_network(input_size: int, num_classes: int, \n",
    "                                hidden_sizes: List[int] = None) -> Sequential:\n",
    "    \"\"\"\n",
    "    Create a network for classification tasks.\n",
    "    \n",
    "    Args:\n",
    "        input_size: Number of input features\n",
    "        num_classes: Number of output classes\n",
    "        hidden_sizes: List of hidden layer sizes (default: [input_size * 2])\n",
    "        \n",
    "    Returns:\n",
    "        Sequential network for classification\n",
    "        \n",
    "    TODO: Implement classification network creation.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Use default hidden sizes if none provided\n",
    "    2. Create MLP with appropriate architecture\n",
    "    3. Use Sigmoid for binary classification (num_classes=1)\n",
    "    4. Use appropriate activation for multi-class\n",
    "    \n",
    "    EXAMPLE:\n",
    "    create_classification_network(10, 3) creates:\n",
    "    Dense(10→20) → ReLU → Dense(20→3) → Sigmoid\n",
    "    \n",
    "    HINTS:\n",
    "    - Use create_mlp() function\n",
    "    - Choose appropriate output activation based on num_classes\n",
    "    - For binary classification (num_classes=1), use Sigmoid\n",
    "    - For multi-class, you could use Sigmoid or no activation\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "867fa5d4",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def create_classification_network(input_size: int, num_classes: int, \n",
    "                                hidden_sizes: List[int] = None) -> Sequential:\n",
    "    \"\"\"Create a network for classification tasks.\"\"\"\n",
    "    if hidden_sizes is None:\n",
    "        hidden_sizes = [input_size // 2]  # Use input_size // 2 as default\n",
    "    \n",
    "    # Choose appropriate output activation\n",
    "    output_activation = Sigmoid if num_classes == 1 else Softmax\n",
    "    \n",
    "    return create_mlp(input_size, hidden_sizes, num_classes, \n",
    "                     activation=ReLU, output_activation=output_activation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8888dc0c",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def create_regression_network(input_size: int, output_size: int = 1,\n",
    "                             hidden_sizes: List[int] = None) -> Sequential:\n",
    "    \"\"\"\n",
    "    Create a network for regression tasks.\n",
    "    \n",
    "    Args:\n",
    "        input_size: Number of input features\n",
    "        output_size: Number of output values (default: 1)\n",
    "        hidden_sizes: List of hidden layer sizes (default: [input_size * 2])\n",
    "        \n",
    "    Returns:\n",
    "        Sequential network for regression\n",
    "        \n",
    "    TODO: Implement regression network creation.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Use default hidden sizes if none provided\n",
    "    2. Create MLP with appropriate architecture\n",
    "    3. Use no activation on output layer (linear output)\n",
    "    \n",
    "    EXAMPLE:\n",
    "    create_regression_network(5, 1) creates:\n",
    "    Dense(5→10) → ReLU → Dense(10→1) (no activation)\n",
    "    \n",
    "    HINTS:\n",
    "    - Use create_mlp() but with no output activation\n",
    "    - For regression, we want linear outputs (no activation)\n",
    "    - You can pass None or identity function as output_activation\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "052bb51a",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def create_regression_network(input_size: int, output_size: int = 1,\n",
    "                             hidden_sizes: List[int] = None) -> Sequential:\n",
    "    \"\"\"Create a network for regression tasks.\"\"\"\n",
    "    if hidden_sizes is None:\n",
    "        hidden_sizes = [input_size // 2]  # Use input_size // 2 as default\n",
    "    \n",
    "    # Create MLP with Tanh output activation for regression\n",
    "    return create_mlp(input_size, hidden_sizes, output_size, \n",
    "                     activation=ReLU, output_activation=Tanh)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5dd183e8",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Practical Networks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0cf0dc20",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test practical networks\n",
    "print(\"Testing practical networks...\")\n",
    "\n",
    "try:\n",
    "    # Test classification network\n",
    "    class_net = create_classification_network(input_size=5, num_classes=1)\n",
    "    x_class = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])\n",
    "    y_class = class_net(x_class)\n",
    "    print(f\"✅ Classification output: {y_class}\")\n",
    "    print(f\"✅ Output range: [{np.min(y_class.data):.3f}, {np.max(y_class.data):.3f}]\")\n",
    "    \n",
    "    # Test regression network\n",
    "    reg_net = create_regression_network(input_size=3, output_size=1)\n",
    "    x_reg = Tensor([[1.0, 2.0, 3.0]])\n",
    "    y_reg = reg_net(x_reg)\n",
    "    print(f\"✅ Regression output: {y_reg}\")\n",
    "    print(f\"✅ Output range: [{np.min(y_reg.data):.3f}, {np.max(y_reg.data):.3f}]\")\n",
    "    \n",
    "    print(\"🎉 Practical networks work!\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement the network creation functions above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da4b34d4",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Step 7: Network Behavior Analysis\n",
    "\n",
    "Let's create tools to analyze how networks behave with different inputs and understand their capabilities.\n",
    "\n",
    "### Why Behavior Analysis Matters\n",
    "- **Understanding**: Learn what patterns networks can learn\n",
    "- **Debugging**: Identify when networks fail\n",
    "- **Design**: Choose appropriate architectures\n",
    "- **Validation**: Ensure networks work as expected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9cbf0f3",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| export\n",
    "def analyze_network_behavior(network: Sequential, input_data: Tensor, \n",
    "                           title: str = \"Network Behavior Analysis\"):\n",
    "    \"\"\"\n",
    "    Analyze how a network behaves with different inputs.\n",
    "    \n",
    "    Args:\n",
    "        network: Sequential network to analyze\n",
    "        input_data: Input tensor to test\n",
    "        title: Title for the plot\n",
    "        \n",
    "    TODO: Create an analysis showing network behavior and capabilities.\n",
    "    \n",
    "    APPROACH:\n",
    "    1. Test the network with the given input\n",
    "    2. Analyze the output characteristics\n",
    "    3. Test with variations of the input\n",
    "    4. Create visualizations showing behavior patterns\n",
    "    \n",
    "    EXAMPLE:\n",
    "    Test network with original input and noisy versions\n",
    "    Show how output changes with input variations\n",
    "    \n",
    "    HINTS:\n",
    "    - Test the original input\n",
    "    - Create variations (noise, scaling, etc.)\n",
    "    - Compare outputs across variations\n",
    "    - Show statistics and patterns\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"Student implementation required\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f002ab23",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "#| hide\n",
    "#| export\n",
    "def analyze_network_behavior(network: Sequential, input_data: Tensor, \n",
    "                           title: str = \"Network Behavior Analysis\"):\n",
    "    \"\"\"Analyze how a network behaves with different inputs.\"\"\"\n",
    "    if not _should_show_plots():\n",
    "        print(\"📊 Visualization disabled during testing\")\n",
    "        return\n",
    "    \n",
    "    # Test original input\n",
    "    original_output = network(input_data)\n",
    "    \n",
    "    # Create variations\n",
    "    noise_levels = [0.0, 0.1, 0.2, 0.5]\n",
    "    outputs = []\n",
    "    \n",
    "    for noise in noise_levels:\n",
    "        noisy_input = Tensor(input_data.data + noise * np.random.randn(*input_data.data.shape))\n",
    "        output = network(noisy_input)\n",
    "        outputs.append(output.data.flatten())\n",
    "    \n",
    "    # Create analysis plot\n",
    "    fig, axes = plt.subplots(2, 2, figsize=(12, 10))\n",
    "    \n",
    "    # Original output\n",
    "    axes[0, 0].hist(outputs[0], bins=20, alpha=0.7)\n",
    "    axes[0, 0].set_title('Original Input Output')\n",
    "    axes[0, 0].set_xlabel('Value')\n",
    "    axes[0, 0].set_ylabel('Frequency')\n",
    "    \n",
    "    # Output stability\n",
    "    output_means = [np.mean(out) for out in outputs]\n",
    "    output_stds = [np.std(out) for out in outputs]\n",
    "    axes[0, 1].plot(noise_levels, output_means, 'bo-', label='Mean')\n",
    "    axes[0, 1].fill_between(noise_levels, \n",
    "                           [m-s for m, s in zip(output_means, output_stds)],\n",
    "                           [m+s for m, s in zip(output_means, output_stds)], \n",
    "                           alpha=0.3, label='±1 Std')\n",
    "    axes[0, 1].set_xlabel('Noise Level')\n",
    "    axes[0, 1].set_ylabel('Output Value')\n",
    "    axes[0, 1].set_title('Output Stability')\n",
    "    axes[0, 1].legend()\n",
    "    \n",
    "    # Output distribution comparison\n",
    "    for i, (output, noise) in enumerate(zip(outputs, noise_levels)):\n",
    "        axes[1, 0].hist(output, bins=20, alpha=0.5, label=f'Noise={noise}')\n",
    "    axes[1, 0].set_xlabel('Output Value')\n",
    "    axes[1, 0].set_ylabel('Frequency')\n",
    "    axes[1, 0].set_title('Output Distribution Comparison')\n",
    "    axes[1, 0].legend()\n",
    "    \n",
    "    # Statistics\n",
    "    stats_text = f'Original Mean: {np.mean(outputs[0]):.3f}\\nOriginal Std: {np.std(outputs[0]):.3f}\\nOutput Range: [{np.min(outputs[0]):.3f}, {np.max(outputs[0]):.3f}]'\n",
    "    axes[1, 1].text(0.1, 0.5, stats_text, transform=axes[1, 1].transAxes, \n",
    "                    verticalalignment='center', fontsize=10)\n",
    "    axes[1, 1].set_title('Network Statistics')\n",
    "    axes[1, 1].axis('off')\n",
    "    \n",
    "    plt.suptitle(title)\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58c4d2fe",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### 🧪 Test Network Behavior Analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4241defa",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test network behavior analysis\n",
    "print(\"Testing network behavior analysis...\")\n",
    "\n",
    "try:\n",
    "    # Create a test network\n",
    "    test_network = create_classification_network(input_size=3, num_classes=1)\n",
    "    test_input = Tensor([[1.0, 2.0, 3.0]])\n",
    "    \n",
    "    # Analyze behavior\n",
    "    if _should_show_plots():\n",
    "        analyze_network_behavior(test_network, test_input, \"Test Network Behavior\")\n",
    "        print(\"✅ Network behavior analysis created!\")\n",
    "    else:\n",
    "        print(\"✅ Network behavior analysis skipped during testing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "    print(\"Make sure to implement analyze_network_behavior above!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5e6395d0",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## 🎯 Module Summary\n",
    "\n",
    "Congratulations! You've built the foundation of neural network architectures:\n",
    "\n",
    "### What You've Accomplished\n",
    "✅ **Sequential Networks**: Composing layers into complete architectures  \n",
    "✅ **MLP Creation**: Building multi-layer perceptrons  \n",
    "✅ **Network Visualization**: Understanding architecture and data flow  \n",
    "✅ **Network Comparison**: Analyzing different architectures  \n",
    "✅ **Practical Networks**: Classification and regression networks  \n",
    "✅ **Behavior Analysis**: Understanding network capabilities  \n",
    "\n",
    "### Key Concepts You've Learned\n",
    "- **Networks** are compositions of layers that transform data\n",
    "- **Architecture design** determines network capabilities\n",
    "- **Sequential networks** are the most fundamental building block\n",
    "- **Different architectures** solve different problems\n",
    "- **Visualization tools** help understand network behavior\n",
    "\n",
    "### What's Next\n",
    "In the next modules, you'll build on this foundation:\n",
    "- **Autograd**: Enable automatic differentiation for training\n",
    "- **Training**: Learn parameters using gradients and optimizers\n",
    "- **Loss Functions**: Define objectives for learning\n",
    "- **Applications**: Solve real problems with neural networks\n",
    "\n",
    "### Real-World Connection\n",
    "Your network architectures are now ready to:\n",
    "- Compose layers into complete neural networks\n",
    "- Create specialized architectures for different tasks\n",
    "- Analyze and understand network behavior\n",
    "- Integrate with the rest of the TinyTorch ecosystem\n",
    "\n",
    "**Ready for the next challenge?** Let's move on to automatic differentiation to enable training!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "090bbc0d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Final verification\n",
    "print(\"\\n\" + \"=\"*50)\n",
    "print(\"🎉 NETWORKS MODULE COMPLETE!\")\n",
    "print(\"=\"*50)\n",
    "print(\"✅ Sequential network implementation\")\n",
    "print(\"✅ MLP creation and architecture design\")\n",
    "print(\"✅ Network visualization and analysis\")\n",
    "print(\"✅ Network comparison tools\")\n",
    "print(\"✅ Practical classification and regression networks\")\n",
    "print(\"✅ Network behavior analysis\")\n",
    "print(\"\\n🚀 Ready to enable training with autograd in the next module!\") "
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "main_language": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}