Files
TinyTorch/modules/networks/networks_dev.py
Vijay Janapa Reddi 718f52380e Improve CLI: remove redundant --module flags
- Update test, export, and clean commands to use positional arguments
- Change from 'tito module test --module dataloader' to 'tito module test dataloader'
- Eliminates redundant --module flag within module command group
- Update help text and examples to reflect new syntax
- Maintains backward compatibility with --all flag
- More intuitive and consistent CLI design
2025-07-12 00:56:00 -04:00

1118 lines
36 KiB
Python

# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.17.1
# ---
# %% [markdown]
"""
# Module 3: Networks - Neural Network Architectures
Welcome to the Networks module! This is where we compose layers into complete neural network architectures.
## Learning Goals
- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`
- Build common architectures (MLP, CNN) from layers
- Visualize network structure and data flow
- See how architecture affects capability
- Master forward pass inference (no training yet!)
## Build → Use → Understand
1. **Build**: Compose layers into complete networks
2. **Use**: Create different architectures and run inference
3. **Understand**: How architecture design affects network behavior
## Module Dependencies
This module builds on previous modules:
- **tensor** → **activations** → **layers** → **networks**
- Clean composition: math functions → building blocks → complete systems
"""
# %% [markdown]
"""
## 📦 Where This Code Lives in the Final Package
**Learning Side:** You work in `modules/networks/networks_dev.py`
**Building Side:** Code exports to `tinytorch.core.networks`
```python
# Final package structure:
from tinytorch.core.networks import Sequential, MLP
from tinytorch.core.layers import Dense, Conv2D
from tinytorch.core.activations import ReLU, Sigmoid, Tanh
from tinytorch.core.tensor import Tensor
```
**Why this matters:**
- **Learning:** Focused modules for deep understanding
- **Production:** Proper organization like PyTorch's `torch.nn`
- **Consistency:** All network architectures live together in `core.networks`
"""
# %%
#| default_exp core.networks
# Setup and imports
import numpy as np
import sys
from typing import List, Union, Optional, Callable
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.patches import FancyBboxPatch, ConnectionPatch
import seaborn as sns
# Import all the building blocks we need
from tinytorch.core.tensor import Tensor
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax
print("🔥 TinyTorch Networks Module")
print(f"NumPy version: {np.__version__}")
print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")
print("Ready to build neural network architectures!")
# %%
#| export
import numpy as np
import sys
from typing import List, Union, Optional, Callable
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.patches import FancyBboxPatch, ConnectionPatch
import seaborn as sns
# Import our building blocks
from tinytorch.core.tensor import Tensor
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU, Sigmoid, Tanh
# %%
#| hide
#| export
def _should_show_plots():
"""Check if we should show plots (disable during testing)"""
return 'pytest' not in sys.modules and 'test' not in sys.argv
# %% [markdown]
"""
## Step 1: What is a Network?
### Definition
A **network** is a composition of layers that transforms input data into output predictions. Think of it as a pipeline of transformations:
```
Input → Layer1 → Layer2 → Layer3 → Output
```
### Why Networks Matter
- **Function composition**: Complex behavior from simple building blocks
- **Learnable parameters**: Each layer has weights that can be learned
- **Architecture design**: Different layouts solve different problems
- **Real-world applications**: Classification, regression, generation, etc.
### The Fundamental Insight
**Neural networks are just function composition!**
- Each layer is a function: `f_i(x)`
- The network is: `f(x) = f_n(...f_2(f_1(x)))`
- Complex behavior emerges from simple building blocks
### Real-World Examples
- **MLP (Multi-Layer Perceptron)**: Classic feedforward network
- **CNN (Convolutional Neural Network)**: For image processing
- **RNN (Recurrent Neural Network)**: For sequential data
- **Transformer**: For attention-based processing
### Visual Intuition
```
Input: [1, 2, 3] (3 features)
Layer1: [1.4, 2.8] (linear transformation)
Layer2: [1.4, 2.8] (nonlinearity)
Layer3: [0.7] (final prediction)
```
### The Math Behind It
For a network with layers `f_1, f_2, ..., f_n`:
```
f(x) = f_n(f_{n-1}(...f_2(f_1(x))))
```
Each layer transforms the data, and the final output is the composition of all these transformations.
Let's start by building the most fundamental network: **Sequential**.
"""
# %%
#| export
class Sequential:
"""
Sequential Network: Composes layers in sequence
The most fundamental network architecture.
Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))
Args:
layers: List of layers to compose
TODO: Implement the Sequential network with forward pass.
APPROACH:
1. Store the list of layers as an instance variable
2. Implement forward pass that applies each layer in sequence
3. Make the network callable for easy use
EXAMPLE:
network = Sequential([
Dense(3, 4),
ReLU(),
Dense(4, 2),
Sigmoid()
])
x = Tensor([[1, 2, 3]])
y = network(x) # Forward pass through all layers
HINTS:
- Store layers in self.layers
- Use a for loop to apply each layer in order
- Each layer's output becomes the next layer's input
- Return the final output
"""
def __init__(self, layers: List):
"""
Initialize Sequential network with layers.
Args:
layers: List of layers to compose in order
TODO: Store the layers and implement forward pass
STEP-BY-STEP:
1. Store the layers list as self.layers
2. This creates the network architecture
EXAMPLE:
Sequential([Dense(3,4), ReLU(), Dense(4,2)])
creates a 3-layer network: Dense → ReLU → Dense
"""
raise NotImplementedError("Student implementation required")
def forward(self, x: Tensor) -> Tensor:
"""
Forward pass through all layers in sequence.
Args:
x: Input tensor
Returns:
Output tensor after passing through all layers
TODO: Implement sequential forward pass through all layers
STEP-BY-STEP:
1. Start with the input tensor: current = x
2. Loop through each layer in self.layers
3. Apply each layer: current = layer(current)
4. Return the final output
EXAMPLE:
Input: Tensor([[1, 2, 3]])
Layer1 (Dense): Tensor([[1.4, 2.8]])
Layer2 (ReLU): Tensor([[1.4, 2.8]])
Layer3 (Dense): Tensor([[0.7]])
Output: Tensor([[0.7]])
HINTS:
- Use a for loop: for layer in self.layers:
- Apply each layer: current = layer(current)
- The output of one layer becomes input to the next
- Return the final result
"""
raise NotImplementedError("Student implementation required")
def __call__(self, x: Tensor) -> Tensor:
"""Make network callable: network(x) same as network.forward(x)"""
return self.forward(x)
# %%
#| hide
#| export
class Sequential:
"""
Sequential Network: Composes layers in sequence
The most fundamental network architecture.
Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))
"""
def __init__(self, layers: List):
"""Initialize Sequential network with layers."""
self.layers = layers
def forward(self, x: Tensor) -> Tensor:
"""Forward pass through all layers in sequence."""
# Apply each layer in order
for layer in self.layers:
x = layer(x)
return x
def __call__(self, x: Tensor) -> Tensor:
"""Make network callable: network(x) same as network.forward(x)"""
return self.forward(x)
# %% [markdown]
"""
### 🧪 Test Your Sequential Network
"""
# %%
# Test the Sequential network
print("Testing Sequential network...")
try:
# Create a simple 2-layer network: 3 → 4 → 2
network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
print(f"✅ Network created with {len(network.layers)} layers")
# Test with sample data
x = Tensor([[1.0, 2.0, 3.0]])
print(f"✅ Input: {x}")
# Forward pass
y = network(x)
print(f"✅ Output: {y}")
print(f"✅ Output shape: {y.shape}")
# Verify the network works
assert y.shape == (1, 2), f"❌ Expected shape (1, 2), got {y.shape}"
assert np.all(y.data >= 0) and np.all(y.data <= 1), "❌ Sigmoid output should be between 0 and 1"
print("🎉 Sequential network works!")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement the Sequential network above!")
# %% [markdown]
"""
## Step 2: Understanding Network Architecture
Now let's explore how different network architectures affect the network's capabilities.
### What is Network Architecture?
**Architecture** refers to how layers are arranged and connected. It determines:
- **Capacity**: How complex patterns the network can learn
- **Efficiency**: How many parameters and computations needed
- **Specialization**: What types of problems it's good at
### Common Architectures
#### 1. **MLP (Multi-Layer Perceptron)**
```
Input → Dense → ReLU → Dense → ReLU → Dense → Output
```
- **Use case**: General-purpose learning
- **Strengths**: Universal approximation, simple to understand
- **Weaknesses**: Doesn't exploit spatial structure
#### 2. **CNN (Convolutional Neural Network)**
```
Input → Conv2D → ReLU → Conv2D → ReLU → Dense → Output
```
- **Use case**: Image processing, spatial data
- **Strengths**: Parameter sharing, translation invariance
- **Weaknesses**: Fixed spatial structure
#### 3. **Deep Network**
```
Input → Dense → ReLU → Dense → ReLU → Dense → ReLU → Dense → Output
```
- **Use case**: Complex pattern recognition
- **Strengths**: High capacity, can learn complex functions
- **Weaknesses**: More parameters, harder to train
Let's build some common architectures!
"""
# %%
#| export
def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int,
activation=ReLU, output_activation=Sigmoid) -> Sequential:
"""
Create a Multi-Layer Perceptron (MLP) network.
Args:
input_size: Number of input features
hidden_sizes: List of hidden layer sizes
output_size: Number of output features
activation: Activation function for hidden layers (default: ReLU)
output_activation: Activation function for output layer (default: Sigmoid)
Returns:
Sequential network with MLP architecture
TODO: Implement MLP creation with alternating Dense and activation layers.
APPROACH:
1. Start with an empty list of layers
2. Add the first Dense layer: input_size → first hidden size
3. For each hidden layer:
- Add activation function
- Add Dense layer connecting to next hidden size
4. Add final activation function
5. Add final Dense layer: last hidden size → output_size
6. Add output activation function
7. Return Sequential(layers)
EXAMPLE:
create_mlp(3, [4, 2], 1) creates:
Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid
HINTS:
- Start with layers = []
- Add Dense layers with appropriate input/output sizes
- Add activation functions between Dense layers
- Don't forget the final output activation
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int,
activation=ReLU, output_activation=Sigmoid) -> Sequential:
"""Create a Multi-Layer Perceptron (MLP) network."""
layers = []
# Add first layer
current_size = input_size
for hidden_size in hidden_sizes:
layers.append(Dense(input_size=current_size, output_size=hidden_size))
layers.append(activation())
current_size = hidden_size
# Add output layer
layers.append(Dense(input_size=current_size, output_size=output_size))
layers.append(output_activation())
return Sequential(layers)
# %% [markdown]
"""
### 🧪 Test Your MLP Creation
"""
# %%
# Test MLP creation
print("Testing MLP creation...")
try:
# Create different MLP architectures
mlp1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
mlp2 = create_mlp(input_size=5, hidden_sizes=[8, 4], output_size=2)
mlp3 = create_mlp(input_size=2, hidden_sizes=[10, 6, 3], output_size=1, activation=Tanh)
print(f"✅ MLP1: {len(mlp1.layers)} layers")
print(f"✅ MLP2: {len(mlp2.layers)} layers")
print(f"✅ MLP3: {len(mlp3.layers)} layers")
# Test forward pass
x = Tensor([[1.0, 2.0, 3.0]])
y1 = mlp1(x)
print(f"✅ MLP1 output: {y1}")
x2 = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])
y2 = mlp2(x2)
print(f"✅ MLP2 output: {y2}")
print("🎉 MLP creation works!")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement create_mlp above!")
# %% [markdown]
"""
## Step 3: Network Visualization and Analysis
Let's create tools to visualize and analyze network architectures. This helps us understand what our networks are doing.
### Why Visualization Matters
- **Architecture understanding**: See how data flows through the network
- **Debugging**: Identify bottlenecks and issues
- **Design**: Compare different architectures
- **Communication**: Explain networks to others
### What We'll Build
1. **Architecture visualization**: Show layer connections
2. **Data flow visualization**: See how data transforms
3. **Network comparison**: Compare different architectures
4. **Behavior analysis**: Understand network capabilities
"""
# %%
#| export
def visualize_network_architecture(network: Sequential, title: str = "Network Architecture"):
"""
Visualize the architecture of a Sequential network.
Args:
network: Sequential network to visualize
title: Title for the plot
TODO: Create a visualization showing the network structure.
APPROACH:
1. Create a matplotlib figure
2. For each layer, draw a box showing its type and size
3. Connect the boxes with arrows showing data flow
4. Add labels and formatting
EXAMPLE:
Input → Dense(3→4) → ReLU → Dense(4→2) → Sigmoid → Output
HINTS:
- Use plt.subplots() to create the figure
- Use plt.text() to add layer labels
- Use plt.arrow() to show connections
- Add proper spacing and formatting
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def visualize_network_architecture(network: Sequential, title: str = "Network Architecture"):
"""Visualize the architecture of a Sequential network."""
if not _should_show_plots():
print("📊 Visualization disabled during testing")
return
fig, ax = plt.subplots(1, 1, figsize=(12, 6))
# Calculate positions
num_layers = len(network.layers)
x_positions = np.linspace(0, 10, num_layers + 2)
# Draw input
ax.text(x_positions[0], 0, 'Input', ha='center', va='center',
bbox=dict(boxstyle='round,pad=0.3', facecolor='lightblue'))
# Draw layers
for i, layer in enumerate(network.layers):
layer_name = type(layer).__name__
ax.text(x_positions[i+1], 0, layer_name, ha='center', va='center',
bbox=dict(boxstyle='round,pad=0.3', facecolor='lightgreen'))
# Draw arrow
ax.arrow(x_positions[i], 0, 0.8, 0, head_width=0.1, head_length=0.1,
fc='black', ec='black')
# Draw output
ax.text(x_positions[-1], 0, 'Output', ha='center', va='center',
bbox=dict(boxstyle='round,pad=0.3', facecolor='lightcoral'))
ax.set_xlim(-0.5, 10.5)
ax.set_ylim(-0.5, 0.5)
ax.set_title(title)
ax.axis('off')
plt.show()
# %% [markdown]
"""
### 🧪 Test Network Visualization
"""
# %%
# Test network visualization
print("Testing network visualization...")
try:
# Create a test network
test_network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
# Visualize the network
if _should_show_plots():
visualize_network_architecture(test_network, "Test Network Architecture")
print("✅ Network visualization created!")
else:
print("✅ Network visualization skipped during testing")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement visualize_network_architecture above!")
# %% [markdown]
"""
## Step 4: Data Flow Analysis
Let's create tools to analyze how data flows through the network. This helps us understand what each layer is doing.
### Why Data Flow Analysis Matters
- **Debugging**: See where data gets corrupted
- **Optimization**: Identify bottlenecks
- **Understanding**: Learn what each layer learns
- **Design**: Choose appropriate layer sizes
"""
# %%
#| export
def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = "Data Flow Through Network"):
"""
Visualize how data flows through the network.
Args:
network: Sequential network to analyze
input_data: Input tensor to trace through the network
title: Title for the plot
TODO: Create a visualization showing how data transforms through each layer.
APPROACH:
1. Trace the input through each layer
2. Record the output of each layer
3. Create a visualization showing the transformations
4. Add statistics (mean, std, range) for each layer
EXAMPLE:
Input: [1, 2, 3] → Layer1: [1.4, 2.8] → Layer2: [1.4, 2.8] → Output: [0.7]
HINTS:
- Use a for loop to apply each layer
- Store intermediate outputs
- Use plt.subplot() to create multiple subplots
- Show statistics for each layer output
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def visualize_data_flow(network: Sequential, input_data: Tensor, title: str = "Data Flow Through Network"):
"""Visualize how data flows through the network."""
if not _should_show_plots():
print("📊 Visualization disabled during testing")
return
# Trace data through network
current_data = input_data
layer_outputs = [current_data.data.flatten()]
layer_names = ['Input']
for layer in network.layers:
current_data = layer(current_data)
layer_outputs.append(current_data.data.flatten())
layer_names.append(type(layer).__name__)
# Create visualization
fig, axes = plt.subplots(2, len(layer_outputs), figsize=(15, 8))
for i, (output, name) in enumerate(zip(layer_outputs, layer_names)):
# Histogram
axes[0, i].hist(output, bins=20, alpha=0.7)
axes[0, i].set_title(f'{name}\nShape: {output.shape}')
axes[0, i].set_xlabel('Value')
axes[0, i].set_ylabel('Frequency')
# Statistics
stats_text = f'Mean: {np.mean(output):.3f}\nStd: {np.std(output):.3f}\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]'
axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes,
verticalalignment='center', fontsize=10)
axes[1, i].set_title(f'{name} Statistics')
axes[1, i].axis('off')
plt.suptitle(title)
plt.tight_layout()
plt.show()
# %% [markdown]
"""
### 🧪 Test Data Flow Visualization
"""
# %%
# Test data flow visualization
print("Testing data flow visualization...")
try:
# Create a test network
test_network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
# Test input
test_input = Tensor([[1.0, 2.0, 3.0]])
# Visualize data flow
if _should_show_plots():
visualize_data_flow(test_network, test_input, "Test Network Data Flow")
print("✅ Data flow visualization created!")
else:
print("✅ Data flow visualization skipped during testing")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement visualize_data_flow above!")
# %% [markdown]
"""
## Step 5: Network Comparison and Analysis
Let's create tools to compare different network architectures and understand their capabilities.
### Why Network Comparison Matters
- **Architecture selection**: Choose the right network for your problem
- **Performance analysis**: Understand trade-offs between different designs
- **Design insights**: Learn what makes networks effective
- **Research**: Compare new architectures to baselines
"""
# %%
#| export
def compare_networks(networks: List[Sequential], network_names: List[str],
input_data: Tensor, title: str = "Network Comparison"):
"""
Compare multiple networks on the same input.
Args:
networks: List of Sequential networks to compare
network_names: Names for each network
input_data: Input tensor to test all networks
title: Title for the plot
TODO: Create a comparison visualization showing how different networks process the same input.
APPROACH:
1. Run the same input through each network
2. Collect the outputs and intermediate results
3. Create a visualization comparing the results
4. Show statistics and differences
EXAMPLE:
Compare MLP vs Deep Network vs Wide Network on same input
HINTS:
- Use a for loop to test each network
- Store outputs and any relevant statistics
- Use plt.subplot() to create comparison plots
- Show both outputs and intermediate layer results
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def compare_networks(networks: List[Sequential], network_names: List[str],
input_data: Tensor, title: str = "Network Comparison"):
"""Compare multiple networks on the same input."""
if not _should_show_plots():
print("📊 Visualization disabled during testing")
return
# Test all networks
outputs = []
for network in networks:
output = network(input_data)
outputs.append(output.data.flatten())
# Create comparison plot
fig, axes = plt.subplots(2, len(networks), figsize=(15, 8))
for i, (output, name) in enumerate(zip(outputs, network_names)):
# Output distribution
axes[0, i].hist(output, bins=20, alpha=0.7)
axes[0, i].set_title(f'{name}\nOutput Distribution')
axes[0, i].set_xlabel('Value')
axes[0, i].set_ylabel('Frequency')
# Statistics
stats_text = f'Mean: {np.mean(output):.3f}\nStd: {np.std(output):.3f}\nRange: [{np.min(output):.3f}, {np.max(output):.3f}]\nSize: {len(output)}'
axes[1, i].text(0.1, 0.5, stats_text, transform=axes[1, i].transAxes,
verticalalignment='center', fontsize=10)
axes[1, i].set_title(f'{name} Statistics')
axes[1, i].axis('off')
plt.suptitle(title)
plt.tight_layout()
plt.show()
# %% [markdown]
"""
### 🧪 Test Network Comparison
"""
# %%
# Test network comparison
print("Testing network comparison...")
try:
# Create different networks
network1 = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
network2 = create_mlp(input_size=3, hidden_sizes=[8, 4], output_size=1)
network3 = create_mlp(input_size=3, hidden_sizes=[2], output_size=1, activation=Tanh)
networks = [network1, network2, network3]
names = ["Small MLP", "Deep MLP", "Tanh MLP"]
# Test input
test_input = Tensor([[1.0, 2.0, 3.0]])
# Compare networks
if _should_show_plots():
compare_networks(networks, names, test_input, "Network Architecture Comparison")
print("✅ Network comparison created!")
else:
print("✅ Network comparison skipped during testing")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement compare_networks above!")
# %% [markdown]
"""
## Step 6: Practical Network Architectures
Now let's create some practical network architectures for common machine learning tasks.
### Common Network Types
#### 1. **Classification Networks**
- **Binary classification**: Output single probability
- **Multi-class classification**: Output probability distribution
- **Use cases**: Image classification, spam detection, sentiment analysis
#### 2. **Regression Networks**
- **Single output**: Predict continuous value
- **Multiple outputs**: Predict multiple values
- **Use cases**: Price prediction, temperature forecasting, demand estimation
#### 3. **Feature Extraction Networks**
- **Encoder networks**: Compress data into features
- **Use cases**: Dimensionality reduction, feature learning, representation learning
"""
# %%
#| export
def create_classification_network(input_size: int, num_classes: int,
hidden_sizes: List[int] = None) -> Sequential:
"""
Create a network for classification tasks.
Args:
input_size: Number of input features
num_classes: Number of output classes
hidden_sizes: List of hidden layer sizes (default: [input_size * 2])
Returns:
Sequential network for classification
TODO: Implement classification network creation.
APPROACH:
1. Use default hidden sizes if none provided
2. Create MLP with appropriate architecture
3. Use Sigmoid for binary classification (num_classes=1)
4. Use appropriate activation for multi-class
EXAMPLE:
create_classification_network(10, 3) creates:
Dense(10→20) → ReLU → Dense(20→3) → Sigmoid
HINTS:
- Use create_mlp() function
- Choose appropriate output activation based on num_classes
- For binary classification (num_classes=1), use Sigmoid
- For multi-class, you could use Sigmoid or no activation
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def create_classification_network(input_size: int, num_classes: int,
hidden_sizes: List[int] = None) -> Sequential:
"""Create a network for classification tasks."""
if hidden_sizes is None:
hidden_sizes = [input_size // 2] # Use input_size // 2 as default
# Choose appropriate output activation
output_activation = Sigmoid if num_classes == 1 else Softmax
return create_mlp(input_size, hidden_sizes, num_classes,
activation=ReLU, output_activation=output_activation)
# %%
#| export
def create_regression_network(input_size: int, output_size: int = 1,
hidden_sizes: List[int] = None) -> Sequential:
"""
Create a network for regression tasks.
Args:
input_size: Number of input features
output_size: Number of output values (default: 1)
hidden_sizes: List of hidden layer sizes (default: [input_size * 2])
Returns:
Sequential network for regression
TODO: Implement regression network creation.
APPROACH:
1. Use default hidden sizes if none provided
2. Create MLP with appropriate architecture
3. Use no activation on output layer (linear output)
EXAMPLE:
create_regression_network(5, 1) creates:
Dense(5→10) → ReLU → Dense(10→1) (no activation)
HINTS:
- Use create_mlp() but with no output activation
- For regression, we want linear outputs (no activation)
- You can pass None or identity function as output_activation
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def create_regression_network(input_size: int, output_size: int = 1,
hidden_sizes: List[int] = None) -> Sequential:
"""Create a network for regression tasks."""
if hidden_sizes is None:
hidden_sizes = [input_size // 2] # Use input_size // 2 as default
# Create MLP with Tanh output activation for regression
return create_mlp(input_size, hidden_sizes, output_size,
activation=ReLU, output_activation=Tanh)
# %% [markdown]
"""
### 🧪 Test Practical Networks
"""
# %%
# Test practical networks
print("Testing practical networks...")
try:
# Test classification network
class_net = create_classification_network(input_size=5, num_classes=1)
x_class = Tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])
y_class = class_net(x_class)
print(f"✅ Classification output: {y_class}")
print(f"✅ Output range: [{np.min(y_class.data):.3f}, {np.max(y_class.data):.3f}]")
# Test regression network
reg_net = create_regression_network(input_size=3, output_size=1)
x_reg = Tensor([[1.0, 2.0, 3.0]])
y_reg = reg_net(x_reg)
print(f"✅ Regression output: {y_reg}")
print(f"✅ Output range: [{np.min(y_reg.data):.3f}, {np.max(y_reg.data):.3f}]")
print("🎉 Practical networks work!")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement the network creation functions above!")
# %% [markdown]
"""
## Step 7: Network Behavior Analysis
Let's create tools to analyze how networks behave with different inputs and understand their capabilities.
### Why Behavior Analysis Matters
- **Understanding**: Learn what patterns networks can learn
- **Debugging**: Identify when networks fail
- **Design**: Choose appropriate architectures
- **Validation**: Ensure networks work as expected
"""
# %%
#| export
def analyze_network_behavior(network: Sequential, input_data: Tensor,
title: str = "Network Behavior Analysis"):
"""
Analyze how a network behaves with different inputs.
Args:
network: Sequential network to analyze
input_data: Input tensor to test
title: Title for the plot
TODO: Create an analysis showing network behavior and capabilities.
APPROACH:
1. Test the network with the given input
2. Analyze the output characteristics
3. Test with variations of the input
4. Create visualizations showing behavior patterns
EXAMPLE:
Test network with original input and noisy versions
Show how output changes with input variations
HINTS:
- Test the original input
- Create variations (noise, scaling, etc.)
- Compare outputs across variations
- Show statistics and patterns
"""
raise NotImplementedError("Student implementation required")
# %%
#| hide
#| export
def analyze_network_behavior(network: Sequential, input_data: Tensor,
title: str = "Network Behavior Analysis"):
"""Analyze how a network behaves with different inputs."""
if not _should_show_plots():
print("📊 Visualization disabled during testing")
return
# Test original input
original_output = network(input_data)
# Create variations
noise_levels = [0.0, 0.1, 0.2, 0.5]
outputs = []
for noise in noise_levels:
noisy_input = Tensor(input_data.data + noise * np.random.randn(*input_data.data.shape))
output = network(noisy_input)
outputs.append(output.data.flatten())
# Create analysis plot
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Original output
axes[0, 0].hist(outputs[0], bins=20, alpha=0.7)
axes[0, 0].set_title('Original Input Output')
axes[0, 0].set_xlabel('Value')
axes[0, 0].set_ylabel('Frequency')
# Output stability
output_means = [np.mean(out) for out in outputs]
output_stds = [np.std(out) for out in outputs]
axes[0, 1].plot(noise_levels, output_means, 'bo-', label='Mean')
axes[0, 1].fill_between(noise_levels,
[m-s for m, s in zip(output_means, output_stds)],
[m+s for m, s in zip(output_means, output_stds)],
alpha=0.3, label='±1 Std')
axes[0, 1].set_xlabel('Noise Level')
axes[0, 1].set_ylabel('Output Value')
axes[0, 1].set_title('Output Stability')
axes[0, 1].legend()
# Output distribution comparison
for i, (output, noise) in enumerate(zip(outputs, noise_levels)):
axes[1, 0].hist(output, bins=20, alpha=0.5, label=f'Noise={noise}')
axes[1, 0].set_xlabel('Output Value')
axes[1, 0].set_ylabel('Frequency')
axes[1, 0].set_title('Output Distribution Comparison')
axes[1, 0].legend()
# Statistics
stats_text = f'Original Mean: {np.mean(outputs[0]):.3f}\nOriginal Std: {np.std(outputs[0]):.3f}\nOutput Range: [{np.min(outputs[0]):.3f}, {np.max(outputs[0]):.3f}]'
axes[1, 1].text(0.1, 0.5, stats_text, transform=axes[1, 1].transAxes,
verticalalignment='center', fontsize=10)
axes[1, 1].set_title('Network Statistics')
axes[1, 1].axis('off')
plt.suptitle(title)
plt.tight_layout()
plt.show()
# %% [markdown]
"""
### 🧪 Test Network Behavior Analysis
"""
# %%
# Test network behavior analysis
print("Testing network behavior analysis...")
try:
# Create a test network
test_network = create_classification_network(input_size=3, num_classes=1)
test_input = Tensor([[1.0, 2.0, 3.0]])
# Analyze behavior
if _should_show_plots():
analyze_network_behavior(test_network, test_input, "Test Network Behavior")
print("✅ Network behavior analysis created!")
else:
print("✅ Network behavior analysis skipped during testing")
except Exception as e:
print(f"❌ Error: {e}")
print("Make sure to implement analyze_network_behavior above!")
# %% [markdown]
"""
## 🎯 Module Summary
Congratulations! You've built the foundation of neural network architectures:
### What You've Accomplished
✅ **Sequential Networks**: Composing layers into complete architectures
✅ **MLP Creation**: Building multi-layer perceptrons
✅ **Network Visualization**: Understanding architecture and data flow
✅ **Network Comparison**: Analyzing different architectures
✅ **Practical Networks**: Classification and regression networks
✅ **Behavior Analysis**: Understanding network capabilities
### Key Concepts You've Learned
- **Networks** are compositions of layers that transform data
- **Architecture design** determines network capabilities
- **Sequential networks** are the most fundamental building block
- **Different architectures** solve different problems
- **Visualization tools** help understand network behavior
### What's Next
In the next modules, you'll build on this foundation:
- **Autograd**: Enable automatic differentiation for training
- **Training**: Learn parameters using gradients and optimizers
- **Loss Functions**: Define objectives for learning
- **Applications**: Solve real problems with neural networks
### Real-World Connection
Your network architectures are now ready to:
- Compose layers into complete neural networks
- Create specialized architectures for different tasks
- Analyze and understand network behavior
- Integrate with the rest of the TinyTorch ecosystem
**Ready for the next challenge?** Let's move on to automatic differentiation to enable training!
"""
# %%
# Final verification
print("\n" + "="*50)
print("🎉 NETWORKS MODULE COMPLETE!")
print("="*50)
print("✅ Sequential network implementation")
print("✅ MLP creation and architecture design")
print("✅ Network visualization and analysis")
print("✅ Network comparison tools")
print("✅ Practical classification and regression networks")
print("✅ Network behavior analysis")
print("\n🚀 Ready to enable training with autograd in the next module!")