mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-06-01 00:36:46 -05:00
- Add 17 intermediate test points across 6 modules for immediate student feedback - Tensor module: Tests after creation, properties, arithmetic, and operators - Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax) - Layers module: Tests after matrix multiplication and Dense layer implementation - Networks module: Tests after Sequential class and MLP creation - CNN module: Tests after convolution, Conv2D layer, and flatten operations - DataLoader module: Tests after Dataset interface and DataLoader class - All tests include visual progress indicators and behavioral explanations - Maintains NBGrader compliance with proper metadata and point allocation - Enables steady forward progress and better debugging for students - 100% test success rate across all modules and integration testing
620 lines
22 KiB
Python
620 lines
22 KiB
Python
# ---
|
|
# jupyter:
|
|
# jupytext:
|
|
# text_representation:
|
|
# extension: .py
|
|
# format_name: percent
|
|
# format_version: '1.3'
|
|
# jupytext_version: 1.17.1
|
|
# ---
|
|
|
|
# %% [markdown]
|
|
"""
|
|
# Module 4: Networks - Neural Network Architectures
|
|
|
|
Welcome to the Networks module! This is where we compose layers into complete neural network architectures.
|
|
|
|
## Learning Goals
|
|
- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`
|
|
- Build the Sequential network architecture for composing layers
|
|
- Create common network patterns like MLPs (Multi-Layer Perceptrons)
|
|
- Visualize network architectures and understand their capabilities
|
|
- Master forward pass inference through complete networks
|
|
|
|
## Build → Use → Understand
|
|
1. **Build**: Sequential networks that compose layers into complete architectures
|
|
2. **Use**: Create different network patterns and run inference
|
|
3. **Understand**: How architecture design affects network behavior and capability
|
|
"""
|
|
|
|
# %% nbgrader={"grade": false, "grade_id": "networks-imports", "locked": false, "schema_version": 3, "solution": false, "task": false}
|
|
#| default_exp core.networks
|
|
|
|
#| export
|
|
import numpy as np
|
|
import sys
|
|
import os
|
|
from typing import List, Union, Optional, Callable
|
|
import matplotlib.pyplot as plt
|
|
import matplotlib.patches as patches
|
|
from matplotlib.patches import FancyBboxPatch, ConnectionPatch
|
|
import seaborn as sns
|
|
|
|
# Import all the building blocks we need - try package first, then local modules
|
|
try:
|
|
from tinytorch.core.tensor import Tensor
|
|
from tinytorch.core.layers import Dense
|
|
from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax
|
|
except ImportError:
|
|
# For development, import from local modules
|
|
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))
|
|
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '02_activations'))
|
|
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '03_layers'))
|
|
from tensor_dev import Tensor
|
|
from activations_dev import ReLU, Sigmoid, Tanh, Softmax
|
|
from layers_dev import Dense
|
|
|
|
# %% nbgrader={"grade": false, "grade_id": "networks-setup", "locked": false, "schema_version": 3, "solution": false, "task": false}
|
|
#| hide
|
|
#| export
|
|
def _should_show_plots():
|
|
"""Check if we should show plots (disable during testing)"""
|
|
# Check multiple conditions that indicate we're in test mode
|
|
is_pytest = (
|
|
'pytest' in sys.modules or
|
|
'test' in sys.argv or
|
|
os.environ.get('PYTEST_CURRENT_TEST') is not None or
|
|
any('test' in arg for arg in sys.argv) or
|
|
any('pytest' in arg for arg in sys.argv)
|
|
)
|
|
|
|
# Show plots in development mode (when not in test mode)
|
|
return not is_pytest
|
|
|
|
# %% nbgrader={"grade": false, "grade_id": "networks-welcome", "locked": false, "schema_version": 3, "solution": false, "task": false}
|
|
print("🔥 TinyTorch Networks Module")
|
|
print(f"NumPy version: {np.__version__}")
|
|
print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")
|
|
print("Ready to build neural network architectures!")
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## 📦 Where This Code Lives in the Final Package
|
|
|
|
**Learning Side:** You work in `modules/source/04_networks/networks_dev.py`
|
|
**Building Side:** Code exports to `tinytorch.core.networks`
|
|
|
|
```python
|
|
# Final package structure:
|
|
from tinytorch.core.networks import Sequential, MLP # Network architectures!
|
|
from tinytorch.core.layers import Dense, Conv2D # Building blocks
|
|
from tinytorch.core.activations import ReLU, Sigmoid, Tanh # Nonlinearity
|
|
from tinytorch.core.tensor import Tensor # Foundation
|
|
```
|
|
|
|
**Why this matters:**
|
|
- **Learning:** Focused modules for deep understanding
|
|
- **Production:** Proper organization like PyTorch's `torch.nn.Sequential`
|
|
- **Consistency:** All network architectures live together in `core.networks`
|
|
- **Integration:** Works seamlessly with layers, activations, and tensors
|
|
"""
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## 🧠 The Mathematical Foundation of Neural Networks
|
|
|
|
### Function Composition at Scale
|
|
Neural networks are fundamentally about **function composition**:
|
|
|
|
```
|
|
f(x) = f_n(f_{n-1}(...f_2(f_1(x))))
|
|
```
|
|
|
|
Each layer is a function, and the network is the composition of all these functions.
|
|
|
|
### Why Function Composition is Powerful
|
|
- **Modularity**: Each layer has a specific purpose
|
|
- **Composability**: Simple functions combine to create complex behaviors
|
|
- **Universal approximation**: Deep compositions can approximate any function
|
|
- **Hierarchical learning**: Early layers learn simple features, later layers learn complex patterns
|
|
|
|
### The Architecture Design Space
|
|
Different arrangements of layers create different capabilities:
|
|
- **Depth**: More layers → more complex representations
|
|
- **Width**: More neurons per layer → more capacity per layer
|
|
- **Connections**: How layers connect affects information flow
|
|
- **Activation functions**: Add nonlinearity for complex patterns
|
|
|
|
### Connection to Real ML Systems
|
|
Every framework uses sequential composition:
|
|
- **PyTorch**: `torch.nn.Sequential([layer1, layer2, layer3])`
|
|
- **TensorFlow**: `tf.keras.Sequential([layer1, layer2, layer3])`
|
|
- **JAX**: `jax.nn.Sequential([layer1, layer2, layer3])`
|
|
- **TinyTorch**: `tinytorch.core.networks.Sequential([layer1, layer2, layer3])` (what we're building!)
|
|
|
|
### Performance and Design Considerations
|
|
- **Forward pass efficiency**: Sequential computation through layers
|
|
- **Memory management**: Intermediate activations storage
|
|
- **Gradient flow**: How information flows backward (for training)
|
|
- **Architecture search**: Finding optimal network structures
|
|
"""
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## Step 1: What is a Network?
|
|
|
|
### Definition
|
|
A **network** is a composition of layers that transforms input data into output predictions. Think of it as a pipeline of transformations:
|
|
|
|
```
|
|
Input → Layer1 → Layer2 → Layer3 → Output
|
|
```
|
|
|
|
### Why Networks Matter
|
|
- **Function composition**: Complex behavior from simple building blocks
|
|
- **Learnable parameters**: Each layer has weights that can be learned
|
|
- **Architecture design**: Different layouts solve different problems
|
|
- **Real-world applications**: Classification, regression, generation, etc.
|
|
|
|
### The Fundamental Insight
|
|
**Neural networks are just function composition!**
|
|
- Each layer is a function: `f_i(x)`
|
|
- The network is: `f(x) = f_n(...f_2(f_1(x)))`
|
|
- Complex behavior emerges from simple building blocks
|
|
|
|
### Real-World Examples
|
|
- **MLP (Multi-Layer Perceptron)**: Classic feedforward network
|
|
- **CNN (Convolutional Neural Network)**: For image processing
|
|
- **RNN (Recurrent Neural Network)**: For sequential data
|
|
- **Transformer**: For attention-based processing
|
|
|
|
### Visual Intuition
|
|
```
|
|
Input: [1, 2, 3] (3 features)
|
|
Layer1: [1.4, 2.8] (linear transformation)
|
|
Layer2: [1.4, 2.8] (nonlinearity)
|
|
Layer3: [0.7] (final prediction)
|
|
```
|
|
|
|
Let's start by building the most fundamental network: **Sequential**.
|
|
"""
|
|
|
|
# %% nbgrader={"grade": false, "grade_id": "sequential-class", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
|
#| export
|
|
class Sequential:
|
|
"""
|
|
Sequential Network: Composes layers in sequence
|
|
|
|
The most fundamental network architecture.
|
|
Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))
|
|
"""
|
|
|
|
def __init__(self, layers: List):
|
|
"""
|
|
Initialize Sequential network with layers.
|
|
|
|
Args:
|
|
layers: List of layers to compose in order
|
|
|
|
TODO: Store the layers and implement forward pass
|
|
|
|
APPROACH:
|
|
1. Store the layers list as an instance variable
|
|
2. This creates the network architecture ready for forward pass
|
|
|
|
EXAMPLE:
|
|
Sequential([Dense(3,4), ReLU(), Dense(4,2)])
|
|
creates a 3-layer network: Dense → ReLU → Dense
|
|
|
|
HINTS:
|
|
- Store layers in self.layers
|
|
- This is the foundation for all network architectures
|
|
"""
|
|
### BEGIN SOLUTION
|
|
self.layers = layers
|
|
### END SOLUTION
|
|
|
|
def forward(self, x: Tensor) -> Tensor:
|
|
"""
|
|
Forward pass through all layers in sequence.
|
|
|
|
Args:
|
|
x: Input tensor
|
|
|
|
Returns:
|
|
Output tensor after passing through all layers
|
|
|
|
TODO: Implement sequential forward pass through all layers
|
|
|
|
APPROACH:
|
|
1. Start with the input tensor
|
|
2. Apply each layer in sequence
|
|
3. Each layer's output becomes the next layer's input
|
|
4. Return the final output
|
|
|
|
EXAMPLE:
|
|
Input: Tensor([[1, 2, 3]])
|
|
Layer1 (Dense): Tensor([[1.4, 2.8]])
|
|
Layer2 (ReLU): Tensor([[1.4, 2.8]])
|
|
Layer3 (Dense): Tensor([[0.7]])
|
|
Output: Tensor([[0.7]])
|
|
|
|
HINTS:
|
|
- Use a for loop: for layer in self.layers:
|
|
- Apply each layer: x = layer(x)
|
|
- The output of one layer becomes input to the next
|
|
- Return the final result
|
|
"""
|
|
### BEGIN SOLUTION
|
|
# Apply each layer in sequence
|
|
for layer in self.layers:
|
|
x = layer(x)
|
|
return x
|
|
### END SOLUTION
|
|
|
|
def __call__(self, x: Tensor) -> Tensor:
|
|
"""Make network callable: network(x) same as network.forward(x)"""
|
|
return self.forward(x)
|
|
|
|
# %% [markdown]
|
|
"""
|
|
### 🧪 Quick Test: Sequential Network
|
|
|
|
Let's test your Sequential network implementation! This is the foundation of all neural network architectures.
|
|
"""
|
|
|
|
# %% nbgrader={"grade": true, "grade_id": "test-sequential-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
|
|
# Test Sequential network immediately after implementation
|
|
print("🔬 Testing Sequential network...")
|
|
|
|
# Create a simple 2-layer network: 3 → 4 → 2
|
|
try:
|
|
network = Sequential([
|
|
Dense(input_size=3, output_size=4),
|
|
ReLU(),
|
|
Dense(input_size=4, output_size=2),
|
|
Sigmoid()
|
|
])
|
|
|
|
print(f"Network created with {len(network.layers)} layers")
|
|
print("✅ Sequential network creation successful")
|
|
|
|
# Test with sample data
|
|
x = Tensor([[1.0, 2.0, 3.0]])
|
|
print(f"Input: {x}")
|
|
|
|
# Forward pass
|
|
y = network(x)
|
|
print(f"Output: {y}")
|
|
print(f"Output shape: {y.shape}")
|
|
|
|
# Verify the network works
|
|
assert y.shape == (1, 2), f"Expected shape (1, 2), got {y.shape}"
|
|
print("✅ Sequential network produces correct output shape")
|
|
|
|
# Test that sigmoid output is in valid range
|
|
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
|
|
print("✅ Sequential network output is in valid range")
|
|
|
|
# Test that layers are stored correctly
|
|
assert len(network.layers) == 4, f"Expected 4 layers, got {len(network.layers)}"
|
|
print("✅ Sequential network stores layers correctly")
|
|
|
|
except Exception as e:
|
|
print(f"❌ Sequential network test failed: {e}")
|
|
raise
|
|
|
|
# Show the network architecture
|
|
print("🎯 Sequential network behavior:")
|
|
print(" Applies layers in sequence: f(g(h(x)))")
|
|
print(" Input flows through each layer in order")
|
|
print(" Output of layer i becomes input of layer i+1")
|
|
print("📈 Progress: Sequential network ✓")
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## Step 2: Building Multi-Layer Perceptrons (MLPs)
|
|
|
|
### What is an MLP?
|
|
A **Multi-Layer Perceptron** is the classic neural network architecture:
|
|
|
|
```
|
|
Input → Dense → Activation → Dense → Activation → ... → Dense → Output
|
|
```
|
|
|
|
### Why MLPs are Important
|
|
- **Universal approximation**: Can approximate any continuous function
|
|
- **Foundation**: Basis for understanding all neural networks
|
|
- **Versatile**: Works for classification, regression, and more
|
|
- **Simple**: Easy to understand and implement
|
|
|
|
### MLP Architecture Pattern
|
|
```
|
|
create_mlp(3, [4, 2], 1) creates:
|
|
Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid
|
|
```
|
|
|
|
### Real-World Applications
|
|
- **Tabular data**: Customer analytics, financial modeling
|
|
- **Feature learning**: Learning representations from raw data
|
|
- **Classification**: Spam detection, medical diagnosis
|
|
- **Regression**: Price prediction, time series forecasting
|
|
"""
|
|
|
|
# %% nbgrader={"grade": false, "grade_id": "create-mlp", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
|
#| export
|
|
def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int,
|
|
activation=ReLU, output_activation=Sigmoid) -> Sequential:
|
|
"""
|
|
Create a Multi-Layer Perceptron (MLP) network.
|
|
|
|
Args:
|
|
input_size: Number of input features
|
|
hidden_sizes: List of hidden layer sizes
|
|
output_size: Number of output features
|
|
activation: Activation function for hidden layers (default: ReLU)
|
|
output_activation: Activation function for output layer (default: Sigmoid)
|
|
|
|
Returns:
|
|
Sequential network with MLP architecture
|
|
|
|
TODO: Implement MLP creation with alternating Dense and activation layers.
|
|
|
|
APPROACH:
|
|
1. Start with an empty list of layers
|
|
2. Add layers in this pattern:
|
|
- Dense(input_size → first_hidden_size)
|
|
- Activation()
|
|
- Dense(first_hidden_size → second_hidden_size)
|
|
- Activation()
|
|
- ...
|
|
- Dense(last_hidden_size → output_size)
|
|
- Output_activation()
|
|
3. Return Sequential(layers)
|
|
|
|
EXAMPLE:
|
|
create_mlp(3, [4, 2], 1) creates:
|
|
Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid
|
|
|
|
HINTS:
|
|
- Start with layers = []
|
|
- Track current_size starting with input_size
|
|
- For each hidden_size: add Dense(current_size, hidden_size), then activation
|
|
- Finally add Dense(last_hidden_size, output_size), then output_activation
|
|
- Return Sequential(layers)
|
|
"""
|
|
### BEGIN SOLUTION
|
|
layers = []
|
|
current_size = input_size
|
|
|
|
# Add hidden layers with activations
|
|
for hidden_size in hidden_sizes:
|
|
layers.append(Dense(current_size, hidden_size))
|
|
layers.append(activation())
|
|
current_size = hidden_size
|
|
|
|
# Add output layer with output activation
|
|
layers.append(Dense(current_size, output_size))
|
|
layers.append(output_activation())
|
|
|
|
return Sequential(layers)
|
|
### END SOLUTION
|
|
|
|
# %% [markdown]
|
|
"""
|
|
### 🧪 Quick Test: MLP Creation
|
|
|
|
Let's test your MLP creation function! This builds complete neural networks with a single function call.
|
|
"""
|
|
|
|
# %% nbgrader={"grade": true, "grade_id": "test-mlp-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
|
|
# Test MLP creation immediately after implementation
|
|
print("🔬 Testing MLP creation...")
|
|
|
|
# Create a simple MLP: 3 → 4 → 2 → 1
|
|
try:
|
|
mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)
|
|
|
|
print(f"MLP created with {len(mlp.layers)} layers")
|
|
print("✅ MLP creation successful")
|
|
|
|
# Test the structure - should have 6 layers: Dense, ReLU, Dense, ReLU, Dense, Sigmoid
|
|
expected_layers = 6 # 3 Dense + 2 ReLU + 1 Sigmoid
|
|
assert len(mlp.layers) == expected_layers, f"Expected {expected_layers} layers, got {len(mlp.layers)}"
|
|
print("✅ MLP has correct number of layers")
|
|
|
|
# Test with sample data
|
|
x = Tensor([[1.0, 2.0, 3.0]])
|
|
y = mlp(x)
|
|
print(f"MLP input: {x}")
|
|
print(f"MLP output: {y}")
|
|
print(f"MLP output shape: {y.shape}")
|
|
|
|
# Verify the output
|
|
assert y.shape == (1, 1), f"Expected shape (1, 1), got {y.shape}"
|
|
print("✅ MLP produces correct output shape")
|
|
|
|
# Test that sigmoid output is in valid range
|
|
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
|
|
print("✅ MLP output is in valid range")
|
|
|
|
except Exception as e:
|
|
print(f"❌ MLP creation test failed: {e}")
|
|
raise
|
|
|
|
# Test different architectures
|
|
try:
|
|
# Test shallow network
|
|
shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
|
|
assert len(shallow_net.layers) == 4, f"Shallow network should have 4 layers, got {len(shallow_net.layers)}"
|
|
|
|
# Test deep network
|
|
deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)
|
|
assert len(deep_net.layers) == 8, f"Deep network should have 8 layers, got {len(deep_net.layers)}"
|
|
|
|
# Test wide network
|
|
wide_net = create_mlp(input_size=3, hidden_sizes=[10], output_size=1)
|
|
assert len(wide_net.layers) == 4, f"Wide network should have 4 layers, got {len(wide_net.layers)}"
|
|
|
|
print("✅ Different MLP architectures work correctly")
|
|
|
|
except Exception as e:
|
|
print(f"❌ MLP architecture test failed: {e}")
|
|
raise
|
|
|
|
# Show the MLP pattern
|
|
print("🎯 MLP creation pattern:")
|
|
print(" Input → Dense → Activation → Dense → Activation → ... → Dense → Output_Activation")
|
|
print(" Automatically creates the complete architecture")
|
|
print(" Handles any number of hidden layers")
|
|
print("📈 Progress: Sequential network ✓, MLP creation ✓")
|
|
print("🚀 Complete neural networks ready!")
|
|
|
|
# %% [markdown]
|
|
"""
|
|
### 🧪 Test Your Network Implementations
|
|
|
|
Once you implement the functions above, run these cells to test them:
|
|
"""
|
|
|
|
# %% nbgrader={"grade": true, "grade_id": "test-sequential", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
|
|
# Test the Sequential network
|
|
print("Testing Sequential network...")
|
|
|
|
# Create a simple 2-layer network: 3 → 4 → 2
|
|
network = Sequential([
|
|
Dense(input_size=3, output_size=4),
|
|
ReLU(),
|
|
Dense(input_size=4, output_size=2),
|
|
Sigmoid()
|
|
])
|
|
|
|
print(f"Network created with {len(network.layers)} layers")
|
|
|
|
# Test with sample data
|
|
x = Tensor([[1.0, 2.0, 3.0]])
|
|
print(f"Input: {x}")
|
|
|
|
# Forward pass
|
|
y = network(x)
|
|
print(f"Output: {y}")
|
|
print(f"Output shape: {y.shape}")
|
|
|
|
# Verify the network works
|
|
assert y.shape == (1, 2), f"Expected shape (1, 2), got {y.shape}"
|
|
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
|
|
|
|
print("✅ Sequential network tests passed!")
|
|
|
|
# %% nbgrader={"grade": true, "grade_id": "test-mlp", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
|
|
# Test MLP creation
|
|
print("Testing MLP creation...")
|
|
|
|
# Create a simple MLP: 3 → 4 → 2 → 1
|
|
mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)
|
|
|
|
print(f"MLP created with {len(mlp.layers)} layers")
|
|
|
|
# Test the structure
|
|
expected_layers = [
|
|
Dense, # 3 → 4
|
|
ReLU, # activation
|
|
Dense, # 4 → 2
|
|
ReLU, # activation
|
|
Dense, # 2 → 1
|
|
Sigmoid # output activation
|
|
]
|
|
|
|
assert len(mlp.layers) == 6, f"Expected 6 layers, got {len(mlp.layers)}"
|
|
|
|
# Test with sample data
|
|
x = Tensor([[1.0, 2.0, 3.0]])
|
|
y = mlp(x)
|
|
print(f"MLP output: {y}")
|
|
print(f"MLP output shape: {y.shape}")
|
|
|
|
# Verify the output
|
|
assert y.shape == (1, 1), f"Expected shape (1, 1), got {y.shape}"
|
|
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
|
|
|
|
print("✅ MLP creation tests passed!")
|
|
|
|
# %% nbgrader={"grade": true, "grade_id": "test-network-comparison", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
|
|
# Test different network architectures
|
|
print("Testing different network architectures...")
|
|
|
|
# Create networks with different architectures
|
|
shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
|
|
deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)
|
|
wide_net = create_mlp(input_size=3, hidden_sizes=[10], output_size=1)
|
|
|
|
# Test input
|
|
x = Tensor([[1.0, 2.0, 3.0]])
|
|
|
|
# Test all networks
|
|
shallow_out = shallow_net(x)
|
|
deep_out = deep_net(x)
|
|
wide_out = wide_net(x)
|
|
|
|
print(f"Shallow network output: {shallow_out}")
|
|
print(f"Deep network output: {deep_out}")
|
|
print(f"Wide network output: {wide_out}")
|
|
|
|
# Verify all outputs are valid
|
|
for name, output in [("Shallow", shallow_out), ("Deep", deep_out), ("Wide", wide_out)]:
|
|
assert output.shape == (1, 1), f"{name} network output shape should be (1, 1), got {output.shape}"
|
|
assert np.all(output.data >= 0) and np.all(output.data <= 1), f"{name} network output should be between 0 and 1"
|
|
|
|
print("✅ Network architecture comparison tests passed!")
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## 🎯 Module Summary
|
|
|
|
Congratulations! You've successfully implemented complete neural network architectures:
|
|
|
|
### What You've Accomplished
|
|
✅ **Sequential Networks**: The fundamental architecture for composing layers
|
|
✅ **Function Composition**: Understanding how layers combine to create complex behaviors
|
|
✅ **MLP Creation**: Building Multi-Layer Perceptrons with flexible architectures
|
|
✅ **Architecture Patterns**: Creating shallow, deep, and wide networks
|
|
✅ **Forward Pass**: Complete inference through multi-layer networks
|
|
|
|
### Key Concepts You've Learned
|
|
- **Networks are function composition**: Complex behavior from simple building blocks
|
|
- **Sequential architecture**: The foundation of most neural networks
|
|
- **MLP patterns**: Dense → Activation → Dense → Activation → Output
|
|
- **Architecture design**: How depth and width affect network capability
|
|
- **Forward pass**: How data flows through complete networks
|
|
|
|
### Mathematical Foundations
|
|
- **Function composition**: f(x) = f_n(...f_2(f_1(x)))
|
|
- **Universal approximation**: MLPs can approximate any continuous function
|
|
- **Hierarchical learning**: Early layers learn simple features, later layers learn complex patterns
|
|
- **Nonlinearity**: Activation functions enable complex decision boundaries
|
|
|
|
### Real-World Applications
|
|
- **Classification**: Image recognition, spam detection, medical diagnosis
|
|
- **Regression**: Price prediction, time series forecasting
|
|
- **Feature learning**: Extracting meaningful representations from raw data
|
|
- **Transfer learning**: Using pre-trained networks for new tasks
|
|
|
|
### Next Steps
|
|
1. **Export your code**: `tito package nbdev --export 04_networks`
|
|
2. **Test your implementation**: `tito module test 04_networks`
|
|
3. **Use your networks**:
|
|
```python
|
|
from tinytorch.core.networks import Sequential, create_mlp
|
|
from tinytorch.core.layers import Dense
|
|
from tinytorch.core.activations import ReLU
|
|
|
|
# Create custom network
|
|
network = Sequential([Dense(10, 5), ReLU(), Dense(5, 1)])
|
|
|
|
# Create MLP
|
|
mlp = create_mlp(10, [20, 10], 1)
|
|
```
|
|
4. **Move to Module 5**: Start building convolutional networks for images!
|
|
|
|
**Ready for the next challenge?** Let's add convolutional layers for image processing and build CNNs!
|
|
""" |