Files
TinyTorch/modules/source/04_networks/networks_dev.py
Vijay Janapa Reddi e3a1600823 Standardize inline test naming and ensure progressive testing structure
 STANDARDIZED TESTING ARCHITECTURE:
- All inline tests now use consistent 'Unit Test: [Component]' naming
- Progressive testing: small portions tested as students implement
- Consistent print statements with �� Unit Test: format

 PROGRESSIVE TESTING STRUCTURE:
- Tensor Module: Unit Test: Creation → Properties → Arithmetic → Comprehensive
- Activations Module: Unit Test: ReLU → Sigmoid → Tanh → Softmax → Comprehensive
- Layers Module: Unit Test: Matrix Multiplication → Dense Layer → Comprehensive
- Networks Module: Unit Test: Sequential → MLP Creation → Comprehensive
- CNN Module: Unit Test: Convolution → Conv2D → Flatten → Comprehensive
- DataLoader Module: Unit Test: Dataset → DataLoader → Pipeline → Comprehensive
- Autograd Module: Unit Test: Variables → Operations → Chain Rule → Comprehensive

 EDUCATIONAL CONSISTENCY:
- Each unit test focuses on one specific component in isolation
- Immediate feedback after each implementation step
- Clear explanations of what each test validates
- Consistent error messages and success indicators

 TESTING GRANULARITY VERIFIED:
- Unit tests test small, specific functionality
- Comprehensive tests cover edge cases and integration
- All tests follow NBGrader-compliant cell structure
- Proper separation between educational and assessment testing

Total: 25+ individual unit tests across 7 modules with consistent naming and structure
2025-07-12 20:38:26 -04:00

1273 lines
51 KiB
Python

# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.17.1
# ---
# %% [markdown]
"""
# Module 4: Networks - Neural Network Architectures
Welcome to the Networks module! This is where we compose layers into complete neural network architectures.
## Learning Goals
- Understand networks as function composition: `f(x) = layer_n(...layer_2(layer_1(x)))`
- Build the Sequential network architecture for composing layers
- Create common network patterns like MLPs (Multi-Layer Perceptrons)
- Visualize network architectures and understand their capabilities
- Master forward pass inference through complete networks
## Build → Use → Understand
1. **Build**: Sequential networks that compose layers into complete architectures
2. **Use**: Create different network patterns and run inference
3. **Understand**: How architecture design affects network behavior and capability
"""
# %% nbgrader={"grade": false, "grade_id": "networks-imports", "locked": false, "schema_version": 3, "solution": false, "task": false}
#| default_exp core.networks
#| export
import numpy as np
import sys
import os
from typing import List, Union, Optional, Callable
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.patches import FancyBboxPatch, ConnectionPatch
import seaborn as sns
# Import all the building blocks we need - try package first, then local modules
try:
from tinytorch.core.tensor import Tensor
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax
except ImportError:
# For development, import from local modules
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '02_activations'))
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '03_layers'))
from tensor_dev import Tensor
from activations_dev import ReLU, Sigmoid, Tanh, Softmax
from layers_dev import Dense
# %% nbgrader={"grade": false, "grade_id": "networks-setup", "locked": false, "schema_version": 3, "solution": false, "task": false}
#| hide
#| export
def _should_show_plots():
"""Check if we should show plots (disable during testing)"""
# Check multiple conditions that indicate we're in test mode
is_pytest = (
'pytest' in sys.modules or
'test' in sys.argv or
os.environ.get('PYTEST_CURRENT_TEST') is not None or
any('test' in arg for arg in sys.argv) or
any('pytest' in arg for arg in sys.argv)
)
# Show plots in development mode (when not in test mode)
return not is_pytest
# %% nbgrader={"grade": false, "grade_id": "networks-welcome", "locked": false, "schema_version": 3, "solution": false, "task": false}
print("🔥 TinyTorch Networks Module")
print(f"NumPy version: {np.__version__}")
print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")
print("Ready to build neural network architectures!")
# %% [markdown]
"""
## 📦 Where This Code Lives in the Final Package
**Learning Side:** You work in `modules/source/04_networks/networks_dev.py`
**Building Side:** Code exports to `tinytorch.core.networks`
```python
# Final package structure:
from tinytorch.core.networks import Sequential, MLP # Network architectures!
from tinytorch.core.layers import Dense, Conv2D # Building blocks
from tinytorch.core.activations import ReLU, Sigmoid, Tanh # Nonlinearity
from tinytorch.core.tensor import Tensor # Foundation
```
**Why this matters:**
- **Learning:** Focused modules for deep understanding
- **Production:** Proper organization like PyTorch's `torch.nn.Sequential`
- **Consistency:** All network architectures live together in `core.networks`
- **Integration:** Works seamlessly with layers, activations, and tensors
"""
# %% [markdown]
"""
## 🧠 The Mathematical Foundation of Neural Networks
### Function Composition at Scale
Neural networks are fundamentally about **function composition**:
```
f(x) = f_n(f_{n-1}(...f_2(f_1(x))))
```
Each layer is a function, and the network is the composition of all these functions.
### Why Function Composition is Powerful
- **Modularity**: Each layer has a specific purpose
- **Composability**: Simple functions combine to create complex behaviors
- **Universal approximation**: Deep compositions can approximate any function
- **Hierarchical learning**: Early layers learn simple features, later layers learn complex patterns
### The Architecture Design Space
Different arrangements of layers create different capabilities:
- **Depth**: More layers → more complex representations
- **Width**: More neurons per layer → more capacity per layer
- **Connections**: How layers connect affects information flow
- **Activation functions**: Add nonlinearity for complex patterns
### Connection to Real ML Systems
Every framework uses sequential composition:
- **PyTorch**: `torch.nn.Sequential([layer1, layer2, layer3])`
- **TensorFlow**: `tf.keras.Sequential([layer1, layer2, layer3])`
- **JAX**: `jax.nn.Sequential([layer1, layer2, layer3])`
- **TinyTorch**: `tinytorch.core.networks.Sequential([layer1, layer2, layer3])` (what we're building!)
### Performance and Design Considerations
- **Forward pass efficiency**: Sequential computation through layers
- **Memory management**: Intermediate activations storage
- **Gradient flow**: How information flows backward (for training)
- **Architecture search**: Finding optimal network structures
"""
# %% [markdown]
"""
## Step 1: What is a Network?
### Definition
A **network** is a composition of layers that transforms input data into output predictions. Think of it as a pipeline of transformations:
```
Input → Layer1 → Layer2 → Layer3 → Output
```
### Why Networks Matter
- **Function composition**: Complex behavior from simple building blocks
- **Learnable parameters**: Each layer has weights that can be learned
- **Architecture design**: Different layouts solve different problems
- **Real-world applications**: Classification, regression, generation, etc.
### The Fundamental Insight
**Neural networks are just function composition!**
- Each layer is a function: `f_i(x)`
- The network is: `f(x) = f_n(...f_2(f_1(x)))`
- Complex behavior emerges from simple building blocks
### Real-World Examples
- **MLP (Multi-Layer Perceptron)**: Classic feedforward network
- **CNN (Convolutional Neural Network)**: For image processing
- **RNN (Recurrent Neural Network)**: For sequential data
- **Transformer**: For attention-based processing
### Visual Intuition
```
Input: [1, 2, 3] (3 features)
Layer1: [1.4, 2.8] (linear transformation)
Layer2: [1.4, 2.8] (nonlinearity)
Layer3: [0.7] (final prediction)
```
Let's start by building the most fundamental network: **Sequential**.
"""
# %% nbgrader={"grade": false, "grade_id": "sequential-class", "locked": false, "schema_version": 3, "solution": true, "task": false}
#| export
class Sequential:
"""
Sequential Network: Composes layers in sequence
The most fundamental network architecture.
Applies layers in order: f(x) = layer_n(...layer_2(layer_1(x)))
"""
def __init__(self, layers: List):
"""
Initialize Sequential network with layers.
Args:
layers: List of layers to compose in order
TODO: Store the layers and implement forward pass
APPROACH:
1. Store the layers list as an instance variable
2. This creates the network architecture ready for forward pass
EXAMPLE:
Sequential([Dense(3,4), ReLU(), Dense(4,2)])
creates a 3-layer network: Dense → ReLU → Dense
HINTS:
- Store layers in self.layers
- This is the foundation for all network architectures
"""
### BEGIN SOLUTION
self.layers = layers
### END SOLUTION
def forward(self, x: Tensor) -> Tensor:
"""
Forward pass through all layers in sequence.
Args:
x: Input tensor
Returns:
Output tensor after passing through all layers
TODO: Implement sequential forward pass through all layers
APPROACH:
1. Start with the input tensor
2. Apply each layer in sequence
3. Each layer's output becomes the next layer's input
4. Return the final output
EXAMPLE:
Input: Tensor([[1, 2, 3]])
Layer1 (Dense): Tensor([[1.4, 2.8]])
Layer2 (ReLU): Tensor([[1.4, 2.8]])
Layer3 (Dense): Tensor([[0.7]])
Output: Tensor([[0.7]])
HINTS:
- Use a for loop: for layer in self.layers:
- Apply each layer: x = layer(x)
- The output of one layer becomes input to the next
- Return the final result
"""
### BEGIN SOLUTION
# Apply each layer in sequence
for layer in self.layers:
x = layer(x)
return x
### END SOLUTION
def __call__(self, x: Tensor) -> Tensor:
"""Make network callable: network(x) same as network.forward(x)"""
return self.forward(x)
# %% [markdown]
"""
### 🧪 Unit Test: Sequential Network
Let's test your Sequential network implementation! This is the foundation of all neural network architectures.
**This is a unit test** - it tests one specific class (Sequential network) in isolation.
"""
# %% nbgrader={"grade": true, "grade_id": "test-sequential-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test Sequential network immediately after implementation
print("🔬 Unit Test: Sequential Network...")
# Create a simple 2-layer network: 3 → 4 → 2
try:
network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
print(f"Network created with {len(network.layers)} layers")
print("✅ Sequential network creation successful")
# Test with sample data
x = Tensor([[1.0, 2.0, 3.0]])
print(f"Input: {x}")
# Forward pass
y = network(x)
print(f"Output: {y}")
print(f"Output shape: {y.shape}")
# Verify the network works
assert y.shape == (1, 2), f"Expected shape (1, 2), got {y.shape}"
print("✅ Sequential network produces correct output shape")
# Test that sigmoid output is in valid range
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
print("✅ Sequential network output is in valid range")
# Test that layers are stored correctly
assert len(network.layers) == 4, f"Expected 4 layers, got {len(network.layers)}"
print("✅ Sequential network stores layers correctly")
except Exception as e:
print(f"❌ Sequential network test failed: {e}")
raise
# Show the network architecture
print("🎯 Sequential network behavior:")
print(" Applies layers in sequence: f(g(h(x)))")
print(" Input flows through each layer in order")
print(" Output of layer i becomes input of layer i+1")
print("📈 Progress: Sequential network ✓")
# %% [markdown]
"""
## Step 2: Building Multi-Layer Perceptrons (MLPs)
### What is an MLP?
A **Multi-Layer Perceptron** is the classic neural network architecture:
```
Input → Dense → Activation → Dense → Activation → ... → Dense → Output
```
### Why MLPs are Important
- **Universal approximation**: Can approximate any continuous function
- **Foundation**: Basis for understanding all neural networks
- **Versatile**: Works for classification, regression, and more
- **Simple**: Easy to understand and implement
### MLP Architecture Pattern
```
create_mlp(3, [4, 2], 1) creates:
Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid
```
### Real-World Applications
- **Tabular data**: Customer analytics, financial modeling
- **Feature learning**: Learning representations from raw data
- **Classification**: Spam detection, medical diagnosis
- **Regression**: Price prediction, time series forecasting
"""
# %% nbgrader={"grade": false, "grade_id": "create-mlp", "locked": false, "schema_version": 3, "solution": true, "task": false}
#| export
def create_mlp(input_size: int, hidden_sizes: List[int], output_size: int,
activation=ReLU, output_activation=Sigmoid) -> Sequential:
"""
Create a Multi-Layer Perceptron (MLP) network.
Args:
input_size: Number of input features
hidden_sizes: List of hidden layer sizes
output_size: Number of output features
activation: Activation function for hidden layers (default: ReLU)
output_activation: Activation function for output layer (default: Sigmoid)
Returns:
Sequential network with MLP architecture
TODO: Implement MLP creation with alternating Dense and activation layers.
APPROACH:
1. Start with an empty list of layers
2. Add layers in this pattern:
- Dense(input_size → first_hidden_size)
- Activation()
- Dense(first_hidden_size → second_hidden_size)
- Activation()
- ...
- Dense(last_hidden_size → output_size)
- Output_activation()
3. Return Sequential(layers)
EXAMPLE:
create_mlp(3, [4, 2], 1) creates:
Dense(3→4) → ReLU → Dense(4→2) → ReLU → Dense(2→1) → Sigmoid
HINTS:
- Start with layers = []
- Track current_size starting with input_size
- For each hidden_size: add Dense(current_size, hidden_size), then activation
- Finally add Dense(last_hidden_size, output_size), then output_activation
- Return Sequential(layers)
"""
### BEGIN SOLUTION
layers = []
current_size = input_size
# Add hidden layers with activations
for hidden_size in hidden_sizes:
layers.append(Dense(current_size, hidden_size))
layers.append(activation())
current_size = hidden_size
# Add output layer with output activation
layers.append(Dense(current_size, output_size))
layers.append(output_activation())
return Sequential(layers)
### END SOLUTION
# %% [markdown]
"""
### 🧪 Unit Test: MLP Creation
Let's test your MLP creation function! This builds complete neural networks with a single function call.
**This is a unit test** - it tests one specific function (create_mlp) in isolation.
"""
# %% nbgrader={"grade": true, "grade_id": "test-mlp-immediate", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
# Test MLP creation immediately after implementation
print("🔬 Unit Test: MLP Creation...")
# Create a simple MLP: 3 → 4 → 2 → 1
try:
mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)
print(f"MLP created with {len(mlp.layers)} layers")
print("✅ MLP creation successful")
# Test the structure - should have 6 layers: Dense, ReLU, Dense, ReLU, Dense, Sigmoid
expected_layers = 6 # 3 Dense + 2 ReLU + 1 Sigmoid
assert len(mlp.layers) == expected_layers, f"Expected {expected_layers} layers, got {len(mlp.layers)}"
print("✅ MLP has correct number of layers")
# Test with sample data
x = Tensor([[1.0, 2.0, 3.0]])
y = mlp(x)
print(f"MLP input: {x}")
print(f"MLP output: {y}")
print(f"MLP output shape: {y.shape}")
# Verify the output
assert y.shape == (1, 1), f"Expected shape (1, 1), got {y.shape}"
print("✅ MLP produces correct output shape")
# Test that sigmoid output is in valid range
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
print("✅ MLP output is in valid range")
except Exception as e:
print(f"❌ MLP creation test failed: {e}")
raise
# Test different architectures
try:
# Test shallow network
shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
assert len(shallow_net.layers) == 4, f"Shallow network should have 4 layers, got {len(shallow_net.layers)}"
# Test deep network
deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)
assert len(deep_net.layers) == 8, f"Deep network should have 8 layers, got {len(deep_net.layers)}"
# Test wide network
wide_net = create_mlp(input_size=3, hidden_sizes=[10], output_size=1)
assert len(wide_net.layers) == 4, f"Wide network should have 4 layers, got {len(wide_net.layers)}"
print("✅ Different MLP architectures work correctly")
except Exception as e:
print(f"❌ MLP architecture test failed: {e}")
raise
# Show the MLP pattern
print("🎯 MLP creation pattern:")
print(" Input → Dense → Activation → Dense → Activation → ... → Dense → Output_Activation")
print(" Automatically creates the complete architecture")
print(" Handles any number of hidden layers")
print("📈 Progress: Sequential network ✓, MLP creation ✓")
print("🚀 Complete neural networks ready!")
# %% [markdown]
"""
### 🧪 Test Your Network Implementations
Once you implement the functions above, run these cells to test them:
"""
# %% nbgrader={"grade": true, "grade_id": "test-sequential", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test the Sequential network
print("Testing Sequential network...")
# Create a simple 2-layer network: 3 → 4 → 2
network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
print(f"Network created with {len(network.layers)} layers")
# Test with sample data
x = Tensor([[1.0, 2.0, 3.0]])
print(f"Input: {x}")
# Forward pass
y = network(x)
print(f"Output: {y}")
print(f"Output shape: {y.shape}")
# Verify the network works
assert y.shape == (1, 2), f"Expected shape (1, 2), got {y.shape}"
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
print("✅ Sequential network tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-mlp", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test MLP creation
print("Testing MLP creation...")
# Create a simple MLP: 3 → 4 → 2 → 1
mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)
print(f"MLP created with {len(mlp.layers)} layers")
# Test the structure
expected_layers = [
Dense, # 3 → 4
ReLU, # activation
Dense, # 4 → 2
ReLU, # activation
Dense, # 2 → 1
Sigmoid # output activation
]
assert len(mlp.layers) == 6, f"Expected 6 layers, got {len(mlp.layers)}"
# Test with sample data
x = Tensor([[1.0, 2.0, 3.0]])
y = mlp(x)
print(f"MLP output: {y}")
print(f"MLP output shape: {y.shape}")
# Verify the output
assert y.shape == (1, 1), f"Expected shape (1, 1), got {y.shape}"
assert np.all(y.data >= 0) and np.all(y.data <= 1), "Sigmoid output should be between 0 and 1"
print("✅ MLP creation tests passed!")
# %% nbgrader={"grade": true, "grade_id": "test-network-comparison", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false}
# Test different network architectures
print("Testing different network architectures...")
# Create networks with different architectures
shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)
wide_net = create_mlp(input_size=3, hidden_sizes=[10], output_size=1)
# Test input
x = Tensor([[1.0, 2.0, 3.0]])
# Test all networks
shallow_out = shallow_net(x)
deep_out = deep_net(x)
wide_out = wide_net(x)
print(f"Shallow network output: {shallow_out}")
print(f"Deep network output: {deep_out}")
print(f"Wide network output: {wide_out}")
# Verify all outputs are valid
for name, output in [("Shallow", shallow_out), ("Deep", deep_out), ("Wide", wide_out)]:
assert output.shape == (1, 1), f"{name} network output shape should be (1, 1), got {output.shape}"
assert np.all(output.data >= 0) and np.all(output.data <= 1), f"{name} network output should be between 0 and 1"
print("✅ Network architecture comparison tests passed!")
# %% [markdown]
"""
## 🎯 Module Summary
Congratulations! You've successfully implemented complete neural network architectures:
### What You've Accomplished
✅ **Sequential Networks**: The fundamental architecture for composing layers
✅ **Function Composition**: Understanding how layers combine to create complex behaviors
✅ **MLP Creation**: Building Multi-Layer Perceptrons with flexible architectures
✅ **Architecture Patterns**: Creating shallow, deep, and wide networks
✅ **Forward Pass**: Complete inference through multi-layer networks
### Key Concepts You've Learned
- **Networks are function composition**: Complex behavior from simple building blocks
- **Sequential architecture**: The foundation of most neural networks
- **MLP patterns**: Dense → Activation → Dense → Activation → Output
- **Architecture design**: How depth and width affect network capability
- **Forward pass**: How data flows through complete networks
### Mathematical Foundations
- **Function composition**: f(x) = f_n(...f_2(f_1(x)))
- **Universal approximation**: MLPs can approximate any continuous function
- **Hierarchical learning**: Early layers learn simple features, later layers learn complex patterns
- **Nonlinearity**: Activation functions enable complex decision boundaries
### Real-World Applications
- **Classification**: Image recognition, spam detection, medical diagnosis
- **Regression**: Price prediction, time series forecasting
- **Feature learning**: Extracting meaningful representations from raw data
- **Transfer learning**: Using pre-trained networks for new tasks
### Next Steps
1. **Export your code**: `tito package nbdev --export 04_networks`
2. **Test your implementation**: `tito module test 04_networks`
3. **Use your networks**:
```python
from tinytorch.core.networks import Sequential, create_mlp
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU
# Create custom network
network = Sequential([Dense(10, 5), ReLU(), Dense(5, 1)])
# Create MLP
mlp = create_mlp(10, [20, 10], 1)
```
4. **Move to Module 5**: Start building convolutional networks for images!
**Ready for the next challenge?** Let's add convolutional layers for image processing and build CNNs!
"""
# %% [markdown]
"""
## 🧪 Comprehensive Testing: Neural Network Architectures
Let's thoroughly test your network implementations to ensure they work correctly in all scenarios.
This comprehensive testing ensures your networks are robust and ready for real ML applications.
"""
# %% nbgrader={"grade": true, "grade_id": "test-networks-comprehensive", "locked": true, "points": 30, "schema_version": 3, "solution": false, "task": false}
def test_networks_comprehensive():
"""Comprehensive test of Sequential networks and MLP creation."""
print("🔬 Testing neural network architectures comprehensively...")
tests_passed = 0
total_tests = 10
# Test 1: Sequential Network Creation and Structure
try:
# Create a simple 2-layer network
network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
assert len(network.layers) == 4, f"Expected 4 layers, got {len(network.layers)}"
# Test layer types
assert isinstance(network.layers[0], Dense), "First layer should be Dense"
assert isinstance(network.layers[1], ReLU), "Second layer should be ReLU"
assert isinstance(network.layers[2], Dense), "Third layer should be Dense"
assert isinstance(network.layers[3], Sigmoid), "Fourth layer should be Sigmoid"
print("✅ Sequential network creation and structure")
tests_passed += 1
except Exception as e:
print(f"❌ Sequential network creation failed: {e}")
# Test 2: Sequential Network Forward Pass
try:
network = Sequential([
Dense(input_size=3, output_size=4),
ReLU(),
Dense(input_size=4, output_size=2),
Sigmoid()
])
# Test single sample
x_single = Tensor([[1.0, 2.0, 3.0]])
y_single = network(x_single)
assert y_single.shape == (1, 2), f"Single sample output should be (1, 2), got {y_single.shape}"
assert np.all((y_single.data >= 0) & (y_single.data <= 1)), "Sigmoid output should be in [0,1]"
# Test batch processing
x_batch = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]])
y_batch = network(x_batch)
assert y_batch.shape == (3, 2), f"Batch output should be (3, 2), got {y_batch.shape}"
assert np.all((y_batch.data >= 0) & (y_batch.data <= 1)), "All batch outputs should be in [0,1]"
print("✅ Sequential network forward pass: single and batch")
tests_passed += 1
except Exception as e:
print(f"❌ Sequential network forward pass failed: {e}")
# Test 3: MLP Creation Basic Functionality
try:
# Create simple MLP: 3 → 4 → 2 → 1
mlp = create_mlp(input_size=3, hidden_sizes=[4, 2], output_size=1)
# Should have 6 layers: Dense, ReLU, Dense, ReLU, Dense, Sigmoid
expected_layers = 6
assert len(mlp.layers) == expected_layers, f"Expected {expected_layers} layers, got {len(mlp.layers)}"
# Test layer pattern
layer_types = [type(layer).__name__ for layer in mlp.layers]
expected_pattern = ['Dense', 'ReLU', 'Dense', 'ReLU', 'Dense', 'Sigmoid']
assert layer_types == expected_pattern, f"Expected pattern {expected_pattern}, got {layer_types}"
# Test forward pass
x = Tensor([[1.0, 2.0, 3.0]])
y = mlp(x)
assert y.shape == (1, 1), f"MLP output should be (1, 1), got {y.shape}"
assert np.all((y.data >= 0) & (y.data <= 1)), "MLP output should be in [0,1]"
print("✅ MLP creation basic functionality")
tests_passed += 1
except Exception as e:
print(f"❌ MLP creation basic failed: {e}")
# Test 4: Different MLP Architectures
try:
# Test shallow network (1 hidden layer)
shallow_net = create_mlp(input_size=3, hidden_sizes=[4], output_size=1)
assert len(shallow_net.layers) == 4, f"Shallow network should have 4 layers, got {len(shallow_net.layers)}"
# Test deep network (3 hidden layers)
deep_net = create_mlp(input_size=3, hidden_sizes=[4, 4, 4], output_size=1)
assert len(deep_net.layers) == 8, f"Deep network should have 8 layers, got {len(deep_net.layers)}"
# Test wide network (1 large hidden layer)
wide_net = create_mlp(input_size=3, hidden_sizes=[20], output_size=1)
assert len(wide_net.layers) == 4, f"Wide network should have 4 layers, got {len(wide_net.layers)}"
# Test very deep network
very_deep_net = create_mlp(input_size=3, hidden_sizes=[5, 5, 5, 5, 5], output_size=1)
assert len(very_deep_net.layers) == 12, f"Very deep network should have 12 layers, got {len(very_deep_net.layers)}"
# Test all networks work
x = Tensor([[1.0, 2.0, 3.0]])
for name, net in [("Shallow", shallow_net), ("Deep", deep_net), ("Wide", wide_net), ("Very Deep", very_deep_net)]:
y = net(x)
assert y.shape == (1, 1), f"{name} network output shape should be (1, 1), got {y.shape}"
assert np.all((y.data >= 0) & (y.data <= 1)), f"{name} network output should be in [0,1]"
print("✅ Different MLP architectures: shallow, deep, wide, very deep")
tests_passed += 1
except Exception as e:
print(f"❌ Different MLP architectures failed: {e}")
# Test 5: MLP with Different Activation Functions
try:
# Test with Tanh activation
mlp_tanh = create_mlp(input_size=3, hidden_sizes=[4], output_size=1, activation=Tanh, output_activation=Sigmoid)
# Check layer types
layer_types = [type(layer).__name__ for layer in mlp_tanh.layers]
expected_pattern = ['Dense', 'Tanh', 'Dense', 'Sigmoid']
assert layer_types == expected_pattern, f"Tanh MLP pattern should be {expected_pattern}, got {layer_types}"
# Test forward pass
x = Tensor([[1.0, 2.0, 3.0]])
y = mlp_tanh(x)
assert y.shape == (1, 1), "Tanh MLP should work correctly"
# Test with different output activation
mlp_tanh_out = create_mlp(input_size=3, hidden_sizes=[4], output_size=3, activation=ReLU, output_activation=Softmax)
y_multi = mlp_tanh_out(x)
assert y_multi.shape == (1, 3), "Multi-output MLP should work"
# Check softmax properties
assert abs(np.sum(y_multi.data) - 1.0) < 1e-6, "Softmax outputs should sum to 1"
print("✅ MLP with different activation functions: Tanh, Softmax")
tests_passed += 1
except Exception as e:
print(f"❌ MLP with different activations failed: {e}")
# Test 6: Network Layer Composition
try:
# Test that network correctly chains layers
network = Sequential([
Dense(input_size=4, output_size=3),
ReLU(),
Dense(input_size=3, output_size=2),
Tanh(),
Dense(input_size=2, output_size=1),
Sigmoid()
])
x = Tensor([[1.0, -1.0, 2.0, -2.0]])
# Manual forward pass to verify composition
h1 = network.layers[0](x) # Dense
h2 = network.layers[1](h1) # ReLU
h3 = network.layers[2](h2) # Dense
h4 = network.layers[3](h3) # Tanh
h5 = network.layers[4](h4) # Dense
h6 = network.layers[5](h5) # Sigmoid
# Compare with network forward pass
y_network = network(x)
assert np.allclose(h6.data, y_network.data), "Manual and network forward pass should match"
# Check intermediate shapes
assert h1.shape == (1, 3), f"h1 shape should be (1, 3), got {h1.shape}"
assert h2.shape == (1, 3), f"h2 shape should be (1, 3), got {h2.shape}"
assert h3.shape == (1, 2), f"h3 shape should be (1, 2), got {h3.shape}"
assert h4.shape == (1, 2), f"h4 shape should be (1, 2), got {h4.shape}"
assert h5.shape == (1, 1), f"h5 shape should be (1, 1), got {h5.shape}"
assert h6.shape == (1, 1), f"h6 shape should be (1, 1), got {h6.shape}"
# Check activation effects
assert np.all(h2.data >= 0), "ReLU should produce non-negative values"
assert np.all((h4.data >= -1) & (h4.data <= 1)), "Tanh should produce values in [-1,1]"
assert np.all((h6.data >= 0) & (h6.data <= 1)), "Sigmoid should produce values in [0,1]"
print("✅ Network layer composition: correct chaining and shapes")
tests_passed += 1
except Exception as e:
print(f"❌ Network layer composition failed: {e}")
# Test 7: Edge Cases and Robustness
try:
# Test with minimal network (1 layer)
minimal_net = Sequential([Dense(input_size=2, output_size=1)])
x_minimal = Tensor([[1.0, 2.0]])
y_minimal = minimal_net(x_minimal)
assert y_minimal.shape == (1, 1), "Minimal network should work"
# Test with single neuron layers
single_neuron_net = create_mlp(input_size=1, hidden_sizes=[1], output_size=1)
x_single = Tensor([[5.0]])
y_single_neuron = single_neuron_net(x_single)
assert y_single_neuron.shape == (1, 1), "Single neuron network should work"
# Test with large batch
large_net = create_mlp(input_size=10, hidden_sizes=[5], output_size=1)
x_large_batch = Tensor(np.random.randn(100, 10))
y_large_batch = large_net(x_large_batch)
assert y_large_batch.shape == (100, 1), "Large batch should work"
assert not np.any(np.isnan(y_large_batch.data)), "Should not produce NaN"
assert not np.any(np.isinf(y_large_batch.data)), "Should not produce Inf"
print("✅ Edge cases: minimal networks, single neurons, large batches")
tests_passed += 1
except Exception as e:
print(f"❌ Edge cases failed: {e}")
# Test 8: Multi-class Classification Networks
try:
# Create multi-class classifier
classifier = create_mlp(input_size=4, hidden_sizes=[8, 6], output_size=3, output_activation=Softmax)
# Test with batch of samples
x_multi = Tensor(np.random.randn(5, 4))
y_multi = classifier(x_multi)
assert y_multi.shape == (5, 3), f"Multi-class output should be (5, 3), got {y_multi.shape}"
# Check softmax properties for each sample
row_sums = np.sum(y_multi.data, axis=1)
assert np.allclose(row_sums, 1.0), "Each sample should have probabilities summing to 1"
assert np.all(y_multi.data > 0), "All probabilities should be positive"
# Test that argmax gives valid class predictions
predictions = np.argmax(y_multi.data, axis=1)
assert np.all((predictions >= 0) & (predictions < 3)), "Predictions should be valid class indices"
print("✅ Multi-class classification: softmax probabilities, valid predictions")
tests_passed += 1
except Exception as e:
print(f"❌ Multi-class classification failed: {e}")
# Test 9: Real ML Scenarios
try:
# Scenario 1: Binary classification (like spam detection)
spam_classifier = create_mlp(input_size=100, hidden_sizes=[50, 20], output_size=1, output_activation=Sigmoid)
# Simulate email features
email_features = Tensor(np.random.randn(10, 100))
spam_probabilities = spam_classifier(email_features)
assert spam_probabilities.shape == (10, 1), "Spam classifier should output probabilities for each email"
assert np.all((spam_probabilities.data >= 0) & (spam_probabilities.data <= 1)), "Should output valid probabilities"
# Scenario 2: Image classification (like MNIST)
mnist_classifier = create_mlp(input_size=784, hidden_sizes=[256, 128], output_size=10, output_activation=Softmax)
# Simulate flattened images
images = Tensor(np.random.randn(32, 784)) # Batch of 32 images
class_probabilities = mnist_classifier(images)
assert class_probabilities.shape == (32, 10), "MNIST classifier should output 10 class probabilities"
# Check softmax properties
batch_sums = np.sum(class_probabilities.data, axis=1)
assert np.allclose(batch_sums, 1.0), "Each image should have class probabilities summing to 1"
# Scenario 3: Regression (like house price prediction)
price_predictor = Sequential([
Dense(input_size=8, output_size=16),
ReLU(),
Dense(input_size=16, output_size=8),
ReLU(),
Dense(input_size=8, output_size=1) # No activation for regression
])
# Simulate house features
house_features = Tensor(np.random.randn(5, 8))
predicted_prices = price_predictor(house_features)
assert predicted_prices.shape == (5, 1), "Price predictor should output one price per house"
print("✅ Real ML scenarios: spam detection, image classification, price prediction")
tests_passed += 1
except Exception as e:
print(f"❌ Real ML scenarios failed: {e}")
# Test 10: Network Comparison and Analysis
try:
# Create networks with same total parameters but different architectures
x_test = Tensor([[1.0, 2.0, 3.0, 4.0]])
# Wide network: 4 → 20 → 1 (parameters: 4*20 + 20 + 20*1 + 1 = 121)
wide_network = create_mlp(input_size=4, hidden_sizes=[20], output_size=1)
# Deep network: 4 → 10 → 10 → 1 (parameters: 4*10 + 10 + 10*10 + 10 + 10*1 + 1 = 171)
deep_network = create_mlp(input_size=4, hidden_sizes=[10, 10], output_size=1)
# Test both networks
wide_output = wide_network(x_test)
deep_output = deep_network(x_test)
assert wide_output.shape == (1, 1), "Wide network should produce correct output"
assert deep_output.shape == (1, 1), "Deep network should produce correct output"
# Both should be valid but potentially different
assert np.all((wide_output.data >= 0) & (wide_output.data <= 1)), "Wide network output should be valid"
assert np.all((deep_output.data >= 0) & (deep_output.data <= 1)), "Deep network output should be valid"
# Test network complexity
def count_parameters(network):
total = 0
for layer in network.layers:
if isinstance(layer, Dense):
total += layer.weights.size
if layer.bias is not None:
total += layer.bias.size
return total
wide_params = count_parameters(wide_network)
deep_params = count_parameters(deep_network)
assert wide_params > 0, "Wide network should have parameters"
assert deep_params > 0, "Deep network should have parameters"
print(f"✅ Network comparison: wide ({wide_params} params) vs deep ({deep_params} params)")
tests_passed += 1
except Exception as e:
print(f"❌ Network comparison failed: {e}")
# Results summary
print(f"\n📊 Networks Module Results: {tests_passed}/{total_tests} tests passed")
if tests_passed == total_tests:
print("🎉 All network tests passed! Your implementations support:")
print(" • Sequential networks: layer composition and chaining")
print(" • MLP creation: flexible multi-layer perceptron architectures")
print(" • Different architectures: shallow, deep, wide networks")
print(" • Multiple activation functions: ReLU, Tanh, Sigmoid, Softmax")
print(" • Multi-class classification: softmax probability distributions")
print(" • Real ML scenarios: spam detection, image classification, regression")
print(" • Network analysis: parameter counting and architecture comparison")
print("📈 Progress: All Network Functionality ✓")
return True
else:
print("⚠️ Some network tests failed. Common issues:")
print(" • Check Sequential class layer composition")
print(" • Verify create_mlp function layer creation pattern")
print(" • Ensure proper activation function integration")
print(" • Test forward pass through complete networks")
print(" • Verify shape handling across all layers")
return False
# Run the comprehensive test
success = test_networks_comprehensive()
# %% [markdown]
"""
### 🧪 Integration Test: Complete Neural Network Applications
Let's test your networks in realistic machine learning applications.
"""
# %% nbgrader={"grade": true, "grade_id": "test-networks-integration", "locked": true, "points": 20, "schema_version": 3, "solution": false, "task": false}
def test_networks_integration():
"""Integration test with complete neural network applications."""
print("🔬 Testing networks in complete ML applications...")
try:
print("🧠 Building complete ML applications with neural networks...")
# Application 1: Iris Classification
print("\n🌸 Application 1: Iris Classification (Multi-class)")
iris_classifier = create_mlp(
input_size=4, # 4 flower measurements
hidden_sizes=[8, 6], # Hidden layers
output_size=3, # 3 iris species
output_activation=Softmax
)
# Simulate iris data
iris_samples = Tensor([
[5.1, 3.5, 1.4, 0.2], # Setosa-like
[7.0, 3.2, 4.7, 1.4], # Versicolor-like
[6.3, 3.3, 6.0, 2.5] # Virginica-like
])
iris_predictions = iris_classifier(iris_samples)
assert iris_predictions.shape == (3, 3), "Should predict 3 classes for 3 samples"
# Check that predictions are valid probabilities
row_sums = np.sum(iris_predictions.data, axis=1)
assert np.allclose(row_sums, 1.0), "Each prediction should sum to 1"
# Get predicted classes
predicted_classes = np.argmax(iris_predictions.data, axis=1)
print(f" Predicted classes: {predicted_classes}")
print(f" Confidence scores: {np.max(iris_predictions.data, axis=1)}")
print("✅ Iris classification: valid multi-class predictions")
# Application 2: Housing Price Prediction
print("\n🏠 Application 2: Housing Price Prediction (Regression)")
price_predictor = Sequential([
Dense(input_size=8, output_size=16), # 8 house features
ReLU(),
Dense(input_size=16, output_size=8),
ReLU(),
Dense(input_size=8, output_size=1) # 1 price output (no activation for regression)
])
# Simulate house features: [size, bedrooms, bathrooms, age, location_score, etc.]
house_data = Tensor([
[2000, 3, 2, 5, 8.5, 1, 0, 1], # Large, new house
[1200, 2, 1, 20, 6.0, 0, 1, 0], # Small, older house
[1800, 3, 2, 10, 7.5, 1, 0, 0] # Medium house
])
predicted_prices = price_predictor(house_data)
assert predicted_prices.shape == (3, 1), "Should predict 1 price for each house"
assert not np.any(np.isnan(predicted_prices.data)), "Prices should not be NaN"
print(f" Predicted prices: {predicted_prices.data.flatten()}")
print("✅ Housing price prediction: valid regression outputs")
# Application 3: Sentiment Analysis
print("\n💭 Application 3: Sentiment Analysis (Binary Classification)")
sentiment_analyzer = create_mlp(
input_size=100, # 100 text features (like TF-IDF)
hidden_sizes=[50, 25], # Deep network for text
output_size=1, # Binary sentiment (positive/negative)
output_activation=Sigmoid
)
# Simulate text features for different reviews
review_features = Tensor(np.random.randn(5, 100)) # 5 reviews
sentiment_scores = sentiment_analyzer(review_features)
assert sentiment_scores.shape == (5, 1), "Should predict sentiment for each review"
assert np.all((sentiment_scores.data >= 0) & (sentiment_scores.data <= 1)), "Sentiment scores should be probabilities"
# Convert to sentiment labels
sentiment_labels = (sentiment_scores.data > 0.5).astype(int)
print(f" Sentiment predictions: {sentiment_labels.flatten()}")
print(f" Confidence scores: {sentiment_scores.data.flatten()}")
print("✅ Sentiment analysis: valid binary classification")
# Application 4: MNIST-like Digit Recognition
print("\n🔢 Application 4: Digit Recognition (Image Classification)")
digit_classifier = create_mlp(
input_size=784, # 28x28 flattened images
hidden_sizes=[256, 128, 64], # Deep network for images
output_size=10, # 10 digits (0-9)
output_activation=Softmax
)
# Simulate flattened digit images
digit_images = Tensor(np.random.randn(8, 784)) # 8 digit images
digit_predictions = digit_classifier(digit_images)
assert digit_predictions.shape == (8, 10), "Should predict 10 classes for each image"
# Check softmax properties
row_sums = np.sum(digit_predictions.data, axis=1)
assert np.allclose(row_sums, 1.0), "Each prediction should sum to 1"
# Get predicted digits
predicted_digits = np.argmax(digit_predictions.data, axis=1)
confidence_scores = np.max(digit_predictions.data, axis=1)
print(f" Predicted digits: {predicted_digits}")
print(f" Confidence scores: {confidence_scores}")
print("✅ Digit recognition: valid multi-class image classification")
# Application 5: Network Architecture Comparison
print("\n📊 Application 5: Architecture Comparison Study")
# Create different architectures for same task
architectures = {
"Shallow": create_mlp(4, [16], 3, output_activation=Softmax),
"Medium": create_mlp(4, [12, 8], 3, output_activation=Softmax),
"Deep": create_mlp(4, [8, 8, 8], 3, output_activation=Softmax),
"Wide": create_mlp(4, [24], 3, output_activation=Softmax)
}
# Test all architectures on same data
test_data = Tensor([[1.0, 2.0, 3.0, 4.0]])
for name, network in architectures.items():
prediction = network(test_data)
assert prediction.shape == (1, 3), f"{name} network should output 3 classes"
assert abs(np.sum(prediction.data) - 1.0) < 1e-6, f"{name} network should output valid probabilities"
# Count parameters
param_count = sum(layer.weights.size + (layer.bias.size if hasattr(layer, 'bias') and layer.bias is not None else 0)
for layer in network.layers if hasattr(layer, 'weights'))
print(f" {name} network: {param_count} parameters, prediction: {prediction.data.flatten()}")
print("✅ Architecture comparison: all networks work with different complexities")
# Application 6: Transfer Learning Simulation
print("\n🔄 Application 6: Transfer Learning Simulation")
# Create "pre-trained" feature extractor
feature_extractor = Sequential([
Dense(input_size=100, output_size=50),
ReLU(),
Dense(input_size=50, output_size=25),
ReLU()
])
# Create task-specific classifier
classifier_head = Sequential([
Dense(input_size=25, output_size=10),
ReLU(),
Dense(input_size=10, output_size=2),
Softmax()
])
# Simulate transfer learning pipeline
raw_data = Tensor(np.random.randn(3, 100))
# Extract features
features = feature_extractor(raw_data)
assert features.shape == (3, 25), "Feature extractor should output 25 features"
# Classify using extracted features
final_predictions = classifier_head(features)
assert final_predictions.shape == (3, 2), "Classifier should output 2 classes"
row_sums = np.sum(final_predictions.data, axis=1)
assert np.allclose(row_sums, 1.0), "Transfer learning predictions should be valid"
print("✅ Transfer learning simulation: modular network composition")
print("\n🎉 Integration test passed! Your networks work correctly in:")
print(" • Multi-class classification (Iris flowers)")
print(" • Regression tasks (housing prices)")
print(" • Binary classification (sentiment analysis)")
print(" • Image classification (digit recognition)")
print(" • Architecture comparison studies")
print(" • Transfer learning scenarios")
print("📈 Progress: Networks ready for real ML applications!")
return True
except Exception as e:
print(f"❌ Integration test failed: {e}")
print("\n💡 This suggests an issue with:")
print(" • Network architecture composition")
print(" • Forward pass through complete networks")
print(" • Shape compatibility between layers")
print(" • Activation function integration")
print(" • Check your Sequential and create_mlp implementations")
return False
# Run the integration test
success = test_networks_integration() and success
# Print final summary
print(f"\n{'='*60}")
print("🎯 NETWORKS MODULE TESTING COMPLETE")
print(f"{'='*60}")
if success:
print("🎉 CONGRATULATIONS! All network tests passed!")
print("\n✅ Your networks module successfully implements:")
print(" • Sequential networks: flexible layer composition")
print(" • MLP creation: automated multi-layer perceptron building")
print(" • Architecture flexibility: shallow, deep, wide networks")
print(" • Multiple activations: ReLU, Tanh, Sigmoid, Softmax")
print(" • Real ML applications: classification, regression, image recognition")
print(" • Network analysis: parameter counting and architecture comparison")
print(" • Transfer learning: modular network composition")
print("\n🚀 You're ready to tackle any neural network architecture!")
print("📈 Final Progress: Networks Module ✓ COMPLETE")
else:
print("⚠️ Some tests failed. Please review the error messages above.")
print("\n🔧 To fix issues:")
print(" 1. Check your Sequential class implementation")
print(" 2. Verify create_mlp function layer creation")
print(" 3. Ensure proper forward pass through all layers")
print(" 4. Test shape compatibility between layers")
print(" 5. Verify activation function integration")
print("\n💪 Keep building! These networks are the foundation of modern AI.")
# %% [markdown]
"""
## 🎯 Module Summary
Congratulations! You've successfully implemented complete neural network architectures:
### What You've Accomplished
✅ **Sequential Networks**: The fundamental architecture for composing layers
✅ **Function Composition**: Understanding how layers combine to create complex behaviors
✅ **MLP Creation**: Building Multi-Layer Perceptrons with flexible architectures
✅ **Architecture Patterns**: Creating shallow, deep, and wide networks
✅ **Forward Pass**: Complete inference through multi-layer networks
### Key Concepts You've Learned
- **Networks are function composition**: Complex behavior from simple building blocks
- **Sequential architecture**: The foundation of most neural networks
- **MLP patterns**: Dense → Activation → Dense → Activation → Output
- **Architecture design**: How depth and width affect network capability
- **Forward pass**: How data flows through complete networks
### Mathematical Foundations
- **Function composition**: f(x) = f_n(...f_2(f_1(x)))
- **Universal approximation**: MLPs can approximate any continuous function
- **Hierarchical learning**: Early layers learn simple features, later layers learn complex patterns
- **Nonlinearity**: Activation functions enable complex decision boundaries
### Real-World Applications
- **Classification**: Image recognition, spam detection, medical diagnosis
- **Regression**: Price prediction, time series forecasting
- **Feature learning**: Extracting meaningful representations from raw data
- **Transfer learning**: Using pre-trained networks for new tasks
### Next Steps
1. **Export your code**: `tito package nbdev --export 04_networks`
2. **Test your implementation**: `tito module test 04_networks`
3. **Use your networks**:
```python
from tinytorch.core.networks import Sequential, create_mlp
from tinytorch.core.layers import Dense
from tinytorch.core.activations import ReLU
# Create custom network
network = Sequential([Dense(10, 5), ReLU(), Dense(5, 1)])
# Create MLP
mlp = create_mlp(10, [20, 10], 1)
```
4. **Move to Module 5**: Start building convolutional networks for images!
**Ready for the next challenge?** Let's add convolutional layers for image processing and build CNNs!
"""