- Add 17 intermediate test points across 6 modules for immediate student feedback - Tensor module: Tests after creation, properties, arithmetic, and operators - Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax) - Layers module: Tests after matrix multiplication and Dense layer implementation - Networks module: Tests after Sequential class and MLP creation - CNN module: Tests after convolution, Conv2D layer, and flatten operations - DataLoader module: Tests after Dataset interface and DataLoader class - All tests include visual progress indicators and behavioral explanations - Maintains NBGrader compliance with proper metadata and point allocation - Enables steady forward progress and better debugging for students - 100% test success rate across all modules and integration testing
🔥 TinyTorch Activations Module
📊 Module Info
- Difficulty: ⭐⭐ Intermediate
- Time Estimate: 3-4 hours
- Prerequisites: Tensor module
- Next Steps: Layers module
Welcome to the Activations module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.
🎯 Learning Objectives
By the end of this module, you will:
- Understand why activation functions are essential for neural networks
- Implement the three most important activation functions: ReLU, Sigmoid, and Tanh
- Test your functions with various inputs to understand their behavior
- Grasp the mathematical properties that make each function useful
🧠 Why This Module Matters
Without activation functions, neural networks are just linear transformations!
Linear → Linear → Linear = Still just Linear
Linear → Activation → Linear = Can learn complex patterns!
This module teaches you the mathematical foundations that make deep learning possible.
📚 What You'll Build
1. ReLU (Rectified Linear Unit)
- Formula:
f(x) = max(0, x) - Properties: Simple, sparse, unbounded
- Use case: Hidden layers (most common)
2. Sigmoid
- Formula:
f(x) = 1 / (1 + e^(-x)) - Properties: Bounded to (0,1), smooth, probabilistic
- Use case: Binary classification, gates
3. Tanh (Hyperbolic Tangent)
- Formula:
f(x) = tanh(x) - Properties: Bounded to (-1,1), zero-centered, smooth
- Use case: Hidden layers, RNNs
🚀 Getting Started
Development Workflow
-
Open the development file:
python bin/tito.py jupyter # Then open assignments/source/02_activations/activations_dev.py -
Implement the functions:
- Start with ReLU (simplest)
- Move to Sigmoid (numerical stability challenge)
- Finish with Tanh (symmetry properties)
-
Visualize your functions:
- Each function has plotting sections
- See how your implementation transforms inputs
- Compare all functions side-by-side
-
Test as you go:
python bin/tito.py test --module activations -
Export to package:
python bin/tito.py sync
📊 Visual Learning Features
This module includes comprehensive plotting sections to help you understand:
- Individual Function Plots: See each activation function's curve
- Implementation Comparison: Your implementation vs ideal side-by-side
- Mathematical Explanations: Visual breakdown of function properties
- Error Analysis: Quantitative feedback on implementation accuracy
- Comprehensive Comparison: All functions analyzed together
Enhanced Features:
- 4-Panel Plots: Implementation vs ideal, mathematical definition, properties, error analysis
- Real-time Feedback: Immediate accuracy scores with color-coded status
- Mathematical Insights: Detailed explanations of function properties
- Numerical Stability Testing: Verification with extreme values
- Property Verification: Symmetry, monotonicity, and zero-centering tests
Why enhanced plots matter:
- Visual Debugging: See exactly where your implementation differs
- Quantitative Feedback: Get precise error measurements
- Mathematical Understanding: Connect formulas to visual behavior
- Implementation Confidence: Know immediately if your code is correct
- Learning Reinforcement: Multiple visual perspectives of the same concept
Implementation Tips
ReLU Implementation
def forward(self, x: Tensor) -> Tensor:
return Tensor(np.maximum(0, x.data))
Sigmoid Implementation (Numerical Stability)
def forward(self, x: Tensor) -> Tensor:
# For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
# For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
x_data = x.data
result = np.zeros_like(x_data)
positive_mask = x_data >= 0
result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
return Tensor(result)
Tanh Implementation
def forward(self, x: Tensor) -> Tensor:
return Tensor(np.tanh(x.data))
🧪 Testing Your Implementation
Unit Tests
python bin/tito.py test --module activations
Test Coverage:
- ✅ Mathematical correctness
- ✅ Numerical stability
- ✅ Shape preservation
- ✅ Edge cases
- ✅ Function properties
Manual Testing
# Test all activations
from tinytorch.core.tensor import Tensor
from modules.activations.activations_dev import ReLU, Sigmoid, Tanh
x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()
print("Input:", x.data)
print("ReLU:", relu(x).data)
print("Sigmoid:", sigmoid(x).data)
print("Tanh:", tanh(x).data)
📊 Understanding Function Properties
Range Comparison
| Function | Input Range | Output Range | Zero Point |
|---|---|---|---|
| ReLU | (-∞, ∞) | [0, ∞) | f(0) = 0 |
| Sigmoid | (-∞, ∞) | (0, 1) | f(0) = 0.5 |
| Tanh | (-∞, ∞) | (-1, 1) | f(0) = 0 |
Key Properties
- ReLU: Sparse (zeros out negatives), unbounded, simple
- Sigmoid: Probabilistic (0-1 range), smooth, saturating
- Tanh: Zero-centered, symmetric, stronger gradients than sigmoid
🔧 Integration with TinyTorch
After implementation, your activations will be available as:
from tinytorch.core.activations import ReLU, Sigmoid, Tanh
# Use in neural networks
relu = ReLU()
output = relu(input_tensor)
🎯 Common Issues & Solutions
Issue 1: Sigmoid Overflow
Problem: exp() overflow with large inputs
Solution: Use numerically stable implementation (see code above)
Issue 2: Wrong Output Range
Problem: Sigmoid/Tanh outputs outside expected range Solution: Check your mathematical implementation
Issue 3: Shape Mismatch
Problem: Output shape differs from input shape Solution: Ensure element-wise operations preserve shape
Issue 4: Import Errors
Problem: Cannot import after implementation
Solution: Run python bin/tito.py sync to export to package
📈 Performance Considerations
- ReLU: Fastest (simple max operation)
- Sigmoid: Moderate (exponential computation)
- Tanh: Moderate (hyperbolic function)
All implementations use NumPy for vectorized operations.
🚀 What's Next
After mastering activations, you'll use them in:
- Layers Module: Building neural network layers
- Loss Functions: Computing training objectives
- Advanced Architectures: CNNs, RNNs, and more
These functions are the mathematical foundation for everything that follows!
📚 Further Reading
Mathematical Background:
Advanced Topics:
- ReLU variants (Leaky ReLU, ELU, Swish)
- Activation function choice and impact
- Gradient flow and vanishing gradients
🎉 Success Criteria
You've mastered this module when:
- All tests pass (
python bin/tito.py test --module activations) - You understand why each function is useful
- You can explain the mathematical properties
- You can use activations in neural networks
- You appreciate the importance of nonlinearity
Great work! You've built the mathematical foundation of neural networks! 🎉