Files
TinyTorch/modules/source/02_activations
Vijay Janapa Reddi 9199199845 feat: Add comprehensive intermediate testing across all TinyTorch modules
- Add 17 intermediate test points across 6 modules for immediate student feedback
- Tensor module: Tests after creation, properties, arithmetic, and operators
- Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax)
- Layers module: Tests after matrix multiplication and Dense layer implementation
- Networks module: Tests after Sequential class and MLP creation
- CNN module: Tests after convolution, Conv2D layer, and flatten operations
- DataLoader module: Tests after Dataset interface and DataLoader class
- All tests include visual progress indicators and behavioral explanations
- Maintains NBGrader compliance with proper metadata and point allocation
- Enables steady forward progress and better debugging for students
- 100% test success rate across all modules and integration testing
2025-07-12 18:28:35 -04:00
..

🔥 TinyTorch Activations Module

📊 Module Info

  • Difficulty: Intermediate
  • Time Estimate: 3-4 hours
  • Prerequisites: Tensor module
  • Next Steps: Layers module

Welcome to the Activations module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.

🎯 Learning Objectives

By the end of this module, you will:

  1. Understand why activation functions are essential for neural networks
  2. Implement the three most important activation functions: ReLU, Sigmoid, and Tanh
  3. Test your functions with various inputs to understand their behavior
  4. Grasp the mathematical properties that make each function useful

🧠 Why This Module Matters

Without activation functions, neural networks are just linear transformations!

Linear → Linear → Linear = Still just Linear
Linear → Activation → Linear = Can learn complex patterns!

This module teaches you the mathematical foundations that make deep learning possible.

📚 What You'll Build

1. ReLU (Rectified Linear Unit)

  • Formula: f(x) = max(0, x)
  • Properties: Simple, sparse, unbounded
  • Use case: Hidden layers (most common)

2. Sigmoid

  • Formula: f(x) = 1 / (1 + e^(-x))
  • Properties: Bounded to (0,1), smooth, probabilistic
  • Use case: Binary classification, gates

3. Tanh (Hyperbolic Tangent)

  • Formula: f(x) = tanh(x)
  • Properties: Bounded to (-1,1), zero-centered, smooth
  • Use case: Hidden layers, RNNs

🚀 Getting Started

Development Workflow

  1. Open the development file:

    python bin/tito.py jupyter
    # Then open assignments/source/02_activations/activations_dev.py
    
  2. Implement the functions:

    • Start with ReLU (simplest)
    • Move to Sigmoid (numerical stability challenge)
    • Finish with Tanh (symmetry properties)
  3. Visualize your functions:

    • Each function has plotting sections
    • See how your implementation transforms inputs
    • Compare all functions side-by-side
  4. Test as you go:

    python bin/tito.py test --module activations
    
  5. Export to package:

    python bin/tito.py sync
    

📊 Visual Learning Features

This module includes comprehensive plotting sections to help you understand:

  • Individual Function Plots: See each activation function's curve
  • Implementation Comparison: Your implementation vs ideal side-by-side
  • Mathematical Explanations: Visual breakdown of function properties
  • Error Analysis: Quantitative feedback on implementation accuracy
  • Comprehensive Comparison: All functions analyzed together

Enhanced Features:

  • 4-Panel Plots: Implementation vs ideal, mathematical definition, properties, error analysis
  • Real-time Feedback: Immediate accuracy scores with color-coded status
  • Mathematical Insights: Detailed explanations of function properties
  • Numerical Stability Testing: Verification with extreme values
  • Property Verification: Symmetry, monotonicity, and zero-centering tests

Why enhanced plots matter:

  • Visual Debugging: See exactly where your implementation differs
  • Quantitative Feedback: Get precise error measurements
  • Mathematical Understanding: Connect formulas to visual behavior
  • Implementation Confidence: Know immediately if your code is correct
  • Learning Reinforcement: Multiple visual perspectives of the same concept

Implementation Tips

ReLU Implementation

def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.maximum(0, x.data))

Sigmoid Implementation (Numerical Stability)

def forward(self, x: Tensor) -> Tensor:
    # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
    # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
    x_data = x.data
    result = np.zeros_like(x_data)
    
    positive_mask = x_data >= 0
    result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
    result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
    
    return Tensor(result)

Tanh Implementation

def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.tanh(x.data))

🧪 Testing Your Implementation

Unit Tests

python bin/tito.py test --module activations

Test Coverage:

  • Mathematical correctness
  • Numerical stability
  • Shape preservation
  • Edge cases
  • Function properties

Manual Testing

# Test all activations
from tinytorch.core.tensor import Tensor
from modules.activations.activations_dev import ReLU, Sigmoid, Tanh

x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])

relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()

print("Input:", x.data)
print("ReLU:", relu(x).data)
print("Sigmoid:", sigmoid(x).data)
print("Tanh:", tanh(x).data)

📊 Understanding Function Properties

Range Comparison

Function Input Range Output Range Zero Point
ReLU (-∞, ∞) [0, ∞) f(0) = 0
Sigmoid (-∞, ∞) (0, 1) f(0) = 0.5
Tanh (-∞, ∞) (-1, 1) f(0) = 0

Key Properties

  • ReLU: Sparse (zeros out negatives), unbounded, simple
  • Sigmoid: Probabilistic (0-1 range), smooth, saturating
  • Tanh: Zero-centered, symmetric, stronger gradients than sigmoid

🔧 Integration with TinyTorch

After implementation, your activations will be available as:

from tinytorch.core.activations import ReLU, Sigmoid, Tanh

# Use in neural networks
relu = ReLU()
output = relu(input_tensor)

🎯 Common Issues & Solutions

Issue 1: Sigmoid Overflow

Problem: exp() overflow with large inputs Solution: Use numerically stable implementation (see code above)

Issue 2: Wrong Output Range

Problem: Sigmoid/Tanh outputs outside expected range Solution: Check your mathematical implementation

Issue 3: Shape Mismatch

Problem: Output shape differs from input shape Solution: Ensure element-wise operations preserve shape

Issue 4: Import Errors

Problem: Cannot import after implementation Solution: Run python bin/tito.py sync to export to package

📈 Performance Considerations

  • ReLU: Fastest (simple max operation)
  • Sigmoid: Moderate (exponential computation)
  • Tanh: Moderate (hyperbolic function)

All implementations use NumPy for vectorized operations.

🚀 What's Next

After mastering activations, you'll use them in:

  1. Layers Module: Building neural network layers
  2. Loss Functions: Computing training objectives
  3. Advanced Architectures: CNNs, RNNs, and more

These functions are the mathematical foundation for everything that follows!

📚 Further Reading

Mathematical Background:

Advanced Topics:

  • ReLU variants (Leaky ReLU, ELU, Swish)
  • Activation function choice and impact
  • Gradient flow and vanishing gradients

🎉 Success Criteria

You've mastered this module when:

  • All tests pass (python bin/tito.py test --module activations)
  • You understand why each function is useful
  • You can explain the mathematical properties
  • You can use activations in neural networks
  • You appreciate the importance of nonlinearity

Great work! You've built the mathematical foundation of neural networks! 🎉