Files
TinyTorch/modules/source/03_activations/README.md
Vijay Janapa Reddi 8afe207ce5 Renumber modules from 00-13 to 01-14 for natural numbering
 Rename all module directories: 00_setup → 01_setup, etc.
 Update convert_modules.py mappings for new directory names
 Update _toc.yml file paths and titles (1-14 instead of 0-13)
 Regenerate all overview pages with new numbering
 Fix all broken references in usage-paths and intro
 Update chapter references to use natural numbering

Benefits:
- More intuitive course progression starting from 1
- Matches academic course numbering conventions
- Eliminates confusion about 'Module 0' concept
- Cleaner mental model for students and instructors
- All references and links properly updated

Complete transformation: 14 modules now numbered 01-14
2025-07-15 18:51:36 -04:00

252 lines
7.5 KiB
Markdown

# 🔥 TinyTorch Activations Module
## 📊 Module Info
- **Difficulty**: ⭐⭐ Intermediate
- **Time Estimate**: 3-4 hours
- **Prerequisites**: Tensor module
- **Next Steps**: Layers module
Welcome to the **Activations** module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.
## 🎯 Learning Objectives
By the end of this module, you will:
1. **Understand** why activation functions are essential for neural networks
2. **Implement** the three most important activation functions: ReLU, Sigmoid, and Tanh
3. **Test** your functions with various inputs to understand their behavior
4. **Grasp** the mathematical properties that make each function useful
## 🧠 Why This Module Matters
**Without activation functions, neural networks are just linear transformations!**
```
Linear → Linear → Linear = Still just Linear
Linear → Activation → Linear = Can learn complex patterns!
```
This module teaches you the mathematical foundations that make deep learning possible.
## 📚 What You'll Build
### 1. **ReLU** (Rectified Linear Unit)
- **Formula**: `f(x) = max(0, x)`
- **Properties**: Simple, sparse, unbounded
- **Use case**: Hidden layers (most common)
### 2. **Sigmoid**
- **Formula**: `f(x) = 1 / (1 + e^(-x))`
- **Properties**: Bounded to (0,1), smooth, probabilistic
- **Use case**: Binary classification, gates
### 3. **Tanh** (Hyperbolic Tangent)
- **Formula**: `f(x) = tanh(x)`
- **Properties**: Bounded to (-1,1), zero-centered, smooth
- **Use case**: Hidden layers, RNNs
## 🚀 Getting Started
### Prerequisites
1. **Activate the virtual environment**:
```bash
source bin/activate-tinytorch.sh
```
2. **Start development environment**:
```bash
tito jupyter
```
### Development Workflow
1. **Open the development file**:
```bash
# Then open assignments/source/02_activations/activations_dev.py
```
2. **Implement the functions**:
- Start with ReLU (simplest)
- Move to Sigmoid (numerical stability challenge)
- Finish with Tanh (symmetry properties)
3. **Visualize your functions**:
- Each function has plotting sections
- See how your implementation transforms inputs
- Compare all functions side-by-side
4. **Test as you go**:
```bash
tito test --module activations
```
5. **Export to package**:
```bash
tito sync
```
### 📊 Visual Learning Features
This module includes comprehensive plotting sections to help you understand:
- **Individual Function Plots**: See each activation function's curve
- **Implementation Comparison**: Your implementation vs ideal side-by-side
- **Mathematical Explanations**: Visual breakdown of function properties
- **Error Analysis**: Quantitative feedback on implementation accuracy
- **Comprehensive Comparison**: All functions analyzed together
**Enhanced Features**:
- **4-Panel Plots**: Implementation vs ideal, mathematical definition, properties, error analysis
- **Real-time Feedback**: Immediate accuracy scores with color-coded status
- **Mathematical Insights**: Detailed explanations of function properties
- **Numerical Stability Testing**: Verification with extreme values
- **Property Verification**: Symmetry, monotonicity, and zero-centering tests
**Why enhanced plots matter**:
- **Visual Debugging**: See exactly where your implementation differs
- **Quantitative Feedback**: Get precise error measurements
- **Mathematical Understanding**: Connect formulas to visual behavior
- **Implementation Confidence**: Know immediately if your code is correct
- **Learning Reinforcement**: Multiple visual perspectives of the same concept
### Implementation Tips
#### ReLU Implementation
```python
def forward(self, x: Tensor) -> Tensor:
return Tensor(np.maximum(0, x.data))
```
#### Sigmoid Implementation (Numerical Stability)
```python
def forward(self, x: Tensor) -> Tensor:
# For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
# For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
x_data = x.data
result = np.zeros_like(x_data)
positive_mask = x_data >= 0
result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
return Tensor(result)
```
#### Tanh Implementation
```python
def forward(self, x: Tensor) -> Tensor:
return Tensor(np.tanh(x.data))
```
### Testing Your Implementation
1. **Run the tests**:
```bash
tito test --module activations
```
2. **Export to package**:
```bash
tito sync
```
### Manual Testing
```python
# Test all activations
from tinytorch.core.tensor import Tensor
from modules.activations.activations_dev import ReLU, Sigmoid, Tanh
x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()
print("Input:", x.data)
print("ReLU:", relu(x).data)
print("Sigmoid:", sigmoid(x).data)
print("Tanh:", tanh(x).data)
```
## 📊 Understanding Function Properties
### Range Comparison
| Function | Input Range | Output Range | Zero Point |
|----------|-------------|--------------|------------|
| ReLU | (-∞, ∞) | [0, ∞) | f(0) = 0 |
| Sigmoid | (-∞, ∞) | (0, 1) | f(0) = 0.5 |
| Tanh | (-∞, ∞) | (-1, 1) | f(0) = 0 |
### Key Properties
- **ReLU**: Sparse (zeros out negatives), unbounded, simple
- **Sigmoid**: Probabilistic (0-1 range), smooth, saturating
- **Tanh**: Zero-centered, symmetric, stronger gradients than sigmoid
## 🔧 Integration with TinyTorch
After implementation, your activations will be available as:
```python
from tinytorch.core.activations import ReLU, Sigmoid, Tanh
# Use in neural networks
relu = ReLU()
output = relu(input_tensor)
```
## 🎯 Common Issues & Solutions
### Issue 1: Sigmoid Overflow
**Problem**: `exp()` overflow with large inputs
**Solution**: Use numerically stable implementation (see code above)
### Issue 2: Wrong Output Range
**Problem**: Sigmoid/Tanh outputs outside expected range
**Solution**: Check your mathematical implementation
### Issue 3: Shape Mismatch
**Problem**: Output shape differs from input shape
**Solution**: Ensure element-wise operations preserve shape
### Issue 4: Import Errors
**Problem**: Cannot import after implementation
**Solution**: Run `tito sync` to export to package
## 📈 Performance Considerations
- **ReLU**: Fastest (simple max operation)
- **Sigmoid**: Moderate (exponential computation)
- **Tanh**: Moderate (hyperbolic function)
All implementations use NumPy for vectorized operations.
## 🚀 What's Next
After mastering activations, you'll use them in:
1. **Layers Module**: Building neural network layers
2. **Loss Functions**: Computing training objectives
3. **Advanced Architectures**: CNNs, RNNs, and more
These functions are the mathematical foundation for everything that follows!
## 📚 Further Reading
**Mathematical Background**:
- [Activation Functions in Neural Networks](https://en.wikipedia.org/wiki/Activation_function)
- [Deep Learning Book - Chapter 6](http://www.deeplearningbook.org/)
**Advanced Topics**:
- ReLU variants (Leaky ReLU, ELU, Swish)
- Activation function choice and impact
- Gradient flow and vanishing gradients
## 🎉 Success Criteria
You've mastered this module when:
- [ ] All tests pass (`tito test --module activations`)
- [ ] You understand why each function is useful
- [ ] You can explain the mathematical properties
- [ ] You can use activations in neural networks
- [ ] You appreciate the importance of nonlinearity
**Great work! You've built the mathematical foundation of neural networks!** 🎉