mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-12 02:43:35 -05:00

Files

Vijay Janapa Reddi 9199199845 feat: Add comprehensive intermediate testing across all TinyTorch modules

- Add 17 intermediate test points across 6 modules for immediate student feedback
- Tensor module: Tests after creation, properties, arithmetic, and operators
- Activations module: Tests after each activation function (ReLU, Sigmoid, Tanh, Softmax)
- Layers module: Tests after matrix multiplication and Dense layer implementation
- Networks module: Tests after Sequential class and MLP creation
- CNN module: Tests after convolution, Conv2D layer, and flatten operations
- DataLoader module: Tests after Dataset interface and DataLoader class
- All tests include visual progress indicators and behavioral explanations
- Maintains NBGrader compliance with proper metadata and point allocation
- Enables steady forward progress and better debugging for students
- 100% test success rate across all modules and integration testing

2025-07-12 18:28:35 -04:00

tests

Simplify export workflow: remove module_paths.txt, use dynamic discovery

2025-07-12 17:19:22 -04:00

activations_dev_backup.py

Simplify export workflow: remove module_paths.txt, use dynamic discovery

2025-07-12 17:19:22 -04:00

activations_dev.ipynb

feat: Enhanced tensor and activations modules with comprehensive educational content

2025-07-12 17:51:00 -04:00

activations_dev.py

feat: Add comprehensive intermediate testing across all TinyTorch modules

2025-07-12 18:28:35 -04:00

module.yaml

Simplify export workflow: remove module_paths.txt, use dynamic discovery

2025-07-12 17:19:22 -04:00

README.md

Simplify export workflow: remove module_paths.txt, use dynamic discovery

2025-07-12 17:19:22 -04:00

README.md

🔥 TinyTorch Activations Module

📊 Module Info

Difficulty: ⭐⭐ Intermediate
Time Estimate: 3-4 hours
Prerequisites: Tensor module
Next Steps: Layers module

Welcome to the Activations module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.

🎯 Learning Objectives

By the end of this module, you will:

Understand why activation functions are essential for neural networks
Implement the three most important activation functions: ReLU, Sigmoid, and Tanh
Test your functions with various inputs to understand their behavior
Grasp the mathematical properties that make each function useful

🧠 Why This Module Matters

Without activation functions, neural networks are just linear transformations!

Linear → Linear → Linear = Still just Linear
Linear → Activation → Linear = Can learn complex patterns!

This module teaches you the mathematical foundations that make deep learning possible.

📚 What You'll Build

1. ReLU (Rectified Linear Unit)

Formula: f(x) = max(0, x)
Properties: Simple, sparse, unbounded
Use case: Hidden layers (most common)

2. Sigmoid

Formula: f(x) = 1 / (1 + e^(-x))
Properties: Bounded to (0,1), smooth, probabilistic
Use case: Binary classification, gates

3. Tanh (Hyperbolic Tangent)

Formula: f(x) = tanh(x)
Properties: Bounded to (-1,1), zero-centered, smooth
Use case: Hidden layers, RNNs

🚀 Getting Started

Development Workflow

Open the development file:

python bin/tito.py jupyter
# Then open assignments/source/02_activations/activations_dev.py

Implement the functions:
- Start with ReLU (simplest)
- Move to Sigmoid (numerical stability challenge)
- Finish with Tanh (symmetry properties)
Visualize your functions:
- Each function has plotting sections
- See how your implementation transforms inputs
- Compare all functions side-by-side

Test as you go:

python bin/tito.py test --module activations

Export to package:
```
python bin/tito.py sync
```

📊 Visual Learning Features

This module includes comprehensive plotting sections to help you understand:

Individual Function Plots: See each activation function's curve
Implementation Comparison: Your implementation vs ideal side-by-side
Mathematical Explanations: Visual breakdown of function properties
Error Analysis: Quantitative feedback on implementation accuracy
Comprehensive Comparison: All functions analyzed together

Enhanced Features:

4-Panel Plots: Implementation vs ideal, mathematical definition, properties, error analysis
Real-time Feedback: Immediate accuracy scores with color-coded status
Mathematical Insights: Detailed explanations of function properties
Numerical Stability Testing: Verification with extreme values
Property Verification: Symmetry, monotonicity, and zero-centering tests

Why enhanced plots matter:

Visual Debugging: See exactly where your implementation differs
Quantitative Feedback: Get precise error measurements
Mathematical Understanding: Connect formulas to visual behavior
Implementation Confidence: Know immediately if your code is correct
Learning Reinforcement: Multiple visual perspectives of the same concept

Implementation Tips

ReLU Implementation

def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.maximum(0, x.data))

Sigmoid Implementation (Numerical Stability)

def forward(self, x: Tensor) -> Tensor:
    # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
    # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
    x_data = x.data
    result = np.zeros_like(x_data)
    
    positive_mask = x_data >= 0
    result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
    result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
    
    return Tensor(result)

Tanh Implementation

def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.tanh(x.data))

🧪 Testing Your Implementation

Unit Tests

python bin/tito.py test --module activations

Test Coverage:

✅ Mathematical correctness
✅ Numerical stability
✅ Shape preservation
✅ Edge cases
✅ Function properties

Manual Testing

# Test all activations
from tinytorch.core.tensor import Tensor
from modules.activations.activations_dev import ReLU, Sigmoid, Tanh

x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])

relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()

print("Input:", x.data)
print("ReLU:", relu(x).data)
print("Sigmoid:", sigmoid(x).data)
print("Tanh:", tanh(x).data)

📊 Understanding Function Properties

Range Comparison

Function	Input Range	Output Range	Zero Point
ReLU	(-∞, ∞)	[0, ∞)	f(0) = 0
Sigmoid	(-∞, ∞)	(0, 1)	f(0) = 0.5
Tanh	(-∞, ∞)	(-1, 1)	f(0) = 0

Key Properties

ReLU: Sparse (zeros out negatives), unbounded, simple
Sigmoid: Probabilistic (0-1 range), smooth, saturating
Tanh: Zero-centered, symmetric, stronger gradients than sigmoid

🔧 Integration with TinyTorch

After implementation, your activations will be available as:

from tinytorch.core.activations import ReLU, Sigmoid, Tanh

# Use in neural networks
relu = ReLU()
output = relu(input_tensor)

🎯 Common Issues & Solutions

Issue 1: Sigmoid Overflow

Problem: exp() overflow with large inputs Solution: Use numerically stable implementation (see code above)

Issue 2: Wrong Output Range

Problem: Sigmoid/Tanh outputs outside expected range Solution: Check your mathematical implementation

Issue 3: Shape Mismatch

Problem: Output shape differs from input shape Solution: Ensure element-wise operations preserve shape

Issue 4: Import Errors

Problem: Cannot import after implementation Solution: Run python bin/tito.py sync to export to package

📈 Performance Considerations

ReLU: Fastest (simple max operation)
Sigmoid: Moderate (exponential computation)
Tanh: Moderate (hyperbolic function)

All implementations use NumPy for vectorized operations.

🚀 What's Next

After mastering activations, you'll use them in:

Layers Module: Building neural network layers
Loss Functions: Computing training objectives
Advanced Architectures: CNNs, RNNs, and more

These functions are the mathematical foundation for everything that follows!

📚 Further Reading

Mathematical Background:

Advanced Topics:

ReLU variants (Leaky ReLU, ELU, Swish)
Activation function choice and impact
Gradient flow and vanishing gradients

🎉 Success Criteria

You've mastered this module when:

All tests pass (python bin/tito.py test --module activations)
You understand why each function is useful
You can explain the mathematical properties
You can use activations in neural networks
You appreciate the importance of nonlinearity

Great work! You've built the mathematical foundation of neural networks! 🎉