feat: Create clean modular architecture with activations → layers separation

��️ Major architectural improvement implementing clean separation of concerns: ✨ NEW: Activations Module - Complete activations module with ReLU, Sigmoid, Tanh implementations - Educational NBDev structure with student TODOs + instructor solutions - Comprehensive testing suite (24 tests) with mathematical correctness validation - Visual learning features with matplotlib plotting (disabled during testing) - Clean export to tinytorch.core.activations 🔧 REFACTOR: Layers Module - Removed duplicate activation function implementations - Clean import from activations module: 'from tinytorch.core.activations import ReLU, Sigmoid, Tanh' - Updated documentation to reflect modular architecture - Preserved all existing functionality while improving code organization 🧪 TESTING: Comprehensive Test Coverage - All 24 activations tests passing ✅ - All 17 layers tests passing ✅ - Integration tests verify clean architecture works end-to-end - CLI testing with 'tito test --module' works for both modules 📦 ARCHITECTURE: Clean Dependency Graph - activations (math functions) → layers (building blocks) → networks (applications) - Separation of concerns: pure math vs. neural network components - Reusable components across future modules - Single source of truth for activation implementations �� PEDAGOGY: Enhanced Learning Experience - Week-sized chunks: students master activations, then build layers - Clear progression from mathematical foundations to applications - Real-world software architecture patterns - Modular design principles in practice This establishes the foundation for scalable, maintainable ML systems education.
2026-04-30 19:17:31 -05:00 · 2025-07-10 21:32:25 -04:00
parent 7da85b3572
commit b47c8ef259
10 changed files with 2161 additions and 303 deletions
--- a/bin/tito.py
+++ b/bin/tito.py
@@ -343,7 +343,7 @@ def cmd_info(args):

 def cmd_test(args):
    """Run tests for a specific module."""
-    valid_modules = ["setup", "tensor", "layers", "cnn", "data", "training", 
+    valid_modules = ["setup", "tensor", "activations", "layers", "cnn", "data", "training", 
                     "profiling", "compression", "kernels", "benchmarking", "mlops"]
    
    if args.all:
--- a/modules/activations/README.md
+++ b/modules/activations/README.md
@@ -0,0 +1,237 @@
+# 🔥 TinyTorch Activations Module
+
+Welcome to the **Activations** module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.
+
+## 🎯 Learning Objectives
+
+By the end of this module, you will:
+1. **Understand** why activation functions are essential for neural networks
+2. **Implement** the three most important activation functions: ReLU, Sigmoid, and Tanh
+3. **Test** your functions with various inputs to understand their behavior
+4. **Grasp** the mathematical properties that make each function useful
+
+## 🧠 Why This Module Matters
+
+**Without activation functions, neural networks are just linear transformations!**
+
+```
+Linear → Linear → Linear = Still just Linear
+Linear → Activation → Linear = Can learn complex patterns!
+```
+
+This module teaches you the mathematical foundations that make deep learning possible.
+
+## 📚 What You'll Build
+
+### 1. **ReLU** (Rectified Linear Unit)
+- **Formula**: `f(x) = max(0, x)`
+- **Properties**: Simple, sparse, unbounded
+- **Use case**: Hidden layers (most common)
+
+### 2. **Sigmoid** 
+- **Formula**: `f(x) = 1 / (1 + e^(-x))`
+- **Properties**: Bounded to (0,1), smooth, probabilistic
+- **Use case**: Binary classification, gates
+
+### 3. **Tanh** (Hyperbolic Tangent)
+- **Formula**: `f(x) = tanh(x)`
+- **Properties**: Bounded to (-1,1), zero-centered, smooth
+- **Use case**: Hidden layers, RNNs
+
+## 🚀 Getting Started
+
+### Development Workflow
+
+1. **Open the development file**:
+   ```bash
+   python bin/tito.py jupyter
+   # Then open modules/activations/activations_dev.py
+   ```
+
+2. **Implement the functions**:
+   - Start with ReLU (simplest)
+   - Move to Sigmoid (numerical stability challenge)
+   - Finish with Tanh (symmetry properties)
+
+3. **Visualize your functions**:
+   - Each function has plotting sections
+   - See how your implementation transforms inputs
+   - Compare all functions side-by-side
+
+4. **Test as you go**:
+   ```bash
+   python bin/tito.py test --module activations
+   ```
+
+5. **Export to package**:
+   ```bash
+   python bin/tito.py sync
+   ```
+
+### 📊 Visual Learning Features
+
+This module includes comprehensive plotting sections to help you understand:
+
+- **Individual Function Plots**: See each activation function's curve
+- **Implementation Comparison**: Your implementation vs ideal side-by-side
+- **Mathematical Explanations**: Visual breakdown of function properties
+- **Error Analysis**: Quantitative feedback on implementation accuracy
+- **Comprehensive Comparison**: All functions analyzed together
+
+**Enhanced Features**:
+- **4-Panel Plots**: Implementation vs ideal, mathematical definition, properties, error analysis
+- **Real-time Feedback**: Immediate accuracy scores with color-coded status
+- **Mathematical Insights**: Detailed explanations of function properties
+- **Numerical Stability Testing**: Verification with extreme values
+- **Property Verification**: Symmetry, monotonicity, and zero-centering tests
+
+**Why enhanced plots matter**: 
+- **Visual Debugging**: See exactly where your implementation differs
+- **Quantitative Feedback**: Get precise error measurements
+- **Mathematical Understanding**: Connect formulas to visual behavior
+- **Implementation Confidence**: Know immediately if your code is correct
+- **Learning Reinforcement**: Multiple visual perspectives of the same concept
+
+### Implementation Tips
+
+#### ReLU Implementation
+```python
+def forward(self, x: Tensor) -> Tensor:
+    return Tensor(np.maximum(0, x.data))
+```
+
+#### Sigmoid Implementation (Numerical Stability)
+```python
+def forward(self, x: Tensor) -> Tensor:
+    # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
+    # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
+    x_data = x.data
+    result = np.zeros_like(x_data)
+    
+    positive_mask = x_data >= 0
+    result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
+    result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
+    
+    return Tensor(result)
+```
+
+#### Tanh Implementation
+```python
+def forward(self, x: Tensor) -> Tensor:
+    return Tensor(np.tanh(x.data))
+```
+
+## 🧪 Testing Your Implementation
+
+### Unit Tests
+```bash
+python bin/tito.py test --module activations
+```
+
+**Test Coverage**:
+- ✅ Mathematical correctness
+- ✅ Numerical stability
+- ✅ Shape preservation
+- ✅ Edge cases
+- ✅ Function properties
+
+### Manual Testing
+```python
+# Test all activations
+from tinytorch.core.tensor import Tensor
+from modules.activations.activations_dev import ReLU, Sigmoid, Tanh
+
+x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
+
+relu = ReLU()
+sigmoid = Sigmoid()
+tanh = Tanh()
+
+print("Input:", x.data)
+print("ReLU:", relu(x).data)
+print("Sigmoid:", sigmoid(x).data)
+print("Tanh:", tanh(x).data)
+```
+
+## 📊 Understanding Function Properties
+
+### Range Comparison
+| Function | Input Range | Output Range | Zero Point |
+|----------|-------------|--------------|------------|
+| ReLU     | (-∞, ∞)     | [0, ∞)       | f(0) = 0   |
+| Sigmoid  | (-∞, ∞)     | (0, 1)       | f(0) = 0.5 |
+| Tanh     | (-∞, ∞)     | (-1, 1)      | f(0) = 0   |
+
+### Key Properties
+- **ReLU**: Sparse (zeros out negatives), unbounded, simple
+- **Sigmoid**: Probabilistic (0-1 range), smooth, saturating
+- **Tanh**: Zero-centered, symmetric, stronger gradients than sigmoid
+
+## 🔧 Integration with TinyTorch
+
+After implementation, your activations will be available as:
+
+```python
+from tinytorch.core.activations import ReLU, Sigmoid, Tanh
+
+# Use in neural networks
+relu = ReLU()
+output = relu(input_tensor)
+```
+
+## 🎯 Common Issues & Solutions
+
+### Issue 1: Sigmoid Overflow
+**Problem**: `exp()` overflow with large inputs
+**Solution**: Use numerically stable implementation (see code above)
+
+### Issue 2: Wrong Output Range
+**Problem**: Sigmoid/Tanh outputs outside expected range
+**Solution**: Check your mathematical implementation
+
+### Issue 3: Shape Mismatch
+**Problem**: Output shape differs from input shape
+**Solution**: Ensure element-wise operations preserve shape
+
+### Issue 4: Import Errors
+**Problem**: Cannot import after implementation
+**Solution**: Run `python bin/tito.py sync` to export to package
+
+## 📈 Performance Considerations
+
+- **ReLU**: Fastest (simple max operation)
+- **Sigmoid**: Moderate (exponential computation)
+- **Tanh**: Moderate (hyperbolic function)
+
+All implementations use NumPy for vectorized operations.
+
+## 🚀 What's Next
+
+After mastering activations, you'll use them in:
+1. **Layers Module**: Building neural network layers
+2. **Loss Functions**: Computing training objectives
+3. **Advanced Architectures**: CNNs, RNNs, and more
+
+These functions are the mathematical foundation for everything that follows!
+
+## 📚 Further Reading
+
+**Mathematical Background**:
+- [Activation Functions in Neural Networks](https://en.wikipedia.org/wiki/Activation_function)
+- [Deep Learning Book - Chapter 6](http://www.deeplearningbook.org/)
+
+**Advanced Topics**:
+- ReLU variants (Leaky ReLU, ELU, Swish)
+- Activation function choice and impact
+- Gradient flow and vanishing gradients
+
+## 🎉 Success Criteria
+
+You've mastered this module when:
+- [ ] All tests pass (`python bin/tito.py test --module activations`)
+- [ ] You understand why each function is useful
+- [ ] You can explain the mathematical properties
+- [ ] You can use activations in neural networks
+- [ ] You appreciate the importance of nonlinearity
+
+**Great work! You've built the mathematical foundation of neural networks!** 🎉 
--- a/modules/activations/activations_dev.py
+++ b/modules/activations/activations_dev.py
--- a/modules/activations/tests/test_activations.py
+++ b/modules/activations/tests/test_activations.py
@@ -0,0 +1,345 @@
+"""
+Test suite for the TinyTorch Activations module.
+
+This test suite validates the mathematical correctness of activation functions:
+- ReLU: f(x) = max(0, x)
+- Sigmoid: f(x) = 1 / (1 + e^(-x))
+- Tanh: f(x) = tanh(x)
+
+Tests focus on:
+1. Mathematical correctness
+2. Numerical stability
+3. Edge cases
+4. Shape preservation
+5. Type consistency
+"""
+
+import pytest
+import numpy as np
+import math
+from tinytorch.core.tensor import Tensor
+
+# Import the activation functions
+import sys
+import os
+sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
+from activations_dev import ReLU, Sigmoid, Tanh
+
+
+class TestReLU:
+    """Test the ReLU activation function."""
+    
+    def test_relu_basic_functionality(self):
+        """Test basic ReLU behavior: max(0, x)"""
+        relu = ReLU()
+        
+        # Test mixed positive/negative values
+        x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
+        y = relu(x)
+        expected = np.array([[0.0, 0.0, 0.0, 1.0, 2.0]])
+        
+        assert np.allclose(y.data, expected), f"Expected {expected}, got {y.data}"
+    
+    def test_relu_all_positive(self):
+        """Test ReLU with all positive values (should be unchanged)"""
+        relu = ReLU()
+        
+        x = Tensor([[1.0, 2.5, 3.7, 10.0]])
+        y = relu(x)
+        
+        assert np.allclose(y.data, x.data), "ReLU should preserve positive values"
+    
+    def test_relu_all_negative(self):
+        """Test ReLU with all negative values (should be zeros)"""
+        relu = ReLU()
+        
+        x = Tensor([[-1.0, -2.5, -3.7, -10.0]])
+        y = relu(x)
+        expected = np.zeros_like(x.data)
+        
+        assert np.allclose(y.data, expected), "ReLU should zero out negative values"
+    
+    def test_relu_zero_input(self):
+        """Test ReLU with zero input"""
+        relu = ReLU()
+        
+        x = Tensor([[0.0]])
+        y = relu(x)
+        
+        assert y.data[0, 0] == 0.0, "ReLU(0) should be 0"
+    
+    def test_relu_shape_preservation(self):
+        """Test that ReLU preserves tensor shape"""
+        relu = ReLU()
+        
+        # Test different shapes
+        shapes = [(1, 5), (2, 3), (4, 1), (3, 3)]
+        for shape in shapes:
+            x = Tensor(np.random.randn(*shape))
+            y = relu(x)
+            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
+    
+    def test_relu_callable(self):
+        """Test that ReLU can be called directly"""
+        relu = ReLU()
+        x = Tensor([[1.0, -1.0]])
+        
+        y1 = relu(x)
+        y2 = relu.forward(x)
+        
+        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
+
+
+class TestSigmoid:
+    """Test the Sigmoid activation function."""
+    
+    def test_sigmoid_basic_functionality(self):
+        """Test basic Sigmoid behavior"""
+        sigmoid = Sigmoid()
+        
+        # Test known values
+        x = Tensor([[0.0]])
+        y = sigmoid(x)
+        assert abs(y.data[0, 0] - 0.5) < 1e-6, "Sigmoid(0) should be 0.5"
+    
+    def test_sigmoid_range(self):
+        """Test that Sigmoid outputs are in (0, 1)"""
+        sigmoid = Sigmoid()
+        
+        # Test wide range of inputs
+        x = Tensor([[-10.0, -5.0, -1.0, 0.0, 1.0, 5.0, 10.0]])
+        y = sigmoid(x)
+        
+        assert np.all(y.data > 0), "Sigmoid outputs should be > 0"
+        assert np.all(y.data < 1), "Sigmoid outputs should be < 1"
+    
+    def test_sigmoid_numerical_stability(self):
+        """Test Sigmoid with extreme values (numerical stability)"""
+        sigmoid = Sigmoid()
+        
+        # Test extreme values that could cause overflow
+        x = Tensor([[-100.0, -50.0, 50.0, 100.0]])
+        y = sigmoid(x)
+        
+        # Should not contain NaN or inf
+        assert not np.any(np.isnan(y.data)), "Sigmoid should not produce NaN"
+        assert not np.any(np.isinf(y.data)), "Sigmoid should not produce inf"
+        
+        # Should be close to 0 for very negative, close to 1 for very positive
+        assert y.data[0, 0] < 1e-10, "Sigmoid(-100) should be very close to 0"
+        assert y.data[0, 1] < 1e-10, "Sigmoid(-50) should be very close to 0"
+        assert y.data[0, 2] > 1 - 1e-10, "Sigmoid(50) should be very close to 1"
+        assert y.data[0, 3] > 1 - 1e-10, "Sigmoid(100) should be very close to 1"
+    
+    def test_sigmoid_monotonicity(self):
+        """Test that Sigmoid is monotonically increasing"""
+        sigmoid = Sigmoid()
+        
+        x = Tensor([[-3.0, -1.0, 0.0, 1.0, 3.0]])
+        y = sigmoid(x)
+        
+        # Check that outputs are increasing
+        for i in range(len(y.data[0]) - 1):
+            assert y.data[0, i] < y.data[0, i + 1], "Sigmoid should be monotonically increasing"
+    
+    def test_sigmoid_shape_preservation(self):
+        """Test that Sigmoid preserves tensor shape"""
+        sigmoid = Sigmoid()
+        
+        shapes = [(1, 5), (2, 3), (4, 1)]
+        for shape in shapes:
+            x = Tensor(np.random.randn(*shape))
+            y = sigmoid(x)
+            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
+    
+    def test_sigmoid_callable(self):
+        """Test that Sigmoid can be called directly"""
+        sigmoid = Sigmoid()
+        x = Tensor([[1.0, -1.0]])
+        
+        y1 = sigmoid(x)
+        y2 = sigmoid.forward(x)
+        
+        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
+
+
+class TestTanh:
+    """Test the Tanh activation function."""
+    
+    def test_tanh_basic_functionality(self):
+        """Test basic Tanh behavior"""
+        tanh = Tanh()
+        
+        # Test known values
+        x = Tensor([[0.0]])
+        y = tanh(x)
+        assert abs(y.data[0, 0] - 0.0) < 1e-6, "Tanh(0) should be 0"
+    
+    def test_tanh_range(self):
+        """Test that Tanh outputs are in [-1, 1]"""
+        tanh = Tanh()
+        
+        # Test wide range of inputs
+        x = Tensor([[-10.0, -5.0, -1.0, 0.0, 1.0, 5.0, 10.0]])
+        y = tanh(x)
+        
+        assert np.all(y.data >= -1), "Tanh outputs should be >= -1"
+        assert np.all(y.data <= 1), "Tanh outputs should be <= 1"
+    
+    def test_tanh_symmetry(self):
+        """Test that Tanh is symmetric: tanh(-x) = -tanh(x)"""
+        tanh = Tanh()
+        
+        x = Tensor([[1.0, 2.0, 3.0]])
+        x_neg = Tensor([[-1.0, -2.0, -3.0]])
+        
+        y_pos = tanh(x)
+        y_neg = tanh(x_neg)
+        
+        assert np.allclose(y_neg.data, -y_pos.data), "Tanh should be symmetric"
+    
+    def test_tanh_monotonicity(self):
+        """Test that Tanh is monotonically increasing"""
+        tanh = Tanh()
+        
+        x = Tensor([[-3.0, -1.0, 0.0, 1.0, 3.0]])
+        y = tanh(x)
+        
+        # Check that outputs are increasing
+        for i in range(len(y.data[0]) - 1):
+            assert y.data[0, i] < y.data[0, i + 1], "Tanh should be monotonically increasing"
+    
+    def test_tanh_extreme_values(self):
+        """Test Tanh with extreme values"""
+        tanh = Tanh()
+        
+        x = Tensor([[-100.0, 100.0]])
+        y = tanh(x)
+        
+        # Should be close to -1 and 1 respectively
+        assert abs(y.data[0, 0] - (-1.0)) < 1e-10, "Tanh(-100) should be very close to -1"
+        assert abs(y.data[0, 1] - 1.0) < 1e-10, "Tanh(100) should be very close to 1"
+    
+    def test_tanh_shape_preservation(self):
+        """Test that Tanh preserves tensor shape"""
+        tanh = Tanh()
+        
+        shapes = [(1, 5), (2, 3), (4, 1)]
+        for shape in shapes:
+            x = Tensor(np.random.randn(*shape))
+            y = tanh(x)
+            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
+    
+    def test_tanh_callable(self):
+        """Test that Tanh can be called directly"""
+        tanh = Tanh()
+        x = Tensor([[1.0, -1.0]])
+        
+        y1 = tanh(x)
+        y2 = tanh.forward(x)
+        
+        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
+
+
+class TestActivationComparison:
+    """Test interactions and comparisons between activation functions."""
+    
+    def test_activation_consistency(self):
+        """Test that all activations work with the same input"""
+        relu = ReLU()
+        sigmoid = Sigmoid()
+        tanh = Tanh()
+        
+        x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
+        
+        # All should process without error
+        y_relu = relu(x)
+        y_sigmoid = sigmoid(x)
+        y_tanh = tanh(x)
+        
+        # All should preserve shape
+        assert y_relu.shape == x.shape
+        assert y_sigmoid.shape == x.shape
+        assert y_tanh.shape == x.shape
+    
+    def test_activation_ranges(self):
+        """Test that activations have expected output ranges"""
+        relu = ReLU()
+        sigmoid = Sigmoid()
+        tanh = Tanh()
+        
+        x = Tensor([[-5.0, -2.0, 0.0, 2.0, 5.0]])
+        
+        y_relu = relu(x)
+        y_sigmoid = sigmoid(x)
+        y_tanh = tanh(x)
+        
+        # ReLU: [0, inf)
+        assert np.all(y_relu.data >= 0), "ReLU should be non-negative"
+        
+        # Sigmoid: (0, 1)
+        assert np.all(y_sigmoid.data > 0), "Sigmoid should be positive"
+        assert np.all(y_sigmoid.data < 1), "Sigmoid should be less than 1"
+        
+        # Tanh: (-1, 1)
+        assert np.all(y_tanh.data > -1), "Tanh should be greater than -1"
+        assert np.all(y_tanh.data < 1), "Tanh should be less than 1"
+
+
+# Integration tests with edge cases
+class TestActivationEdgeCases:
+    """Test edge cases and boundary conditions."""
+    
+    def test_zero_tensor(self):
+        """Test all activations with zero tensor"""
+        relu = ReLU()
+        sigmoid = Sigmoid()
+        tanh = Tanh()
+        
+        x = Tensor([[0.0, 0.0, 0.0]])
+        
+        y_relu = relu(x)
+        y_sigmoid = sigmoid(x)
+        y_tanh = tanh(x)
+        
+        assert np.allclose(y_relu.data, [0.0, 0.0, 0.0]), "ReLU(0) should be 0"
+        assert np.allclose(y_sigmoid.data, [0.5, 0.5, 0.5]), "Sigmoid(0) should be 0.5"
+        assert np.allclose(y_tanh.data, [0.0, 0.0, 0.0]), "Tanh(0) should be 0"
+    
+    def test_single_element_tensor(self):
+        """Test all activations with single element tensor"""
+        relu = ReLU()
+        sigmoid = Sigmoid()
+        tanh = Tanh()
+        
+        x = Tensor([[1.0]])
+        
+        y_relu = relu(x)
+        y_sigmoid = sigmoid(x)
+        y_tanh = tanh(x)
+        
+        assert y_relu.shape == (1, 1)
+        assert y_sigmoid.shape == (1, 1)
+        assert y_tanh.shape == (1, 1)
+    
+    def test_large_tensor(self):
+        """Test activations with larger tensors"""
+        relu = ReLU()
+        sigmoid = Sigmoid()
+        tanh = Tanh()
+        
+        # Create a 10x10 tensor
+        x = Tensor(np.random.randn(10, 10))
+        
+        y_relu = relu(x)
+        y_sigmoid = sigmoid(x)
+        y_tanh = tanh(x)
+        
+        assert y_relu.shape == (10, 10)
+        assert y_sigmoid.shape == (10, 10)
+        assert y_tanh.shape == (10, 10)
+
+
+if __name__ == "__main__":
+    # Run tests with pytest
+    pytest.main([__file__, "-v"]) 
--- a/modules/layers/layers_dev.py
+++ b/modules/layers/layers_dev.py
@@ -17,15 +17,20 @@ Welcome to the Layers module! This is where neural networks begin. You'll implem
 ## Learning Goals
 - Understand layers as functions that transform tensors: `y = f(x)`
 - Implement Dense layers with linear transformations: `y = Wx + b`
- Add activation functions for nonlinearity (ReLU, Sigmoid, Tanh)
+- Use activation functions from the activations module for nonlinearity
 - See how neural networks are just function composition
 - Build intuition before diving into training

 ## Build → Use → Understand
-1. **Build**: Dense layers and activation functions
+1. **Build**: Dense layers using activation functions as building blocks
 2. **Use**: Transform tensors and see immediate results
 3. **Understand**: How neural networks transform information

+## Module Dependencies
+This module builds on the **activations** module:
+- **activations** → **layers** → **networks**
+- Clean separation of concerns: math functions → layer building blocks → full networks
+
 ## Module → Package Structure
 **🎓 Teaching vs. 🔧 Building**: 
 - **Learning side**: Work in `modules/layers/layers_dev.py`  
@@ -51,6 +56,9 @@ import sys
 from typing import Union, Optional, Callable
 from tinytorch.core.tensor import Tensor

+# Import activation functions from the activations module
+from tinytorch.core.activations import ReLU, Sigmoid, Tanh
+
 # Import our Tensor class
 # sys.path.append('../../')
 # from modules.tensor.tensor_dev import Tensor
@@ -203,12 +211,11 @@ try:
    print(f"Input: {x.data}")
    print(f"Output: {y.data}")
    
-    # Test with batch of examples
+    # Test with batch
    x_batch = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])  # Shape: (2, 3)
    y_batch = layer(x_batch)
    print(f"\nBatch input shape: {x_batch.shape}")
    print(f"Batch output shape: {y_batch.shape}")
-    print(f"Batch output: {y_batch.data}")
    
    print("✅ Dense layer working!")
    
@@ -218,14 +225,20 @@ except Exception as e:

 # %% [markdown]
 """
-## Step 2: Activation Functions
+## Step 2: Activation Functions - Adding Nonlinearity

-Dense layers alone can only learn **linear** transformations. But most real-world problems need **nonlinear** transformations.
+Now we'll use the activation functions from the **activations** module! 

-**Activation functions** add nonlinearity:
- **ReLU**: `max(0, x)` - Most common, simple and effective
- **Sigmoid**: `1 / (1 + e^(-x))` - Squashes to (0, 1)
- **Tanh**: `tanh(x)` - Squashes to (-1, 1)
+**Clean Architecture**: We import the activation functions rather than redefining them:
+```python
+from tinytorch.core.activations import ReLU, Sigmoid, Tanh
+```
+
+**Why this matters**:
+- **Separation of concerns**: Math functions vs. layer building blocks
+- **Reusability**: Activations can be used anywhere in the system
+- **Maintainability**: One place to update activation implementations
+- **Composability**: Clean imports make neural networks easier to build

 **Why nonlinearity matters**: Without it, stacking layers is pointless!
 ```
@@ -234,178 +247,43 @@ Linear → NonLinear → Linear = Can learn complex patterns
 ```
 """

-# %%
-#| export
-class ReLU:
-    """
-    ReLU Activation: f(x) = max(0, x)
-    
-    The most popular activation function in deep learning.
-    Simple, effective, and computationally efficient.
-    
-    TODO: Implement ReLU activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply ReLU: f(x) = max(0, x)
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with ReLU applied element-wise
-            
-        TODO: Implement element-wise max(0, x) operation
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        """Make activation callable: relu(x) same as relu.forward(x)"""
-        return self.forward(x)
-
-# %%
-#| hide
-#| export
-class ReLU:
-    """ReLU Activation: f(x) = max(0, x)"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply ReLU: f(x) = max(0, x)"""
-        return Tensor(np.maximum(0, x.data))
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %%
-#| export
-class Sigmoid:
-    """
-    Sigmoid Activation: f(x) = 1 / (1 + e^(-x))
-    
-    Squashes input to range (0, 1). Often used for binary classification.
-    
-    TODO: Implement Sigmoid activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply Sigmoid: f(x) = 1 / (1 + e^(-x))
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with Sigmoid applied element-wise
-            
-        TODO: Implement sigmoid function (be careful with numerical stability!)
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %%
-#| hide
-#| export
-class Sigmoid:
-    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply Sigmoid with numerical stability"""
-        # Use the numerically stable version to avoid overflow
-        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
-        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
-        x_data = x.data
-        result = np.zeros_like(x_data)
-        
-        # Stable computation
-        positive_mask = x_data >= 0
-        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
-        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
-        
-        return Tensor(result)
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %%
-#| export
-class Tanh:
-    """
-    Tanh Activation: f(x) = tanh(x)
-    
-    Squashes input to range (-1, 1). Zero-centered output.
-    
-    TODO: Implement Tanh activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply Tanh: f(x) = tanh(x)
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with Tanh applied element-wise
-            
-        TODO: Implement tanh function
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %%
-#| hide
-#| export
-class Tanh:
-    """Tanh Activation: f(x) = tanh(x)"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply Tanh"""
-        return Tensor(np.tanh(x.data))
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
 # %% [markdown]
 """
-### 🧪 Test Your Activation Functions
+### 🧪 Test Activation Functions from Activations Module

-Once you implement the activation functions above, run this cell to test them:
+Let's test that we can use the activation functions from the activations module:
 """

 # %%
-# Test activation functions
+# Test activation functions from activations module
 try:
-    print("=== Testing Activation Functions ===")
+    print("=== Testing Activation Functions from Activations Module ===")
    
    # Test data: mix of positive, negative, and zero
    x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
    print(f"Input: {x.data}")
    
-    # Test ReLU
+    # Test ReLU from activations module
    relu = ReLU()
    y_relu = relu(x)
    print(f"ReLU output: {y_relu.data}")
    
-    # Test Sigmoid
+    # Test Sigmoid from activations module
    sigmoid = Sigmoid()
    y_sigmoid = sigmoid(x)
    print(f"Sigmoid output: {y_sigmoid.data}")
    
-    # Test Tanh
+    # Test Tanh from activations module
    tanh = Tanh()
    y_tanh = tanh(x)
    print(f"Tanh output: {y_tanh.data}")
    
-    print("✅ Activation functions working!")
+    print("✅ Activation functions from activations module working!")
+    print("🎉 Clean architecture: layers module uses activations module!")
    
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Make sure to implement the activation functions above!")
+    print("Make sure the activations module is properly exported!")

 # %% [markdown]
 """
@@ -418,6 +296,11 @@ Input → Dense → ReLU → Dense → Sigmoid → Output
 ```

 This is a 2-layer neural network that can learn complex nonlinear patterns!
+
+**Notice the clean architecture**:
+- Dense layers handle linear transformations
+- Activation functions (from activations module) handle nonlinearity
+- Composition creates complex behaviors from simple building blocks
 """

 # %%
@@ -431,9 +314,9 @@ try:
    # Output: 2 neurons with Sigmoid
    
    layer1 = Dense(input_size=3, output_size=4)
-    activation1 = ReLU()
+    activation1 = ReLU()  # From activations module
    layer2 = Dense(input_size=4, output_size=2)
-    activation2 = Sigmoid()
+    activation2 = Sigmoid()  # From activations module
    
    print("Network architecture:")
    print(f"  Input: 3 features")
@@ -458,28 +341,36 @@ try:
    print(f"Output values: {output.data}")
    
    print("\n🎉 Neural network working! You just built your first neural network!")
+    print("🏗️  Clean architecture: Dense layers + Activations module = Neural Network")
    print("Notice how the network transforms 3D input into 2D output through learned transformations.")
    
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Make sure to implement the layers and activations above!")
+    print("Make sure to implement the layers and check activations module!")

 # %% [markdown]
 """
 ## Step 4: Understanding What We Built

-Congratulations! You just implemented the fundamental building blocks of neural networks:
+Congratulations! You just implemented a clean, modular neural network architecture:

 ### 🧱 **What You Built**
 1. **Dense Layer**: Linear transformation `y = Wx + b`
-2. **Activation Functions**: Nonlinear transformations (ReLU, Sigmoid, Tanh)
+2. **Activation Functions**: Imported from activations module (ReLU, Sigmoid, Tanh)
 3. **Layer Composition**: Chaining layers to build networks

+### 🏗️ **Clean Architecture Benefits**
+- **Separation of concerns**: Math functions vs. layer building blocks
+- **Reusability**: Activations can be used across different modules
+- **Maintainability**: One place to update activation implementations
+- **Composability**: Clean imports make complex networks easier to build
+
 ### 🎯 **Key Insights**
 - **Layers are functions**: They transform tensors from one space to another
 - **Composition creates complexity**: Simple layers → complex networks
 - **Nonlinearity is crucial**: Without it, deep networks are just linear transformations
 - **Neural networks are function approximators**: They learn to map inputs to outputs
+- **Modular design**: Building blocks can be combined in many ways

 ### 🚀 **What's Next**
 In the next modules, you'll learn:
@@ -498,7 +389,7 @@ Then test your implementation:
 python bin/tito.py test --module layers
 ```

-**Great job! You've built the foundation of neural networks!** 🎉
+**Great job! You've built a clean, modular foundation for neural networks!** 🎉
 """

 # %%
@@ -514,9 +405,9 @@ try:
    # Build a 3-layer network for digit classification
    # 784 → 128 → 64 → 10
    layer1 = Dense(input_size=image_size, output_size=128)
-    relu1 = ReLU()
+    relu1 = ReLU()  # From activations module
    layer2 = Dense(input_size=128, output_size=64)
-    relu2 = ReLU()
+    relu2 = ReLU()  # From activations module
    layer3 = Dense(input_size=64, output_size=num_classes)
    softmax = Sigmoid()  # Using Sigmoid as a simple "probability-like" output
    
@@ -541,8 +432,38 @@ try:
    print(f"  Sample predictions: {predictions.data[0]}")  # First image predictions
    
    print("\n🎉 You built a neural network that could classify images!")
+    print("🏗️  Clean architecture: Dense layers + Activations module = Image Classifier")
    print("With training, this network could learn to recognize handwritten digits!")
    
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Check your layer implementations!") 
+    print("Check your layer implementations and activations module!")
+
+# %% [markdown]
+"""
+## 🎓 Module Summary
+
+### What You Learned
+1. **Layer Architecture**: Dense layers as linear transformations
+2. **Clean Dependencies**: Layers module uses activations module
+3. **Function Composition**: Simple building blocks → complex networks
+4. **Modular Design**: Separation of concerns for maintainable code
+
+### Key Architectural Insight
+```
+activations (math functions) → layers (building blocks) → networks (applications)
+```
+
+This clean dependency graph makes the system:
+- **Understandable**: Each module has a clear purpose
+- **Testable**: Each module can be tested independently
+- **Reusable**: Components can be used across different contexts
+- **Maintainable**: Changes are localized to appropriate modules
+
+### Next Steps
+- **Training**: Learn how networks learn from data
+- **Advanced Architectures**: CNNs, RNNs, Transformers
+- **Applications**: Real-world machine learning problems
+
+**Congratulations on building a clean, modular neural network foundation!** 🚀
+""" 
--- a/modules/layers/tests/test_layers.py
+++ b/modules/layers/tests/test_layers.py
@@ -18,7 +18,11 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))

 # Import from the module's development file
 # Note: This imports the instructor version with full implementation
-from layers_dev import Dense, ReLU, Sigmoid, Tanh, Tensor
+from layers_dev import Dense, Tensor
+
+# Import activation functions from the activations module
+sys.path.insert(0, os.path.join(os.path.dirname(os.path.dirname(__file__)), '..', 'activations'))
+from activations_dev import ReLU, Sigmoid, Tanh

 def safe_numpy(tensor):
    """Get numpy array from tensor, using .numpy() if available, otherwise .data"""
--- a/tinytorch/_modidx.py
+++ b/tinytorch/_modidx.py
@@ -5,7 +5,30 @@ d = { 'settings': { 'branch': 'main',
                'doc_host': 'https://tinytorch.github.io',
                'git_url': 'https://github.com/tinytorch/TinyTorch/',
                'lib_path': 'tinytorch'},
-  'syms': { 'tinytorch.core.tensor': { 'tinytorch.core.tensor.Tensor': ('tensor/tensor_dev.html#tensor', 'tinytorch/core/tensor.py'),
+  'syms': { 'tinytorch.core.activations': {},
+            'tinytorch.core.layers': { 'tinytorch.core.layers.Dense': ('layers/layers_dev.html#dense', 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Dense.__call__': ( 'layers/layers_dev.html#dense.__call__',
+                                                                                 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Dense.__init__': ( 'layers/layers_dev.html#dense.__init__',
+                                                                                 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Dense.forward': ( 'layers/layers_dev.html#dense.forward',
+                                                                                'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.ReLU': ('layers/layers_dev.html#relu', 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.ReLU.__call__': ( 'layers/layers_dev.html#relu.__call__',
+                                                                                'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.ReLU.forward': ( 'layers/layers_dev.html#relu.forward',
+                                                                               'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Sigmoid': ('layers/layers_dev.html#sigmoid', 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Sigmoid.__call__': ( 'layers/layers_dev.html#sigmoid.__call__',
+                                                                                   'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Sigmoid.forward': ( 'layers/layers_dev.html#sigmoid.forward',
+                                                                                  'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Tanh': ('layers/layers_dev.html#tanh', 'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Tanh.__call__': ( 'layers/layers_dev.html#tanh.__call__',
+                                                                                'tinytorch/core/layers.py'),
+                                       'tinytorch.core.layers.Tanh.forward': ( 'layers/layers_dev.html#tanh.forward',
+                                                                               'tinytorch/core/layers.py')},
+            'tinytorch.core.tensor': { 'tinytorch.core.tensor.Tensor': ('tensor/tensor_dev.html#tensor', 'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor.Tensor.__init__': ( 'tensor/tensor_dev.html#tensor.__init__',
                                                                                  'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor.Tensor.__repr__': ( 'tensor/tensor_dev.html#tensor.__repr__',
@@ -22,7 +45,21 @@ d = { 'settings': { 'branch': 'main',
                                                                                      'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor._add_utility_methods': ( 'tensor/tensor_dev.html#_add_utility_methods',
                                                                                       'tinytorch/core/tensor.py')},
-            'tinytorch.core.utils': { 'tinytorch.core.utils.SystemInfo': ('setup/setup_dev.html#systeminfo', 'tinytorch/core/utils.py'),
+            'tinytorch.core.utils': { 'tinytorch.core.utils.DeveloperProfile': ( 'setup/setup_dev.html#developerprofile',
+                                                                                 'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile.__init__': ( 'setup/setup_dev.html#developerprofile.__init__',
+                                                                                          'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile.__str__': ( 'setup/setup_dev.html#developerprofile.__str__',
+                                                                                         'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile._load_default_flame': ( 'setup/setup_dev.html#developerprofile._load_default_flame',
+                                                                                                     'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile.get_ascii_art': ( 'setup/setup_dev.html#developerprofile.get_ascii_art',
+                                                                                               'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile.get_full_profile': ( 'setup/setup_dev.html#developerprofile.get_full_profile',
+                                                                                                  'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.DeveloperProfile.get_signature': ( 'setup/setup_dev.html#developerprofile.get_signature',
+                                                                                               'tinytorch/core/utils.py'),
+                                      'tinytorch.core.utils.SystemInfo': ('setup/setup_dev.html#systeminfo', 'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.SystemInfo.__init__': ( 'setup/setup_dev.html#systeminfo.__init__',
                                                                                    'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.SystemInfo.__str__': ( 'setup/setup_dev.html#systeminfo.__str__',
--- a/tinytorch/core/activations.py
+++ b/tinytorch/core/activations.py
@@ -0,0 +1,58 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/activations/activations_dev.py.
+
+# %% auto 0
+__all__ = ['ReLU', 'Sigmoid', 'Tanh']
+
+# %% ../../modules/activations/activations_dev.py auto 1
+import math
+import numpy as np
+import matplotlib.pyplot as plt
+import os
+import sys
+
+# TinyTorch imports
+from tinytorch.core.tensor import Tensor
+
+# %% ../../modules/activations/activations_dev.py auto 2
+class ReLU:
+    """ReLU Activation: f(x) = max(0, x)"""
+    
+    def forward(self, x: Tensor) -> Tensor:
+        """Apply ReLU: f(x) = max(0, x)"""
+        return Tensor(np.maximum(0, x.data))
+    
+    def __call__(self, x: Tensor) -> Tensor:
+        return self.forward(x)
+
+# %% ../../modules/activations/activations_dev.py auto 3
+class Sigmoid:
+    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
+    
+    def forward(self, x: Tensor) -> Tensor:
+        """Apply Sigmoid with numerical stability"""
+        # Use the numerically stable version to avoid overflow
+        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
+        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
+        x_data = x.data
+        result = np.zeros_like(x_data)
+        
+        # Stable computation
+        positive_mask = x_data >= 0
+        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
+        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
+        
+        return Tensor(result)
+    
+    def __call__(self, x: Tensor) -> Tensor:
+        return self.forward(x)
+
+# %% ../../modules/activations/activations_dev.py auto 4
+class Tanh:
+    """Tanh Activation: f(x) = tanh(x)"""
+    
+    def forward(self, x: Tensor) -> Tensor:
+        """Apply Tanh"""
+        return Tensor(np.tanh(x.data))
+    
+    def __call__(self, x: Tensor) -> Tensor:
+        return self.forward(x) 
--- a/tinytorch/core/layers.py
+++ b/tinytorch/core/layers.py
@@ -1,7 +1,7 @@
 # AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/layers/layers_dev.ipynb.

 # %% auto 0
-__all__ = ['Dense', 'ReLU', 'Sigmoid', 'Tanh']
+__all__ = ['Dense']

 # %% ../../modules/layers/layers_dev.ipynb 2
 import numpy as np
@@ -10,6 +10,9 @@ import sys
 from typing import Union, Optional, Callable
 from .tensor import Tensor

+# Import activation functions from the activations module
+from .activations import ReLU, Sigmoid, Tanh
+
 # Import our Tensor class
 # sys.path.append('../../')
 # from modules.tensor.tensor_dev import Tensor
@@ -109,130 +112,3 @@ class Dense:
    def __call__(self, x: Tensor) -> Tensor:
        """Make layer callable: layer(x) same as layer.forward(x)"""
        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 9
-class ReLU:
-    """
-    ReLU Activation: f(x) = max(0, x)
-    
-    The most popular activation function in deep learning.
-    Simple, effective, and computationally efficient.
-    
-    TODO: Implement ReLU activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply ReLU: f(x) = max(0, x)
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with ReLU applied element-wise
-            
-        TODO: Implement element-wise max(0, x) operation
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        """Make activation callable: relu(x) same as relu.forward(x)"""
-        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 10
-class ReLU:
-    """ReLU Activation: f(x) = max(0, x)"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply ReLU: f(x) = max(0, x)"""
-        return Tensor(np.maximum(0, x.data))
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 11
-class Sigmoid:
-    """
-    Sigmoid Activation: f(x) = 1 / (1 + e^(-x))
-    
-    Squashes input to range (0, 1). Often used for binary classification.
-    
-    TODO: Implement Sigmoid activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply Sigmoid: f(x) = 1 / (1 + e^(-x))
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with Sigmoid applied element-wise
-            
-        TODO: Implement sigmoid function (be careful with numerical stability!)
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 12
-class Sigmoid:
-    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply Sigmoid with numerical stability"""
-        # Use the numerically stable version to avoid overflow
-        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
-        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
-        x_data = x.data
-        result = np.zeros_like(x_data)
-        
-        # Stable computation
-        positive_mask = x_data >= 0
-        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
-        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
-        
-        return Tensor(result)
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 13
-class Tanh:
-    """
-    Tanh Activation: f(x) = tanh(x)
-    
-    Squashes input to range (-1, 1). Zero-centered output.
-    
-    TODO: Implement Tanh activation function.
-    """
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """
-        Apply Tanh: f(x) = tanh(x)
-        
-        Args:
-            x: Input tensor
-            
-        Returns:
-            Output tensor with Tanh applied element-wise
-            
-        TODO: Implement tanh function
-        """
-        raise NotImplementedError("Student implementation required")
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
-
-# %% ../../modules/layers/layers_dev.ipynb 14
-class Tanh:
-    """Tanh Activation: f(x) = tanh(x)"""
-    
-    def forward(self, x: Tensor) -> Tensor:
-        """Apply Tanh"""
-        return Tensor(np.tanh(x.data))
-    
-    def __call__(self, x: Tensor) -> Tensor:
-        return self.forward(x)
--- a/tinytorch/core/utils.py
+++ b/tinytorch/core/utils.py
@@ -1,22 +1,98 @@
 # AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/setup/setup_dev.ipynb.

 # %% auto 0
-__all__ = ['hello_tinytorch', 'add_numbers', 'SystemInfo']
+__all__ = ['hello_tinytorch', 'add_numbers', 'SystemInfo', 'DeveloperProfile']

 # %% ../../modules/setup/setup_dev.ipynb 3
 def hello_tinytorch():
-    """A simple hello world function for TinyTorch."""
-    return "Hello from TinyTorch! 🔥"
+    """
+    A simple hello world function for TinyTorch.
+    
+    TODO: Implement this function to display TinyTorch ASCII art and welcome message.
+    Load the flame art from tinytorch_flame.txt file with graceful fallback.
+    """
+    raise NotImplementedError("Student implementation required")
+
+def add_numbers(a, b):
+    """
+    Add two numbers together.
+    
+    TODO: Implement addition of two numbers.
+    This is the foundation of all mathematical operations in ML.
+    """
+    raise NotImplementedError("Student implementation required")
+
+# %% ../../modules/setup/setup_dev.ipynb 4
+def hello_tinytorch():
+    """Display the TinyTorch ASCII art and welcome message."""
+    try:
+        # Get the directory containing this file
+        current_dir = Path(__file__).parent
+        art_file = current_dir / "tinytorch_flame.txt"
+        
+        if art_file.exists():
+            with open(art_file, 'r') as f:
+                ascii_art = f.read()
+            print(ascii_art)
+            print("Tiny🔥Torch")
+            print("Build ML Systems from Scratch!")
+        else:
+            print("🔥 TinyTorch 🔥")
+            print("Build ML Systems from Scratch!")
+    except NameError:
+        # Handle case when running in notebook where __file__ is not defined
+        try:
+            art_file = Path(os.getcwd()) / "tinytorch_flame.txt"
+            if art_file.exists():
+                with open(art_file, 'r') as f:
+                    ascii_art = f.read()
+                print(ascii_art)
+                print("Tiny🔥Torch")
+                print("Build ML Systems from Scratch!")
+            else:
+                print("🔥 TinyTorch 🔥")
+                print("Build ML Systems from Scratch!")
+        except:
+            print("🔥 TinyTorch 🔥")
+            print("Build ML Systems from Scratch!")

 def add_numbers(a, b):
    """Add two numbers together."""
    return a + b

+# %% ../../modules/setup/setup_dev.ipynb 8
+class SystemInfo:
+    """
+    Simple system information class.
    
-# %% ../../modules/setup/setup_dev.ipynb 6
-import sys
-import platform
+    TODO: Implement this class to collect and display system information.
+    """
    
+    def __init__(self):
+        """
+        Initialize system information collection.
+        
+        TODO: Collect Python version, platform, and machine information.
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def __str__(self):
+        """
+        Return human-readable system information.
+        
+        TODO: Format system info as a readable string.
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def is_compatible(self):
+        """
+        Check if system meets minimum requirements.
+        
+        TODO: Check if Python version is >= 3.8
+        """
+        raise NotImplementedError("Student implementation required")
+
+# %% ../../modules/setup/setup_dev.ipynb 9
 class SystemInfo:
    """Simple system information class."""
    
@@ -32,3 +108,145 @@ class SystemInfo:
        """Check if system meets minimum requirements."""
        return self.python_version >= (3, 8)

+# %% ../../modules/setup/setup_dev.ipynb 13
+class DeveloperProfile:
+    """
+    Developer profile for personalizing TinyTorch experience.
+    
+    TODO: Implement this class to store and display developer information.
+    Default to course instructor but allow students to personalize.
+    """
+    
+    @staticmethod
+    def _load_default_flame():
+        """
+        Load the default TinyTorch flame ASCII art from file.
+        
+        TODO: Implement file loading for tinytorch_flame.txt with fallback.
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def __init__(self, name="Vijay Janapa Reddi", affiliation="Harvard University", 
+                 email="vj@eecs.harvard.edu", github_username="profvjreddi", ascii_art=None):
+        """
+        Initialize developer profile.
+        
+        TODO: Store developer information with sensible defaults.
+        Students should be able to customize this with their own info and ASCII art.
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def __str__(self):
+        """
+        Return formatted developer information.
+        
+        TODO: Format developer info as a professional signature with optional ASCII art.
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def get_signature(self):
+        """
+        Get a short signature for code headers.
+        
+        TODO: Return a concise signature like "Built by Name (@github)"
+        """
+        raise NotImplementedError("Student implementation required")
+    
+    def get_ascii_art(self):
+        """
+        Get ASCII art for the profile.
+        
+        TODO: Return custom ASCII art or default flame loaded from file.
+        """
+        raise NotImplementedError("Student implementation required")
+
+# %% ../../modules/setup/setup_dev.ipynb 14
+class DeveloperProfile:
+    """Developer profile for personalizing TinyTorch experience."""
+    
+    @staticmethod
+    def _load_default_flame():
+        """Load the default TinyTorch flame ASCII art from file."""
+        try:
+            # Try to load from the same directory as this module
+            try:
+                # Try to get the directory of the current file
+                current_dir = os.path.dirname(__file__)
+            except NameError:
+                # If __file__ is not defined (e.g., in notebook), use current directory
+                current_dir = os.getcwd()
+            
+            flame_path = os.path.join(current_dir, 'tinytorch_flame.txt')
+            
+            with open(flame_path, 'r', encoding='utf-8') as f:
+                flame_art = f.read()
+            
+            # Add the Tiny🔥Torch text below the flame
+            return f"""{flame_art}
+                    
+                    Tiny🔥Torch
+            Build ML Systems from Scratch!
+            """
+        except (FileNotFoundError, IOError):
+            # Fallback to simple flame if file not found
+            return """
+    🔥 TinyTorch Developer 🔥
+         .  .  .  .  .  .
+        .    .  .  .  .   .
+       .  .    .  .  .  .  .
+      .  .  .    .  .  .  .  .
+     .  .  .  .    .  .  .  .  .
+    .  .  .  .  .    .  .  .  .  .
+   .  .  .  .  .  .    .  .  .  .  .
+  .  .  .  .  .  .  .    .  .  .  .  .
+ .  .  .  .  .  .  .  .    .  .  .  .  .
+.  .  .  .  .  .  .  .  .    .  .  .  .  .
+ \\  \\  \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
+  \\  \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
+   \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
+    \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
+     \\  \\  \\  \\  \\  /  /  /  /  /  /
+      \\  \\  \\  \\  /  /  /  /  /  /
+       \\  \\  \\  /  /  /  /  /  /
+        \\  \\  /  /  /  /  /  /
+         \\  /  /  /  /  /  /
+          \\/  /  /  /  /  /
+           \\/  /  /  /  /
+            \\/  /  /  /
+             \\/  /  /
+              \\/  /
+               \\/
+                    
+                    Tiny🔥Torch
+            Build ML Systems from Scratch!
+            """
+    
+    def __init__(self, name="Vijay Janapa Reddi", affiliation="Harvard University", 
+                 email="vj@eecs.harvard.edu", github_username="profvjreddi", ascii_art=None):
+        self.name = name
+        self.affiliation = affiliation
+        self.email = email
+        self.github_username = github_username
+        self.ascii_art = ascii_art or self._load_default_flame()
+    
+    def __str__(self):
+        return f"👨‍💻 {self.name} | {self.affiliation} | @{self.github_username}"
+    
+    def get_signature(self):
+        """Get a short signature for code headers."""
+        return f"Built by {self.name} (@{self.github_username})"
+    
+    def get_ascii_art(self):
+        """Get ASCII art for the profile."""
+        return self.ascii_art
+    
+    def get_full_profile(self):
+        """Get complete profile with ASCII art."""
+        return f"""{self.ascii_art}
+        
+👨‍💻 Developer: {self.name}
+🏛️  Affiliation: {self.affiliation}
+📧 Email: {self.email}
+🐙 GitHub: @{self.github_username}
+🔥 Ready to build ML systems from scratch!
+"""