feat: Create clean modular architecture with activations → layers separation

��️ Major architectural improvement implementing clean separation of concerns: ✨ NEW: Activations Module - Complete activations module with ReLU, Sigmoid, Tanh implementations - Educational NBDev structure with student TODOs + instructor solutions - Comprehensive testing suite (24 tests) with mathematical correctness validation - Visual learning features with matplotlib plotting (disabled during testing) - Clean export to tinytorch.core.activations 🔧 REFACTOR: Layers Module - Removed duplicate activation function implementations - Clean import from activations module: 'from tinytorch.core.activations import ReLU, Sigmoid, Tanh' - Updated documentation to reflect modular architecture - Preserved all existing functionality while improving code organization 🧪 TESTING: Comprehensive Test Coverage - All 24 activations tests passing ✅ - All 17 layers tests passing ✅ - Integration tests verify clean architecture works end-to-end - CLI testing with 'tito test --module' works for both modules 📦 ARCHITECTURE: Clean Dependency Graph - activations (math functions) → layers (building blocks) → networks (applications) - Separation of concerns: pure math vs. neural network components - Reusable components across future modules - Single source of truth for activation implementations �� PEDAGOGY: Enhanced Learning Experience - Week-sized chunks: students master activations, then build layers - Clear progression from mathematical foundations to applications - Real-world software architecture patterns - Modular design principles in practice This establishes the foundation for scalable, maintainable ML systems education.
2026-04-30 03:57:30 -05:00 · 2025-07-10 21:32:25 -04:00
parent 7da85b3572
commit b47c8ef259
10 changed files with 2161 additions and 303 deletions
--- a/bin/tito.py
+++ b/bin/tito.py
@@ -343,7 +343,7 @@ def cmd_info(args):
 def cmd_test(args):
    """Run tests for a specific module."""
-    valid_modules = ["setup", "tensor", "layers", "cnn", "data", "training", 
+    valid_modules = ["setup", "tensor", "activations", "layers", "cnn", "data", "training", 
                     "profiling", "compression", "kernels", "benchmarking", "mlops"]
    if args.all:
--- a/modules/activations/README.md
+++ b/modules/activations/README.md
@@ -0,0 +1,237 @@
 # 🔥 TinyTorch Activations Module
 Welcome to the **Activations** module! This is where you'll implement the mathematical functions that give neural networks their power to learn complex patterns.
 ## 🎯 Learning Objectives
 By the end of this module, you will:
 1. **Understand** why activation functions are essential for neural networks
 2. **Implement** the three most important activation functions: ReLU, Sigmoid, and Tanh
 3. **Test** your functions with various inputs to understand their behavior
 4. **Grasp** the mathematical properties that make each function useful
 ## 🧠 Why This Module Matters
 **Without activation functions, neural networks are just linear transformations!**
 ```
 Linear → Linear → Linear = Still just Linear
 Linear → Activation → Linear = Can learn complex patterns!
 ```
 This module teaches you the mathematical foundations that make deep learning possible.
 ## 📚 What You'll Build
 ### 1. **ReLU** (Rectified Linear Unit)
 - **Formula**: `f(x) = max(0, x)`
 - **Properties**: Simple, sparse, unbounded
 - **Use case**: Hidden layers (most common)
 ### 2. **Sigmoid** 
 - **Formula**: `f(x) = 1 / (1 + e^(-x))`
 - **Properties**: Bounded to (0,1), smooth, probabilistic
 - **Use case**: Binary classification, gates
 ### 3. **Tanh** (Hyperbolic Tangent)
 - **Formula**: `f(x) = tanh(x)`
 - **Properties**: Bounded to (-1,1), zero-centered, smooth
 - **Use case**: Hidden layers, RNNs
 ## 🚀 Getting Started
 ### Development Workflow
 1. **Open the development file**:
   ```bash
   python bin/tito.py jupyter
   # Then open modules/activations/activations_dev.py
   ```
 2. **Implement the functions**:
   - Start with ReLU (simplest)
   - Move to Sigmoid (numerical stability challenge)
   - Finish with Tanh (symmetry properties)
 3. **Visualize your functions**:
   - Each function has plotting sections
   - See how your implementation transforms inputs
   - Compare all functions side-by-side
 4. **Test as you go**:
   ```bash
   python bin/tito.py test --module activations
   ```
 5. **Export to package**:
   ```bash
   python bin/tito.py sync
   ```
 ### 📊 Visual Learning Features
 This module includes comprehensive plotting sections to help you understand:
 - **Individual Function Plots**: See each activation function's curve
 - **Implementation Comparison**: Your implementation vs ideal side-by-side
 - **Mathematical Explanations**: Visual breakdown of function properties
 - **Error Analysis**: Quantitative feedback on implementation accuracy
 - **Comprehensive Comparison**: All functions analyzed together
 **Enhanced Features**:
 - **4-Panel Plots**: Implementation vs ideal, mathematical definition, properties, error analysis
 - **Real-time Feedback**: Immediate accuracy scores with color-coded status
 - **Mathematical Insights**: Detailed explanations of function properties
 - **Numerical Stability Testing**: Verification with extreme values
 - **Property Verification**: Symmetry, monotonicity, and zero-centering tests
 **Why enhanced plots matter**: 
 - **Visual Debugging**: See exactly where your implementation differs
 - **Quantitative Feedback**: Get precise error measurements
 - **Mathematical Understanding**: Connect formulas to visual behavior
 - **Implementation Confidence**: Know immediately if your code is correct
 - **Learning Reinforcement**: Multiple visual perspectives of the same concept
 ### Implementation Tips
 #### ReLU Implementation
 ```python
 def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.maximum(0, x.data))
 ```
 #### Sigmoid Implementation (Numerical Stability)
 ```python
 def forward(self, x: Tensor) -> Tensor:
    # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
    # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
    x_data = x.data
    result = np.zeros_like(x_data)
    positive_mask = x_data >= 0
    result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
    result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
    return Tensor(result)
 ```
 #### Tanh Implementation
 ```python
 def forward(self, x: Tensor) -> Tensor:
    return Tensor(np.tanh(x.data))
 ```
 ## 🧪 Testing Your Implementation
 ### Unit Tests
 ```bash
 python bin/tito.py test --module activations
 ```
 **Test Coverage**:
 - ✅ Mathematical correctness
 - ✅ Numerical stability
 - ✅ Shape preservation
 - ✅ Edge cases
 - ✅ Function properties
 ### Manual Testing
 ```python
 # Test all activations
 from tinytorch.core.tensor import Tensor
 from modules.activations.activations_dev import ReLU, Sigmoid, Tanh
 x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
 relu = ReLU()
 sigmoid = Sigmoid()
 tanh = Tanh()
 print("Input:", x.data)
 print("ReLU:", relu(x).data)
 print("Sigmoid:", sigmoid(x).data)
 print("Tanh:", tanh(x).data)
 ```
 ## 📊 Understanding Function Properties
 ### Range Comparison
 | Function | Input Range | Output Range | Zero Point |
 |----------|-------------|--------------|------------|
 | ReLU     | (-∞, ∞)     | [0, ∞)       | f(0) = 0   |
 | Sigmoid  | (-∞, ∞)     | (0, 1)       | f(0) = 0.5 |
 | Tanh     | (-∞, ∞)     | (-1, 1)      | f(0) = 0   |
 ### Key Properties
 - **ReLU**: Sparse (zeros out negatives), unbounded, simple
 - **Sigmoid**: Probabilistic (0-1 range), smooth, saturating
 - **Tanh**: Zero-centered, symmetric, stronger gradients than sigmoid
 ## 🔧 Integration with TinyTorch
 After implementation, your activations will be available as:
 ```python
 from tinytorch.core.activations import ReLU, Sigmoid, Tanh
 # Use in neural networks
 relu = ReLU()
 output = relu(input_tensor)
 ```
 ## 🎯 Common Issues & Solutions
 ### Issue 1: Sigmoid Overflow
 **Problem**: `exp()` overflow with large inputs
 **Solution**: Use numerically stable implementation (see code above)
 ### Issue 2: Wrong Output Range
 **Problem**: Sigmoid/Tanh outputs outside expected range
 **Solution**: Check your mathematical implementation
 ### Issue 3: Shape Mismatch
 **Problem**: Output shape differs from input shape
 **Solution**: Ensure element-wise operations preserve shape
 ### Issue 4: Import Errors
 **Problem**: Cannot import after implementation
 **Solution**: Run `python bin/tito.py sync` to export to package
 ## 📈 Performance Considerations
 - **ReLU**: Fastest (simple max operation)
 - **Sigmoid**: Moderate (exponential computation)
 - **Tanh**: Moderate (hyperbolic function)
 All implementations use NumPy for vectorized operations.
 ## 🚀 What's Next
 After mastering activations, you'll use them in:
 1. **Layers Module**: Building neural network layers
 2. **Loss Functions**: Computing training objectives
 3. **Advanced Architectures**: CNNs, RNNs, and more
 These functions are the mathematical foundation for everything that follows!
 ## 📚 Further Reading
 **Mathematical Background**:
 - [Activation Functions in Neural Networks](https://en.wikipedia.org/wiki/Activation_function)
 - [Deep Learning Book - Chapter 6](http://www.deeplearningbook.org/)
 **Advanced Topics**:
 - ReLU variants (Leaky ReLU, ELU, Swish)
 - Activation function choice and impact
 - Gradient flow and vanishing gradients
 ## 🎉 Success Criteria
 You've mastered this module when:
 - [ ] All tests pass (`python bin/tito.py test --module activations`)
 - [ ] You understand why each function is useful
 - [ ] You can explain the mathematical properties
 - [ ] You can use activations in neural networks
 - [ ] You appreciate the importance of nonlinearity
 **Great work! You've built the mathematical foundation of neural networks!** 🎉 
--- a/modules/activations/activations_dev.py
+++ b/modules/activations/activations_dev.py
--- a/modules/activations/tests/test_activations.py
+++ b/modules/activations/tests/test_activations.py
@@ -0,0 +1,345 @@
 """
 Test suite for the TinyTorch Activations module.
 This test suite validates the mathematical correctness of activation functions:
 - ReLU: f(x) = max(0, x)
 - Sigmoid: f(x) = 1 / (1 + e^(-x))
 - Tanh: f(x) = tanh(x)
 Tests focus on:
 1. Mathematical correctness
 2. Numerical stability
 3. Edge cases
 4. Shape preservation
 5. Type consistency
 """
 import pytest
 import numpy as np
 import math
 from tinytorch.core.tensor import Tensor
 # Import the activation functions
 import sys
 import os
 sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
 from activations_dev import ReLU, Sigmoid, Tanh
 class TestReLU:
    """Test the ReLU activation function."""
    def test_relu_basic_functionality(self):
        """Test basic ReLU behavior: max(0, x)"""
        relu = ReLU()
        # Test mixed positive/negative values
        x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
        y = relu(x)
        expected = np.array([[0.0, 0.0, 0.0, 1.0, 2.0]])
        assert np.allclose(y.data, expected), f"Expected {expected}, got {y.data}"
    def test_relu_all_positive(self):
        """Test ReLU with all positive values (should be unchanged)"""
        relu = ReLU()
        x = Tensor([[1.0, 2.5, 3.7, 10.0]])
        y = relu(x)
        assert np.allclose(y.data, x.data), "ReLU should preserve positive values"
    def test_relu_all_negative(self):
        """Test ReLU with all negative values (should be zeros)"""
        relu = ReLU()
        x = Tensor([[-1.0, -2.5, -3.7, -10.0]])
        y = relu(x)
        expected = np.zeros_like(x.data)
        assert np.allclose(y.data, expected), "ReLU should zero out negative values"
    def test_relu_zero_input(self):
        """Test ReLU with zero input"""
        relu = ReLU()
        x = Tensor([[0.0]])
        y = relu(x)
        assert y.data[0, 0] == 0.0, "ReLU(0) should be 0"
    def test_relu_shape_preservation(self):
        """Test that ReLU preserves tensor shape"""
        relu = ReLU()
        # Test different shapes
        shapes = [(1, 5), (2, 3), (4, 1), (3, 3)]
        for shape in shapes:
            x = Tensor(np.random.randn(*shape))
            y = relu(x)
            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
    def test_relu_callable(self):
        """Test that ReLU can be called directly"""
        relu = ReLU()
        x = Tensor([[1.0, -1.0]])
        y1 = relu(x)
        y2 = relu.forward(x)
        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
 class TestSigmoid:
    """Test the Sigmoid activation function."""
    def test_sigmoid_basic_functionality(self):
        """Test basic Sigmoid behavior"""
        sigmoid = Sigmoid()
        # Test known values
        x = Tensor([[0.0]])
        y = sigmoid(x)
        assert abs(y.data[0, 0] - 0.5) < 1e-6, "Sigmoid(0) should be 0.5"
    def test_sigmoid_range(self):
        """Test that Sigmoid outputs are in (0, 1)"""
        sigmoid = Sigmoid()
        # Test wide range of inputs
        x = Tensor([[-10.0, -5.0, -1.0, 0.0, 1.0, 5.0, 10.0]])
        y = sigmoid(x)
        assert np.all(y.data > 0), "Sigmoid outputs should be > 0"
        assert np.all(y.data < 1), "Sigmoid outputs should be < 1"
    def test_sigmoid_numerical_stability(self):
        """Test Sigmoid with extreme values (numerical stability)"""
        sigmoid = Sigmoid()
        # Test extreme values that could cause overflow
        x = Tensor([[-100.0, -50.0, 50.0, 100.0]])
        y = sigmoid(x)
        # Should not contain NaN or inf
        assert not np.any(np.isnan(y.data)), "Sigmoid should not produce NaN"
        assert not np.any(np.isinf(y.data)), "Sigmoid should not produce inf"
        # Should be close to 0 for very negative, close to 1 for very positive
        assert y.data[0, 0] < 1e-10, "Sigmoid(-100) should be very close to 0"
        assert y.data[0, 1] < 1e-10, "Sigmoid(-50) should be very close to 0"
        assert y.data[0, 2] > 1 - 1e-10, "Sigmoid(50) should be very close to 1"
        assert y.data[0, 3] > 1 - 1e-10, "Sigmoid(100) should be very close to 1"
    def test_sigmoid_monotonicity(self):
        """Test that Sigmoid is monotonically increasing"""
        sigmoid = Sigmoid()
        x = Tensor([[-3.0, -1.0, 0.0, 1.0, 3.0]])
        y = sigmoid(x)
        # Check that outputs are increasing
        for i in range(len(y.data[0]) - 1):
            assert y.data[0, i] < y.data[0, i + 1], "Sigmoid should be monotonically increasing"
    def test_sigmoid_shape_preservation(self):
        """Test that Sigmoid preserves tensor shape"""
        sigmoid = Sigmoid()
        shapes = [(1, 5), (2, 3), (4, 1)]
        for shape in shapes:
            x = Tensor(np.random.randn(*shape))
            y = sigmoid(x)
            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
    def test_sigmoid_callable(self):
        """Test that Sigmoid can be called directly"""
        sigmoid = Sigmoid()
        x = Tensor([[1.0, -1.0]])
        y1 = sigmoid(x)
        y2 = sigmoid.forward(x)
        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
 class TestTanh:
    """Test the Tanh activation function."""
    def test_tanh_basic_functionality(self):
        """Test basic Tanh behavior"""
        tanh = Tanh()
        # Test known values
        x = Tensor([[0.0]])
        y = tanh(x)
        assert abs(y.data[0, 0] - 0.0) < 1e-6, "Tanh(0) should be 0"
    def test_tanh_range(self):
        """Test that Tanh outputs are in [-1, 1]"""
        tanh = Tanh()
        # Test wide range of inputs
        x = Tensor([[-10.0, -5.0, -1.0, 0.0, 1.0, 5.0, 10.0]])
        y = tanh(x)
        assert np.all(y.data >= -1), "Tanh outputs should be >= -1"
        assert np.all(y.data <= 1), "Tanh outputs should be <= 1"
    def test_tanh_symmetry(self):
        """Test that Tanh is symmetric: tanh(-x) = -tanh(x)"""
        tanh = Tanh()
        x = Tensor([[1.0, 2.0, 3.0]])
        x_neg = Tensor([[-1.0, -2.0, -3.0]])
        y_pos = tanh(x)
        y_neg = tanh(x_neg)
        assert np.allclose(y_neg.data, -y_pos.data), "Tanh should be symmetric"
    def test_tanh_monotonicity(self):
        """Test that Tanh is monotonically increasing"""
        tanh = Tanh()
        x = Tensor([[-3.0, -1.0, 0.0, 1.0, 3.0]])
        y = tanh(x)
        # Check that outputs are increasing
        for i in range(len(y.data[0]) - 1):
            assert y.data[0, i] < y.data[0, i + 1], "Tanh should be monotonically increasing"
    def test_tanh_extreme_values(self):
        """Test Tanh with extreme values"""
        tanh = Tanh()
        x = Tensor([[-100.0, 100.0]])
        y = tanh(x)
        # Should be close to -1 and 1 respectively
        assert abs(y.data[0, 0] - (-1.0)) < 1e-10, "Tanh(-100) should be very close to -1"
        assert abs(y.data[0, 1] - 1.0) < 1e-10, "Tanh(100) should be very close to 1"
    def test_tanh_shape_preservation(self):
        """Test that Tanh preserves tensor shape"""
        tanh = Tanh()
        shapes = [(1, 5), (2, 3), (4, 1)]
        for shape in shapes:
            x = Tensor(np.random.randn(*shape))
            y = tanh(x)
            assert y.shape == x.shape, f"Shape mismatch: expected {x.shape}, got {y.shape}"
    def test_tanh_callable(self):
        """Test that Tanh can be called directly"""
        tanh = Tanh()
        x = Tensor([[1.0, -1.0]])
        y1 = tanh(x)
        y2 = tanh.forward(x)
        assert np.allclose(y1.data, y2.data), "Direct call should match forward method"
 class TestActivationComparison:
    """Test interactions and comparisons between activation functions."""
    def test_activation_consistency(self):
        """Test that all activations work with the same input"""
        relu = ReLU()
        sigmoid = Sigmoid()
        tanh = Tanh()
        x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
        # All should process without error
        y_relu = relu(x)
        y_sigmoid = sigmoid(x)
        y_tanh = tanh(x)
        # All should preserve shape
        assert y_relu.shape == x.shape
        assert y_sigmoid.shape == x.shape
        assert y_tanh.shape == x.shape
    def test_activation_ranges(self):
        """Test that activations have expected output ranges"""
        relu = ReLU()
        sigmoid = Sigmoid()
        tanh = Tanh()
        x = Tensor([[-5.0, -2.0, 0.0, 2.0, 5.0]])
        y_relu = relu(x)
        y_sigmoid = sigmoid(x)
        y_tanh = tanh(x)
        # ReLU: [0, inf)
        assert np.all(y_relu.data >= 0), "ReLU should be non-negative"
        # Sigmoid: (0, 1)
        assert np.all(y_sigmoid.data > 0), "Sigmoid should be positive"
        assert np.all(y_sigmoid.data < 1), "Sigmoid should be less than 1"
        # Tanh: (-1, 1)
        assert np.all(y_tanh.data > -1), "Tanh should be greater than -1"
        assert np.all(y_tanh.data < 1), "Tanh should be less than 1"
 # Integration tests with edge cases
 class TestActivationEdgeCases:
    """Test edge cases and boundary conditions."""
    def test_zero_tensor(self):
        """Test all activations with zero tensor"""
        relu = ReLU()
        sigmoid = Sigmoid()
        tanh = Tanh()
        x = Tensor([[0.0, 0.0, 0.0]])
        y_relu = relu(x)
        y_sigmoid = sigmoid(x)
        y_tanh = tanh(x)
        assert np.allclose(y_relu.data, [0.0, 0.0, 0.0]), "ReLU(0) should be 0"
        assert np.allclose(y_sigmoid.data, [0.5, 0.5, 0.5]), "Sigmoid(0) should be 0.5"
        assert np.allclose(y_tanh.data, [0.0, 0.0, 0.0]), "Tanh(0) should be 0"
    def test_single_element_tensor(self):
        """Test all activations with single element tensor"""
        relu = ReLU()
        sigmoid = Sigmoid()
        tanh = Tanh()
        x = Tensor([[1.0]])
        y_relu = relu(x)
        y_sigmoid = sigmoid(x)
        y_tanh = tanh(x)
        assert y_relu.shape == (1, 1)
        assert y_sigmoid.shape == (1, 1)
        assert y_tanh.shape == (1, 1)
    def test_large_tensor(self):
        """Test activations with larger tensors"""
        relu = ReLU()
        sigmoid = Sigmoid()
        tanh = Tanh()
        # Create a 10x10 tensor
        x = Tensor(np.random.randn(10, 10))
        y_relu = relu(x)
        y_sigmoid = sigmoid(x)
        y_tanh = tanh(x)
        assert y_relu.shape == (10, 10)
        assert y_sigmoid.shape == (10, 10)
        assert y_tanh.shape == (10, 10)
 if __name__ == "__main__":
    # Run tests with pytest
    pytest.main([__file__, "-v"]) 
--- a/modules/layers/layers_dev.py
+++ b/modules/layers/layers_dev.py
@@ -17,15 +17,20 @@ Welcome to the Layers module! This is where neural networks begin. You'll implem
 ## Learning Goals
 - Understand layers as functions that transform tensors: `y = f(x)`
 - Implement Dense layers with linear transformations: `y = Wx + b`
- Add activation functions for nonlinearity (ReLU, Sigmoid, Tanh)
+- Use activation functions from the activations module for nonlinearity
 - See how neural networks are just function composition
 - Build intuition before diving into training
 ## Build → Use → Understand
-1. **Build**: Dense layers and activation functions
+1. **Build**: Dense layers using activation functions as building blocks
 2. **Use**: Transform tensors and see immediate results
 3. **Understand**: How neural networks transform information
 ## Module Dependencies
 This module builds on the **activations** module:
 - **activations** → **layers** → **networks**
 - Clean separation of concerns: math functions → layer building blocks → full networks
 ## Module → Package Structure
 **🎓 Teaching vs. 🔧 Building**: 
 - **Learning side**: Work in `modules/layers/layers_dev.py`  
@@ -51,6 +56,9 @@ import sys
 from typing import Union, Optional, Callable
 from tinytorch.core.tensor import Tensor
 # Import activation functions from the activations module
 from tinytorch.core.activations import ReLU, Sigmoid, Tanh
 # Import our Tensor class
 # sys.path.append('../../')
 # from modules.tensor.tensor_dev import Tensor
@@ -203,12 +211,11 @@ try:
    print(f"Input: {x.data}")
    print(f"Output: {y.data}")
-    # Test with batch of examples
+    # Test with batch
    x_batch = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])  # Shape: (2, 3)
    y_batch = layer(x_batch)
    print(f"\nBatch input shape: {x_batch.shape}")
    print(f"Batch output shape: {y_batch.shape}")
    print(f"Batch output: {y_batch.data}")
    print("✅ Dense layer working!")
@@ -218,14 +225,20 @@ except Exception as e:
 # %% [markdown]
 """
-## Step 2: Activation Functions
+## Step 2: Activation Functions - Adding Nonlinearity
-Dense layers alone can only learn **linear** transformations. But most real-world problems need **nonlinear** transformations.
+Now we'll use the activation functions from the **activations** module! 
-**Activation functions** add nonlinearity:
+**Clean Architecture**: We import the activation functions rather than redefining them:
- **ReLU**: `max(0, x)` - Most common, simple and effective
+```python
- **Sigmoid**: `1 / (1 + e^(-x))` - Squashes to (0, 1)
+from tinytorch.core.activations import ReLU, Sigmoid, Tanh
- **Tanh**: `tanh(x)` - Squashes to (-1, 1)
+```
 **Why this matters**:
 - **Separation of concerns**: Math functions vs. layer building blocks
 - **Reusability**: Activations can be used anywhere in the system
 - **Maintainability**: One place to update activation implementations
 - **Composability**: Clean imports make neural networks easier to build
 **Why nonlinearity matters**: Without it, stacking layers is pointless!
 ```
@@ -234,178 +247,43 @@ Linear → NonLinear → Linear = Can learn complex patterns
 ```
 """
 # %%
 #| export
 class ReLU:
    """
    ReLU Activation: f(x) = max(0, x)
    The most popular activation function in deep learning.
    Simple, effective, and computationally efficient.
    TODO: Implement ReLU activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply ReLU: f(x) = max(0, x)
        Args:
            x: Input tensor
        Returns:
            Output tensor with ReLU applied element-wise
        TODO: Implement element-wise max(0, x) operation
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        """Make activation callable: relu(x) same as relu.forward(x)"""
        return self.forward(x)
 # %%
 #| hide
 #| export
 class ReLU:
    """ReLU Activation: f(x) = max(0, x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply ReLU: f(x) = max(0, x)"""
        return Tensor(np.maximum(0, x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %%
 #| export
 class Sigmoid:
    """
    Sigmoid Activation: f(x) = 1 / (1 + e^(-x))
    Squashes input to range (0, 1). Often used for binary classification.
    TODO: Implement Sigmoid activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply Sigmoid: f(x) = 1 / (1 + e^(-x))
        Args:
            x: Input tensor
        Returns:
            Output tensor with Sigmoid applied element-wise
        TODO: Implement sigmoid function (be careful with numerical stability!)
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %%
 #| hide
 #| export
 class Sigmoid:
    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Sigmoid with numerical stability"""
        # Use the numerically stable version to avoid overflow
        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
        x_data = x.data
        result = np.zeros_like(x_data)
        # Stable computation
        positive_mask = x_data >= 0
        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
        return Tensor(result)
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %%
 #| export
 class Tanh:
    """
    Tanh Activation: f(x) = tanh(x)
    Squashes input to range (-1, 1). Zero-centered output.
    TODO: Implement Tanh activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply Tanh: f(x) = tanh(x)
        Args:
            x: Input tensor
        Returns:
            Output tensor with Tanh applied element-wise
        TODO: Implement tanh function
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %%
 #| hide
 #| export
 class Tanh:
    """Tanh Activation: f(x) = tanh(x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Tanh"""
        return Tensor(np.tanh(x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% [markdown]
 """
-### 🧪 Test Your Activation Functions
+### 🧪 Test Activation Functions from Activations Module
-Once you implement the activation functions above, run this cell to test them:
+Let's test that we can use the activation functions from the activations module:
 """
 # %%
-# Test activation functions
+# Test activation functions from activations module
 try:
-    print("=== Testing Activation Functions ===")
+    print("=== Testing Activation Functions from Activations Module ===")
    # Test data: mix of positive, negative, and zero
    x = Tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
    print(f"Input: {x.data}")
-    # Test ReLU
+    # Test ReLU from activations module
    relu = ReLU()
    y_relu = relu(x)
    print(f"ReLU output: {y_relu.data}")
-    # Test Sigmoid
+    # Test Sigmoid from activations module
    sigmoid = Sigmoid()
    y_sigmoid = sigmoid(x)
    print(f"Sigmoid output: {y_sigmoid.data}")
-    # Test Tanh
+    # Test Tanh from activations module
    tanh = Tanh()
    y_tanh = tanh(x)
    print(f"Tanh output: {y_tanh.data}")
-    print("✅ Activation functions working!")
+    print("✅ Activation functions from activations module working!")
    print("🎉 Clean architecture: layers module uses activations module!")
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Make sure to implement the activation functions above!")
+    print("Make sure the activations module is properly exported!")
 # %% [markdown]
 """
@@ -418,6 +296,11 @@ Input → Dense → ReLU → Dense → Sigmoid → Output
 ```
 This is a 2-layer neural network that can learn complex nonlinear patterns!
 **Notice the clean architecture**:
 - Dense layers handle linear transformations
 - Activation functions (from activations module) handle nonlinearity
 - Composition creates complex behaviors from simple building blocks
 """
 # %%
@@ -431,9 +314,9 @@ try:
    # Output: 2 neurons with Sigmoid
    layer1 = Dense(input_size=3, output_size=4)
-    activation1 = ReLU()
+    activation1 = ReLU()  # From activations module
    layer2 = Dense(input_size=4, output_size=2)
-    activation2 = Sigmoid()
+    activation2 = Sigmoid()  # From activations module
    print("Network architecture:")
    print(f"  Input: 3 features")
@@ -458,28 +341,36 @@ try:
    print(f"Output values: {output.data}")
    print("\n🎉 Neural network working! You just built your first neural network!")
    print("🏗️  Clean architecture: Dense layers + Activations module = Neural Network")
    print("Notice how the network transforms 3D input into 2D output through learned transformations.")
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Make sure to implement the layers and activations above!")
+    print("Make sure to implement the layers and check activations module!")
 # %% [markdown]
 """
 ## Step 4: Understanding What We Built
-Congratulations! You just implemented the fundamental building blocks of neural networks:
+Congratulations! You just implemented a clean, modular neural network architecture:
 ### 🧱 **What You Built**
 1. **Dense Layer**: Linear transformation `y = Wx + b`
-2. **Activation Functions**: Nonlinear transformations (ReLU, Sigmoid, Tanh)
+2. **Activation Functions**: Imported from activations module (ReLU, Sigmoid, Tanh)
 3. **Layer Composition**: Chaining layers to build networks
 ### 🏗️ **Clean Architecture Benefits**
 - **Separation of concerns**: Math functions vs. layer building blocks
 - **Reusability**: Activations can be used across different modules
 - **Maintainability**: One place to update activation implementations
 - **Composability**: Clean imports make complex networks easier to build
 ### 🎯 **Key Insights**
 - **Layers are functions**: They transform tensors from one space to another
 - **Composition creates complexity**: Simple layers → complex networks
 - **Nonlinearity is crucial**: Without it, deep networks are just linear transformations
 - **Neural networks are function approximators**: They learn to map inputs to outputs
 - **Modular design**: Building blocks can be combined in many ways
 ### 🚀 **What's Next**
 In the next modules, you'll learn:
@@ -498,7 +389,7 @@ Then test your implementation:
 python bin/tito.py test --module layers
 ```
-**Great job! You've built the foundation of neural networks!** 🎉
+**Great job! You've built a clean, modular foundation for neural networks!** 🎉
 """
 # %%
@@ -514,9 +405,9 @@ try:
    # Build a 3-layer network for digit classification
    # 784 → 128 → 64 → 10
    layer1 = Dense(input_size=image_size, output_size=128)
-    relu1 = ReLU()
+    relu1 = ReLU()  # From activations module
    layer2 = Dense(input_size=128, output_size=64)
-    relu2 = ReLU()
+    relu2 = ReLU()  # From activations module
    layer3 = Dense(input_size=64, output_size=num_classes)
    softmax = Sigmoid()  # Using Sigmoid as a simple "probability-like" output
@@ -541,8 +432,38 @@ try:
    print(f"  Sample predictions: {predictions.data[0]}")  # First image predictions
    print("\n🎉 You built a neural network that could classify images!")
    print("🏗️  Clean architecture: Dense layers + Activations module = Image Classifier")
    print("With training, this network could learn to recognize handwritten digits!")
 except Exception as e:
    print(f"❌ Error: {e}")
-    print("Check your layer implementations!") 
+    print("Check your layer implementations and activations module!")
 # %% [markdown]
 """
 ## 🎓 Module Summary
 ### What You Learned
 1. **Layer Architecture**: Dense layers as linear transformations
 2. **Clean Dependencies**: Layers module uses activations module
 3. **Function Composition**: Simple building blocks → complex networks
 4. **Modular Design**: Separation of concerns for maintainable code
 ### Key Architectural Insight
 ```
 activations (math functions) → layers (building blocks) → networks (applications)
 ```
 This clean dependency graph makes the system:
 - **Understandable**: Each module has a clear purpose
 - **Testable**: Each module can be tested independently
 - **Reusable**: Components can be used across different contexts
 - **Maintainable**: Changes are localized to appropriate modules
 ### Next Steps
 - **Training**: Learn how networks learn from data
 - **Advanced Architectures**: CNNs, RNNs, Transformers
 - **Applications**: Real-world machine learning problems
 **Congratulations on building a clean, modular neural network foundation!** 🚀
 """ 
--- a/modules/layers/tests/test_layers.py
+++ b/modules/layers/tests/test_layers.py
@@ -18,7 +18,11 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
 # Import from the module's development file
 # Note: This imports the instructor version with full implementation
-from layers_dev import Dense, ReLU, Sigmoid, Tanh, Tensor
+from layers_dev import Dense, Tensor
 # Import activation functions from the activations module
 sys.path.insert(0, os.path.join(os.path.dirname(os.path.dirname(__file__)), '..', 'activations'))
 from activations_dev import ReLU, Sigmoid, Tanh
 def safe_numpy(tensor):
    """Get numpy array from tensor, using .numpy() if available, otherwise .data"""
--- a/tinytorch/_modidx.py
+++ b/tinytorch/_modidx.py
@@ -5,7 +5,30 @@ d = { 'settings': { 'branch': 'main',
                'doc_host': 'https://tinytorch.github.io',
                'git_url': 'https://github.com/tinytorch/TinyTorch/',
                'lib_path': 'tinytorch'},
-  'syms': { 'tinytorch.core.tensor': { 'tinytorch.core.tensor.Tensor': ('tensor/tensor_dev.html#tensor', 'tinytorch/core/tensor.py'),
+  'syms': { 'tinytorch.core.activations': {},
            'tinytorch.core.layers': { 'tinytorch.core.layers.Dense': ('layers/layers_dev.html#dense', 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Dense.__call__': ( 'layers/layers_dev.html#dense.__call__',
                                                                                 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Dense.__init__': ( 'layers/layers_dev.html#dense.__init__',
                                                                                 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Dense.forward': ( 'layers/layers_dev.html#dense.forward',
                                                                                'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.ReLU': ('layers/layers_dev.html#relu', 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.ReLU.__call__': ( 'layers/layers_dev.html#relu.__call__',
                                                                                'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.ReLU.forward': ( 'layers/layers_dev.html#relu.forward',
                                                                               'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Sigmoid': ('layers/layers_dev.html#sigmoid', 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Sigmoid.__call__': ( 'layers/layers_dev.html#sigmoid.__call__',
                                                                                   'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Sigmoid.forward': ( 'layers/layers_dev.html#sigmoid.forward',
                                                                                  'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Tanh': ('layers/layers_dev.html#tanh', 'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Tanh.__call__': ( 'layers/layers_dev.html#tanh.__call__',
                                                                                'tinytorch/core/layers.py'),
                                       'tinytorch.core.layers.Tanh.forward': ( 'layers/layers_dev.html#tanh.forward',
                                                                               'tinytorch/core/layers.py')},
            'tinytorch.core.tensor': { 'tinytorch.core.tensor.Tensor': ('tensor/tensor_dev.html#tensor', 'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor.Tensor.__init__': ( 'tensor/tensor_dev.html#tensor.__init__',
                                                                                  'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor.Tensor.__repr__': ( 'tensor/tensor_dev.html#tensor.__repr__',
@@ -22,7 +45,21 @@ d = { 'settings': { 'branch': 'main',
                                                                                      'tinytorch/core/tensor.py'),
                                       'tinytorch.core.tensor._add_utility_methods': ( 'tensor/tensor_dev.html#_add_utility_methods',
                                                                                       'tinytorch/core/tensor.py')},
-            'tinytorch.core.utils': { 'tinytorch.core.utils.SystemInfo': ('setup/setup_dev.html#systeminfo', 'tinytorch/core/utils.py'),
+            'tinytorch.core.utils': { 'tinytorch.core.utils.DeveloperProfile': ( 'setup/setup_dev.html#developerprofile',
                                                                                 'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile.__init__': ( 'setup/setup_dev.html#developerprofile.__init__',
                                                                                          'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile.__str__': ( 'setup/setup_dev.html#developerprofile.__str__',
                                                                                         'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile._load_default_flame': ( 'setup/setup_dev.html#developerprofile._load_default_flame',
                                                                                                     'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile.get_ascii_art': ( 'setup/setup_dev.html#developerprofile.get_ascii_art',
                                                                                               'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile.get_full_profile': ( 'setup/setup_dev.html#developerprofile.get_full_profile',
                                                                                                  'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.DeveloperProfile.get_signature': ( 'setup/setup_dev.html#developerprofile.get_signature',
                                                                                               'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.SystemInfo': ('setup/setup_dev.html#systeminfo', 'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.SystemInfo.__init__': ( 'setup/setup_dev.html#systeminfo.__init__',
                                                                                    'tinytorch/core/utils.py'),
                                      'tinytorch.core.utils.SystemInfo.__str__': ( 'setup/setup_dev.html#systeminfo.__str__',
--- a/tinytorch/core/activations.py
+++ b/tinytorch/core/activations.py
@@ -0,0 +1,58 @@
 # AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/activations/activations_dev.py.
 # %% auto 0
 __all__ = ['ReLU', 'Sigmoid', 'Tanh']
 # %% ../../modules/activations/activations_dev.py auto 1
 import math
 import numpy as np
 import matplotlib.pyplot as plt
 import os
 import sys
 # TinyTorch imports
 from tinytorch.core.tensor import Tensor
 # %% ../../modules/activations/activations_dev.py auto 2
 class ReLU:
    """ReLU Activation: f(x) = max(0, x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply ReLU: f(x) = max(0, x)"""
        return Tensor(np.maximum(0, x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/activations/activations_dev.py auto 3
 class Sigmoid:
    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Sigmoid with numerical stability"""
        # Use the numerically stable version to avoid overflow
        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
        x_data = x.data
        result = np.zeros_like(x_data)
        # Stable computation
        positive_mask = x_data >= 0
        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
        return Tensor(result)
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/activations/activations_dev.py auto 4
 class Tanh:
    """Tanh Activation: f(x) = tanh(x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Tanh"""
        return Tensor(np.tanh(x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x) 
--- a/tinytorch/core/layers.py
+++ b/tinytorch/core/layers.py
@@ -1,7 +1,7 @@
 # AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/layers/layers_dev.ipynb.
 # %% auto 0
-__all__ = ['Dense', 'ReLU', 'Sigmoid', 'Tanh']
+__all__ = ['Dense']
 # %% ../../modules/layers/layers_dev.ipynb 2
 import numpy as np
@@ -10,6 +10,9 @@ import sys
 from typing import Union, Optional, Callable
 from .tensor import Tensor
 # Import activation functions from the activations module
 from .activations import ReLU, Sigmoid, Tanh
 # Import our Tensor class
 # sys.path.append('../../')
 # from modules.tensor.tensor_dev import Tensor
@@ -109,130 +112,3 @@ class Dense:
    def __call__(self, x: Tensor) -> Tensor:
        """Make layer callable: layer(x) same as layer.forward(x)"""
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 9
 class ReLU:
    """
    ReLU Activation: f(x) = max(0, x)
    The most popular activation function in deep learning.
    Simple, effective, and computationally efficient.
    TODO: Implement ReLU activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply ReLU: f(x) = max(0, x)
        Args:
            x: Input tensor
        Returns:
            Output tensor with ReLU applied element-wise
        TODO: Implement element-wise max(0, x) operation
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        """Make activation callable: relu(x) same as relu.forward(x)"""
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 10
 class ReLU:
    """ReLU Activation: f(x) = max(0, x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply ReLU: f(x) = max(0, x)"""
        return Tensor(np.maximum(0, x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 11
 class Sigmoid:
    """
    Sigmoid Activation: f(x) = 1 / (1 + e^(-x))
    Squashes input to range (0, 1). Often used for binary classification.
    TODO: Implement Sigmoid activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply Sigmoid: f(x) = 1 / (1 + e^(-x))
        Args:
            x: Input tensor
        Returns:
            Output tensor with Sigmoid applied element-wise
        TODO: Implement sigmoid function (be careful with numerical stability!)
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 12
 class Sigmoid:
    """Sigmoid Activation: f(x) = 1 / (1 + e^(-x))"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Sigmoid with numerical stability"""
        # Use the numerically stable version to avoid overflow
        # For x >= 0: sigmoid(x) = 1 / (1 + exp(-x))
        # For x < 0: sigmoid(x) = exp(x) / (1 + exp(x))
        x_data = x.data
        result = np.zeros_like(x_data)
        # Stable computation
        positive_mask = x_data >= 0
        result[positive_mask] = 1.0 / (1.0 + np.exp(-x_data[positive_mask]))
        result[~positive_mask] = np.exp(x_data[~positive_mask]) / (1.0 + np.exp(x_data[~positive_mask]))
        return Tensor(result)
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 13
 class Tanh:
    """
    Tanh Activation: f(x) = tanh(x)
    Squashes input to range (-1, 1). Zero-centered output.
    TODO: Implement Tanh activation function.
    """
    def forward(self, x: Tensor) -> Tensor:
        """
        Apply Tanh: f(x) = tanh(x)
        Args:
            x: Input tensor
        Returns:
            Output tensor with Tanh applied element-wise
        TODO: Implement tanh function
        """
        raise NotImplementedError("Student implementation required")
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
 # %% ../../modules/layers/layers_dev.ipynb 14
 class Tanh:
    """Tanh Activation: f(x) = tanh(x)"""
    def forward(self, x: Tensor) -> Tensor:
        """Apply Tanh"""
        return Tensor(np.tanh(x.data))
    def __call__(self, x: Tensor) -> Tensor:
        return self.forward(x)
--- a/tinytorch/core/utils.py
+++ b/tinytorch/core/utils.py
@@ -1,22 +1,98 @@
 # AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/setup/setup_dev.ipynb.
 # %% auto 0
-__all__ = ['hello_tinytorch', 'add_numbers', 'SystemInfo']
+__all__ = ['hello_tinytorch', 'add_numbers', 'SystemInfo', 'DeveloperProfile']
 # %% ../../modules/setup/setup_dev.ipynb 3
 def hello_tinytorch():
-    """A simple hello world function for TinyTorch."""
+    """
-    return "Hello from TinyTorch! 🔥"
+    A simple hello world function for TinyTorch.
    TODO: Implement this function to display TinyTorch ASCII art and welcome message.
    Load the flame art from tinytorch_flame.txt file with graceful fallback.
    """
    raise NotImplementedError("Student implementation required")
 def add_numbers(a, b):
    """
    Add two numbers together.
    TODO: Implement addition of two numbers.
    This is the foundation of all mathematical operations in ML.
    """
    raise NotImplementedError("Student implementation required")
 # %% ../../modules/setup/setup_dev.ipynb 4
 def hello_tinytorch():
    """Display the TinyTorch ASCII art and welcome message."""
    try:
        # Get the directory containing this file
        current_dir = Path(__file__).parent
        art_file = current_dir / "tinytorch_flame.txt"
        if art_file.exists():
            with open(art_file, 'r') as f:
                ascii_art = f.read()
            print(ascii_art)
            print("Tiny🔥Torch")
            print("Build ML Systems from Scratch!")
        else:
            print("🔥 TinyTorch 🔥")
            print("Build ML Systems from Scratch!")
    except NameError:
        # Handle case when running in notebook where __file__ is not defined
        try:
            art_file = Path(os.getcwd()) / "tinytorch_flame.txt"
            if art_file.exists():
                with open(art_file, 'r') as f:
                    ascii_art = f.read()
                print(ascii_art)
                print("Tiny🔥Torch")
                print("Build ML Systems from Scratch!")
            else:
                print("🔥 TinyTorch 🔥")
                print("Build ML Systems from Scratch!")
        except:
            print("🔥 TinyTorch 🔥")
            print("Build ML Systems from Scratch!")
 def add_numbers(a, b):
    """Add two numbers together."""
    return a + b
 # %% ../../modules/setup/setup_dev.ipynb 8
 class SystemInfo:
    """
    Simple system information class.
    TODO: Implement this class to collect and display system information.
    """
    def __init__(self):
        """
        Initialize system information collection.
        TODO: Collect Python version, platform, and machine information.
        """
        raise NotImplementedError("Student implementation required")
    def __str__(self):
        """
        Return human-readable system information.
        TODO: Format system info as a readable string.
        """
        raise NotImplementedError("Student implementation required")
    def is_compatible(self):
        """
        Check if system meets minimum requirements.
        TODO: Check if Python version is >= 3.8
        """
        raise NotImplementedError("Student implementation required")
-# %% ../../modules/setup/setup_dev.ipynb 6
+# %% ../../modules/setup/setup_dev.ipynb 9
 import sys
 import platform
 class SystemInfo:
    """Simple system information class."""
@@ -32,3 +108,145 @@ class SystemInfo:
        """Check if system meets minimum requirements."""
        return self.python_version >= (3, 8)
 # %% ../../modules/setup/setup_dev.ipynb 13
 class DeveloperProfile:
    """
    Developer profile for personalizing TinyTorch experience.
    TODO: Implement this class to store and display developer information.
    Default to course instructor but allow students to personalize.
    """
    @staticmethod
    def _load_default_flame():
        """
        Load the default TinyTorch flame ASCII art from file.
        TODO: Implement file loading for tinytorch_flame.txt with fallback.
        """
        raise NotImplementedError("Student implementation required")
    def __init__(self, name="Vijay Janapa Reddi", affiliation="Harvard University", 
                 email="vj@eecs.harvard.edu", github_username="profvjreddi", ascii_art=None):
        """
        Initialize developer profile.
        TODO: Store developer information with sensible defaults.
        Students should be able to customize this with their own info and ASCII art.
        """
        raise NotImplementedError("Student implementation required")
    def __str__(self):
        """
        Return formatted developer information.
        TODO: Format developer info as a professional signature with optional ASCII art.
        """
        raise NotImplementedError("Student implementation required")
    def get_signature(self):
        """
        Get a short signature for code headers.
        TODO: Return a concise signature like "Built by Name (@github)"
        """
        raise NotImplementedError("Student implementation required")
    def get_ascii_art(self):
        """
        Get ASCII art for the profile.
        TODO: Return custom ASCII art or default flame loaded from file.
        """
        raise NotImplementedError("Student implementation required")
 # %% ../../modules/setup/setup_dev.ipynb 14
 class DeveloperProfile:
    """Developer profile for personalizing TinyTorch experience."""
    @staticmethod
    def _load_default_flame():
        """Load the default TinyTorch flame ASCII art from file."""
        try:
            # Try to load from the same directory as this module
            try:
                # Try to get the directory of the current file
                current_dir = os.path.dirname(__file__)
            except NameError:
                # If __file__ is not defined (e.g., in notebook), use current directory
                current_dir = os.getcwd()
            flame_path = os.path.join(current_dir, 'tinytorch_flame.txt')
            with open(flame_path, 'r', encoding='utf-8') as f:
                flame_art = f.read()
            # Add the Tiny🔥Torch text below the flame
            return f"""{flame_art}
                    Tiny🔥Torch
            Build ML Systems from Scratch!
            """
        except (FileNotFoundError, IOError):
            # Fallback to simple flame if file not found
            return """
    🔥 TinyTorch Developer 🔥
         .  .  .  .  .  .
        .    .  .  .  .   .
       .  .    .  .  .  .  .
      .  .  .    .  .  .  .  .
     .  .  .  .    .  .  .  .  .
    .  .  .  .  .    .  .  .  .  .
   .  .  .  .  .  .    .  .  .  .  .
  .  .  .  .  .  .  .    .  .  .  .  .
 .  .  .  .  .  .  .  .    .  .  .  .  .
 .  .  .  .  .  .  .  .  .    .  .  .  .  .
 \\  \\  \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
  \\  \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
   \\  \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
    \\  \\  \\  \\  \\  \\  /  /  /  /  /  /
     \\  \\  \\  \\  \\  /  /  /  /  /  /
      \\  \\  \\  \\  /  /  /  /  /  /
       \\  \\  \\  /  /  /  /  /  /
        \\  \\  /  /  /  /  /  /
         \\  /  /  /  /  /  /
          \\/  /  /  /  /  /
           \\/  /  /  /  /
            \\/  /  /  /
             \\/  /  /
              \\/  /
               \\/
                    Tiny🔥Torch
            Build ML Systems from Scratch!
            """
    def __init__(self, name="Vijay Janapa Reddi", affiliation="Harvard University", 
                 email="vj@eecs.harvard.edu", github_username="profvjreddi", ascii_art=None):
        self.name = name
        self.affiliation = affiliation
        self.email = email
        self.github_username = github_username
        self.ascii_art = ascii_art or self._load_default_flame()
    def __str__(self):
        return f"👨‍💻 {self.name} | {self.affiliation} | @{self.github_username}"
    def get_signature(self):
        """Get a short signature for code headers."""
        return f"Built by {self.name} (@{self.github_username})"
    def get_ascii_art(self):
        """Get ASCII art for the profile."""
        return self.ascii_art
    def get_full_profile(self):
        """Get complete profile with ASCII art."""
        return f"""{self.ascii_art}
 👨‍💻 Developer: {self.name}
 🏛️  Affiliation: {self.affiliation}
 📧 Email: {self.email}
 🐙 GitHub: @{self.github_username}
 🔥 Ready to build ML systems from scratch!
 """