mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-06-03 05:01:14 -05:00
This commit introduces the core building blocks for neural networks, including a naive matrix multiplication implementation and a Dense layer. It provides a foundation for constructing and experimenting with neural networks, emphasizing the concept of layers as tensor transformations and function composition. The module includes thorough testing and performance comparisons to demonstrate the importance of optimized operations.
🧱 Module 2: Layers - Neural Network Building Blocks
Build the fundamental transformations that compose into neural networks
🎯 Learning Objectives
After completing this module, you will:
- Understand layers as functions that transform tensors:
y = f(x) - Implement Dense layers with linear transformations:
y = Wx + b - Add activation functions for nonlinearity (ReLU, Sigmoid, Tanh)
- See how neural networks are just function composition
- Build intuition for neural network architecture before diving into training
🧱 Build → Use → Understand
This module follows the TinyTorch pedagogical framework:
- Build: Dense layers and activation functions from scratch
- Use: Transform tensors and see immediate results
- Understand: How neural networks transform information
📚 What You'll Build
Dense Layer
layer = Dense(input_size=3, output_size=2)
x = Tensor([[1.0, 2.0, 3.0]])
y = layer(x) # Shape: (1, 2)
Activation Functions
relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()
x = Tensor([[-1.0, 0.0, 1.0]])
y_relu = relu(x) # [0.0, 0.0, 1.0]
y_sigmoid = sigmoid(x) # [0.27, 0.5, 0.73]
y_tanh = tanh(x) # [-0.76, 0.0, 0.76]
Neural Networks
# 3 → 4 → 2 network
layer1 = Dense(input_size=3, output_size=4)
activation1 = ReLU()
layer2 = Dense(input_size=4, output_size=2)
activation2 = Sigmoid()
# Forward pass
x = Tensor([[1.0, 2.0, 3.0]])
h1 = layer1(x)
h1_activated = activation1(h1)
h2 = layer2(h1_activated)
output = activation2(h2)
🚀 Getting Started
Prerequisites
- Complete Module 1: Tensor ✅
- Understand basic linear algebra (matrix multiplication)
- Familiar with Python classes and methods
Quick Start
# Navigate to the layers module
cd modules/layers
# Work in the development notebook
jupyter notebook layers_dev.ipynb
# Or work in the Python file
code layers_dev.py
📖 Module Structure
modules/layers/
├── layers_dev.py # Main development file (work here!)
├── layers_dev.ipynb # Jupyter notebook version
├── tests/
│ └── test_layers.py # Comprehensive tests
├── README.md # This file
└── solutions/ # Reference implementations (if stuck)
🎓 Learning Path
Step 1: Dense Layer (Linear Transformation)
- Understand
y = Wx + b - Implement weight initialization
- Handle matrix multiplication and bias addition
- Test with single examples and batches
Step 2: Activation Functions
- Implement ReLU:
max(0, x) - Implement Sigmoid:
1 / (1 + e^(-x)) - Implement Tanh:
tanh(x) - Understand why nonlinearity is crucial
Step 3: Layer Composition
- Chain layers together
- Build complete neural networks
- See how simple layers create complex functions
Step 4: Real-World Application
- Build an image classification network
- Understand how architecture affects capability
🧪 Testing Your Implementation
Module-Level Tests
# Run comprehensive tests
python -m pytest tests/test_layers.py -v
# Quick test
python -c "from layers_dev import Dense, ReLU; print('✅ Layers working!')"
Package-Level Tests
# Export to package
python ../../bin/tito.py sync
# Test integration
python ../../bin/tito.py test --module layers
🎯 Key Concepts
Layers as Functions
- Input: Tensor with some shape
- Transformation: Mathematical operation
- Output: Tensor with possibly different shape
Linear vs Nonlinear
- Dense layers: Linear transformations
- Activation functions: Nonlinear transformations
- Composition: Linear + Nonlinear = Complex functions
Neural Networks = Function Composition
Input → Dense → ReLU → Dense → Sigmoid → Output
Why This Matters
- Modularity: Build complex networks from simple parts
- Reusability: Same layers work for different problems
- Understanding: Know how each part contributes to the whole
🔍 Common Issues
Import Errors
# Make sure you're in the right directory
import sys
sys.path.append('../../')
from modules.tensor.tensor_dev import Tensor
Shape Mismatches
# Check input/output sizes match
layer1 = Dense(input_size=3, output_size=4)
layer2 = Dense(input_size=4, output_size=2) # 4 matches output of layer1
Gradient Issues (Later)
# Use proper weight initialization
limit = math.sqrt(6.0 / (input_size + output_size))
weights = np.random.uniform(-limit, limit, (input_size, output_size))
🎉 Success Criteria
You've successfully completed this module when:
- ✅ All tests pass (
pytest tests/test_layers.py) - ✅ You can build a 2-layer neural network
- ✅ You understand how layers transform tensors
- ✅ You see the connection between layers and neural networks
- ✅ Package export works (
tito test --module layers)
🚀 What's Next
After completing this module, you're ready for:
- Module 3: Networks - Compose layers into common architectures
- Module 4: Training - Learn how networks improve through experience
- Module 5: Applications - Use networks for real problems
🤝 Getting Help
- Check the tests for examples of expected behavior
- Look at the solutions/ directory if you're stuck
- Review the pedagogical principles in
docs/pedagogy/ - Remember: Build → Use → Understand!
Great job building the foundation of neural networks! 🎉
This module implements the core insight: neural networks are just function composition of simple building blocks.