mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-06-03 05:01:14 -05:00

Files

Vijay Janapa Reddi 16b5cda4e3 Implements core neural network layers

This commit introduces the core building blocks for neural networks,
including a naive matrix multiplication implementation and a Dense layer.

It provides a foundation for constructing and experimenting with
neural networks, emphasizing the concept of layers as tensor
transformations and function composition.

The module includes thorough testing and performance comparisons
to demonstrate the importance of optimized operations.

2025-07-11 15:07:54 -04:00

tests

feat: Create clean modular architecture with activations → layers separation

2025-07-10 21:32:25 -04:00

layers_dev.ipynb

feat: Add matrix multiplication scaffolding to Layers module

2025-07-10 23:27:02 -04:00

layers_dev.py

Implements core neural network layers

2025-07-11 15:07:54 -04:00

README.md

🧱 Implement Layers module - Neural Network Building Blocks

2025-07-10 20:30:31 -04:00

README.md

🧱 Module 2: Layers - Neural Network Building Blocks

Build the fundamental transformations that compose into neural networks

🎯 Learning Objectives

After completing this module, you will:

Understand layers as functions that transform tensors: y = f(x)
Implement Dense layers with linear transformations: y = Wx + b
Add activation functions for nonlinearity (ReLU, Sigmoid, Tanh)
See how neural networks are just function composition
Build intuition for neural network architecture before diving into training

🧱 Build → Use → Understand

This module follows the TinyTorch pedagogical framework:

Build: Dense layers and activation functions from scratch
Use: Transform tensors and see immediate results
Understand: How neural networks transform information

📚 What You'll Build

Dense Layer

layer = Dense(input_size=3, output_size=2)
x = Tensor([[1.0, 2.0, 3.0]])
y = layer(x)  # Shape: (1, 2)

Activation Functions

relu = ReLU()
sigmoid = Sigmoid()
tanh = Tanh()

x = Tensor([[-1.0, 0.0, 1.0]])
y_relu = relu(x)      # [0.0, 0.0, 1.0]
y_sigmoid = sigmoid(x)  # [0.27, 0.5, 0.73]
y_tanh = tanh(x)      # [-0.76, 0.0, 0.76]

Neural Networks

# 3 → 4 → 2 network
layer1 = Dense(input_size=3, output_size=4)
activation1 = ReLU()
layer2 = Dense(input_size=4, output_size=2)
activation2 = Sigmoid()

# Forward pass
x = Tensor([[1.0, 2.0, 3.0]])
h1 = layer1(x)
h1_activated = activation1(h1)
h2 = layer2(h1_activated)
output = activation2(h2)

🚀 Getting Started

Prerequisites

Complete Module 1: Tensor ✅
Understand basic linear algebra (matrix multiplication)
Familiar with Python classes and methods

Quick Start

# Navigate to the layers module
cd modules/layers

# Work in the development notebook
jupyter notebook layers_dev.ipynb

# Or work in the Python file
code layers_dev.py

📖 Module Structure

modules/layers/
├── layers_dev.py           # Main development file (work here!)
├── layers_dev.ipynb        # Jupyter notebook version
├── tests/
│   └── test_layers.py      # Comprehensive tests
├── README.md              # This file
└── solutions/             # Reference implementations (if stuck)

🎓 Learning Path

Step 1: Dense Layer (Linear Transformation)

Understand y = Wx + b
Implement weight initialization
Handle matrix multiplication and bias addition
Test with single examples and batches

Step 2: Activation Functions

Implement ReLU: max(0, x)
Implement Sigmoid: 1 / (1 + e^(-x))
Implement Tanh: tanh(x)
Understand why nonlinearity is crucial

Step 3: Layer Composition

Chain layers together
Build complete neural networks
See how simple layers create complex functions

Step 4: Real-World Application

Build an image classification network
Understand how architecture affects capability

🧪 Testing Your Implementation

Module-Level Tests

# Run comprehensive tests
python -m pytest tests/test_layers.py -v

# Quick test
python -c "from layers_dev import Dense, ReLU; print('✅ Layers working!')"

Package-Level Tests

# Export to package
python ../../bin/tito.py sync

# Test integration
python ../../bin/tito.py test --module layers

🎯 Key Concepts

Layers as Functions

Input: Tensor with some shape
Transformation: Mathematical operation
Output: Tensor with possibly different shape

Linear vs Nonlinear

Dense layers: Linear transformations
Activation functions: Nonlinear transformations
Composition: Linear + Nonlinear = Complex functions

Neural Networks = Function Composition

Input → Dense → ReLU → Dense → Sigmoid → Output

Why This Matters

Modularity: Build complex networks from simple parts
Reusability: Same layers work for different problems
Understanding: Know how each part contributes to the whole

🔍 Common Issues

Import Errors

# Make sure you're in the right directory
import sys
sys.path.append('../../')
from modules.tensor.tensor_dev import Tensor

Shape Mismatches

# Check input/output sizes match
layer1 = Dense(input_size=3, output_size=4)
layer2 = Dense(input_size=4, output_size=2)  # 4 matches output of layer1

Gradient Issues (Later)

# Use proper weight initialization
limit = math.sqrt(6.0 / (input_size + output_size))
weights = np.random.uniform(-limit, limit, (input_size, output_size))

🎉 Success Criteria

You've successfully completed this module when:

✅ All tests pass (pytest tests/test_layers.py)
✅ You can build a 2-layer neural network
✅ You understand how layers transform tensors
✅ You see the connection between layers and neural networks
✅ Package export works (tito test --module layers)

🚀 What's Next

After completing this module, you're ready for:

Module 3: Networks - Compose layers into common architectures
Module 4: Training - Learn how networks improve through experience
Module 5: Applications - Use networks for real problems

🤝 Getting Help

Check the tests for examples of expected behavior
Look at the solutions/ directory if you're stuck
Review the pedagogical principles in docs/pedagogy/
Remember: Build → Use → Understand!

Great job building the foundation of neural networks! 🎉

This module implements the core insight: neural networks are just function composition of simple building blocks.