Files
TinyTorch/modules/01_tensor/README.md
Vijay Janapa Reddi 832c569cad Add module development files to new structure
Added all module development files to modules/XX_name/ directories:

Module notebooks and scripts:
- 18 modules with .ipynb and .py files (01-20, excluding some gaps)
- Moved from modules/source/ to direct module directories
- Includes tensor, autograd, layers, transformers, optimization modules

Module README files:
- Added README.md for modules with additional documentation
- Complements ABOUT.md files added earlier

This completes the module restructuring:
- Before: modules/source/XX_name/*_dev.{py,ipynb}
- After: modules/XX_name/*_dev.{py,ipynb}

All development happens directly in numbered module directories now.
2025-11-10 19:43:36 -05:00

155 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Module 01: Tensor Foundation
## Overview
Build the foundational Tensor class that powers all machine learning operations in TinyTorch.
## Time Estimate
**2-3 hours**
## Difficulty
⭐⭐☆☆☆ (Beginner)
## Prerequisites
- **Python basics**: Variables, functions, classes, operators
- **NumPy fundamentals**: Array creation, indexing, basic operations
- **Linear algebra**: Matrix multiplication concept, vectors vs matrices
## Learning Outcomes
By completing this module, you will be able to:
1. **Implement a complete Tensor class** with arithmetic operations (+, -, *, /), matrix multiplication, and shape manipulation that mirrors PyTorch's design patterns
2. **Understand tensor broadcasting semantics** and how automatic shape alignment enables efficient batch processing across different dimensional data
3. **Design classes with dormant features** that activate in future modules, learning PyTorch's evolution from Variable to unified Tensor with built-in autograd
4. **Analyze memory layout and cache behavior** to understand why certain operations (row-wise access) are significantly faster than others (column-wise access)
5. **Build production-ready APIs** with proper error handling, clear error messages, and input validation that guides users toward correct usage
## Key Concepts
### Tensors: The Universal ML Data Structure
Tensors are multi-dimensional arrays that serve as the fundamental data structure in machine learning:
- **0D (scalar)**: Single number (e.g., loss value)
- **1D (vector)**: List of numbers (e.g., bias terms)
- **2D (matrix)**: Grid of numbers (e.g., weight matrices, images)
- **3D+**: Higher dimensions (e.g., batches of images, sequence data)
### Broadcasting: Automatic Shape Alignment
NumPy-style broadcasting automatically aligns tensors of different shapes for operations:
```python
matrix = [[1, 2], [3, 4]] # Shape: (2, 2)
vector = [10, 20] # Shape: (2,)
result = matrix + vector # Broadcasting: (2,2) + (2,) → (2,2)
# Result: [[11, 22], [13, 24]]
```
### Memory Layout and Cache Effects
Understanding row-major (C-style) storage explains why sequential access is faster:
- **Row-wise access**: Sequential memory, excellent cache locality (~2-3× faster)
- **Column-wise access**: Strided memory, poor cache locality
- **Real impact**: Same O(n) algorithm, dramatically different wall-clock time
### Dormant Gradient Features
Our Tensor includes gradient tracking attributes (`requires_grad`, `grad`, `backward()`) from the start, but they remain inactive until Module 05. This design:
- Maintains consistent API throughout the course (no Variable vs Tensor confusion)
- Follows PyTorch 2.0's unified Tensor design
- Enables progressive disclosure of complexity
## Module Structure
1. **Introduction**: What is a Tensor? (Concept + ML context)
2. **Foundations**: Mathematical Background (Broadcasting, memory layout)
3. **Implementation**: Building Tensor class with immediate unit testing
4. **Integration**: Neural network layer simulation
5. **Systems Analysis**: Memory layout and cache performance
6. **Module Test**: Comprehensive validation
## What You'll Build
```python
# Your complete Tensor class will support:
x = Tensor([[1, 2, 3], [4, 5, 6]])
y = Tensor([[7, 8, 9], [10, 11, 12]])
# Arithmetic operations with broadcasting
z = x + y # Element-wise addition
scaled = x * 2 # Scalar broadcasting
normalized = (x - x.mean()) / x.std() # Chaining operations
# Matrix operations
W = Tensor([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]])
output = x.matmul(W) # Matrix multiplication: (2,3) @ (3,2) → (2,2)
# Shape manipulation
reshaped = x.reshape(3, 2) # (2,3) → (3,2)
transposed = x.transpose() # (2,3) → (3,2) with data rearrangement
# Reduction operations
total = x.sum() # Sum all elements
col_means = x.mean(axis=0) # Average per column
```
## Connection to Production ML
This module teaches patterns used in production frameworks:
- **PyTorch's Tensor class**: Same API design with unified gradients
- **NumPy broadcasting**: Industry-standard automatic shape alignment
- **Memory efficiency**: Row-major storage, cache-aware algorithms
- **Error handling**: Clear messages that guide users toward solutions
## Files in This Module
- `tensor_dev.py`: Your working implementation (Jupyter notebook format)
- `test_tensor.py`: Comprehensive test suite (run with pytest)
- `README.md`: This file
## Next Steps
After completing this module:
**→ Module 02: Activations**
- Build activation functions (ReLU, Sigmoid, GELU)
- Learn how nonlinearity enables neural networks to learn complex patterns
- Understand vanishing/exploding gradients through activation analysis
Your Tensor class becomes the foundation that all future modules build upon!
## Common Pitfalls to Avoid
1. **Matrix multiplication vs element-wise multiplication**
- Use `.matmul()` or `@` for matrix multiplication (dot product)
- Use `*` for element-wise multiplication (Hadamard product)
2. **Shape compatibility in broadcasting**
- Inner dimensions must match for matmul: (M,K) @ (K,N) ✓
- Broadcasting aligns from rightmost dimension
- Clear error messages help debug shape mismatches
3. **Reshape vs transpose confusion**
- Reshape: Same memory layout, different interpretation (fast, O(1))
- Transpose: Data rearrangement in memory (slower, O(n))
4. **Gradient features are dormant**
- `requires_grad`, `grad`, `backward()` exist but don't function yet
- They activate in Module 05 - ignore them for now
- Don't try to implement gradients manually
## Resources
- **NumPy documentation**: https://numpy.org/doc/stable/
- **PyTorch Tensor API**: https://pytorch.org/docs/stable/tensors.html
- **Broadcasting semantics**: https://numpy.org/doc/stable/user/basics.broadcasting.html
## Getting Help
If you're stuck:
1. Read the error messages carefully - they include hints
2. Check the ASCII diagrams in `tensor_dev.py` for visual explanations
3. Run unit tests individually to isolate issues
4. Review the module integration test for end-to-end examples
Happy learning! 🚀