Fixes after merge conflicts: - Fix tensor reshape error message format - Fix __init__.py imports (remove BatchNorm2d, fix enable_autograd call) - Fix attention mask broadcasting for multi-head attention - Fix memoization module to use matmul instead of @ operator - Fix capstone module count_parameters and CosineSchedule usage - Add missing imports to benchmark.py (dataclass, Profiler, platform, os) - Simplify capstone pipeline test to avoid data shape mismatch All 20 modules now pass tito test --all
Module 01: Tensor Foundation
Overview
Build the foundational Tensor class that powers all machine learning operations in TinyTorch.
Time Estimate
2-3 hours
Difficulty
⭐⭐☆☆☆ (Beginner)
Prerequisites
- Python basics: Variables, functions, classes, operators
- NumPy fundamentals: Array creation, indexing, basic operations
- Linear algebra: Matrix multiplication concept, vectors vs matrices
Learning Outcomes
By completing this module, you will be able to:
-
Implement a complete Tensor class with arithmetic operations (+, -, *, /), matrix multiplication, and shape manipulation that mirrors PyTorch's design patterns
-
Understand tensor broadcasting semantics and how automatic shape alignment enables efficient batch processing across different dimensional data
-
Design classes with dormant features that activate in future modules, learning PyTorch's evolution from Variable to unified Tensor with built-in autograd
-
Analyze memory layout and cache behavior to understand why certain operations (row-wise access) are significantly faster than others (column-wise access)
-
Build production-ready APIs with proper error handling, clear error messages, and input validation that guides users toward correct usage
Key Concepts
Tensors: The Universal ML Data Structure
Tensors are multi-dimensional arrays that serve as the fundamental data structure in machine learning:
- 0D (scalar): Single number (e.g., loss value)
- 1D (vector): List of numbers (e.g., bias terms)
- 2D (matrix): Grid of numbers (e.g., weight matrices, images)
- 3D+: Higher dimensions (e.g., batches of images, sequence data)
Broadcasting: Automatic Shape Alignment
NumPy-style broadcasting automatically aligns tensors of different shapes for operations:
matrix = [[1, 2], [3, 4]] # Shape: (2, 2)
vector = [10, 20] # Shape: (2,)
result = matrix + vector # Broadcasting: (2,2) + (2,) → (2,2)
# Result: [[11, 22], [13, 24]]
Memory Layout and Cache Effects
Understanding row-major (C-style) storage explains why sequential access is faster:
- Row-wise access: Sequential memory, excellent cache locality (~2-3× faster)
- Column-wise access: Strided memory, poor cache locality
- Real impact: Same O(n) algorithm, dramatically different wall-clock time
Dormant Gradient Features
Our Tensor includes gradient tracking attributes (requires_grad, grad, backward()) from the start, but they remain inactive until Module 05. This design:
- Maintains consistent API throughout the course (no Variable vs Tensor confusion)
- Follows PyTorch 2.0's unified Tensor design
- Enables progressive disclosure of complexity
Module Structure
- Introduction: What is a Tensor? (Concept + ML context)
- Foundations: Mathematical Background (Broadcasting, memory layout)
- Implementation: Building Tensor class with immediate unit testing
- Integration: Neural network layer simulation
- Systems Analysis: Memory layout and cache performance
- Module Test: Comprehensive validation
What You'll Build
# Your complete Tensor class will support:
x = Tensor([[1, 2, 3], [4, 5, 6]])
y = Tensor([[7, 8, 9], [10, 11, 12]])
# Arithmetic operations with broadcasting
z = x + y # Element-wise addition
scaled = x * 2 # Scalar broadcasting
normalized = (x - x.mean()) / x.std() # Chaining operations
# Matrix operations
W = Tensor([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]])
output = x.matmul(W) # Matrix multiplication: (2,3) @ (3,2) → (2,2)
# Shape manipulation
reshaped = x.reshape(3, 2) # (2,3) → (3,2)
transposed = x.transpose() # (2,3) → (3,2) with data rearrangement
# Reduction operations
total = x.sum() # Sum all elements
col_means = x.mean(axis=0) # Average per column
Connection to Production ML
This module teaches patterns used in production frameworks:
- PyTorch's Tensor class: Same API design with unified gradients
- NumPy broadcasting: Industry-standard automatic shape alignment
- Memory efficiency: Row-major storage, cache-aware algorithms
- Error handling: Clear messages that guide users toward solutions
Files in This Module
tensor_dev.py: Your working implementation (Jupyter notebook format)test_tensor.py: Comprehensive test suite (run with pytest)README.md: This file
Next Steps
After completing this module:
→ Module 02: Activations
- Build activation functions (ReLU, Sigmoid, GELU)
- Learn how nonlinearity enables neural networks to learn complex patterns
- Understand vanishing/exploding gradients through activation analysis
Your Tensor class becomes the foundation that all future modules build upon!
Common Pitfalls to Avoid
-
Matrix multiplication vs element-wise multiplication
- Use
.matmul()or@for matrix multiplication (dot product) - Use
*for element-wise multiplication (Hadamard product)
- Use
-
Shape compatibility in broadcasting
- Inner dimensions must match for matmul: (M,K) @ (K,N) ✓
- Broadcasting aligns from rightmost dimension
- Clear error messages help debug shape mismatches
-
Reshape vs transpose confusion
- Reshape: Same memory layout, different interpretation (fast, O(1))
- Transpose: Data rearrangement in memory (slower, O(n))
-
Gradient features are dormant
requires_grad,grad,backward()exist but don't function yet- They activate in Module 05 - ignore them for now
- Don't try to implement gradients manually
Resources
- NumPy documentation: https://numpy.org/doc/stable/
- PyTorch Tensor API: https://pytorch.org/docs/stable/tensors.html
- Broadcasting semantics: https://numpy.org/doc/stable/user/basics.broadcasting.html
Getting Help
If you're stuck:
- Read the error messages carefully - they include hints
- Check the ASCII diagrams in
tensor_dev.pyfor visual explanations - Run unit tests individually to isolate issues
- Review the module integration test for end-to-end examples
Happy learning! 🚀