# Module 01: Tensor Foundation

## Overview
Build the foundational Tensor class that powers all machine learning operations in TinyTorch.

## Time Estimate
**2-3 hours**

## Difficulty
⭐⭐☆☆☆ (Beginner)

## Prerequisites
- **Python basics**: Variables, functions, classes, operators
- **NumPy fundamentals**: Array creation, indexing, basic operations
- **Linear algebra**: Matrix multiplication concept, vectors vs matrices

## Learning Outcomes

By completing this module, you will be able to:

1. **Implement a complete Tensor class** with arithmetic operations (+, -, *, /), matrix multiplication, and shape manipulation that mirrors PyTorch's design patterns

2. **Understand tensor broadcasting semantics** and how automatic shape alignment enables efficient batch processing across different dimensional data

3. **Design classes with dormant features** that activate in future modules, learning PyTorch's evolution from Variable to unified Tensor with built-in autograd

4. **Analyze memory layout and cache behavior** to understand why certain operations (row-wise access) are significantly faster than others (column-wise access)

5. **Build production-ready APIs** with proper error handling, clear error messages, and input validation that guides users toward correct usage

## Key Concepts

### Tensors: The Universal ML Data Structure
Tensors are multi-dimensional arrays that serve as the fundamental data structure in machine learning:
- **0D (scalar)**: Single number (e.g., loss value)
- **1D (vector)**: List of numbers (e.g., bias terms)
- **2D (matrix)**: Grid of numbers (e.g., weight matrices, images)
- **3D+**: Higher dimensions (e.g., batches of images, sequence data)

### Broadcasting: Automatic Shape Alignment
NumPy-style broadcasting automatically aligns tensors of different shapes for operations:
```python
matrix = [[1, 2], [3, 4]]  # Shape: (2, 2)
vector = [10, 20]           # Shape: (2,)
result = matrix + vector    # Broadcasting: (2,2) + (2,) → (2,2)
# Result: [[11, 22], [13, 24]]
```

### Memory Layout and Cache Effects
Understanding row-major (C-style) storage explains why sequential access is faster:
- **Row-wise access**: Sequential memory, excellent cache locality (~2-3× faster)
- **Column-wise access**: Strided memory, poor cache locality
- **Real impact**: Same O(n) algorithm, dramatically different wall-clock time

### Dormant Gradient Features
Our Tensor includes gradient tracking attributes (`requires_grad`, `grad`, `backward()`) from the start, but they remain inactive until Module 05. This design:
- Maintains consistent API throughout the course (no Variable vs Tensor confusion)
- Follows PyTorch 2.0's unified Tensor design
- Enables progressive disclosure of complexity

## Module Structure

1. **Introduction**: What is a Tensor? (Concept + ML context)
2. **Foundations**: Mathematical Background (Broadcasting, memory layout)
3. **Implementation**: Building Tensor class with immediate unit testing
4. **Integration**: Neural network layer simulation
5. **Systems Analysis**: Memory layout and cache performance
6. **Module Test**: Comprehensive validation

## What You'll Build

```python
# Your complete Tensor class will support:
x = Tensor([[1, 2, 3], [4, 5, 6]])
y = Tensor([[7, 8, 9], [10, 11, 12]])

# Arithmetic operations with broadcasting
z = x + y              # Element-wise addition
scaled = x * 2         # Scalar broadcasting
normalized = (x - x.mean()) / x.std()  # Chaining operations

# Matrix operations
W = Tensor([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]])
output = x.matmul(W)   # Matrix multiplication: (2,3) @ (3,2) → (2,2)

# Shape manipulation
reshaped = x.reshape(3, 2)     # (2,3) → (3,2)
transposed = x.transpose()     # (2,3) → (3,2) with data rearrangement

# Reduction operations
total = x.sum()                # Sum all elements
col_means = x.mean(axis=0)     # Average per column
```

## Connection to Production ML

This module teaches patterns used in production frameworks:
- **PyTorch's Tensor class**: Same API design with unified gradients
- **NumPy broadcasting**: Industry-standard automatic shape alignment
- **Memory efficiency**: Row-major storage, cache-aware algorithms
- **Error handling**: Clear messages that guide users toward solutions

## Files in This Module

- `tensor_dev.py`: Your working implementation (Jupyter notebook format)
- `test_tensor.py`: Comprehensive test suite (run with pytest)
- `README.md`: This file

## Next Steps

After completing this module:

**→ Module 02: Activations**
- Build activation functions (ReLU, Sigmoid, GELU)
- Learn how nonlinearity enables neural networks to learn complex patterns
- Understand vanishing/exploding gradients through activation analysis

Your Tensor class becomes the foundation that all future modules build upon!

## Common Pitfalls to Avoid

1. **Matrix multiplication vs element-wise multiplication**
   - Use `.matmul()` or `@` for matrix multiplication (dot product)
   - Use `*` for element-wise multiplication (Hadamard product)

2. **Shape compatibility in broadcasting**
   - Inner dimensions must match for matmul: (M,K) @ (K,N) ✓
   - Broadcasting aligns from rightmost dimension
   - Clear error messages help debug shape mismatches

3. **Reshape vs transpose confusion**
   - Reshape: Same memory layout, different interpretation (fast, O(1))
   - Transpose: Data rearrangement in memory (slower, O(n))

4. **Gradient features are dormant**
   - `requires_grad`, `grad`, `backward()` exist but don't function yet
   - They activate in Module 05 - ignore them for now
   - Don't try to implement gradients manually

## Resources

- **NumPy documentation**: https://numpy.org/doc/stable/
- **PyTorch Tensor API**: https://pytorch.org/docs/stable/tensors.html
- **Broadcasting semantics**: https://numpy.org/doc/stable/user/basics.broadcasting.html

## Getting Help

If you're stuck:
1. Read the error messages carefully - they include hints
2. Check the ASCII diagrams in `tensor_dev.py` for visual explanations
3. Run unit tests individually to isolate issues
4. Review the module integration test for end-to-end examples

Happy learning! 🚀