mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-03-12 03:33:33 -05:00
- Added package structure documentation explaining modules/source/ vs tinytorch.core. - Enhanced mathematical foundations with linear algebra refresher and Universal Approximation Theorem - Added real-world applications for each activation function (ReLU, Sigmoid, Tanh, Softmax) - Included mathematical properties, derivatives, ranges, and computational costs - Added performance considerations and numerical stability explanations - Connected to production ML systems (PyTorch, TensorFlow, JAX equivalents) - Implemented streamlined 'tito export' command with automatic .py → .ipynb conversion - All functionality preserved: scripts run correctly, tests pass, package integration works - Ready to continue with remaining modules (layers, networks, cnn, dataloader)
790 lines
29 KiB
Plaintext
790 lines
29 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e37ae542",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"# Module 1: Tensor - Core Data Structure\n",
|
|
"\n",
|
|
"Welcome to the Tensor module! This is where TinyTorch really begins. You'll implement the fundamental data structure that powers all ML systems.\n",
|
|
"\n",
|
|
"## Learning Goals\n",
|
|
"- Understand tensors as N-dimensional arrays with ML-specific operations\n",
|
|
"- Implement a complete Tensor class with arithmetic operations\n",
|
|
"- Handle shape management, data types, and memory layout\n",
|
|
"- Build the foundation for neural networks and automatic differentiation\n",
|
|
"- Master the NBGrader workflow with comprehensive testing\n",
|
|
"\n",
|
|
"## Build → Use → Understand\n",
|
|
"1. **Build**: Create the Tensor class with core operations\n",
|
|
"2. **Use**: Perform tensor arithmetic and transformations\n",
|
|
"3. **Understand**: How tensors form the foundation of ML systems"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "af571489",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "tensor-imports",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| default_exp core.tensor\n",
|
|
"\n",
|
|
"#| export\n",
|
|
"import numpy as np\n",
|
|
"import sys\n",
|
|
"from typing import Union, List, Tuple, Optional, Any"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "16eb7a23",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "tensor-setup",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(\"🔥 TinyTorch Tensor Module\")\n",
|
|
"print(f\"NumPy version: {np.__version__}\")\n",
|
|
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
|
"print(\"Ready to build tensors!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "79347f07",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 📦 Where This Code Lives in the Final Package\n",
|
|
"\n",
|
|
"**Learning Side:** You work in `modules/source/01_tensor/tensor_dev.py` \n",
|
|
"**Building Side:** Code exports to `tinytorch.core.tensor`\n",
|
|
"\n",
|
|
"```python\n",
|
|
"# Final package structure:\n",
|
|
"from tinytorch.core.tensor import Tensor # The foundation of everything!\n",
|
|
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
|
|
"from tinytorch.core.layers import Dense, Conv2D\n",
|
|
"```\n",
|
|
"\n",
|
|
"**Why this matters:**\n",
|
|
"- **Learning:** Focused modules for deep understanding\n",
|
|
"- **Production:** Proper organization like PyTorch's `torch.Tensor`\n",
|
|
"- **Consistency:** All tensor operations live together in `core.tensor`\n",
|
|
"- **Foundation:** Every other module depends on Tensor"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0fb9e8f5",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 1: What is a Tensor?\n",
|
|
"\n",
|
|
"### Definition\n",
|
|
"A **tensor** is an N-dimensional array with ML-specific operations. Think of it as a container that can hold data in multiple dimensions:\n",
|
|
"\n",
|
|
"- **Scalar** (0D): A single number - `5.0`\n",
|
|
"- **Vector** (1D): A list of numbers - `[1, 2, 3]` \n",
|
|
"- **Matrix** (2D): A 2D array - `[[1, 2], [3, 4]]`\n",
|
|
"- **Higher dimensions**: 3D, 4D, etc. for images, video, batches\n",
|
|
"\n",
|
|
"### Why Tensors Matter in ML\n",
|
|
"Tensors are the foundation of all machine learning because:\n",
|
|
"- **Neural networks** process tensors (images, text, audio)\n",
|
|
"- **Batch processing** requires multiple samples at once\n",
|
|
"- **GPU acceleration** works efficiently with tensors\n",
|
|
"- **Automatic differentiation** needs structured data\n",
|
|
"\n",
|
|
"### Real-World Examples\n",
|
|
"- **Image**: 3D tensor `(height, width, channels)` - `(224, 224, 3)` for RGB images\n",
|
|
"- **Batch of images**: 4D tensor `(batch_size, height, width, channels)` - `(32, 224, 224, 3)`\n",
|
|
"- **Text**: 2D tensor `(sequence_length, embedding_dim)` - `(100, 768)` for BERT embeddings\n",
|
|
"- **Audio**: 2D tensor `(time_steps, features)` - `(16000, 1)` for 1 second of audio\n",
|
|
"\n",
|
|
"### Why Not Just Use NumPy?\n",
|
|
"We will use NumPy internally, but our Tensor class adds:\n",
|
|
"- **ML-specific operations** (later: gradients, GPU support)\n",
|
|
"- **Consistent API** for neural networks\n",
|
|
"- **Type safety** and error checking\n",
|
|
"- **Integration** with the rest of TinyTorch\n",
|
|
"\n",
|
|
"Let's start building!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "211f7216",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 🧠 The Mathematical Foundation\n",
|
|
"\n",
|
|
"### Linear Algebra Refresher\n",
|
|
"Tensors are generalizations of scalars, vectors, and matrices:\n",
|
|
"\n",
|
|
"```\n",
|
|
"Scalar (0D): 5\n",
|
|
"Vector (1D): [1, 2, 3]\n",
|
|
"Matrix (2D): [[1, 2], [3, 4]]\n",
|
|
"Tensor (3D): [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Why This Matters for Neural Networks\n",
|
|
"- **Forward Pass**: Matrix multiplication between layers\n",
|
|
"- **Batch Processing**: Multiple samples processed simultaneously\n",
|
|
"- **Convolutions**: 3D operations on image data\n",
|
|
"- **Gradients**: Derivatives computed across all dimensions\n",
|
|
"\n",
|
|
"### Connection to Real ML Systems\n",
|
|
"Every major ML framework uses tensors:\n",
|
|
"- **PyTorch**: `torch.Tensor`\n",
|
|
"- **TensorFlow**: `tf.Tensor`\n",
|
|
"- **JAX**: `jax.numpy.ndarray`\n",
|
|
"- **TinyTorch**: `tinytorch.core.tensor.Tensor` (what we're building!)\n",
|
|
"\n",
|
|
"### Performance Considerations\n",
|
|
"- **Memory Layout**: Contiguous arrays for cache efficiency\n",
|
|
"- **Vectorization**: SIMD operations for speed\n",
|
|
"- **Broadcasting**: Efficient operations on different shapes\n",
|
|
"- **Type Consistency**: Avoiding unnecessary conversions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3b5dc139",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\"",
|
|
"lines_to_next_cell": 1
|
|
},
|
|
"source": [
|
|
"## Step 2: The Tensor Class Foundation\n",
|
|
"\n",
|
|
"### Core Concept\n",
|
|
"Our Tensor class wraps NumPy arrays with ML-specific functionality. It needs to:\n",
|
|
"- Handle different input types (scalars, lists, numpy arrays)\n",
|
|
"- Provide consistent shape and type information\n",
|
|
"- Support arithmetic operations\n",
|
|
"- Maintain compatibility with the rest of TinyTorch\n",
|
|
"\n",
|
|
"### Design Principles\n",
|
|
"- **Simplicity**: Easy to create and use\n",
|
|
"- **Consistency**: Predictable behavior across operations\n",
|
|
"- **Performance**: Efficient NumPy backend\n",
|
|
"- **Extensibility**: Ready for future features (gradients, GPU)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f5368e89",
|
|
"metadata": {
|
|
"lines_to_next_cell": 1,
|
|
"nbgrader": {
|
|
"grade": false,
|
|
"grade_id": "tensor-class",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#| export\n",
|
|
"class Tensor:\n",
|
|
" \"\"\"\n",
|
|
" TinyTorch Tensor: N-dimensional array with ML operations.\n",
|
|
" \n",
|
|
" The fundamental data structure for all TinyTorch operations.\n",
|
|
" Wraps NumPy arrays with ML-specific functionality.\n",
|
|
" \"\"\"\n",
|
|
" \n",
|
|
" def __init__(self, data: Union[int, float, List, np.ndarray], dtype: Optional[str] = None):\n",
|
|
" \"\"\"\n",
|
|
" Create a new tensor from data.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" data: Input data (scalar, list, or numpy array)\n",
|
|
" dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
|
|
" \n",
|
|
" TODO: Implement tensor creation with proper type handling.\n",
|
|
" \n",
|
|
" STEP-BY-STEP:\n",
|
|
" 1. Check if data is a scalar (int/float) - convert to numpy array\n",
|
|
" 2. Check if data is a list - convert to numpy array \n",
|
|
" 3. Check if data is already a numpy array - use as-is\n",
|
|
" 4. Apply dtype conversion if specified\n",
|
|
" 5. Store the result in self._data\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor(5) → stores np.array(5)\n",
|
|
" Tensor([1, 2, 3]) → stores np.array([1, 2, 3])\n",
|
|
" Tensor(np.array([1, 2, 3])) → stores the array directly\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use isinstance() to check data types\n",
|
|
" - Use np.array() for conversion\n",
|
|
" - Handle dtype parameter for type conversion\n",
|
|
" - Store the array in self._data\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" # Convert input to numpy array\n",
|
|
" if isinstance(data, (int, float, np.number)):\n",
|
|
" # Handle Python and NumPy scalars\n",
|
|
" if dtype is None:\n",
|
|
" # Auto-detect type: int for integers, float32 for floats\n",
|
|
" if isinstance(data, int) or (isinstance(data, np.number) and np.issubdtype(type(data), np.integer)):\n",
|
|
" dtype = 'int32'\n",
|
|
" else:\n",
|
|
" dtype = 'float32'\n",
|
|
" self._data = np.array(data, dtype=dtype)\n",
|
|
" elif isinstance(data, list):\n",
|
|
" # Let NumPy auto-detect type, then convert if needed\n",
|
|
" temp_array = np.array(data)\n",
|
|
" if dtype is None:\n",
|
|
" # Use NumPy's auto-detected type, but prefer float32 for floats\n",
|
|
" if temp_array.dtype == np.float64:\n",
|
|
" dtype = 'float32'\n",
|
|
" else:\n",
|
|
" dtype = str(temp_array.dtype)\n",
|
|
" self._data = np.array(data, dtype=dtype)\n",
|
|
" elif isinstance(data, np.ndarray):\n",
|
|
" # Already a numpy array\n",
|
|
" if dtype is None:\n",
|
|
" # Keep existing dtype, but prefer float32 for float64\n",
|
|
" if data.dtype == np.float64:\n",
|
|
" dtype = 'float32'\n",
|
|
" else:\n",
|
|
" dtype = str(data.dtype)\n",
|
|
" self._data = data.astype(dtype) if dtype != data.dtype else data.copy()\n",
|
|
" else:\n",
|
|
" # Try to convert unknown types\n",
|
|
" self._data = np.array(data, dtype=dtype)\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" @property\n",
|
|
" def data(self) -> np.ndarray:\n",
|
|
" \"\"\"\n",
|
|
" Access underlying numpy array.\n",
|
|
" \n",
|
|
" TODO: Return the stored numpy array.\n",
|
|
" \n",
|
|
" HINT: Return self._data (the array you stored in __init__)\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" return self._data\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" @property\n",
|
|
" def shape(self) -> Tuple[int, ...]:\n",
|
|
" \"\"\"\n",
|
|
" Get tensor shape.\n",
|
|
" \n",
|
|
" TODO: Return the shape of the stored numpy array.\n",
|
|
" \n",
|
|
" HINT: Use .shape attribute of the numpy array\n",
|
|
" EXAMPLE: Tensor([1, 2, 3]).shape should return (3,)\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" return self._data.shape\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" @property\n",
|
|
" def size(self) -> int:\n",
|
|
" \"\"\"\n",
|
|
" Get total number of elements.\n",
|
|
" \n",
|
|
" TODO: Return the total number of elements in the tensor.\n",
|
|
" \n",
|
|
" HINT: Use .size attribute of the numpy array\n",
|
|
" EXAMPLE: Tensor([1, 2, 3]).size should return 3\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" return self._data.size\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" @property\n",
|
|
" def dtype(self) -> np.dtype:\n",
|
|
" \"\"\"\n",
|
|
" Get data type as numpy dtype.\n",
|
|
" \n",
|
|
" TODO: Return the data type of the stored numpy array.\n",
|
|
" \n",
|
|
" HINT: Use .dtype attribute of the numpy array\n",
|
|
" EXAMPLE: Tensor([1, 2, 3]).dtype should return dtype('int32')\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" return self._data.dtype\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" def __repr__(self) -> str:\n",
|
|
" \"\"\"\n",
|
|
" String representation.\n",
|
|
" \n",
|
|
" TODO: Create a clear string representation of the tensor.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Convert the numpy array to a list for readable output\n",
|
|
" 2. Include the shape and dtype information\n",
|
|
" 3. Format: \"Tensor([data], shape=shape, dtype=dtype)\"\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([1, 2, 3]) → \"Tensor([1, 2, 3], shape=(3,), dtype=int32)\"\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use .tolist() to convert numpy array to list\n",
|
|
" - Include shape and dtype information\n",
|
|
" - Keep format consistent and readable\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" return f\"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})\"\n",
|
|
" ### END SOLUTION\n",
|
|
" \n",
|
|
" def add(self, other: 'Tensor') -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Add two tensors element-wise.\n",
|
|
" \n",
|
|
" TODO: Implement tensor addition.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Add the numpy arrays using +\n",
|
|
" 2. Return a new Tensor with the result\n",
|
|
" 3. Handle broadcasting automatically\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use self._data + other._data\n",
|
|
" - Return Tensor(result)\n",
|
|
" - NumPy handles broadcasting automatically\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" result = self._data + other._data\n",
|
|
" return Tensor(result)\n",
|
|
" ### END SOLUTION\n",
|
|
"\n",
|
|
" def multiply(self, other: 'Tensor') -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Multiply two tensors element-wise.\n",
|
|
" \n",
|
|
" TODO: Implement tensor multiplication.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Multiply the numpy arrays using *\n",
|
|
" 2. Return a new Tensor with the result\n",
|
|
" 3. Handle broadcasting automatically\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
|
|
" \n",
|
|
" HINTS:\n",
|
|
" - Use self._data * other._data\n",
|
|
" - Return Tensor(result)\n",
|
|
" - This is element-wise, not matrix multiplication\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" result = self._data * other._data\n",
|
|
" return Tensor(result)\n",
|
|
" ### END SOLUTION\n",
|
|
"\n",
|
|
" def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Addition operator: tensor + other\n",
|
|
" \n",
|
|
" TODO: Implement + operator for tensors.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. If other is a Tensor, use tensor addition\n",
|
|
" 2. If other is a scalar, convert to Tensor first\n",
|
|
" 3. Return the result\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
|
|
" Tensor([1, 2]) + 5 → Tensor([6, 7])\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" if isinstance(other, Tensor):\n",
|
|
" return self.add(other)\n",
|
|
" else:\n",
|
|
" return self.add(Tensor(other))\n",
|
|
" ### END SOLUTION\n",
|
|
"\n",
|
|
" def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Multiplication operator: tensor * other\n",
|
|
" \n",
|
|
" TODO: Implement * operator for tensors.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. If other is a Tensor, use tensor multiplication\n",
|
|
" 2. If other is a scalar, convert to Tensor first\n",
|
|
" 3. Return the result\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
|
|
" Tensor([1, 2]) * 3 → Tensor([3, 6])\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" if isinstance(other, Tensor):\n",
|
|
" return self.multiply(other)\n",
|
|
" else:\n",
|
|
" return self.multiply(Tensor(other))\n",
|
|
" ### END SOLUTION\n",
|
|
"\n",
|
|
" def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Subtraction operator: tensor - other\n",
|
|
" \n",
|
|
" TODO: Implement - operator for tensors.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Convert other to Tensor if needed\n",
|
|
" 2. Subtract using numpy arrays\n",
|
|
" 3. Return new Tensor with result\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([5, 6]) - Tensor([1, 2]) → Tensor([4, 4])\n",
|
|
" Tensor([5, 6]) - 1 → Tensor([4, 5])\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" if isinstance(other, Tensor):\n",
|
|
" result = self._data - other._data\n",
|
|
" else:\n",
|
|
" result = self._data - other\n",
|
|
" return Tensor(result)\n",
|
|
" ### END SOLUTION\n",
|
|
"\n",
|
|
" def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
|
" \"\"\"\n",
|
|
" Division operator: tensor / other\n",
|
|
" \n",
|
|
" TODO: Implement / operator for tensors.\n",
|
|
" \n",
|
|
" APPROACH:\n",
|
|
" 1. Convert other to Tensor if needed\n",
|
|
" 2. Divide using numpy arrays\n",
|
|
" 3. Return new Tensor with result\n",
|
|
" \n",
|
|
" EXAMPLE:\n",
|
|
" Tensor([6, 8]) / Tensor([2, 4]) → Tensor([3, 2])\n",
|
|
" Tensor([6, 8]) / 2 → Tensor([3, 4])\n",
|
|
" \"\"\"\n",
|
|
" ### BEGIN SOLUTION\n",
|
|
" if isinstance(other, Tensor):\n",
|
|
" result = self._data / other._data\n",
|
|
" else:\n",
|
|
" result = self._data / other\n",
|
|
" return Tensor(result)\n",
|
|
" ### END SOLUTION"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "cebcc1d6",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 3: Tensor Arithmetic Operations\n",
|
|
"\n",
|
|
"### Why Arithmetic Matters\n",
|
|
"Tensor arithmetic is the foundation of all neural network operations:\n",
|
|
"- **Forward pass**: Matrix multiplications and additions\n",
|
|
"- **Activation functions**: Element-wise operations\n",
|
|
"- **Loss computation**: Differences and squares\n",
|
|
"- **Gradient computation**: Chain rule applications\n",
|
|
"\n",
|
|
"### Operations We'll Implement\n",
|
|
"- **Addition**: Element-wise addition of tensors\n",
|
|
"- **Multiplication**: Element-wise multiplication\n",
|
|
"- **Python operators**: `+`, `-`, `*`, `/` for natural syntax\n",
|
|
"- **Broadcasting**: Handle different shapes automatically"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5afc47f3",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 3: Tensor Arithmetic Methods\n",
|
|
"\n",
|
|
"The arithmetic methods are now part of the Tensor class above. Let's test them!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "04dc4fac",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 4: Python Operator Overloading\n",
|
|
"\n",
|
|
"### Why Operator Overloading?\n",
|
|
"Python's magic methods allow us to use natural syntax:\n",
|
|
"- `a + b` instead of `a.add(b)`\n",
|
|
"- `a * b` instead of `a.multiply(b)`\n",
|
|
"- `a - b` for subtraction\n",
|
|
"- `a / b` for division\n",
|
|
"\n",
|
|
"This makes tensor operations feel natural and readable."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "35ae8a76",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## Step 4: Operator Overloading\n",
|
|
"\n",
|
|
"The operator methods (__add__, __mul__, __sub__, __truediv__) are now part of the Tensor class above. This enables natural syntax like `a + b` and `a * b`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1a00809c",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"### 🧪 Test Your Tensor Implementation\n",
|
|
"\n",
|
|
"Once you implement the Tensor class above, run these cells to test your implementation:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7ac88fbc",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-tensor-creation",
|
|
"locked": true,
|
|
"points": 25,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test tensor creation and properties\n",
|
|
"print(\"Testing tensor creation...\")\n",
|
|
"\n",
|
|
"# Test scalar creation\n",
|
|
"scalar = Tensor(5.0)\n",
|
|
"assert scalar.shape == (), f\"Scalar shape should be (), got {scalar.shape}\"\n",
|
|
"assert scalar.size == 1, f\"Scalar size should be 1, got {scalar.size}\"\n",
|
|
"assert scalar.data.item() == 5.0, f\"Scalar value should be 5.0, got {scalar.data.item()}\"\n",
|
|
"\n",
|
|
"# Test vector creation\n",
|
|
"vector = Tensor([1, 2, 3])\n",
|
|
"assert vector.shape == (3,), f\"Vector shape should be (3,), got {vector.shape}\"\n",
|
|
"assert vector.size == 3, f\"Vector size should be 3, got {vector.size}\"\n",
|
|
"assert np.array_equal(vector.data, np.array([1, 2, 3])), \"Vector data mismatch\"\n",
|
|
"\n",
|
|
"# Test matrix creation\n",
|
|
"matrix = Tensor([[1, 2], [3, 4]])\n",
|
|
"assert matrix.shape == (2, 2), f\"Matrix shape should be (2, 2), got {matrix.shape}\"\n",
|
|
"assert matrix.size == 4, f\"Matrix size should be 4, got {matrix.size}\"\n",
|
|
"assert np.array_equal(matrix.data, np.array([[1, 2], [3, 4]])), \"Matrix data mismatch\"\n",
|
|
"\n",
|
|
"# Test dtype handling\n",
|
|
"float_tensor = Tensor([1.0, 2.0, 3.0])\n",
|
|
"assert float_tensor.dtype == np.float32, f\"Float tensor dtype should be float32, got {float_tensor.dtype}\"\n",
|
|
"\n",
|
|
"int_tensor = Tensor([1, 2, 3])\n",
|
|
"# Note: NumPy may default to int64 on some systems, so we check for integer types\n",
|
|
"assert int_tensor.dtype in [np.int32, np.int64], f\"Int tensor dtype should be int32 or int64, got {int_tensor.dtype}\"\n",
|
|
"\n",
|
|
"print(\"✅ Tensor creation tests passed!\")\n",
|
|
"print(f\"✅ Scalar: {scalar}\")\n",
|
|
"print(f\"✅ Vector: {vector}\")\n",
|
|
"print(f\"✅ Matrix: {matrix}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "edc7519d",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-tensor-arithmetic",
|
|
"locked": true,
|
|
"points": 25,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test tensor arithmetic operations\n",
|
|
"print(\"Testing tensor arithmetic...\")\n",
|
|
"\n",
|
|
"# Test addition\n",
|
|
"a = Tensor([1, 2, 3])\n",
|
|
"b = Tensor([4, 5, 6])\n",
|
|
"c = a + b\n",
|
|
"expected = np.array([5, 7, 9])\n",
|
|
"assert np.array_equal(c.data, expected), f\"Addition failed: expected {expected}, got {c.data}\"\n",
|
|
"\n",
|
|
"# Test multiplication\n",
|
|
"d = a * b\n",
|
|
"expected = np.array([4, 10, 18])\n",
|
|
"assert np.array_equal(d.data, expected), f\"Multiplication failed: expected {expected}, got {d.data}\"\n",
|
|
"\n",
|
|
"# Test subtraction\n",
|
|
"e = b - a\n",
|
|
"expected = np.array([3, 3, 3])\n",
|
|
"assert np.array_equal(e.data, expected), f\"Subtraction failed: expected {expected}, got {e.data}\"\n",
|
|
"\n",
|
|
"# Test division\n",
|
|
"f = b / a\n",
|
|
"expected = np.array([4.0, 2.5, 2.0])\n",
|
|
"assert np.allclose(f.data, expected), f\"Division failed: expected {expected}, got {f.data}\"\n",
|
|
"\n",
|
|
"# Test scalar operations\n",
|
|
"g = a + 10\n",
|
|
"expected = np.array([11, 12, 13])\n",
|
|
"assert np.array_equal(g.data, expected), f\"Scalar addition failed: expected {expected}, got {g.data}\"\n",
|
|
"\n",
|
|
"h = a * 2\n",
|
|
"expected = np.array([2, 4, 6])\n",
|
|
"assert np.array_equal(h.data, expected), f\"Scalar multiplication failed: expected {expected}, got {h.data}\"\n",
|
|
"\n",
|
|
"print(\"✅ Tensor arithmetic tests passed!\")\n",
|
|
"print(f\"✅ Addition: {a} + {b} = {c}\")\n",
|
|
"print(f\"✅ Multiplication: {a} * {b} = {d}\")\n",
|
|
"print(f\"✅ Subtraction: {b} - {a} = {e}\")\n",
|
|
"print(f\"✅ Division: {b} / {a} = {f}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ba87775f",
|
|
"metadata": {
|
|
"nbgrader": {
|
|
"grade": true,
|
|
"grade_id": "test-tensor-broadcasting",
|
|
"locked": true,
|
|
"points": 25,
|
|
"schema_version": 3,
|
|
"solution": false,
|
|
"task": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Test tensor broadcasting\n",
|
|
"print(\"Testing tensor broadcasting...\")\n",
|
|
"\n",
|
|
"# Test scalar broadcasting\n",
|
|
"matrix = Tensor([[1, 2], [3, 4]])\n",
|
|
"scalar = Tensor(10)\n",
|
|
"result = matrix + scalar\n",
|
|
"expected = np.array([[11, 12], [13, 14]])\n",
|
|
"assert np.array_equal(result.data, expected), f\"Scalar broadcasting failed: expected {expected}, got {result.data}\"\n",
|
|
"\n",
|
|
"# Test vector broadcasting\n",
|
|
"vector = Tensor([1, 2])\n",
|
|
"result = matrix + vector\n",
|
|
"expected = np.array([[2, 4], [4, 6]])\n",
|
|
"assert np.array_equal(result.data, expected), f\"Vector broadcasting failed: expected {expected}, got {result.data}\"\n",
|
|
"\n",
|
|
"# Test different shapes\n",
|
|
"a = Tensor([[1], [2], [3]]) # (3, 1)\n",
|
|
"b = Tensor([10, 20]) # (2,)\n",
|
|
"result = a + b\n",
|
|
"expected = np.array([[11, 21], [12, 22], [13, 23]])\n",
|
|
"assert np.array_equal(result.data, expected), f\"Shape broadcasting failed: expected {expected}, got {result.data}\"\n",
|
|
"\n",
|
|
"print(\"✅ Tensor broadcasting tests passed!\")\n",
|
|
"print(f\"✅ Matrix + Scalar: {matrix} + {scalar} = {result}\")\n",
|
|
"print(f\"✅ Broadcasting works correctly!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8ac93d30",
|
|
"metadata": {
|
|
"cell_marker": "\"\"\""
|
|
},
|
|
"source": [
|
|
"## 🎯 Module Summary\n",
|
|
"\n",
|
|
"Congratulations! You've successfully implemented the core Tensor class for TinyTorch:\n",
|
|
"\n",
|
|
"### What You've Accomplished\n",
|
|
"✅ **Tensor Creation**: Handle scalars, vectors, matrices, and higher-dimensional arrays \n",
|
|
"✅ **Data Types**: Proper dtype handling with auto-detection and conversion \n",
|
|
"✅ **Properties**: Shape, size, dtype, and data access \n",
|
|
"✅ **Arithmetic**: Addition, multiplication, subtraction, division \n",
|
|
"✅ **Operators**: Natural Python syntax with `+`, `-`, `*`, `/` \n",
|
|
"✅ **Broadcasting**: Automatic shape compatibility like NumPy \n",
|
|
"\n",
|
|
"### Key Concepts You've Learned\n",
|
|
"- **Tensors** are the fundamental data structure for ML systems\n",
|
|
"- **NumPy backend** provides efficient computation with ML-friendly API\n",
|
|
"- **Operator overloading** makes tensor operations feel natural\n",
|
|
"- **Broadcasting** enables flexible operations between different shapes\n",
|
|
"- **Type safety** ensures consistent behavior across operations\n",
|
|
"\n",
|
|
"### Next Steps\n",
|
|
"1. **Export your code**: `tito package nbdev --export 01_tensor`\n",
|
|
"2. **Test your implementation**: `tito module test 01_tensor`\n",
|
|
"3. **Use your tensors**: \n",
|
|
" ```python\n",
|
|
" from tinytorch.core.tensor import Tensor\n",
|
|
" t = Tensor([1, 2, 3])\n",
|
|
" print(t + 5) # Your tensor in action!\n",
|
|
" ```\n",
|
|
"4. **Move to Module 2**: Start building activation functions!\n",
|
|
"\n",
|
|
"**Ready for the next challenge?** Let's add the mathematical functions that make neural networks powerful!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"jupytext": {
|
|
"main_language": "python"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|