mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-03-12 01:45:57 -05:00
feat: Enhanced tensor and activations modules with comprehensive educational content
- Added package structure documentation explaining modules/source/ vs tinytorch.core. - Enhanced mathematical foundations with linear algebra refresher and Universal Approximation Theorem - Added real-world applications for each activation function (ReLU, Sigmoid, Tanh, Softmax) - Included mathematical properties, derivatives, ranges, and computational costs - Added performance considerations and numerical stability explanations - Connected to production ML systems (PyTorch, TensorFlow, JAX equivalents) - Implemented streamlined 'tito export' command with automatic .py → .ipynb conversion - All functionality preserved: scripts run correctly, tests pass, package integration works - Ready to continue with remaining modules (layers, networks, cnn, dataloader)
This commit is contained in:
789
modules/source/01_tensor/tensor_dev.ipynb
Normal file
789
modules/source/01_tensor/tensor_dev.ipynb
Normal file
@@ -0,0 +1,789 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e37ae542",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"# Module 1: Tensor - Core Data Structure\n",
|
||||
"\n",
|
||||
"Welcome to the Tensor module! This is where TinyTorch really begins. You'll implement the fundamental data structure that powers all ML systems.\n",
|
||||
"\n",
|
||||
"## Learning Goals\n",
|
||||
"- Understand tensors as N-dimensional arrays with ML-specific operations\n",
|
||||
"- Implement a complete Tensor class with arithmetic operations\n",
|
||||
"- Handle shape management, data types, and memory layout\n",
|
||||
"- Build the foundation for neural networks and automatic differentiation\n",
|
||||
"- Master the NBGrader workflow with comprehensive testing\n",
|
||||
"\n",
|
||||
"## Build → Use → Understand\n",
|
||||
"1. **Build**: Create the Tensor class with core operations\n",
|
||||
"2. **Use**: Perform tensor arithmetic and transformations\n",
|
||||
"3. **Understand**: How tensors form the foundation of ML systems"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "af571489",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "tensor-imports",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| default_exp core.tensor\n",
|
||||
"\n",
|
||||
"#| export\n",
|
||||
"import numpy as np\n",
|
||||
"import sys\n",
|
||||
"from typing import Union, List, Tuple, Optional, Any"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "16eb7a23",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "tensor-setup",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(\"🔥 TinyTorch Tensor Module\")\n",
|
||||
"print(f\"NumPy version: {np.__version__}\")\n",
|
||||
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
||||
"print(\"Ready to build tensors!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "79347f07",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 📦 Where This Code Lives in the Final Package\n",
|
||||
"\n",
|
||||
"**Learning Side:** You work in `modules/source/01_tensor/tensor_dev.py` \n",
|
||||
"**Building Side:** Code exports to `tinytorch.core.tensor`\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# Final package structure:\n",
|
||||
"from tinytorch.core.tensor import Tensor # The foundation of everything!\n",
|
||||
"from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
|
||||
"from tinytorch.core.layers import Dense, Conv2D\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Why this matters:**\n",
|
||||
"- **Learning:** Focused modules for deep understanding\n",
|
||||
"- **Production:** Proper organization like PyTorch's `torch.Tensor`\n",
|
||||
"- **Consistency:** All tensor operations live together in `core.tensor`\n",
|
||||
"- **Foundation:** Every other module depends on Tensor"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0fb9e8f5",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 1: What is a Tensor?\n",
|
||||
"\n",
|
||||
"### Definition\n",
|
||||
"A **tensor** is an N-dimensional array with ML-specific operations. Think of it as a container that can hold data in multiple dimensions:\n",
|
||||
"\n",
|
||||
"- **Scalar** (0D): A single number - `5.0`\n",
|
||||
"- **Vector** (1D): A list of numbers - `[1, 2, 3]` \n",
|
||||
"- **Matrix** (2D): A 2D array - `[[1, 2], [3, 4]]`\n",
|
||||
"- **Higher dimensions**: 3D, 4D, etc. for images, video, batches\n",
|
||||
"\n",
|
||||
"### Why Tensors Matter in ML\n",
|
||||
"Tensors are the foundation of all machine learning because:\n",
|
||||
"- **Neural networks** process tensors (images, text, audio)\n",
|
||||
"- **Batch processing** requires multiple samples at once\n",
|
||||
"- **GPU acceleration** works efficiently with tensors\n",
|
||||
"- **Automatic differentiation** needs structured data\n",
|
||||
"\n",
|
||||
"### Real-World Examples\n",
|
||||
"- **Image**: 3D tensor `(height, width, channels)` - `(224, 224, 3)` for RGB images\n",
|
||||
"- **Batch of images**: 4D tensor `(batch_size, height, width, channels)` - `(32, 224, 224, 3)`\n",
|
||||
"- **Text**: 2D tensor `(sequence_length, embedding_dim)` - `(100, 768)` for BERT embeddings\n",
|
||||
"- **Audio**: 2D tensor `(time_steps, features)` - `(16000, 1)` for 1 second of audio\n",
|
||||
"\n",
|
||||
"### Why Not Just Use NumPy?\n",
|
||||
"We will use NumPy internally, but our Tensor class adds:\n",
|
||||
"- **ML-specific operations** (later: gradients, GPU support)\n",
|
||||
"- **Consistent API** for neural networks\n",
|
||||
"- **Type safety** and error checking\n",
|
||||
"- **Integration** with the rest of TinyTorch\n",
|
||||
"\n",
|
||||
"Let's start building!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "211f7216",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🧠 The Mathematical Foundation\n",
|
||||
"\n",
|
||||
"### Linear Algebra Refresher\n",
|
||||
"Tensors are generalizations of scalars, vectors, and matrices:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"Scalar (0D): 5\n",
|
||||
"Vector (1D): [1, 2, 3]\n",
|
||||
"Matrix (2D): [[1, 2], [3, 4]]\n",
|
||||
"Tensor (3D): [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Why This Matters for Neural Networks\n",
|
||||
"- **Forward Pass**: Matrix multiplication between layers\n",
|
||||
"- **Batch Processing**: Multiple samples processed simultaneously\n",
|
||||
"- **Convolutions**: 3D operations on image data\n",
|
||||
"- **Gradients**: Derivatives computed across all dimensions\n",
|
||||
"\n",
|
||||
"### Connection to Real ML Systems\n",
|
||||
"Every major ML framework uses tensors:\n",
|
||||
"- **PyTorch**: `torch.Tensor`\n",
|
||||
"- **TensorFlow**: `tf.Tensor`\n",
|
||||
"- **JAX**: `jax.numpy.ndarray`\n",
|
||||
"- **TinyTorch**: `tinytorch.core.tensor.Tensor` (what we're building!)\n",
|
||||
"\n",
|
||||
"### Performance Considerations\n",
|
||||
"- **Memory Layout**: Contiguous arrays for cache efficiency\n",
|
||||
"- **Vectorization**: SIMD operations for speed\n",
|
||||
"- **Broadcasting**: Efficient operations on different shapes\n",
|
||||
"- **Type Consistency**: Avoiding unnecessary conversions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3b5dc139",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"## Step 2: The Tensor Class Foundation\n",
|
||||
"\n",
|
||||
"### Core Concept\n",
|
||||
"Our Tensor class wraps NumPy arrays with ML-specific functionality. It needs to:\n",
|
||||
"- Handle different input types (scalars, lists, numpy arrays)\n",
|
||||
"- Provide consistent shape and type information\n",
|
||||
"- Support arithmetic operations\n",
|
||||
"- Maintain compatibility with the rest of TinyTorch\n",
|
||||
"\n",
|
||||
"### Design Principles\n",
|
||||
"- **Simplicity**: Easy to create and use\n",
|
||||
"- **Consistency**: Predictable behavior across operations\n",
|
||||
"- **Performance**: Efficient NumPy backend\n",
|
||||
"- **Extensibility**: Ready for future features (gradients, GPU)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f5368e89",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "tensor-class",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"class Tensor:\n",
|
||||
" \"\"\"\n",
|
||||
" TinyTorch Tensor: N-dimensional array with ML operations.\n",
|
||||
" \n",
|
||||
" The fundamental data structure for all TinyTorch operations.\n",
|
||||
" Wraps NumPy arrays with ML-specific functionality.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" def __init__(self, data: Union[int, float, List, np.ndarray], dtype: Optional[str] = None):\n",
|
||||
" \"\"\"\n",
|
||||
" Create a new tensor from data.\n",
|
||||
" \n",
|
||||
" Args:\n",
|
||||
" data: Input data (scalar, list, or numpy array)\n",
|
||||
" dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
|
||||
" \n",
|
||||
" TODO: Implement tensor creation with proper type handling.\n",
|
||||
" \n",
|
||||
" STEP-BY-STEP:\n",
|
||||
" 1. Check if data is a scalar (int/float) - convert to numpy array\n",
|
||||
" 2. Check if data is a list - convert to numpy array \n",
|
||||
" 3. Check if data is already a numpy array - use as-is\n",
|
||||
" 4. Apply dtype conversion if specified\n",
|
||||
" 5. Store the result in self._data\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor(5) → stores np.array(5)\n",
|
||||
" Tensor([1, 2, 3]) → stores np.array([1, 2, 3])\n",
|
||||
" Tensor(np.array([1, 2, 3])) → stores the array directly\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use isinstance() to check data types\n",
|
||||
" - Use np.array() for conversion\n",
|
||||
" - Handle dtype parameter for type conversion\n",
|
||||
" - Store the array in self._data\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" # Convert input to numpy array\n",
|
||||
" if isinstance(data, (int, float, np.number)):\n",
|
||||
" # Handle Python and NumPy scalars\n",
|
||||
" if dtype is None:\n",
|
||||
" # Auto-detect type: int for integers, float32 for floats\n",
|
||||
" if isinstance(data, int) or (isinstance(data, np.number) and np.issubdtype(type(data), np.integer)):\n",
|
||||
" dtype = 'int32'\n",
|
||||
" else:\n",
|
||||
" dtype = 'float32'\n",
|
||||
" self._data = np.array(data, dtype=dtype)\n",
|
||||
" elif isinstance(data, list):\n",
|
||||
" # Let NumPy auto-detect type, then convert if needed\n",
|
||||
" temp_array = np.array(data)\n",
|
||||
" if dtype is None:\n",
|
||||
" # Use NumPy's auto-detected type, but prefer float32 for floats\n",
|
||||
" if temp_array.dtype == np.float64:\n",
|
||||
" dtype = 'float32'\n",
|
||||
" else:\n",
|
||||
" dtype = str(temp_array.dtype)\n",
|
||||
" self._data = np.array(data, dtype=dtype)\n",
|
||||
" elif isinstance(data, np.ndarray):\n",
|
||||
" # Already a numpy array\n",
|
||||
" if dtype is None:\n",
|
||||
" # Keep existing dtype, but prefer float32 for float64\n",
|
||||
" if data.dtype == np.float64:\n",
|
||||
" dtype = 'float32'\n",
|
||||
" else:\n",
|
||||
" dtype = str(data.dtype)\n",
|
||||
" self._data = data.astype(dtype) if dtype != data.dtype else data.copy()\n",
|
||||
" else:\n",
|
||||
" # Try to convert unknown types\n",
|
||||
" self._data = np.array(data, dtype=dtype)\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" @property\n",
|
||||
" def data(self) -> np.ndarray:\n",
|
||||
" \"\"\"\n",
|
||||
" Access underlying numpy array.\n",
|
||||
" \n",
|
||||
" TODO: Return the stored numpy array.\n",
|
||||
" \n",
|
||||
" HINT: Return self._data (the array you stored in __init__)\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" return self._data\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" @property\n",
|
||||
" def shape(self) -> Tuple[int, ...]:\n",
|
||||
" \"\"\"\n",
|
||||
" Get tensor shape.\n",
|
||||
" \n",
|
||||
" TODO: Return the shape of the stored numpy array.\n",
|
||||
" \n",
|
||||
" HINT: Use .shape attribute of the numpy array\n",
|
||||
" EXAMPLE: Tensor([1, 2, 3]).shape should return (3,)\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" return self._data.shape\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" @property\n",
|
||||
" def size(self) -> int:\n",
|
||||
" \"\"\"\n",
|
||||
" Get total number of elements.\n",
|
||||
" \n",
|
||||
" TODO: Return the total number of elements in the tensor.\n",
|
||||
" \n",
|
||||
" HINT: Use .size attribute of the numpy array\n",
|
||||
" EXAMPLE: Tensor([1, 2, 3]).size should return 3\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" return self._data.size\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" @property\n",
|
||||
" def dtype(self) -> np.dtype:\n",
|
||||
" \"\"\"\n",
|
||||
" Get data type as numpy dtype.\n",
|
||||
" \n",
|
||||
" TODO: Return the data type of the stored numpy array.\n",
|
||||
" \n",
|
||||
" HINT: Use .dtype attribute of the numpy array\n",
|
||||
" EXAMPLE: Tensor([1, 2, 3]).dtype should return dtype('int32')\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" return self._data.dtype\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def __repr__(self) -> str:\n",
|
||||
" \"\"\"\n",
|
||||
" String representation.\n",
|
||||
" \n",
|
||||
" TODO: Create a clear string representation of the tensor.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Convert the numpy array to a list for readable output\n",
|
||||
" 2. Include the shape and dtype information\n",
|
||||
" 3. Format: \"Tensor([data], shape=shape, dtype=dtype)\"\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([1, 2, 3]) → \"Tensor([1, 2, 3], shape=(3,), dtype=int32)\"\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use .tolist() to convert numpy array to list\n",
|
||||
" - Include shape and dtype information\n",
|
||||
" - Keep format consistent and readable\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" return f\"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})\"\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def add(self, other: 'Tensor') -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Add two tensors element-wise.\n",
|
||||
" \n",
|
||||
" TODO: Implement tensor addition.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Add the numpy arrays using +\n",
|
||||
" 2. Return a new Tensor with the result\n",
|
||||
" 3. Handle broadcasting automatically\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use self._data + other._data\n",
|
||||
" - Return Tensor(result)\n",
|
||||
" - NumPy handles broadcasting automatically\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" result = self._data + other._data\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
"\n",
|
||||
" def multiply(self, other: 'Tensor') -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Multiply two tensors element-wise.\n",
|
||||
" \n",
|
||||
" TODO: Implement tensor multiplication.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Multiply the numpy arrays using *\n",
|
||||
" 2. Return a new Tensor with the result\n",
|
||||
" 3. Handle broadcasting automatically\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use self._data * other._data\n",
|
||||
" - Return Tensor(result)\n",
|
||||
" - This is element-wise, not matrix multiplication\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" result = self._data * other._data\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
"\n",
|
||||
" def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Addition operator: tensor + other\n",
|
||||
" \n",
|
||||
" TODO: Implement + operator for tensors.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. If other is a Tensor, use tensor addition\n",
|
||||
" 2. If other is a scalar, convert to Tensor first\n",
|
||||
" 3. Return the result\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
|
||||
" Tensor([1, 2]) + 5 → Tensor([6, 7])\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" if isinstance(other, Tensor):\n",
|
||||
" return self.add(other)\n",
|
||||
" else:\n",
|
||||
" return self.add(Tensor(other))\n",
|
||||
" ### END SOLUTION\n",
|
||||
"\n",
|
||||
" def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Multiplication operator: tensor * other\n",
|
||||
" \n",
|
||||
" TODO: Implement * operator for tensors.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. If other is a Tensor, use tensor multiplication\n",
|
||||
" 2. If other is a scalar, convert to Tensor first\n",
|
||||
" 3. Return the result\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
|
||||
" Tensor([1, 2]) * 3 → Tensor([3, 6])\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" if isinstance(other, Tensor):\n",
|
||||
" return self.multiply(other)\n",
|
||||
" else:\n",
|
||||
" return self.multiply(Tensor(other))\n",
|
||||
" ### END SOLUTION\n",
|
||||
"\n",
|
||||
" def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Subtraction operator: tensor - other\n",
|
||||
" \n",
|
||||
" TODO: Implement - operator for tensors.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Convert other to Tensor if needed\n",
|
||||
" 2. Subtract using numpy arrays\n",
|
||||
" 3. Return new Tensor with result\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([5, 6]) - Tensor([1, 2]) → Tensor([4, 4])\n",
|
||||
" Tensor([5, 6]) - 1 → Tensor([4, 5])\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" if isinstance(other, Tensor):\n",
|
||||
" result = self._data - other._data\n",
|
||||
" else:\n",
|
||||
" result = self._data - other\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
"\n",
|
||||
" def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
|
||||
" \"\"\"\n",
|
||||
" Division operator: tensor / other\n",
|
||||
" \n",
|
||||
" TODO: Implement / operator for tensors.\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Convert other to Tensor if needed\n",
|
||||
" 2. Divide using numpy arrays\n",
|
||||
" 3. Return new Tensor with result\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Tensor([6, 8]) / Tensor([2, 4]) → Tensor([3, 2])\n",
|
||||
" Tensor([6, 8]) / 2 → Tensor([3, 4])\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" if isinstance(other, Tensor):\n",
|
||||
" result = self._data / other._data\n",
|
||||
" else:\n",
|
||||
" result = self._data / other\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cebcc1d6",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: Tensor Arithmetic Operations\n",
|
||||
"\n",
|
||||
"### Why Arithmetic Matters\n",
|
||||
"Tensor arithmetic is the foundation of all neural network operations:\n",
|
||||
"- **Forward pass**: Matrix multiplications and additions\n",
|
||||
"- **Activation functions**: Element-wise operations\n",
|
||||
"- **Loss computation**: Differences and squares\n",
|
||||
"- **Gradient computation**: Chain rule applications\n",
|
||||
"\n",
|
||||
"### Operations We'll Implement\n",
|
||||
"- **Addition**: Element-wise addition of tensors\n",
|
||||
"- **Multiplication**: Element-wise multiplication\n",
|
||||
"- **Python operators**: `+`, `-`, `*`, `/` for natural syntax\n",
|
||||
"- **Broadcasting**: Handle different shapes automatically"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5afc47f3",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: Tensor Arithmetic Methods\n",
|
||||
"\n",
|
||||
"The arithmetic methods are now part of the Tensor class above. Let's test them!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "04dc4fac",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 4: Python Operator Overloading\n",
|
||||
"\n",
|
||||
"### Why Operator Overloading?\n",
|
||||
"Python's magic methods allow us to use natural syntax:\n",
|
||||
"- `a + b` instead of `a.add(b)`\n",
|
||||
"- `a * b` instead of `a.multiply(b)`\n",
|
||||
"- `a - b` for subtraction\n",
|
||||
"- `a / b` for division\n",
|
||||
"\n",
|
||||
"This makes tensor operations feel natural and readable."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "35ae8a76",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 4: Operator Overloading\n",
|
||||
"\n",
|
||||
"The operator methods (__add__, __mul__, __sub__, __truediv__) are now part of the Tensor class above. This enables natural syntax like `a + b` and `a * b`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1a00809c",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"### 🧪 Test Your Tensor Implementation\n",
|
||||
"\n",
|
||||
"Once you implement the Tensor class above, run these cells to test your implementation:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7ac88fbc",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-tensor-creation",
|
||||
"locked": true,
|
||||
"points": 25,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test tensor creation and properties\n",
|
||||
"print(\"Testing tensor creation...\")\n",
|
||||
"\n",
|
||||
"# Test scalar creation\n",
|
||||
"scalar = Tensor(5.0)\n",
|
||||
"assert scalar.shape == (), f\"Scalar shape should be (), got {scalar.shape}\"\n",
|
||||
"assert scalar.size == 1, f\"Scalar size should be 1, got {scalar.size}\"\n",
|
||||
"assert scalar.data.item() == 5.0, f\"Scalar value should be 5.0, got {scalar.data.item()}\"\n",
|
||||
"\n",
|
||||
"# Test vector creation\n",
|
||||
"vector = Tensor([1, 2, 3])\n",
|
||||
"assert vector.shape == (3,), f\"Vector shape should be (3,), got {vector.shape}\"\n",
|
||||
"assert vector.size == 3, f\"Vector size should be 3, got {vector.size}\"\n",
|
||||
"assert np.array_equal(vector.data, np.array([1, 2, 3])), \"Vector data mismatch\"\n",
|
||||
"\n",
|
||||
"# Test matrix creation\n",
|
||||
"matrix = Tensor([[1, 2], [3, 4]])\n",
|
||||
"assert matrix.shape == (2, 2), f\"Matrix shape should be (2, 2), got {matrix.shape}\"\n",
|
||||
"assert matrix.size == 4, f\"Matrix size should be 4, got {matrix.size}\"\n",
|
||||
"assert np.array_equal(matrix.data, np.array([[1, 2], [3, 4]])), \"Matrix data mismatch\"\n",
|
||||
"\n",
|
||||
"# Test dtype handling\n",
|
||||
"float_tensor = Tensor([1.0, 2.0, 3.0])\n",
|
||||
"assert float_tensor.dtype == np.float32, f\"Float tensor dtype should be float32, got {float_tensor.dtype}\"\n",
|
||||
"\n",
|
||||
"int_tensor = Tensor([1, 2, 3])\n",
|
||||
"# Note: NumPy may default to int64 on some systems, so we check for integer types\n",
|
||||
"assert int_tensor.dtype in [np.int32, np.int64], f\"Int tensor dtype should be int32 or int64, got {int_tensor.dtype}\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Tensor creation tests passed!\")\n",
|
||||
"print(f\"✅ Scalar: {scalar}\")\n",
|
||||
"print(f\"✅ Vector: {vector}\")\n",
|
||||
"print(f\"✅ Matrix: {matrix}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "edc7519d",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-tensor-arithmetic",
|
||||
"locked": true,
|
||||
"points": 25,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test tensor arithmetic operations\n",
|
||||
"print(\"Testing tensor arithmetic...\")\n",
|
||||
"\n",
|
||||
"# Test addition\n",
|
||||
"a = Tensor([1, 2, 3])\n",
|
||||
"b = Tensor([4, 5, 6])\n",
|
||||
"c = a + b\n",
|
||||
"expected = np.array([5, 7, 9])\n",
|
||||
"assert np.array_equal(c.data, expected), f\"Addition failed: expected {expected}, got {c.data}\"\n",
|
||||
"\n",
|
||||
"# Test multiplication\n",
|
||||
"d = a * b\n",
|
||||
"expected = np.array([4, 10, 18])\n",
|
||||
"assert np.array_equal(d.data, expected), f\"Multiplication failed: expected {expected}, got {d.data}\"\n",
|
||||
"\n",
|
||||
"# Test subtraction\n",
|
||||
"e = b - a\n",
|
||||
"expected = np.array([3, 3, 3])\n",
|
||||
"assert np.array_equal(e.data, expected), f\"Subtraction failed: expected {expected}, got {e.data}\"\n",
|
||||
"\n",
|
||||
"# Test division\n",
|
||||
"f = b / a\n",
|
||||
"expected = np.array([4.0, 2.5, 2.0])\n",
|
||||
"assert np.allclose(f.data, expected), f\"Division failed: expected {expected}, got {f.data}\"\n",
|
||||
"\n",
|
||||
"# Test scalar operations\n",
|
||||
"g = a + 10\n",
|
||||
"expected = np.array([11, 12, 13])\n",
|
||||
"assert np.array_equal(g.data, expected), f\"Scalar addition failed: expected {expected}, got {g.data}\"\n",
|
||||
"\n",
|
||||
"h = a * 2\n",
|
||||
"expected = np.array([2, 4, 6])\n",
|
||||
"assert np.array_equal(h.data, expected), f\"Scalar multiplication failed: expected {expected}, got {h.data}\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Tensor arithmetic tests passed!\")\n",
|
||||
"print(f\"✅ Addition: {a} + {b} = {c}\")\n",
|
||||
"print(f\"✅ Multiplication: {a} * {b} = {d}\")\n",
|
||||
"print(f\"✅ Subtraction: {b} - {a} = {e}\")\n",
|
||||
"print(f\"✅ Division: {b} / {a} = {f}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ba87775f",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-tensor-broadcasting",
|
||||
"locked": true,
|
||||
"points": 25,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test tensor broadcasting\n",
|
||||
"print(\"Testing tensor broadcasting...\")\n",
|
||||
"\n",
|
||||
"# Test scalar broadcasting\n",
|
||||
"matrix = Tensor([[1, 2], [3, 4]])\n",
|
||||
"scalar = Tensor(10)\n",
|
||||
"result = matrix + scalar\n",
|
||||
"expected = np.array([[11, 12], [13, 14]])\n",
|
||||
"assert np.array_equal(result.data, expected), f\"Scalar broadcasting failed: expected {expected}, got {result.data}\"\n",
|
||||
"\n",
|
||||
"# Test vector broadcasting\n",
|
||||
"vector = Tensor([1, 2])\n",
|
||||
"result = matrix + vector\n",
|
||||
"expected = np.array([[2, 4], [4, 6]])\n",
|
||||
"assert np.array_equal(result.data, expected), f\"Vector broadcasting failed: expected {expected}, got {result.data}\"\n",
|
||||
"\n",
|
||||
"# Test different shapes\n",
|
||||
"a = Tensor([[1], [2], [3]]) # (3, 1)\n",
|
||||
"b = Tensor([10, 20]) # (2,)\n",
|
||||
"result = a + b\n",
|
||||
"expected = np.array([[11, 21], [12, 22], [13, 23]])\n",
|
||||
"assert np.array_equal(result.data, expected), f\"Shape broadcasting failed: expected {expected}, got {result.data}\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Tensor broadcasting tests passed!\")\n",
|
||||
"print(f\"✅ Matrix + Scalar: {matrix} + {scalar} = {result}\")\n",
|
||||
"print(f\"✅ Broadcasting works correctly!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8ac93d30",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🎯 Module Summary\n",
|
||||
"\n",
|
||||
"Congratulations! You've successfully implemented the core Tensor class for TinyTorch:\n",
|
||||
"\n",
|
||||
"### What You've Accomplished\n",
|
||||
"✅ **Tensor Creation**: Handle scalars, vectors, matrices, and higher-dimensional arrays \n",
|
||||
"✅ **Data Types**: Proper dtype handling with auto-detection and conversion \n",
|
||||
"✅ **Properties**: Shape, size, dtype, and data access \n",
|
||||
"✅ **Arithmetic**: Addition, multiplication, subtraction, division \n",
|
||||
"✅ **Operators**: Natural Python syntax with `+`, `-`, `*`, `/` \n",
|
||||
"✅ **Broadcasting**: Automatic shape compatibility like NumPy \n",
|
||||
"\n",
|
||||
"### Key Concepts You've Learned\n",
|
||||
"- **Tensors** are the fundamental data structure for ML systems\n",
|
||||
"- **NumPy backend** provides efficient computation with ML-friendly API\n",
|
||||
"- **Operator overloading** makes tensor operations feel natural\n",
|
||||
"- **Broadcasting** enables flexible operations between different shapes\n",
|
||||
"- **Type safety** ensures consistent behavior across operations\n",
|
||||
"\n",
|
||||
"### Next Steps\n",
|
||||
"1. **Export your code**: `tito package nbdev --export 01_tensor`\n",
|
||||
"2. **Test your implementation**: `tito module test 01_tensor`\n",
|
||||
"3. **Use your tensors**: \n",
|
||||
" ```python\n",
|
||||
" from tinytorch.core.tensor import Tensor\n",
|
||||
" t = Tensor([1, 2, 3])\n",
|
||||
" print(t + 5) # Your tensor in action!\n",
|
||||
" ```\n",
|
||||
"4. **Move to Module 2**: Start building activation functions!\n",
|
||||
"\n",
|
||||
"**Ready for the next challenge?** Let's add the mathematical functions that make neural networks powerful!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"jupytext": {
|
||||
"main_language": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,408 +0,0 @@
|
||||
# ---
|
||||
# jupyter:
|
||||
# jupytext:
|
||||
# text_representation:
|
||||
# extension: .py
|
||||
# format_name: percent
|
||||
# format_version: '1.3'
|
||||
# jupytext_version: 1.17.1
|
||||
# ---
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
# Module 1: Tensor - Enhanced with nbgrader Support
|
||||
|
||||
This is an enhanced version of the tensor module that demonstrates dual-purpose content creation:
|
||||
- **Self-learning**: Rich educational content with guided implementation
|
||||
- **Auto-grading**: nbgrader-compatible assignments with hidden tests
|
||||
|
||||
## Dual System Benefits
|
||||
|
||||
1. **Single Source**: One file generates both learning and assignment materials
|
||||
2. **Consistent Quality**: Same instructor solutions in both contexts
|
||||
3. **Flexible Assessment**: Choose between self-paced learning or formal grading
|
||||
4. **Scalable**: Handle large courses with automated feedback
|
||||
|
||||
## How It Works
|
||||
|
||||
- **TinyTorch markers**: `#| exercise_start/end` for educational content
|
||||
- **nbgrader markers**: `### BEGIN/END SOLUTION` for auto-grading
|
||||
- **Hidden tests**: `### BEGIN/END HIDDEN TESTS` for automatic verification
|
||||
- **Dual generation**: One command creates both student notebooks and assignments
|
||||
"""
|
||||
|
||||
# %%
|
||||
#| default_exp core.tensor
|
||||
|
||||
# %%
|
||||
#| export
|
||||
import numpy as np
|
||||
from typing import Union, List, Tuple, Optional
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Enhanced Tensor Class
|
||||
|
||||
This implementation shows how to create dual-purpose educational content:
|
||||
|
||||
### For Self-Learning Students
|
||||
- Rich explanations and step-by-step guidance
|
||||
- Detailed hints and examples
|
||||
- Progressive difficulty with scaffolding
|
||||
|
||||
### For Formal Assessment
|
||||
- Auto-graded with hidden tests
|
||||
- Immediate feedback on correctness
|
||||
- Partial credit for complex methods
|
||||
"""
|
||||
|
||||
# %%
|
||||
#| export
|
||||
class Tensor:
|
||||
"""
|
||||
TinyTorch Tensor: N-dimensional array with ML operations.
|
||||
|
||||
This enhanced version demonstrates dual-purpose educational content
|
||||
suitable for both self-learning and formal assessment.
|
||||
"""
|
||||
|
||||
def __init__(self, data: Union[int, float, List, np.ndarray], dtype: Optional[str] = None):
|
||||
"""
|
||||
Create a new tensor from data.
|
||||
|
||||
Args:
|
||||
data: Input data (scalar, list, or numpy array)
|
||||
dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.
|
||||
"""
|
||||
#| exercise_start
|
||||
#| hint: Use np.array() to convert input data to numpy array
|
||||
#| solution_test: tensor.shape should match input shape
|
||||
#| difficulty: easy
|
||||
|
||||
### BEGIN SOLUTION
|
||||
# Convert input to numpy array
|
||||
if isinstance(data, (int, float)):
|
||||
self._data = np.array(data)
|
||||
elif isinstance(data, list):
|
||||
self._data = np.array(data)
|
||||
elif isinstance(data, np.ndarray):
|
||||
self._data = data.copy()
|
||||
else:
|
||||
self._data = np.array(data)
|
||||
|
||||
# Apply dtype conversion if specified
|
||||
if dtype is not None:
|
||||
self._data = self._data.astype(dtype)
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
@property
|
||||
def data(self) -> np.ndarray:
|
||||
"""Access underlying numpy array."""
|
||||
#| exercise_start
|
||||
#| hint: Return the stored numpy array (_data attribute)
|
||||
#| solution_test: tensor.data should return numpy array
|
||||
#| difficulty: easy
|
||||
|
||||
### BEGIN SOLUTION
|
||||
return self._data
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
@property
|
||||
def shape(self) -> Tuple[int, ...]:
|
||||
"""Get tensor shape."""
|
||||
#| exercise_start
|
||||
#| hint: Use the .shape attribute of the numpy array
|
||||
#| solution_test: tensor.shape should return tuple of dimensions
|
||||
#| difficulty: easy
|
||||
|
||||
### BEGIN SOLUTION
|
||||
return self._data.shape
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
@property
|
||||
def size(self) -> int:
|
||||
"""Get total number of elements."""
|
||||
#| exercise_start
|
||||
#| hint: Use the .size attribute of the numpy array
|
||||
#| solution_test: tensor.size should return total element count
|
||||
#| difficulty: easy
|
||||
|
||||
### BEGIN SOLUTION
|
||||
return self._data.size
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
@property
|
||||
def dtype(self) -> np.dtype:
|
||||
"""Get data type as numpy dtype."""
|
||||
#| exercise_start
|
||||
#| hint: Use the .dtype attribute of the numpy array
|
||||
#| solution_test: tensor.dtype should return numpy dtype
|
||||
#| difficulty: easy
|
||||
|
||||
### BEGIN SOLUTION
|
||||
return self._data.dtype
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
def __repr__(self) -> str:
|
||||
"""String representation of the tensor."""
|
||||
#| exercise_start
|
||||
#| hint: Format as "Tensor([data], shape=shape, dtype=dtype)"
|
||||
#| solution_test: repr should include data, shape, and dtype
|
||||
#| difficulty: medium
|
||||
|
||||
### BEGIN SOLUTION
|
||||
data_str = self._data.tolist()
|
||||
return f"Tensor({data_str}, shape={self.shape}, dtype={self.dtype})"
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
def add(self, other: 'Tensor') -> 'Tensor':
|
||||
"""
|
||||
Add two tensors element-wise.
|
||||
|
||||
Args:
|
||||
other: Another tensor to add
|
||||
|
||||
Returns:
|
||||
New tensor with element-wise sum
|
||||
"""
|
||||
#| exercise_start
|
||||
#| hint: Use numpy's + operator for element-wise addition
|
||||
#| solution_test: result should be new Tensor with correct values
|
||||
#| difficulty: medium
|
||||
|
||||
### BEGIN SOLUTION
|
||||
result_data = self._data + other._data
|
||||
return Tensor(result_data)
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
def multiply(self, other: 'Tensor') -> 'Tensor':
|
||||
"""
|
||||
Multiply two tensors element-wise.
|
||||
|
||||
Args:
|
||||
other: Another tensor to multiply
|
||||
|
||||
Returns:
|
||||
New tensor with element-wise product
|
||||
"""
|
||||
#| exercise_start
|
||||
#| hint: Use numpy's * operator for element-wise multiplication
|
||||
#| solution_test: result should be new Tensor with correct values
|
||||
#| difficulty: medium
|
||||
|
||||
### BEGIN SOLUTION
|
||||
result_data = self._data * other._data
|
||||
return Tensor(result_data)
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
def matmul(self, other: 'Tensor') -> 'Tensor':
|
||||
"""
|
||||
Matrix multiplication of two tensors.
|
||||
|
||||
Args:
|
||||
other: Another tensor for matrix multiplication
|
||||
|
||||
Returns:
|
||||
New tensor with matrix product
|
||||
|
||||
Raises:
|
||||
ValueError: If shapes are incompatible for matrix multiplication
|
||||
"""
|
||||
#| exercise_start
|
||||
#| hint: Use np.dot() for matrix multiplication, check shapes first
|
||||
#| solution_test: result should handle shape validation and matrix multiplication
|
||||
#| difficulty: hard
|
||||
|
||||
### BEGIN SOLUTION
|
||||
# Check shape compatibility
|
||||
if len(self.shape) != 2 or len(other.shape) != 2:
|
||||
raise ValueError("Matrix multiplication requires 2D tensors")
|
||||
|
||||
if self.shape[1] != other.shape[0]:
|
||||
raise ValueError(f"Cannot multiply shapes {self.shape} and {other.shape}")
|
||||
|
||||
result_data = np.dot(self._data, other._data)
|
||||
return Tensor(result_data)
|
||||
### END SOLUTION
|
||||
|
||||
#| exercise_end
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Hidden Tests for Auto-Grading
|
||||
|
||||
These tests are hidden from students but used for automatic grading.
|
||||
They provide comprehensive coverage and immediate feedback.
|
||||
"""
|
||||
|
||||
# %%
|
||||
### BEGIN HIDDEN TESTS
|
||||
def test_tensor_creation_basic():
|
||||
"""Test basic tensor creation (2 points)"""
|
||||
t = Tensor([1, 2, 3])
|
||||
assert t.shape == (3,)
|
||||
assert t.data.tolist() == [1, 2, 3]
|
||||
assert t.size == 3
|
||||
|
||||
def test_tensor_creation_scalar():
|
||||
"""Test scalar tensor creation (2 points)"""
|
||||
t = Tensor(5)
|
||||
assert t.shape == ()
|
||||
assert t.data.item() == 5
|
||||
assert t.size == 1
|
||||
|
||||
def test_tensor_creation_2d():
|
||||
"""Test 2D tensor creation (2 points)"""
|
||||
t = Tensor([[1, 2], [3, 4]])
|
||||
assert t.shape == (2, 2)
|
||||
assert t.data.tolist() == [[1, 2], [3, 4]]
|
||||
assert t.size == 4
|
||||
|
||||
def test_tensor_dtype():
|
||||
"""Test dtype handling (2 points)"""
|
||||
t = Tensor([1, 2, 3], dtype='float32')
|
||||
assert t.dtype == np.float32
|
||||
assert t.data.dtype == np.float32
|
||||
|
||||
def test_tensor_properties():
|
||||
"""Test tensor properties (2 points)"""
|
||||
t = Tensor([[1, 2, 3], [4, 5, 6]])
|
||||
assert t.shape == (2, 3)
|
||||
assert t.size == 6
|
||||
assert isinstance(t.data, np.ndarray)
|
||||
|
||||
def test_tensor_repr():
|
||||
"""Test string representation (2 points)"""
|
||||
t = Tensor([1, 2, 3])
|
||||
repr_str = repr(t)
|
||||
assert "Tensor" in repr_str
|
||||
assert "shape" in repr_str
|
||||
assert "dtype" in repr_str
|
||||
|
||||
def test_tensor_add():
|
||||
"""Test tensor addition (3 points)"""
|
||||
t1 = Tensor([1, 2, 3])
|
||||
t2 = Tensor([4, 5, 6])
|
||||
result = t1.add(t2)
|
||||
assert result.data.tolist() == [5, 7, 9]
|
||||
assert result.shape == (3,)
|
||||
|
||||
def test_tensor_multiply():
|
||||
"""Test tensor multiplication (3 points)"""
|
||||
t1 = Tensor([1, 2, 3])
|
||||
t2 = Tensor([4, 5, 6])
|
||||
result = t1.multiply(t2)
|
||||
assert result.data.tolist() == [4, 10, 18]
|
||||
assert result.shape == (3,)
|
||||
|
||||
def test_tensor_matmul():
|
||||
"""Test matrix multiplication (4 points)"""
|
||||
t1 = Tensor([[1, 2], [3, 4]])
|
||||
t2 = Tensor([[5, 6], [7, 8]])
|
||||
result = t1.matmul(t2)
|
||||
expected = [[19, 22], [43, 50]]
|
||||
assert result.data.tolist() == expected
|
||||
assert result.shape == (2, 2)
|
||||
|
||||
def test_tensor_matmul_error():
|
||||
"""Test matrix multiplication error handling (2 points)"""
|
||||
t1 = Tensor([[1, 2, 3]]) # Shape (1, 3)
|
||||
t2 = Tensor([[4, 5]]) # Shape (1, 2)
|
||||
|
||||
try:
|
||||
t1.matmul(t2)
|
||||
assert False, "Should have raised ValueError"
|
||||
except ValueError as e:
|
||||
assert "Cannot multiply shapes" in str(e)
|
||||
|
||||
def test_tensor_immutability():
|
||||
"""Test that operations create new tensors (2 points)"""
|
||||
t1 = Tensor([1, 2, 3])
|
||||
t2 = Tensor([4, 5, 6])
|
||||
original_data = t1.data.copy()
|
||||
|
||||
result = t1.add(t2)
|
||||
|
||||
# Original tensor should be unchanged
|
||||
assert np.array_equal(t1.data, original_data)
|
||||
# Result should be different object
|
||||
assert result is not t1
|
||||
assert result.data is not t1.data
|
||||
|
||||
### END HIDDEN TESTS
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Usage Examples
|
||||
|
||||
### Self-Learning Mode
|
||||
Students work through the educational content step by step:
|
||||
|
||||
```python
|
||||
# Create tensors
|
||||
t1 = Tensor([1, 2, 3])
|
||||
t2 = Tensor([4, 5, 6])
|
||||
|
||||
# Basic operations
|
||||
result = t1.add(t2)
|
||||
print(f"Addition: {result}")
|
||||
|
||||
# Matrix operations
|
||||
matrix1 = Tensor([[1, 2], [3, 4]])
|
||||
matrix2 = Tensor([[5, 6], [7, 8]])
|
||||
product = matrix1.matmul(matrix2)
|
||||
print(f"Matrix multiplication: {product}")
|
||||
```
|
||||
|
||||
### Assignment Mode
|
||||
Students submit implementations that are automatically graded:
|
||||
|
||||
1. **Immediate feedback**: Know if implementation is correct
|
||||
2. **Partial credit**: Earn points for each working method
|
||||
3. **Hidden tests**: Comprehensive coverage beyond visible examples
|
||||
4. **Error handling**: Points for proper edge case handling
|
||||
|
||||
### Benefits of Dual System
|
||||
|
||||
1. **Single source**: One implementation serves both purposes
|
||||
2. **Consistent quality**: Same instructor solutions everywhere
|
||||
3. **Flexible assessment**: Choose the right tool for each situation
|
||||
4. **Scalable**: Handle large courses with automated feedback
|
||||
|
||||
This approach transforms TinyTorch from a learning framework into a complete course management solution.
|
||||
"""
|
||||
|
||||
# %%
|
||||
# Test the implementation
|
||||
if __name__ == "__main__":
|
||||
# Basic testing
|
||||
t1 = Tensor([1, 2, 3])
|
||||
t2 = Tensor([4, 5, 6])
|
||||
|
||||
print(f"t1: {t1}")
|
||||
print(f"t2: {t2}")
|
||||
print(f"t1 + t2: {t1.add(t2)}")
|
||||
print(f"t1 * t2: {t1.multiply(t2)}")
|
||||
|
||||
# Matrix multiplication
|
||||
m1 = Tensor([[1, 2], [3, 4]])
|
||||
m2 = Tensor([[5, 6], [7, 8]])
|
||||
print(f"Matrix multiplication: {m1.matmul(m2)}")
|
||||
|
||||
print("✅ Enhanced tensor module working!")
|
||||
894
modules/source/02_activations/activations_dev.ipynb
Normal file
894
modules/source/02_activations/activations_dev.ipynb
Normal file
@@ -0,0 +1,894 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "720f94f1",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"# Module 2: Activations - Nonlinearity in Neural Networks\n",
|
||||
"\n",
|
||||
"Welcome to the Activations module! This is where neural networks get their power through nonlinearity.\n",
|
||||
"\n",
|
||||
"## Learning Goals\n",
|
||||
"- Understand why activation functions are essential for neural networks\n",
|
||||
"- Implement the four most important activation functions: ReLU, Sigmoid, Tanh, and Softmax\n",
|
||||
"- Visualize how activations transform data and enable complex learning\n",
|
||||
"- See how activations work with layers to build powerful networks\n",
|
||||
"- Master the NBGrader workflow with comprehensive testing\n",
|
||||
"\n",
|
||||
"## Build → Use → Understand\n",
|
||||
"1. **Build**: Activation functions that add nonlinearity\n",
|
||||
"2. **Use**: Transform tensors and see immediate results\n",
|
||||
"3. **Understand**: How nonlinearity enables complex pattern learning"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "3c0ecb71",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "activations-imports",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| default_exp core.activations\n",
|
||||
"\n",
|
||||
"#| export\n",
|
||||
"import math\n",
|
||||
"import numpy as np\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import os\n",
|
||||
"import sys\n",
|
||||
"from typing import Union, List\n",
|
||||
"\n",
|
||||
"# Import our Tensor class - try from package first, then from local module\n",
|
||||
"try:\n",
|
||||
" from tinytorch.core.tensor import Tensor\n",
|
||||
"except ImportError:\n",
|
||||
" # For development, import from local tensor module\n",
|
||||
" sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))\n",
|
||||
" from tensor_dev import Tensor"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "dd3c4277",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "activations-setup",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| hide\n",
|
||||
"#| export\n",
|
||||
"def _should_show_plots():\n",
|
||||
" \"\"\"Check if we should show plots (disable during testing)\"\"\"\n",
|
||||
" # Check multiple conditions that indicate we're in test mode\n",
|
||||
" is_pytest = (\n",
|
||||
" 'pytest' in sys.modules or\n",
|
||||
" 'test' in sys.argv or\n",
|
||||
" os.environ.get('PYTEST_CURRENT_TEST') is not None or\n",
|
||||
" any('test' in arg for arg in sys.argv) or\n",
|
||||
" any('pytest' in arg for arg in sys.argv)\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Show plots in development mode (when not in test mode)\n",
|
||||
" return not is_pytest"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0d08aa85",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "activations-visualization",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| hide\n",
|
||||
"#| export\n",
|
||||
"def visualize_activation_function(activation_fn, name: str, x_range: tuple = (-5, 5), num_points: int = 100):\n",
|
||||
" \"\"\"Visualize an activation function's behavior\"\"\"\n",
|
||||
" if not _should_show_plots():\n",
|
||||
" return\n",
|
||||
" \n",
|
||||
" try:\n",
|
||||
" \n",
|
||||
" # Generate input values\n",
|
||||
" x_vals = np.linspace(x_range[0], x_range[1], num_points)\n",
|
||||
" \n",
|
||||
" # Apply activation function\n",
|
||||
" y_vals = []\n",
|
||||
" for x in x_vals:\n",
|
||||
" input_tensor = Tensor([[x]])\n",
|
||||
" output = activation_fn(input_tensor)\n",
|
||||
" y_vals.append(output.data.item())\n",
|
||||
" \n",
|
||||
" # Create plot\n",
|
||||
" plt.figure(figsize=(10, 6))\n",
|
||||
" plt.plot(x_vals, y_vals, 'b-', linewidth=2, label=f'{name} Activation')\n",
|
||||
" plt.grid(True, alpha=0.3)\n",
|
||||
" plt.xlabel('Input (x)')\n",
|
||||
" plt.ylabel(f'{name}(x)')\n",
|
||||
" plt.title(f'{name} Activation Function')\n",
|
||||
" plt.legend()\n",
|
||||
" plt.show()\n",
|
||||
" \n",
|
||||
" except ImportError:\n",
|
||||
" print(\" 📊 Matplotlib not available - skipping visualization\")\n",
|
||||
" except Exception as e:\n",
|
||||
" print(f\" ⚠️ Visualization error: {e}\")\n",
|
||||
"\n",
|
||||
"def visualize_activation_on_data(activation_fn, name: str, data: Tensor):\n",
|
||||
" \"\"\"Show activation function applied to sample data\"\"\"\n",
|
||||
" if not _should_show_plots():\n",
|
||||
" return\n",
|
||||
" \n",
|
||||
" try:\n",
|
||||
" output = activation_fn(data)\n",
|
||||
" print(f\" 📊 {name} Example:\")\n",
|
||||
" print(f\" Input: {data.data.flatten()}\")\n",
|
||||
" print(f\" Output: {output.data.flatten()}\")\n",
|
||||
" print(f\" Range: [{output.data.min():.3f}, {output.data.max():.3f}]\")\n",
|
||||
" \n",
|
||||
" except Exception as e:\n",
|
||||
" print(f\" ⚠️ Data visualization error: {e}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a29b0c94",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 1: What is an Activation Function?\n",
|
||||
"\n",
|
||||
"### Definition\n",
|
||||
"An **activation function** is a mathematical function that adds nonlinearity to neural networks. It transforms the output of a layer before passing it to the next layer.\n",
|
||||
"\n",
|
||||
"### Why Activation Functions Matter\n",
|
||||
"**Without activation functions, neural networks are just linear transformations!**\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"Linear → Linear → Linear = Still Linear\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"No matter how many layers you stack, without activation functions, you can only learn linear relationships. Activation functions introduce the nonlinearity that allows neural networks to:\n",
|
||||
"- Learn complex patterns\n",
|
||||
"- Approximate any continuous function\n",
|
||||
"- Solve non-linear problems\n",
|
||||
"\n",
|
||||
"### Visual Analogy\n",
|
||||
"Think of activation functions as **decision makers** at each neuron:\n",
|
||||
"- **ReLU**: \"If positive, pass it through; if negative, block it\"\n",
|
||||
"- **Sigmoid**: \"Squash everything between 0 and 1\"\n",
|
||||
"- **Tanh**: \"Squash everything between -1 and 1\"\n",
|
||||
"- **Softmax**: \"Convert to probabilities that sum to 1\"\n",
|
||||
"\n",
|
||||
"### Connection to Previous Modules\n",
|
||||
"In Module 1 (Tensor), we learned how to store and manipulate data. Now we add the nonlinear functions that make neural networks powerful."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2b3cce52",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"## Step 2: ReLU - The Workhorse of Deep Learning\n",
|
||||
"\n",
|
||||
"### What is ReLU?\n",
|
||||
"**ReLU (Rectified Linear Unit)** is the most popular activation function in deep learning.\n",
|
||||
"\n",
|
||||
"**Mathematical Definition:**\n",
|
||||
"```\n",
|
||||
"f(x) = max(0, x)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**In Plain English:**\n",
|
||||
"- If input is positive → pass it through unchanged\n",
|
||||
"- If input is negative → output zero\n",
|
||||
"\n",
|
||||
"### Why ReLU is Popular\n",
|
||||
"1. **Simple**: Easy to compute and understand\n",
|
||||
"2. **Fast**: No expensive operations (no exponentials)\n",
|
||||
"3. **Sparse**: Outputs many zeros, creating sparse representations\n",
|
||||
"4. **Gradient-friendly**: Gradient is either 0 or 1 (no vanishing gradient for positive inputs)\n",
|
||||
"\n",
|
||||
"### Real-World Analogy\n",
|
||||
"ReLU is like a **one-way valve** - it only lets positive \"pressure\" through, blocking negative values completely.\n",
|
||||
"\n",
|
||||
"### When to Use ReLU\n",
|
||||
"- **Hidden layers** in most neural networks\n",
|
||||
"- **Convolutional layers** in image processing\n",
|
||||
"- **When you want sparse activations**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4300f9b3",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "relu-class",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"class ReLU:\n",
|
||||
" \"\"\"\n",
|
||||
" ReLU Activation Function: f(x) = max(0, x)\n",
|
||||
" \n",
|
||||
" The most popular activation function in deep learning.\n",
|
||||
" Simple, fast, and effective for most applications.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" def forward(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"\n",
|
||||
" Apply ReLU activation: f(x) = max(0, x)\n",
|
||||
" \n",
|
||||
" TODO: Implement ReLU activation\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. For each element in the input tensor, apply max(0, element)\n",
|
||||
" 2. Return a new Tensor with the results\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Input: Tensor([[-1, 0, 1, 2, -3]])\n",
|
||||
" Expected: Tensor([[0, 0, 1, 2, 0]])\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use np.maximum(0, x.data) for element-wise max\n",
|
||||
" - Remember to return a new Tensor object\n",
|
||||
" - The shape should remain the same as input\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" result = np.maximum(0, x.data)\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def __call__(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"Make the class callable: relu(x) instead of relu.forward(x)\"\"\"\n",
|
||||
" return self.forward(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "533c471b",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: Sigmoid - The Smooth Squasher\n",
|
||||
"\n",
|
||||
"### What is Sigmoid?\n",
|
||||
"**Sigmoid** is a smooth S-shaped function that squashes inputs to the range (0, 1).\n",
|
||||
"\n",
|
||||
"**Mathematical Definition:**\n",
|
||||
"```\n",
|
||||
"f(x) = 1 / (1 + e^(-x))\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Properties:**\n",
|
||||
"- **Range**: (0, 1) - never exactly 0 or 1\n",
|
||||
"- **Smooth**: Differentiable everywhere\n",
|
||||
"- **Monotonic**: Always increasing\n",
|
||||
"- **Centered**: Around 0.5\n",
|
||||
"\n",
|
||||
"### Why Sigmoid is Useful\n",
|
||||
"1. **Probabilistic**: Output can be interpreted as probabilities\n",
|
||||
"2. **Bounded**: Output is always between 0 and 1\n",
|
||||
"3. **Smooth**: Good for gradient-based optimization\n",
|
||||
"4. **Historical**: Was the standard before ReLU\n",
|
||||
"\n",
|
||||
"### Real-World Analogy\n",
|
||||
"Sigmoid is like a **soft switch** - it gradually turns on as input increases, unlike ReLU's hard cutoff.\n",
|
||||
"\n",
|
||||
"### When to Use Sigmoid\n",
|
||||
"- **Binary classification** (output layer)\n",
|
||||
"- **Gates** in LSTM/GRU networks\n",
|
||||
"- **When you need probabilistic outputs**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cbe9f91c",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "sigmoid-class",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"class Sigmoid:\n",
|
||||
" \"\"\"\n",
|
||||
" Sigmoid Activation Function: f(x) = 1 / (1 + e^(-x))\n",
|
||||
" \n",
|
||||
" Smooth S-shaped function that squashes inputs to (0, 1).\n",
|
||||
" Useful for binary classification and probabilistic outputs.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" def forward(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"\n",
|
||||
" Apply Sigmoid activation: f(x) = 1 / (1 + e^(-x))\n",
|
||||
" \n",
|
||||
" TODO: Implement Sigmoid activation with numerical stability\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Clip input values to prevent overflow (e.g., between -500 and 500)\n",
|
||||
" 2. Apply the sigmoid formula: 1 / (1 + exp(-x))\n",
|
||||
" 3. Return a new Tensor with the results\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Input: Tensor([[-2, 0, 2]])\n",
|
||||
" Expected: Tensor([[0.119, 0.5, 0.881]]) (approximately)\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use np.clip(x.data, -500, 500) for numerical stability\n",
|
||||
" - Use np.exp() for the exponential function\n",
|
||||
" - Be careful with very large/small inputs to avoid overflow\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" # Clip for numerical stability\n",
|
||||
" clipped = np.clip(x.data, -500, 500)\n",
|
||||
" result = 1 / (1 + np.exp(-clipped))\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def __call__(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"Make the class callable: sigmoid(x) instead of sigmoid.forward(x)\"\"\"\n",
|
||||
" return self.forward(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "67dc777f",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"## Step 4: Tanh - The Zero-Centered Squasher\n",
|
||||
"\n",
|
||||
"### What is Tanh?\n",
|
||||
"**Tanh (Hyperbolic Tangent)** is similar to Sigmoid but centered around zero.\n",
|
||||
"\n",
|
||||
"**Mathematical Definition:**\n",
|
||||
"```\n",
|
||||
"f(x) = tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Properties:**\n",
|
||||
"- **Range**: (-1, 1) - symmetric around zero\n",
|
||||
"- **Zero-centered**: Output averages to zero\n",
|
||||
"- **Smooth**: Differentiable everywhere\n",
|
||||
"- **Stronger gradients**: Than sigmoid in some regions\n",
|
||||
"\n",
|
||||
"### Why Tanh is Useful\n",
|
||||
"1. **Zero-centered**: Better for training (gradients don't all have same sign)\n",
|
||||
"2. **Symmetric**: Treats positive and negative inputs equally\n",
|
||||
"3. **Stronger gradients**: Can help with training dynamics\n",
|
||||
"4. **Bounded**: Output is always between -1 and 1\n",
|
||||
"\n",
|
||||
"### Real-World Analogy\n",
|
||||
"Tanh is like a **balanced scale** - it can tip positive or negative, with zero as the neutral point.\n",
|
||||
"\n",
|
||||
"### When to Use Tanh\n",
|
||||
"- **Hidden layers** (alternative to ReLU)\n",
|
||||
"- **RNNs** (traditional choice)\n",
|
||||
"- **When you need zero-centered outputs**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e982bfbd",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "tanh-class",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"class Tanh:\n",
|
||||
" \"\"\"\n",
|
||||
" Tanh Activation Function: f(x) = tanh(x)\n",
|
||||
" \n",
|
||||
" Zero-centered S-shaped function that squashes inputs to (-1, 1).\n",
|
||||
" Better than sigmoid for hidden layers due to zero-centered outputs.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" def forward(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"\n",
|
||||
" Apply Tanh activation: f(x) = tanh(x)\n",
|
||||
" \n",
|
||||
" TODO: Implement Tanh activation\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Use NumPy's tanh function for numerical stability\n",
|
||||
" 2. Apply to the tensor data\n",
|
||||
" 3. Return a new Tensor with the results\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Input: Tensor([[-2, 0, 2]])\n",
|
||||
" Expected: Tensor([[-0.964, 0.0, 0.964]]) (approximately)\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use np.tanh(x.data) - NumPy handles the math\n",
|
||||
" - Much simpler than implementing the formula manually\n",
|
||||
" - NumPy's tanh is numerically stable\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" result = np.tanh(x.data)\n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def __call__(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"Make the class callable: tanh(x) instead of tanh.forward(x)\"\"\"\n",
|
||||
" return self.forward(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "726ae88b",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"## Step 5: Softmax - The Probability Converter\n",
|
||||
"\n",
|
||||
"### What is Softmax?\n",
|
||||
"**Softmax** converts a vector of numbers into a probability distribution.\n",
|
||||
"\n",
|
||||
"**Mathematical Definition:**\n",
|
||||
"```\n",
|
||||
"f(x_i) = e^(x_i) / Σ(e^(x_j)) for all j\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Properties:**\n",
|
||||
"- **Probabilities**: All outputs sum to 1\n",
|
||||
"- **Non-negative**: All outputs are ≥ 0\n",
|
||||
"- **Differentiable**: Smooth everywhere\n",
|
||||
"- **Competitive**: Amplifies differences between inputs\n",
|
||||
"\n",
|
||||
"### Why Softmax is Essential\n",
|
||||
"1. **Multi-class classification**: Converts logits to probabilities\n",
|
||||
"2. **Attention mechanisms**: Focuses on important elements\n",
|
||||
"3. **Interpretable**: Output can be understood as confidence\n",
|
||||
"4. **Competitive**: Emphasizes the largest input\n",
|
||||
"\n",
|
||||
"### Real-World Analogy\n",
|
||||
"Softmax is like **dividing a pie** - it takes any set of numbers and converts them into slices that sum to 100%.\n",
|
||||
"\n",
|
||||
"### When to Use Softmax\n",
|
||||
"- **Multi-class classification** (output layer)\n",
|
||||
"- **Attention mechanisms** in transformers\n",
|
||||
"- **When you need probability distributions**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a99d93cc",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "softmax-class",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"class Softmax:\n",
|
||||
" \"\"\"\n",
|
||||
" Softmax Activation Function: f(x_i) = e^(x_i) / Σ(e^(x_j))\n",
|
||||
" \n",
|
||||
" Converts a vector of numbers into a probability distribution.\n",
|
||||
" Essential for multi-class classification and attention mechanisms.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" def forward(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"\n",
|
||||
" Apply Softmax activation: f(x_i) = e^(x_i) / Σ(e^(x_j))\n",
|
||||
" \n",
|
||||
" TODO: Implement Softmax activation with numerical stability\n",
|
||||
" \n",
|
||||
" APPROACH:\n",
|
||||
" 1. Subtract max value from inputs for numerical stability\n",
|
||||
" 2. Compute exponentials: e^(x_i - max)\n",
|
||||
" 3. Divide by sum of exponentials\n",
|
||||
" 4. Return a new Tensor with the results\n",
|
||||
" \n",
|
||||
" EXAMPLE:\n",
|
||||
" Input: Tensor([[1, 2, 3]])\n",
|
||||
" Expected: Tensor([[0.09, 0.24, 0.67]]) (approximately, sums to 1)\n",
|
||||
" \n",
|
||||
" HINTS:\n",
|
||||
" - Use np.max(x.data, axis=-1, keepdims=True) for stability\n",
|
||||
" - Use np.exp() for exponentials\n",
|
||||
" - Use np.sum() for the denominator\n",
|
||||
" - Make sure the result sums to 1 along the last axis\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" # Subtract max for numerical stability\n",
|
||||
" x_max = np.max(x.data, axis=-1, keepdims=True)\n",
|
||||
" x_shifted = x.data - x_max\n",
|
||||
" \n",
|
||||
" # Compute softmax\n",
|
||||
" exp_x = np.exp(x_shifted)\n",
|
||||
" sum_exp = np.sum(exp_x, axis=-1, keepdims=True)\n",
|
||||
" result = exp_x / sum_exp\n",
|
||||
" \n",
|
||||
" return Tensor(result)\n",
|
||||
" ### END SOLUTION\n",
|
||||
" \n",
|
||||
" def __call__(self, x: Tensor) -> Tensor:\n",
|
||||
" \"\"\"Make the class callable: softmax(x) instead of softmax.forward(x)\"\"\"\n",
|
||||
" return self.forward(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d37cb352",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"### 🧪 Test Your Activation Functions\n",
|
||||
"\n",
|
||||
"Once you implement the activation functions above, run these cells to test them:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "067e766c",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-relu",
|
||||
"locked": true,
|
||||
"points": 20,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test ReLU activation\n",
|
||||
"print(\"Testing ReLU activation...\")\n",
|
||||
"\n",
|
||||
"relu = ReLU()\n",
|
||||
"\n",
|
||||
"# Test basic functionality\n",
|
||||
"input_tensor = Tensor([[-2, -1, 0, 1, 2]])\n",
|
||||
"output = relu(input_tensor)\n",
|
||||
"expected = np.array([[0, 0, 0, 1, 2]])\n",
|
||||
"assert np.array_equal(output.data, expected), f\"ReLU failed: expected {expected}, got {output.data}\"\n",
|
||||
"\n",
|
||||
"# Test with matrix\n",
|
||||
"matrix_input = Tensor([[-1, 2], [3, -4]])\n",
|
||||
"matrix_output = relu(matrix_input)\n",
|
||||
"expected_matrix = np.array([[0, 2], [3, 0]])\n",
|
||||
"assert np.array_equal(matrix_output.data, expected_matrix), f\"ReLU matrix failed: expected {expected_matrix}, got {matrix_output.data}\"\n",
|
||||
"\n",
|
||||
"# Test shape preservation\n",
|
||||
"assert output.shape == input_tensor.shape, f\"ReLU should preserve shape: input {input_tensor.shape}, output {output.shape}\"\n",
|
||||
"\n",
|
||||
"print(\"✅ ReLU tests passed!\")\n",
|
||||
"print(f\"✅ ReLU({input_tensor.data.flatten()}) = {output.data.flatten()}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e01b7261",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-sigmoid",
|
||||
"locked": true,
|
||||
"points": 20,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test Sigmoid activation\n",
|
||||
"print(\"Testing Sigmoid activation...\")\n",
|
||||
"\n",
|
||||
"sigmoid = Sigmoid()\n",
|
||||
"\n",
|
||||
"# Test basic functionality\n",
|
||||
"input_tensor = Tensor([[0]])\n",
|
||||
"output = sigmoid(input_tensor)\n",
|
||||
"expected_value = 0.5\n",
|
||||
"assert abs(output.data.item() - expected_value) < 1e-6, f\"Sigmoid(0) should be 0.5, got {output.data.item()}\"\n",
|
||||
"\n",
|
||||
"# Test range bounds (allowing for floating-point precision at extremes)\n",
|
||||
"large_input = Tensor([[100]])\n",
|
||||
"large_output = sigmoid(large_input)\n",
|
||||
"assert 0 < large_output.data.item() <= 1, f\"Sigmoid output should be in (0,1], got {large_output.data.item()}\"\n",
|
||||
"\n",
|
||||
"small_input = Tensor([[-100]])\n",
|
||||
"small_output = sigmoid(small_input)\n",
|
||||
"assert 0 <= small_output.data.item() < 1, f\"Sigmoid output should be in [0,1), got {small_output.data.item()}\"\n",
|
||||
"\n",
|
||||
"# Test with multiple values\n",
|
||||
"multi_input = Tensor([[-2, 0, 2]])\n",
|
||||
"multi_output = sigmoid(multi_input)\n",
|
||||
"assert multi_output.shape == multi_input.shape, \"Sigmoid should preserve shape\"\n",
|
||||
"assert np.all((multi_output.data > 0) & (multi_output.data < 1)), \"All sigmoid outputs should be in (0,1)\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Sigmoid tests passed!\")\n",
|
||||
"print(f\"✅ Sigmoid({multi_input.data.flatten()}) = {multi_output.data.flatten()}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8ca2fa6f",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-tanh",
|
||||
"locked": true,
|
||||
"points": 20,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test Tanh activation\n",
|
||||
"print(\"Testing Tanh activation...\")\n",
|
||||
"\n",
|
||||
"tanh = Tanh()\n",
|
||||
"\n",
|
||||
"# Test basic functionality\n",
|
||||
"input_tensor = Tensor([[0]])\n",
|
||||
"output = tanh(input_tensor)\n",
|
||||
"expected_value = 0.0\n",
|
||||
"assert abs(output.data.item() - expected_value) < 1e-6, f\"Tanh(0) should be 0.0, got {output.data.item()}\"\n",
|
||||
"\n",
|
||||
"# Test range bounds (allowing for floating-point precision at extremes)\n",
|
||||
"large_input = Tensor([[100]])\n",
|
||||
"large_output = tanh(large_input)\n",
|
||||
"assert -1 <= large_output.data.item() <= 1, f\"Tanh output should be in [-1,1], got {large_output.data.item()}\"\n",
|
||||
"\n",
|
||||
"small_input = Tensor([[-100]])\n",
|
||||
"small_output = tanh(small_input)\n",
|
||||
"assert -1 <= small_output.data.item() <= 1, f\"Tanh output should be in [-1,1], got {small_output.data.item()}\"\n",
|
||||
"\n",
|
||||
"# Test symmetry: tanh(-x) = -tanh(x)\n",
|
||||
"test_input = Tensor([[2]])\n",
|
||||
"pos_output = tanh(test_input)\n",
|
||||
"neg_input = Tensor([[-2]])\n",
|
||||
"neg_output = tanh(neg_input)\n",
|
||||
"assert abs(pos_output.data.item() + neg_output.data.item()) < 1e-6, \"Tanh should be symmetric: tanh(-x) = -tanh(x)\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Tanh tests passed!\")\n",
|
||||
"print(f\"✅ Tanh(±2) = ±{abs(pos_output.data.item()):.3f}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "50795506",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-softmax",
|
||||
"locked": true,
|
||||
"points": 20,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test Softmax activation\n",
|
||||
"print(\"Testing Softmax activation...\")\n",
|
||||
"\n",
|
||||
"softmax = Softmax()\n",
|
||||
"\n",
|
||||
"# Test basic functionality\n",
|
||||
"input_tensor = Tensor([[1, 2, 3]])\n",
|
||||
"output = softmax(input_tensor)\n",
|
||||
"\n",
|
||||
"# Check that outputs sum to 1\n",
|
||||
"sum_output = np.sum(output.data)\n",
|
||||
"assert abs(sum_output - 1.0) < 1e-6, f\"Softmax outputs should sum to 1, got {sum_output}\"\n",
|
||||
"\n",
|
||||
"# Check that all outputs are positive\n",
|
||||
"assert np.all(output.data > 0), \"All softmax outputs should be positive\"\n",
|
||||
"\n",
|
||||
"# Check that larger inputs give larger outputs\n",
|
||||
"assert output.data[0, 2] > output.data[0, 1] > output.data[0, 0], \"Softmax should preserve order\"\n",
|
||||
"\n",
|
||||
"# Test with matrix (multiple rows)\n",
|
||||
"matrix_input = Tensor([[1, 2], [3, 4]])\n",
|
||||
"matrix_output = softmax(matrix_input)\n",
|
||||
"row_sums = np.sum(matrix_output.data, axis=1)\n",
|
||||
"assert np.allclose(row_sums, 1.0), f\"Each row should sum to 1, got {row_sums}\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Softmax tests passed!\")\n",
|
||||
"print(f\"✅ Softmax({input_tensor.data.flatten()}) = {output.data.flatten()}\")\n",
|
||||
"print(f\"✅ Sum = {np.sum(output.data):.6f}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c8dfc085",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-activation-integration",
|
||||
"locked": true,
|
||||
"points": 20,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Test activation function integration\n",
|
||||
"print(\"Testing activation function integration...\")\n",
|
||||
"\n",
|
||||
"# Create test data\n",
|
||||
"test_data = Tensor([[-2, -1, 0, 1, 2]])\n",
|
||||
"\n",
|
||||
"# Test all activations\n",
|
||||
"relu = ReLU()\n",
|
||||
"sigmoid = Sigmoid()\n",
|
||||
"tanh = Tanh()\n",
|
||||
"softmax = Softmax()\n",
|
||||
"\n",
|
||||
"# Apply all activations\n",
|
||||
"relu_out = relu(test_data)\n",
|
||||
"sigmoid_out = sigmoid(test_data)\n",
|
||||
"tanh_out = tanh(test_data)\n",
|
||||
"softmax_out = softmax(test_data)\n",
|
||||
"\n",
|
||||
"# Check shapes are preserved\n",
|
||||
"assert relu_out.shape == test_data.shape, \"ReLU should preserve shape\"\n",
|
||||
"assert sigmoid_out.shape == test_data.shape, \"Sigmoid should preserve shape\"\n",
|
||||
"assert tanh_out.shape == test_data.shape, \"Tanh should preserve shape\"\n",
|
||||
"assert softmax_out.shape == test_data.shape, \"Softmax should preserve shape\"\n",
|
||||
"\n",
|
||||
"# Check ranges (allowing for floating-point precision at extremes)\n",
|
||||
"assert np.all(relu_out.data >= 0), \"ReLU outputs should be non-negative\"\n",
|
||||
"assert np.all((sigmoid_out.data >= 0) & (sigmoid_out.data <= 1)), \"Sigmoid outputs should be in [0,1]\"\n",
|
||||
"assert np.all((tanh_out.data >= -1) & (tanh_out.data <= 1)), \"Tanh outputs should be in [-1,1]\"\n",
|
||||
"assert np.all(softmax_out.data > 0), \"Softmax outputs should be positive\"\n",
|
||||
"\n",
|
||||
"# Test chaining (composition)\n",
|
||||
"chained = relu(sigmoid(test_data))\n",
|
||||
"assert chained.shape == test_data.shape, \"Chained activations should preserve shape\"\n",
|
||||
"\n",
|
||||
"print(\"✅ Activation integration tests passed!\")\n",
|
||||
"print(f\"✅ All activation functions work correctly\")\n",
|
||||
"print(f\"✅ Input: {test_data.data.flatten()}\")\n",
|
||||
"print(f\"✅ ReLU: {relu_out.data.flatten()}\")\n",
|
||||
"print(f\"✅ Sigmoid: {sigmoid_out.data.flatten()}\")\n",
|
||||
"print(f\"✅ Tanh: {tanh_out.data.flatten()}\")\n",
|
||||
"print(f\"✅ Softmax: {softmax_out.data.flatten()}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fa5f40bb",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🎯 Module Summary\n",
|
||||
"\n",
|
||||
"Congratulations! You've successfully implemented the core activation functions for TinyTorch:\n",
|
||||
"\n",
|
||||
"### What You've Accomplished\n",
|
||||
"✅ **ReLU**: The workhorse activation for hidden layers \n",
|
||||
"✅ **Sigmoid**: Smooth probabilistic outputs for binary classification \n",
|
||||
"✅ **Tanh**: Zero-centered activation for better training dynamics \n",
|
||||
"✅ **Softmax**: Probability distributions for multi-class classification \n",
|
||||
"✅ **Integration**: All functions work together and preserve tensor shapes \n",
|
||||
"\n",
|
||||
"### Key Concepts You've Learned\n",
|
||||
"- **Nonlinearity** is essential for neural networks to learn complex patterns\n",
|
||||
"- **ReLU** is simple, fast, and effective for most hidden layers\n",
|
||||
"- **Sigmoid** squashes outputs to (0,1) for probabilistic interpretation\n",
|
||||
"- **Tanh** is zero-centered and often better than sigmoid for hidden layers\n",
|
||||
"- **Softmax** converts logits to probability distributions\n",
|
||||
"- **Numerical stability** is crucial for functions with exponentials\n",
|
||||
"\n",
|
||||
"### Next Steps\n",
|
||||
"1. **Export your code**: `tito package nbdev --export 02_activations`\n",
|
||||
"2. **Test your implementation**: `tito module test 02_activations`\n",
|
||||
"3. **Use your activations**: \n",
|
||||
" ```python\n",
|
||||
" from tinytorch.core.activations import ReLU, Sigmoid, Tanh, Softmax\n",
|
||||
" from tinytorch.core.tensor import Tensor\n",
|
||||
" \n",
|
||||
" relu = ReLU()\n",
|
||||
" x = Tensor([[-1, 0, 1, 2]])\n",
|
||||
" y = relu(x) # Your activation in action!\n",
|
||||
" ```\n",
|
||||
"4. **Move to Module 3**: Start building neural network layers!\n",
|
||||
"\n",
|
||||
"**Ready for the next challenge?** Let's combine tensors and activations to build the fundamental building blocks of neural networks!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"jupytext": {
|
||||
"main_language": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -5,7 +5,62 @@ d = { 'settings': { 'branch': 'main',
|
||||
'doc_host': 'https://tinytorch.github.io',
|
||||
'git_url': 'https://github.com/tinytorch/TinyTorch/',
|
||||
'lib_path': 'tinytorch'},
|
||||
'syms': { 'tinytorch.core.setup': { 'tinytorch.core.setup.personal_info': ( '00_setup/setup_dev.html#personal_info',
|
||||
'syms': { 'tinytorch.core.activations': { 'tinytorch.core.activations.ReLU': ( '02_activations/activations_dev.html#relu',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.ReLU.__call__': ( '02_activations/activations_dev.html#relu.__call__',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.ReLU.forward': ( '02_activations/activations_dev.html#relu.forward',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Sigmoid': ( '02_activations/activations_dev.html#sigmoid',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Sigmoid.__call__': ( '02_activations/activations_dev.html#sigmoid.__call__',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Sigmoid.forward': ( '02_activations/activations_dev.html#sigmoid.forward',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Softmax': ( '02_activations/activations_dev.html#softmax',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Softmax.__call__': ( '02_activations/activations_dev.html#softmax.__call__',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Softmax.forward': ( '02_activations/activations_dev.html#softmax.forward',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Tanh': ( '02_activations/activations_dev.html#tanh',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Tanh.__call__': ( '02_activations/activations_dev.html#tanh.__call__',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.Tanh.forward': ( '02_activations/activations_dev.html#tanh.forward',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations._should_show_plots': ( '02_activations/activations_dev.html#_should_show_plots',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.visualize_activation_function': ( '02_activations/activations_dev.html#visualize_activation_function',
|
||||
'tinytorch/core/activations.py'),
|
||||
'tinytorch.core.activations.visualize_activation_on_data': ( '02_activations/activations_dev.html#visualize_activation_on_data',
|
||||
'tinytorch/core/activations.py')},
|
||||
'tinytorch.core.setup': { 'tinytorch.core.setup.personal_info': ( '00_setup/setup_dev.html#personal_info',
|
||||
'tinytorch/core/setup.py'),
|
||||
'tinytorch.core.setup.system_info': ( '00_setup/setup_dev.html#system_info',
|
||||
'tinytorch/core/setup.py')}}}
|
||||
'tinytorch/core/setup.py')},
|
||||
'tinytorch.core.tensor': { 'tinytorch.core.tensor.Tensor': ('01_tensor/tensor_dev.html#tensor', 'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__add__': ( '01_tensor/tensor_dev.html#tensor.__add__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__init__': ( '01_tensor/tensor_dev.html#tensor.__init__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__mul__': ( '01_tensor/tensor_dev.html#tensor.__mul__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__repr__': ( '01_tensor/tensor_dev.html#tensor.__repr__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__sub__': ( '01_tensor/tensor_dev.html#tensor.__sub__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.__truediv__': ( '01_tensor/tensor_dev.html#tensor.__truediv__',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.add': ( '01_tensor/tensor_dev.html#tensor.add',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.data': ( '01_tensor/tensor_dev.html#tensor.data',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.dtype': ( '01_tensor/tensor_dev.html#tensor.dtype',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.multiply': ( '01_tensor/tensor_dev.html#tensor.multiply',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.shape': ( '01_tensor/tensor_dev.html#tensor.shape',
|
||||
'tinytorch/core/tensor.py'),
|
||||
'tinytorch.core.tensor.Tensor.size': ( '01_tensor/tensor_dev.html#tensor.size',
|
||||
'tinytorch/core/tensor.py')}}}
|
||||
|
||||
246
tinytorch/core/activations.py
Normal file
246
tinytorch/core/activations.py
Normal file
@@ -0,0 +1,246 @@
|
||||
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/source/02_activations/activations_dev.ipynb.
|
||||
|
||||
# %% auto 0
|
||||
__all__ = ['visualize_activation_function', 'visualize_activation_on_data', 'ReLU', 'Sigmoid', 'Tanh', 'Softmax']
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 1
|
||||
import math
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
import os
|
||||
import sys
|
||||
from typing import Union, List
|
||||
|
||||
# Import our Tensor class - try from package first, then from local module
|
||||
try:
|
||||
from tinytorch.core.tensor import Tensor
|
||||
except ImportError:
|
||||
# For development, import from local tensor module
|
||||
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '01_tensor'))
|
||||
from tensor_dev import Tensor
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 2
|
||||
def _should_show_plots():
|
||||
"""Check if we should show plots (disable during testing)"""
|
||||
# Check multiple conditions that indicate we're in test mode
|
||||
is_pytest = (
|
||||
'pytest' in sys.modules or
|
||||
'test' in sys.argv or
|
||||
os.environ.get('PYTEST_CURRENT_TEST') is not None or
|
||||
any('test' in arg for arg in sys.argv) or
|
||||
any('pytest' in arg for arg in sys.argv)
|
||||
)
|
||||
|
||||
# Show plots in development mode (when not in test mode)
|
||||
return not is_pytest
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 3
|
||||
def visualize_activation_function(activation_fn, name: str, x_range: tuple = (-5, 5), num_points: int = 100):
|
||||
"""Visualize an activation function's behavior"""
|
||||
if not _should_show_plots():
|
||||
return
|
||||
|
||||
try:
|
||||
|
||||
# Generate input values
|
||||
x_vals = np.linspace(x_range[0], x_range[1], num_points)
|
||||
|
||||
# Apply activation function
|
||||
y_vals = []
|
||||
for x in x_vals:
|
||||
input_tensor = Tensor([[x]])
|
||||
output = activation_fn(input_tensor)
|
||||
y_vals.append(output.data.item())
|
||||
|
||||
# Create plot
|
||||
plt.figure(figsize=(10, 6))
|
||||
plt.plot(x_vals, y_vals, 'b-', linewidth=2, label=f'{name} Activation')
|
||||
plt.grid(True, alpha=0.3)
|
||||
plt.xlabel('Input (x)')
|
||||
plt.ylabel(f'{name}(x)')
|
||||
plt.title(f'{name} Activation Function')
|
||||
plt.legend()
|
||||
plt.show()
|
||||
|
||||
except ImportError:
|
||||
print(" 📊 Matplotlib not available - skipping visualization")
|
||||
except Exception as e:
|
||||
print(f" ⚠️ Visualization error: {e}")
|
||||
|
||||
def visualize_activation_on_data(activation_fn, name: str, data: Tensor):
|
||||
"""Show activation function applied to sample data"""
|
||||
if not _should_show_plots():
|
||||
return
|
||||
|
||||
try:
|
||||
output = activation_fn(data)
|
||||
print(f" 📊 {name} Example:")
|
||||
print(f" Input: {data.data.flatten()}")
|
||||
print(f" Output: {output.data.flatten()}")
|
||||
print(f" Range: [{output.data.min():.3f}, {output.data.max():.3f}]")
|
||||
|
||||
except Exception as e:
|
||||
print(f" ⚠️ Data visualization error: {e}")
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 6
|
||||
class ReLU:
|
||||
"""
|
||||
ReLU Activation Function: f(x) = max(0, x)
|
||||
|
||||
The most popular activation function in deep learning.
|
||||
Simple, fast, and effective for most applications.
|
||||
"""
|
||||
|
||||
def forward(self, x: Tensor) -> Tensor:
|
||||
"""
|
||||
Apply ReLU activation: f(x) = max(0, x)
|
||||
|
||||
TODO: Implement ReLU activation
|
||||
|
||||
APPROACH:
|
||||
1. For each element in the input tensor, apply max(0, element)
|
||||
2. Return a new Tensor with the results
|
||||
|
||||
EXAMPLE:
|
||||
Input: Tensor([[-1, 0, 1, 2, -3]])
|
||||
Expected: Tensor([[0, 0, 1, 2, 0]])
|
||||
|
||||
HINTS:
|
||||
- Use np.maximum(0, x.data) for element-wise max
|
||||
- Remember to return a new Tensor object
|
||||
- The shape should remain the same as input
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
result = np.maximum(0, x.data)
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __call__(self, x: Tensor) -> Tensor:
|
||||
"""Make the class callable: relu(x) instead of relu.forward(x)"""
|
||||
return self.forward(x)
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 8
|
||||
class Sigmoid:
|
||||
"""
|
||||
Sigmoid Activation Function: f(x) = 1 / (1 + e^(-x))
|
||||
|
||||
Smooth S-shaped function that squashes inputs to (0, 1).
|
||||
Useful for binary classification and probabilistic outputs.
|
||||
"""
|
||||
|
||||
def forward(self, x: Tensor) -> Tensor:
|
||||
"""
|
||||
Apply Sigmoid activation: f(x) = 1 / (1 + e^(-x))
|
||||
|
||||
TODO: Implement Sigmoid activation with numerical stability
|
||||
|
||||
APPROACH:
|
||||
1. Clip input values to prevent overflow (e.g., between -500 and 500)
|
||||
2. Apply the sigmoid formula: 1 / (1 + exp(-x))
|
||||
3. Return a new Tensor with the results
|
||||
|
||||
EXAMPLE:
|
||||
Input: Tensor([[-2, 0, 2]])
|
||||
Expected: Tensor([[0.119, 0.5, 0.881]]) (approximately)
|
||||
|
||||
HINTS:
|
||||
- Use np.clip(x.data, -500, 500) for numerical stability
|
||||
- Use np.exp() for the exponential function
|
||||
- Be careful with very large/small inputs to avoid overflow
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
# Clip for numerical stability
|
||||
clipped = np.clip(x.data, -500, 500)
|
||||
result = 1 / (1 + np.exp(-clipped))
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __call__(self, x: Tensor) -> Tensor:
|
||||
"""Make the class callable: sigmoid(x) instead of sigmoid.forward(x)"""
|
||||
return self.forward(x)
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 10
|
||||
class Tanh:
|
||||
"""
|
||||
Tanh Activation Function: f(x) = tanh(x)
|
||||
|
||||
Zero-centered S-shaped function that squashes inputs to (-1, 1).
|
||||
Better than sigmoid for hidden layers due to zero-centered outputs.
|
||||
"""
|
||||
|
||||
def forward(self, x: Tensor) -> Tensor:
|
||||
"""
|
||||
Apply Tanh activation: f(x) = tanh(x)
|
||||
|
||||
TODO: Implement Tanh activation
|
||||
|
||||
APPROACH:
|
||||
1. Use NumPy's tanh function for numerical stability
|
||||
2. Apply to the tensor data
|
||||
3. Return a new Tensor with the results
|
||||
|
||||
EXAMPLE:
|
||||
Input: Tensor([[-2, 0, 2]])
|
||||
Expected: Tensor([[-0.964, 0.0, 0.964]]) (approximately)
|
||||
|
||||
HINTS:
|
||||
- Use np.tanh(x.data) - NumPy handles the math
|
||||
- Much simpler than implementing the formula manually
|
||||
- NumPy's tanh is numerically stable
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
result = np.tanh(x.data)
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __call__(self, x: Tensor) -> Tensor:
|
||||
"""Make the class callable: tanh(x) instead of tanh.forward(x)"""
|
||||
return self.forward(x)
|
||||
|
||||
# %% ../../modules/source/02_activations/activations_dev.ipynb 12
|
||||
class Softmax:
|
||||
"""
|
||||
Softmax Activation Function: f(x_i) = e^(x_i) / Σ(e^(x_j))
|
||||
|
||||
Converts a vector of numbers into a probability distribution.
|
||||
Essential for multi-class classification and attention mechanisms.
|
||||
"""
|
||||
|
||||
def forward(self, x: Tensor) -> Tensor:
|
||||
"""
|
||||
Apply Softmax activation: f(x_i) = e^(x_i) / Σ(e^(x_j))
|
||||
|
||||
TODO: Implement Softmax activation with numerical stability
|
||||
|
||||
APPROACH:
|
||||
1. Subtract max value from inputs for numerical stability
|
||||
2. Compute exponentials: e^(x_i - max)
|
||||
3. Divide by sum of exponentials
|
||||
4. Return a new Tensor with the results
|
||||
|
||||
EXAMPLE:
|
||||
Input: Tensor([[1, 2, 3]])
|
||||
Expected: Tensor([[0.09, 0.24, 0.67]]) (approximately, sums to 1)
|
||||
|
||||
HINTS:
|
||||
- Use np.max(x.data, axis=-1, keepdims=True) for stability
|
||||
- Use np.exp() for exponentials
|
||||
- Use np.sum() for the denominator
|
||||
- Make sure the result sums to 1 along the last axis
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
# Subtract max for numerical stability
|
||||
x_max = np.max(x.data, axis=-1, keepdims=True)
|
||||
x_shifted = x.data - x_max
|
||||
|
||||
# Compute softmax
|
||||
exp_x = np.exp(x_shifted)
|
||||
sum_exp = np.sum(exp_x, axis=-1, keepdims=True)
|
||||
result = exp_x / sum_exp
|
||||
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __call__(self, x: Tensor) -> Tensor:
|
||||
"""Make the class callable: softmax(x) instead of softmax.forward(x)"""
|
||||
return self.forward(x)
|
||||
297
tinytorch/core/tensor.py
Normal file
297
tinytorch/core/tensor.py
Normal file
@@ -0,0 +1,297 @@
|
||||
# AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/source/01_tensor/tensor_dev.ipynb.
|
||||
|
||||
# %% auto 0
|
||||
__all__ = ['Tensor']
|
||||
|
||||
# %% ../../modules/source/01_tensor/tensor_dev.ipynb 1
|
||||
import numpy as np
|
||||
import sys
|
||||
from typing import Union, List, Tuple, Optional, Any
|
||||
|
||||
# %% ../../modules/source/01_tensor/tensor_dev.ipynb 7
|
||||
class Tensor:
|
||||
"""
|
||||
TinyTorch Tensor: N-dimensional array with ML operations.
|
||||
|
||||
The fundamental data structure for all TinyTorch operations.
|
||||
Wraps NumPy arrays with ML-specific functionality.
|
||||
"""
|
||||
|
||||
def __init__(self, data: Union[int, float, List, np.ndarray], dtype: Optional[str] = None):
|
||||
"""
|
||||
Create a new tensor from data.
|
||||
|
||||
Args:
|
||||
data: Input data (scalar, list, or numpy array)
|
||||
dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.
|
||||
|
||||
TODO: Implement tensor creation with proper type handling.
|
||||
|
||||
STEP-BY-STEP:
|
||||
1. Check if data is a scalar (int/float) - convert to numpy array
|
||||
2. Check if data is a list - convert to numpy array
|
||||
3. Check if data is already a numpy array - use as-is
|
||||
4. Apply dtype conversion if specified
|
||||
5. Store the result in self._data
|
||||
|
||||
EXAMPLE:
|
||||
Tensor(5) → stores np.array(5)
|
||||
Tensor([1, 2, 3]) → stores np.array([1, 2, 3])
|
||||
Tensor(np.array([1, 2, 3])) → stores the array directly
|
||||
|
||||
HINTS:
|
||||
- Use isinstance() to check data types
|
||||
- Use np.array() for conversion
|
||||
- Handle dtype parameter for type conversion
|
||||
- Store the array in self._data
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
# Convert input to numpy array
|
||||
if isinstance(data, (int, float, np.number)):
|
||||
# Handle Python and NumPy scalars
|
||||
if dtype is None:
|
||||
# Auto-detect type: int for integers, float32 for floats
|
||||
if isinstance(data, int) or (isinstance(data, np.number) and np.issubdtype(type(data), np.integer)):
|
||||
dtype = 'int32'
|
||||
else:
|
||||
dtype = 'float32'
|
||||
self._data = np.array(data, dtype=dtype)
|
||||
elif isinstance(data, list):
|
||||
# Let NumPy auto-detect type, then convert if needed
|
||||
temp_array = np.array(data)
|
||||
if dtype is None:
|
||||
# Use NumPy's auto-detected type, but prefer float32 for floats
|
||||
if temp_array.dtype == np.float64:
|
||||
dtype = 'float32'
|
||||
else:
|
||||
dtype = str(temp_array.dtype)
|
||||
self._data = np.array(data, dtype=dtype)
|
||||
elif isinstance(data, np.ndarray):
|
||||
# Already a numpy array
|
||||
if dtype is None:
|
||||
# Keep existing dtype, but prefer float32 for float64
|
||||
if data.dtype == np.float64:
|
||||
dtype = 'float32'
|
||||
else:
|
||||
dtype = str(data.dtype)
|
||||
self._data = data.astype(dtype) if dtype != data.dtype else data.copy()
|
||||
else:
|
||||
# Try to convert unknown types
|
||||
self._data = np.array(data, dtype=dtype)
|
||||
### END SOLUTION
|
||||
|
||||
@property
|
||||
def data(self) -> np.ndarray:
|
||||
"""
|
||||
Access underlying numpy array.
|
||||
|
||||
TODO: Return the stored numpy array.
|
||||
|
||||
HINT: Return self._data (the array you stored in __init__)
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
return self._data
|
||||
### END SOLUTION
|
||||
|
||||
@property
|
||||
def shape(self) -> Tuple[int, ...]:
|
||||
"""
|
||||
Get tensor shape.
|
||||
|
||||
TODO: Return the shape of the stored numpy array.
|
||||
|
||||
HINT: Use .shape attribute of the numpy array
|
||||
EXAMPLE: Tensor([1, 2, 3]).shape should return (3,)
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
return self._data.shape
|
||||
### END SOLUTION
|
||||
|
||||
@property
|
||||
def size(self) -> int:
|
||||
"""
|
||||
Get total number of elements.
|
||||
|
||||
TODO: Return the total number of elements in the tensor.
|
||||
|
||||
HINT: Use .size attribute of the numpy array
|
||||
EXAMPLE: Tensor([1, 2, 3]).size should return 3
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
return self._data.size
|
||||
### END SOLUTION
|
||||
|
||||
@property
|
||||
def dtype(self) -> np.dtype:
|
||||
"""
|
||||
Get data type as numpy dtype.
|
||||
|
||||
TODO: Return the data type of the stored numpy array.
|
||||
|
||||
HINT: Use .dtype attribute of the numpy array
|
||||
EXAMPLE: Tensor([1, 2, 3]).dtype should return dtype('int32')
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
return self._data.dtype
|
||||
### END SOLUTION
|
||||
|
||||
def __repr__(self) -> str:
|
||||
"""
|
||||
String representation.
|
||||
|
||||
TODO: Create a clear string representation of the tensor.
|
||||
|
||||
APPROACH:
|
||||
1. Convert the numpy array to a list for readable output
|
||||
2. Include the shape and dtype information
|
||||
3. Format: "Tensor([data], shape=shape, dtype=dtype)"
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([1, 2, 3]) → "Tensor([1, 2, 3], shape=(3,), dtype=int32)"
|
||||
|
||||
HINTS:
|
||||
- Use .tolist() to convert numpy array to list
|
||||
- Include shape and dtype information
|
||||
- Keep format consistent and readable
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
return f"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})"
|
||||
### END SOLUTION
|
||||
|
||||
def add(self, other: 'Tensor') -> 'Tensor':
|
||||
"""
|
||||
Add two tensors element-wise.
|
||||
|
||||
TODO: Implement tensor addition.
|
||||
|
||||
APPROACH:
|
||||
1. Add the numpy arrays using +
|
||||
2. Return a new Tensor with the result
|
||||
3. Handle broadcasting automatically
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])
|
||||
|
||||
HINTS:
|
||||
- Use self._data + other._data
|
||||
- Return Tensor(result)
|
||||
- NumPy handles broadcasting automatically
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
result = self._data + other._data
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def multiply(self, other: 'Tensor') -> 'Tensor':
|
||||
"""
|
||||
Multiply two tensors element-wise.
|
||||
|
||||
TODO: Implement tensor multiplication.
|
||||
|
||||
APPROACH:
|
||||
1. Multiply the numpy arrays using *
|
||||
2. Return a new Tensor with the result
|
||||
3. Handle broadcasting automatically
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])
|
||||
|
||||
HINTS:
|
||||
- Use self._data * other._data
|
||||
- Return Tensor(result)
|
||||
- This is element-wise, not matrix multiplication
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
result = self._data * other._data
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':
|
||||
"""
|
||||
Addition operator: tensor + other
|
||||
|
||||
TODO: Implement + operator for tensors.
|
||||
|
||||
APPROACH:
|
||||
1. If other is a Tensor, use tensor addition
|
||||
2. If other is a scalar, convert to Tensor first
|
||||
3. Return the result
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])
|
||||
Tensor([1, 2]) + 5 → Tensor([6, 7])
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
if isinstance(other, Tensor):
|
||||
return self.add(other)
|
||||
else:
|
||||
return self.add(Tensor(other))
|
||||
### END SOLUTION
|
||||
|
||||
def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':
|
||||
"""
|
||||
Multiplication operator: tensor * other
|
||||
|
||||
TODO: Implement * operator for tensors.
|
||||
|
||||
APPROACH:
|
||||
1. If other is a Tensor, use tensor multiplication
|
||||
2. If other is a scalar, convert to Tensor first
|
||||
3. Return the result
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])
|
||||
Tensor([1, 2]) * 3 → Tensor([3, 6])
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
if isinstance(other, Tensor):
|
||||
return self.multiply(other)
|
||||
else:
|
||||
return self.multiply(Tensor(other))
|
||||
### END SOLUTION
|
||||
|
||||
def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':
|
||||
"""
|
||||
Subtraction operator: tensor - other
|
||||
|
||||
TODO: Implement - operator for tensors.
|
||||
|
||||
APPROACH:
|
||||
1. Convert other to Tensor if needed
|
||||
2. Subtract using numpy arrays
|
||||
3. Return new Tensor with result
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([5, 6]) - Tensor([1, 2]) → Tensor([4, 4])
|
||||
Tensor([5, 6]) - 1 → Tensor([4, 5])
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
if isinstance(other, Tensor):
|
||||
result = self._data - other._data
|
||||
else:
|
||||
result = self._data - other
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
|
||||
def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':
|
||||
"""
|
||||
Division operator: tensor / other
|
||||
|
||||
TODO: Implement / operator for tensors.
|
||||
|
||||
APPROACH:
|
||||
1. Convert other to Tensor if needed
|
||||
2. Divide using numpy arrays
|
||||
3. Return new Tensor with result
|
||||
|
||||
EXAMPLE:
|
||||
Tensor([6, 8]) / Tensor([2, 4]) → Tensor([3, 2])
|
||||
Tensor([6, 8]) / 2 → Tensor([3, 4])
|
||||
"""
|
||||
### BEGIN SOLUTION
|
||||
if isinstance(other, Tensor):
|
||||
result = self._data / other._data
|
||||
else:
|
||||
result = self._data / other
|
||||
return Tensor(result)
|
||||
### END SOLUTION
|
||||
@@ -113,6 +113,45 @@ class ExportCommand(BaseCommand):
|
||||
|
||||
console.print(Panel(exports_text, title="Export Summary", border_style="bright_green"))
|
||||
|
||||
def _convert_py_to_notebook(self, module_path: Path) -> bool:
|
||||
"""Convert .py dev file to .ipynb using Jupytext."""
|
||||
module_name = module_path.name
|
||||
short_name = module_name[3:] if module_name.startswith(tuple(f"{i:02d}_" for i in range(100))) else module_name
|
||||
|
||||
dev_file = module_path / f"{short_name}_dev.py"
|
||||
if not dev_file.exists():
|
||||
return False
|
||||
|
||||
notebook_file = module_path / f"{short_name}_dev.ipynb"
|
||||
|
||||
# Check if notebook is newer than .py file
|
||||
if notebook_file.exists():
|
||||
py_mtime = dev_file.stat().st_mtime
|
||||
nb_mtime = notebook_file.stat().st_mtime
|
||||
if nb_mtime > py_mtime:
|
||||
return True # Notebook is up to date
|
||||
|
||||
try:
|
||||
result = subprocess.run([
|
||||
"jupytext", "--to", "ipynb", str(dev_file)
|
||||
], capture_output=True, text=True, cwd=module_path)
|
||||
|
||||
return result.returncode == 0
|
||||
except FileNotFoundError:
|
||||
return False
|
||||
|
||||
def _convert_all_modules(self) -> list:
|
||||
"""Convert all modules' .py files to .ipynb files."""
|
||||
modules = self._discover_modules()
|
||||
converted = []
|
||||
|
||||
for module_name in modules:
|
||||
module_path = Path(f"modules/source/{module_name}")
|
||||
if self._convert_py_to_notebook(module_path):
|
||||
converted.append(module_name)
|
||||
|
||||
return converted
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
console = self.console
|
||||
|
||||
@@ -136,17 +175,35 @@ class ExportCommand(BaseCommand):
|
||||
return 1
|
||||
|
||||
console.print(Panel(f"🔄 Exporting Module: {args.module}",
|
||||
title="nbdev Export", border_style="bright_cyan"))
|
||||
title="Complete Export Workflow", border_style="bright_cyan"))
|
||||
|
||||
# Step 1: Convert .py to .ipynb
|
||||
console.print(f"📝 Converting {args.module} Python file to notebook...")
|
||||
if not self._convert_py_to_notebook(module_path):
|
||||
console.print(Panel("[red]❌ Failed to convert .py file to notebook. Is jupytext installed?[/red]",
|
||||
title="Conversion Error", border_style="red"))
|
||||
return 1
|
||||
|
||||
console.print(f"🔄 Exporting {args.module} notebook to tinytorch package...")
|
||||
|
||||
# Use nbdev_export with --path for specific module
|
||||
# Step 2: Use nbdev_export with --path for specific module
|
||||
cmd = ["nbdev_export", "--path", str(module_path)]
|
||||
elif hasattr(args, 'all') and args.all:
|
||||
console.print(Panel("🔄 Exporting All Notebooks to Package",
|
||||
title="nbdev Export", border_style="bright_cyan"))
|
||||
console.print(Panel("🔄 Exporting All Modules to Package",
|
||||
title="Complete Export Workflow", border_style="bright_cyan"))
|
||||
|
||||
# Step 1: Convert all .py files to .ipynb
|
||||
console.print("📝 Converting all Python files to notebooks...")
|
||||
converted = self._convert_all_modules()
|
||||
if not converted:
|
||||
console.print(Panel("[red]❌ No modules converted. Check if jupytext is installed and .py files exist.[/red]",
|
||||
title="Conversion Error", border_style="red"))
|
||||
return 1
|
||||
|
||||
console.print(f"✅ Converted {len(converted)} modules: {', '.join(converted)}")
|
||||
console.print("🔄 Exporting all notebook code to tinytorch package...")
|
||||
|
||||
# Use nbdev_export for all modules
|
||||
# Step 2: Use nbdev_export for all modules
|
||||
cmd = ["nbdev_export"]
|
||||
else:
|
||||
console.print(Panel("[red]❌ Must specify either a module name or --all[/red]",
|
||||
|
||||
@@ -59,6 +59,8 @@ class TinyTorchCLI:
|
||||
'module': ModuleCommand,
|
||||
'package': PackageCommand,
|
||||
'nbgrader': NBGraderCommand,
|
||||
# Convenience commands
|
||||
'export': ExportCommand,
|
||||
}
|
||||
|
||||
def create_parser(self) -> argparse.ArgumentParser:
|
||||
@@ -77,7 +79,8 @@ Command Groups:
|
||||
Examples:
|
||||
tito system info Show system information
|
||||
tito module status --metadata Module status with metadata
|
||||
tito package export Export notebooks to package
|
||||
tito export 01_tensor Export specific module to package
|
||||
tito export --all Export all modules to package
|
||||
tito nbgrader generate setup Generate assignment from setup module
|
||||
"""
|
||||
)
|
||||
@@ -174,7 +177,8 @@ Examples:
|
||||
"[bold]Quick Start:[/bold]\n"
|
||||
" [dim]tito system info[/dim] - Show system information\n"
|
||||
" [dim]tito module status --metadata[/dim] - Module status with metadata\n"
|
||||
" [dim]tito package export[/dim] - Export notebooks to package\n"
|
||||
" [dim]tito export 01_tensor[/dim] - Export specific module to package\n"
|
||||
" [dim]tito export --all[/dim] - Export all modules to package\n"
|
||||
" [dim]tito nbgrader generate setup[/dim] - Generate assignment from setup module\n\n"
|
||||
"[bold]Get Help:[/bold]\n"
|
||||
" [dim]tito system[/dim] - Show system subcommands\n"
|
||||
|
||||
Reference in New Issue
Block a user