TinyTorch/modules/02_tensor/tensor_dev.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c8575dba",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "# Tensor - The Foundation of Machine Learning\n",
    "\n",
    "Welcome to Tensor! You'll build the fundamental data structure that powers every neural network.\n",
    "\n",
    "## 🔗 Building on Previous Learning\n",
    "**What You Built Before**:\n",
    "- Module 01 (Setup): Python environment with NumPy, the foundation for numerical computing\n",
    "\n",
    "**What's Working**: You have a complete development environment with all the tools needed for machine learning!\n",
    "\n",
    "**The Gap**: You can import NumPy, but you need to understand how to build the core data structure that makes ML possible.\n",
    "\n",
    "**This Module's Solution**: Build a complete Tensor class that wraps NumPy arrays with ML-specific operations and memory management.\n",
    "\n",
    "**Connection Map**:\n",
    "```\n",
    "Setup → Tensor → Activations\n",
    "(tools)   (data)   (nonlinearity)\n",
    "```\n",
    "\n",
    "## Learning Objectives\n",
    "\n",
    "By completing this module, you will:\n",
    "\n",
    "1. **Implement tensor operations** - Build a complete N-dimensional array system with arithmetic, broadcasting, and matrix multiplication\n",
    "2. **Master memory efficiency** - Understand why memory layout affects performance more than algorithm choice\n",
    "3. **Create ML-ready APIs** - Design clean interfaces that mirror PyTorch and TensorFlow patterns\n",
    "4. **Enable neural networks** - Build the foundation that supports weights, biases, and data in all ML models\n",
    "\n",
    "## Build → Test → Use\n",
    "\n",
    "1. **Build**: Implement Tensor class with creation, arithmetic, and advanced operations\n",
    "2. **Test**: Validate each component immediately to ensure correctness and performance\n",
    "3. **Use**: Apply tensors to real multi-dimensional data operations that neural networks require"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68dcb6b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "#| default_exp core.tensor\n",
    "\n",
    "#| export\n",
    "import numpy as np\n",
    "import sys\n",
    "from typing import Union, Tuple, Optional, Any\n",
    "import warnings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74cad3a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "print(\"🔥 TinyTorch Tensor Module\")\n",
    "print(f\"NumPy version: {np.__version__}\")\n",
    "print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
    "print(\"Ready to build tensors!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "285c53b1",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## Understanding Tensors: Visual Guide\n",
    "\n",
    "### What Are Tensors? A Visual Journey\n",
    "\n",
    "**The Story**: Think of tensors as smart containers that know their shape and can efficiently store numbers for machine learning. They're like upgraded versions of regular Python lists that understand mathematics.\n",
    "\n",
    "```\n",
    "Scalar (0D Tensor):     Vector (1D Tensor):     Matrix (2D Tensor):\n",
    "     [5]                   [1, 2, 3]             ┌ 1  2  3 ┐\n",
    "                                                  │ 4  5  6 │\n",
    "                                                  └ 7  8  9 ┘\n",
    "\n",
    "3D Tensor (RGB Image):                   4D Tensor (Batch of Images):\n",
    "┌─────────────┐                         ┌─────────────┐ ┌─────────────┐\n",
    "│ Red Channel │                         │   Image 1   │ │   Image 2   │\n",
    "│             │                         │             │ │             │\n",
    "└─────────────┘                         └─────────────┘ └─────────────┘\n",
    "┌─────────────┐                                      ...\n",
    "│Green Channel│\n",
    "│             │\n",
    "└─────────────┘\n",
    "┌─────────────┐\n",
    "│Blue Channel │\n",
    "│             │\n",
    "└─────────────┘\n",
    "```\n",
    "\n",
    "**What's happening step-by-step**: As we add dimensions, tensors represent more complex data. A single number becomes a list, a list becomes a grid, a grid becomes a volume (like an image with red/green/blue channels), and a volume becomes a collection (like a batch of images for training). Each dimension adds a new way to organize and access the data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "840238d6",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Memory Layout: Why Performance Matters\n",
    "\n",
    "**The Story**: Imagine your computer's memory as a long street with numbered houses. When your CPU needs data, it doesn't just grab one house - it loads an entire city block (64 bytes) into its cache.\n",
    "\n",
    "```\n",
    "Contiguous Memory (FAST):\n",
    "[1][2][3][4][5][6] ──> Cache-friendly, vectorized operations\n",
    " ↑  ↑  ↑  ↑  ↑  ↑\n",
    " Sequential access pattern\n",
    "\n",
    "Non-contiguous Memory (SLOW):\n",
    "[1]...[2].....[3] ──> Cache misses, scattered access\n",
    " ↑     ↑       ↑\n",
    " Random access pattern\n",
    "```\n",
    "\n",
    "**What's happening step-by-step**: When you access element [1], the CPU automatically loads elements [1] through [6] in one cache load. Every subsequent access ([2], [3], [4]...) is already in the cache - no extra memory trips needed! With non-contiguous data, each access requires a new, expensive trip to main memory.\n",
    "\n",
    "**The Performance Impact**: This creates 10-100x speedups because you get 6 elements for the price of fetching 1. It's like getting 6 books from the library for the effort of finding just 1."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86cb7d01",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Tensor Operations: Broadcasting Magic\n",
    "\n",
    "**The Story**: Broadcasting is like having a smart photocopier that automatically copies data to match different shapes without actually using extra memory. It's NumPy's way of making operations \"just work\" between tensors of different sizes.\n",
    "\n",
    "```\n",
    "Broadcasting Example:\n",
    "    Matrix (2×3)     +     Scalar        =     Result (2×3)\n",
    "  ┌ 1  2  3 ┐             [10]              ┌ 11 12 13 ┐\n",
    "  └ 4  5  6 ┘                               └ 14 15 16 ┘\n",
    "\n",
    "Broadcasting Rules:\n",
    "1. Align shapes from right to left\n",
    "2. Dimensions of size 1 stretch to match\n",
    "3. Missing dimensions assume size 1\n",
    "\n",
    "Vector + Matrix Broadcasting:\n",
    "  [1, 2, 3]    +    [[10],     =    [[11, 12, 13],\n",
    "  (1×3)             [20]]            [21, 22, 23]]\n",
    "                    (2×1)            (2×3)\n",
    "```\n",
    "\n",
    "**What's happening step-by-step**: Python aligns shapes from right to left, like comparing numbers by their ones place first. When shapes don't match, dimensions of size 1 automatically \"stretch\" to match the larger dimension - but no data is actually copied. The operation happens as if the data were copied, but uses the original memory locations.\n",
    "\n",
    "**Why this matters for ML**: Adding a bias vector to a 1000×1000 matrix would normally require copying the vector 1000 times, but broadcasting does it with zero copies and massive memory savings."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37bb2239",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Neural Network Data Flow\n",
    "\n",
    "```\n",
    "Batch Processing in Neural Networks:\n",
    "\n",
    "Input Batch (32 images, 28×28 pixels):\n",
    "┌─────────────────────────────────┐\n",
    "│ [Batch=32, Height=28, Width=28] │\n",
    "└─────────────────────────────────┘\n",
    "             ↓ Flatten\n",
    "┌─────────────────────────────────┐\n",
    "│     [Batch=32, Features=784]    │ ← Matrix multiplication ready\n",
    "└─────────────────────────────────┘\n",
    "             ↓ Linear Layer\n",
    "┌─────────────────────────────────┐\n",
    "│     [Batch=32, Hidden=128]      │ ← Hidden layer activations\n",
    "└─────────────────────────────────┘\n",
    "\n",
    "Why batching matters:\n",
    "- Single image: 784 × 128 = 100,352 operations\n",
    "- Batch of 32: Same 100,352 ops, but 32× the data\n",
    "- GPU utilization: 32× better parallelization\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e97ea75",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## The Mathematical Foundation\n",
    "\n",
    "Before we implement, let's understand the mathematical concepts:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a2597fa",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Scalars to Tensors: Building Complexity\n",
    "\n",
    "**Scalar (Rank 0)**:\n",
    "- A single number: `5.0` or `temperature`\n",
    "- Shape: `()` (empty tuple)\n",
    "- ML examples: loss values, learning rates\n",
    "\n",
    "**Vector (Rank 1)**:\n",
    "- Ordered list of numbers: `[1, 2, 3]`\n",
    "- Shape: `(3,)` (one dimension)\n",
    "- ML examples: word embeddings, gradients\n",
    "\n",
    "**Matrix (Rank 2)**:\n",
    "- 2D array: `[[1, 2], [3, 4]]`\n",
    "- Shape: `(2, 2)` (rows, columns)\n",
    "- ML examples: weight matrices, images\n",
    "\n",
    "**Higher-Order Tensors**:\n",
    "- 3D: RGB images `(height, width, channels)`\n",
    "- 4D: Image batches `(batch, height, width, channels)`\n",
    "- 5D: Video batches `(batch, time, height, width, channels)`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51dbe323",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Why Not Just Use NumPy?\n",
    "\n",
    "While NumPy is excellent, our Tensor class adds ML-specific features:\n",
    "\n",
    "**Future Extensions** (coming in later modules):\n",
    "- **Automatic gradients**: Track operations for backpropagation\n",
    "- **GPU acceleration**: Move computations to graphics cards\n",
    "- **Lazy evaluation**: Build computation graphs for optimization\n",
    "\n",
    "**Educational Value**:\n",
    "- **Understanding**: See how PyTorch/TensorFlow work internally\n",
    "- **Debugging**: Trace operations step by step\n",
    "- **Customization**: Add domain-specific operations"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "076ad694",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Implementation Overview\n",
    "\n",
    "Our Tensor class design:\n",
    "\n",
    "```python\n",
    "class Tensor:\n",
    "    def __init__(self, data)      # Create from any data type\n",
    "\n",
    "    # Properties\n",
    "    .shape                        # Dimensions tuple\n",
    "    .size                         # Total element count\n",
    "    .dtype                        # Data type\n",
    "    .data                         # Access underlying NumPy array\n",
    "\n",
    "    # Arithmetic Operations\n",
    "    def __add__(self, other)      # tensor + tensor\n",
    "    def __mul__(self, other)      # tensor * tensor\n",
    "    def __sub__(self, other)      # tensor - tensor\n",
    "    def __truediv__(self, other)  # tensor / tensor\n",
    "\n",
    "    # Advanced Operations\n",
    "    def matmul(self, other)       # Matrix multiplication\n",
    "    def sum(self, axis=None)      # Sum along axes\n",
    "    def reshape(self, *shape)     # Change shape\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc9cadb3",
   "metadata": {
    "lines_to_next_cell": 1,
    "nbgrader": {
     "grade": false,
     "grade_id": "tensor-init",
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "\n",
    "#| export\n",
    "class Tensor:\n",
    "    \"\"\"\n",
    "    TinyTorch Tensor: N-dimensional array with ML operations.\n",
    "\n",
    "    The fundamental data structure for all TinyTorch operations.\n",
    "    Wraps NumPy arrays with ML-specific functionality.\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, data: Any, dtype: Optional[str] = None, requires_grad: bool = False):\n",
    "        \"\"\"\n",
    "        Create a new tensor from data.\n",
    "\n",
    "        Args:\n",
    "            data: Input data (scalar, list, or numpy array)\n",
    "            dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
    "            requires_grad: Whether this tensor needs gradients for training. Defaults to False.\n",
    "\n",
    "        TODO: Implement tensor creation with simple, clear type handling.\n",
    "\n",
    "        APPROACH (Clear implementation for learning):\n",
    "        1. Convert input data to numpy array - NumPy handles conversions\n",
    "        2. Apply dtype if specified - common string types like 'float32'\n",
    "        3. Set default float32 for float64 arrays - ML convention for efficiency\n",
    "        4. Store the result in self._data - internal storage for numpy array\n",
    "        5. Initialize gradient tracking - prepares for automatic differentiation\n",
    "\n",
    "        EXAMPLE:\n",
    "        >>> Tensor(5)\n",
    "        # Creates: np.array(5, dtype='int32')\n",
    "        >>> Tensor([1.0, 2.0, 3.0])\n",
    "        # Creates: np.array([1.0, 2.0, 3.0], dtype='float32')\n",
    "        >>> Tensor([1, 2, 3], dtype='float32')\n",
    "        # Creates: np.array([1, 2, 3], dtype='float32')\n",
    "\n",
    "        PRODUCTION CONTEXT:\n",
    "        PyTorch tensors handle 47+ dtype formats with complex validation.\n",
    "        Our version teaches the core concept that transfers directly.\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        # Convert input to numpy array - let NumPy handle most conversions\n",
    "        if isinstance(data, Tensor):\n",
    "            # Input is another Tensor - copy data efficiently\n",
    "            self._data = data.data.copy()\n",
    "        else:\n",
    "            # Convert to numpy array\n",
    "            self._data = np.array(data)\n",
    "\n",
    "        # Apply dtype if specified\n",
    "        if dtype is not None:\n",
    "            self._data = self._data.astype(dtype)\n",
    "        elif self._data.dtype == np.float64:\n",
    "            # ML convention: prefer float32 for memory and GPU efficiency\n",
    "            self._data = self._data.astype(np.float32)\n",
    "\n",
    "        # Initialize gradient tracking attributes (used in Module 9 - Autograd)\n",
    "        self.requires_grad = requires_grad\n",
    "        self.grad = None\n",
    "        self._grad_fn = None\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def data(self) -> np.ndarray:\n",
    "        \"\"\"\n",
    "        Access underlying numpy array.\n",
    "\n",
    "        TODO: Return the stored numpy array.\n",
    "\n",
    "        APPROACH (Medium comments for property methods):\n",
    "        1. Access the internal _data attribute\n",
    "        2. Return the numpy array directly - enables NumPy integration\n",
    "        3. This provides access to underlying data for visualization/analysis\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - PyTorch: tensor.numpy() converts to NumPy for scientific computing\n",
    "        - TensorFlow: tensor.numpy() enables integration with matplotlib/scipy\n",
    "        - Production use: Data scientists need raw arrays for debugging/visualization\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data\n",
    "        ### END SOLUTION\n",
    "    \n",
    "    @data.setter\n",
    "    def data(self, value: Union[np.ndarray, 'Tensor']) -> None:\n",
    "        \"\"\"Set the underlying data of the tensor.\"\"\"\n",
    "        if isinstance(value, Tensor):\n",
    "            self._data = value._data.copy()\n",
    "        else:\n",
    "            self._data = np.array(value)\n",
    "\n",
    "    @property\n",
    "    def shape(self) -> Tuple[int, ...]:\n",
    "        \"\"\"\n",
    "        Get tensor shape.\n",
    "\n",
    "        TODO: Return the shape of the stored numpy array.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Access the _data attribute (the NumPy array)\n",
    "        2. Get the shape property from the NumPy array\n",
    "        3. Return the shape tuple directly\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Neural networks: Layer compatibility requires matching shapes\n",
    "        - Computer vision: Image shape (height, width, channels) determines architecture\n",
    "        - Debugging: Shape mismatches are the #1 cause of ML errors\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.shape\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def size(self) -> int:\n",
    "        \"\"\"\n",
    "        Get total number of elements.\n",
    "\n",
    "        TODO: Return the total number of elements in the tensor.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Access the _data attribute (the NumPy array)\n",
    "        2. Get the size property from the NumPy array\n",
    "        3. Return the total element count as an integer\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Memory planning: Calculate RAM requirements for large tensors\n",
    "        - Model architecture: Determine parameter counts for layers\n",
    "        - Performance: Size affects computation time and vectorization efficiency\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.size\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def dtype(self) -> np.dtype:\n",
    "        \"\"\"\n",
    "        Get data type as numpy dtype.\n",
    "\n",
    "        TODO: Return the data type of the stored numpy array.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Access the _data attribute\n",
    "        2. Get the dtype property\n",
    "        3. Return the NumPy dtype object\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Precision vs speed: float32 is faster, float64 more accurate\n",
    "        - Memory optimization: int8 uses 1/4 memory of int32\n",
    "        - GPU compatibility: Some operations only work with specific types\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.dtype\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def strides(self) -> Tuple[int, ...]:\n",
    "        \"\"\"\n",
    "        Get memory stride pattern of the tensor.\n",
    "        \n",
    "        Returns:\n",
    "            Tuple of byte strides for each dimension\n",
    "            \n",
    "        PRODUCTION CONNECTION:\n",
    "        - Memory layout analysis: Understanding cache efficiency\n",
    "        - Performance debugging: Non-unit strides can indicate copies\n",
    "        - Advanced operations: Enables efficient transpose and reshape operations\n",
    "        \"\"\"\n",
    "        return self._data.strides\n",
    "    \n",
    "    @property\n",
    "    def is_contiguous(self) -> bool:\n",
    "        \"\"\"\n",
    "        Check if tensor data is stored in contiguous memory.\n",
    "        \n",
    "        Returns:\n",
    "            True if data is contiguous in C-order (row-major)\n",
    "            \n",
    "        PRODUCTION CONNECTION:\n",
    "        - Performance critical: Contiguous data enables vectorization\n",
    "        - Memory efficiency: Contiguous operations can be 10-100x faster\n",
    "        - GPU transfers: Contiguous data transfers more efficiently\n",
    "        \"\"\"\n",
    "        return self._data.flags['C_CONTIGUOUS']\n",
    "\n",
    "    def __repr__(self) -> str:\n",
    "        \"\"\"\n",
    "        String representation with size limits for readability.\n",
    "\n",
    "        TODO: Create a clear string representation of the tensor.\n",
    "\n",
    "        APPROACH (Light comments for utility methods):\n",
    "        1. Check tensor size - if large, show shape/dtype only\n",
    "        2. For small tensors, convert numpy array to list using .tolist()\n",
    "        3. Format appropriately and return string\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2, 3]) → \"Tensor([1, 2, 3], shape=(3,), dtype=int32)\"\n",
    "        Large tensor → \"Tensor(shape=(1000, 1000), dtype=float32)\"\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if self.size > 20:\n",
    "            # Large tensors: show shape and dtype only for readability\n",
    "            return f\"Tensor(shape={self.shape}, dtype={self.dtype})\"\n",
    "        else:\n",
    "            # Small tensors: show data, shape, and dtype\n",
    "            return f\"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})\"\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def item(self) -> Union[int, float]:\n",
    "        \"\"\"Extract a scalar value from a single-element tensor.\"\"\"\n",
    "        if self._data.size != 1:\n",
    "            raise ValueError(f\"item() can only be called on tensors with exactly one element, got {self._data.size} elements\")\n",
    "        return self._data.item()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91b993b2",
   "metadata": {
    "nbgrader": {
     "grade": false,
     "grade_id": "tensor-arithmetic",
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "    def add(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Add two tensors element-wise.\n",
    "\n",
    "        TODO: Implement tensor addition.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Use NumPy's + operator for element-wise addition\n",
    "        3. Create new Tensor object with result\n",
    "        4. Return the new tensor\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Neural networks: Adding bias terms to linear layer outputs\n",
    "        - Residual connections: skip connections in ResNet architectures\n",
    "        - Gradient updates: Adding computed gradients to parameters\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        result_data = self._data + other._data\n",
    "        result = Tensor(result_data)\n",
    "        \n",
    "        # TODO: Gradient tracking will be added in Module 9 (Autograd)\n",
    "        # This enables automatic differentiation for neural network training\n",
    "        # For now, we focus on the core tensor operation\n",
    "        \n",
    "        return result\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def multiply(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Multiply two tensors element-wise.\n",
    "\n",
    "        TODO: Implement tensor multiplication.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Use NumPy's * operator for element-wise multiplication\n",
    "        3. Create new Tensor object with result\n",
    "        4. Return the new tensor\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Activation functions: Element-wise operations like ReLU masking\n",
    "        - Attention mechanisms: Element-wise scaling in transformer models\n",
    "        - Feature scaling: Multiplying features by learned scaling factors\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        result_data = self._data * other._data\n",
    "        result = Tensor(result_data)\n",
    "        \n",
    "        # TODO: Gradient tracking will be added in Module 9 (Autograd)\n",
    "        # This enables automatic differentiation for neural network training\n",
    "        # For now, we focus on the core tensor operation\n",
    "        \n",
    "        return result\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Addition operator: tensor + other\n",
    "\n",
    "        TODO: Implement + operator for tensors.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, call the add() method directly\n",
    "        3. If scalar, convert to Tensor then call add()\n",
    "        4. Return the result from add() method\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Natural syntax: tensor + scalar enables intuitive code\n",
    "        - Broadcasting: Adding scalars to tensors is common in ML\n",
    "        - API design: Clean interfaces reduce cognitive load for researchers\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            return self.add(other)\n",
    "        else:\n",
    "            return self.add(Tensor(other))\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Multiplication operator: tensor * other\n",
    "\n",
    "        TODO: Implement * operator for tensors.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, call the multiply() method directly\n",
    "        3. If scalar, convert to Tensor then call multiply()\n",
    "        4. Return the result from multiply() method\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Scaling features: tensor * learning_rate for gradient updates\n",
    "        - Masking: tensor * mask for attention mechanisms\n",
    "        - Regularization: tensor * dropout_mask during training\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            return self.multiply(other)\n",
    "        else:\n",
    "            return self.multiply(Tensor(other))\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Subtraction operator: tensor - other\n",
    "\n",
    "        TODO: Implement - operator for tensors.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, subtract other._data from self._data\n",
    "        3. If scalar, subtract scalar directly from self._data\n",
    "        4. Create new Tensor with result and return\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Gradient computation: parameter - learning_rate * gradient\n",
    "        - Error calculation: predicted - actual for loss computation\n",
    "        - Centering data: tensor - mean for zero-centered inputs\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            result = self._data - other._data\n",
    "        else:\n",
    "            result = self._data - other\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Division operator: tensor / other\n",
    "\n",
    "        TODO: Implement / operator for tensors.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, divide self._data by other._data\n",
    "        3. If scalar, divide self._data by scalar directly\n",
    "        4. Create new Tensor with result and return\n",
    "\n",
    "        PRODUCTION CONNECTION:\n",
    "        - Normalization: tensor / std_deviation for standard scaling\n",
    "        - Learning rate decay: parameter / decay_factor over time\n",
    "        - Probability computation: counts / total_counts for frequencies\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            result = self._data / other._data\n",
    "        else:\n",
    "            result = self._data / other\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def mean(self) -> 'Tensor':\n",
    "        \"\"\"Computes the mean of the tensor's elements.\"\"\"\n",
    "        return Tensor(np.mean(self.data))\n",
    "    \n",
    "    def sum(self, axis=None, keepdims=False) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Sum tensor elements along specified axes.\n",
    "        \n",
    "        Args:\n",
    "            axis: Axis or axes to sum over. If None, sum all elements.\n",
    "            keepdims: Whether to keep dimensions of size 1 in output.\n",
    "            \n",
    "        Returns:\n",
    "            New tensor with summed values.\n",
    "        \"\"\"\n",
    "        result_data = np.sum(self._data, axis=axis, keepdims=keepdims)\n",
    "        result = Tensor(result_data)\n",
    "        \n",
    "        if self.requires_grad:\n",
    "            result.requires_grad = True\n",
    "            \n",
    "            def grad_fn(grad):\n",
    "                # Sum gradient: broadcast gradient back to original shape\n",
    "                grad_data = grad.data\n",
    "                if axis is None:\n",
    "                    # Sum over all axes - gradient is broadcast to full shape\n",
    "                    grad_data = np.full(self.shape, grad_data)\n",
    "                else:\n",
    "                    # Sum over specific axes - expand back those dimensions\n",
    "                    if not isinstance(axis, tuple):\n",
    "                        axis_tuple = (axis,) if axis is not None else ()\n",
    "                    else:\n",
    "                        axis_tuple = axis\n",
    "                    \n",
    "                    # Expand dimensions that were summed\n",
    "                    for ax in sorted(axis_tuple):\n",
    "                        if ax < 0:\n",
    "                            ax = len(self.shape) + ax\n",
    "                        grad_data = np.expand_dims(grad_data, axis=ax)\n",
    "                    \n",
    "                    # Broadcast to original shape\n",
    "                    grad_data = np.broadcast_to(grad_data, self.shape)\n",
    "                \n",
    "                self.backward(Tensor(grad_data))\n",
    "            \n",
    "            result._grad_fn = grad_fn\n",
    "        \n",
    "        return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c4b5e57",
   "metadata": {
    "nbgrader": {
     "grade": false,
     "grade_id": "tensor-matmul",
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "    def matmul(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Matrix multiplication using NumPy's optimized implementation.\n",
    "\n",
    "        TODO: Implement matrix multiplication.\n",
    "\n",
    "        APPROACH:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Check tensor shapes for compatibility\n",
    "        3. Use NumPy's optimized dot product\n",
    "        4. Create new Tensor object with the result\n",
    "        5. Return the new tensor\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        a_data = self._data\n",
    "        b_data = other._data\n",
    "\n",
    "        # Validate tensor shapes\n",
    "        if len(a_data.shape) != 2 or len(b_data.shape) != 2:\n",
    "            raise ValueError(\"matmul requires 2D tensors\")\n",
    "\n",
    "        m, k = a_data.shape\n",
    "        k2, n = b_data.shape\n",
    "\n",
    "        if k != k2:\n",
    "            raise ValueError(f\"Inner dimensions must match: {k} != {k2}\")\n",
    "\n",
    "        # Use NumPy's optimized implementation\n",
    "        result_data = np.dot(a_data, b_data)\n",
    "        return Tensor(result_data)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __matmul__(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Matrix multiplication operator: tensor @ other\n",
    "\n",
    "        Enables the @ operator for matrix multiplication, providing\n",
    "        clean syntax for neural network operations.\n",
    "        \"\"\"\n",
    "        return self.matmul(other)\n",
    "\n",
    "    def backward(self, gradient=None):\n",
    "        \"\"\"\n",
    "        Compute gradients for this tensor and propagate backward.\n",
    "\n",
    "        Basic backward pass - accumulates gradients and propagates to dependencies.\n",
    "        This enables simple gradient computation for basic operations.\n",
    "\n",
    "        Args:\n",
    "            gradient: Gradient from upstream. If None, assumes scalar with grad=1\n",
    "        \"\"\"\n",
    "        if not self.requires_grad:\n",
    "            return\n",
    "\n",
    "        if gradient is None:\n",
    "            # Scalar case - gradient is 1\n",
    "            gradient = Tensor(np.ones_like(self._data))\n",
    "\n",
    "        # Accumulate gradients\n",
    "        if self.grad is None:\n",
    "            self.grad = gradient\n",
    "        else:\n",
    "            self.grad = self.grad + gradient\n",
    "\n",
    "        # Propagate to dependencies via grad_fn\n",
    "        if self._grad_fn is not None:\n",
    "            self._grad_fn(gradient)\n",
    "    \n",
    "    def zero_grad(self):\n",
    "        \"\"\"Reset gradients to None. Used by optimizers before backward pass.\"\"\"\n",
    "        self.grad = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a8f6f7d5",
   "metadata": {
    "nbgrader": {
     "grade": false,
     "grade_id": "tensor-reshape",
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "    def reshape(self, *shape: int) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Return a new tensor with the same data but different shape.\n",
    "\n",
    "        Args:\n",
    "            *shape: New shape dimensions. Use -1 for automatic sizing.\n",
    "\n",
    "        Returns:\n",
    "            New Tensor with reshaped data\n",
    "            \n",
    "        Note:\n",
    "            This returns a view when possible (no copying), or a copy when necessary.\n",
    "            Use .contiguous() after reshape if you need guaranteed contiguous memory.\n",
    "        \"\"\"\n",
    "        reshaped_data = self._data.reshape(*shape)\n",
    "        result = Tensor(reshaped_data)\n",
    "        \n",
    "        # Preserve gradient tracking\n",
    "        if self.requires_grad:\n",
    "            result.requires_grad = True\n",
    "            \n",
    "            def grad_fn(grad):\n",
    "                # Reshape gradient back to original shape\n",
    "                orig_grad = grad.reshape(*self.shape)\n",
    "                self.backward(orig_grad)\n",
    "            \n",
    "            result._grad_fn = grad_fn\n",
    "        \n",
    "        return result\n",
    "    \n",
    "    def view(self, *shape: int) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Return a view of the tensor with a new shape. Alias for reshape.\n",
    "        \n",
    "        Args:\n",
    "            *shape: New shape dimensions. Use -1 for automatic sizing.\n",
    "            \n",
    "        Returns:\n",
    "            New Tensor sharing the same data (view when possible)\n",
    "            \n",
    "        PRODUCTION CONNECTION:\n",
    "        - PyTorch compatibility: .view() is the PyTorch equivalent\n",
    "        - Memory efficiency: Views avoid copying data when possible\n",
    "        - Performance critical: Views enable efficient transformations\n",
    "        \"\"\"\n",
    "        return self.reshape(*shape)\n",
    "    \n",
    "    def clone(self) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Create a deep copy of the tensor.\n",
    "        \n",
    "        Returns:\n",
    "            New Tensor with copied data\n",
    "            \n",
    "        PRODUCTION CONNECTION:\n",
    "        - Memory isolation: Ensures modifications don't affect original\n",
    "        - Gradient tracking: Clones maintain independent gradient graphs\n",
    "        - Safe operations: Use when you need guaranteed data independence\n",
    "        \"\"\"\n",
    "        cloned_data = self._data.copy()\n",
    "        result = Tensor(cloned_data)\n",
    "        \n",
    "        # Clone preserves gradient requirements but starts fresh grad tracking\n",
    "        result.requires_grad = self.requires_grad\n",
    "        # Note: grad and grad_fn are NOT copied - clone starts fresh\n",
    "        \n",
    "        return result\n",
    "    \n",
    "    def contiguous(self) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Return a contiguous tensor with the same data.\n",
    "        \n",
    "        Returns:\n",
    "            Tensor with contiguous memory layout (may be a copy)\n",
    "            \n",
    "        PRODUCTION CONNECTION:\n",
    "        - Performance optimization: Ensures optimal memory layout\n",
    "        - GPU operations: Many CUDA operations require contiguous data\n",
    "        - Cache efficiency: Contiguous data maximizes CPU cache utilization\n",
    "        \"\"\"\n",
    "        if self.is_contiguous:\n",
    "            return self  # Already contiguous, return self\n",
    "        \n",
    "        # Make contiguous copy\n",
    "        contiguous_data = np.ascontiguousarray(self._data)\n",
    "        result = Tensor(contiguous_data)\n",
    "        \n",
    "        # Preserve gradient tracking\n",
    "        result.requires_grad = self.requires_grad\n",
    "        if self.requires_grad:\n",
    "            def grad_fn(grad):\n",
    "                self.backward(grad)\n",
    "            result._grad_fn = grad_fn\n",
    "        \n",
    "        return result\n",
    "\n",
    "    def numpy(self) -> np.ndarray:\n",
    "        \"\"\"\n",
    "        Convert tensor to NumPy array.\n",
    "        \n",
    "        This is the PyTorch-inspired method for tensor-to-numpy conversion.\n",
    "        Provides clean interface for interoperability with NumPy operations.\n",
    "        \"\"\"\n",
    "        return self._data\n",
    "    \n",
    "    def __array__(self, dtype=None) -> np.ndarray:\n",
    "        \"\"\"Enable np.array(tensor) and np.allclose(tensor, array).\"\"\"\n",
    "        if dtype is not None:\n",
    "            return self._data.astype(dtype)\n",
    "        return self._data\n",
    "    \n",
    "    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):\n",
    "        \"\"\"Enable NumPy universal functions with Tensor objects.\"\"\"\n",
    "        # Convert Tensor inputs to NumPy arrays\n",
    "        args = []\n",
    "        for input_ in inputs:\n",
    "            if isinstance(input_, Tensor):\n",
    "                args.append(input_._data)\n",
    "            else:\n",
    "                args.append(input_)\n",
    "        \n",
    "        # Call the ufunc on NumPy arrays\n",
    "        outputs = getattr(ufunc, method)(*args, **kwargs)\n",
    "        \n",
    "        # If method returns NotImplemented, let NumPy handle it\n",
    "        if outputs is NotImplemented:\n",
    "            return NotImplemented\n",
    "            \n",
    "        # Wrap result back in Tensor if appropriate\n",
    "        if method == '__call__':\n",
    "            if isinstance(outputs, np.ndarray):\n",
    "                return Tensor(outputs)\n",
    "            elif isinstance(outputs, tuple):\n",
    "                return tuple(Tensor(output) if isinstance(output, np.ndarray) else output \n",
    "                           for output in outputs)\n",
    "        \n",
    "        return outputs\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ce24a6f",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 2
   },
   "source": [
    "## Testing Your Tensor Implementation\n",
    "\n",
    "Let's validate each component immediately to ensure everything works correctly:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37e009e2",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Unit Test: Tensor Creation\n",
    "\n",
    "Let's test your tensor creation implementation right away! This gives you immediate feedback on whether your `__init__` method works correctly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eff5b3e5",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_tensor_creation():\n",
    "    \"\"\"Test tensor creation with all data types and shapes.\"\"\"\n",
    "    print(\"🔬 Unit Test: Tensor Creation...\")\n",
    "    \n",
    "    try:\n",
    "        # Test scalar\n",
    "        scalar = Tensor(5.0)\n",
    "        assert hasattr(scalar, '_data'), \"Tensor should have _data attribute\"\n",
    "        assert scalar._data.shape == (), f\"Scalar should have shape (), got {scalar._data.shape}\"\n",
    "        print(\"✅ Scalar creation works\")\n",
    "\n",
    "        # Test vector\n",
    "        vector = Tensor([1, 2, 3])\n",
    "        assert vector._data.shape == (3,), f\"Vector should have shape (3,), got {vector._data.shape}\"\n",
    "        print(\"✅ Vector creation works\")\n",
    "\n",
    "        # Test matrix\n",
    "        matrix = Tensor([[1, 2], [3, 4]])\n",
    "        assert matrix._data.shape == (2, 2), f\"Matrix should have shape (2, 2), got {matrix._data.shape}\"\n",
    "        print(\"✅ Matrix creation works\")\n",
    "\n",
    "        print(\"📈 Progress: Tensor Creation ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Tensor creation test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Tensor creation behavior:\")\n",
    "    print(\"   Converts data to NumPy arrays\")\n",
    "    print(\"   Preserves shape and data type\")\n",
    "    print(\"   Stores in _data attribute\")\n",
    "\n",
    "test_unit_tensor_creation()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0abae867",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Unit Test: Tensor Properties\n",
    "\n",
    "Now let's test that your tensor properties work correctly. This tests the @property methods you implemented."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05c92150",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_tensor_properties():\n",
    "    \"\"\"Test tensor properties (shape, size, dtype, data access).\"\"\"\n",
    "    print(\"🔬 Unit Test: Tensor Properties...\")\n",
    "    \n",
    "    try:\n",
    "        # Test with a simple matrix\n",
    "        tensor = Tensor([[1, 2, 3], [4, 5, 6]])\n",
    "\n",
    "        # Test shape property\n",
    "        assert tensor.shape == (2, 3), f\"Shape should be (2, 3), got {tensor.shape}\"\n",
    "        print(\"✅ Shape property works\")\n",
    "\n",
    "        # Test size property\n",
    "        assert tensor.size == 6, f\"Size should be 6, got {tensor.size}\"\n",
    "        print(\"✅ Size property works\")\n",
    "\n",
    "        # Test data property\n",
    "        assert np.array_equal(tensor.data, np.array([[1, 2, 3], [4, 5, 6]])), \"Data property should return numpy array\"\n",
    "        print(\"✅ Data property works\")\n",
    "\n",
    "        # Test dtype property\n",
    "        assert tensor.dtype in [np.int32, np.int64], f\"Dtype should be int32 or int64, got {tensor.dtype}\"\n",
    "        print(\"✅ Dtype property works\")\n",
    "\n",
    "        print(\"📈 Progress: Tensor Properties ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Tensor properties test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Tensor properties behavior:\")\n",
    "    print(\"   shape: Returns tuple of dimensions\")\n",
    "    print(\"   size: Returns total number of elements\")\n",
    "    print(\"   data: Returns underlying NumPy array\")\n",
    "    print(\"   dtype: Returns NumPy data type\")\n",
    "\n",
    "test_unit_tensor_properties()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94247bc9",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Unit Test: Tensor Arithmetic\n",
    "\n",
    "Let's test your tensor arithmetic operations. This tests the __add__, __mul__, __sub__, __truediv__ methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2704d05a",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_tensor_arithmetic():\n",
    "    \"\"\"Test tensor arithmetic operations.\"\"\"\n",
    "    print(\"🔬 Unit Test: Tensor Arithmetic...\")\n",
    "    \n",
    "    try:\n",
    "        # Test addition\n",
    "        a = Tensor([1, 2, 3])\n",
    "        b = Tensor([4, 5, 6])\n",
    "        result = a + b\n",
    "        expected = np.array([5, 7, 9])\n",
    "        assert np.array_equal(result.data, expected), f\"Addition failed: expected {expected}, got {result.data}\"\n",
    "        print(\"✅ Addition works\")\n",
    "\n",
    "        # Test scalar addition\n",
    "        result_scalar = a + 10\n",
    "        expected_scalar = np.array([11, 12, 13])\n",
    "        assert np.array_equal(result_scalar.data, expected_scalar), f\"Scalar addition failed: expected {expected_scalar}, got {result_scalar.data}\"\n",
    "        print(\"✅ Scalar addition works\")\n",
    "\n",
    "        # Test multiplication\n",
    "        result_mul = a * b\n",
    "        expected_mul = np.array([4, 10, 18])\n",
    "        assert np.array_equal(result_mul.data, expected_mul), f\"Multiplication failed: expected {expected_mul}, got {result_mul.data}\"\n",
    "        print(\"✅ Multiplication works\")\n",
    "\n",
    "        # Test scalar multiplication\n",
    "        result_scalar_mul = a * 2\n",
    "        expected_scalar_mul = np.array([2, 4, 6])\n",
    "        assert np.array_equal(result_scalar_mul.data, expected_scalar_mul), f\"Scalar multiplication failed: expected {expected_scalar_mul}, got {result_scalar_mul.data}\"\n",
    "        print(\"✅ Scalar multiplication works\")\n",
    "\n",
    "        # Test subtraction\n",
    "        result_sub = b - a\n",
    "        expected_sub = np.array([3, 3, 3])\n",
    "        assert np.array_equal(result_sub.data, expected_sub), f\"Subtraction failed: expected {expected_sub}, got {result_sub.data}\"\n",
    "        print(\"✅ Subtraction works\")\n",
    "\n",
    "        # Test division\n",
    "        result_div = b / a\n",
    "        expected_div = np.array([4.0, 2.5, 2.0])\n",
    "        assert np.allclose(result_div.data, expected_div), f\"Division failed: expected {expected_div}, got {result_div.data}\"\n",
    "        print(\"✅ Division works\")\n",
    "\n",
    "        print(\"📈 Progress: Tensor Arithmetic ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Tensor arithmetic test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Tensor arithmetic behavior:\")\n",
    "    print(\"   Element-wise operations on tensors\")\n",
    "    print(\"   Broadcasting with scalars\")\n",
    "    print(\"   Returns new Tensor objects\")\n",
    "    print(\"   Preserves numerical precision\")\n",
    "\n",
    "test_unit_tensor_arithmetic()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1da8fe1f",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Unit Test: Matrix Multiplication\n",
    "\n",
    "Test the matrix multiplication implementation that shows both educational and optimized approaches."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66806e77",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_matrix_multiplication():\n",
    "    \"\"\"Test matrix multiplication with educational and optimized paths.\"\"\"\n",
    "    print(\"🔬 Unit Test: Matrix Multiplication...\")\n",
    "    \n",
    "    try:\n",
    "        # Small matrix (educational path)\n",
    "        small_a = Tensor([[1, 2], [3, 4]])\n",
    "        small_b = Tensor([[5, 6], [7, 8]])\n",
    "        small_result = small_a @ small_b\n",
    "        small_expected = np.array([[19, 22], [43, 50]])\n",
    "        assert np.array_equal(small_result.data, small_expected), f\"Small matmul failed: expected {small_expected}, got {small_result.data}\"\n",
    "        print(\"✅ Small matrix multiplication (educational) works\")\n",
    "\n",
    "        # Large matrix (optimized path) \n",
    "        large_a = Tensor(np.random.randn(100, 50))\n",
    "        large_b = Tensor(np.random.randn(50, 80))\n",
    "        large_result = large_a @ large_b\n",
    "        assert large_result.shape == (100, 80), f\"Large matmul shape wrong: expected (100, 80), got {large_result.shape}\"\n",
    "        \n",
    "        # Verify with NumPy\n",
    "        expected_large = np.dot(large_a.data, large_b.data)\n",
    "        assert np.allclose(large_result.data, expected_large), \"Large matmul results don't match NumPy\"\n",
    "        print(\"✅ Large matrix multiplication (optimized) works\")\n",
    "\n",
    "        print(\"📈 Progress: Matrix Multiplication ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Matrix multiplication test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Matrix multiplication behavior:\")\n",
    "    print(\"   Small matrices: Educational loops show concept\")\n",
    "    print(\"   Large matrices: Optimized NumPy implementation\")\n",
    "    print(\"   Proper shape validation and error handling\")\n",
    "    print(\"   Foundation for neural network linear layers\")\n",
    "\n",
    "test_unit_matrix_multiplication()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "76025783",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Unit Test: Advanced Tensor Operations\n",
    "\n",
    "Test the new view/copy semantics and memory layout functionality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "564575fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_advanced_tensor_operations():\n",
    "    \"\"\"Test advanced tensor operations: view, clone, contiguous, strides.\"\"\"\n",
    "    print(\"🔬 Unit Test: Advanced Tensor Operations...\")\n",
    "    \n",
    "    try:\n",
    "        # Test dtype handling improvements\n",
    "        tensor_str = Tensor([1, 2, 3], dtype=\"float32\")\n",
    "        tensor_np = Tensor([1, 2, 3], dtype=np.float64)\n",
    "        assert tensor_str.dtype == np.float32, f\"String dtype failed: {tensor_str.dtype}\"\n",
    "        assert tensor_np.dtype == np.float64, f\"NumPy dtype failed: {tensor_np.dtype}\"\n",
    "        print(\"✅ Enhanced dtype handling works\")\n",
    "\n",
    "        # Test stride and contiguity properties\n",
    "        matrix = Tensor([[1, 2, 3], [4, 5, 6]])\n",
    "        assert hasattr(matrix, 'strides'), \"Should have strides property\"\n",
    "        assert hasattr(matrix, 'is_contiguous'), \"Should have is_contiguous property\"\n",
    "        assert matrix.is_contiguous == True, \"New tensor should be contiguous\"\n",
    "        print(\"✅ Stride and contiguity properties work\")\n",
    "\n",
    "        # Test view vs clone semantics\n",
    "        original = Tensor([[1, 2], [3, 4]])\n",
    "        view_tensor = original.view(4)  # Should share data\n",
    "        clone_tensor = original.clone()  # Should copy data\n",
    "        \n",
    "        assert view_tensor.shape == (4,), f\"View shape wrong: {view_tensor.shape}\"\n",
    "        assert clone_tensor.shape == (2, 2), f\"Clone shape wrong: {clone_tensor.shape}\"\n",
    "        print(\"✅ View and clone semantics work\")\n",
    "\n",
    "        # Test contiguous operation\n",
    "        non_contiguous = Tensor(np.ones((10, 10)).T)  # Transpose creates non-contiguous\n",
    "        contiguous_result = non_contiguous.contiguous()\n",
    "        \n",
    "        if not non_contiguous.is_contiguous:  # Only test if actually non-contiguous\n",
    "            assert contiguous_result.is_contiguous == True, \"contiguous() should make data contiguous\"\n",
    "        print(\"✅ Contiguous operation works\")\n",
    "\n",
    "        # Test error handling for invalid dtype\n",
    "        try:\n",
    "            Tensor([1, 2, 3], dtype=123)  # Invalid dtype\n",
    "            print(\"❌ Should have failed with invalid dtype\")\n",
    "        except TypeError:\n",
    "            print(\"✅ Proper error handling for invalid dtype\")\n",
    "\n",
    "        print(\"📈 Progress: Advanced Tensor Operations ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Advanced tensor operations test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Advanced tensor operations behavior:\")\n",
    "    print(\"   Enhanced dtype handling (str and np.dtype)\")\n",
    "    print(\"   Memory layout analysis with strides\")\n",
    "    print(\"   View vs copy semantics for memory efficiency\")\n",
    "    print(\"   Contiguous memory optimization\")\n",
    "\n",
    "test_unit_advanced_tensor_operations()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "674989ac",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "### 🧪 Integration Test: Tensor-NumPy Integration\n",
    "\n",
    "This integration test validates that your tensor system works seamlessly with NumPy, the foundation of the scientific Python ecosystem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79dc850b",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def test_module_tensor_numpy_integration():\n",
    "    \"\"\"\n",
    "    Integration test for tensor operations with NumPy arrays.\n",
    "\n",
    "    Tests that tensors properly integrate with NumPy operations and maintain\n",
    "    compatibility with the scientific Python ecosystem.\n",
    "    \"\"\"\n",
    "    print(\"🔬 Integration Test: Tensor-NumPy Integration...\")\n",
    "\n",
    "    try:\n",
    "        # Test 1: Tensor from NumPy array\n",
    "        numpy_array = np.array([[1, 2, 3], [4, 5, 6]])\n",
    "        tensor_from_numpy = Tensor(numpy_array)\n",
    "\n",
    "        assert tensor_from_numpy.shape == (2, 3), \"Tensor should preserve NumPy array shape\"\n",
    "        assert np.array_equal(tensor_from_numpy.data, numpy_array), \"Tensor should preserve NumPy array data\"\n",
    "        print(\"✅ Tensor from NumPy array works\")\n",
    "\n",
    "        # Test 2: Tensor arithmetic with NumPy-compatible operations\n",
    "        a = Tensor([1.0, 2.0, 3.0])\n",
    "        b = Tensor([4.0, 5.0, 6.0])\n",
    "\n",
    "        # Test operations that would be used in neural networks\n",
    "        dot_product_result = np.dot(a.data, b.data)  # Common in layers\n",
    "        assert np.isclose(dot_product_result, 32.0), \"Dot product should work with tensor data\"\n",
    "        print(\"✅ NumPy operations on tensor data work\")\n",
    "\n",
    "        # Test 3: Broadcasting compatibility\n",
    "        matrix = Tensor([[1, 2], [3, 4]])\n",
    "        scalar = Tensor(10)\n",
    "\n",
    "        result = matrix + scalar\n",
    "        expected = np.array([[11, 12], [13, 14]])\n",
    "        assert np.array_equal(result.data, expected), \"Broadcasting should work like NumPy\"\n",
    "        print(\"✅ Broadcasting compatibility works\")\n",
    "\n",
    "        # Test 4: Integration with scientific computing patterns\n",
    "        data = Tensor([1, 4, 9, 16, 25])\n",
    "        sqrt_result = Tensor(np.sqrt(data.data))  # Using NumPy functions on tensor data\n",
    "        expected_sqrt = np.array([1., 2., 3., 4., 5.])\n",
    "        assert np.allclose(sqrt_result.data, expected_sqrt), \"Should integrate with NumPy functions\"\n",
    "        print(\"✅ Scientific computing integration works\")\n",
    "\n",
    "        print(\"📈 Progress: Tensor-NumPy Integration ✓\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"❌ Integration test failed: {e}\")\n",
    "        raise\n",
    "\n",
    "    print(\"🎯 Integration test validates:\")\n",
    "    print(\"   Seamless NumPy array conversion\")\n",
    "    print(\"   Compatible arithmetic operations\")\n",
    "    print(\"   Proper broadcasting behavior\")\n",
    "    print(\"   Scientific computing workflow integration\")\n",
    "\n",
    "test_module_tensor_numpy_integration()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3ba2c701",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Parameter Helper Function\n",
    "\n",
    "Now that we have Tensor with gradient support, let's add a convenient helper function for creating trainable parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8039d2e4",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "\n",
    "#| export\n",
    "def Parameter(data, dtype=None):\n",
    "    \"\"\"\n",
    "    Convenience function for creating trainable tensors.\n",
    "\n",
    "    This is equivalent to Tensor(data, requires_grad=True) but provides\n",
    "    cleaner syntax for neural network parameters.\n",
    "\n",
    "    Args:\n",
    "        data: Input data (scalar, list, or numpy array)\n",
    "        dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
    "\n",
    "    Returns:\n",
    "        Tensor with requires_grad=True\n",
    "\n",
    "    Examples:\n",
    "        weight = Parameter(np.random.randn(784, 128))  # Neural network weight\n",
    "        bias = Parameter(np.zeros(128))                # Neural network bias\n",
    "    \"\"\"\n",
    "    return Tensor(data, dtype=dtype, requires_grad=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94412986",
   "metadata": {
    "cell_marker": "\"\"\"",
    "lines_to_next_cell": 1
   },
   "source": [
    "## Comprehensive Testing Function\n",
    "\n",
    "Let's create a comprehensive test that runs all our unit tests together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "71d471d8",
   "metadata": {
    "lines_to_next_cell": 1
   },
   "outputs": [],
   "source": [
    "\n",
    "def test_unit_all():\n",
    "    \"\"\"Run complete tensor module validation.\"\"\"\n",
    "    print(\"🧪 Running all unit tests...\")\n",
    "    \n",
    "    # Call every individual test function\n",
    "    test_unit_tensor_creation()\n",
    "    test_unit_tensor_properties() \n",
    "    test_unit_tensor_arithmetic()\n",
    "    test_unit_matrix_multiplication()\n",
    "    test_unit_advanced_tensor_operations()\n",
    "    test_module_tensor_numpy_integration()\n",
    "    \n",
    "    print(\"✅ All tests passed! Tensor module ready for integration.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adbef893",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "\"\"\"\n",
    "# Main Execution Block\n",
    "\"\"\"\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    # Run all tensor tests\n",
    "    test_unit_all()\n",
    "    \n",
    "    print(\"\\n🎉 Tensor module implementation complete!\")\n",
    "    print(\"📦 Ready to export to tinytorch.core.tensor\")\n",
    "    \n",
    "    # Demonstrate the new ML Framework Advisor improvements\n",
    "    print(\"\\n🚀 New Features Demonstration:\")\n",
    "    \n",
    "    # 1. Enhanced dtype handling\n",
    "    t1 = Tensor([1, 2, 3], dtype=\"float32\")\n",
    "    t2 = Tensor([1, 2, 3], dtype=np.float64)\n",
    "    t3 = Tensor([1, 2, 3], dtype=np.int32)\n",
    "    print(f\"✅ Enhanced dtype support: str={t1.dtype}, np.dtype={t2.dtype}, np.type={t3.dtype}\")\n",
    "    \n",
    "    # 2. Memory layout analysis\n",
    "    matrix = Tensor([[1, 2, 3], [4, 5, 6]])\n",
    "    print(f\"✅ Memory analysis: strides={matrix.strides}, contiguous={matrix.is_contiguous}\")\n",
    "    \n",
    "    # 3. View/copy semantics\n",
    "    view = matrix.view(6)\n",
    "    clone = matrix.clone()\n",
    "    print(f\"✅ View/copy semantics: view_shape={view.shape}, clone_shape={clone.shape}\")\n",
    "    \n",
    "    # 4. Broadcasting failure demonstration with clear error messages\n",
    "    try:\n",
    "        bad_a = Tensor([[1, 2], [3, 4]])  # (2, 2)\n",
    "        bad_b = Tensor([1, 2, 3])         # (3,)\n",
    "        result = bad_a + bad_b\n",
    "    except ValueError as e:\n",
    "        print(f\"✅ Clear broadcasting error: {str(e)[:50]}...\")\n",
    "    \n",
    "    print(\"\\n🎯 Core tensor implementation complete!\")\n",
    "    print(\"   ✓ Simple, clear tensor creation and operations\")\n",
    "    print(\"   ✓ Memory layout analysis and performance insights\")\n",
    "    print(\"   ✓ Broadcasting with comprehensive error handling\")\n",
    "    print(\"   ✓ View/copy semantics for memory efficiency\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec96153",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## 🤔 ML Systems Thinking\n",
    "\n",
    "Now that you've built a complete tensor system, let's connect your implementation to real ML challenges:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ddedb4f4",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Question 1: Memory Efficiency at Scale\n",
    "\n",
    "**Challenge**: Your Tensor class showed that contiguous memory is 10-100x faster than scattered memory. Consider a language model with 7 billion parameters (28GB at float32). How would you modify your memory layout strategies to handle training with limited GPU memory (16GB)?\n",
    "\n",
    "Calculate the memory requirements for parameters, gradients, and optimizer states, then propose specific optimizations to your Tensor implementation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1a53526a",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\"\"\"\n",
    "YOUR ANALYSIS:\n",
    "\n",
    "[Write your response here - consider memory layout, cache efficiency,\n",
    "and optimization strategies for large-scale tensor operations]\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9645ace4",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Question 2: Production Broadcasting\n",
    "\n",
    "**Challenge**: Your broadcasting implementation handles basic cases. In transformer models, you need operations like:\n",
    "- Query (32, 512, 768) × Key (32, 512, 768) → Attention (32, 512, 512)\n",
    "- Attention (32, 8, 512, 512) + Bias (1, 1, 512, 512)\n",
    "\n",
    "How would you extend your `__add__` and `__mul__` methods to handle these complex shapes while providing clear error messages when shapes are incompatible?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "20aee275",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\"\"\"\n",
    "YOUR ANALYSIS:\n",
    "\n",
    "[Write your response here - consider broadcasting rules, error handling,\n",
    "and complex shape operations in transformer architectures]\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4e71b43",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "### Question 3: Gradient Compatibility\n",
    "\n",
    "**Challenge**: Your Tensor class includes `requires_grad` and basic gradient tracking. When you implement automatic differentiation (Module 09), how will your current design support gradient computation?\n",
    "\n",
    "Consider how operations like `c = a * b` need to track both forward computation and backward gradient flow. What modifications would your Tensor methods need to support this?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32c157fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\"\"\"\n",
    "YOUR ANALYSIS:\n",
    "\n",
    "[Write your response here - consider gradient tracking, computational graphs,\n",
    "and how your tensor operations will support automatic differentiation]\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b4d9bff",
   "metadata": {
    "cell_marker": "\"\"\""
   },
   "source": [
    "## 🎯 MODULE SUMMARY: Tensor Foundation\n",
    "\n",
    "Congratulations! You've built the fundamental data structure that powers all machine learning!\n",
    "\n",
    "### Key Learning Outcomes\n",
    "- **Complete Tensor System**: Built a 400+ line implementation with 15 methods supporting all essential tensor operations\n",
    "- **Memory Efficiency Mastery**: Discovered that memory layout affects performance more than algorithms (10-100x speedups)\n",
    "- **Broadcasting Implementation**: Created automatic shape matching that saves memory and enables flexible operations\n",
    "- **Production-Ready API**: Designed interfaces that mirror PyTorch and TensorFlow patterns\n",
    "\n",
    "### Ready for Next Steps\n",
    "Your tensor implementation now enables:\n",
    "- **Module 03 (Activations)**: Add nonlinear functions that make neural networks powerful\n",
    "- **Neural network operations**: Matrix multiplication, broadcasting, and gradient preparation\n",
    "- **Real data processing**: Handle images, text, and complex multi-dimensional datasets\n",
    "\n",
    "### Export Your Work\n",
    "1. **Export to package**: `tito module complete 02_tensor`\n",
    "2. **Verify integration**: Your Tensor class will be available as `tinytorch.core.tensor.Tensor`\n",
    "3. **Enable next module**: Activations build on your tensor foundation\n",
    "\n",
    "**Achievement unlocked**: You've built the universal data structure of modern AI! Every neural network, from simple classifiers to ChatGPT, relies on the tensor concepts you've just implemented."
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "main_language": "python"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}