TinyTorch/modules/02_tensor/tensor_dev.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "789dd4d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# # Tensor - Core Data Structure and Memory Management\n",
    "# \n",
    "# Welcome to the Tensor module! You'll implement the fundamental data structure that powers all neural networks and understand why memory layout determines performance.\n",
    "# \n",
    "# ## Learning Goals\n",
    "# - Systems understanding: How tensor memory layout affects cache performance and computational efficiency\n",
    "# - Core implementation skill: Build a complete Tensor class with shape management and arithmetic operations\n",
    "# - Pattern recognition: Understand how tensors abstract N-dimensional data for ML algorithms\n",
    "# - Framework connection: See how your implementation mirrors PyTorch's tensor design and memory model\n",
    "# - Performance insight: Learn why contiguous memory layout and vectorized operations are critical for ML performance\n",
    "# \n",
    "# ## Build → Use → Reflect\n",
    "# 1. **Build**: Complete Tensor class with shape management, broadcasting, and vectorized operations\n",
    "# 2. **Use**: Perform tensor arithmetic and transformations on real multi-dimensional data\n",
    "# 3. **Reflect**: Why does tensor memory layout become the performance bottleneck in large neural networks?\n",
    "# \n",
    "# ## What You'll Achieve\n",
    "# By the end of this module, you'll understand:\n",
    "# - Deep technical understanding of how N-dimensional arrays are stored and manipulated in memory\n",
    "# - Practical capability to build efficient tensor operations that form the foundation of neural networks\n",
    "# - Systems insight into why memory access patterns determine whether ML operations run fast or slow\n",
    "# - Performance consideration of when tensor operations trigger expensive memory copies vs efficient in-place updates\n",
    "# - Connection to production ML systems and how PyTorch optimizes tensor storage for GPU acceleration\n",
    "# \n",
    "# ## Systems Reality Check\n",
    "# 💡 **Production Context**: PyTorch tensors automatically choose optimal memory layouts and can seamlessly move between CPU and GPU - your implementation reveals these design decisions\n",
    "# ⚡ **Performance Note**: Non-contiguous tensors can be 10-100x slower than contiguous ones - memory layout is often more important than algorithm choice in ML systems"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e0449c6a",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "#| default_exp core.tensor\n",
    "\n",
    "#| export\n",
    "import numpy as np\n",
    "import sys\n",
    "from typing import Union, Tuple, Optional, Any"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "63c51f79",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "print(\"🔥 TinyTorch Tensor Module\")\n",
    "print(f\"NumPy version: {np.__version__}\")\n",
    "print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
    "print(\"Ready to build tensors!\")\n",
    "\n",
    "\n",
    "# ## Where This Code Lives in the Final Package\n",
    "# \n",
    "# **Learning Side:** You work in `modules/source/02_tensor/tensor_dev.py`  \n",
    "# **Building Side:** Code exports to `tinytorch.core.tensor`\n",
    "# \n",
    "# ```python\n",
    "# # Final package structure:\n",
    "# from tinytorch.core.tensor import Tensor  # The foundation of everything!\n",
    "# from tinytorch.core.activations import ReLU, Sigmoid, Tanh\n",
    "# from tinytorch.core.layers import Dense, Conv2D\n",
    "# ```\n",
    "# \n",
    "# **Why this matters:**\n",
    "# - **Learning:** Focused modules for deep understanding\n",
    "# - **Production:** Proper organization like PyTorch's `torch.Tensor`\n",
    "# - **Consistency:** All tensor operations live together in `core.tensor`\n",
    "# - **Foundation:** Every other module depends on Tensor\n",
    "\n",
    "# ## Mathematical Foundation: From Scalars to Tensors\n",
    "# \n",
    "# Understanding tensors requires building from mathematical fundamentals:\n",
    "# \n",
    "# ### Scalars (Rank 0)\n",
    "# - **Definition**: A single number with no direction\n",
    "# - **Examples**: Temperature (25°C), mass (5.2 kg), probability (0.7)\n",
    "# - **Operations**: Addition, multiplication, comparison\n",
    "# - **ML Context**: Loss values, learning rates, regularization parameters\n",
    "# \n",
    "# ### Vectors (Rank 1)\n",
    "# - **Definition**: An ordered list of numbers with direction and magnitude\n",
    "# - **Examples**: Position [x, y, z], RGB color [255, 128, 0], word embedding [0.1, -0.5, 0.8]\n",
    "# - **Operations**: Dot product, cross product, norm calculation\n",
    "# - **ML Context**: Feature vectors, gradients, model parameters\n",
    "# \n",
    "# ### Matrices (Rank 2)\n",
    "# - **Definition**: A 2D array organizing data in rows and columns\n",
    "# - **Examples**: Image (height × width), weight matrix (input × output), covariance matrix\n",
    "# - **Operations**: Matrix multiplication, transpose, inverse, eigendecomposition\n",
    "# - **ML Context**: Linear layer weights, attention matrices, batch data\n",
    "# \n",
    "# ### Higher-Order Tensors (Rank 3+)\n",
    "# - **Definition**: Multi-dimensional arrays extending matrices\n",
    "# - **Examples**: \n",
    "#   - **3D**: Video frames (time × height × width), RGB images (height × width × channels)\n",
    "#   - **4D**: Image batches (batch × height × width × channels)\n",
    "#   - **5D**: Video batches (batch × time × height × width × channels)\n",
    "# - **Operations**: Tensor products, contractions, decompositions\n",
    "# - **ML Context**: Convolutional features, RNN states, transformer attention\n",
    "\n",
    "# ## Why Tensors Matter in ML: The Computational Foundation\n",
    "# \n",
    "# ### Unified Data Representation\n",
    "# Tensors provide a consistent way to represent all ML data:\n",
    "# ```python\n",
    "# # All of these are tensors with different shapes\n",
    "# scalar_loss = Tensor(0.5)              # Shape: ()\n",
    "# feature_vector = Tensor([1, 2, 3])      # Shape: (3,)\n",
    "# weight_matrix = Tensor([[1, 2], [3, 4]]) # Shape: (2, 2)\n",
    "# image_batch = Tensor(np.random.rand(32, 224, 224, 3)) # Shape: (32, 224, 224, 3)\n",
    "# ```\n",
    "# \n",
    "# ### Efficient Batch Processing\n",
    "# ML systems process multiple samples simultaneously:\n",
    "# ```python\n",
    "# # Instead of processing one image at a time:\n",
    "# for image in images:\n",
    "#     result = model(image)  # Slow: 1000 separate operations\n",
    "# \n",
    "# # Process entire batch at once:\n",
    "# batch_result = model(image_batch)  # Fast: 1 vectorized operation\n",
    "# ```\n",
    "# \n",
    "# ### Hardware Acceleration\n",
    "# Modern hardware (GPUs, TPUs) excels at tensor operations:\n",
    "# - **Parallel processing**: Multiple operations simultaneously\n",
    "# - **Vectorization**: SIMD (Single Instruction, Multiple Data) operations\n",
    "# - **Memory optimization**: Contiguous memory layout for cache efficiency\n",
    "# \n",
    "# ### Automatic Differentiation\n",
    "# Tensors enable gradient computation through computational graphs:\n",
    "# ```python\n",
    "# # Each tensor operation creates a node in the computation graph\n",
    "# x = Tensor([1, 2, 3])\n",
    "# y = x * 2          # Node: multiplication\n",
    "# z = y + 1          # Node: addition\n",
    "# loss = z.sum()     # Node: summation\n",
    "# # Gradients flow backward through this graph\n",
    "# ```\n",
    "\n",
    "# ## Real-World Examples: Tensors in Action\n",
    "# \n",
    "# ### Computer Vision\n",
    "# - **Grayscale image**: 2D tensor `(height, width)` - `(28, 28)` for MNIST\n",
    "# - **Color image**: 3D tensor `(height, width, channels)` - `(224, 224, 3)` for RGB\n",
    "# - **Image batch**: 4D tensor `(batch, height, width, channels)` - `(32, 224, 224, 3)`\n",
    "# - **Video**: 5D tensor `(batch, time, height, width, channels)`\n",
    "# \n",
    "# ### Natural Language Processing\n",
    "# - **Word embedding**: 1D tensor `(embedding_dim,)` - `(300,)` for Word2Vec\n",
    "# - **Sentence**: 2D tensor `(sequence_length, embedding_dim)` - `(50, 768)` for BERT\n",
    "# - **Batch of sentences**: 3D tensor `(batch, sequence_length, embedding_dim)`\n",
    "# \n",
    "# ### Audio Processing\n",
    "# - **Audio signal**: 1D tensor `(time_steps,)` - `(16000,)` for 1 second at 16kHz\n",
    "# - **Spectrogram**: 2D tensor `(time_frames, frequency_bins)`\n",
    "# - **Batch of audio**: 3D tensor `(batch, time_steps, features)`\n",
    "# \n",
    "# ### Time Series\n",
    "# - **Single series**: 2D tensor `(time_steps, features)`\n",
    "# - **Multiple series**: 3D tensor `(batch, time_steps, features)`\n",
    "# - **Multivariate forecasting**: 4D tensor `(batch, time_steps, features, predictions)`\n",
    "\n",
    "# ## Why Not Just Use NumPy?\n",
    "# \n",
    "# While we use NumPy internally, our Tensor class adds ML-specific functionality:\n",
    "# \n",
    "# ### ML-Specific Operations\n",
    "# - **Gradient tracking**: For automatic differentiation (coming in Module 7)\n",
    "# - **GPU support**: For hardware acceleration (future extension)\n",
    "# - **Broadcasting semantics**: ML-friendly dimension handling\n",
    "# \n",
    "# ### Consistent API\n",
    "# - **Type safety**: Predictable behavior across operations\n",
    "# - **Error checking**: Clear error messages for debugging\n",
    "# - **Integration**: Seamless work with other TinyTorch components\n",
    "# \n",
    "# ### Educational Value\n",
    "# - **Conceptual clarity**: Understand what tensors really are\n",
    "# - **Implementation insight**: See how frameworks work internally\n",
    "# - **Debugging skills**: Trace through tensor operations step by step\n",
    "# \n",
    "# ### Extensibility\n",
    "# - **Future features**: Ready for gradients, GPU, distributed computing\n",
    "# - **Customization**: Add domain-specific operations\n",
    "# - **Optimization**: Profile and optimize specific use cases\n",
    "\n",
    "# ## Performance Considerations: Building Efficient Tensors\n",
    "# \n",
    "# ### Memory Layout\n",
    "# - **Contiguous arrays**: Better cache locality and performance\n",
    "# - **Data types**: `float32` vs `float64` trade-offs\n",
    "# - **Memory sharing**: Avoid unnecessary copies\n",
    "# \n",
    "# ### Vectorization\n",
    "# - **SIMD operations**: Single Instruction, Multiple Data\n",
    "# - **Broadcasting**: Efficient operations on different shapes\n",
    "# - **Batch operations**: Process multiple samples simultaneously\n",
    "# \n",
    "# ### Numerical Stability\n",
    "# - **Precision**: Balancing speed and accuracy\n",
    "# - **Overflow/underflow**: Handling extreme values\n",
    "# - **Gradient flow**: Maintaining numerical stability for training\n",
    "\n",
    "# # CONCEPT\n",
    "# Tensors are N-dimensional arrays that carry data through neural networks.\n",
    "# Think NumPy arrays with ML superpowers - same math, more capabilities.\n",
    "\n",
    "# # CODE STRUCTURE\n",
    "# ```python\n",
    "# class Tensor:\n",
    "#     def __init__(self, data):     # Create from any data type\n",
    "#     def __add__(self, other):     # Enable tensor + tensor\n",
    "#     def __mul__(self, other):     # Enable tensor * tensor\n",
    "#     # Properties: .shape, .size, .dtype, .data\n",
    "# ```\n",
    "\n",
    "# # CONNECTIONS\n",
    "# - torch.Tensor (PyTorch) - same concept, production optimized\n",
    "# - tf.Tensor (TensorFlow) - distributed computing focus\n",
    "# - np.ndarray (NumPy) - we wrap this with ML operations\n",
    "\n",
    "# # CONSTRAINTS\n",
    "# - Handle broadcasting (auto-shape matching for operations)\n",
    "# - Support multiple data types (float32, int32, etc.)\n",
    "# - Efficient memory usage (copy only when necessary)\n",
    "# - Natural math notation (tensor + tensor should just work)\n",
    "\n",
    "# # CONTEXT\n",
    "# Every ML operation flows through tensors:\n",
    "# - Neural networks: All computations operate on tensors\n",
    "# - Training: Gradients flow through tensor operations  \n",
    "# - Hardware: GPUs optimized for tensor math\n",
    "# - Production: Millions of tensor ops per second in real systems\n",
    "# \n",
    "# **You're building the universal language of machine learning.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21e134e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "#| export\n",
    "class Tensor:\n",
    "    \"\"\"\n",
    "    TinyTorch Tensor: N-dimensional array with ML operations.\n",
    "\n",
    "    The fundamental data structure for all TinyTorch operations.\n",
    "    Wraps NumPy arrays with ML-specific functionality.\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, data: Any, dtype: Optional[str] = None, requires_grad: bool = False):\n",
    "        \"\"\"\n",
    "        Create a new tensor from data.\n",
    "\n",
    "        Args:\n",
    "            data: Input data (scalar, list, or numpy array)\n",
    "            dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
    "            requires_grad: Whether this tensor needs gradients for training. Defaults to False.\n",
    "\n",
    "        TODO: Implement tensor creation with proper type handling.\n",
    "\n",
    "        STEP-BY-STEP:\n",
    "        1. Check if data is a scalar (int/float) - convert to numpy array\n",
    "        2. Check if data is a list - convert to numpy array  \n",
    "        3. Check if data is already a numpy array - use as-is\n",
    "        4. Apply dtype conversion if specified\n",
    "        5. Store the result in self._data\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor(5) → stores np.array(5)\n",
    "        Tensor([1, 2, 3]) → stores np.array([1, 2, 3])\n",
    "        Tensor(np.array([1, 2, 3])) → stores the array directly\n",
    "\n",
    "        HINTS:\n",
    "        - Use isinstance() to check data types\n",
    "        - Use np.array() for conversion\n",
    "        - Handle dtype parameter for type conversion\n",
    "        - Store the array in self._data\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        # Convert input to numpy array\n",
    "        if isinstance(data, (int, float, np.number)):\n",
    "            # Handle Python and NumPy scalars\n",
    "            if dtype is None:\n",
    "                # Auto-detect type: int for integers, float32 for floats\n",
    "                if isinstance(data, int) or (isinstance(data, np.number) and np.issubdtype(type(data), np.integer)):\n",
    "                    dtype = 'int32'\n",
    "                else:\n",
    "                    dtype = 'float32'\n",
    "            self._data = np.array(data, dtype=dtype)\n",
    "        elif isinstance(data, list):\n",
    "            # Let NumPy auto-detect type, then convert if needed\n",
    "            temp_array = np.array(data)\n",
    "            if dtype is None:\n",
    "                # Use NumPy's auto-detected type, but prefer float32 for floats\n",
    "                if temp_array.dtype == np.float64:\n",
    "                    dtype = 'float32'\n",
    "                else:\n",
    "                    dtype = str(temp_array.dtype)\n",
    "            self._data = np.array(data, dtype=dtype)\n",
    "        elif isinstance(data, np.ndarray):\n",
    "            # Already a numpy array\n",
    "            if dtype is None:\n",
    "                # Keep existing dtype, but prefer float32 for float64\n",
    "                if data.dtype == np.float64:\n",
    "                    dtype = 'float32'\n",
    "                else:\n",
    "                    dtype = str(data.dtype)\n",
    "            self._data = data.astype(dtype) if dtype != data.dtype else data.copy()\n",
    "        elif isinstance(data, Tensor):\n",
    "            # Input is another Tensor - extract its data\n",
    "            if dtype is None:\n",
    "                # Keep existing dtype, but prefer float32 for float64\n",
    "                if data.data.dtype == np.float64:\n",
    "                    dtype = 'float32'\n",
    "                else:\n",
    "                    dtype = str(data.data.dtype)\n",
    "            self._data = data.data.astype(dtype) if dtype != str(data.data.dtype) else data.data.copy()\n",
    "        else:\n",
    "            # Try to convert unknown types\n",
    "            self._data = np.array(data, dtype=dtype)\n",
    "\n",
    "        # Initialize gradient tracking attributes\n",
    "        self.requires_grad = requires_grad\n",
    "        self.grad = None if requires_grad else None\n",
    "        self._grad_fn = None\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def data(self) -> np.ndarray:\n",
    "        \"\"\"\n",
    "        Access underlying numpy array.\n",
    "\n",
    "        TODO: Return the stored numpy array.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Access the internal _data attribute\n",
    "        2. Return the numpy array directly\n",
    "        3. This provides access to underlying data for NumPy operations\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - PyTorch: tensor.numpy() converts to NumPy for visualization/analysis\n",
    "        - TensorFlow: tensor.numpy() enables integration with scientific Python\n",
    "        - Production: Data scientists need to access raw arrays for debugging\n",
    "        - Performance: Direct access avoids copying for read-only operations\n",
    "\n",
    "        HINT: Return self._data (the array you stored in __init__)\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data\n",
    "        ### END SOLUTION\n",
    "    \n",
    "    @data.setter\n",
    "    def data(self, value: Union[np.ndarray, 'Tensor']) -> None:\n",
    "        \"\"\"\n",
    "        Set the underlying data of the tensor.\n",
    "        \n",
    "        Args:\n",
    "            value: New data (numpy array or Tensor)\n",
    "        \"\"\"\n",
    "        if isinstance(value, Tensor):\n",
    "            self._data = value._data.copy()\n",
    "        else:\n",
    "            self._data = np.array(value)\n",
    "\n",
    "    @property\n",
    "    def shape(self) -> Tuple[int, ...]:\n",
    "        \"\"\"\n",
    "        Get tensor shape.\n",
    "\n",
    "        TODO: Return the shape of the stored numpy array.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Access the _data attribute (the NumPy array)\n",
    "        2. Get the shape property from the NumPy array\n",
    "        3. Return the shape tuple directly\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Neural networks: Layer compatibility requires matching shapes\n",
    "        - Computer vision: Image shape (height, width, channels) determines architecture\n",
    "        - NLP: Sequence length and vocabulary size affect model design\n",
    "        - Debugging: Shape mismatches are the #1 cause of ML errors\n",
    "\n",
    "        HINT: Use .shape attribute of the numpy array\n",
    "        EXAMPLE: Tensor([1, 2, 3]).shape should return (3,)\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.shape\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def size(self) -> int:\n",
    "        \"\"\"\n",
    "        Get total number of elements.\n",
    "\n",
    "        TODO: Return the total number of elements in the tensor.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Access the _data attribute (the NumPy array)\n",
    "        2. Get the size property from the NumPy array\n",
    "        3. Return the total element count as an integer\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Memory planning: Calculate RAM requirements for large tensors\n",
    "        - Model architecture: Determine parameter counts for layers\n",
    "        - Performance optimization: Size affects computation time\n",
    "        - Batch processing: Total elements determines vectorization efficiency\n",
    "\n",
    "        HINT: Use .size attribute of the numpy array\n",
    "        EXAMPLE: Tensor([1, 2, 3]).size should return 3\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.size\n",
    "        ### END SOLUTION\n",
    "\n",
    "    @property\n",
    "    def dtype(self) -> np.dtype:\n",
    "        \"\"\"\n",
    "        Get data type as numpy dtype.\n",
    "\n",
    "        TODO: Return the data type of the stored numpy array.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Access the _data attribute (the NumPy array)\n",
    "        2. Get the dtype property from the NumPy array\n",
    "        3. Return the NumPy dtype object directly\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Precision vs speed: float32 is faster, float64 more accurate\n",
    "        - Memory optimization: int8 uses 1/4 memory of int32\n",
    "        - GPU compatibility: Some operations only work with specific types\n",
    "        - Model deployment: Mobile/edge devices prefer smaller data types\n",
    "\n",
    "        HINT: Use .dtype attribute of the numpy array\n",
    "        EXAMPLE: Tensor([1, 2, 3]).dtype should return dtype('int32')\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return self._data.dtype\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __repr__(self) -> str:\n",
    "        \"\"\"\n",
    "        String representation.\n",
    "\n",
    "        TODO: Create a clear string representation of the tensor.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Convert the numpy array to a list using .tolist()\n",
    "        2. Get shape and dtype information from properties\n",
    "        3. Format as \"Tensor([data], shape=shape, dtype=dtype)\"\n",
    "        4. Return the formatted string\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Debugging: Clear tensor representation speeds debugging\n",
    "        - Jupyter notebooks: Good __repr__ improves data exploration\n",
    "        - Logging: Production systems log tensor info for monitoring\n",
    "        - Education: Students understand tensors better with clear output\n",
    "\n",
    "        APPROACH:\n",
    "        1. Convert the numpy array to a list for readable output\n",
    "        2. Include the shape and dtype information\n",
    "        3. Format: \"Tensor([data], shape=shape, dtype=dtype)\"\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2, 3]) → \"Tensor([1, 2, 3], shape=(3,), dtype=int32)\"\n",
    "\n",
    "        HINTS:\n",
    "        - Use .tolist() to convert numpy array to list\n",
    "        - Include shape and dtype information\n",
    "        - Keep format consistent and readable\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        return f\"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})\"\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def add(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Add two tensors element-wise.\n",
    "\n",
    "        TODO: Implement tensor addition.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Use NumPy's + operator for element-wise addition\n",
    "        3. Create a new Tensor object with the result\n",
    "        4. Return the new tensor\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Neural networks: Adding bias terms to linear layer outputs\n",
    "        - Residual connections: skip connections in ResNet architectures\n",
    "        - Gradient updates: Adding computed gradients to parameters\n",
    "        - Ensemble methods: Combining predictions from multiple models\n",
    "\n",
    "        APPROACH:\n",
    "        1. Add the numpy arrays using +\n",
    "        2. Return a new Tensor with the result\n",
    "        3. Handle broadcasting automatically\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
    "\n",
    "        HINTS:\n",
    "        - Use self._data + other._data\n",
    "        - Return Tensor(result)\n",
    "        - NumPy handles broadcasting automatically\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        result = self._data + other._data\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def multiply(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Multiply two tensors element-wise.\n",
    "\n",
    "        TODO: Implement tensor multiplication.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Use NumPy's * operator for element-wise multiplication\n",
    "        3. Create a new Tensor object with the result\n",
    "        4. Return the new tensor\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Activation functions: Element-wise operations like ReLU masking\n",
    "        - Attention mechanisms: Element-wise scaling in transformer models\n",
    "        - Feature scaling: Multiplying features by learned scaling factors\n",
    "        - Gating: Element-wise gating in LSTM and GRU cells\n",
    "\n",
    "        APPROACH:\n",
    "        1. Multiply the numpy arrays using *\n",
    "        2. Return a new Tensor with the result\n",
    "        3. Handle broadcasting automatically\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
    "\n",
    "        HINTS:\n",
    "        - Use self._data * other._data\n",
    "        - Return Tensor(result)\n",
    "        - This is element-wise, not matrix multiplication\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        result = self._data * other._data\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Addition operator: tensor + other\n",
    "\n",
    "        TODO: Implement + operator for tensors.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, call the add() method directly\n",
    "        3. If scalar, convert to Tensor then call add()\n",
    "        4. Return the result from add() method\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Natural syntax: tensor + scalar enables intuitive code\n",
    "        - Broadcasting: Adding scalars to tensors is common in ML\n",
    "        - Operator overloading: Python's magic methods enable math-like syntax\n",
    "        - API design: Clean interfaces reduce cognitive load for researchers\n",
    "\n",
    "        APPROACH:\n",
    "        1. If other is a Tensor, use tensor addition\n",
    "        2. If other is a scalar, convert to Tensor first\n",
    "        3. Return the result\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])\n",
    "        Tensor([1, 2]) + 5 → Tensor([6, 7])\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            return self.add(other)\n",
    "        else:\n",
    "            return self.add(Tensor(other))\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Multiplication operator: tensor * other\n",
    "\n",
    "        TODO: Implement * operator for tensors.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, call the multiply() method directly\n",
    "        3. If scalar, convert to Tensor then call multiply()\n",
    "        4. Return the result from multiply() method\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Scaling features: tensor * learning_rate for gradient updates\n",
    "        - Masking: tensor * mask for attention mechanisms\n",
    "        - Regularization: tensor * dropout_mask during training\n",
    "        - Normalization: tensor * scale_factor in batch normalization\n",
    "\n",
    "        APPROACH:\n",
    "        1. If other is a Tensor, use tensor multiplication\n",
    "        2. If other is a scalar, convert to Tensor first\n",
    "        3. Return the result\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])\n",
    "        Tensor([1, 2]) * 3 → Tensor([3, 6])\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            return self.multiply(other)\n",
    "        else:\n",
    "            return self.multiply(Tensor(other))\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Subtraction operator: tensor - other\n",
    "\n",
    "        TODO: Implement - operator for tensors.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, subtract other._data from self._data\n",
    "        3. If scalar, subtract scalar directly from self._data\n",
    "        4. Create new Tensor with result and return\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Gradient computation: parameter - learning_rate * gradient\n",
    "        - Residual connections: output - skip_connection in some architectures\n",
    "        - Error calculation: predicted - actual for loss computation\n",
    "        - Centering data: tensor - mean for zero-centered inputs\n",
    "\n",
    "        APPROACH:\n",
    "        1. Convert other to Tensor if needed\n",
    "        2. Subtract using numpy arrays\n",
    "        3. Return new Tensor with result\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([5, 6]) - Tensor([1, 2]) → Tensor([4, 4])\n",
    "        Tensor([5, 6]) - 1 → Tensor([4, 5])\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            result = self._data - other._data\n",
    "        else:\n",
    "            result = self._data - other\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Division operator: tensor / other\n",
    "\n",
    "        TODO: Implement / operator for tensors.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Check if other is a Tensor object\n",
    "        2. If Tensor, divide self._data by other._data\n",
    "        3. If scalar, divide self._data by scalar directly\n",
    "        4. Create new Tensor with result and return\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Normalization: tensor / std_deviation for standard scaling\n",
    "        - Learning rate decay: parameter / decay_factor over time\n",
    "        - Probability computation: counts / total_counts for frequencies\n",
    "        - Temperature scaling: logits / temperature in softmax functions\n",
    "\n",
    "        APPROACH:\n",
    "        1. Convert other to Tensor if needed\n",
    "        2. Divide using numpy arrays\n",
    "        3. Return new Tensor with result\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([6, 8]) / Tensor([2, 4]) → Tensor([3, 2])\n",
    "        Tensor([6, 8]) / 2 → Tensor([3, 4])\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        if isinstance(other, Tensor):\n",
    "            result = self._data / other._data\n",
    "        else:\n",
    "            result = self._data / other\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def mean(self) -> 'Tensor':\n",
    "        \"\"\"Computes the mean of the tensor's elements.\"\"\"\n",
    "        return Tensor(np.mean(self.data))\n",
    "\n",
    "    def matmul(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Perform matrix multiplication between two tensors.\n",
    "\n",
    "        TODO: Implement matrix multiplication.\n",
    "\n",
    "        STEP-BY-STEP IMPLEMENTATION:\n",
    "        1. Extract numpy arrays from both tensors\n",
    "        2. Use np.matmul() for proper matrix multiplication\n",
    "        3. Create new Tensor object with the result\n",
    "        4. Return the new tensor\n",
    "\n",
    "        LEARNING CONNECTIONS:\n",
    "        Real-world relevance:\n",
    "        - Linear layers: input @ weight matrices in neural networks\n",
    "        - Transformer attention: Q @ K^T for attention scores\n",
    "        - CNN convolutions: Implemented as matrix multiplications\n",
    "        - Batch processing: Matrix ops enable parallel computation\n",
    "\n",
    "        APPROACH:\n",
    "        1. Use np.matmul() to perform matrix multiplication\n",
    "        2. Return a new Tensor with the result\n",
    "        3. Handle broadcasting automatically\n",
    "\n",
    "        EXAMPLE:\n",
    "        Tensor([[1, 2], [3, 4]]) @ Tensor([[5, 6], [7, 8]]) → Tensor([[19, 22], [43, 50]])\n",
    "\n",
    "        HINTS:\n",
    "        - Use np.matmul(self._data, other._data)\n",
    "        - Return Tensor(result)\n",
    "        - This is matrix multiplication, not element-wise multiplication\n",
    "        \"\"\"\n",
    "        ### BEGIN SOLUTION\n",
    "        result = np.matmul(self._data, other._data)\n",
    "        return Tensor(result)\n",
    "        ### END SOLUTION\n",
    "\n",
    "    def __matmul__(self, other: 'Tensor') -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Matrix multiplication operator: tensor @ other\n",
    "\n",
    "        Enables the @ operator for matrix multiplication, providing\n",
    "        clean syntax for neural network operations.\n",
    "        \"\"\"\n",
    "        return self.matmul(other)\n",
    "\n",
    "    def backward(self, gradient=None):\n",
    "        \"\"\"\n",
    "        Compute gradients for this tensor and propagate backward.\n",
    "\n",
    "        This is a stub for now - full implementation in Module 09 (Autograd).\n",
    "        For now, just accumulates gradients if requires_grad=True.\n",
    "\n",
    "        Args:\n",
    "            gradient: Gradient from upstream. If None, assumes scalar with grad=1\n",
    "        \"\"\"\n",
    "        if not self.requires_grad:\n",
    "            return\n",
    "\n",
    "        if gradient is None:\n",
    "            # Scalar case - gradient is 1\n",
    "            gradient = Tensor(np.ones_like(self._data))\n",
    "\n",
    "        # Accumulate gradients\n",
    "        if self.grad is None:\n",
    "            self.grad = gradient\n",
    "        else:\n",
    "            self.grad = self.grad + gradient\n",
    "    \n",
    "    def zero_grad(self):\n",
    "        \"\"\"\n",
    "        Reset gradients to None. Used by optimizers before backward pass.\n",
    "        \n",
    "        This method is called by optimizers to clear gradients before\n",
    "        computing new ones, preventing gradient accumulation across batches.\n",
    "        \"\"\"\n",
    "        self.grad = None\n",
    "\n",
    "    def reshape(self, *shape: int) -> 'Tensor':\n",
    "        \"\"\"\n",
    "        Return a new tensor with the same data but different shape.\n",
    "\n",
    "        Args:\n",
    "            *shape: New shape dimensions. Use -1 for automatic sizing.\n",
    "\n",
    "        Returns:\n",
    "            New Tensor with reshaped data\n",
    "\n",
    "        Example:\n",
    "            tensor.reshape(2, -1)  # Reshape to 2 rows, auto columns\n",
    "            tensor.reshape(4, 3)   # Reshape to 4x3 matrix\n",
    "        \"\"\"\n",
    "        reshaped_data = self._data.reshape(*shape)\n",
    "        return Tensor(reshaped_data)\n",
    "\n",
    "\n",
    "# # Testing Your Implementation\n",
    "# \n",
    "# Now let's test our tensor implementation with comprehensive tests that validate all functionality.\n",
    "\n",
    "# ### 🧪 Unit Test: Tensor Creation\n",
    "# \n",
    "# Let's test your tensor creation implementation right away! This gives you immediate feedback on whether your `__init__` method works correctly.\n",
    "# \n",
    "# **This is a unit test** - it tests one specific function (tensor creation) in isolation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "217cb51e",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Test tensor creation immediately after implementation\n",
    "print(\"🔬 Unit Test: Tensor Creation...\")\n",
    "\n",
    "# Test basic tensor creation\n",
    "try:\n",
    "    # Test scalar\n",
    "    scalar = Tensor(5.0)\n",
    "    assert hasattr(scalar, '_data'), \"Tensor should have _data attribute\"\n",
    "    assert scalar._data.shape == (), f\"Scalar should have shape (), got {scalar._data.shape}\"\n",
    "    print(\"✅ Scalar creation works\")\n",
    "\n",
    "    # Test vector\n",
    "    vector = Tensor([1, 2, 3])\n",
    "    assert vector._data.shape == (3,), f\"Vector should have shape (3,), got {vector._data.shape}\"\n",
    "    print(\"✅ Vector creation works\")\n",
    "\n",
    "    # Test matrix\n",
    "    matrix = Tensor([[1, 2], [3, 4]])\n",
    "    assert matrix._data.shape == (2, 2), f\"Matrix should have shape (2, 2), got {matrix._data.shape}\"\n",
    "    print(\"✅ Matrix creation works\")\n",
    "\n",
    "    print(\"📈 Progress: Tensor Creation ✓\")\n",
    "\n",
    "except Exception as e:\n",
    "    print(f\"❌ Tensor creation test failed: {e}\")\n",
    "    raise\n",
    "\n",
    "print(\"🎯 Tensor creation behavior:\")\n",
    "print(\"   Converts data to NumPy arrays\")\n",
    "print(\"   Preserves shape and data type\")\n",
    "print(\"   Stores in _data attribute\")\n",
    "\n",
    "\n",
    "# ### 🧪 Unit Test: Tensor Properties\n",
    "# \n",
    "# Now let's test that your tensor properties work correctly. This tests the @property methods you implemented.\n",
    "# \n",
    "# **This is a unit test** - it tests specific properties (shape, size, dtype, data) in isolation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7bd87245",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Test tensor properties immediately after implementation\n",
    "print(\"🔬 Unit Test: Tensor Properties...\")\n",
    "\n",
    "# Test properties with simple examples\n",
    "try:\n",
    "    # Test with a simple matrix\n",
    "    tensor = Tensor([[1, 2, 3], [4, 5, 6]])\n",
    "\n",
    "    # Test shape property\n",
    "    assert tensor.shape == (2, 3), f\"Shape should be (2, 3), got {tensor.shape}\"\n",
    "    print(\"✅ Shape property works\")\n",
    "\n",
    "    # Test size property\n",
    "    assert tensor.size == 6, f\"Size should be 6, got {tensor.size}\"\n",
    "    print(\"✅ Size property works\")\n",
    "\n",
    "    # Test data property\n",
    "    assert np.array_equal(tensor.data, np.array([[1, 2, 3], [4, 5, 6]])), \"Data property should return numpy array\"\n",
    "    print(\"✅ Data property works\")\n",
    "\n",
    "    # Test dtype property\n",
    "    assert tensor.dtype in [np.int32, np.int64], f\"Dtype should be int32 or int64, got {tensor.dtype}\"\n",
    "    print(\"✅ Dtype property works\")\n",
    "\n",
    "    print(\"📈 Progress: Tensor Properties ✓\")\n",
    "\n",
    "except Exception as e:\n",
    "    print(f\"❌ Tensor properties test failed: {e}\")\n",
    "    raise\n",
    "\n",
    "print(\"🎯 Tensor properties behavior:\")\n",
    "print(\"   shape: Returns tuple of dimensions\")\n",
    "print(\"   size: Returns total number of elements\")\n",
    "print(\"   data: Returns underlying NumPy array\")\n",
    "print(\"   dtype: Returns NumPy data type\")\n",
    "\n",
    "\n",
    "# ### 🧪 Unit Test: Tensor Arithmetic\n",
    "# \n",
    "# Let's test your tensor arithmetic operations. This tests the __add__, __mul__, __sub__, __truediv__ methods.\n",
    "# \n",
    "# **This is a unit test** - it tests specific arithmetic operations in isolation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dfd5f714",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Test tensor arithmetic immediately after implementation\n",
    "print(\"🔬 Unit Test: Tensor Arithmetic...\")\n",
    "\n",
    "# Test basic arithmetic with simple examples\n",
    "try:\n",
    "    # Test addition\n",
    "    a = Tensor([1, 2, 3])\n",
    "    b = Tensor([4, 5, 6])\n",
    "    result = a + b\n",
    "    expected = np.array([5, 7, 9])\n",
    "    assert np.array_equal(result.data, expected), f\"Addition failed: expected {expected}, got {result.data}\"\n",
    "    print(\"✅ Addition works\")\n",
    "\n",
    "    # Test scalar addition\n",
    "    result_scalar = a + 10\n",
    "    expected_scalar = np.array([11, 12, 13])\n",
    "    assert np.array_equal(result_scalar.data, expected_scalar), f\"Scalar addition failed: expected {expected_scalar}, got {result_scalar.data}\"\n",
    "    print(\"✅ Scalar addition works\")\n",
    "\n",
    "    # Test multiplication\n",
    "    result_mul = a * b\n",
    "    expected_mul = np.array([4, 10, 18])\n",
    "    assert np.array_equal(result_mul.data, expected_mul), f\"Multiplication failed: expected {expected_mul}, got {result_mul.data}\"\n",
    "    print(\"✅ Multiplication works\")\n",
    "\n",
    "    # Test scalar multiplication\n",
    "    result_scalar_mul = a * 2\n",
    "    expected_scalar_mul = np.array([2, 4, 6])\n",
    "    assert np.array_equal(result_scalar_mul.data, expected_scalar_mul), f\"Scalar multiplication failed: expected {expected_scalar_mul}, got {result_scalar_mul.data}\"\n",
    "    print(\"✅ Scalar multiplication works\")\n",
    "\n",
    "    print(\"📈 Progress: Tensor Arithmetic ✓\")\n",
    "\n",
    "except Exception as e:\n",
    "    print(f\"❌ Tensor arithmetic test failed: {e}\")\n",
    "    raise\n",
    "\n",
    "print(\"🎯 Tensor arithmetic behavior:\")\n",
    "print(\"   Element-wise operations on tensors\")\n",
    "print(\"   Broadcasting with scalars\")\n",
    "print(\"   Returns new Tensor objects\")\n",
    "\n",
    "\n",
    "# ### 🔬 Comprehensive Tests\n",
    "# \n",
    "# Now let's run comprehensive tests that validate all tensor functionality together. These tests ensure your implementation is production-ready.\n",
    "# \n",
    "# **These are comprehensive tests** - they test multiple features and edge cases to ensure robustness."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b062d2c7",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "def test_unit_tensor_creation():\n",
    "    \"\"\"Comprehensive test of tensor creation with all data types and shapes.\"\"\"\n",
    "    print(\"🔬 Testing comprehensive tensor creation...\")\n",
    "\n",
    "    # Test scalar creation\n",
    "    scalar_int = Tensor(42)\n",
    "    assert scalar_int.shape == ()\n",
    "\n",
    "    # Test vector creation\n",
    "    vector_int = Tensor([1, 2, 3])\n",
    "    assert vector_int.shape == (3,)\n",
    "\n",
    "    # Test matrix creation\n",
    "    matrix_2x2 = Tensor([[1, 2], [3, 4]])\n",
    "    assert matrix_2x2.shape == (2, 2)\n",
    "    print(\"✅ Tensor creation tests passed!\")\n",
    "\n",
    "# Test function defined (called in main block)\n",
    "\n",
    "\n",
    "# ### Unit Test: Tensor Properties\n",
    "# \n",
    "# This test validates your tensor property methods (shape, size, dtype, data), ensuring they correctly reflect the tensor's dimensional structure and data characteristics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48d82065",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "def test_unit_tensor_properties():\n",
    "    \"\"\"Comprehensive test of tensor properties (shape, size, dtype, data access).\"\"\"\n",
    "    print(\"🔬 Testing comprehensive tensor properties...\")\n",
    "\n",
    "    tensor = Tensor([[1, 2, 3], [4, 5, 6]])\n",
    "\n",
    "    # Test shape property\n",
    "    assert tensor.shape == (2, 3)\n",
    "\n",
    "    # Test size property\n",
    "    assert tensor.size == 6\n",
    "\n",
    "    # Test data property\n",
    "    assert np.array_equal(tensor.data, np.array([[1, 2, 3], [4, 5, 6]]))\n",
    "\n",
    "    # Test dtype property\n",
    "    assert tensor.dtype in [np.int32, np.int64]\n",
    "    print(\"✅ Tensor properties tests passed!\")\n",
    "\n",
    "# Test function defined (called in main block)\n",
    "\n",
    "\n",
    "# ### 🧪 Unit Test: Tensor Arithmetic Operations\n",
    "# \n",
    "# Now let's test all your arithmetic operations working together! This comprehensive test validates that addition, subtraction, multiplication, and division all work correctly with your tensor implementation.\n",
    "# \n",
    "# **What This Tests:**\n",
    "# - Element-wise addition, subtraction, multiplication, division\n",
    "# - Proper NumPy array handling in arithmetic\n",
    "# - Result correctness across different operations\n",
    "# \n",
    "# **Why This Matters:**\n",
    "# - Arithmetic operations are the foundation of all neural network computations\n",
    "# - These operations must be fast and mathematically correct\n",
    "# - Your implementation should match NumPy's behavior exactly"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9646cbfa",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "def test_unit_tensor_arithmetic():\n",
    "    \"\"\"Comprehensive test of tensor arithmetic operations.\"\"\"\n",
    "    print(\"🔬 Testing comprehensive tensor arithmetic...\")\n",
    "\n",
    "    a = Tensor([1, 2, 3])\n",
    "    b = Tensor([4, 5, 6])\n",
    "\n",
    "    # Test addition\n",
    "    c = a + b\n",
    "    expected = np.array([5, 7, 9])\n",
    "    assert np.array_equal(c.data, expected)\n",
    "\n",
    "    # Test multiplication\n",
    "    d = a * b\n",
    "    expected = np.array([4, 10, 18])\n",
    "    assert np.array_equal(d.data, expected)\n",
    "\n",
    "    # Test subtraction\n",
    "    e = b - a\n",
    "    expected = np.array([3, 3, 3])\n",
    "    assert np.array_equal(e.data, expected)\n",
    "\n",
    "    # Test division\n",
    "    f = b / a\n",
    "    expected = np.array([4.0, 2.5, 2.0])\n",
    "    assert np.allclose(f.data, expected)\n",
    "    print(\"✅ Tensor arithmetic tests passed!\")\n",
    "\n",
    "# Test function defined (called in main block)\n",
    "\n",
    "\n",
    "# ### 🧪 Integration Test: Tensor-NumPy Integration\n",
    "# \n",
    "# This integration test validates that your tensor system works seamlessly with NumPy, the foundation of the scientific Python ecosystem.\n",
    "# \n",
    "# **What This Tests:**\n",
    "# - Creating tensors from NumPy arrays\n",
    "# - Converting tensors back to NumPy arrays  \n",
    "# - Mixed operations between tensors and NumPy\n",
    "# - Data type preservation and consistency\n",
    "# \n",
    "# **Why This Matters:**\n",
    "# - Real ML systems must integrate with NumPy seamlessly\n",
    "# - Data scientists expect tensors to work with existing NumPy code\n",
    "# - Performance optimizations often involve NumPy operations\n",
    "# - This compatibility is what makes PyTorch and TensorFlow so powerful\n",
    "# \n",
    "# **Real-World Connection:**\n",
    "# - PyTorch tensors have `.numpy()` and `torch.from_numpy()` methods\n",
    "# - TensorFlow has similar NumPy integration\n",
    "# - This test ensures your tensors work in real data science workflows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2f396666",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "def test_module_tensor_numpy_integration():\n",
    "    \"\"\"\n",
    "    Integration test for tensor operations with NumPy arrays.\n",
    "\n",
    "    Tests that tensors properly integrate with NumPy operations and maintain\n",
    "    compatibility with the scientific Python ecosystem.\n",
    "    \"\"\"\n",
    "    print(\"🔬 Running Integration Test: Tensor-NumPy Integration...\")\n",
    "\n",
    "    # Test 1: Tensor from NumPy array\n",
    "    numpy_array = np.array([[1, 2, 3], [4, 5, 6]])\n",
    "    tensor_from_numpy = Tensor(numpy_array)\n",
    "\n",
    "    assert tensor_from_numpy.shape == (2, 3), \"Tensor should preserve NumPy array shape\"\n",
    "    assert np.array_equal(tensor_from_numpy.data, numpy_array), \"Tensor should preserve NumPy array data\"\n",
    "\n",
    "    # Test 2: Tensor arithmetic with NumPy-compatible operations\n",
    "    a = Tensor([1.0, 2.0, 3.0])\n",
    "    b = Tensor([4.0, 5.0, 6.0])\n",
    "\n",
    "    # Test operations that would be used in neural networks\n",
    "    dot_product_result = np.dot(a.data, b.data)  # Common in layers\n",
    "    assert np.isclose(dot_product_result, 32.0), \"Dot product should work with tensor data\"\n",
    "\n",
    "    # Test 3: Broadcasting compatibility\n",
    "    matrix = Tensor([[1, 2], [3, 4]])\n",
    "    scalar = Tensor(10)\n",
    "\n",
    "    result = matrix + scalar\n",
    "    expected = np.array([[11, 12], [13, 14]])\n",
    "    assert np.array_equal(result.data, expected), \"Broadcasting should work like NumPy\"\n",
    "\n",
    "    # Test 4: Integration with scientific computing patterns\n",
    "    data = Tensor([1, 4, 9, 16, 25])\n",
    "    sqrt_result = Tensor(np.sqrt(data.data))  # Using NumPy functions on tensor data\n",
    "    expected_sqrt = np.array([1., 2., 3., 4., 5.])\n",
    "    assert np.allclose(sqrt_result.data, expected_sqrt), \"Should integrate with NumPy functions\"\n",
    "\n",
    "    print(\"✅ Integration Test Passed: Tensor-NumPy integration works correctly.\")\n",
    "\n",
    "# Test function defined (called in main block)\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    # Run all tensor tests\n",
    "    test_unit_tensor_creation()\n",
    "    test_unit_tensor_properties()\n",
    "    test_unit_tensor_arithmetic()\n",
    "    test_module_tensor_numpy_integration()\n",
    "\n",
    "    print(\"All tests passed!\")\n",
    "    print(\"Tensor module complete!\")\n",
    "\n",
    "\n",
    "# ## 🤔 ML Systems Thinking: Interactive Questions\n",
    "# \n",
    "# Now that you've built a working tensor system, let's connect this foundational work to broader ML systems challenges. These questions help you think critically about how tensor operations scale to production ML environments.\n",
    "# \n",
    "# Take time to reflect thoughtfully on each question - your insights will help you understand how the tensor concepts you've implemented connect to real-world ML systems engineering.\n",
    "\n",
    "# ### Question 1: Memory Layout and Cache Efficiency\n",
    "# \n",
    "# **Context**: Your tensor implementation wraps NumPy arrays and creates new tensors for each operation. In production ML systems, tensor operations happen millions of times per second, making memory layout and cache efficiency critical for performance.\n",
    "# \n",
    "# **Reflection Question**: Design a memory-efficient tensor system for training large neural networks (billions of parameters). How would you balance memory layout optimization with cache efficiency? Consider scenarios where you need to process massive image batches (1000+ images) while maintaining memory locality for CPU cache optimization. What trade-offs would you make between memory copying and in-place operations?\n",
    "# \n",
    "# Think about: contiguous memory layout, cache line utilization, memory fragmentation, and the difference between row-major vs column-major storage in different computational contexts.\n",
    "# \n",
    "# *Target length: 150-300 words*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1911ed11",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\"\"\"\n",
    "YOUR REFLECTION ON MEMORY LAYOUT AND CACHE EFFICIENCY:\n",
    "\n",
    "TODO: Replace this text with your thoughtful response about memory-efficient tensor system design.\n",
    "\n",
    "Consider addressing:\n",
    "- How would you optimize memory layout for large batch processing?\n",
    "- What strategies would you use to minimize cache misses during tensor operations?\n",
    "- How would you handle the trade-off between memory copying and in-place operations?\n",
    "- What role does contiguous memory layout play in computational efficiency?\n",
    "- How would different storage patterns (row-major vs column-major) affect performance?\n",
    "\n",
    "Write a practical design connecting your tensor implementation to real memory optimization challenges.\n",
    "\n",
    "GRADING RUBRIC (Instructor Use):\n",
    "- Demonstrates understanding of memory layout impact on performance (3 points)\n",
    "- Addresses cache efficiency and locality concerns appropriately (3 points)\n",
    "- Shows practical knowledge of memory optimization strategies (2 points)\n",
    "- Demonstrates systems thinking about large-scale tensor operations (2 points)\n",
    "- Clear technical reasoning and practical considerations (bonus points for innovative approaches)\n",
    "\"\"\"\n",
    "\n",
    "### BEGIN SOLUTION\n",
    "# Student response area - instructor will replace this section during grading setup\n",
    "# This is a manually graded question requiring technical analysis of memory optimization\n",
    "# Students should demonstrate understanding of cache efficiency and memory layout optimization\n",
    "### END SOLUTION\n",
    "\n",
    "\n",
    "# ### Question 2: Hardware Abstraction and Multi-Platform Deployment\n",
    "# \n",
    "# **Context**: Your tensor class currently operates on CPU through NumPy. Production ML systems must run efficiently across diverse hardware: development laptops (CPU), training clusters (GPU), mobile devices (ARM processors), and edge devices (specialized AI chips).\n",
    "# \n",
    "# **Reflection Question**: Architect a hardware-abstraction layer for your tensor system that enables the same tensor operations to run optimally across CPU, GPU, and specialized AI accelerators. How would you handle the complexity of different memory models, precision requirements, and computational paradigms while maintaining a simple user interface? Consider the challenges of automatic device placement and memory management across heterogeneous hardware.\n",
    "# \n",
    "# Think about: device-specific optimizations, memory transfer costs, precision trade-offs, and automatic kernel selection for different hardware architectures.\n",
    "# \n",
    "# *Target length: 150-300 words*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a58b9e34",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\"\"\"\n",
    "YOUR REFLECTION ON HARDWARE ABSTRACTION AND MULTI-PLATFORM DEPLOYMENT:\n",
    "\n",
    "TODO: Replace this text with your thoughtful response about hardware abstraction design.\n",
    "\n",
    "Consider addressing:\n",
    "- How would you design an abstraction layer that works across CPU, GPU, and AI accelerators?\n",
    "- What strategies would you use for automatic device placement and memory management?\n",
    "- How would you handle different precision requirements across hardware platforms?\n",
    "- What role would kernel selection and optimization play in your design?\n",
    "- How would you minimize memory transfer costs between different compute devices?\n",
    "\n",
    "Write an architectural analysis connecting your tensor foundation to real hardware deployment challenges.\n",
    "\n",
    "GRADING RUBRIC (Instructor Use):\n",
    "- Shows understanding of multi-platform hardware challenges (3 points)\n",
    "- Designs practical abstraction layer for device management (3 points)\n",
    "- Addresses precision and optimization considerations (2 points)\n",
    "- Demonstrates systems thinking about hardware-software interfaces (2 points)\n",
    "- Clear architectural reasoning with practical insights (bonus points for comprehensive understanding)\n",
    "\"\"\"\n",
    "\n",
    "### BEGIN SOLUTION\n",
    "# Student response area - instructor will replace this section during grading setup\n",
    "# This is a manually graded question requiring understanding of hardware abstraction challenges\n",
    "# Students should demonstrate knowledge of multi-platform deployment and device optimization\n",
    "### END SOLUTION\n",
    "\n",
    "\n",
    "# ### Question 3: Computational Graph Integration and Automatic Differentiation\n",
    "# \n",
    "# **Context**: Your tensor performs operations immediately (eager execution). Modern deep learning frameworks build computational graphs to track operations for automatic differentiation, enabling gradient-based optimization that powers neural network training.\n",
    "# \n",
    "# **Reflection Question**: Extend your tensor design to support computational graph construction for automatic differentiation. How would you modify your tensor operations to build a graph of dependencies while maintaining performance for both training (graph construction) and inference (optimized execution)? Consider the challenge of supporting both eager execution for debugging and graph mode for production deployment.\n",
    "# \n",
    "# Think about: operation tracking, gradient flow, memory management for large graphs, and the trade-offs between flexibility and performance in different execution modes.\n",
    "# \n",
    "# *Target length: 150-300 words*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "20290df0",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\"\"\"\n",
    "YOUR REFLECTION ON COMPUTATIONAL GRAPH INTEGRATION:\n",
    "\n",
    "TODO: Replace this text with your thoughtful response about computational graph design.\n",
    "\n",
    "Consider addressing:\n",
    "- How would you modify your tensor class to support computational graph construction?\n",
    "- What strategies would you use to balance eager execution with graph-based optimization?\n",
    "- How would you handle gradient flow and automatic differentiation in your design?\n",
    "- What memory management challenges arise with large computational graphs?\n",
    "- How would you support both debugging-friendly and production-optimized execution modes?\n",
    "\n",
    "Write a design analysis connecting your tensor operations to automatic differentiation and training systems.\n",
    "\n",
    "GRADING RUBRIC (Instructor Use):\n",
    "- Understands computational graph concepts and gradient tracking (3 points)\n",
    "- Designs practical approach to eager vs graph execution modes (3 points)\n",
    "- Addresses memory management and performance considerations (2 points)\n",
    "- Shows systems thinking about training vs inference requirements (2 points)\n",
    "- Clear design reasoning with automatic differentiation insights (bonus points for deep understanding)\n",
    "\"\"\"\n",
    "\n",
    "### BEGIN SOLUTION\n",
    "# Student response area - instructor will replace this section during grading setup\n",
    "# This is a manually graded question requiring understanding of computational graphs and automatic differentiation\n",
    "# Students should demonstrate knowledge of how tensor operations enable gradient computation\n",
    "### END SOLUTION\n",
    "\n",
    "\n",
    "# ## Parameter Helper Function\n",
    "# \n",
    "# Now that we have Tensor with gradient support, let's add a convenient helper function for creating trainable parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d05174e",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "#| export\n",
    "def Parameter(data, dtype=None):\n",
    "    \"\"\"\n",
    "    Convenience function for creating trainable tensors.\n",
    "\n",
    "    This is equivalent to Tensor(data, requires_grad=True) but provides\n",
    "    cleaner syntax for neural network parameters.\n",
    "\n",
    "    Args:\n",
    "        data: Input data (scalar, list, or numpy array)\n",
    "        dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.\n",
    "\n",
    "    Returns:\n",
    "        Tensor with requires_grad=True\n",
    "\n",
    "    Examples:\n",
    "        weight = Parameter(np.random.randn(784, 128))  # Neural network weight\n",
    "        bias = Parameter(np.zeros(128))                # Neural network bias\n",
    "    \"\"\"\n",
    "    return Tensor(data, dtype=dtype, requires_grad=True)\n",
    "\n",
    "\n",
    "# # MODULE SUMMARY: Tensor Foundation\n",
    "# \n",
    "# Congratulations! You've successfully implemented the fundamental data structure that powers all machine learning:\n",
    "# \n",
    "# ## What You've Built\n",
    "# - **Tensor Class**: N-dimensional array wrapper with professional interfaces\n",
    "# - **Core Operations**: Creation, property access, and arithmetic operations\n",
    "# - **Shape Management**: Automatic shape tracking and validation\n",
    "# - **Data Types**: Proper NumPy integration and type handling\n",
    "# - **Foundation**: The building block for all subsequent TinyTorch modules\n",
    "# \n",
    "# ## Key Learning Outcomes\n",
    "# - **Understanding**: How tensors work as the foundation of machine learning\n",
    "# - **Implementation**: Built tensor operations from scratch\n",
    "# - **Professional patterns**: Clean APIs, proper error handling, comprehensive testing\n",
    "# - **Real-world connection**: Understanding PyTorch/TensorFlow tensor foundations\n",
    "# - **Systems thinking**: Building reliable, reusable components\n",
    "# \n",
    "# ## Mathematical Foundations Mastered\n",
    "# - **N-dimensional arrays**: Shape, size, and dimensionality concepts\n",
    "# - **Element-wise operations**: Addition, subtraction, multiplication, division\n",
    "# - **Broadcasting**: Understanding how operations work with different shapes\n",
    "# - **Memory management**: Efficient data storage and access patterns\n",
    "# \n",
    "# ## Professional Skills Developed\n",
    "# - **API design**: Clean, intuitive interfaces for tensor operations\n",
    "# - **Error handling**: Graceful handling of invalid operations and edge cases\n",
    "# - **Testing methodology**: Comprehensive validation of tensor functionality\n",
    "# - **Documentation**: Clear, educational documentation with examples\n",
    "# \n",
    "# ## Ready for Advanced Applications\n",
    "# Your tensor implementation now enables:\n",
    "# - **Neural Networks**: Foundation for all layer implementations\n",
    "# - **Automatic Differentiation**: Gradient computation through computational graphs\n",
    "# - **Complex Models**: CNNs, RNNs, Transformers - all built on tensors\n",
    "# - **Real Applications**: Training models on real datasets\n",
    "# \n",
    "# ## Connection to Real ML Systems\n",
    "# Your implementation mirrors production systems:\n",
    "# - **PyTorch**: `torch.Tensor` provides identical functionality\n",
    "# - **TensorFlow**: `tf.Tensor` implements similar concepts\n",
    "# - **NumPy**: `numpy.ndarray` serves as the foundation\n",
    "# - **Industry Standard**: Every major ML framework uses these exact principles\n",
    "# \n",
    "# ## The Power of Tensors\n",
    "# You've built the fundamental data structure of modern AI:\n",
    "# - **Universality**: Tensors represent all data: images, text, audio, video\n",
    "# - **Efficiency**: Vectorized operations enable fast computation\n",
    "# - **Scalability**: Handles everything from single numbers to massive matrices\n",
    "# - **Flexibility**: Foundation for any mathematical operation\n",
    "# \n",
    "# ## What's Next\n",
    "# Your tensor implementation is the foundation for:\n",
    "# - **Activations**: Nonlinear functions that enable complex learning\n",
    "# - **Layers**: Linear transformations and neural network building blocks\n",
    "# - **Networks**: Composing layers into powerful architectures\n",
    "# - **Training**: Optimizing networks to solve real problems\n",
    "# \n",
    "# **Next Module**: Activation functions - adding the nonlinearity that makes neural networks powerful!\n",
    "# \n",
    "# You've built the foundation of modern AI. Now let's add the mathematical functions that enable machines to learn complex patterns!"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all",
   "encoding": "# coding: utf-8",
   "executable": "/usr/bin/env python",
   "main_language": "python",
   "notebook_metadata_filter": "-all"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}