Files
TinyTorch/tinytorch/core/tensor.py
Vijay Janapa Reddi cdbdba0b35 Comprehensive TinyTorch framework evaluation and analysis
Assessment Results:
- 75% real implementation vs 25% educational scaffolding
- Working end-to-end training on CIFAR-10 dataset
- Comprehensive architecture coverage (MLPs, CNNs, Attention)
- Production-oriented features (MLOps, profiling, compression)
- Professional development workflow with CLI tools

Key Findings:
- Students build functional ML framework from scratch
- Real datasets and meaningful evaluation capabilities
- Progressive complexity through 16-module structure
- Systems engineering principles throughout
- Ready for serious ML systems education

Gaps Identified:
- GPU acceleration and distributed training
- Advanced optimizers and model serialization
- Some memory optimization opportunities

Recommendation: Excellent foundation for ML systems engineering education
2025-09-16 22:41:07 -04:00

478 lines
17 KiB
Python

# AUTOGENERATED! DO NOT EDIT! File to edit: ../../modules/source/02_tensor/tensor_dev.ipynb.
# %% auto 0
__all__ = ['Tensor']
# %% ../../modules/source/02_tensor/tensor_dev.ipynb 1
import numpy as np
import sys
from typing import Union, Tuple, Optional, Any
# %% ../../modules/source/02_tensor/tensor_dev.ipynb 14
class Tensor:
"""
TinyTorch Tensor: N-dimensional array with ML operations.
The fundamental data structure for all TinyTorch operations.
Wraps NumPy arrays with ML-specific functionality.
"""
def __init__(self, data: Any, dtype: Optional[str] = None):
"""
Create a new tensor from data.
Args:
data: Input data (scalar, list, or numpy array)
dtype: Data type ('float32', 'int32', etc.). Defaults to auto-detect.
TODO: Implement tensor creation with proper type handling.
STEP-BY-STEP:
1. Check if data is a scalar (int/float) - convert to numpy array
2. Check if data is a list - convert to numpy array
3. Check if data is already a numpy array - use as-is
4. Apply dtype conversion if specified
5. Store the result in self._data
EXAMPLE:
Tensor(5) → stores np.array(5)
Tensor([1, 2, 3]) → stores np.array([1, 2, 3])
Tensor(np.array([1, 2, 3])) → stores the array directly
HINTS:
- Use isinstance() to check data types
- Use np.array() for conversion
- Handle dtype parameter for type conversion
- Store the array in self._data
"""
### BEGIN SOLUTION
# Convert input to numpy array
if isinstance(data, (int, float, np.number)):
# Handle Python and NumPy scalars
if dtype is None:
# Auto-detect type: int for integers, float32 for floats
if isinstance(data, int) or (isinstance(data, np.number) and np.issubdtype(type(data), np.integer)):
dtype = 'int32'
else:
dtype = 'float32'
self._data = np.array(data, dtype=dtype)
elif isinstance(data, list):
# Let NumPy auto-detect type, then convert if needed
temp_array = np.array(data)
if dtype is None:
# Use NumPy's auto-detected type, but prefer float32 for floats
if temp_array.dtype == np.float64:
dtype = 'float32'
else:
dtype = str(temp_array.dtype)
self._data = np.array(data, dtype=dtype)
elif isinstance(data, np.ndarray):
# Already a numpy array
if dtype is None:
# Keep existing dtype, but prefer float32 for float64
if data.dtype == np.float64:
dtype = 'float32'
else:
dtype = str(data.dtype)
self._data = data.astype(dtype) if dtype != data.dtype else data.copy()
else:
# Try to convert unknown types
self._data = np.array(data, dtype=dtype)
### END SOLUTION
@property
def data(self) -> np.ndarray:
"""
Access underlying numpy array.
TODO: Return the stored numpy array.
STEP-BY-STEP IMPLEMENTATION:
1. Access the internal _data attribute
2. Return the numpy array directly
3. This provides access to underlying data for NumPy operations
LEARNING CONNECTIONS:
Real-world relevance:
- PyTorch: tensor.numpy() converts to NumPy for visualization/analysis
- TensorFlow: tensor.numpy() enables integration with scientific Python
- Production: Data scientists need to access raw arrays for debugging
- Performance: Direct access avoids copying for read-only operations
HINT: Return self._data (the array you stored in __init__)
"""
### BEGIN SOLUTION
return self._data
### END SOLUTION
@property
def shape(self) -> Tuple[int, ...]:
"""
Get tensor shape.
TODO: Return the shape of the stored numpy array.
STEP-BY-STEP IMPLEMENTATION:
1. Access the _data attribute (the NumPy array)
2. Get the shape property from the NumPy array
3. Return the shape tuple directly
LEARNING CONNECTIONS:
Real-world relevance:
- Neural networks: Layer compatibility requires matching shapes
- Computer vision: Image shape (height, width, channels) determines architecture
- NLP: Sequence length and vocabulary size affect model design
- Debugging: Shape mismatches are the #1 cause of ML errors
HINT: Use .shape attribute of the numpy array
EXAMPLE: Tensor([1, 2, 3]).shape should return (3,)
"""
### BEGIN SOLUTION
return self._data.shape
### END SOLUTION
@property
def size(self) -> int:
"""
Get total number of elements.
TODO: Return the total number of elements in the tensor.
STEP-BY-STEP IMPLEMENTATION:
1. Access the _data attribute (the NumPy array)
2. Get the size property from the NumPy array
3. Return the total element count as an integer
LEARNING CONNECTIONS:
Real-world relevance:
- Memory planning: Calculate RAM requirements for large tensors
- Model architecture: Determine parameter counts for layers
- Performance optimization: Size affects computation time
- Batch processing: Total elements determines vectorization efficiency
HINT: Use .size attribute of the numpy array
EXAMPLE: Tensor([1, 2, 3]).size should return 3
"""
### BEGIN SOLUTION
return self._data.size
### END SOLUTION
@property
def dtype(self) -> np.dtype:
"""
Get data type as numpy dtype.
TODO: Return the data type of the stored numpy array.
STEP-BY-STEP IMPLEMENTATION:
1. Access the _data attribute (the NumPy array)
2. Get the dtype property from the NumPy array
3. Return the NumPy dtype object directly
LEARNING CONNECTIONS:
Real-world relevance:
- Precision vs speed: float32 is faster, float64 more accurate
- Memory optimization: int8 uses 1/4 memory of int32
- GPU compatibility: Some operations only work with specific types
- Model deployment: Mobile/edge devices prefer smaller data types
HINT: Use .dtype attribute of the numpy array
EXAMPLE: Tensor([1, 2, 3]).dtype should return dtype('int32')
"""
### BEGIN SOLUTION
return self._data.dtype
### END SOLUTION
def __repr__(self) -> str:
"""
String representation.
TODO: Create a clear string representation of the tensor.
STEP-BY-STEP IMPLEMENTATION:
1. Convert the numpy array to a list using .tolist()
2. Get shape and dtype information from properties
3. Format as "Tensor([data], shape=shape, dtype=dtype)"
4. Return the formatted string
LEARNING CONNECTIONS:
Real-world relevance:
- Debugging: Clear tensor representation speeds debugging
- Jupyter notebooks: Good __repr__ improves data exploration
- Logging: Production systems log tensor info for monitoring
- Education: Students understand tensors better with clear output
APPROACH:
1. Convert the numpy array to a list for readable output
2. Include the shape and dtype information
3. Format: "Tensor([data], shape=shape, dtype=dtype)"
EXAMPLE:
Tensor([1, 2, 3]) → "Tensor([1, 2, 3], shape=(3,), dtype=int32)"
HINTS:
- Use .tolist() to convert numpy array to list
- Include shape and dtype information
- Keep format consistent and readable
"""
### BEGIN SOLUTION
return f"Tensor({self._data.tolist()}, shape={self.shape}, dtype={self.dtype})"
### END SOLUTION
def add(self, other: 'Tensor') -> 'Tensor':
"""
Add two tensors element-wise.
TODO: Implement tensor addition.
STEP-BY-STEP IMPLEMENTATION:
1. Extract numpy arrays from both tensors
2. Use NumPy's + operator for element-wise addition
3. Create a new Tensor object with the result
4. Return the new tensor
LEARNING CONNECTIONS:
Real-world relevance:
- Neural networks: Adding bias terms to linear layer outputs
- Residual connections: skip connections in ResNet architectures
- Gradient updates: Adding computed gradients to parameters
- Ensemble methods: Combining predictions from multiple models
APPROACH:
1. Add the numpy arrays using +
2. Return a new Tensor with the result
3. Handle broadcasting automatically
EXAMPLE:
Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])
HINTS:
- Use self._data + other._data
- Return Tensor(result)
- NumPy handles broadcasting automatically
"""
### BEGIN SOLUTION
result = self._data + other._data
return Tensor(result)
### END SOLUTION
def multiply(self, other: 'Tensor') -> 'Tensor':
"""
Multiply two tensors element-wise.
TODO: Implement tensor multiplication.
STEP-BY-STEP IMPLEMENTATION:
1. Extract numpy arrays from both tensors
2. Use NumPy's * operator for element-wise multiplication
3. Create a new Tensor object with the result
4. Return the new tensor
LEARNING CONNECTIONS:
Real-world relevance:
- Activation functions: Element-wise operations like ReLU masking
- Attention mechanisms: Element-wise scaling in transformer models
- Feature scaling: Multiplying features by learned scaling factors
- Gating: Element-wise gating in LSTM and GRU cells
APPROACH:
1. Multiply the numpy arrays using *
2. Return a new Tensor with the result
3. Handle broadcasting automatically
EXAMPLE:
Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])
HINTS:
- Use self._data * other._data
- Return Tensor(result)
- This is element-wise, not matrix multiplication
"""
### BEGIN SOLUTION
result = self._data * other._data
return Tensor(result)
### END SOLUTION
def __add__(self, other: Union['Tensor', int, float]) -> 'Tensor':
"""
Addition operator: tensor + other
TODO: Implement + operator for tensors.
STEP-BY-STEP IMPLEMENTATION:
1. Check if other is a Tensor object
2. If Tensor, call the add() method directly
3. If scalar, convert to Tensor then call add()
4. Return the result from add() method
LEARNING CONNECTIONS:
Real-world relevance:
- Natural syntax: tensor + scalar enables intuitive code
- Broadcasting: Adding scalars to tensors is common in ML
- Operator overloading: Python's magic methods enable math-like syntax
- API design: Clean interfaces reduce cognitive load for researchers
APPROACH:
1. If other is a Tensor, use tensor addition
2. If other is a scalar, convert to Tensor first
3. Return the result
EXAMPLE:
Tensor([1, 2]) + Tensor([3, 4]) → Tensor([4, 6])
Tensor([1, 2]) + 5 → Tensor([6, 7])
"""
### BEGIN SOLUTION
if isinstance(other, Tensor):
return self.add(other)
else:
return self.add(Tensor(other))
### END SOLUTION
def __mul__(self, other: Union['Tensor', int, float]) -> 'Tensor':
"""
Multiplication operator: tensor * other
TODO: Implement * operator for tensors.
STEP-BY-STEP IMPLEMENTATION:
1. Check if other is a Tensor object
2. If Tensor, call the multiply() method directly
3. If scalar, convert to Tensor then call multiply()
4. Return the result from multiply() method
LEARNING CONNECTIONS:
Real-world relevance:
- Scaling features: tensor * learning_rate for gradient updates
- Masking: tensor * mask for attention mechanisms
- Regularization: tensor * dropout_mask during training
- Normalization: tensor * scale_factor in batch normalization
APPROACH:
1. If other is a Tensor, use tensor multiplication
2. If other is a scalar, convert to Tensor first
3. Return the result
EXAMPLE:
Tensor([1, 2]) * Tensor([3, 4]) → Tensor([3, 8])
Tensor([1, 2]) * 3 → Tensor([3, 6])
"""
### BEGIN SOLUTION
if isinstance(other, Tensor):
return self.multiply(other)
else:
return self.multiply(Tensor(other))
### END SOLUTION
def __sub__(self, other: Union['Tensor', int, float]) -> 'Tensor':
"""
Subtraction operator: tensor - other
TODO: Implement - operator for tensors.
STEP-BY-STEP IMPLEMENTATION:
1. Check if other is a Tensor object
2. If Tensor, subtract other._data from self._data
3. If scalar, subtract scalar directly from self._data
4. Create new Tensor with result and return
LEARNING CONNECTIONS:
Real-world relevance:
- Gradient computation: parameter - learning_rate * gradient
- Residual connections: output - skip_connection in some architectures
- Error calculation: predicted - actual for loss computation
- Centering data: tensor - mean for zero-centered inputs
APPROACH:
1. Convert other to Tensor if needed
2. Subtract using numpy arrays
3. Return new Tensor with result
EXAMPLE:
Tensor([5, 6]) - Tensor([1, 2]) → Tensor([4, 4])
Tensor([5, 6]) - 1 → Tensor([4, 5])
"""
### BEGIN SOLUTION
if isinstance(other, Tensor):
result = self._data - other._data
else:
result = self._data - other
return Tensor(result)
### END SOLUTION
def __truediv__(self, other: Union['Tensor', int, float]) -> 'Tensor':
"""
Division operator: tensor / other
TODO: Implement / operator for tensors.
STEP-BY-STEP IMPLEMENTATION:
1. Check if other is a Tensor object
2. If Tensor, divide self._data by other._data
3. If scalar, divide self._data by scalar directly
4. Create new Tensor with result and return
LEARNING CONNECTIONS:
Real-world relevance:
- Normalization: tensor / std_deviation for standard scaling
- Learning rate decay: parameter / decay_factor over time
- Probability computation: counts / total_counts for frequencies
- Temperature scaling: logits / temperature in softmax functions
APPROACH:
1. Convert other to Tensor if needed
2. Divide using numpy arrays
3. Return new Tensor with result
EXAMPLE:
Tensor([6, 8]) / Tensor([2, 4]) → Tensor([3, 2])
Tensor([6, 8]) / 2 → Tensor([3, 4])
"""
### BEGIN SOLUTION
if isinstance(other, Tensor):
result = self._data / other._data
else:
result = self._data / other
return Tensor(result)
### END SOLUTION
def mean(self) -> 'Tensor':
"""Computes the mean of the tensor's elements."""
return Tensor(np.mean(self.data))
def matmul(self, other: 'Tensor') -> 'Tensor':
"""
Perform matrix multiplication between two tensors.
TODO: Implement matrix multiplication.
STEP-BY-STEP IMPLEMENTATION:
1. Extract numpy arrays from both tensors
2. Use np.matmul() for proper matrix multiplication
3. Create new Tensor object with the result
4. Return the new tensor
LEARNING CONNECTIONS:
Real-world relevance:
- Linear layers: input @ weight matrices in neural networks
- Transformer attention: Q @ K^T for attention scores
- CNN convolutions: Implemented as matrix multiplications
- Batch processing: Matrix ops enable parallel computation
APPROACH:
1. Use np.matmul() to perform matrix multiplication
2. Return a new Tensor with the result
3. Handle broadcasting automatically
EXAMPLE:
Tensor([[1, 2], [3, 4]]) @ Tensor([[5, 6], [7, 8]]) → Tensor([[19, 22], [43, 50]])
HINTS:
- Use np.matmul(self._data, other._data)
- Return Tensor(result)
- This is matrix multiplication, not element-wise multiplication
"""
### BEGIN SOLUTION
result = np.matmul(self._data, other._data)
return Tensor(result)
### END SOLUTION