mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-01 04:58:11 -05:00
- Remove all tests/ directories under modules/source/ - Keep main tests/ directory for testing exported functionality - Update status command to check tests in main tests/ directory - Update documentation to reflect new test structure - Reduce maintenance burden by eliminating duplicate test systems - Focus on inline NBGrader tests for development, main tests for package validation
592 lines
19 KiB
Markdown
592 lines
19 KiB
Markdown
great# Module Improvement Guide: From Poor to Excellent Structure
|
||
|
||
## Example: Transforming 01_tensor Module
|
||
|
||
This guide shows how to transform the tensor module from its current structure to follow the **explain → code → test → repeat** pattern exemplified by `setup_dev.py`.
|
||
|
||
## Current Problem Structure
|
||
|
||
```python
|
||
# Current tensor_dev.py structure (POOR)
|
||
# Lines 1-300: All explanations
|
||
# Lines 300-700: All implementations
|
||
# Lines 700-1536: All tests at the end
|
||
|
||
class Tensor:
|
||
def __init__(self):
|
||
raise NotImplementedError("Student implementation required")
|
||
|
||
def add(self):
|
||
raise NotImplementedError("Student implementation required")
|
||
|
||
def multiply(self):
|
||
raise NotImplementedError("Student implementation required")
|
||
|
||
# Much later...
|
||
def test_tensor_creation_comprehensive():
|
||
# Tests everything at once
|
||
pass
|
||
|
||
def test_tensor_arithmetic_comprehensive():
|
||
# Tests everything at once
|
||
pass
|
||
```
|
||
|
||
## Improved Structure (EXCELLENT)
|
||
|
||
```python
|
||
# Improved tensor_dev.py structure (EXCELLENT)
|
||
# Following: Explain → Code → Test → Repeat
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 1: What is a Tensor?
|
||
|
||
### Definition
|
||
A **tensor** is an N-dimensional array with ML-specific operations.
|
||
|
||
### Why Tensors Matter
|
||
- **Foundation**: Every ML framework uses tensors
|
||
- **Efficiency**: Vectorized operations are faster
|
||
- **Flexibility**: Same operations work on scalars, vectors, matrices
|
||
|
||
### Real-World Examples
|
||
```python
|
||
# Scalar (0D): A single number
|
||
temperature = Tensor(25.0)
|
||
|
||
# Vector (1D): A list of numbers
|
||
rgb_color = Tensor([255, 128, 0])
|
||
|
||
# Matrix (2D): Image pixels
|
||
image = Tensor([[100, 150], [200, 250]])
|
||
```
|
||
|
||
Let's build this step by step!
|
||
"""
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 1A: Tensor Creation
|
||
|
||
### The Foundation Operation
|
||
Creating tensors is the first thing you'll do in any ML system. Our Tensor class needs to:
|
||
1. Accept various input types (lists, numpy arrays, scalars)
|
||
2. Store data efficiently
|
||
3. Track shape and type information
|
||
"""
|
||
|
||
# %% nbgrader={"grade": false, "grade_id": "tensor-creation", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
||
#| export
|
||
class Tensor:
|
||
def __init__(self, data: Union[int, float, List, np.ndarray], dtype: Optional[str] = None):
|
||
"""
|
||
Create a tensor from various input types.
|
||
|
||
TODO: Implement tensor creation with proper data handling.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Convert input data to numpy array using np.array()
|
||
2. Handle dtype conversion if specified
|
||
3. Store the numpy array in self._data
|
||
4. Validate that data is numeric (not strings, objects, etc.)
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
# From scalar
|
||
t1 = Tensor(5.0)
|
||
|
||
# From list
|
||
t2 = Tensor([1, 2, 3])
|
||
|
||
# From nested list (matrix)
|
||
t3 = Tensor([[1, 2], [3, 4]])
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- Use np.array(data) to convert input
|
||
- Check dtype parameter: if provided, use np.array(data, dtype=dtype)
|
||
- Validate: ensure data is numeric (int, float, complex)
|
||
- Store in self._data for internal use
|
||
|
||
LEARNING CONNECTIONS:
|
||
- This is like torch.tensor() in PyTorch
|
||
- Similar to tf.constant() in TensorFlow
|
||
- Foundation for all other tensor operations
|
||
"""
|
||
### BEGIN SOLUTION
|
||
if dtype is not None:
|
||
self._data = np.array(data, dtype=dtype)
|
||
else:
|
||
self._data = np.array(data)
|
||
|
||
# Validate numeric data
|
||
if not np.issubdtype(self._data.dtype, np.number):
|
||
raise ValueError(f"Tensor data must be numeric, got {self._data.dtype}")
|
||
### END SOLUTION
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### 🧪 Test Your Tensor Creation
|
||
|
||
Once you implement the `__init__` method above, run this cell to test it:
|
||
"""
|
||
|
||
# %% nbgrader={"grade": true, "grade_id": "test-tensor-creation", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
|
||
def test_tensor_creation():
|
||
"""Test tensor creation with various input types"""
|
||
print("Testing tensor creation...")
|
||
|
||
# Test scalar creation
|
||
t1 = Tensor(5.0)
|
||
assert t1._data.shape == (), "Scalar tensor should have empty shape"
|
||
assert t1._data.item() == 5.0, "Scalar value should be 5.0"
|
||
|
||
# Test list creation
|
||
t2 = Tensor([1, 2, 3])
|
||
assert t2._data.shape == (3,), "1D tensor should have shape (3,)"
|
||
assert np.array_equal(t2._data, [1, 2, 3]), "1D tensor values should match"
|
||
|
||
# Test matrix creation
|
||
t3 = Tensor([[1, 2], [3, 4]])
|
||
assert t3._data.shape == (2, 2), "2D tensor should have shape (2, 2)"
|
||
assert np.array_equal(t3._data, [[1, 2], [3, 4]]), "2D tensor values should match"
|
||
|
||
# Test dtype specification
|
||
t4 = Tensor([1, 2, 3], dtype='float32')
|
||
assert t4._data.dtype == np.float32, "Specified dtype should be respected"
|
||
|
||
print("✅ Tensor creation tests passed!")
|
||
print(f"✅ Created tensors: scalar, vector, matrix")
|
||
print(f"✅ Handled data types correctly")
|
||
|
||
# Run the test
|
||
test_tensor_creation()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 1B: Tensor Properties
|
||
|
||
### Essential Information Access
|
||
Every tensor needs to provide basic information about itself:
|
||
- **Shape**: Dimensions of the tensor
|
||
- **Size**: Total number of elements
|
||
- **Data access**: Get the underlying data
|
||
|
||
### Why Properties Matter
|
||
- **Debugging**: Quickly see tensor dimensions
|
||
- **Validation**: Check compatibility for operations
|
||
- **Integration**: Interface with other libraries
|
||
"""
|
||
|
||
# %% nbgrader={"grade": false, "grade_id": "tensor-properties", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
||
#| export
|
||
@property
|
||
def data(self) -> np.ndarray:
|
||
"""
|
||
Get the underlying numpy array data.
|
||
|
||
TODO: Implement data property access.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Return self._data directly
|
||
2. This gives users access to the numpy array
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
t = Tensor([[1, 2], [3, 4]])
|
||
print(t.data) # [[1 2]
|
||
# [3 4]]
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- Simple property: just return self._data
|
||
- No validation needed here
|
||
- This is like tensor.numpy() in PyTorch
|
||
"""
|
||
### BEGIN SOLUTION
|
||
return self._data
|
||
### END SOLUTION
|
||
|
||
@property
|
||
def shape(self) -> Tuple[int, ...]:
|
||
"""
|
||
Get the shape (dimensions) of the tensor.
|
||
|
||
TODO: Implement shape property.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Return self._data.shape
|
||
2. This gives the dimensions as a tuple
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
t = Tensor([[1, 2], [3, 4]])
|
||
print(t.shape) # (2, 2)
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- NumPy arrays have a .shape attribute
|
||
- Return self._data.shape
|
||
- This is like tensor.shape in PyTorch
|
||
"""
|
||
### BEGIN SOLUTION
|
||
return self._data.shape
|
||
### END SOLUTION
|
||
|
||
@property
|
||
def size(self) -> int:
|
||
"""
|
||
Get the total number of elements in the tensor.
|
||
|
||
TODO: Implement size property.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Return self._data.size
|
||
2. This gives total elements across all dimensions
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
t = Tensor([[1, 2], [3, 4]])
|
||
print(t.size) # 4 (2×2 = 4 elements)
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- NumPy arrays have a .size attribute
|
||
- Return self._data.size
|
||
- This is like tensor.numel() in PyTorch
|
||
"""
|
||
### BEGIN SOLUTION
|
||
return self._data.size
|
||
### END SOLUTION
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### 🧪 Test Your Tensor Properties
|
||
|
||
Once you implement the properties above, run this cell to test them:
|
||
"""
|
||
|
||
# %% nbgrader={"grade": true, "grade_id": "test-tensor-properties", "locked": true, "points": 10, "schema_version": 3, "solution": false, "task": false}
|
||
def test_tensor_properties():
|
||
"""Test tensor properties: data, shape, size"""
|
||
print("Testing tensor properties...")
|
||
|
||
# Test scalar properties
|
||
t1 = Tensor(5.0)
|
||
assert t1.shape == (), "Scalar shape should be empty tuple"
|
||
assert t1.size == 1, "Scalar size should be 1"
|
||
assert t1.data.item() == 5.0, "Scalar data should be accessible"
|
||
|
||
# Test vector properties
|
||
t2 = Tensor([1, 2, 3, 4])
|
||
assert t2.shape == (4,), "Vector shape should be (4,)"
|
||
assert t2.size == 4, "Vector size should be 4"
|
||
assert np.array_equal(t2.data, [1, 2, 3, 4]), "Vector data should match"
|
||
|
||
# Test matrix properties
|
||
t3 = Tensor([[1, 2, 3], [4, 5, 6]])
|
||
assert t3.shape == (2, 3), "Matrix shape should be (2, 3)"
|
||
assert t3.size == 6, "Matrix size should be 6"
|
||
assert np.array_equal(t3.data, [[1, 2, 3], [4, 5, 6]]), "Matrix data should match"
|
||
|
||
print("✅ Tensor properties tests passed!")
|
||
print(f"✅ Shape, size, and data access working correctly")
|
||
|
||
# Run the test
|
||
test_tensor_properties()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 2: Tensor Arithmetic
|
||
|
||
### The Heart of ML: Mathematical Operations
|
||
Now we implement the core mathematical operations that make ML possible:
|
||
- **Addition**: Element-wise addition of tensors
|
||
- **Multiplication**: Element-wise multiplication
|
||
- **Subtraction**: Element-wise subtraction
|
||
- **Division**: Element-wise division
|
||
|
||
### Why Arithmetic Matters
|
||
- **Neural networks**: Every layer uses tensor arithmetic
|
||
- **Optimization**: Gradient updates use arithmetic
|
||
- **Data processing**: Normalization, scaling, transformations
|
||
"""
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 2A: Tensor Addition
|
||
|
||
### The Foundation Operation
|
||
Addition is the most basic and important tensor operation:
|
||
- **Element-wise**: Each element adds to corresponding element
|
||
- **Broadcasting**: Smaller tensors can add to larger ones
|
||
- **Commutative**: a + b = b + a
|
||
"""
|
||
|
||
# %% nbgrader={"grade": false, "grade_id": "tensor-addition", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
||
#| export
|
||
def add(self, other: 'Tensor') -> 'Tensor':
|
||
"""
|
||
Add two tensors element-wise.
|
||
|
||
TODO: Implement tensor addition.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Get the numpy data from both tensors
|
||
2. Use numpy's + operator for element-wise addition
|
||
3. Create a new Tensor with the result
|
||
4. Return the new tensor
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
t1 = Tensor([[1, 2], [3, 4]])
|
||
t2 = Tensor([[5, 6], [7, 8]])
|
||
result = t1.add(t2)
|
||
print(result.data) # [[6, 8], [10, 12]]
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- Use self._data + other._data for numpy addition
|
||
- Wrap result in new Tensor: return Tensor(result)
|
||
- NumPy handles broadcasting automatically
|
||
- This is like torch.add() in PyTorch
|
||
|
||
LEARNING CONNECTIONS:
|
||
- This is used in every neural network layer
|
||
- Gradient updates use addition: params = params + learning_rate * gradients
|
||
- Data preprocessing: adding bias, normalization
|
||
"""
|
||
### BEGIN SOLUTION
|
||
result = self._data + other._data
|
||
return Tensor(result)
|
||
### END SOLUTION
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### 🧪 Test Your Tensor Addition
|
||
|
||
Once you implement the `add` method above, run this cell to test it:
|
||
"""
|
||
|
||
# %% nbgrader={"grade": true, "grade_id": "test-tensor-addition", "locked": true, "points": 15, "schema_version": 3, "solution": false, "task": false}
|
||
def test_tensor_addition():
|
||
"""Test tensor addition with various shapes"""
|
||
print("Testing tensor addition...")
|
||
|
||
# Test same-shape addition
|
||
t1 = Tensor([[1, 2], [3, 4]])
|
||
t2 = Tensor([[5, 6], [7, 8]])
|
||
result = t1.add(t2)
|
||
expected = np.array([[6, 8], [10, 12]])
|
||
assert np.array_equal(result.data, expected), "Same-shape addition failed"
|
||
|
||
# Test scalar addition (broadcasting)
|
||
t3 = Tensor([[1, 2], [3, 4]])
|
||
t4 = Tensor(10)
|
||
result = t3.add(t4)
|
||
expected = np.array([[11, 12], [13, 14]])
|
||
assert np.array_equal(result.data, expected), "Scalar addition failed"
|
||
|
||
# Test vector addition (broadcasting)
|
||
t5 = Tensor([[1, 2], [3, 4]])
|
||
t6 = Tensor([10, 20])
|
||
result = t5.add(t6)
|
||
expected = np.array([[11, 22], [13, 24]])
|
||
assert np.array_equal(result.data, expected), "Vector addition failed"
|
||
|
||
print("✅ Tensor addition tests passed!")
|
||
print(f"✅ Same-shape, scalar, and vector addition working")
|
||
|
||
# Run the test
|
||
test_tensor_addition()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Step 2B: Tensor Multiplication
|
||
|
||
### Scaling and Element-wise Products
|
||
Multiplication is crucial for scaling values and computing element-wise products:
|
||
- **Element-wise**: Each element multiplies with corresponding element
|
||
- **Broadcasting**: Works with different shapes
|
||
- **Commutative**: a * b = b * a
|
||
"""
|
||
|
||
# %% nbgrader={"grade": false, "grade_id": "tensor-multiplication", "locked": false, "schema_version": 3, "solution": true, "task": false}
|
||
#| export
|
||
def multiply(self, other: 'Tensor') -> 'Tensor':
|
||
"""
|
||
Multiply two tensors element-wise.
|
||
|
||
TODO: Implement tensor multiplication.
|
||
|
||
STEP-BY-STEP IMPLEMENTATION:
|
||
1. Get the numpy data from both tensors
|
||
2. Use numpy's * operator for element-wise multiplication
|
||
3. Create a new Tensor with the result
|
||
4. Return the new tensor
|
||
|
||
EXAMPLE USAGE:
|
||
```python
|
||
t1 = Tensor([[1, 2], [3, 4]])
|
||
t2 = Tensor([[2, 3], [4, 5]])
|
||
result = t1.multiply(t2)
|
||
print(result.data) # [[2, 6], [12, 20]]
|
||
```
|
||
|
||
IMPLEMENTATION HINTS:
|
||
- Use self._data * other._data for numpy multiplication
|
||
- Wrap result in new Tensor: return Tensor(result)
|
||
- NumPy handles broadcasting automatically
|
||
- This is like torch.mul() in PyTorch
|
||
|
||
LEARNING CONNECTIONS:
|
||
- Used in activation functions: ReLU masks
|
||
- Attention mechanisms: attention weights * values
|
||
- Scaling: learning_rate * gradients
|
||
"""
|
||
### BEGIN SOLUTION
|
||
result = self._data * other._data
|
||
return Tensor(result)
|
||
### END SOLUTION
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### 🧪 Test Your Tensor Multiplication
|
||
|
||
Once you implement the `multiply` method above, run this cell to test it:
|
||
"""
|
||
|
||
# %% nbgrader={"grade": true, "grade_id": "test-tensor-multiplication", "locked": true, "points": 15, "schema_version": 3, "solution": false, "task": false}
|
||
def test_tensor_multiplication():
|
||
"""Test tensor multiplication with various shapes"""
|
||
print("Testing tensor multiplication...")
|
||
|
||
# Test same-shape multiplication
|
||
t1 = Tensor([[1, 2], [3, 4]])
|
||
t2 = Tensor([[2, 3], [4, 5]])
|
||
result = t1.multiply(t2)
|
||
expected = np.array([[2, 6], [12, 20]])
|
||
assert np.array_equal(result.data, expected), "Same-shape multiplication failed"
|
||
|
||
# Test scalar multiplication (broadcasting)
|
||
t3 = Tensor([[1, 2], [3, 4]])
|
||
t4 = Tensor(2)
|
||
result = t3.multiply(t4)
|
||
expected = np.array([[2, 4], [6, 8]])
|
||
assert np.array_equal(result.data, expected), "Scalar multiplication failed"
|
||
|
||
# Test vector multiplication (broadcasting)
|
||
t5 = Tensor([[1, 2], [3, 4]])
|
||
t6 = Tensor([2, 3])
|
||
result = t5.multiply(t6)
|
||
expected = np.array([[2, 6], [6, 12]])
|
||
assert np.array_equal(result.data, expected), "Vector multiplication failed"
|
||
|
||
print("✅ Tensor multiplication tests passed!")
|
||
print(f"✅ Same-shape, scalar, and vector multiplication working")
|
||
|
||
# Run the test
|
||
test_tensor_multiplication()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## 🎯 Step 3: Integration Test
|
||
|
||
### Putting It All Together
|
||
Now let's test that all our tensor operations work together in realistic scenarios:
|
||
"""
|
||
|
||
# %% nbgrader={"grade": true, "grade_id": "test-tensor-integration", "locked": true, "points": 20, "schema_version": 3, "solution": false, "task": false}
|
||
def test_tensor_integration():
|
||
"""Test complete tensor functionality together"""
|
||
print("Testing tensor integration...")
|
||
|
||
# Create test tensors
|
||
t1 = Tensor([[1, 2], [3, 4]])
|
||
t2 = Tensor([[2, 1], [1, 2]])
|
||
scalar = Tensor(0.5)
|
||
|
||
# Test chained operations
|
||
result = t1.add(t2).multiply(scalar)
|
||
expected = np.array([[1.5, 1.5], [2.0, 3.0]])
|
||
assert np.array_equal(result.data, expected), "Chained operations failed"
|
||
|
||
# Test properties after operations
|
||
assert result.shape == (2, 2), "Result shape should be (2, 2)"
|
||
assert result.size == 4, "Result size should be 4"
|
||
|
||
# Test with different shapes (broadcasting)
|
||
t3 = Tensor([1, 2, 3])
|
||
t4 = Tensor([[1], [2], [3]])
|
||
result = t3.add(t4)
|
||
assert result.shape == (3, 3), "Broadcasting result should be (3, 3)"
|
||
|
||
print("✅ Tensor integration tests passed!")
|
||
print(f"✅ All tensor operations work together correctly")
|
||
print(f"✅ Ready to build neural networks!")
|
||
|
||
# Run the integration test
|
||
test_tensor_integration()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## 🎯 Module Summary: Tensor Mastery Achieved!
|
||
|
||
Congratulations! You've successfully implemented the core Tensor class with:
|
||
|
||
### ✅ What You've Built
|
||
- **Tensor Creation**: Handle various input types (scalars, lists, arrays)
|
||
- **Properties**: Access shape, size, and data efficiently
|
||
- **Arithmetic**: Add and multiply tensors with broadcasting support
|
||
- **Integration**: Operations work together seamlessly
|
||
|
||
### ✅ Key Learning Outcomes
|
||
- **Understanding**: Tensors as the foundation of ML systems
|
||
- **Implementation**: Built tensor operations from scratch
|
||
- **Testing**: Comprehensive validation at each step
|
||
- **Integration**: Chained operations for complex computations
|
||
|
||
### ✅ Ready for Next Steps
|
||
Your tensor implementation is now ready to power:
|
||
- **Activations**: ReLU, Sigmoid, Tanh will operate on your tensors
|
||
- **Layers**: Dense layers will use tensor arithmetic
|
||
- **Networks**: Complete neural networks built on your foundation
|
||
|
||
**Next Module**: Activations - Adding nonlinearity to enable complex learning!
|
||
"""
|
||
```
|
||
|
||
## Key Improvements Demonstrated
|
||
|
||
### 1. **Progressive Structure**
|
||
- Each concept is explained, implemented, and tested before moving on
|
||
- Students get immediate feedback after each step
|
||
- No overwhelming amount of code without validation
|
||
|
||
### 2. **Rich Scaffolding**
|
||
- Every TODO has step-by-step implementation guidance
|
||
- Example usage shows exactly what the function should do
|
||
- Implementation hints provide specific technical guidance
|
||
- Learning connections show how concepts fit together
|
||
|
||
### 3. **Immediate Testing**
|
||
- Each function is tested immediately after implementation
|
||
- Tests provide clear success messages and specific achievements
|
||
- Integration tests show how concepts work together
|
||
|
||
### 4. **Educational Flow**
|
||
- Concepts build logically from simple to complex
|
||
- Real-world motivation before technical implementation
|
||
- Visual examples and concrete cases before abstract theory
|
||
|
||
## Implementation Steps for Other Modules
|
||
|
||
1. **Identify natural breakpoints** in the current module
|
||
2. **Reorganize** into Step 1, Step 2, etc. with explanations
|
||
3. **Add rich TODO blocks** with step-by-step guidance
|
||
4. **Insert immediate testing** after each major concept
|
||
5. **Add success messages** and progress indicators
|
||
6. **Include learning connections** between concepts
|
||
|
||
This transformation turns modules from reference material into guided learning experiences that maximize student success through immediate feedback and clear progression. |