mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-29 08:37:30 -05:00
Implements comprehensive demo system showing AI capabilities unlocked by each module export: - 8 progressive demos from tensor math to language generation - Complete tito demo CLI integration with capability matrix - Real AI demonstrations including XOR solving, computer vision, attention mechanisms - Educational explanations connecting implementations to production ML systems Repository reorganization: - demos/ directory with all demo files and comprehensive README - docs/ organized by category (development, nbgrader, user guides) - scripts/ for utility and testing scripts - Clean root directory with only essential files Students can now run 'tito demo' after each module export to see their framework's growing intelligence through hands-on demonstrations.
7.9 KiB
7.9 KiB
TinyTorch NBGrader Style Guide
Purpose
This guide establishes the standard format for all NBGrader solution blocks across TinyTorch modules to ensure consistency and maximize educational value.
Standard Solution Block Format
def function_name(self, parameters):
"""
Brief function description (1-2 sentences).
Args:
param1: Parameter description
param2: Parameter description
Returns:
Return type and description
TODO: Implement [specific task] with [key requirements].
STEP-BY-STEP IMPLEMENTATION:
1. [Action verb] [specific task] - [brief explanation]
2. [Action verb] [specific task] - [brief explanation]
3. [Action verb] [specific task] - [brief explanation]
4. [Action verb] [specific task] - [brief explanation]
EXAMPLE USAGE:
```python
# Realistic example with clear input/output
input_data = ClassName(example_data)
result = function_name(input_data, parameters)
print(result) # Expected: [specific output]
```
IMPLEMENTATION HINTS:
- Use [specific function/method] for [specific purpose]
- Handle [edge case] by [specific approach]
- Remember to [critical requirement]
- Common error: [specific mistake to avoid]
LEARNING CONNECTIONS:
- This is equivalent to [PyTorch/TensorFlow function]
- Used in [real-world application/system]
- Foundation for [advanced concept]
- Enables [specific capability]
"""
### BEGIN SOLUTION
# Implementation code (typically 3-10 lines)
# Focus on clarity and correctness
# Follow the steps outlined above
### END SOLUTION
Required Sections
1. TODO
- Purpose: Clear task description
- Format:
TODO: Implement [specific task] with [key requirements]. - Example:
TODO: Implement forward pass for ReLU activation with proper handling of negative values.
2. STEP-BY-STEP IMPLEMENTATION
- Purpose: Guide implementation approach
- Format: Numbered list with action verbs
- Guidelines:
- Start each step with an action verb (Create, Calculate, Apply, Return)
- Include brief explanation after dash
- Keep to 3-5 steps for later modules, 5-7 for early modules
- Example:
1. Check input dimensions - ensure tensor is valid 2. Apply element-wise maximum - compare with zero 3. Return activated tensor - maintain original shape
3. EXAMPLE USAGE
- Purpose: Demonstrate correct usage
- Format: Python code block with comments
- Must Include:
- Realistic input data
- Function call with proper parameters
- Expected output with comment
- Example:
# Create sample input x = Tensor([[-1, 0, 2], [3, -4, 5]]) relu = ReLU() output = relu(x) print(output) # Expected: [[0, 0, 2], [3, 0, 5]]
4. IMPLEMENTATION HINTS
- Purpose: Technical guidance and common pitfalls
- Format: Bulleted list
- Should Include:
- Specific functions/methods to use
- Edge cases to handle
- Common errors to avoid
- Performance considerations (for later modules)
- Example:
- Use np.maximum() for element-wise comparison - Handle None inputs gracefully - Remember to preserve input shape - Common error: forgetting to handle batch dimensions
5. LEARNING CONNECTIONS
- Purpose: Connect to real-world ML systems
- Format: Bulleted list
- Should Include:
- Framework equivalents (PyTorch/TensorFlow)
- Real-world applications
- Connection to other modules
- Why this implementation matters
- Example:
- This is equivalent to torch.nn.ReLU() in PyTorch - Used in every modern neural network architecture - Foundation for understanding gradient flow - Enables training deep networks without vanishing gradients
Optional Enhancement Sections
VISUAL STEP-BY-STEP (Early modules)
- When to Use: Complex mathematical operations or data flow
- Format: ASCII diagrams or visual explanations
- Example:
Input: [1, -2, 3, -4, 5] ↓ ReLU Output: [1, 0, 3, 0, 5]
DEBUGGING HINTS (When helpful)
- When to Use: Functions with common implementation errors
- Format: Specific debugging strategies
- Example:
- Print shapes at each step to verify dimensions - Check for NaN values after operations - Verify gradient flow in backward pass
MATHEMATICAL FOUNDATION (Math-heavy modules)
- When to Use: Complex mathematical operations
- Format: LaTeX-style equations with explanations
- Example:
Softmax formula: softmax(x_i) = exp(x_i) / Σ(exp(x_j))
Module-Specific Guidelines
Early Modules (01-07): Foundation & Architecture
- More detailed STEP-BY-STEP (5-7 steps)
- Include VISUAL STEP-BY-STEP where helpful
- Focus on educational clarity
- Simpler EXAMPLE USAGE
Middle Modules (08-11): Training
- Balance detail with conciseness (4-5 steps)
- Include gradient flow considerations
- Real dataset examples
- Performance hints become important
Later Modules (12-16): Production
- Concise STEP-BY-STEP (3-5 steps)
- Production-focused IMPLEMENTATION HINTS
- Complex, real-world EXAMPLE USAGE
- Strong emphasis on LEARNING CONNECTIONS to industry
Quality Checklist
Before finalizing any solution block, verify:
- TODO clearly states the task
- STEP-BY-STEP has numbered action steps
- EXAMPLE USAGE has realistic code with expected output
- IMPLEMENTATION HINTS cover key technical points
- LEARNING CONNECTIONS link to real ML systems
- Solution code follows the outlined steps
- All code is tested and working
- Docstring has proper Args/Returns sections
Common Mistakes to Avoid
- Inconsistent section names: Always use exact section headers
- Missing expected output: Every example needs
# Expected:comment - Too vague TODOs: Be specific about requirements
- Untested examples: All example code must actually work
- Missing Learning Connections: Always connect to real-world ML
Example: Well-Formatted Solution Block
def softmax(self, x: np.ndarray, axis: int = -1) -> np.ndarray:
"""
Apply softmax activation function along specified axis.
Args:
x: Input array of any shape
axis: Axis along which to apply softmax (default: -1)
Returns:
Array with same shape as input with softmax applied
TODO: Implement numerically stable softmax with overflow protection.
STEP-BY-STEP IMPLEMENTATION:
1. Subtract maximum value - prevent overflow in exponential
2. Compute exponentials - apply exp() to shifted values
3. Sum exponentials - calculate normalization factor
4. Divide by sum - normalize to get probabilities
EXAMPLE USAGE:
```python
logits = np.array([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]])
probs = softmax(logits)
print(probs.sum(axis=1)) # Expected: [1.0, 1.0]
print(probs[0]) # Expected: [0.659, 0.242, 0.099] (approx)
```
IMPLEMENTATION HINTS:
- Use x.max(axis=axis, keepdims=True) for stable computation
- Apply np.exp() after shifting by maximum
- Use keepdims=True to maintain broadcasting shape
- Common error: forgetting to handle arbitrary axis parameter
LEARNING CONNECTIONS:
- This is equivalent to torch.nn.functional.softmax() in PyTorch
- Critical for multi-class classification in final layers
- Used in attention mechanisms for weight normalization
- Foundation for cross-entropy loss computation
"""
### BEGIN SOLUTION
x_max = x.max(axis=axis, keepdims=True)
x_shifted = x - x_max
exp_x = np.exp(x_shifted)
sum_exp = exp_x.sum(axis=axis, keepdims=True)
return exp_x / sum_exp
### END SOLUTION
Enforcement
- All new modules MUST follow this style guide
- Existing modules should be updated when modified
- Use this guide for code reviews
- Include compliance in module testing
Last Updated: [Current Date] Version: 1.0