Add NBGrader style guide and compliance checker

- Created comprehensive NBGRADER_STYLE_GUIDE.md with standard format
- Defined required sections: TODO, STEP-BY-STEP, EXAMPLE USAGE, HINTS, CONNECTIONS
- Added check_compliance.py script to audit all modules
- Identified 8/17 modules fully compliant, 9 need updates
- Established clear quality standards for educational content
This commit is contained in:
Vijay Janapa Reddi
2025-09-16 16:03:42 -04:00
parent cedebf9d2c
commit ef36508701
2 changed files with 344 additions and 0 deletions

256
NBGRADER_STYLE_GUIDE.md Normal file
View File

@@ -0,0 +1,256 @@
# TinyTorch NBGrader Style Guide
## Purpose
This guide establishes the standard format for all NBGrader solution blocks across TinyTorch modules to ensure consistency and maximize educational value.
## Standard Solution Block Format
```python
def function_name(self, parameters):
"""
Brief function description (1-2 sentences).
Args:
param1: Parameter description
param2: Parameter description
Returns:
Return type and description
TODO: Implement [specific task] with [key requirements].
STEP-BY-STEP IMPLEMENTATION:
1. [Action verb] [specific task] - [brief explanation]
2. [Action verb] [specific task] - [brief explanation]
3. [Action verb] [specific task] - [brief explanation]
4. [Action verb] [specific task] - [brief explanation]
EXAMPLE USAGE:
```python
# Realistic example with clear input/output
input_data = ClassName(example_data)
result = function_name(input_data, parameters)
print(result) # Expected: [specific output]
```
IMPLEMENTATION HINTS:
- Use [specific function/method] for [specific purpose]
- Handle [edge case] by [specific approach]
- Remember to [critical requirement]
- Common error: [specific mistake to avoid]
LEARNING CONNECTIONS:
- This is equivalent to [PyTorch/TensorFlow function]
- Used in [real-world application/system]
- Foundation for [advanced concept]
- Enables [specific capability]
"""
### BEGIN SOLUTION
# Implementation code (typically 3-10 lines)
# Focus on clarity and correctness
# Follow the steps outlined above
### END SOLUTION
```
## Required Sections
### 1. TODO
- **Purpose**: Clear task description
- **Format**: `TODO: Implement [specific task] with [key requirements].`
- **Example**: `TODO: Implement forward pass for ReLU activation with proper handling of negative values.`
### 2. STEP-BY-STEP IMPLEMENTATION
- **Purpose**: Guide implementation approach
- **Format**: Numbered list with action verbs
- **Guidelines**:
- Start each step with an action verb (Create, Calculate, Apply, Return)
- Include brief explanation after dash
- Keep to 3-5 steps for later modules, 5-7 for early modules
- **Example**:
```
1. Check input dimensions - ensure tensor is valid
2. Apply element-wise maximum - compare with zero
3. Return activated tensor - maintain original shape
```
### 3. EXAMPLE USAGE
- **Purpose**: Demonstrate correct usage
- **Format**: Python code block with comments
- **Must Include**:
- Realistic input data
- Function call with proper parameters
- Expected output with comment
- **Example**:
```python
# Create sample input
x = Tensor([[-1, 0, 2], [3, -4, 5]])
relu = ReLU()
output = relu(x)
print(output) # Expected: [[0, 0, 2], [3, 0, 5]]
```
### 4. IMPLEMENTATION HINTS
- **Purpose**: Technical guidance and common pitfalls
- **Format**: Bulleted list
- **Should Include**:
- Specific functions/methods to use
- Edge cases to handle
- Common errors to avoid
- Performance considerations (for later modules)
- **Example**:
```
- Use np.maximum() for element-wise comparison
- Handle None inputs gracefully
- Remember to preserve input shape
- Common error: forgetting to handle batch dimensions
```
### 5. LEARNING CONNECTIONS
- **Purpose**: Connect to real-world ML systems
- **Format**: Bulleted list
- **Should Include**:
- Framework equivalents (PyTorch/TensorFlow)
- Real-world applications
- Connection to other modules
- Why this implementation matters
- **Example**:
```
- This is equivalent to torch.nn.ReLU() in PyTorch
- Used in every modern neural network architecture
- Foundation for understanding gradient flow
- Enables training deep networks without vanishing gradients
```
## Optional Enhancement Sections
### VISUAL STEP-BY-STEP (Early modules)
- **When to Use**: Complex mathematical operations or data flow
- **Format**: ASCII diagrams or visual explanations
- **Example**:
```
Input: [1, -2, 3, -4, 5]
↓ ReLU
Output: [1, 0, 3, 0, 5]
```
### DEBUGGING HINTS (When helpful)
- **When to Use**: Functions with common implementation errors
- **Format**: Specific debugging strategies
- **Example**:
```
- Print shapes at each step to verify dimensions
- Check for NaN values after operations
- Verify gradient flow in backward pass
```
### MATHEMATICAL FOUNDATION (Math-heavy modules)
- **When to Use**: Complex mathematical operations
- **Format**: LaTeX-style equations with explanations
- **Example**:
```
Softmax formula: softmax(x_i) = exp(x_i) / Σ(exp(x_j))
```
## Module-Specific Guidelines
### Early Modules (01-07): Foundation & Architecture
- More detailed STEP-BY-STEP (5-7 steps)
- Include VISUAL STEP-BY-STEP where helpful
- Focus on educational clarity
- Simpler EXAMPLE USAGE
### Middle Modules (08-11): Training
- Balance detail with conciseness (4-5 steps)
- Include gradient flow considerations
- Real dataset examples
- Performance hints become important
### Later Modules (12-16): Production
- Concise STEP-BY-STEP (3-5 steps)
- Production-focused IMPLEMENTATION HINTS
- Complex, real-world EXAMPLE USAGE
- Strong emphasis on LEARNING CONNECTIONS to industry
## Quality Checklist
Before finalizing any solution block, verify:
- [ ] TODO clearly states the task
- [ ] STEP-BY-STEP has numbered action steps
- [ ] EXAMPLE USAGE has realistic code with expected output
- [ ] IMPLEMENTATION HINTS cover key technical points
- [ ] LEARNING CONNECTIONS link to real ML systems
- [ ] Solution code follows the outlined steps
- [ ] All code is tested and working
- [ ] Docstring has proper Args/Returns sections
## Common Mistakes to Avoid
1. **Inconsistent section names**: Always use exact section headers
2. **Missing expected output**: Every example needs `# Expected:` comment
3. **Too vague TODOs**: Be specific about requirements
4. **Untested examples**: All example code must actually work
5. **Missing Learning Connections**: Always connect to real-world ML
## Example: Well-Formatted Solution Block
```python
def softmax(self, x: np.ndarray, axis: int = -1) -> np.ndarray:
"""
Apply softmax activation function along specified axis.
Args:
x: Input array of any shape
axis: Axis along which to apply softmax (default: -1)
Returns:
Array with same shape as input with softmax applied
TODO: Implement numerically stable softmax with overflow protection.
STEP-BY-STEP IMPLEMENTATION:
1. Subtract maximum value - prevent overflow in exponential
2. Compute exponentials - apply exp() to shifted values
3. Sum exponentials - calculate normalization factor
4. Divide by sum - normalize to get probabilities
EXAMPLE USAGE:
```python
logits = np.array([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]])
probs = softmax(logits)
print(probs.sum(axis=1)) # Expected: [1.0, 1.0]
print(probs[0]) # Expected: [0.659, 0.242, 0.099] (approx)
```
IMPLEMENTATION HINTS:
- Use x.max(axis=axis, keepdims=True) for stable computation
- Apply np.exp() after shifting by maximum
- Use keepdims=True to maintain broadcasting shape
- Common error: forgetting to handle arbitrary axis parameter
LEARNING CONNECTIONS:
- This is equivalent to torch.nn.functional.softmax() in PyTorch
- Critical for multi-class classification in final layers
- Used in attention mechanisms for weight normalization
- Foundation for cross-entropy loss computation
"""
### BEGIN SOLUTION
x_max = x.max(axis=axis, keepdims=True)
x_shifted = x - x_max
exp_x = np.exp(x_shifted)
sum_exp = exp_x.sum(axis=axis, keepdims=True)
return exp_x / sum_exp
### END SOLUTION
```
## Enforcement
1. All new modules MUST follow this style guide
2. Existing modules should be updated when modified
3. Use this guide for code reviews
4. Include compliance in module testing
---
*Last Updated: [Current Date]*
*Version: 1.0*

88
check_compliance.py Normal file
View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""Check NBGrader style guide compliance across all modules."""
import os
import re
from pathlib import Path
def analyze_module_compliance(filepath):
with open(filepath, 'r') as f:
content = f.read()
# Count solution blocks
solution_blocks = len(re.findall(r'### BEGIN SOLUTION', content))
# Check for required sections
has_todo = 'TODO:' in content
has_step_by_step = 'STEP-BY-STEP IMPLEMENTATION:' in content
has_example_usage = 'EXAMPLE USAGE:' in content or 'EXAMPLE:' in content
has_hints = 'IMPLEMENTATION HINTS:' in content or 'HINTS:' in content
has_connections = 'LEARNING CONNECTIONS:' in content or 'LEARNING CONNECTION:' in content
# Check for alternative patterns (older style)
has_approach = 'APPROACH:' in content
has_your_code_here = 'YOUR CODE HERE' in content
has_raise_notimpl = 'raise NotImplementedError' in content
compliance_score = sum([has_todo, has_step_by_step, has_example_usage, has_hints, has_connections])
return {
'solution_blocks': solution_blocks,
'compliance_score': compliance_score,
'has_todo': has_todo,
'has_step_by_step': has_step_by_step,
'has_example_usage': has_example_usage,
'has_hints': has_hints,
'has_connections': has_connections,
'has_old_patterns': has_approach or has_your_code_here or has_raise_notimpl
}
# Analyze all modules
modules_dir = Path('modules/source')
results = {}
for module_dir in sorted(modules_dir.iterdir()):
if module_dir.is_dir() and module_dir.name != 'utils':
py_files = list(module_dir.glob('*_dev.py'))
if py_files:
module_file = py_files[0]
results[module_dir.name] = analyze_module_compliance(module_file)
# Report results
print('=== NBGrader Style Guide Compliance Report ===\n')
print('Module | Blocks | Score | TODO | STEP | EXAM | HINT | CONN | Old? |')
print('-' * 78)
for module_name in sorted(results.keys()):
r = results[module_name]
status_emoji = '' if r['compliance_score'] == 5 else '⚠️' if r['compliance_score'] >= 3 else ''
print(f"{module_name:16} | {r['solution_blocks']:6} | {status_emoji} {r['compliance_score']}/5 | "
f"{'' if r['has_todo'] else '':^4} | "
f"{'' if r['has_step_by_step'] else '':^4} | "
f"{'' if r['has_example_usage'] else '':^4} | "
f"{'' if r['has_hints'] else '':^4} | "
f"{'' if r['has_connections'] else '':^4} | "
f"{'⚠️' if r['has_old_patterns'] else '':^4} |")
# Summary
fully_compliant = sum(1 for r in results.values() if r['compliance_score'] == 5)
needs_update = sum(1 for r in results.values() if r['compliance_score'] < 5)
has_old_patterns = sum(1 for r in results.values() if r['has_old_patterns'])
print('\n=== Summary ===')
print(f'Fully Compliant: {fully_compliant}/{len(results)}')
print(f'Needs Update: {needs_update}/{len(results)}')
print(f'Has Old Patterns: {has_old_patterns}/{len(results)}')
# List modules needing updates
print('\n=== Modules Needing Updates ===')
for module_name, r in sorted(results.items()):
if r['compliance_score'] < 5:
missing = []
if not r['has_todo']: missing.append('TODO')
if not r['has_step_by_step']: missing.append('STEP-BY-STEP')
if not r['has_example_usage']: missing.append('EXAMPLE USAGE')
if not r['has_hints']: missing.append('HINTS')
if not r['has_connections']: missing.append('CONNECTIONS')
print(f"{module_name}: Missing {', '.join(missing)}")