mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-27 15:17:31 -05:00
- Updated pedagogical principles with refined engagement patterns: - Build → Use → Reflect (design & systems thinking) - Build → Use → Analyze (technical depth & debugging) - Build → Use → Optimize (systems iteration & performance) - Added pattern selection guide for module developers - Updated development workflow to choose pattern first - Created specific module assignments for each pattern - Enhanced quick reference with pattern-specific activities This evolution moves beyond passive 'understanding' to active, specific engagement that matches professional ML engineering skills.
318 lines
10 KiB
Markdown
318 lines
10 KiB
Markdown
# 📖 TinyTorch Module Development Guide
|
|
|
|
**Complete methodology for creating educational modules with real-world ML engineering practices.**
|
|
|
|
## 🎯 Philosophy
|
|
|
|
**"Build → Use → Understand → Repeat"** with real data and immediate feedback.
|
|
|
|
Create complete, working implementations that automatically generate student exercise versions while maintaining production-quality exports.
|
|
|
|
## 🔑 Core Principles
|
|
|
|
### **Real Data, Real Systems**
|
|
- **Use production datasets**: No mock/fake data - students work with CIFAR-10, not synthetic data
|
|
- **Show progress feedback**: Downloads, training need visual progress indicators
|
|
- **Cache for efficiency**: Download once, use repeatedly
|
|
- **Real-world scale**: Use actual dataset sizes, not toy examples
|
|
|
|
### **Immediate Visual Feedback**
|
|
- **Visual confirmation**: Students see their code working (images, plots, results)
|
|
- **Development vs. Export separation**: Rich feedback in `_dev.py`, clean exports to package
|
|
- **Progress indicators**: Status messages, progress bars for long operations
|
|
- **Real-time validation**: Students can verify each step immediately
|
|
|
|
### **Educational Excellence**
|
|
- **Progressive complexity**: Easy → Medium → Hard with clear difficulty indicators
|
|
- **Comprehensive guidance**: TODO sections with approach, examples, hints, systems thinking
|
|
- **Real-world connections**: Connect every concept to production ML engineering
|
|
- **Immediate testing**: Test each component with real inputs as you build
|
|
|
|
## 🏗️ Development Workflow
|
|
|
|
### Step 1: Choose the Learning Pattern
|
|
- **Select engagement pattern**: Reflect, Analyze, or Optimize?
|
|
- **Use the Pattern Selection Guide** from [Pedagogical Principles](../pedagogy/pedagogical-principles.md):
|
|
- **Build → Use → Reflect**: Early modules, design decisions, systems thinking
|
|
- **Build → Use → Analyze**: Middle modules, technical depth, performance
|
|
- **Build → Use → Optimize**: Advanced modules, iteration, production focus
|
|
- **Document your choice** with clear rationale
|
|
|
|
### Step 2: Plan the Learning Journey
|
|
- **Define learning objectives**: What should students implement vs. receive?
|
|
- **Choose real data**: What production dataset will they use?
|
|
- **Design progression**: How does complexity build through the module?
|
|
- **Map to production**: How does this connect to real ML systems?
|
|
- **Design pattern-specific activities**: Questions, exercises, or challenges
|
|
|
|
### Step 3: Write Complete Implementation
|
|
Create `modules/{module}/{module}_dev.py` with NBDev structure:
|
|
|
|
```python
|
|
# ---
|
|
# jupyter:
|
|
# jupytext:
|
|
# text_representation:
|
|
# extension: .py
|
|
# format_name: percent
|
|
# format_version: '1.3'
|
|
# jupytext_version: 1.17.1
|
|
# ---
|
|
|
|
# %% [markdown]
|
|
"""
|
|
# Module: {Title} - {Purpose}
|
|
|
|
## 🎯 Learning Pattern: Build → Use → [Pattern]
|
|
|
|
**Pattern Choice**: [Reflect/Analyze/Optimize]
|
|
**Rationale**: [Why this pattern fits the learning objectives]
|
|
|
|
**Key Activities**:
|
|
- [Pattern-specific activity 1]
|
|
- [Pattern-specific activity 2]
|
|
- [Pattern-specific activity 3]
|
|
|
|
## Learning Objectives
|
|
- ✅ Build {core_concept} from scratch
|
|
- ✅ Use it with real data ({dataset_name})
|
|
- ✅ [Engage] through {pattern_specific_activities}
|
|
- ✅ Connect to production ML systems
|
|
|
|
## What You'll Build
|
|
{description_of_what_students_build}
|
|
"""
|
|
|
|
# %%
|
|
#| default_exp core.{module}
|
|
import numpy as np
|
|
import matplotlib.pyplot as plt
|
|
from typing import Union, List, Optional
|
|
|
|
# %%
|
|
#| export
|
|
class MainClass:
|
|
"""
|
|
{Description of the class}
|
|
|
|
TODO: {What students need to implement}
|
|
|
|
APPROACH:
|
|
1. {Step 1 with specific guidance}
|
|
2. {Step 2 with specific guidance}
|
|
3. {Step 3 with specific guidance}
|
|
|
|
EXAMPLE:
|
|
Input: {concrete_example}
|
|
Expected: {expected_output}
|
|
|
|
HINTS:
|
|
- {Helpful hint about approach}
|
|
- {Systems thinking hint}
|
|
- {Real-world connection}
|
|
"""
|
|
def __init__(self, params):
|
|
raise NotImplementedError("Student implementation required")
|
|
|
|
# %%
|
|
#| hide
|
|
#| export
|
|
class MainClass:
|
|
"""Complete implementation (hidden from students)."""
|
|
def __init__(self, params):
|
|
# Actual working implementation
|
|
pass
|
|
|
|
# %% [markdown]
|
|
"""
|
|
## 🧪 Test Your Implementation
|
|
"""
|
|
|
|
# %%
|
|
# Test with real data
|
|
try:
|
|
# Test student implementation
|
|
result = MainClass(real_data_example)
|
|
print(f"✅ Success: {result}")
|
|
except NotImplementedError:
|
|
print("⚠️ Implement the class above first!")
|
|
|
|
# Visual feedback (development only - not exported)
|
|
def show_results(data):
|
|
"""Show visual confirmation of working code."""
|
|
plt.figure(figsize=(10, 6))
|
|
# Visualization code
|
|
plt.show()
|
|
|
|
if _should_show_plots():
|
|
show_results(real_data)
|
|
```
|
|
|
|
### Step 4: Create Tests with Real Data
|
|
Create `modules/{module}/tests/test_{module}.py`:
|
|
|
|
```python
|
|
import pytest
|
|
import numpy as np
|
|
from {module}_dev import MainClass
|
|
|
|
def test_with_real_data():
|
|
"""Test with actual production data."""
|
|
# Use real datasets, not mocks
|
|
real_data = load_real_dataset()
|
|
|
|
instance = MainClass(real_data)
|
|
result = instance.process()
|
|
|
|
# Test real properties
|
|
assert result.shape == expected_real_shape
|
|
assert result.dtype == expected_real_dtype
|
|
# Test with actual data characteristics
|
|
```
|
|
|
|
### Step 5: Convert and Export
|
|
```bash
|
|
# Convert to notebook
|
|
python bin/tito.py notebooks --module {module}
|
|
|
|
# Export to package
|
|
python bin/tito.py sync --module {module}
|
|
|
|
# Test everything
|
|
python bin/tito.py test --module {module}
|
|
```
|
|
|
|
## 🏷️ NBDev Directives
|
|
|
|
### Core Directives
|
|
- `#| default_exp core.{module}` - Sets export destination
|
|
- `#| export` - Marks code for export to package
|
|
- `#| hide` + `#| export` - Hidden implementation (instructor solution)
|
|
- `# %% [markdown]` - Markdown cells for explanations
|
|
- `# %%` - Code cells
|
|
|
|
### Educational Structure
|
|
- **Concept explanation** → **Implementation guidance** → **Hidden solution** → **Testing** → **Visual feedback**
|
|
|
|
## 🎨 Difficulty System
|
|
|
|
- **🟢 Easy (5-10 min)**: Constructor, properties, basic operations
|
|
- **🟡 Medium (10-20 min)**: Conditional logic, data processing, error handling
|
|
- **🔴 Hard (20+ min)**: Complex algorithms, system integration, optimization
|
|
|
|
## 📋 Implementation Guidelines
|
|
|
|
### Students Implement (Core Learning)
|
|
- **Main functionality**: Core algorithms and data structures
|
|
- **Data processing**: Loading, preprocessing, batching
|
|
- **Error handling**: Input validation, type checking
|
|
- **Basic operations**: Mathematical operations, transformations
|
|
|
|
### Students Receive (Focus on Learning Goals)
|
|
- **Complex setup**: Download progress bars, caching systems
|
|
- **Utility functions**: Visualization, debugging helpers
|
|
- **Advanced features**: Optimization, GPU support
|
|
- **Infrastructure**: Test frameworks, import management
|
|
|
|
### TODO Guidance Quality
|
|
```python
|
|
"""
|
|
TODO: {Clear, specific task}
|
|
|
|
APPROACH:
|
|
1. {Concrete first step}
|
|
2. {Concrete second step}
|
|
3. {Concrete third step}
|
|
|
|
EXAMPLE:
|
|
Input: {actual_data_example}
|
|
Expected: {concrete_expected_output}
|
|
|
|
HINTS:
|
|
- {Helpful guidance without giving code}
|
|
- {Systems thinking consideration}
|
|
- {Real-world connection}
|
|
|
|
SYSTEMS THINKING:
|
|
- {Performance consideration}
|
|
- {Scalability question}
|
|
- {User experience aspect}
|
|
"""
|
|
```
|
|
|
|
## 🗂️ Module Structure
|
|
|
|
```
|
|
modules/{module}/
|
|
├── {module}_dev.py # 🔧 Complete implementation
|
|
├── {module}_dev.ipynb # 📓 Generated notebook
|
|
├── tests/
|
|
│ └── test_{module}.py # 🧪 Real data tests
|
|
├── README.md # 📖 Module guide
|
|
└── data/ # 📊 Cached datasets (if needed)
|
|
```
|
|
|
|
## ✅ Quality Standards
|
|
|
|
### Before Release
|
|
- [ ] Uses real data, not synthetic/mock data
|
|
- [ ] Includes progress feedback for long operations
|
|
- [ ] Visual feedback functions (development only)
|
|
- [ ] Tests use actual datasets at realistic scales
|
|
- [ ] TODO guidance includes systems thinking
|
|
- [ ] Clean separation between development and exports
|
|
- [ ] Follows "Build → Use → Understand" progression
|
|
|
|
### Integration Requirements
|
|
- [ ] Exports correctly to `tinytorch.core.{module}`
|
|
- [ ] No circular dependencies
|
|
- [ ] Consistent with existing module patterns
|
|
- [ ] Compatible with TinyTorch CLI tools
|
|
|
|
## 💡 Best Practices
|
|
|
|
### Development Process
|
|
1. **Start with real data**: Choose production dataset first
|
|
2. **Write complete implementation**: Get it working before adding markers
|
|
3. **Add rich feedback**: Visual confirmation, progress indicators
|
|
4. **Test the student path**: Follow your own TODO guidance
|
|
5. **Optimize user experience**: Consider performance, caching, error messages
|
|
|
|
### Systems Thinking
|
|
- **Performance**: How does this scale with larger datasets?
|
|
- **Caching**: How do we avoid repeated expensive operations?
|
|
- **User Experience**: How do students know the code is working?
|
|
- **Production Relevance**: How does this connect to real ML systems?
|
|
|
|
### Educational Design
|
|
- **Immediate gratification**: Students see results quickly
|
|
- **Progressive complexity**: Build understanding step by step
|
|
- **Real-world connections**: Connect every concept to production
|
|
- **Visual confirmation**: Students see their code working
|
|
|
|
## 🔄 Continuous Improvement
|
|
|
|
After teaching with a module:
|
|
1. **Monitor student experience**: Where do they get stuck?
|
|
2. **Improve guidance**: Better TODO instructions, clearer hints
|
|
3. **Enhance feedback**: More visual confirmation, better progress indicators
|
|
4. **Optimize performance**: Faster data loading, better caching
|
|
5. **Update documentation**: Share learnings with other developers
|
|
|
|
## 🎯 Success Metrics
|
|
|
|
**Students should be able to:**
|
|
- Explain what they built in simple terms
|
|
- Modify code to solve related problems
|
|
- Connect module concepts to real ML systems
|
|
- Debug issues by understanding the system
|
|
|
|
**Modules should achieve:**
|
|
- High student engagement and completion rates
|
|
- Smooth progression to next modules
|
|
- Real-world relevance and production quality
|
|
- Consistent patterns across the curriculum
|
|
|
|
---
|
|
|
|
**Remember**: We're teaching ML systems engineering, not just algorithms. Every module should reflect real-world practices and challenges while maintaining the "Build → Use → Understand" educational cycle. |