Add community and benchmark features with baseline validation
- Implement tito benchmark baseline and capstone commands - Add SPEC-style normalization for baseline benchmarks - Implement tito community join, update, leave, stats, profile commands - Use project-local storage (.tinytorch/) for user data - Add privacy-by-design with explicit consent prompts - Update site documentation for community and benchmark features - Add Marimo integration for online notebooks - Clean up redundant milestone setup exploration docs - Finalize baseline design: fast setup validation (~1 second) with normalized results
1
.gitattributes
vendored
@@ -7,4 +7,3 @@ tinytorch/core/*.py -diff
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.gif filter=lfs diff=lfs merge=lfs -text
|
||||
|
||||
5
.gitignore
vendored
@@ -87,10 +87,13 @@ site/.venv/
|
||||
book/_build/
|
||||
site/_build/
|
||||
|
||||
# NBGrader
|
||||
# NBGrader - assignments are dynamically generated via 'tito nbgrader generate'
|
||||
# Only ignore student submissions and grading outputs, not source/release (for now)
|
||||
assignments/autograded/
|
||||
assignments/feedback/
|
||||
assignments/submitted/
|
||||
# Note: assignments/source/ and assignments/release/ are kept in git for now
|
||||
# but should be regenerated with 'tito nbgrader generate' when modules change
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
|
||||
578
INSTRUCTOR.md
Normal file
@@ -0,0 +1,578 @@
|
||||
# 👩🏫 TinyTorch Instructor Guide
|
||||
|
||||
Complete guide for teaching ML Systems Engineering with TinyTorch.
|
||||
|
||||
## 🎯 Course Overview
|
||||
|
||||
TinyTorch teaches ML systems engineering through building, not just using. Students construct a complete ML framework from tensors to transformers, understanding memory, performance, and scaling at each step.
|
||||
|
||||
## 🛠️ Instructor Setup
|
||||
|
||||
### **1. Initial Setup**
|
||||
```bash
|
||||
# Clone and setup
|
||||
git clone https://github.com/MLSysBook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
|
||||
# Virtual environment (MANDATORY)
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate
|
||||
|
||||
# Install with instructor tools
|
||||
pip install -r requirements.txt
|
||||
pip install nbgrader
|
||||
|
||||
# Setup grading infrastructure
|
||||
tito grade setup
|
||||
```
|
||||
|
||||
### **2. Verify Installation**
|
||||
```bash
|
||||
tito system doctor
|
||||
# Should show all green checkmarks
|
||||
|
||||
tito grade
|
||||
# Should show available grade commands
|
||||
```
|
||||
|
||||
## 📝 Assignment Workflow
|
||||
|
||||
### **Simplified with Tito CLI**
|
||||
We've wrapped NBGrader behind simple `tito grade` commands so you don't need to learn NBGrader's complex interface.
|
||||
|
||||
### **1. Prepare Assignments**
|
||||
```bash
|
||||
# Generate instructor version (with solutions)
|
||||
tito grade generate 01_tensor
|
||||
|
||||
# Create student version (solutions removed)
|
||||
tito grade release 01_tensor
|
||||
|
||||
# Student version will be in: release/tinytorch/01_tensor/
|
||||
```
|
||||
|
||||
### **2. Distribute to Students**
|
||||
```bash
|
||||
# Option A: GitHub Classroom (recommended)
|
||||
# 1. Create assignment repository from TinyTorch
|
||||
# 2. Remove solutions from modules
|
||||
# 3. Students clone and work
|
||||
|
||||
# Option B: Direct distribution
|
||||
# Share the release/ directory contents
|
||||
```
|
||||
|
||||
### **3. Collect Submissions**
|
||||
```bash
|
||||
# Collect all students
|
||||
tito grade collect 01_tensor
|
||||
|
||||
# Or specific student
|
||||
tito grade collect 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **4. Auto-Grade**
|
||||
```bash
|
||||
# Grade all submissions
|
||||
tito grade autograde 01_tensor
|
||||
|
||||
# Grade specific student
|
||||
tito grade autograde 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **5. Manual Review**
|
||||
```bash
|
||||
# Open grading interface (browser-based)
|
||||
tito grade manual 01_tensor
|
||||
|
||||
# This launches a web interface for:
|
||||
# - Reviewing ML Systems question responses
|
||||
# - Adding feedback comments
|
||||
# - Adjusting auto-grades
|
||||
```
|
||||
|
||||
### **6. Generate Feedback**
|
||||
```bash
|
||||
# Create feedback files for students
|
||||
tito grade feedback 01_tensor
|
||||
```
|
||||
|
||||
### **7. Export Grades**
|
||||
```bash
|
||||
# Export all grades to CSV
|
||||
tito grade export
|
||||
|
||||
# Or specific module
|
||||
tito grade export --module 01_tensor --output grades_module01.csv
|
||||
```
|
||||
|
||||
## 📊 Grading Components
|
||||
|
||||
### **Auto-Graded (70%)**
|
||||
- Code implementation correctness
|
||||
- Test passing
|
||||
- Function signatures
|
||||
- Output validation
|
||||
|
||||
### **Manually Graded (30%)**
|
||||
- ML Systems Thinking questions (3 per module)
|
||||
- Each question: 10 points
|
||||
- Focus on understanding, not perfection
|
||||
|
||||
### **Grading Rubric for ML Systems Questions**
|
||||
|
||||
| Points | Criteria |
|
||||
|--------|----------|
|
||||
| 9-10 | Demonstrates deep understanding, references specific code, discusses systems implications |
|
||||
| 7-8 | Good understanding, some code references, basic systems thinking |
|
||||
| 5-6 | Surface understanding, generic response, limited systems perspective |
|
||||
| 3-4 | Attempted but misses key concepts |
|
||||
| 0-2 | No attempt or completely off-topic |
|
||||
|
||||
**What to Look For:**
|
||||
- References to actual implemented code
|
||||
- Memory/performance analysis
|
||||
- Scaling considerations
|
||||
- Production system comparisons
|
||||
- Understanding of trade-offs
|
||||
|
||||
## 📋 Sample Solutions for Grading Calibration
|
||||
|
||||
This section provides sample solutions to help calibrate grading standards. Use these as reference points when evaluating student submissions.
|
||||
|
||||
### Module 01: Tensor - Memory Footprint
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
"""Calculate tensor memory in bytes."""
|
||||
return self.data.nbytes
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Concise and correct
|
||||
- Uses NumPy's built-in `nbytes` property
|
||||
- Clear docstring
|
||||
- Handles all tensor shapes correctly
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
"""Calculate memory usage."""
|
||||
return np.prod(self.data.shape) * self.data.dtype.itemsize
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Manually calculates (shows understanding)
|
||||
- Works but less efficient than using `nbytes`
|
||||
- Minor: docstring could be more specific
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
size = 1
|
||||
for dim in self.data.shape:
|
||||
size *= dim
|
||||
return size * 4 # Assumes float32
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Correct logic but hardcoded dtype size
|
||||
- Works for float32 but fails for other dtypes
|
||||
- Shows understanding of memory calculation
|
||||
- Missing proper dtype handling
|
||||
|
||||
### Module 05: Autograd - Backward Pass
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def backward(self, gradient=None):
|
||||
"""Backward pass through computational graph."""
|
||||
if gradient is None:
|
||||
gradient = np.ones_like(self.data)
|
||||
|
||||
self.grad = gradient
|
||||
|
||||
if self.grad_fn is not None:
|
||||
# Compute gradients for inputs
|
||||
input_grads = self.grad_fn.backward(gradient)
|
||||
|
||||
# Propagate to input tensors
|
||||
if isinstance(input_grads, tuple):
|
||||
for input_tensor, input_grad in zip(self.grad_fn.inputs, input_grads):
|
||||
if input_tensor.requires_grad:
|
||||
input_tensor.backward(input_grad)
|
||||
else:
|
||||
if self.grad_fn.inputs[0].requires_grad:
|
||||
self.grad_fn.inputs[0].backward(input_grads)
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Handles both scalar and tensor gradients
|
||||
- Properly checks `requires_grad` before propagating
|
||||
- Handles tuple returns from grad_fn
|
||||
- Clear variable names and structure
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def backward(self, gradient=None):
|
||||
if gradient is None:
|
||||
gradient = np.ones_like(self.data)
|
||||
self.grad = gradient
|
||||
if self.grad_fn:
|
||||
grads = self.grad_fn.backward(gradient)
|
||||
for inp, grad in zip(self.grad_fn.inputs, grads):
|
||||
inp.backward(grad)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct logic
|
||||
- Missing `requires_grad` check (minor issue)
|
||||
- Assumes grads is always iterable (may fail for single input)
|
||||
- Works for most cases but less robust
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def backward(self, grad):
|
||||
self.grad = grad
|
||||
if self.grad_fn:
|
||||
self.grad_fn.inputs[0].backward(self.grad_fn.backward(grad))
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic backward pass works
|
||||
- Only handles single input (fails for multi-input operations)
|
||||
- Missing None gradient handling
|
||||
- Shows understanding but incomplete
|
||||
|
||||
### Module 09: Spatial - Convolution Implementation
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
"""Forward pass with explicit loops for clarity."""
|
||||
batch_size, in_channels, height, width = x.shape
|
||||
out_height = (height - self.kernel_size + 2 * self.padding) // self.stride + 1
|
||||
out_width = (width - self.kernel_size + 2 * self.padding) // self.stride + 1
|
||||
|
||||
output = np.zeros((batch_size, self.out_channels, out_height, out_width))
|
||||
|
||||
# Apply padding
|
||||
if self.padding > 0:
|
||||
x = np.pad(x, ((0, 0), (0, 0), (self.padding, self.padding),
|
||||
(self.padding, self.padding)), mode='constant')
|
||||
|
||||
# Explicit convolution loops
|
||||
for b in range(batch_size):
|
||||
for oc in range(self.out_channels):
|
||||
for oh in range(out_height):
|
||||
for ow in range(out_width):
|
||||
h_start = oh * self.stride
|
||||
w_start = ow * self.stride
|
||||
h_end = h_start + self.kernel_size
|
||||
w_end = w_start + self.kernel_size
|
||||
|
||||
window = x[b, :, h_start:h_end, w_start:w_end]
|
||||
output[b, oc, oh, ow] = np.sum(
|
||||
window * self.weight[oc] + self.bias[oc]
|
||||
)
|
||||
|
||||
return Tensor(output, requires_grad=x.requires_grad)
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Clear output shape calculation
|
||||
- Proper padding handling
|
||||
- Explicit loops make O(kernel_size²) complexity visible
|
||||
- Correct gradient tracking setup
|
||||
- Well-structured and readable
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
B, C, H, W = x.shape
|
||||
out_h = (H - self.kernel_size) // self.stride + 1
|
||||
out_w = (W - self.kernel_size) // self.stride + 1
|
||||
out = np.zeros((B, self.out_channels, out_h, out_w))
|
||||
|
||||
for b in range(B):
|
||||
for oc in range(self.out_channels):
|
||||
for i in range(out_h):
|
||||
for j in range(out_w):
|
||||
h = i * self.stride
|
||||
w = j * self.stride
|
||||
out[b, oc, i, j] = np.sum(
|
||||
x[b, :, h:h+self.kernel_size, w:w+self.kernel_size]
|
||||
* self.weight[oc]
|
||||
) + self.bias[oc]
|
||||
return Tensor(out)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Missing padding support (works only for padding=0)
|
||||
- Less clear variable names
|
||||
- Missing requires_grad propagation
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
out = np.zeros((x.shape[0], self.out_channels, x.shape[2]-2, x.shape[3]-2))
|
||||
for b in range(x.shape[0]):
|
||||
for c in range(self.out_channels):
|
||||
for i in range(out.shape[2]):
|
||||
for j in range(out.shape[3]):
|
||||
out[b, c, i, j] = np.sum(x[b, :, i:i+3, j:j+3] * self.weight[c])
|
||||
return Tensor(out)
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic convolution works
|
||||
- Hardcoded kernel_size=3 (not general)
|
||||
- No stride or padding support
|
||||
- Shows understanding but incomplete
|
||||
|
||||
### Module 12: Attention - Scaled Dot-Product Attention
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def forward(self, query, key, value, mask=None):
|
||||
"""Scaled dot-product attention with numerical stability."""
|
||||
# Compute attention scores
|
||||
scores = np.dot(query, key.T) / np.sqrt(self.d_k)
|
||||
|
||||
# Apply mask if provided
|
||||
if mask is not None:
|
||||
scores = np.where(mask, scores, -1e9)
|
||||
|
||||
# Softmax with numerical stability
|
||||
exp_scores = np.exp(scores - np.max(scores, axis=-1, keepdims=True))
|
||||
attention_weights = exp_scores / np.sum(exp_scores, axis=-1, keepdims=True)
|
||||
|
||||
# Apply attention to values
|
||||
output = np.dot(attention_weights, value)
|
||||
|
||||
return output, attention_weights
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Proper scaling factor (1/√d_k)
|
||||
- Numerical stability with max subtraction
|
||||
- Mask handling
|
||||
- Returns both output and attention weights
|
||||
- Clear and well-documented
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def forward(self, q, k, v):
|
||||
scores = np.dot(q, k.T) / np.sqrt(q.shape[-1])
|
||||
weights = np.exp(scores) / np.sum(np.exp(scores), axis=-1, keepdims=True)
|
||||
return np.dot(weights, v)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Missing numerical stability (may overflow)
|
||||
- Missing mask support
|
||||
- Works but less robust
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def forward(self, q, k, v):
|
||||
scores = np.dot(q, k.T)
|
||||
weights = np.exp(scores) / np.sum(np.exp(scores))
|
||||
return np.dot(weights, v)
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic attention mechanism
|
||||
- Missing scaling factor
|
||||
- Missing numerical stability
|
||||
- Incorrect softmax (should be per-row)
|
||||
|
||||
### Grading Guidelines Using Sample Solutions
|
||||
|
||||
**When Evaluating Student Code**:
|
||||
|
||||
1. **Correctness First**: Does it pass all tests?
|
||||
- If no: Maximum 6 points (even if well-written)
|
||||
- If yes: Proceed to quality evaluation
|
||||
|
||||
2. **Code Quality**:
|
||||
- **Excellent (9-10)**: Production-ready, handles edge cases, well-documented
|
||||
- **Good (7-8)**: Correct and functional, minor improvements possible
|
||||
- **Acceptable (5-6)**: Works but incomplete or has issues
|
||||
|
||||
3. **Systems Thinking**:
|
||||
- **Excellent**: Discusses memory, performance, scaling implications
|
||||
- **Good**: Some systems awareness
|
||||
- **Acceptable**: Focuses only on correctness
|
||||
|
||||
4. **Common Patterns**:
|
||||
- Look for: Proper error handling, edge case consideration, documentation
|
||||
- Red flags: Hardcoded values, missing checks, unclear variable names
|
||||
|
||||
**Remember**: These are calibration examples. Adjust based on your course level and learning objectives. The goal is consistent evaluation, not perfection.
|
||||
|
||||
## 📚 Module Teaching Notes
|
||||
|
||||
### **Module 01: Tensor**
|
||||
- **Focus**: Memory layout, data structures
|
||||
- **Key Concept**: Understanding memory is crucial for ML performance
|
||||
- **Demo**: Show memory profiling, copying behavior
|
||||
|
||||
### **Module 02: Activations**
|
||||
- **Focus**: Vectorization, numerical stability
|
||||
- **Key Concept**: Small details matter at scale
|
||||
- **Demo**: Gradient vanishing/exploding
|
||||
|
||||
### **Module 04-05: Layers & Networks**
|
||||
- **Focus**: Composition, parameter management
|
||||
- **Key Concept**: Building blocks combine into complex systems
|
||||
- **Project**: Build a small CNN
|
||||
|
||||
### **Module 06-07: Spatial & Attention**
|
||||
- **Focus**: Algorithmic complexity, memory patterns
|
||||
- **Key Concept**: O(N²) operations become bottlenecks
|
||||
- **Demo**: Profile attention memory usage
|
||||
|
||||
### **Module 08-11: Training Pipeline**
|
||||
- **Focus**: End-to-end system integration
|
||||
- **Key Concept**: Many components must work together
|
||||
- **Project**: Train a real model
|
||||
|
||||
### **Module 12-15: Production**
|
||||
- **Focus**: Deployment, optimization, monitoring
|
||||
- **Key Concept**: Academic vs production requirements
|
||||
- **Demo**: Model compression, deployment
|
||||
|
||||
### **Module 16: TinyGPT**
|
||||
- **Focus**: Framework generalization
|
||||
- **Key Concept**: 70% component reuse from vision to language
|
||||
- **Capstone**: Build a working language model
|
||||
|
||||
## 🎯 Learning Objectives
|
||||
|
||||
By course end, students should be able to:
|
||||
|
||||
1. **Build** complete ML systems from scratch
|
||||
2. **Analyze** memory usage and computational complexity
|
||||
3. **Debug** performance bottlenecks
|
||||
4. **Optimize** for production deployment
|
||||
5. **Understand** framework design decisions
|
||||
6. **Apply** systems thinking to ML problems
|
||||
|
||||
## 📈 Tracking Progress
|
||||
|
||||
### **Individual Progress**
|
||||
```bash
|
||||
# Check specific student progress
|
||||
tito checkpoint status --student student_id
|
||||
```
|
||||
|
||||
### **Class Overview**
|
||||
```bash
|
||||
# Export all checkpoint achievements
|
||||
tito checkpoint export --output class_progress.csv
|
||||
```
|
||||
|
||||
### **Identify Struggling Students**
|
||||
Look for:
|
||||
- Missing checkpoint achievements
|
||||
- Low scores on ML Systems questions
|
||||
- Incomplete module submissions
|
||||
|
||||
## 💡 Teaching Tips
|
||||
|
||||
### **1. Emphasize Building Over Theory**
|
||||
- Have students type every line of code
|
||||
- Run tests immediately after implementation
|
||||
- Break and fix things intentionally
|
||||
|
||||
### **2. Connect to Production Systems**
|
||||
- Show PyTorch/TensorFlow equivalents
|
||||
- Discuss real-world bottlenecks
|
||||
- Share production war stories
|
||||
|
||||
### **3. Make Performance Visible**
|
||||
```python
|
||||
# Use profilers liberally
|
||||
with TimeProfiler("operation"):
|
||||
result = expensive_operation()
|
||||
|
||||
# Show memory usage
|
||||
print(f"Memory: {get_memory_usage():.2f} MB")
|
||||
```
|
||||
|
||||
### **4. Encourage Systems Questions**
|
||||
- "What would break at 1B parameters?"
|
||||
- "How would you distributed this?"
|
||||
- "What's the bottleneck here?"
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### **Common Student Issues**
|
||||
|
||||
**Environment Problems**
|
||||
```bash
|
||||
# Student fix:
|
||||
tito system doctor
|
||||
tito system reset
|
||||
```
|
||||
|
||||
**Module Import Errors**
|
||||
```bash
|
||||
# Rebuild package
|
||||
tito export --all
|
||||
```
|
||||
|
||||
**Test Failures**
|
||||
```bash
|
||||
# Detailed test output
|
||||
tito module test MODULE --verbose
|
||||
```
|
||||
|
||||
### **NBGrader Issues**
|
||||
|
||||
**Database Locked**
|
||||
```bash
|
||||
# Clear NBGrader database
|
||||
rm gradebook.db
|
||||
tito grade setup
|
||||
```
|
||||
|
||||
**Missing Submissions**
|
||||
```bash
|
||||
# Check submission directory
|
||||
ls submitted/*/MODULE/
|
||||
```
|
||||
|
||||
## 📊 Sample Schedule (16 Weeks)
|
||||
|
||||
| Week | Module | Focus |
|
||||
|------|--------|-------|
|
||||
| 1 | 01 Tensor | Data Structures, Memory |
|
||||
| 2 | 02 Activations | Non-linearity Functions |
|
||||
| 3 | 03 Layers | Neural Network Components |
|
||||
| 4 | 04 Losses | Optimization Objectives |
|
||||
| 5 | 05 Autograd | Automatic Differentiation |
|
||||
| 6 | 06 Optimizers | Training Algorithms |
|
||||
| 7 | 07 Training | Complete Training Loop |
|
||||
| 8 | Midterm Project | Build and Train Network |
|
||||
| 9 | 08 DataLoader | Data Pipeline |
|
||||
| 10 | 09 Spatial | Convolutions, CNNs |
|
||||
| 11 | 10 Tokenization | Text Processing |
|
||||
| 12 | 11 Embeddings | Word Representations |
|
||||
| 13 | 12 Attention | Attention Mechanisms |
|
||||
| 14 | 13 Transformers | Transformer Architecture |
|
||||
| 15 | 14-19 Optimization | Profiling, Quantization, etc. |
|
||||
| 16 | 20 Capstone | Torch Olympics Competition |
|
||||
|
||||
## 🎓 Assessment Strategy
|
||||
|
||||
### **Continuous Assessment (70%)**
|
||||
- Module completion: 4% each × 16 = 64%
|
||||
- Checkpoint achievements: 6%
|
||||
|
||||
### **Projects (30%)**
|
||||
- Midterm: Build and train CNN (15%)
|
||||
- Final: Extend TinyGPT (15%)
|
||||
|
||||
## 📚 Additional Resources
|
||||
|
||||
- [MLSys Book](https://mlsysbook.ai) - Companion textbook
|
||||
- [Course Discussions](https://github.com/MLSysBook/TinyTorch/discussions)
|
||||
- [Issue Tracker](https://github.com/MLSysBook/TinyTorch/issues)
|
||||
|
||||
---
|
||||
|
||||
**Need help? Open an issue or contact the TinyTorch team!**
|
||||
@@ -1,850 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "699bd495",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"# Setup - TinyTorch System Configuration\n",
|
||||
"\n",
|
||||
"Welcome to TinyTorch! This setup module configures your personal TinyTorch installation and teaches you the NBGrader workflow.\n",
|
||||
"\n",
|
||||
"## Learning Goals\n",
|
||||
"- Configure your personal TinyTorch installation with custom information\n",
|
||||
"- Learn to query system information using Python modules\n",
|
||||
"- Master the NBGrader workflow: implement → test → export\n",
|
||||
"- Create functions that become part of your tinytorch package\n",
|
||||
"- Understand solution blocks, hidden tests, and automated grading\n",
|
||||
"\n",
|
||||
"## The Big Picture: Why Configuration Matters in ML Systems\n",
|
||||
"Configuration is the foundation of any production ML system. In this module, you'll learn:\n",
|
||||
"\n",
|
||||
"### 1. **System Awareness**\n",
|
||||
"Real ML systems need to understand their environment:\n",
|
||||
"- **Hardware constraints**: Memory, CPU cores, GPU availability\n",
|
||||
"- **Software dependencies**: Python version, library compatibility\n",
|
||||
"- **Platform differences**: Linux servers, macOS development, Windows deployment\n",
|
||||
"\n",
|
||||
"### 2. **Reproducibility**\n",
|
||||
"Configuration enables reproducible ML:\n",
|
||||
"- **Environment documentation**: Exactly what system was used\n",
|
||||
"- **Dependency management**: Precise versions and requirements\n",
|
||||
"- **Debugging support**: System info helps troubleshoot issues\n",
|
||||
"\n",
|
||||
"### 3. **Professional Development**\n",
|
||||
"Proper configuration shows engineering maturity:\n",
|
||||
"- **Attribution**: Your work is properly credited\n",
|
||||
"- **Collaboration**: Others can understand and extend your setup\n",
|
||||
"- **Maintenance**: Systems can be updated and maintained\n",
|
||||
"\n",
|
||||
"### 4. **ML Systems Context**\n",
|
||||
"This connects to broader ML engineering:\n",
|
||||
"- **Model deployment**: Different environments need different configs\n",
|
||||
"- **Monitoring**: System metrics help track performance\n",
|
||||
"- **Scaling**: Understanding hardware helps optimize training\n",
|
||||
"\n",
|
||||
"Let's build the foundation of your ML systems engineering skills!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a06f484d",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "setup-imports",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| default_exp core.setup\n",
|
||||
"\n",
|
||||
"#| export\n",
|
||||
"import sys\n",
|
||||
"import platform\n",
|
||||
"import psutil\n",
|
||||
"import os\n",
|
||||
"from typing import Dict, Any"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f63f890e",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "setup-verification",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(\"🔥 TinyTorch Setup Module\")\n",
|
||||
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
||||
"print(f\"Platform: {platform.system()}\")\n",
|
||||
"print(\"Ready to configure your TinyTorch installation!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "de5378e3",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🏗️ The Architecture of ML Systems Configuration\n",
|
||||
"\n",
|
||||
"### Configuration Layers in Production ML\n",
|
||||
"Real ML systems have multiple configuration layers:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"┌─────────────────────────────────────┐\n",
|
||||
"│ Application Config │ ← Your personal info\n",
|
||||
"├─────────────────────────────────────┤\n",
|
||||
"│ System Environment │ ← Hardware specs\n",
|
||||
"├─────────────────────────────────────┤\n",
|
||||
"│ Runtime Configuration │ ← Python, libraries\n",
|
||||
"├─────────────────────────────────────┤\n",
|
||||
"│ Infrastructure Config │ ← Cloud, containers\n",
|
||||
"└─────────────────────────────────────┘\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Why Each Layer Matters\n",
|
||||
"- **Application**: Identifies who built what and when\n",
|
||||
"- **System**: Determines performance characteristics and limitations\n",
|
||||
"- **Runtime**: Affects compatibility and feature availability\n",
|
||||
"- **Infrastructure**: Enables scaling and deployment strategies\n",
|
||||
"\n",
|
||||
"### Connection to Real ML Frameworks\n",
|
||||
"Every major ML framework has configuration:\n",
|
||||
"- **PyTorch**: `torch.cuda.is_available()`, `torch.get_num_threads()`\n",
|
||||
"- **TensorFlow**: `tf.config.list_physical_devices()`, `tf.sysconfig.get_build_info()`\n",
|
||||
"- **Hugging Face**: Model cards with system requirements and performance metrics\n",
|
||||
"- **MLflow**: Experiment tracking with system context and reproducibility\n",
|
||||
"\n",
|
||||
"### TinyTorch's Approach\n",
|
||||
"We'll build configuration that's:\n",
|
||||
"- **Educational**: Teaches system awareness\n",
|
||||
"- **Practical**: Actually useful for debugging\n",
|
||||
"- **Professional**: Follows industry standards\n",
|
||||
"- **Extensible**: Ready for future ML systems features"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9c51b4b0",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 2
|
||||
},
|
||||
"source": [
|
||||
"## Step 1: What is System Configuration?\n",
|
||||
"\n",
|
||||
"### Definition\n",
|
||||
"**System configuration** is the process of setting up your development environment with personalized information and system diagnostics. In TinyTorch, this means:\n",
|
||||
"\n",
|
||||
"- **Personal Information**: Your name, email, institution for identification\n",
|
||||
"- **System Information**: Hardware specs, Python version, platform details\n",
|
||||
"- **Customization**: Making your TinyTorch installation uniquely yours\n",
|
||||
"\n",
|
||||
"### Why Configuration Matters in ML Systems\n",
|
||||
"Proper system configuration is crucial because:\n",
|
||||
"\n",
|
||||
"#### 1. **Reproducibility** \n",
|
||||
"Your setup can be documented and shared:\n",
|
||||
"```python\n",
|
||||
"# Someone else can recreate your environment\n",
|
||||
"config = {\n",
|
||||
" 'developer': 'Your Name',\n",
|
||||
" 'python_version': '3.9.7',\n",
|
||||
" 'platform': 'Darwin',\n",
|
||||
" 'memory_gb': 16.0\n",
|
||||
"}\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### 2. **Debugging**\n",
|
||||
"System info helps troubleshoot ML performance issues:\n",
|
||||
"- **Memory errors**: \"Do I have enough RAM for this model?\"\n",
|
||||
"- **Performance issues**: \"How many CPU cores can I use?\"\n",
|
||||
"- **Compatibility problems**: \"What Python version am I running?\"\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Development**\n",
|
||||
"Shows proper engineering practices:\n",
|
||||
"- **Attribution**: Your work is properly credited\n",
|
||||
"- **Collaboration**: Others can contact you about your code\n",
|
||||
"- **Documentation**: System context is preserved\n",
|
||||
"\n",
|
||||
"#### 4. **ML Systems Integration**\n",
|
||||
"Connects to broader ML engineering:\n",
|
||||
"- **Model cards**: Document system requirements\n",
|
||||
"- **Experiment tracking**: Record hardware context\n",
|
||||
"- **Deployment**: Match development to production environments\n",
|
||||
"\n",
|
||||
"### Real-World Examples\n",
|
||||
"- **Google Colab**: Shows GPU type, RAM, disk space\n",
|
||||
"- **Kaggle**: Displays system specs for reproducibility\n",
|
||||
"- **MLflow**: Tracks system context with experiments\n",
|
||||
"- **Docker**: Containerizes entire system configuration\n",
|
||||
"\n",
|
||||
"Let's start configuring your TinyTorch system!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "37575c5c",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 2: Personal Information Configuration\n",
|
||||
"\n",
|
||||
"### The Concept: Identity in ML Systems\n",
|
||||
"Your **personal information** identifies you as the developer and configures your TinyTorch installation. This isn't just administrative - it's foundational to professional ML development.\n",
|
||||
"\n",
|
||||
"### Why Personal Info Matters in ML Engineering\n",
|
||||
"\n",
|
||||
"#### 1. **Attribution and Accountability**\n",
|
||||
"- **Model ownership**: Who built this model?\n",
|
||||
"- **Responsibility**: Who should be contacted about issues?\n",
|
||||
"- **Credit**: Proper recognition for your work\n",
|
||||
"\n",
|
||||
"#### 2. **Collaboration and Communication**\n",
|
||||
"- **Team coordination**: Multiple developers on ML projects\n",
|
||||
"- **Knowledge sharing**: Others can learn from your work\n",
|
||||
"- **Bug reports**: Contact info for issues and improvements\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Standards**\n",
|
||||
"- **Industry practice**: All professional software has attribution\n",
|
||||
"- **Open source**: Proper credit in shared code\n",
|
||||
"- **Academic integrity**: Clear authorship in research\n",
|
||||
"\n",
|
||||
"#### 4. **System Customization**\n",
|
||||
"- **Personalized experience**: Your TinyTorch installation\n",
|
||||
"- **Unique identification**: Distinguish your work from others\n",
|
||||
"- **Development tracking**: Link code to developer\n",
|
||||
"\n",
|
||||
"### Real-World Parallels\n",
|
||||
"- **Git commits**: Author name and email in every commit\n",
|
||||
"- **Docker images**: Maintainer information in container metadata\n",
|
||||
"- **Python packages**: Author info in `setup.py` and `pyproject.toml`\n",
|
||||
"- **Model cards**: Creator information for ML models\n",
|
||||
"\n",
|
||||
"### Best Practices for Personal Configuration\n",
|
||||
"- **Use real information**: Not placeholders or fake data\n",
|
||||
"- **Professional email**: Accessible and appropriate\n",
|
||||
"- **Descriptive system name**: Unique and meaningful\n",
|
||||
"- **Consistent formatting**: Follow established conventions\n",
|
||||
"\n",
|
||||
"Now let's implement your personal configuration!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "363c3cb7",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### Before We Code: The 5 C's\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# CONCEPT: What is Personal Information Configuration?\n",
|
||||
"# Developer identity configuration that identifies you as the creator and\n",
|
||||
"# configures your TinyTorch installation. Think Git commit attribution -\n",
|
||||
"# every professional system needs to know who built it.\n",
|
||||
"\n",
|
||||
"# CODE STRUCTURE: What We're Building \n",
|
||||
"def personal_info() -> Dict[str, str]: # Returns developer identity\n",
|
||||
" return { # Dictionary with required fields\n",
|
||||
" 'developer': 'Your Name', # Your actual name\n",
|
||||
" 'email': 'your@domain.com', # Contact information\n",
|
||||
" 'institution': 'Your Place', # Affiliation\n",
|
||||
" 'system_name': 'YourName-Dev', # Unique system identifier\n",
|
||||
" 'version': '1.0.0' # Configuration version\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"# CONNECTIONS: Real-World Equivalents\n",
|
||||
"# Git commits - author name and email in every commit\n",
|
||||
"# Docker images - maintainer information in container metadata\n",
|
||||
"# Python packages - author info in setup.py and pyproject.toml\n",
|
||||
"# Model cards - creator information for ML models\n",
|
||||
"\n",
|
||||
"# CONSTRAINTS: Key Implementation Requirements\n",
|
||||
"# - Use actual information (not placeholder text)\n",
|
||||
"# - Email must be valid format (contains @ and domain)\n",
|
||||
"# - System name should be unique and descriptive\n",
|
||||
"# - All values must be strings, version stays '1.0.0'\n",
|
||||
"\n",
|
||||
"# CONTEXT: Why This Matters in ML Systems\n",
|
||||
"# Professional ML development requires attribution:\n",
|
||||
"# - Model ownership: Who built this neural network?\n",
|
||||
"# - Collaboration: Others can contact you about issues\n",
|
||||
"# - Professional standards: Industry practice for all software\n",
|
||||
"# - System customization: Makes your TinyTorch installation unique\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**You're establishing your identity in the ML systems world.**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e0a8f7d8",
|
||||
"metadata": {
|
||||
"deletable": false,
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"cell_type": "code",
|
||||
"checksum": "885d89952aa40ac841392d44360964ef",
|
||||
"grade": false,
|
||||
"grade_id": "personal-info",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"def personal_info() -> Dict[str, str]:\n",
|
||||
" \"\"\"\n",
|
||||
" Return personal information for this TinyTorch installation.\n",
|
||||
" \n",
|
||||
" This function configures your personal TinyTorch installation with your identity.\n",
|
||||
" It's the foundation of proper ML engineering practices - every system needs\n",
|
||||
" to know who built it and how to contact them.\n",
|
||||
" \n",
|
||||
" TODO: Implement personal information configuration.\n",
|
||||
" \n",
|
||||
" STEP-BY-STEP IMPLEMENTATION:\n",
|
||||
" 1. Create a dictionary with your personal details\n",
|
||||
" 2. Include all required keys: developer, email, institution, system_name, version\n",
|
||||
" 3. Use your actual information (not placeholder text)\n",
|
||||
" 4. Make system_name unique and descriptive\n",
|
||||
" 5. Keep version as '1.0.0' for now\n",
|
||||
" \n",
|
||||
" EXAMPLE OUTPUT:\n",
|
||||
" {\n",
|
||||
" 'developer': 'Student Name',\n",
|
||||
" 'email': 'student@university.edu', \n",
|
||||
" 'institution': 'University Name',\n",
|
||||
" 'system_name': 'StudentName-TinyTorch-Dev',\n",
|
||||
" 'version': '1.0.0'\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" IMPLEMENTATION HINTS:\n",
|
||||
" - Replace the example with your real information\n",
|
||||
" - Use a descriptive system_name (e.g., 'YourName-TinyTorch-Dev')\n",
|
||||
" - Keep email format valid (contains @ and domain)\n",
|
||||
" - Make sure all values are strings\n",
|
||||
" - Consider how this info will be used in debugging and collaboration\n",
|
||||
" \n",
|
||||
" LEARNING CONNECTIONS:\n",
|
||||
" - This is like the 'author' field in Git commits\n",
|
||||
" - Similar to maintainer info in Docker images\n",
|
||||
" - Parallels author info in Python packages\n",
|
||||
" - Foundation for professional ML development\n",
|
||||
" \"\"\"\n",
|
||||
" # YOUR CODE HERE\n",
|
||||
" raise NotImplementedError()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7279ac1a",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### 🧪 Unit Test: Personal Information Configuration\n",
|
||||
"\n",
|
||||
"This test validates your `personal_info()` function implementation, ensuring it returns properly formatted developer information for system attribution and collaboration."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b5abb07b",
|
||||
"metadata": {
|
||||
"deletable": false,
|
||||
"editable": false,
|
||||
"nbgrader": {
|
||||
"cell_type": "code",
|
||||
"checksum": "90ec3c137ee806c81d6b360f1edfe6db",
|
||||
"grade": true,
|
||||
"grade_id": "test-personal-info-immediate",
|
||||
"locked": true,
|
||||
"points": 5,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def test_unit_personal_info_basic():\n",
|
||||
" \"\"\"Test personal_info function implementation.\"\"\"\n",
|
||||
" print(\"🔬 Unit Test: Personal Information...\")\n",
|
||||
" \n",
|
||||
" # Test personal_info function\n",
|
||||
" personal = personal_info()\n",
|
||||
" \n",
|
||||
" # Test return type\n",
|
||||
" assert isinstance(personal, dict), \"personal_info should return a dictionary\"\n",
|
||||
" \n",
|
||||
" # Test required keys\n",
|
||||
" required_keys = ['developer', 'email', 'institution', 'system_name', 'version']\n",
|
||||
" for key in required_keys:\n",
|
||||
" assert key in personal, f\"Dictionary should have '{key}' key\"\n",
|
||||
" \n",
|
||||
" # Test non-empty values\n",
|
||||
" for key, value in personal.items():\n",
|
||||
" assert isinstance(value, str), f\"Value for '{key}' should be a string\"\n",
|
||||
" assert len(value) > 0, f\"Value for '{key}' cannot be empty\"\n",
|
||||
" \n",
|
||||
" # Test email format\n",
|
||||
" assert '@' in personal['email'], \"Email should contain @ symbol\"\n",
|
||||
" assert '.' in personal['email'], \"Email should contain domain\"\n",
|
||||
" \n",
|
||||
" # Test version format\n",
|
||||
" assert personal['version'] == '1.0.0', \"Version should be '1.0.0'\"\n",
|
||||
" \n",
|
||||
" # Test system name (should be unique/personalized)\n",
|
||||
" assert len(personal['system_name']) > 5, \"System name should be descriptive\"\n",
|
||||
" \n",
|
||||
" print(\"✅ Personal info function tests passed!\")\n",
|
||||
" print(f\"✅ TinyTorch configured for: {personal['developer']}\")\n",
|
||||
"\n",
|
||||
"# Run the test\n",
|
||||
"test_unit_personal_info_basic()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3e47a754",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: System Information Queries\n",
|
||||
"\n",
|
||||
"### The Concept: Hardware-Aware ML Systems\n",
|
||||
"**System information** provides details about your hardware and software environment. This is crucial for ML development because machine learning is fundamentally about computation, and computation depends on hardware.\n",
|
||||
"\n",
|
||||
"### Why System Information Matters in ML Engineering\n",
|
||||
"\n",
|
||||
"#### 1. **Performance Optimization**\n",
|
||||
"- **CPU cores**: Determines parallelization strategies\n",
|
||||
"- **Memory**: Limits batch size and model size\n",
|
||||
"- **Architecture**: Affects numerical precision and optimization\n",
|
||||
"\n",
|
||||
"#### 2. **Compatibility and Debugging**\n",
|
||||
"- **Python version**: Determines available features and libraries\n",
|
||||
"- **Platform**: Affects file paths, process management, and system calls\n",
|
||||
"- **Architecture**: Influences numerical behavior and optimization\n",
|
||||
"\n",
|
||||
"#### 3. **Resource Planning**\n",
|
||||
"- **Training time estimation**: More cores = faster training\n",
|
||||
"- **Memory requirements**: Avoid out-of-memory errors\n",
|
||||
"- **Deployment matching**: Development should match production\n",
|
||||
"\n",
|
||||
"#### 4. **Reproducibility**\n",
|
||||
"- **Environment documentation**: Exact system specifications\n",
|
||||
"- **Performance comparison**: Same code, different hardware\n",
|
||||
"- **Bug reproduction**: System-specific issues\n",
|
||||
"\n",
|
||||
"### The Python System Query Toolkit\n",
|
||||
"You'll learn to use these essential Python modules:\n",
|
||||
"\n",
|
||||
"#### `sys.version_info` - Python Version\n",
|
||||
"```python\n",
|
||||
"version_info = sys.version_info\n",
|
||||
"python_version = f\"{version_info.major}.{version_info.minor}.{version_info.micro}\"\n",
|
||||
"# Example: \"3.9.7\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `platform.system()` - Operating System\n",
|
||||
"```python\n",
|
||||
"platform_name = platform.system()\n",
|
||||
"# Examples: \"Darwin\" (macOS), \"Linux\", \"Windows\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `platform.machine()` - CPU Architecture\n",
|
||||
"```python\n",
|
||||
"architecture = platform.machine()\n",
|
||||
"# Examples: \"x86_64\", \"arm64\", \"aarch64\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `psutil.cpu_count()` - CPU Cores\n",
|
||||
"```python\n",
|
||||
"cpu_count = psutil.cpu_count()\n",
|
||||
"# Example: 8 (cores available for parallel processing)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `psutil.virtual_memory().total` - Total RAM\n",
|
||||
"```python\n",
|
||||
"memory_bytes = psutil.virtual_memory().total\n",
|
||||
"memory_gb = round(memory_bytes / (1024**3), 1)\n",
|
||||
"# Example: 16.0 GB\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Real-World Applications\n",
|
||||
"- **PyTorch**: `torch.get_num_threads()` uses CPU count\n",
|
||||
"- **TensorFlow**: `tf.config.list_physical_devices()` queries hardware\n",
|
||||
"- **Scikit-learn**: `n_jobs=-1` uses all available cores\n",
|
||||
"- **Dask**: Automatically configures workers based on CPU count\n",
|
||||
"\n",
|
||||
"### ML Systems Performance Considerations\n",
|
||||
"- **Memory-bound operations**: Matrix multiplication, large model loading\n",
|
||||
"- **CPU-bound operations**: Data preprocessing, feature engineering\n",
|
||||
"- **I/O-bound operations**: Data loading, model saving\n",
|
||||
"- **Platform-specific optimizations**: SIMD instructions, memory management\n",
|
||||
"\n",
|
||||
"Now let's implement system information queries!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "188419e9",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### Before We Code: The 5 C's\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# CONCEPT: What is System Information?\n",
|
||||
"# Hardware and software environment detection for ML systems.\n",
|
||||
"# Think computer specifications for gaming - ML needs to know what\n",
|
||||
"# resources are available for optimal performance.\n",
|
||||
"\n",
|
||||
"# CODE STRUCTURE: What We're Building \n",
|
||||
"def system_info() -> Dict[str, Any]: # Queries system specs\n",
|
||||
" return { # Hardware/software details\n",
|
||||
" 'python_version': '3.9.7', # Python compatibility\n",
|
||||
" 'platform': 'Darwin', # Operating system\n",
|
||||
" 'architecture': 'arm64', # CPU architecture\n",
|
||||
" 'cpu_count': 8, # Parallel processing cores\n",
|
||||
" 'memory_gb': 16.0 # Available RAM\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"# CONNECTIONS: Real-World Equivalents\n",
|
||||
"# torch.get_num_threads() (PyTorch) - uses CPU count for optimization\n",
|
||||
"# tf.config.list_physical_devices() (TensorFlow) - queries hardware\n",
|
||||
"# psutil.cpu_count() (System monitoring) - same underlying queries\n",
|
||||
"# MLflow system tracking - documents environment for reproducibility\n",
|
||||
"\n",
|
||||
"# CONSTRAINTS: Key Implementation Requirements\n",
|
||||
"# - Use actual system queries (not hardcoded values)\n",
|
||||
"# - Convert memory from bytes to GB for readability\n",
|
||||
"# - Round memory to 1 decimal place for clean output\n",
|
||||
"# - Return proper data types (strings, int, float)\n",
|
||||
"\n",
|
||||
"# CONTEXT: Why This Matters in ML Systems\n",
|
||||
"# Hardware awareness enables performance optimization:\n",
|
||||
"# - Training: More CPU cores = faster data processing\n",
|
||||
"# - Memory: Determines maximum model and batch sizes\n",
|
||||
"# - Debugging: System specs help troubleshoot performance issues\n",
|
||||
"# - Reproducibility: Document exact environment for experiment tracking\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**You're building hardware-aware ML systems that adapt to their environment.**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "77998a3c",
|
||||
"metadata": {
|
||||
"deletable": false,
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"cell_type": "code",
|
||||
"checksum": "b6a128f46146114516fccef552012497",
|
||||
"grade": false,
|
||||
"grade_id": "system-info",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"def system_info() -> Dict[str, Any]:\n",
|
||||
" \"\"\"\n",
|
||||
" Query and return system information for this TinyTorch installation.\n",
|
||||
" \n",
|
||||
" This function gathers crucial hardware and software information that affects\n",
|
||||
" ML performance, compatibility, and debugging. It's the foundation of \n",
|
||||
" hardware-aware ML systems.\n",
|
||||
" \n",
|
||||
" TODO: Implement system information queries.\n",
|
||||
" \n",
|
||||
" STEP-BY-STEP IMPLEMENTATION:\n",
|
||||
" 1. Get Python version using sys.version_info\n",
|
||||
" 2. Get platform using platform.system()\n",
|
||||
" 3. Get architecture using platform.machine()\n",
|
||||
" 4. Get CPU count using psutil.cpu_count()\n",
|
||||
" 5. Get memory using psutil.virtual_memory().total\n",
|
||||
" 6. Convert memory from bytes to GB (divide by 1024^3)\n",
|
||||
" 7. Return all information in a dictionary\n",
|
||||
" \n",
|
||||
" EXAMPLE OUTPUT:\n",
|
||||
" {\n",
|
||||
" 'python_version': '3.9.7',\n",
|
||||
" 'platform': 'Darwin', \n",
|
||||
" 'architecture': 'arm64',\n",
|
||||
" 'cpu_count': 8,\n",
|
||||
" 'memory_gb': 16.0\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" IMPLEMENTATION HINTS:\n",
|
||||
" - Use f-string formatting for Python version: f\"{major}.{minor}.{micro}\"\n",
|
||||
" - Memory conversion: bytes / (1024^3) = GB\n",
|
||||
" - Round memory to 1 decimal place for readability\n",
|
||||
" - Make sure data types are correct (strings for text, int for cpu_count, float for memory_gb)\n",
|
||||
" \n",
|
||||
" LEARNING CONNECTIONS:\n",
|
||||
" - This is like `torch.cuda.is_available()` in PyTorch\n",
|
||||
" - Similar to system info in MLflow experiment tracking\n",
|
||||
" - Parallels hardware detection in TensorFlow\n",
|
||||
" - Foundation for performance optimization in ML systems\n",
|
||||
" \n",
|
||||
" PERFORMANCE IMPLICATIONS:\n",
|
||||
" - cpu_count affects parallel processing capabilities\n",
|
||||
" - memory_gb determines maximum model and batch sizes\n",
|
||||
" - platform affects file system and process management\n",
|
||||
" - architecture influences numerical precision and optimization\n",
|
||||
" \"\"\"\n",
|
||||
" # YOUR CODE HERE\n",
|
||||
" raise NotImplementedError()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7f324c88",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### 🧪 Unit Test: System Information Query\n",
|
||||
"\n",
|
||||
"This test validates your `system_info()` function implementation, ensuring it accurately detects and reports hardware and software specifications for performance optimization and debugging."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "094b8f68",
|
||||
"metadata": {
|
||||
"deletable": false,
|
||||
"editable": false,
|
||||
"nbgrader": {
|
||||
"cell_type": "code",
|
||||
"checksum": "b6a307022113102000f8c1cbb71f57ef",
|
||||
"grade": true,
|
||||
"grade_id": "test-system-info-immediate",
|
||||
"locked": true,
|
||||
"points": 5,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def test_unit_system_info_basic():\n",
|
||||
" \"\"\"Test system_info function implementation.\"\"\"\n",
|
||||
" print(\"🔬 Unit Test: System Information...\")\n",
|
||||
" \n",
|
||||
" # Test system_info function\n",
|
||||
" sys_info = system_info()\n",
|
||||
" \n",
|
||||
" # Test return type\n",
|
||||
" assert isinstance(sys_info, dict), \"system_info should return a dictionary\"\n",
|
||||
" \n",
|
||||
" # Test required keys\n",
|
||||
" required_keys = ['python_version', 'platform', 'architecture', 'cpu_count', 'memory_gb']\n",
|
||||
" for key in required_keys:\n",
|
||||
" assert key in sys_info, f\"Dictionary should have '{key}' key\"\n",
|
||||
" \n",
|
||||
" # Test data types\n",
|
||||
" assert isinstance(sys_info['python_version'], str), \"python_version should be string\"\n",
|
||||
" assert isinstance(sys_info['platform'], str), \"platform should be string\"\n",
|
||||
" assert isinstance(sys_info['architecture'], str), \"architecture should be string\"\n",
|
||||
" assert isinstance(sys_info['cpu_count'], int), \"cpu_count should be integer\"\n",
|
||||
" assert isinstance(sys_info['memory_gb'], (int, float)), \"memory_gb should be number\"\n",
|
||||
" \n",
|
||||
" # Test reasonable values\n",
|
||||
" assert sys_info['cpu_count'] > 0, \"CPU count should be positive\"\n",
|
||||
" assert sys_info['memory_gb'] > 0, \"Memory should be positive\"\n",
|
||||
" assert len(sys_info['python_version']) > 0, \"Python version should not be empty\"\n",
|
||||
" \n",
|
||||
" # Test that values are actually queried (not hardcoded)\n",
|
||||
" actual_version = f\"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\"\n",
|
||||
" assert sys_info['python_version'] == actual_version, \"Python version should match actual system\"\n",
|
||||
" \n",
|
||||
" print(\"✅ System info function tests passed!\")\n",
|
||||
" print(f\"✅ Python: {sys_info['python_version']} on {sys_info['platform']}\")\n",
|
||||
"\n",
|
||||
"# Run the test\n",
|
||||
"test_unit_system_info_basic()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e0e55b7e",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🧪 Testing Your Configuration Functions\n",
|
||||
"\n",
|
||||
"### The Importance of Testing in ML Systems\n",
|
||||
"Before we test your implementation, let's understand why testing is crucial in ML systems:\n",
|
||||
"\n",
|
||||
"#### 1. **Reliability**\n",
|
||||
"- **Function correctness**: Does your code do what it's supposed to?\n",
|
||||
"- **Edge case handling**: What happens with unexpected inputs?\n",
|
||||
"- **Error detection**: Catch bugs before they cause problems\n",
|
||||
"\n",
|
||||
"#### 2. **Reproducibility**\n",
|
||||
"- **Consistent behavior**: Same inputs always produce same outputs\n",
|
||||
"- **Environment validation**: Ensure setup works across different systems\n",
|
||||
"- **Regression prevention**: New changes don't break existing functionality\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Development**\n",
|
||||
"- **Code quality**: Well-tested code is maintainable code\n",
|
||||
"- **Collaboration**: Others can trust and extend your work\n",
|
||||
"- **Documentation**: Tests serve as executable documentation\n",
|
||||
"\n",
|
||||
"#### 4. **ML-Specific Concerns**\n",
|
||||
"- **Data validation**: Ensure data types and shapes are correct\n",
|
||||
"- **Performance verification**: Check that optimizations work\n",
|
||||
"- **System compatibility**: Verify cross-platform behavior\n",
|
||||
"\n",
|
||||
"### Testing Strategy\n",
|
||||
"We'll use comprehensive testing that checks:\n",
|
||||
"- **Return types**: Are outputs the correct data types?\n",
|
||||
"- **Required fields**: Are all expected keys present?\n",
|
||||
"- **Data validation**: Are values reasonable and properly formatted?\n",
|
||||
"- **System accuracy**: Do queries match actual system state?\n",
|
||||
"\n",
|
||||
"Now let's test your configuration functions!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "56c9c340",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"### 🎯 Additional Comprehensive Tests\n",
|
||||
"\n",
|
||||
"These comprehensive tests validate that your configuration functions work together and integrate properly with the TinyTorch system."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f9ed0b99",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## 🎯 MODULE SUMMARY: Setup Configuration\n",
|
||||
"\n",
|
||||
"You've successfully configured your TinyTorch installation and learned the foundations of ML systems engineering:\n",
|
||||
"\n",
|
||||
"### What You've Accomplished\n",
|
||||
"✅ **Personal Configuration**: Set up your identity and custom system name \n",
|
||||
"✅ **System Queries**: Learned to gather hardware and software information \n",
|
||||
"✅ **NBGrader Workflow**: Mastered solution blocks and automated testing \n",
|
||||
"✅ **Code Export**: Created functions that become part of your tinytorch package \n",
|
||||
"✅ **Professional Setup**: Established proper development practices \n",
|
||||
"\n",
|
||||
"### Key Concepts You've Learned\n",
|
||||
"\n",
|
||||
"#### 1. **System Awareness**\n",
|
||||
"- **Hardware constraints**: Understanding CPU, memory, and architecture limitations\n",
|
||||
"- **Software dependencies**: Python version and platform compatibility\n",
|
||||
"- **Performance implications**: How system specs affect ML workloads\n",
|
||||
"\n",
|
||||
"#### 2. **Configuration Management**\n",
|
||||
"- **Personal identification**: Professional attribution and contact information\n",
|
||||
"- **Environment documentation**: Reproducible system specifications\n",
|
||||
"- **Professional standards**: Industry-standard development practices\n",
|
||||
"\n",
|
||||
"#### 3. **ML Systems Foundations**\n",
|
||||
"- **Reproducibility**: System context for experiment tracking\n",
|
||||
"- **Debugging**: Hardware info for performance troubleshooting\n",
|
||||
"- **Collaboration**: Proper attribution and contact information\n",
|
||||
"\n",
|
||||
"#### 4. **Development Workflow**\n",
|
||||
"- **NBGrader integration**: Automated testing and grading\n",
|
||||
"- **Code export**: Functions become part of production package\n",
|
||||
"- **Testing practices**: Comprehensive validation of functionality\n",
|
||||
"\n",
|
||||
"### Next Steps in Your ML Systems Journey\n",
|
||||
"\n",
|
||||
"#### **Immediate Actions**\n",
|
||||
"1. **Export your code**: `tito module export 01_setup`\n",
|
||||
"2. **Test your installation**: \n",
|
||||
" ```python\n",
|
||||
" from tinytorch.core.setup import personal_info, system_info\n",
|
||||
" print(personal_info()) # Your personal details\n",
|
||||
" print(system_info()) # System information\n",
|
||||
" ```\n",
|
||||
"3. **Verify package integration**: Ensure your functions work in the tinytorch package\n",
|
||||
"\n",
|
||||
"#### **Looking Ahead**\n",
|
||||
"- **Module 1 (Tensor)**: Build the fundamental data structure for ML\n",
|
||||
"- **Module 2 (Activations)**: Add nonlinearity for complex learning\n",
|
||||
"- **Module 3 (Layers)**: Create the building blocks of neural networks\n",
|
||||
"- **Module 4 (Networks)**: Compose layers into powerful architectures\n",
|
||||
"\n",
|
||||
"#### **Course Progression**\n",
|
||||
"You're now ready to build a complete ML system from scratch:\n",
|
||||
"```\n",
|
||||
"Setup → Tensor → Activations → Layers → Networks → CNN → DataLoader → \n",
|
||||
"Autograd → Optimizers → Training → Compression → Kernels → Benchmarking → MLOps\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Professional Development Milestone\n",
|
||||
"\n",
|
||||
"You've taken your first step in ML systems engineering! This module taught you:\n",
|
||||
"- **System thinking**: Understanding hardware and software constraints\n",
|
||||
"- **Professional practices**: Proper attribution, testing, and documentation\n",
|
||||
"- **Tool mastery**: NBGrader workflow and package development\n",
|
||||
"- **Foundation building**: Creating reusable, tested, documented code\n",
|
||||
"\n",
|
||||
"**Ready for the next challenge?** Let's build the foundation of ML systems with tensors!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"jupytext": {
|
||||
"main_language": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -1,840 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "699bd495",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"# Setup - TinyTorch System Configuration\n",
|
||||
"\n",
|
||||
"Welcome to TinyTorch! This setup module configures your personal TinyTorch installation and teaches you the NBGrader workflow.\n",
|
||||
"\n",
|
||||
"## Learning Goals\n",
|
||||
"- Configure your personal TinyTorch installation with custom information\n",
|
||||
"- Learn to query system information using Python modules\n",
|
||||
"- Master the NBGrader workflow: implement \u2192 test \u2192 export\n",
|
||||
"- Create functions that become part of your tinytorch package\n",
|
||||
"- Understand solution blocks, hidden tests, and automated grading\n",
|
||||
"\n",
|
||||
"## The Big Picture: Why Configuration Matters in ML Systems\n",
|
||||
"Configuration is the foundation of any production ML system. In this module, you'll learn:\n",
|
||||
"\n",
|
||||
"### 1. **System Awareness**\n",
|
||||
"Real ML systems need to understand their environment:\n",
|
||||
"- **Hardware constraints**: Memory, CPU cores, GPU availability\n",
|
||||
"- **Software dependencies**: Python version, library compatibility\n",
|
||||
"- **Platform differences**: Linux servers, macOS development, Windows deployment\n",
|
||||
"\n",
|
||||
"### 2. **Reproducibility**\n",
|
||||
"Configuration enables reproducible ML:\n",
|
||||
"- **Environment documentation**: Exactly what system was used\n",
|
||||
"- **Dependency management**: Precise versions and requirements\n",
|
||||
"- **Debugging support**: System info helps troubleshoot issues\n",
|
||||
"\n",
|
||||
"### 3. **Professional Development**\n",
|
||||
"Proper configuration shows engineering maturity:\n",
|
||||
"- **Attribution**: Your work is properly credited\n",
|
||||
"- **Collaboration**: Others can understand and extend your setup\n",
|
||||
"- **Maintenance**: Systems can be updated and maintained\n",
|
||||
"\n",
|
||||
"### 4. **ML Systems Context**\n",
|
||||
"This connects to broader ML engineering:\n",
|
||||
"- **Model deployment**: Different environments need different configs\n",
|
||||
"- **Monitoring**: System metrics help track performance\n",
|
||||
"- **Scaling**: Understanding hardware helps optimize training\n",
|
||||
"\n",
|
||||
"Let's build the foundation of your ML systems engineering skills!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a06f484d",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "setup-imports",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| default_exp core.setup\n",
|
||||
"\n",
|
||||
"#| export\n",
|
||||
"import sys\n",
|
||||
"import platform\n",
|
||||
"import psutil\n",
|
||||
"import os\n",
|
||||
"from typing import Dict, Any"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f63f890e",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "setup-verification",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(\"\ud83d\udd25 TinyTorch Setup Module\")\n",
|
||||
"print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n",
|
||||
"print(f\"Platform: {platform.system()}\")\n",
|
||||
"print(\"Ready to configure your TinyTorch installation!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "de5378e3",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## \ud83c\udfd7\ufe0f The Architecture of ML Systems Configuration\n",
|
||||
"\n",
|
||||
"### Configuration Layers in Production ML\n",
|
||||
"Real ML systems have multiple configuration layers:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n",
|
||||
"\u2502 Application Config \u2502 \u2190 Your personal info\n",
|
||||
"\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n",
|
||||
"\u2502 System Environment \u2502 \u2190 Hardware specs\n",
|
||||
"\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n",
|
||||
"\u2502 Runtime Configuration \u2502 \u2190 Python, libraries\n",
|
||||
"\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n",
|
||||
"\u2502 Infrastructure Config \u2502 \u2190 Cloud, containers\n",
|
||||
"\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Why Each Layer Matters\n",
|
||||
"- **Application**: Identifies who built what and when\n",
|
||||
"- **System**: Determines performance characteristics and limitations\n",
|
||||
"- **Runtime**: Affects compatibility and feature availability\n",
|
||||
"- **Infrastructure**: Enables scaling and deployment strategies\n",
|
||||
"\n",
|
||||
"### Connection to Real ML Frameworks\n",
|
||||
"Every major ML framework has configuration:\n",
|
||||
"- **PyTorch**: `torch.cuda.is_available()`, `torch.get_num_threads()`\n",
|
||||
"- **TensorFlow**: `tf.config.list_physical_devices()`, `tf.sysconfig.get_build_info()`\n",
|
||||
"- **Hugging Face**: Model cards with system requirements and performance metrics\n",
|
||||
"- **MLflow**: Experiment tracking with system context and reproducibility\n",
|
||||
"\n",
|
||||
"### TinyTorch's Approach\n",
|
||||
"We'll build configuration that's:\n",
|
||||
"- **Educational**: Teaches system awareness\n",
|
||||
"- **Practical**: Actually useful for debugging\n",
|
||||
"- **Professional**: Follows industry standards\n",
|
||||
"- **Extensible**: Ready for future ML systems features"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9c51b4b0",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 2
|
||||
},
|
||||
"source": [
|
||||
"## Step 1: What is System Configuration?\n",
|
||||
"\n",
|
||||
"### Definition\n",
|
||||
"**System configuration** is the process of setting up your development environment with personalized information and system diagnostics. In TinyTorch, this means:\n",
|
||||
"\n",
|
||||
"- **Personal Information**: Your name, email, institution for identification\n",
|
||||
"- **System Information**: Hardware specs, Python version, platform details\n",
|
||||
"- **Customization**: Making your TinyTorch installation uniquely yours\n",
|
||||
"\n",
|
||||
"### Why Configuration Matters in ML Systems\n",
|
||||
"Proper system configuration is crucial because:\n",
|
||||
"\n",
|
||||
"#### 1. **Reproducibility** \n",
|
||||
"Your setup can be documented and shared:\n",
|
||||
"```python\n",
|
||||
"# Someone else can recreate your environment\n",
|
||||
"config = {\n",
|
||||
" 'developer': 'Your Name',\n",
|
||||
" 'python_version': '3.9.7',\n",
|
||||
" 'platform': 'Darwin',\n",
|
||||
" 'memory_gb': 16.0\n",
|
||||
"}\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### 2. **Debugging**\n",
|
||||
"System info helps troubleshoot ML performance issues:\n",
|
||||
"- **Memory errors**: \"Do I have enough RAM for this model?\"\n",
|
||||
"- **Performance issues**: \"How many CPU cores can I use?\"\n",
|
||||
"- **Compatibility problems**: \"What Python version am I running?\"\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Development**\n",
|
||||
"Shows proper engineering practices:\n",
|
||||
"- **Attribution**: Your work is properly credited\n",
|
||||
"- **Collaboration**: Others can contact you about your code\n",
|
||||
"- **Documentation**: System context is preserved\n",
|
||||
"\n",
|
||||
"#### 4. **ML Systems Integration**\n",
|
||||
"Connects to broader ML engineering:\n",
|
||||
"- **Model cards**: Document system requirements\n",
|
||||
"- **Experiment tracking**: Record hardware context\n",
|
||||
"- **Deployment**: Match development to production environments\n",
|
||||
"\n",
|
||||
"### Real-World Examples\n",
|
||||
"- **Google Colab**: Shows GPU type, RAM, disk space\n",
|
||||
"- **Kaggle**: Displays system specs for reproducibility\n",
|
||||
"- **MLflow**: Tracks system context with experiments\n",
|
||||
"- **Docker**: Containerizes entire system configuration\n",
|
||||
"\n",
|
||||
"Let's start configuring your TinyTorch system!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "37575c5c",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 2: Personal Information Configuration\n",
|
||||
"\n",
|
||||
"### The Concept: Identity in ML Systems\n",
|
||||
"Your **personal information** identifies you as the developer and configures your TinyTorch installation. This isn't just administrative - it's foundational to professional ML development.\n",
|
||||
"\n",
|
||||
"### Why Personal Info Matters in ML Engineering\n",
|
||||
"\n",
|
||||
"#### 1. **Attribution and Accountability**\n",
|
||||
"- **Model ownership**: Who built this model?\n",
|
||||
"- **Responsibility**: Who should be contacted about issues?\n",
|
||||
"- **Credit**: Proper recognition for your work\n",
|
||||
"\n",
|
||||
"#### 2. **Collaboration and Communication**\n",
|
||||
"- **Team coordination**: Multiple developers on ML projects\n",
|
||||
"- **Knowledge sharing**: Others can learn from your work\n",
|
||||
"- **Bug reports**: Contact info for issues and improvements\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Standards**\n",
|
||||
"- **Industry practice**: All professional software has attribution\n",
|
||||
"- **Open source**: Proper credit in shared code\n",
|
||||
"- **Academic integrity**: Clear authorship in research\n",
|
||||
"\n",
|
||||
"#### 4. **System Customization**\n",
|
||||
"- **Personalized experience**: Your TinyTorch installation\n",
|
||||
"- **Unique identification**: Distinguish your work from others\n",
|
||||
"- **Development tracking**: Link code to developer\n",
|
||||
"\n",
|
||||
"### Real-World Parallels\n",
|
||||
"- **Git commits**: Author name and email in every commit\n",
|
||||
"- **Docker images**: Maintainer information in container metadata\n",
|
||||
"- **Python packages**: Author info in `setup.py` and `pyproject.toml`\n",
|
||||
"- **Model cards**: Creator information for ML models\n",
|
||||
"\n",
|
||||
"### Best Practices for Personal Configuration\n",
|
||||
"- **Use real information**: Not placeholders or fake data\n",
|
||||
"- **Professional email**: Accessible and appropriate\n",
|
||||
"- **Descriptive system name**: Unique and meaningful\n",
|
||||
"- **Consistent formatting**: Follow established conventions\n",
|
||||
"\n",
|
||||
"Now let's implement your personal configuration!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "363c3cb7",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### Before We Code: The 5 C's\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# CONCEPT: What is Personal Information Configuration?\n",
|
||||
"# Developer identity configuration that identifies you as the creator and\n",
|
||||
"# configures your TinyTorch installation. Think Git commit attribution -\n",
|
||||
"# every professional system needs to know who built it.\n",
|
||||
"\n",
|
||||
"# CODE STRUCTURE: What We're Building \n",
|
||||
"def personal_info() -> Dict[str, str]: # Returns developer identity\n",
|
||||
" return { # Dictionary with required fields\n",
|
||||
" 'developer': 'Your Name', # Your actual name\n",
|
||||
" 'email': 'your@domain.com', # Contact information\n",
|
||||
" 'institution': 'Your Place', # Affiliation\n",
|
||||
" 'system_name': 'YourName-Dev', # Unique system identifier\n",
|
||||
" 'version': '1.0.0' # Configuration version\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"# CONNECTIONS: Real-World Equivalents\n",
|
||||
"# Git commits - author name and email in every commit\n",
|
||||
"# Docker images - maintainer information in container metadata\n",
|
||||
"# Python packages - author info in setup.py and pyproject.toml\n",
|
||||
"# Model cards - creator information for ML models\n",
|
||||
"\n",
|
||||
"# CONSTRAINTS: Key Implementation Requirements\n",
|
||||
"# - Use actual information (not placeholder text)\n",
|
||||
"# - Email must be valid format (contains @ and domain)\n",
|
||||
"# - System name should be unique and descriptive\n",
|
||||
"# - All values must be strings, version stays '1.0.0'\n",
|
||||
"\n",
|
||||
"# CONTEXT: Why This Matters in ML Systems\n",
|
||||
"# Professional ML development requires attribution:\n",
|
||||
"# - Model ownership: Who built this neural network?\n",
|
||||
"# - Collaboration: Others can contact you about issues\n",
|
||||
"# - Professional standards: Industry practice for all software\n",
|
||||
"# - System customization: Makes your TinyTorch installation unique\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**You're establishing your identity in the ML systems world.**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e0a8f7d8",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "personal-info",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"def personal_info() -> Dict[str, str]:\n",
|
||||
" \"\"\"\n",
|
||||
" Return personal information for this TinyTorch installation.\n",
|
||||
" \n",
|
||||
" This function configures your personal TinyTorch installation with your identity.\n",
|
||||
" It's the foundation of proper ML engineering practices - every system needs\n",
|
||||
" to know who built it and how to contact them.\n",
|
||||
" \n",
|
||||
" TODO: Implement personal information configuration.\n",
|
||||
" \n",
|
||||
" STEP-BY-STEP IMPLEMENTATION:\n",
|
||||
" 1. Create a dictionary with your personal details\n",
|
||||
" 2. Include all required keys: developer, email, institution, system_name, version\n",
|
||||
" 3. Use your actual information (not placeholder text)\n",
|
||||
" 4. Make system_name unique and descriptive\n",
|
||||
" 5. Keep version as '1.0.0' for now\n",
|
||||
" \n",
|
||||
" EXAMPLE OUTPUT:\n",
|
||||
" {\n",
|
||||
" 'developer': 'Student Name',\n",
|
||||
" 'email': 'student@university.edu', \n",
|
||||
" 'institution': 'University Name',\n",
|
||||
" 'system_name': 'StudentName-TinyTorch-Dev',\n",
|
||||
" 'version': '1.0.0'\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" IMPLEMENTATION HINTS:\n",
|
||||
" - Replace the example with your real information\n",
|
||||
" - Use a descriptive system_name (e.g., 'YourName-TinyTorch-Dev')\n",
|
||||
" - Keep email format valid (contains @ and domain)\n",
|
||||
" - Make sure all values are strings\n",
|
||||
" - Consider how this info will be used in debugging and collaboration\n",
|
||||
" \n",
|
||||
" LEARNING CONNECTIONS:\n",
|
||||
" - This is like the 'author' field in Git commits\n",
|
||||
" - Similar to maintainer info in Docker images\n",
|
||||
" - Parallels author info in Python packages\n",
|
||||
" - Foundation for professional ML development\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" # YOUR CODE HERE\n",
|
||||
" raise NotImplementedError()\n",
|
||||
" ### END SOLUTION"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7279ac1a",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### \ud83e\uddea Unit Test: Personal Information Configuration\n",
|
||||
"\n",
|
||||
"This test validates your `personal_info()` function implementation, ensuring it returns properly formatted developer information for system attribution and collaboration."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b5abb07b",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-personal-info-immediate",
|
||||
"locked": true,
|
||||
"points": 5,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def test_unit_personal_info_basic():\n",
|
||||
" \"\"\"Test personal_info function implementation.\"\"\"\n",
|
||||
" print(\"\ud83d\udd2c Unit Test: Personal Information...\")\n",
|
||||
" \n",
|
||||
" # Test personal_info function\n",
|
||||
" personal = personal_info()\n",
|
||||
" \n",
|
||||
" # Test return type\n",
|
||||
" assert isinstance(personal, dict), \"personal_info should return a dictionary\"\n",
|
||||
" \n",
|
||||
" # Test required keys\n",
|
||||
" required_keys = ['developer', 'email', 'institution', 'system_name', 'version']\n",
|
||||
" for key in required_keys:\n",
|
||||
" assert key in personal, f\"Dictionary should have '{key}' key\"\n",
|
||||
" \n",
|
||||
" # Test non-empty values\n",
|
||||
" for key, value in personal.items():\n",
|
||||
" assert isinstance(value, str), f\"Value for '{key}' should be a string\"\n",
|
||||
" assert len(value) > 0, f\"Value for '{key}' cannot be empty\"\n",
|
||||
" \n",
|
||||
" # Test email format\n",
|
||||
" assert '@' in personal['email'], \"Email should contain @ symbol\"\n",
|
||||
" assert '.' in personal['email'], \"Email should contain domain\"\n",
|
||||
" \n",
|
||||
" # Test version format\n",
|
||||
" assert personal['version'] == '1.0.0', \"Version should be '1.0.0'\"\n",
|
||||
" \n",
|
||||
" # Test system name (should be unique/personalized)\n",
|
||||
" assert len(personal['system_name']) > 5, \"System name should be descriptive\"\n",
|
||||
" \n",
|
||||
" print(\"\u2705 Personal info function tests passed!\")\n",
|
||||
" print(f\"\u2705 TinyTorch configured for: {personal['developer']}\")\n",
|
||||
"\n",
|
||||
"# Run the test\n",
|
||||
"test_unit_personal_info_basic()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3e47a754",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: System Information Queries\n",
|
||||
"\n",
|
||||
"### The Concept: Hardware-Aware ML Systems\n",
|
||||
"**System information** provides details about your hardware and software environment. This is crucial for ML development because machine learning is fundamentally about computation, and computation depends on hardware.\n",
|
||||
"\n",
|
||||
"### Why System Information Matters in ML Engineering\n",
|
||||
"\n",
|
||||
"#### 1. **Performance Optimization**\n",
|
||||
"- **CPU cores**: Determines parallelization strategies\n",
|
||||
"- **Memory**: Limits batch size and model size\n",
|
||||
"- **Architecture**: Affects numerical precision and optimization\n",
|
||||
"\n",
|
||||
"#### 2. **Compatibility and Debugging**\n",
|
||||
"- **Python version**: Determines available features and libraries\n",
|
||||
"- **Platform**: Affects file paths, process management, and system calls\n",
|
||||
"- **Architecture**: Influences numerical behavior and optimization\n",
|
||||
"\n",
|
||||
"#### 3. **Resource Planning**\n",
|
||||
"- **Training time estimation**: More cores = faster training\n",
|
||||
"- **Memory requirements**: Avoid out-of-memory errors\n",
|
||||
"- **Deployment matching**: Development should match production\n",
|
||||
"\n",
|
||||
"#### 4. **Reproducibility**\n",
|
||||
"- **Environment documentation**: Exact system specifications\n",
|
||||
"- **Performance comparison**: Same code, different hardware\n",
|
||||
"- **Bug reproduction**: System-specific issues\n",
|
||||
"\n",
|
||||
"### The Python System Query Toolkit\n",
|
||||
"You'll learn to use these essential Python modules:\n",
|
||||
"\n",
|
||||
"#### `sys.version_info` - Python Version\n",
|
||||
"```python\n",
|
||||
"version_info = sys.version_info\n",
|
||||
"python_version = f\"{version_info.major}.{version_info.minor}.{version_info.micro}\"\n",
|
||||
"# Example: \"3.9.7\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `platform.system()` - Operating System\n",
|
||||
"```python\n",
|
||||
"platform_name = platform.system()\n",
|
||||
"# Examples: \"Darwin\" (macOS), \"Linux\", \"Windows\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `platform.machine()` - CPU Architecture\n",
|
||||
"```python\n",
|
||||
"architecture = platform.machine()\n",
|
||||
"# Examples: \"x86_64\", \"arm64\", \"aarch64\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `psutil.cpu_count()` - CPU Cores\n",
|
||||
"```python\n",
|
||||
"cpu_count = psutil.cpu_count()\n",
|
||||
"# Example: 8 (cores available for parallel processing)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### `psutil.virtual_memory().total` - Total RAM\n",
|
||||
"```python\n",
|
||||
"memory_bytes = psutil.virtual_memory().total\n",
|
||||
"memory_gb = round(memory_bytes / (1024**3), 1)\n",
|
||||
"# Example: 16.0 GB\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Real-World Applications\n",
|
||||
"- **PyTorch**: `torch.get_num_threads()` uses CPU count\n",
|
||||
"- **TensorFlow**: `tf.config.list_physical_devices()` queries hardware\n",
|
||||
"- **Scikit-learn**: `n_jobs=-1` uses all available cores\n",
|
||||
"- **Dask**: Automatically configures workers based on CPU count\n",
|
||||
"\n",
|
||||
"### ML Systems Performance Considerations\n",
|
||||
"- **Memory-bound operations**: Matrix multiplication, large model loading\n",
|
||||
"- **CPU-bound operations**: Data preprocessing, feature engineering\n",
|
||||
"- **I/O-bound operations**: Data loading, model saving\n",
|
||||
"- **Platform-specific optimizations**: SIMD instructions, memory management\n",
|
||||
"\n",
|
||||
"Now let's implement system information queries!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "188419e9",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### Before We Code: The 5 C's\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# CONCEPT: What is System Information?\n",
|
||||
"# Hardware and software environment detection for ML systems.\n",
|
||||
"# Think computer specifications for gaming - ML needs to know what\n",
|
||||
"# resources are available for optimal performance.\n",
|
||||
"\n",
|
||||
"# CODE STRUCTURE: What We're Building \n",
|
||||
"def system_info() -> Dict[str, Any]: # Queries system specs\n",
|
||||
" return { # Hardware/software details\n",
|
||||
" 'python_version': '3.9.7', # Python compatibility\n",
|
||||
" 'platform': 'Darwin', # Operating system\n",
|
||||
" 'architecture': 'arm64', # CPU architecture\n",
|
||||
" 'cpu_count': 8, # Parallel processing cores\n",
|
||||
" 'memory_gb': 16.0 # Available RAM\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"# CONNECTIONS: Real-World Equivalents\n",
|
||||
"# torch.get_num_threads() (PyTorch) - uses CPU count for optimization\n",
|
||||
"# tf.config.list_physical_devices() (TensorFlow) - queries hardware\n",
|
||||
"# psutil.cpu_count() (System monitoring) - same underlying queries\n",
|
||||
"# MLflow system tracking - documents environment for reproducibility\n",
|
||||
"\n",
|
||||
"# CONSTRAINTS: Key Implementation Requirements\n",
|
||||
"# - Use actual system queries (not hardcoded values)\n",
|
||||
"# - Convert memory from bytes to GB for readability\n",
|
||||
"# - Round memory to 1 decimal place for clean output\n",
|
||||
"# - Return proper data types (strings, int, float)\n",
|
||||
"\n",
|
||||
"# CONTEXT: Why This Matters in ML Systems\n",
|
||||
"# Hardware awareness enables performance optimization:\n",
|
||||
"# - Training: More CPU cores = faster data processing\n",
|
||||
"# - Memory: Determines maximum model and batch sizes\n",
|
||||
"# - Debugging: System specs help troubleshoot performance issues\n",
|
||||
"# - Reproducibility: Document exact environment for experiment tracking\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**You're building hardware-aware ML systems that adapt to their environment.**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "77998a3c",
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 1,
|
||||
"nbgrader": {
|
||||
"grade": false,
|
||||
"grade_id": "system-info",
|
||||
"locked": false,
|
||||
"schema_version": 3,
|
||||
"solution": true,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#| export\n",
|
||||
"def system_info() -> Dict[str, Any]:\n",
|
||||
" \"\"\"\n",
|
||||
" Query and return system information for this TinyTorch installation.\n",
|
||||
" \n",
|
||||
" This function gathers crucial hardware and software information that affects\n",
|
||||
" ML performance, compatibility, and debugging. It's the foundation of \n",
|
||||
" hardware-aware ML systems.\n",
|
||||
" \n",
|
||||
" TODO: Implement system information queries.\n",
|
||||
" \n",
|
||||
" STEP-BY-STEP IMPLEMENTATION:\n",
|
||||
" 1. Get Python version using sys.version_info\n",
|
||||
" 2. Get platform using platform.system()\n",
|
||||
" 3. Get architecture using platform.machine()\n",
|
||||
" 4. Get CPU count using psutil.cpu_count()\n",
|
||||
" 5. Get memory using psutil.virtual_memory().total\n",
|
||||
" 6. Convert memory from bytes to GB (divide by 1024^3)\n",
|
||||
" 7. Return all information in a dictionary\n",
|
||||
" \n",
|
||||
" EXAMPLE OUTPUT:\n",
|
||||
" {\n",
|
||||
" 'python_version': '3.9.7',\n",
|
||||
" 'platform': 'Darwin', \n",
|
||||
" 'architecture': 'arm64',\n",
|
||||
" 'cpu_count': 8,\n",
|
||||
" 'memory_gb': 16.0\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" IMPLEMENTATION HINTS:\n",
|
||||
" - Use f-string formatting for Python version: f\"{major}.{minor}.{micro}\"\n",
|
||||
" - Memory conversion: bytes / (1024^3) = GB\n",
|
||||
" - Round memory to 1 decimal place for readability\n",
|
||||
" - Make sure data types are correct (strings for text, int for cpu_count, float for memory_gb)\n",
|
||||
" \n",
|
||||
" LEARNING CONNECTIONS:\n",
|
||||
" - This is like `torch.cuda.is_available()` in PyTorch\n",
|
||||
" - Similar to system info in MLflow experiment tracking\n",
|
||||
" - Parallels hardware detection in TensorFlow\n",
|
||||
" - Foundation for performance optimization in ML systems\n",
|
||||
" \n",
|
||||
" PERFORMANCE IMPLICATIONS:\n",
|
||||
" - cpu_count affects parallel processing capabilities\n",
|
||||
" - memory_gb determines maximum model and batch sizes\n",
|
||||
" - platform affects file system and process management\n",
|
||||
" - architecture influences numerical precision and optimization\n",
|
||||
" \"\"\"\n",
|
||||
" ### BEGIN SOLUTION\n",
|
||||
" # YOUR CODE HERE\n",
|
||||
" raise NotImplementedError()\n",
|
||||
" ### END SOLUTION"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7f324c88",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\"",
|
||||
"lines_to_next_cell": 1
|
||||
},
|
||||
"source": [
|
||||
"### \ud83e\uddea Unit Test: System Information Query\n",
|
||||
"\n",
|
||||
"This test validates your `system_info()` function implementation, ensuring it accurately detects and reports hardware and software specifications for performance optimization and debugging."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "094b8f68",
|
||||
"metadata": {
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "test-system-info-immediate",
|
||||
"locked": true,
|
||||
"points": 5,
|
||||
"schema_version": 3,
|
||||
"solution": false,
|
||||
"task": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def test_unit_system_info_basic():\n",
|
||||
" \"\"\"Test system_info function implementation.\"\"\"\n",
|
||||
" print(\"\ud83d\udd2c Unit Test: System Information...\")\n",
|
||||
" \n",
|
||||
" # Test system_info function\n",
|
||||
" sys_info = system_info()\n",
|
||||
" \n",
|
||||
" # Test return type\n",
|
||||
" assert isinstance(sys_info, dict), \"system_info should return a dictionary\"\n",
|
||||
" \n",
|
||||
" # Test required keys\n",
|
||||
" required_keys = ['python_version', 'platform', 'architecture', 'cpu_count', 'memory_gb']\n",
|
||||
" for key in required_keys:\n",
|
||||
" assert key in sys_info, f\"Dictionary should have '{key}' key\"\n",
|
||||
" \n",
|
||||
" # Test data types\n",
|
||||
" assert isinstance(sys_info['python_version'], str), \"python_version should be string\"\n",
|
||||
" assert isinstance(sys_info['platform'], str), \"platform should be string\"\n",
|
||||
" assert isinstance(sys_info['architecture'], str), \"architecture should be string\"\n",
|
||||
" assert isinstance(sys_info['cpu_count'], int), \"cpu_count should be integer\"\n",
|
||||
" assert isinstance(sys_info['memory_gb'], (int, float)), \"memory_gb should be number\"\n",
|
||||
" \n",
|
||||
" # Test reasonable values\n",
|
||||
" assert sys_info['cpu_count'] > 0, \"CPU count should be positive\"\n",
|
||||
" assert sys_info['memory_gb'] > 0, \"Memory should be positive\"\n",
|
||||
" assert len(sys_info['python_version']) > 0, \"Python version should not be empty\"\n",
|
||||
" \n",
|
||||
" # Test that values are actually queried (not hardcoded)\n",
|
||||
" actual_version = f\"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\"\n",
|
||||
" assert sys_info['python_version'] == actual_version, \"Python version should match actual system\"\n",
|
||||
" \n",
|
||||
" print(\"\u2705 System info function tests passed!\")\n",
|
||||
" print(f\"\u2705 Python: {sys_info['python_version']} on {sys_info['platform']}\")\n",
|
||||
"\n",
|
||||
"# Run the test\n",
|
||||
"test_unit_system_info_basic()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e0e55b7e",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## \ud83e\uddea Testing Your Configuration Functions\n",
|
||||
"\n",
|
||||
"### The Importance of Testing in ML Systems\n",
|
||||
"Before we test your implementation, let's understand why testing is crucial in ML systems:\n",
|
||||
"\n",
|
||||
"#### 1. **Reliability**\n",
|
||||
"- **Function correctness**: Does your code do what it's supposed to?\n",
|
||||
"- **Edge case handling**: What happens with unexpected inputs?\n",
|
||||
"- **Error detection**: Catch bugs before they cause problems\n",
|
||||
"\n",
|
||||
"#### 2. **Reproducibility**\n",
|
||||
"- **Consistent behavior**: Same inputs always produce same outputs\n",
|
||||
"- **Environment validation**: Ensure setup works across different systems\n",
|
||||
"- **Regression prevention**: New changes don't break existing functionality\n",
|
||||
"\n",
|
||||
"#### 3. **Professional Development**\n",
|
||||
"- **Code quality**: Well-tested code is maintainable code\n",
|
||||
"- **Collaboration**: Others can trust and extend your work\n",
|
||||
"- **Documentation**: Tests serve as executable documentation\n",
|
||||
"\n",
|
||||
"#### 4. **ML-Specific Concerns**\n",
|
||||
"- **Data validation**: Ensure data types and shapes are correct\n",
|
||||
"- **Performance verification**: Check that optimizations work\n",
|
||||
"- **System compatibility**: Verify cross-platform behavior\n",
|
||||
"\n",
|
||||
"### Testing Strategy\n",
|
||||
"We'll use comprehensive testing that checks:\n",
|
||||
"- **Return types**: Are outputs the correct data types?\n",
|
||||
"- **Required fields**: Are all expected keys present?\n",
|
||||
"- **Data validation**: Are values reasonable and properly formatted?\n",
|
||||
"- **System accuracy**: Do queries match actual system state?\n",
|
||||
"\n",
|
||||
"Now let's test your configuration functions!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "56c9c340",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"### \ud83c\udfaf Additional Comprehensive Tests\n",
|
||||
"\n",
|
||||
"These comprehensive tests validate that your configuration functions work together and integrate properly with the TinyTorch system."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f9ed0b99",
|
||||
"metadata": {
|
||||
"cell_marker": "\"\"\""
|
||||
},
|
||||
"source": [
|
||||
"## \ud83c\udfaf MODULE SUMMARY: Setup Configuration\n",
|
||||
"\n",
|
||||
"You've successfully configured your TinyTorch installation and learned the foundations of ML systems engineering:\n",
|
||||
"\n",
|
||||
"### What You've Accomplished\n",
|
||||
"\u2705 **Personal Configuration**: Set up your identity and custom system name \n",
|
||||
"\u2705 **System Queries**: Learned to gather hardware and software information \n",
|
||||
"\u2705 **NBGrader Workflow**: Mastered solution blocks and automated testing \n",
|
||||
"\u2705 **Code Export**: Created functions that become part of your tinytorch package \n",
|
||||
"\u2705 **Professional Setup**: Established proper development practices \n",
|
||||
"\n",
|
||||
"### Key Concepts You've Learned\n",
|
||||
"\n",
|
||||
"#### 1. **System Awareness**\n",
|
||||
"- **Hardware constraints**: Understanding CPU, memory, and architecture limitations\n",
|
||||
"- **Software dependencies**: Python version and platform compatibility\n",
|
||||
"- **Performance implications**: How system specs affect ML workloads\n",
|
||||
"\n",
|
||||
"#### 2. **Configuration Management**\n",
|
||||
"- **Personal identification**: Professional attribution and contact information\n",
|
||||
"- **Environment documentation**: Reproducible system specifications\n",
|
||||
"- **Professional standards**: Industry-standard development practices\n",
|
||||
"\n",
|
||||
"#### 3. **ML Systems Foundations**\n",
|
||||
"- **Reproducibility**: System context for experiment tracking\n",
|
||||
"- **Debugging**: Hardware info for performance troubleshooting\n",
|
||||
"- **Collaboration**: Proper attribution and contact information\n",
|
||||
"\n",
|
||||
"#### 4. **Development Workflow**\n",
|
||||
"- **NBGrader integration**: Automated testing and grading\n",
|
||||
"- **Code export**: Functions become part of production package\n",
|
||||
"- **Testing practices**: Comprehensive validation of functionality\n",
|
||||
"\n",
|
||||
"### Next Steps in Your ML Systems Journey\n",
|
||||
"\n",
|
||||
"#### **Immediate Actions**\n",
|
||||
"1. **Export your code**: `tito module export 01_setup`\n",
|
||||
"2. **Test your installation**: \n",
|
||||
" ```python\n",
|
||||
" from tinytorch.core.setup import personal_info, system_info\n",
|
||||
" print(personal_info()) # Your personal details\n",
|
||||
" print(system_info()) # System information\n",
|
||||
" ```\n",
|
||||
"3. **Verify package integration**: Ensure your functions work in the tinytorch package\n",
|
||||
"\n",
|
||||
"#### **Looking Ahead**\n",
|
||||
"- **Module 1 (Tensor)**: Build the fundamental data structure for ML\n",
|
||||
"- **Module 2 (Activations)**: Add nonlinearity for complex learning\n",
|
||||
"- **Module 3 (Layers)**: Create the building blocks of neural networks\n",
|
||||
"- **Module 4 (Networks)**: Compose layers into powerful architectures\n",
|
||||
"\n",
|
||||
"#### **Course Progression**\n",
|
||||
"You're now ready to build a complete ML system from scratch:\n",
|
||||
"```\n",
|
||||
"Setup \u2192 Tensor \u2192 Activations \u2192 Layers \u2192 Networks \u2192 CNN \u2192 DataLoader \u2192 \n",
|
||||
"Autograd \u2192 Optimizers \u2192 Training \u2192 Compression \u2192 Kernels \u2192 Benchmarking \u2192 MLOps\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"### Professional Development Milestone\n",
|
||||
"\n",
|
||||
"You've taken your first step in ML systems engineering! This module taught you:\n",
|
||||
"- **System thinking**: Understanding hardware and software constraints\n",
|
||||
"- **Professional practices**: Proper attribution, testing, and documentation\n",
|
||||
"- **Tool mastery**: NBGrader workflow and package development\n",
|
||||
"- **Foundation building**: Creating reusable, tested, documented code\n",
|
||||
"\n",
|
||||
"**Ready for the next challenge?** Let's build the foundation of ML systems with tensors!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"jupytext": {
|
||||
"main_language": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
105
binder/ASSIGNMENTS_DYNAMIC.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# Assignments Are Dynamically Generated
|
||||
|
||||
## Important: Assignments Directory Structure
|
||||
|
||||
**All assignments are dynamically generated** from the `modules/` directory using `tito nbgrader` commands. The `assignments/` directory should **not** be manually maintained.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Source of Truth: `modules/` Directory
|
||||
|
||||
The actual module content lives in:
|
||||
```
|
||||
modules/
|
||||
├── 01_tensor/
|
||||
│ └── tensor_dev.py ← Source of truth
|
||||
├── 02_activations/
|
||||
│ └── activations_dev.py ← Source of truth
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Dynamic Generation: `assignments/` Directory
|
||||
|
||||
Assignments are **generated** from modules using:
|
||||
|
||||
```bash
|
||||
# Generate assignment for a single module
|
||||
tito nbgrader generate 01_tensor
|
||||
|
||||
# Generate assignments for all modules
|
||||
tito nbgrader generate --all
|
||||
|
||||
# Generate assignments for a range
|
||||
tito nbgrader generate --range 01-05
|
||||
```
|
||||
|
||||
This creates:
|
||||
```
|
||||
assignments/
|
||||
├── source/
|
||||
│ ├── 01_tensor/
|
||||
│ │ └── 01_tensor.ipynb ← Generated from modules/01_tensor/tensor_dev.py
|
||||
│ └── 02_activations/
|
||||
│ └── 02_activations.ipynb ← Generated from modules/02_activations/activations_dev.py
|
||||
└── release/
|
||||
└── ... (student versions, generated via 'tito nbgrader release')
|
||||
```
|
||||
|
||||
## Process Flow
|
||||
|
||||
```
|
||||
modules/01_tensor/tensor_dev.py
|
||||
↓ (tito nbgrader generate)
|
||||
↓ (jupytext converts .py → .ipynb)
|
||||
↓ (NotebookGenerator processes with nbgrader markers)
|
||||
assignments/source/01_tensor/01_tensor.ipynb
|
||||
↓ (tito nbgrader release)
|
||||
assignments/release/01_tensor/01_tensor.ipynb (student version)
|
||||
```
|
||||
|
||||
## What This Means
|
||||
|
||||
1. **Don't manually edit** `assignments/source/` files - they're generated
|
||||
2. **Edit modules** in `modules/` directory instead
|
||||
3. **Regenerate assignments** when modules change: `tito nbgrader generate`
|
||||
4. **Old assignments** (like `01_setup`) are outdated - regenerate from current modules
|
||||
|
||||
## Outdated Assignment: `01_setup`
|
||||
|
||||
The `assignments/source/01_setup/` directory is **outdated** because:
|
||||
- Module 01 is now "Tensor" (`modules/01_tensor/`)
|
||||
- It was created when Module 01 was "Setup" (old structure)
|
||||
- Should be regenerated: `tito nbgrader generate 01_tensor`
|
||||
|
||||
## For Binder/Colab
|
||||
|
||||
**No impact** - Binder setup doesn't depend on assignment notebooks. However:
|
||||
- If you want to include assignments in Binder, regenerate them first:
|
||||
```bash
|
||||
tito nbgrader generate --all
|
||||
```
|
||||
- Students can access modules directly from `modules/` directory
|
||||
- Assignments are optional - modules are the source of truth
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always regenerate** assignments after modifying modules
|
||||
2. **Don't commit** manually edited assignment files
|
||||
3. **Use `tito nbgrader generate`** to create assignments
|
||||
4. **Keep modules/** as the single source of truth
|
||||
|
||||
## Commands Reference
|
||||
|
||||
```bash
|
||||
# Generate assignments
|
||||
tito nbgrader generate 01_tensor # Single module
|
||||
tito nbgrader generate --all # All modules
|
||||
tito nbgrader generate --range 01-05 # Range
|
||||
|
||||
# Release to students (removes solutions)
|
||||
tito nbgrader release 01_tensor
|
||||
|
||||
# Generate feedback
|
||||
tito nbgrader feedback 01_tensor
|
||||
```
|
||||
|
||||
449
binder/BENCHMARK_COMMUNITY_COMMANDS.md
Normal file
@@ -0,0 +1,449 @@
|
||||
# Benchmark & Community Commands Design
|
||||
|
||||
## Command Structure
|
||||
|
||||
### Benchmark Commands (Performance)
|
||||
|
||||
**Two Types of Benchmarks:**
|
||||
|
||||
1. **Baseline Benchmark** (`tito benchmark baseline`)
|
||||
- Lightweight, runs after setup
|
||||
- Quick validation: "Everything works!"
|
||||
- Basic operations: tensor ops, simple forward pass
|
||||
- **Purpose**: Hello world moment, verify setup
|
||||
|
||||
2. **Capstone Benchmark** (`tito benchmark capstone`)
|
||||
- Full benchmark suite (Module 20)
|
||||
- Proper performance metrics
|
||||
- All optimization tracks: Speed, Compression, Accuracy, Efficiency
|
||||
- **Purpose**: Real performance evaluation, leaderboard
|
||||
|
||||
### Community Commands (Cohort Feeling)
|
||||
|
||||
1. **Join** (`tito community join`)
|
||||
- Add to community map
|
||||
- Share location, institution, course type
|
||||
- **Purpose**: "I'm part of the cohort!"
|
||||
|
||||
2. **Update** (`tito community update`)
|
||||
- Update progress: milestones, modules completed
|
||||
- Refresh community entry
|
||||
- **Purpose**: Track progress in community
|
||||
|
||||
3. **Stats** (`tito community stats`)
|
||||
- See community statistics
|
||||
- Your cohort info
|
||||
- **Purpose**: "See who else is building"
|
||||
|
||||
4. **Cohort** (`tito community cohort`)
|
||||
- See your cohort members
|
||||
- Filter by institution, course type, date
|
||||
- **Purpose**: "These are my peers!"
|
||||
|
||||
## Command Details
|
||||
|
||||
### 1. Baseline Benchmark
|
||||
|
||||
**Command**: `tito benchmark baseline`
|
||||
|
||||
**When to run**: After setup, anytime
|
||||
|
||||
**What it does**:
|
||||
- Runs lightweight benchmarks (no full module 20 needed)
|
||||
- Tests: tensor creation, matrix multiply, simple forward pass
|
||||
- Generates JSON with baseline scores
|
||||
- Shows celebration message
|
||||
|
||||
**Output**:
|
||||
```
|
||||
🎉 Baseline Benchmark Complete!
|
||||
|
||||
📊 Your Baseline Performance:
|
||||
• Tensor Operations: ⚡ 0.5ms
|
||||
• Matrix Multiply: ⚡ 2.3ms
|
||||
• Forward Pass: ⚡ 5.2ms
|
||||
• Score: 85/100
|
||||
|
||||
✅ Setup verified and working!
|
||||
|
||||
💡 Run 'tito benchmark capstone' after Module 20 for full benchmarks
|
||||
```
|
||||
|
||||
**JSON Output**: `benchmarks/baseline_TIMESTAMP.json`
|
||||
|
||||
### 2. Capstone Benchmark
|
||||
|
||||
**Command**: `tito benchmark capstone [--track TRACK]`
|
||||
|
||||
**When to run**: After Module 20 (Capstone)
|
||||
|
||||
**What it does**:
|
||||
- Runs full benchmark suite from Module 20
|
||||
- Tests all optimization tracks:
|
||||
- Speed: Inference latency, throughput
|
||||
- Compression: Model size, quantization
|
||||
- Accuracy: Task performance
|
||||
- Efficiency: Memory, energy
|
||||
- Generates comprehensive JSON
|
||||
- Can submit to leaderboard
|
||||
|
||||
**Tracks**:
|
||||
- `--track speed`: Latency/throughput benchmarks
|
||||
- `--track compression`: Size/quantization benchmarks
|
||||
- `--track accuracy`: Task performance benchmarks
|
||||
- `--track efficiency`: Memory/energy benchmarks
|
||||
- `--track all`: All tracks (default)
|
||||
|
||||
**Output**:
|
||||
```
|
||||
🏆 Capstone Benchmark Results
|
||||
|
||||
📊 Speed Track:
|
||||
• Inference Latency: 45.2ms
|
||||
• Throughput: 22.1 ops/sec
|
||||
• Score: 92/100
|
||||
|
||||
📊 Compression Track:
|
||||
• Model Size: 12.4MB
|
||||
• Compression Ratio: 4.2x
|
||||
• Score: 88/100
|
||||
|
||||
📊 Overall Score: 90/100
|
||||
|
||||
🌍 Submit to leaderboard: tito community submit --benchmark
|
||||
```
|
||||
|
||||
**JSON Output**: `benchmarks/capstone_TIMESTAMP.json`
|
||||
|
||||
### 3. Community Join
|
||||
|
||||
**Command**: `tito community join`
|
||||
|
||||
**When to run**: After setup, anytime
|
||||
|
||||
**What it does**:
|
||||
- Collects: country, institution, course type (optional)
|
||||
- Validates setup
|
||||
- Generates anonymous ID
|
||||
- Adds to community map
|
||||
- Shows cohort info
|
||||
|
||||
**Output**:
|
||||
```
|
||||
🌍 Join the TinyTorch Community
|
||||
|
||||
📍 Country: [Auto-detected: United States]
|
||||
🏫 Institution (optional): Harvard University
|
||||
📚 Course Type (optional): University course
|
||||
|
||||
✅ You've joined the TinyTorch Community!
|
||||
|
||||
📍 Location: United States
|
||||
🏫 Institution: Harvard University
|
||||
🌍 View map: https://tinytorch.ai/community
|
||||
|
||||
🎖️ You're builder #1,234 on the global map!
|
||||
|
||||
👥 Your Cohort:
|
||||
• Fall 2024 cohort: 234 builders
|
||||
• Harvard University: 15 builders
|
||||
• University courses: 456 builders
|
||||
|
||||
💡 Run 'tito community cohort' to see your peers
|
||||
```
|
||||
|
||||
**JSON Output**: `community/my_submission.json`
|
||||
|
||||
### 4. Community Update
|
||||
|
||||
**Command**: `tito community update`
|
||||
|
||||
**When to run**: After milestones pass, module completion
|
||||
|
||||
**What it does**:
|
||||
- Updates existing community entry
|
||||
- Adds: milestones passed, modules completed
|
||||
- Refreshes cohort stats
|
||||
- Shows updated progress
|
||||
|
||||
**Output**:
|
||||
```
|
||||
✅ Community Entry Updated!
|
||||
|
||||
📊 Your Progress:
|
||||
• Milestones Passed: 6/6 ✅
|
||||
• Modules Completed: 20/20 ✅
|
||||
• Capstone Score: 90/100
|
||||
|
||||
👥 Your Cohort Stats:
|
||||
• Fall 2024: 234 builders (you're #15 by progress!)
|
||||
• Harvard: 15 builders (you're #3!)
|
||||
• All milestones: 89 builders worldwide
|
||||
|
||||
🌍 View updated map: https://tinytorch.ai/community
|
||||
```
|
||||
|
||||
### 5. Community Stats
|
||||
|
||||
**Command**: `tito community stats [--cohort]`
|
||||
|
||||
**What it does**:
|
||||
- Shows global community statistics
|
||||
- Shows your cohort information
|
||||
- Shows progress comparisons
|
||||
|
||||
**Output**:
|
||||
```
|
||||
🌍 TinyTorch Community Stats
|
||||
|
||||
📊 Global:
|
||||
• Total Builders: 1,234
|
||||
• Countries: 45
|
||||
• Institutions: 234
|
||||
• This Week: 23 new builders
|
||||
|
||||
👥 Your Cohort (Fall 2024):
|
||||
• Total: 234 builders
|
||||
• Your Institution: 15 builders
|
||||
• Your Progress Rank: #15/234
|
||||
• Milestones Completed: 89/234 (38%)
|
||||
|
||||
📈 Progress Distribution:
|
||||
• All Milestones: 89 (38%)
|
||||
• Some Milestones: 123 (53%)
|
||||
• Just Started: 22 (9%)
|
||||
|
||||
🌍 View full map: https://tinytorch.ai/community
|
||||
```
|
||||
|
||||
### 6. Community Cohort
|
||||
|
||||
**Command**: `tito community cohort [--institution] [--course-type]`
|
||||
|
||||
**What it does**:
|
||||
- Shows your cohort members
|
||||
- Filter by institution, course type, date
|
||||
- Shows progress comparisons
|
||||
- Creates "these are my peers" feeling
|
||||
|
||||
**Output**:
|
||||
```
|
||||
👥 Your TinyTorch Cohort
|
||||
|
||||
🏫 Harvard University Cohort (15 builders):
|
||||
|
||||
Rank | Progress | Joined
|
||||
-----|-----------------|----------
|
||||
#1 | 20/20 modules ✅ | Sep 2024
|
||||
#2 | 20/20 modules ✅ | Sep 2024
|
||||
#3 | 20/20 modules ✅ | Oct 2024 ← You!
|
||||
#4 | 15/20 modules | Oct 2024
|
||||
...
|
||||
|
||||
📚 University Course Cohort (456 builders):
|
||||
• Your rank: #45/456
|
||||
• Top 10% by progress!
|
||||
|
||||
🌍 View full community: https://tinytorch.ai/community
|
||||
```
|
||||
|
||||
## Cohort Features
|
||||
|
||||
### Creating "Cohort Feeling"
|
||||
|
||||
**1. Cohort Identification**
|
||||
- "Fall 2024 Cohort"
|
||||
- "Harvard University Cohort"
|
||||
- "University Course Cohort"
|
||||
- "Self-Paced Cohort"
|
||||
|
||||
**2. Progress Comparison**
|
||||
- "You're #15 in your cohort"
|
||||
- "Top 10% by progress"
|
||||
- "89 builders in your cohort completed all milestones"
|
||||
|
||||
**3. Peer Visibility**
|
||||
- See others from same institution
|
||||
- See others in same course type
|
||||
- See others who joined around same time
|
||||
|
||||
**4. Milestone Celebrations**
|
||||
- "You and 23 others completed Milestone 3 this week!"
|
||||
- "You're part of the 89 builders who completed all milestones!"
|
||||
|
||||
## Data Structure
|
||||
|
||||
### Community Submission
|
||||
|
||||
```json
|
||||
{
|
||||
"anonymous_id": "abc123...",
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
|
||||
"location": {
|
||||
"country": "United States"
|
||||
},
|
||||
|
||||
"institution": {
|
||||
"name": "Harvard University",
|
||||
"type": "university"
|
||||
},
|
||||
|
||||
"context": {
|
||||
"course_type": "university_course",
|
||||
"cohort": "Fall 2024", // Auto-determined by date
|
||||
"experience_level": "intermediate"
|
||||
},
|
||||
|
||||
"progress": {
|
||||
"setup_verified": true,
|
||||
"milestones_passed": 6,
|
||||
"modules_completed": 20,
|
||||
"capstone_score": 90
|
||||
},
|
||||
|
||||
"benchmarks": {
|
||||
"baseline": {
|
||||
"score": 85,
|
||||
"timestamp": "2024-11-20T10:00:00Z"
|
||||
},
|
||||
"capstone": {
|
||||
"score": 90,
|
||||
"tracks": {
|
||||
"speed": 92,
|
||||
"compression": 88,
|
||||
"accuracy": 95,
|
||||
"efficiency": 85
|
||||
},
|
||||
"timestamp": "2024-11-25T15:30:00Z"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Structure
|
||||
|
||||
### Commands to Create
|
||||
|
||||
**Benchmark Commands** (`tito/commands/benchmark.py`):
|
||||
- `tito benchmark baseline` - Quick setup validation
|
||||
- `tito benchmark capstone` - Full Module 20 benchmarks
|
||||
- `tito benchmark submit` - Submit to leaderboard
|
||||
|
||||
**Community Commands** (`tito/commands/community.py`):
|
||||
- `tito community join` - Join community map
|
||||
- `tito community update` - Update progress
|
||||
- `tito community stats` - View statistics
|
||||
- `tito community cohort` - See your cohort
|
||||
- `tito community submit` - Submit benchmarks to leaderboard
|
||||
|
||||
## User Journey with Cohort Feeling
|
||||
|
||||
```
|
||||
1. Clone & Setup
|
||||
↓
|
||||
2. tito system doctor ✅
|
||||
↓
|
||||
3. tito community join
|
||||
→ "You're builder #1,234"
|
||||
→ "Fall 2024 cohort: 234 builders"
|
||||
→ "Harvard: 15 builders"
|
||||
↓
|
||||
4. tito benchmark baseline
|
||||
→ "Score: 85/100"
|
||||
→ "You're in top 25% of your cohort!"
|
||||
↓
|
||||
5. Build modules...
|
||||
↓
|
||||
6. tito community update
|
||||
→ "Milestones: 6/6 ✅"
|
||||
→ "You're #15 in your cohort!"
|
||||
↓
|
||||
7. Complete Module 20...
|
||||
↓
|
||||
8. tito benchmark capstone
|
||||
→ "Score: 90/100"
|
||||
→ "You're #3 at Harvard!"
|
||||
↓
|
||||
9. tito community submit --benchmark
|
||||
→ "Added to leaderboard!"
|
||||
→ "Rank: #45 globally, #3 at Harvard"
|
||||
↓
|
||||
10. tito community cohort
|
||||
→ See your peers
|
||||
→ "These are the builders in my cohort!"
|
||||
```
|
||||
|
||||
## Cohort Features
|
||||
|
||||
### What Creates Cohort Feeling
|
||||
|
||||
**1. Temporal Cohorts**
|
||||
- "Fall 2024 Cohort" (by join date)
|
||||
- "This Week's Cohort" (recent joiners)
|
||||
- "All-Time Builders" (everyone)
|
||||
|
||||
**2. Institutional Cohorts**
|
||||
- "Harvard University Cohort"
|
||||
- "Stanford Cohort"
|
||||
- "Self-Paced Cohort"
|
||||
|
||||
**3. Progress Cohorts**
|
||||
- "All Milestones Cohort" (completed everything)
|
||||
- "Foundation Tier Cohort" (completed modules 1-7)
|
||||
- "Capstone Cohort" (completed module 20)
|
||||
|
||||
**4. Course Type Cohorts**
|
||||
- "University Course Cohort"
|
||||
- "Bootcamp Cohort"
|
||||
- "Self-Paced Cohort"
|
||||
|
||||
### Cohort Messages
|
||||
|
||||
**After joining:**
|
||||
```
|
||||
👥 Welcome to the Fall 2024 Cohort!
|
||||
|
||||
You're joining 234 builders who started TinyTorch this semester.
|
||||
15 builders are from Harvard University (your institution).
|
||||
|
||||
🌍 View your cohort: tito community cohort
|
||||
```
|
||||
|
||||
**After milestones:**
|
||||
```
|
||||
🎉 Milestone Achievement!
|
||||
|
||||
You and 23 others in the Fall 2024 cohort completed Milestone 3 this week!
|
||||
You're now part of the 89 builders who've completed all milestones.
|
||||
|
||||
👥 See your cohort progress: tito community cohort
|
||||
```
|
||||
|
||||
**After capstone:**
|
||||
```
|
||||
🏆 Capstone Complete!
|
||||
|
||||
You're #3 in the Harvard cohort!
|
||||
You're #45 globally among all builders.
|
||||
|
||||
👥 Your cohort stats: tito community cohort
|
||||
```
|
||||
|
||||
## Implementation Priority
|
||||
|
||||
### Phase 1: Core Commands
|
||||
1. ✅ `tito community join` - Join community
|
||||
2. ✅ `tito benchmark baseline` - Quick validation
|
||||
3. ✅ `tito community stats` - View stats
|
||||
|
||||
### Phase 2: Progress Tracking
|
||||
4. ✅ `tito community update` - Update progress
|
||||
5. ✅ `tito community cohort` - See cohort
|
||||
|
||||
### Phase 3: Capstone Integration
|
||||
6. ✅ `tito benchmark capstone` - Full benchmarks
|
||||
7. ✅ `tito community submit` - Submit to leaderboard
|
||||
|
||||
This creates a complete system where students feel part of a cohort from day one! 🎓🌍
|
||||
|
||||
169
binder/BUILD_INTEGRATION.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# Automatic Notebook Preparation in Site Build
|
||||
|
||||
## Overview
|
||||
|
||||
Notebook preparation is now **automatically integrated** into the site build process. When you build the site, notebooks are automatically prepared for launch buttons to work.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Automatic Integration
|
||||
|
||||
The build process now includes notebook preparation:
|
||||
|
||||
```bash
|
||||
cd site
|
||||
make html # Automatically prepares notebooks, then builds site
|
||||
jupyter-book build . # Also prepares notebooks automatically
|
||||
```
|
||||
|
||||
### Build Flow
|
||||
|
||||
```
|
||||
1. User runs: make html
|
||||
↓
|
||||
2. prepare_notebooks.sh runs automatically
|
||||
↓
|
||||
3. Script looks for existing assignment notebooks
|
||||
↓
|
||||
4. Copies them to site/chapters/modules/
|
||||
↓
|
||||
5. Jupyter Book builds site
|
||||
↓
|
||||
6. Launch buttons appear on notebook pages!
|
||||
```
|
||||
|
||||
## What Gets Prepared
|
||||
|
||||
### Source: Assignment Notebooks
|
||||
The script uses notebooks from `assignments/source/` (generated via `tito nbgrader generate`):
|
||||
|
||||
```
|
||||
assignments/source/01_tensor/01_tensor.ipynb
|
||||
↓ (copied during build)
|
||||
site/chapters/modules/01_tensor.ipynb
|
||||
```
|
||||
|
||||
### Why Assignment Notebooks?
|
||||
- Already processed with nbgrader markers
|
||||
- Student-ready format
|
||||
- Generated from Python source files
|
||||
- Consistent with assignment workflow
|
||||
|
||||
## Build Commands
|
||||
|
||||
All build commands now include notebook preparation:
|
||||
|
||||
### HTML Build
|
||||
```bash
|
||||
cd site
|
||||
make html
|
||||
# Or directly:
|
||||
jupyter-book build .
|
||||
```
|
||||
|
||||
### PDF Builds
|
||||
```bash
|
||||
make pdf-simple # HTML-to-PDF (includes notebook prep)
|
||||
make pdf # LaTeX PDF (includes notebook prep)
|
||||
```
|
||||
|
||||
## Manual Preparation (Optional)
|
||||
|
||||
If you want to prepare notebooks manually:
|
||||
|
||||
```bash
|
||||
cd site
|
||||
./prepare_notebooks.sh
|
||||
```
|
||||
|
||||
This is useful for:
|
||||
- Testing notebook preparation
|
||||
- Debugging launch button issues
|
||||
- Preparing notebooks before CI/CD builds
|
||||
|
||||
## Workflow Summary
|
||||
|
||||
### Complete Development → Site Flow
|
||||
|
||||
```
|
||||
1. Development
|
||||
Edit: modules/01_tensor/tensor_dev.py
|
||||
|
||||
2. Generate Assignments
|
||||
Run: tito nbgrader generate 01_tensor
|
||||
Creates: assignments/source/01_tensor/01_tensor.ipynb
|
||||
|
||||
3. Build Site (automatic)
|
||||
Run: cd site && make html
|
||||
Auto-prepares: Copies notebooks to site/chapters/modules/
|
||||
Builds: Jupyter Book with launch buttons
|
||||
|
||||
4. Launch Buttons Work!
|
||||
Users click → Binder/Colab opens with notebook
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
✅ **Automatic**: No manual steps needed
|
||||
✅ **Consistent**: Always uses latest notebooks
|
||||
✅ **Fast**: Uses existing assignment notebooks when available
|
||||
✅ **Robust**: Falls back gracefully if notebooks don't exist
|
||||
✅ **Integrated**: Works with all build commands
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Launch Buttons Don't Appear
|
||||
|
||||
1. **Check notebooks exist**:
|
||||
```bash
|
||||
ls site/chapters/modules/*.ipynb
|
||||
```
|
||||
|
||||
2. **Regenerate assignments**:
|
||||
```bash
|
||||
tito nbgrader generate --all
|
||||
```
|
||||
|
||||
3. **Rebuild site**:
|
||||
```bash
|
||||
cd site && make html
|
||||
```
|
||||
|
||||
### Notebooks Not Found
|
||||
|
||||
If you see "No notebooks prepared":
|
||||
- Run `tito nbgrader generate --all` first
|
||||
- Ensure modules have Python source files
|
||||
- Check that `tito` command is available
|
||||
|
||||
### Build Fails
|
||||
|
||||
The prepare script is designed to fail gracefully:
|
||||
- If `tito` is not available, it skips preparation
|
||||
- If notebooks don't exist, it warns but continues
|
||||
- Build continues even if preparation fails
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
For automated builds (GitHub Actions, etc.):
|
||||
|
||||
```yaml
|
||||
# Example GitHub Actions step
|
||||
- name: Build site
|
||||
run: |
|
||||
cd site
|
||||
make html
|
||||
```
|
||||
|
||||
The prepare script automatically handles:
|
||||
- Missing `tito` command (skips gracefully)
|
||||
- Missing notebooks (warns but continues)
|
||||
- Non-git environments (works in CI/CD)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ✅ Notebook preparation integrated into build
|
||||
2. ✅ Launch buttons will work automatically
|
||||
3. ⏳ Test Binder/Colab links after build
|
||||
4. ⏳ Verify launch buttons appear on site
|
||||
|
||||
29
binder/CLEANUP_NOTES.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Cleanup Notes: Old 01_setup Module
|
||||
|
||||
## Issue
|
||||
The `assignments/source/01_setup/` directory contains an outdated notebook from when Module 01 was "Setup". Module 01 is now "Tensor" (`modules/01_tensor/`).
|
||||
|
||||
## Current State
|
||||
- ✅ **Current Module 01**: `modules/01_tensor/` (Tensor)
|
||||
- ⚠️ **Old Assignment**: `assignments/source/01_setup/` (outdated)
|
||||
- ✅ **Current Assignment**: `assignments/source/02_tensor/` (Tensor)
|
||||
|
||||
## Impact on Binder/Colab
|
||||
**No impact** - Binder setup doesn't depend on specific assignment notebooks. The `binder/` configuration:
|
||||
- Installs TinyTorch package (`pip install -e .`)
|
||||
- Provides JupyterLab environment
|
||||
- Students can access any notebooks in the repository
|
||||
|
||||
## References Updated
|
||||
- ✅ `binder/VERIFY.md` - Updated Colab example to use `02_tensor`
|
||||
- ✅ `site/usage-paths/classroom-use.md` - Updated nbgrader commands
|
||||
- ✅ `docs/STUDENT_QUICKSTART.md` - Updated module references
|
||||
|
||||
## Recommendation
|
||||
The old `assignments/source/01_setup/` directory can be:
|
||||
1. **Removed** if no longer needed (cleanest option)
|
||||
2. **Kept** if you want to preserve old assignments for reference
|
||||
3. **Moved** to an archive directory if you want to keep history
|
||||
|
||||
**For Binder/Colab**: No action needed - they work regardless of this old directory.
|
||||
|
||||
146
binder/CLOUD_NOTEBOOK_OPTIONS.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Cloud Notebook Options for TinyTorch
|
||||
|
||||
## Current Setup
|
||||
|
||||
**Currently Configured:**
|
||||
- ✅ **MyBinder** (`https://mybinder.org`) - Free, open-source, works well
|
||||
- ✅ **Google Colab** (`https://colab.research.google.com`) - Free, popular, GPU access
|
||||
|
||||
## Available Options
|
||||
|
||||
### 1. MyBinder (Current) ✅
|
||||
**Pros:**
|
||||
- Free and open-source
|
||||
- No account required
|
||||
- Works directly from GitHub
|
||||
- Good for educational use
|
||||
- Already configured and working
|
||||
|
||||
**Cons:**
|
||||
- Can be slow to start (2-5 minutes)
|
||||
- Limited resources (CPU, memory)
|
||||
- No GPU access
|
||||
- Sessions timeout after inactivity
|
||||
|
||||
**Best For:** Educational use, quick demos, zero-setup access
|
||||
|
||||
### 2. Google Colab (Current) ✅
|
||||
**Pros:**
|
||||
- Free tier available
|
||||
- GPU access (free tier: T4 GPU)
|
||||
- Fast startup
|
||||
- Popular and familiar to students
|
||||
- Good integration with Google Drive
|
||||
|
||||
**Cons:**
|
||||
- Requires Google account
|
||||
- Free tier has usage limits
|
||||
- Sessions disconnect after inactivity
|
||||
- Can be slow during peak times
|
||||
|
||||
**Best For:** Students who need GPU, familiar Google ecosystem
|
||||
|
||||
### 3. Deepnote (Not Currently Configured)
|
||||
**Pros:**
|
||||
- Modern, polished interface
|
||||
- Real-time collaboration
|
||||
- Good for team projects
|
||||
- Free tier available
|
||||
- Better than Colab for some use cases
|
||||
|
||||
**Cons:**
|
||||
- Less well-known than Colab
|
||||
- Requires account
|
||||
- Free tier limitations
|
||||
|
||||
**Best For:** Team collaboration, professional workflows
|
||||
|
||||
**How to Add:**
|
||||
```yaml
|
||||
# In site/_config.yml
|
||||
launch_buttons:
|
||||
deepnote_url: "https://deepnote.com"
|
||||
```
|
||||
|
||||
### 4. JupyterHub (For Institutions)
|
||||
**Pros:**
|
||||
- Self-hosted control
|
||||
- Institutional integration
|
||||
- Can provide GPUs
|
||||
- Scalable
|
||||
|
||||
**Cons:**
|
||||
- Requires server infrastructure
|
||||
- Setup complexity
|
||||
- Maintenance overhead
|
||||
|
||||
**Best For:** Universities, institutions with IT support
|
||||
|
||||
### 5. Kaggle Notebooks
|
||||
**Pros:**
|
||||
- Free GPU access
|
||||
- Popular ML community
|
||||
- Good for competitions
|
||||
|
||||
**Cons:**
|
||||
- Less flexible than Colab
|
||||
- More focused on competitions
|
||||
|
||||
**Best For:** ML competitions, Kaggle users
|
||||
|
||||
## Recommendation for TinyTorch
|
||||
|
||||
### Current Setup is Good ✅
|
||||
|
||||
**MyBinder + Colab** covers most use cases:
|
||||
- **MyBinder**: Zero-setup, no account needed, perfect for quick access
|
||||
- **Colab**: GPU access when needed, familiar to students
|
||||
|
||||
### Optional Addition: Deepnote
|
||||
|
||||
If you want to add Deepnote for better collaboration:
|
||||
|
||||
1. **Add to config:**
|
||||
```yaml
|
||||
launch_buttons:
|
||||
binderhub_url: "https://mybinder.org"
|
||||
colab_url: "https://colab.research.google.com"
|
||||
deepnote_url: "https://deepnote.com" # Add this
|
||||
```
|
||||
|
||||
2. **Benefits:**
|
||||
- Better collaboration features
|
||||
- More modern interface
|
||||
- Good for team projects
|
||||
|
||||
3. **Considerations:**
|
||||
- Adds another option (might be confusing)
|
||||
- Students need to create account
|
||||
- Current setup already works well
|
||||
|
||||
## What About "Mariomi"?
|
||||
|
||||
I couldn't find a tool called "Mariomi" related to notebooks. You might be thinking of:
|
||||
- **MyST** (MyST Markdown) - Already used by Jupyter Book (for documentation)
|
||||
- **Miro** - Collaboration whiteboard (not for notebooks)
|
||||
- **Deepnote** - Modern notebook platform (see above)
|
||||
|
||||
## My Recommendation
|
||||
|
||||
**Keep current setup (MyBinder + Colab)** because:
|
||||
1. ✅ Already working
|
||||
2. ✅ Covers all use cases
|
||||
3. ✅ No additional complexity
|
||||
4. ✅ Students familiar with Colab
|
||||
5. ✅ MyBinder perfect for zero-setup access
|
||||
|
||||
**Optional:** Add Deepnote if you want better collaboration features, but it's not necessary.
|
||||
|
||||
## Testing Current Setup
|
||||
|
||||
To verify launch buttons work:
|
||||
1. Build site: `cd site && make html`
|
||||
2. Check notebook pages have launch buttons
|
||||
3. Test Binder: Click "Launch Binder" → Should open MyBinder
|
||||
4. Test Colab: Click "Launch Colab" → Should open in Colab
|
||||
|
||||
262
binder/COMMUNITY_BENCHMARK_DESIGN.md
Normal file
@@ -0,0 +1,262 @@
|
||||
# Community Benchmark & "Hello World" Experience Design
|
||||
|
||||
## Goal: First Success Moment
|
||||
|
||||
Create an immediate "wow, I did it!" moment where students:
|
||||
1. ✅ Clone and setup TinyTorch
|
||||
2. ✅ Run all tests (validate installation)
|
||||
3. ✅ Run milestones (validate their implementation)
|
||||
4. 🎉 Get benchmark score and join the community
|
||||
|
||||
## User Journey Flow
|
||||
|
||||
```
|
||||
Clone & Setup
|
||||
↓
|
||||
tito system doctor (verify setup)
|
||||
↓
|
||||
tito milestone validate --all (run all milestones)
|
||||
↓
|
||||
tito benchmark baseline (generate benchmark score)
|
||||
↓
|
||||
🎉 "Welcome to TinyTorch Community!"
|
||||
↓
|
||||
[Optional] Upload to leaderboard
|
||||
```
|
||||
|
||||
## Implementation Design
|
||||
|
||||
### 1. Baseline Benchmark Command
|
||||
|
||||
**Command**: `tito benchmark baseline`
|
||||
|
||||
**What it does**:
|
||||
- Runs a set of lightweight benchmarks (not full module 20)
|
||||
- Tests basic operations: tensor creation, matrix multiplication, simple forward pass
|
||||
- Measures: execution time, memory usage, basic throughput
|
||||
- Generates JSON with results
|
||||
|
||||
**When to run**:
|
||||
- After `tito system doctor` passes
|
||||
- After `tito milestone validate --all` passes
|
||||
- Can be run anytime to check baseline
|
||||
|
||||
### 2. Benchmark JSON Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
"version": "1.0.0",
|
||||
"system": {
|
||||
"platform": "darwin",
|
||||
"python_version": "3.11.0",
|
||||
"numpy_version": "1.24.0",
|
||||
"cpu_count": 8,
|
||||
"memory_gb": 16
|
||||
},
|
||||
"baseline_benchmarks": {
|
||||
"tensor_creation": {
|
||||
"time_ms": 0.5,
|
||||
"memory_mb": 0.1
|
||||
},
|
||||
"matrix_multiply": {
|
||||
"time_ms": 2.3,
|
||||
"throughput_ops_per_sec": 434.78
|
||||
},
|
||||
"simple_forward_pass": {
|
||||
"time_ms": 5.2,
|
||||
"memory_mb": 2.5
|
||||
}
|
||||
},
|
||||
"milestone_status": {
|
||||
"milestone_01_perceptron": "passed",
|
||||
"milestone_02_xor": "passed",
|
||||
"milestone_03_mlp": "passed"
|
||||
},
|
||||
"setup_validated": true,
|
||||
"all_tests_passed": true
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Upload/Submission System
|
||||
|
||||
**Command**: `tito benchmark submit [--public]`
|
||||
|
||||
**What it does**:
|
||||
- Uploads benchmark JSON to server
|
||||
- Gets back: community rank, percentile, badge
|
||||
- Optional: make public on leaderboard
|
||||
|
||||
**Server endpoint** (to be created):
|
||||
- `POST /api/benchmarks/submit`
|
||||
- Returns: `{ "rank": 1234, "percentile": 75, "badge": "🚀 First Steps" }`
|
||||
|
||||
### 4. Community Leaderboard
|
||||
|
||||
**Features**:
|
||||
- Public leaderboard (optional participation)
|
||||
- Shows: rank, percentile, system info, timestamp
|
||||
- Filterable by: system type, date, milestone status
|
||||
- Badges: "🚀 First Steps", "⚡ Fast Setup", "🏆 All Milestones"
|
||||
|
||||
### 5. "Hello World" Experience
|
||||
|
||||
**After `tito benchmark baseline`**:
|
||||
|
||||
```
|
||||
🎉 Congratulations! You've successfully set up TinyTorch!
|
||||
|
||||
📊 Your Baseline Performance:
|
||||
• Tensor Operations: ⚡ Fast (0.5ms)
|
||||
• Matrix Multiply: ⚡ Fast (2.3ms)
|
||||
• Forward Pass: ⚡ Fast (5.2ms)
|
||||
|
||||
✅ Milestones Validated: 3/6 passed
|
||||
|
||||
🌍 Join the Community:
|
||||
Run 'tito benchmark submit' to share your results
|
||||
and see how you compare to others worldwide!
|
||||
|
||||
📈 Your Score: 85/100
|
||||
You're in the top 25% of TinyTorch users!
|
||||
|
||||
🚀 Next Steps:
|
||||
• Continue building modules
|
||||
• Run 'tito benchmark baseline' anytime
|
||||
• Complete all milestones for full score
|
||||
```
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Phase 1: Baseline Benchmark (Core)
|
||||
|
||||
1. **Create `tito/commands/benchmark.py`**:
|
||||
- `tito benchmark baseline` - Run benchmarks, generate JSON
|
||||
- `tito benchmark submit` - Upload to server (optional)
|
||||
|
||||
2. **Benchmark Suite**:
|
||||
- Lightweight tests (don't require all modules)
|
||||
- Basic tensor operations
|
||||
- Simple forward pass
|
||||
- Memory profiling
|
||||
|
||||
3. **JSON Generation**:
|
||||
- Save to `benchmarks/baseline_YYYYMMDD_HHMMSS.json`
|
||||
- Include system info, benchmark results, milestone status
|
||||
|
||||
### Phase 2: Server Integration
|
||||
|
||||
1. **API Endpoint**:
|
||||
- Simple REST API
|
||||
- Accepts benchmark JSON
|
||||
- Returns rank/percentile/badge
|
||||
- Stores in database
|
||||
|
||||
2. **Leaderboard**:
|
||||
- Public web page
|
||||
- Shows rankings
|
||||
- Filterable/searchable
|
||||
|
||||
### Phase 3: Community Features
|
||||
|
||||
1. **Badges**:
|
||||
- "🚀 First Steps" - Completed baseline
|
||||
- "⚡ Fast Setup" - Top 10% performance
|
||||
- "🏆 All Milestones" - All milestones passed
|
||||
- "🌍 Community Member" - Submitted to leaderboard
|
||||
|
||||
2. **Sharing**:
|
||||
- Generate shareable image/card
|
||||
- "I just set up TinyTorch! Score: 85/100"
|
||||
- Link to leaderboard
|
||||
|
||||
## Technical Considerations
|
||||
|
||||
### Benchmark Design
|
||||
|
||||
**Keep it lightweight**:
|
||||
- Don't require all modules
|
||||
- Use basic operations only
|
||||
- Fast execution (< 30 seconds)
|
||||
- Works after setup + milestone validation
|
||||
|
||||
**What to benchmark**:
|
||||
- Tensor creation speed
|
||||
- Matrix multiplication throughput
|
||||
- Simple forward pass (2-layer network)
|
||||
- Memory efficiency
|
||||
- Basic autograd operations
|
||||
|
||||
### Privacy & Opt-in
|
||||
|
||||
- **Default**: Benchmarks saved locally only
|
||||
- **Optional**: `--public` flag to share
|
||||
- **Anonymized**: System info only (no personal data)
|
||||
- **Consent**: Clear messaging about what's shared
|
||||
|
||||
### Server Architecture
|
||||
|
||||
**Simple approach**:
|
||||
- Static JSON file storage (GitHub Pages?)
|
||||
- Or simple API (Flask/FastAPI)
|
||||
- Database: SQLite or PostgreSQL
|
||||
- Leaderboard: Static site generator
|
||||
|
||||
**More advanced**:
|
||||
- Real-time leaderboard
|
||||
- User accounts (optional)
|
||||
- Historical tracking
|
||||
- Regional comparisons
|
||||
|
||||
## User Experience Flow
|
||||
|
||||
### First Time Setup
|
||||
|
||||
```bash
|
||||
# 1. Clone and setup
|
||||
git clone https://github.com/mlsysbook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
./setup-environment.sh
|
||||
source activate.sh
|
||||
|
||||
# 2. Verify setup
|
||||
tito system doctor
|
||||
# ✅ All checks passed!
|
||||
|
||||
# 3. Run milestones (if modules completed)
|
||||
tito milestone validate --all
|
||||
# ✅ Milestone 01: Perceptron - PASSED
|
||||
# ✅ Milestone 02: XOR - PASSED
|
||||
# ✅ Milestone 03: MLP - PASSED
|
||||
|
||||
# 4. Generate baseline benchmark
|
||||
tito benchmark baseline
|
||||
# 🎉 Congratulations! You've successfully set up TinyTorch!
|
||||
# 📊 Your Baseline Performance: 85/100
|
||||
# 🌍 Run 'tito benchmark submit' to join the community!
|
||||
|
||||
# 5. (Optional) Submit to leaderboard
|
||||
tito benchmark submit --public
|
||||
# ✅ Submitted! You're rank #1234 (top 25%)
|
||||
# 🔗 View leaderboard: https://tinytorch.ai/leaderboard
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Immediate Gratification**: "I did it!" moment
|
||||
2. **Community Feeling**: Part of something bigger
|
||||
3. **Motivation**: See how they compare
|
||||
4. **Validation**: Confirms setup worked
|
||||
5. **Progress Tracking**: Can re-run anytime
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Design benchmark suite (what to test)
|
||||
2. Implement `tito benchmark baseline` command
|
||||
3. Create JSON schema
|
||||
4. Design server API (or use GitHub Pages)
|
||||
5. Build leaderboard page
|
||||
6. Add badges/sharing features
|
||||
|
||||
This creates a "hello world" experience that makes students feel successful and part of the community immediately!
|
||||
|
||||
332
binder/COMMUNITY_DATA_COLLECTION.md
Normal file
@@ -0,0 +1,332 @@
|
||||
# Community Data Collection Design
|
||||
|
||||
## Data We Collect (Privacy-Respecting)
|
||||
|
||||
### Required Fields
|
||||
- **Country**: Geographic location (country-level only)
|
||||
- **Setup Verified**: Confirmation that setup works
|
||||
|
||||
### Optional Fields (User Can Skip)
|
||||
- **School/Institution**: University, bootcamp, or organization name
|
||||
- **Course Type**: How they're using TinyTorch
|
||||
- Self-paced learning
|
||||
- University course
|
||||
- Bootcamp/training
|
||||
- Research project
|
||||
- Industry training
|
||||
- **System Type**: Hardware/platform
|
||||
- Apple Silicon
|
||||
- Linux x86
|
||||
- Windows
|
||||
- Cloud (Colab/Binder)
|
||||
- **Experience Level**: (Optional)
|
||||
- Beginner
|
||||
- Intermediate
|
||||
- Advanced
|
||||
|
||||
### What We DON'T Collect
|
||||
- ❌ Personal name
|
||||
- ❌ Email address
|
||||
- ❌ Exact location (city/coordinates)
|
||||
- ❌ IP address
|
||||
- ❌ Any personally identifiable information
|
||||
|
||||
## Data Structure
|
||||
|
||||
### Submission JSON
|
||||
|
||||
```json
|
||||
{
|
||||
"anonymous_id": "abc123...", // Generated hash
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
|
||||
"location": {
|
||||
"country": "United States" // Required
|
||||
},
|
||||
|
||||
"institution": {
|
||||
"name": "Harvard University", // Optional
|
||||
"type": "university" // Optional: university, bootcamp, company, self-paced
|
||||
},
|
||||
|
||||
"context": {
|
||||
"course_type": "university_course", // Optional
|
||||
"experience_level": "intermediate" // Optional
|
||||
},
|
||||
|
||||
"system": {
|
||||
"type": "Apple Silicon", // Optional
|
||||
"platform": "darwin",
|
||||
"python_version": "3.11.0"
|
||||
},
|
||||
|
||||
"progress": {
|
||||
"setup_verified": true,
|
||||
"milestones_passed": 0, // Will update later
|
||||
"modules_completed": 0 // Will update later
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Collection Flow
|
||||
|
||||
### Interactive Prompt
|
||||
|
||||
```bash
|
||||
tito community join
|
||||
|
||||
🌍 Join the TinyTorch Global Community
|
||||
|
||||
This will add your location to the public community map.
|
||||
All information is optional and completely anonymized.
|
||||
|
||||
📍 Country: [Auto-detected: United States]
|
||||
(Press Enter to use detected, or type different country)
|
||||
|
||||
🏫 School/Institution (optional):
|
||||
Examples: "Harvard University", "Stanford", "Self-paced"
|
||||
[Press Enter to skip]
|
||||
|
||||
📚 Course Type (optional):
|
||||
[1] Self-paced learning
|
||||
[2] University course
|
||||
[3] Bootcamp/training
|
||||
[4] Research project
|
||||
[5] Industry training
|
||||
[6] Skip
|
||||
Choose [1-6]:
|
||||
|
||||
💻 System Type (optional):
|
||||
[Auto-detected: Apple Silicon]
|
||||
[Press Enter to use detected, or type different]
|
||||
|
||||
🎓 Experience Level (optional):
|
||||
[1] Beginner
|
||||
[2] Intermediate
|
||||
[3] Advanced
|
||||
[4] Skip
|
||||
Choose [1-4]:
|
||||
|
||||
📊 What will be shared:
|
||||
• Country: United States ✅
|
||||
• Institution: Harvard University ✅
|
||||
• Course Type: University course ✅
|
||||
• System Type: Apple Silicon ✅
|
||||
• No personal information ✅
|
||||
|
||||
🔒 Privacy: Completely anonymized, country-level location only
|
||||
|
||||
Continue? [y/N]: y
|
||||
|
||||
✅ You've joined the TinyTorch Community!
|
||||
|
||||
📍 Location: United States
|
||||
🏫 Institution: Harvard University
|
||||
🌍 View map: https://tinytorch.ai/community
|
||||
|
||||
🎖️ You're builder #1,234 on the global map!
|
||||
|
||||
💡 Your institution will appear on the map (if provided)
|
||||
```
|
||||
|
||||
## Map Visualization Features
|
||||
|
||||
### What the Map Shows
|
||||
|
||||
**Country View:**
|
||||
- Dots/countries with builder counts
|
||||
- "1,234 builders in 45 countries"
|
||||
|
||||
**Institution View** (Optional Filter):
|
||||
- "Builders from 234 institutions"
|
||||
- Top institutions by builder count
|
||||
- "Harvard University: 15 builders"
|
||||
- "Stanford: 12 builders"
|
||||
- "Self-paced: 456 builders"
|
||||
|
||||
**Course Type Breakdown:**
|
||||
- "University courses: 234"
|
||||
- "Self-paced: 456"
|
||||
- "Bootcamps: 89"
|
||||
- "Research: 123"
|
||||
|
||||
**Diversity Stats:**
|
||||
- "Builders from 45 countries"
|
||||
- "234 institutions represented"
|
||||
- "5 course types"
|
||||
- "Diverse experience levels"
|
||||
|
||||
## Privacy Considerations
|
||||
|
||||
### Institution Privacy
|
||||
|
||||
**Options:**
|
||||
1. **Show institution names** (if provided)
|
||||
- Pros: More engaging, shows diversity
|
||||
- Cons: Might identify users in small programs
|
||||
|
||||
2. **Show institution counts only**
|
||||
- Pros: More private
|
||||
- Cons: Less engaging
|
||||
|
||||
3. **Hybrid approach** (Recommended)
|
||||
- Show institution names if ≥3 builders from that institution
|
||||
- Otherwise: "Other institutions: 5 builders"
|
||||
- Protects privacy while showing diversity
|
||||
|
||||
### Consent Flow
|
||||
|
||||
**Clear messaging:**
|
||||
```
|
||||
⚠️ Institution Information
|
||||
|
||||
If you provide your school/institution name, it may appear on the public map.
|
||||
|
||||
🔒 Privacy Protection:
|
||||
• Institution names only shown if ≥3 builders from that institution
|
||||
• No personal names or identifiers
|
||||
• Completely anonymized
|
||||
|
||||
Provide institution? [y/N]:
|
||||
```
|
||||
|
||||
## Map Features
|
||||
|
||||
### Interactive Map
|
||||
|
||||
**Country Level:**
|
||||
- Click country → See stats:
|
||||
- "United States: 456 builders"
|
||||
- "Top institutions: Harvard (15), Stanford (12), MIT (10)"
|
||||
- "Course types: University (234), Self-paced (189)"
|
||||
|
||||
**Institution Filter:**
|
||||
- Filter by institution type
|
||||
- Show: Universities, Bootcamps, Self-paced, etc.
|
||||
- See geographic distribution
|
||||
|
||||
**Course Type View:**
|
||||
- Color-code by course type
|
||||
- Show: "Where are university students?"
|
||||
- Show: "Where are self-paced learners?"
|
||||
|
||||
### Stats Dashboard
|
||||
|
||||
```
|
||||
🌍 TinyTorch Community
|
||||
|
||||
📊 Global Stats:
|
||||
• 1,234 builders worldwide
|
||||
• 45 countries
|
||||
• 234 institutions
|
||||
• 5 course types
|
||||
|
||||
🏫 Top Institutions:
|
||||
1. Harvard University: 15 builders
|
||||
2. Stanford: 12 builders
|
||||
3. MIT: 10 builders
|
||||
4. Self-paced: 456 builders
|
||||
...
|
||||
|
||||
🌎 Geographic Diversity:
|
||||
• United States: 456 builders
|
||||
• India: 234 builders
|
||||
• United Kingdom: 123 builders
|
||||
...
|
||||
|
||||
📚 Course Types:
|
||||
• Self-paced: 456 (37%)
|
||||
• University: 234 (19%)
|
||||
• Bootcamp: 89 (7%)
|
||||
...
|
||||
```
|
||||
|
||||
## Benefits of Collecting This Data
|
||||
|
||||
### For Community
|
||||
- **Visual diversity**: See global reach
|
||||
- **Institutional connections**: "Wow, people from my school!"
|
||||
- **Course type insights**: Understand how TinyTorch is used
|
||||
- **Motivation**: "There are builders from 234 institutions!"
|
||||
|
||||
### For Users
|
||||
- **Representation**: "I'm representing my school!"
|
||||
- **Connection**: Find others from same institution
|
||||
- **Pride**: "My institution is on the map!"
|
||||
|
||||
### For Project
|
||||
- **Adoption tracking**: See where TinyTorch is used
|
||||
- **Diversity metrics**: Geographic and institutional diversity
|
||||
- **Success stories**: "Used in 234 institutions worldwide"
|
||||
|
||||
## Implementation
|
||||
|
||||
### Data Collection
|
||||
|
||||
**Command**: `tito community join`
|
||||
|
||||
**Flow:**
|
||||
1. Auto-detect country (using system locale or geolocation API)
|
||||
2. Ask for institution (optional)
|
||||
3. Ask for course type (optional)
|
||||
4. Auto-detect system type
|
||||
5. Ask for experience level (optional)
|
||||
6. Show summary
|
||||
7. Get consent
|
||||
8. Generate submission
|
||||
|
||||
### Privacy Protection
|
||||
|
||||
**Institution Anonymization:**
|
||||
- If <3 builders from institution → Show as "Other institutions"
|
||||
- If ≥3 builders → Show institution name
|
||||
- Protects privacy while showing diversity
|
||||
|
||||
**Data Storage:**
|
||||
- Anonymous ID (hash, not personal)
|
||||
- No personal identifiers
|
||||
- Country-level only (not city)
|
||||
- Optional fields can be skipped
|
||||
|
||||
## Recommended Fields
|
||||
|
||||
### Required
|
||||
- ✅ Country
|
||||
|
||||
### Highly Recommended (Optional)
|
||||
- ✅ Institution/School name
|
||||
- ✅ Course type
|
||||
|
||||
### Nice to Have (Optional)
|
||||
- System type (auto-detected)
|
||||
- Experience level
|
||||
- Milestone progress (updates later)
|
||||
|
||||
### Skip
|
||||
- Personal name
|
||||
- Email
|
||||
- Exact location
|
||||
- Any PII
|
||||
|
||||
## Example Map Entry
|
||||
|
||||
**What users see:**
|
||||
```
|
||||
📍 United States
|
||||
• 456 builders
|
||||
• Top institutions: Harvard (15), Stanford (12), MIT (10)
|
||||
• Course types: University (234), Self-paced (189)
|
||||
```
|
||||
|
||||
**What gets stored:**
|
||||
```json
|
||||
{
|
||||
"country": "United States",
|
||||
"institution": "Harvard University",
|
||||
"course_type": "university_course",
|
||||
"anonymous_id": "abc123..."
|
||||
}
|
||||
```
|
||||
|
||||
This creates a rich, engaging community map while respecting privacy! 🌍✨
|
||||
|
||||
299
binder/COMMUNITY_EXPERT_RECOMMENDATION.md
Normal file
@@ -0,0 +1,299 @@
|
||||
# Community Building Expert Recommendation for TinyTorch
|
||||
|
||||
## Core Principles
|
||||
|
||||
### 1. **Low Barrier to Entry** ✅
|
||||
- Make it **opt-in**, not required
|
||||
- Default: benchmarks saved locally only
|
||||
- No account creation needed initially
|
||||
- Can participate anonymously
|
||||
|
||||
### 2. **Early Wins & Celebration** 🎉
|
||||
- Immediate "I did it!" moment after setup
|
||||
- Celebrate small wins (setup, first milestone)
|
||||
- Show progress, not just final scores
|
||||
- Make it feel like joining a community, not a competition
|
||||
|
||||
### 3. **Privacy-First** 🔒
|
||||
- **Default**: Everything local, nothing shared
|
||||
- **Opt-in sharing**: Clear consent for public leaderboard
|
||||
- **Anonymized**: System specs only, no personal data
|
||||
- **Institutional friendly**: Works for classroom use
|
||||
|
||||
### 4. **Progressive Engagement** 📈
|
||||
- Level 1: Local benchmark (everyone can do)
|
||||
- Level 2: Share anonymously (low commitment)
|
||||
- Level 3: Public leaderboard (for those who want it)
|
||||
- Level 4: Badges/achievements (long-term engagement)
|
||||
|
||||
### 5. **Inclusive, Not Exclusive** 🌍
|
||||
- Don't make it feel competitive
|
||||
- Focus on "you're part of something bigger"
|
||||
- Celebrate participation, not just top performers
|
||||
- Show diversity (different systems, different progress levels)
|
||||
|
||||
## Recommended Design
|
||||
|
||||
### Phase 1: Local Celebration (Everyone)
|
||||
|
||||
**After `tito benchmark baseline`:**
|
||||
|
||||
```
|
||||
🎉 Welcome to the TinyTorch Community!
|
||||
|
||||
✅ Setup Verified
|
||||
✅ Milestones Validated: 3/6
|
||||
📊 Baseline Score: 85/100
|
||||
|
||||
🌍 You're now part of a global community of ML systems builders!
|
||||
|
||||
💡 Tip: Run 'tito benchmark submit' to see how you compare
|
||||
(completely optional, all data stays local by default)
|
||||
```
|
||||
|
||||
**Key**: Celebrate success, mention community, but don't pressure sharing.
|
||||
|
||||
### Phase 2: Anonymous Comparison (Low Commitment)
|
||||
|
||||
**After `tito benchmark submit` (anonymous mode):**
|
||||
|
||||
```
|
||||
✅ Benchmark submitted anonymously!
|
||||
|
||||
📊 Your Performance:
|
||||
• Score: 85/100
|
||||
• Percentile: Top 25%
|
||||
• System: Similar to 1,234 other users
|
||||
|
||||
🎯 You're doing great! Keep building!
|
||||
|
||||
💡 Run 'tito benchmark baseline' anytime to track your progress
|
||||
```
|
||||
|
||||
**Key**: Show comparison without requiring identity.
|
||||
|
||||
### Phase 3: Public Leaderboard (Opt-in)
|
||||
|
||||
**After `tito benchmark submit --public`:**
|
||||
|
||||
```
|
||||
✅ Added to public leaderboard!
|
||||
|
||||
🏆 Your Rank: #1,234 (Top 25%)
|
||||
🌍 View leaderboard: https://tinytorch.ai/leaderboard
|
||||
|
||||
🎖️ Badge Earned: "🚀 First Steps"
|
||||
|
||||
💡 Share your achievement: [Generate share card]
|
||||
```
|
||||
|
||||
**Key**: Make sharing optional and rewarding.
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### 1. Benchmark Command Structure
|
||||
|
||||
```bash
|
||||
# Generate baseline (always local)
|
||||
tito benchmark baseline
|
||||
# → Creates: benchmarks/baseline_TIMESTAMP.json
|
||||
# → Shows celebration message
|
||||
# → No network calls
|
||||
|
||||
# Submit anonymously (low commitment)
|
||||
tito benchmark submit
|
||||
# → Uploads anonymized data
|
||||
# → Gets back: percentile, comparison stats
|
||||
# → No personal info shared
|
||||
|
||||
# Submit publicly (opt-in)
|
||||
tito benchmark submit --public
|
||||
# → Adds to leaderboard
|
||||
# → Gets rank, badge
|
||||
# → Can share achievement
|
||||
```
|
||||
|
||||
### 2. Privacy Model
|
||||
|
||||
**Three Tiers:**
|
||||
|
||||
1. **Local Only** (Default)
|
||||
- Benchmarks saved to `benchmarks/` directory
|
||||
- No network calls
|
||||
- Complete privacy
|
||||
|
||||
2. **Anonymous Submission**
|
||||
- Uploads: system specs, benchmark scores, milestone status
|
||||
- No personal identifiers
|
||||
- Gets back: percentile, comparison stats
|
||||
- Can't be traced back to user
|
||||
|
||||
3. **Public Leaderboard** (Opt-in)
|
||||
- Requires `--public` flag
|
||||
- Can optionally add: GitHub username, location (country)
|
||||
- Shows on public leaderboard
|
||||
- Can generate shareable card
|
||||
|
||||
### 3. Leaderboard Design
|
||||
|
||||
**Features:**
|
||||
- **Anonymized by default**: Show system specs, not names
|
||||
- **Filterable**: By system type, date, milestone status
|
||||
- **Inclusive**: Show all participants, not just top 10
|
||||
- **Progress-focused**: Show "milestones completed" not just "fastest"
|
||||
- **Diverse**: Highlight different system types, not just fastest
|
||||
|
||||
**Example Leaderboard Entry:**
|
||||
```
|
||||
Rank | System Type | Milestones | Score | Date
|
||||
-----|------------------|------------|-------|----------
|
||||
#1 | Apple Silicon | 6/6 ✅ | 95 | Nov 2024
|
||||
#234 | Linux x86 | 3/6 🚧 | 85 | Nov 2024
|
||||
#567 | Windows | 1/6 🚧 | 70 | Nov 2024
|
||||
```
|
||||
|
||||
### 4. Badge System
|
||||
|
||||
**Achievement Badges** (not competitive):
|
||||
- 🚀 **First Steps**: Completed baseline benchmark
|
||||
- ⚡ **Fast Setup**: Setup completed quickly
|
||||
- 🏆 **Milestone Master**: All 6 milestones passed
|
||||
- 🌍 **Community Member**: Submitted to leaderboard
|
||||
- 📈 **Progress Maker**: Improved score over time
|
||||
- 🎓 **Module Master**: Completed all 20 modules
|
||||
|
||||
**Philosophy**: Celebrate progress, not competition.
|
||||
|
||||
### 5. Server Architecture
|
||||
|
||||
**Simple & Scalable:**
|
||||
|
||||
**Option A: GitHub Pages + GitHub API** (Recommended)
|
||||
- Store submissions as JSON files in `gh-pages` branch
|
||||
- Use GitHub API for submissions
|
||||
- Static leaderboard page
|
||||
- Free, reliable, no server maintenance
|
||||
|
||||
**Option B: Simple API** (Future)
|
||||
- Flask/FastAPI endpoint
|
||||
- SQLite/PostgreSQL database
|
||||
- Real-time leaderboard
|
||||
- More features, but requires hosting
|
||||
|
||||
**Recommendation**: Start with GitHub Pages, scale later if needed.
|
||||
|
||||
## User Experience Flow
|
||||
|
||||
### First Time User
|
||||
|
||||
```bash
|
||||
# 1. Setup
|
||||
git clone ...
|
||||
./setup-environment.sh
|
||||
tito system doctor # ✅ All checks passed!
|
||||
|
||||
# 2. Run milestones (if completed)
|
||||
tito milestone validate --all
|
||||
# ✅ Milestone 01: PASSED
|
||||
# ✅ Milestone 02: PASSED
|
||||
# ✅ Milestone 03: PASSED
|
||||
|
||||
# 3. Generate baseline
|
||||
tito benchmark baseline
|
||||
|
||||
# 🎉 Welcome to the TinyTorch Community!
|
||||
# ✅ Setup Verified
|
||||
# ✅ Milestones Validated: 3/6
|
||||
# 📊 Baseline Score: 85/100
|
||||
#
|
||||
# 🌍 You're now part of a global community of ML systems builders!
|
||||
#
|
||||
# 💡 Tip: Run 'tito benchmark submit' to see how you compare
|
||||
# (completely optional, all data stays local by default)
|
||||
|
||||
# 4. (Optional) See comparison
|
||||
tito benchmark submit
|
||||
|
||||
# ✅ Benchmark submitted anonymously!
|
||||
# 📊 Your Performance:
|
||||
# • Score: 85/100
|
||||
# • Percentile: Top 25%
|
||||
# • Similar systems: 1,234 users
|
||||
#
|
||||
# 🎯 You're doing great! Keep building!
|
||||
|
||||
# 5. (Optional) Join public leaderboard
|
||||
tito benchmark submit --public
|
||||
|
||||
# ✅ Added to public leaderboard!
|
||||
# 🏆 Rank: #1,234 (Top 25%)
|
||||
# 🎖️ Badge: "🚀 First Steps"
|
||||
# 🔗 View: https://tinytorch.ai/leaderboard
|
||||
```
|
||||
|
||||
## Key Recommendations
|
||||
|
||||
### ✅ DO:
|
||||
1. **Make it opt-in**: Default to local-only
|
||||
2. **Celebrate participation**: Not just winners
|
||||
3. **Show progress**: Milestones completed, not just speed
|
||||
4. **Respect privacy**: Anonymized by default
|
||||
5. **Keep it simple**: Start with GitHub Pages
|
||||
6. **Focus on community**: "You're part of something bigger"
|
||||
7. **Make it inclusive**: All skill levels welcome
|
||||
|
||||
### ❌ DON'T:
|
||||
1. **Don't make it required**: Some students/institutions can't share
|
||||
2. **Don't make it competitive**: Focus on learning, not winning
|
||||
3. **Don't collect personal data**: System specs only
|
||||
4. **Don't overcomplicate**: Start simple, iterate
|
||||
5. **Don't exclude anyone**: All systems, all progress levels
|
||||
|
||||
## Implementation Priority
|
||||
|
||||
### Phase 1: MVP (Week 1)
|
||||
- ✅ `tito benchmark baseline` command
|
||||
- ✅ Local JSON generation
|
||||
- ✅ Celebration message
|
||||
- ✅ Basic benchmark suite
|
||||
|
||||
### Phase 2: Community (Week 2)
|
||||
- ✅ `tito benchmark submit` (anonymous)
|
||||
- ✅ GitHub Pages leaderboard
|
||||
- ✅ Percentile calculation
|
||||
- ✅ Badge system
|
||||
|
||||
### Phase 3: Engagement (Week 3)
|
||||
- ✅ Public leaderboard (opt-in)
|
||||
- ✅ Shareable cards
|
||||
- ✅ Progress tracking
|
||||
- ✅ Achievement badges
|
||||
|
||||
## Success Metrics
|
||||
|
||||
**Community Health:**
|
||||
- Number of baseline benchmarks generated (local)
|
||||
- Number of anonymous submissions
|
||||
- Number of public leaderboard entries
|
||||
- Diversity of systems represented
|
||||
- Milestone completion rates
|
||||
|
||||
**Not Success Metrics:**
|
||||
- ❌ Highest scores (too competitive)
|
||||
- ❌ Fastest times (excludes slower systems)
|
||||
- ❌ Leaderboard rank (creates pressure)
|
||||
|
||||
## Final Recommendation
|
||||
|
||||
**Start Simple, Build Community:**
|
||||
|
||||
1. **Local celebration first** - Everyone gets the "wow" moment
|
||||
2. **Anonymous comparison** - Low commitment, high value
|
||||
3. **Public leaderboard** - Opt-in for those who want it
|
||||
4. **Focus on progress** - Celebrate milestones, not speed
|
||||
5. **Privacy-first** - Default to local, opt-in to share
|
||||
|
||||
**The goal**: Make students feel part of a global community of ML systems builders, not competitors.
|
||||
|
||||
This creates a welcoming, inclusive community that celebrates learning and progress! 🎉
|
||||
|
||||
371
binder/COMMUNITY_MAP_VISION.md
Normal file
@@ -0,0 +1,371 @@
|
||||
# Community Map Vision: "We Are TinyTorch"
|
||||
|
||||
## The Vision
|
||||
|
||||
A **world map** that shows where TinyTorch builders are located, creating a visual sense of global community. When students complete milestones and submit, they see:
|
||||
|
||||
> "Wow, there's a community of people building ML systems all over the world!"
|
||||
|
||||
## Design Concept
|
||||
|
||||
### The Map Experience
|
||||
|
||||
**After `tito milestone validate --all` passes:**
|
||||
|
||||
```
|
||||
🎉 Congratulations! All Milestones Validated!
|
||||
|
||||
✅ Setup Complete
|
||||
✅ All Tests Passing
|
||||
✅ All Milestones Passed: 6/6
|
||||
|
||||
🌍 Join the Global TinyTorch Community:
|
||||
|
||||
Run 'tito community submit' to add your location to the map
|
||||
and see builders from around the world!
|
||||
|
||||
(Completely optional - only shares country, not exact location)
|
||||
```
|
||||
|
||||
**After `tito community submit`:**
|
||||
|
||||
```
|
||||
✅ You've joined the TinyTorch Community!
|
||||
|
||||
📍 Your Location: United States
|
||||
🌍 View the map: https://tinytorch.ai/community
|
||||
|
||||
🎖️ You're builder #1,234 on the global map!
|
||||
|
||||
💡 See where other TinyTorch builders are located worldwide
|
||||
```
|
||||
|
||||
### The Map Visualization
|
||||
|
||||
**Features:**
|
||||
- **World map** with dots/countries highlighted
|
||||
- **Interactive**: Click to see stats per country
|
||||
- **Live counter**: "1,234 builders worldwide"
|
||||
- **Diversity showcase**: "Builders in 45 countries"
|
||||
- **Recent additions**: "5 new builders this week"
|
||||
|
||||
**Privacy:**
|
||||
- **Country-level only** (not city/coordinates)
|
||||
- **Opt-in**: Must explicitly submit
|
||||
- **Anonymized**: No personal identifiers
|
||||
- **Optional**: Can participate without location
|
||||
|
||||
## Implementation Design
|
||||
|
||||
### 1. Submission Flow
|
||||
|
||||
**Command**: `tito community submit [--country COUNTRY]`
|
||||
|
||||
**What it does:**
|
||||
- Detects country (or asks user)
|
||||
- Validates milestones passed
|
||||
- Submits anonymized data:
|
||||
```json
|
||||
{
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
"country": "United States", // Country only, not city
|
||||
"milestones_passed": 6,
|
||||
"system_type": "Apple Silicon",
|
||||
"anonymous_id": "abc123..." // Generated hash, not personal
|
||||
}
|
||||
```
|
||||
|
||||
**Validation:**
|
||||
- Checks: `tito system doctor` passed
|
||||
- Checks: `tito milestone validate --all` passed
|
||||
- Only submits if everything validated
|
||||
|
||||
### 2. Map Visualization
|
||||
|
||||
**Technology Options:**
|
||||
|
||||
**Option A: Simple Static Map** (Recommended for MVP)
|
||||
- GitHub Pages + Leaflet.js or Mapbox
|
||||
- JSON file with submissions
|
||||
- Static map that updates on deploy
|
||||
- Free, simple, works immediately
|
||||
|
||||
**Option B: Interactive Map**
|
||||
- Leaflet.js or Mapbox GL
|
||||
- Real-time updates
|
||||
- Click countries for stats
|
||||
- More engaging, requires API
|
||||
|
||||
**Option C: GitHub Pages + GeoJSON**
|
||||
- Store submissions as GeoJSON
|
||||
- Use GitHub's map rendering
|
||||
- Simple, free, GitHub-native
|
||||
|
||||
**Recommendation**: Start with Option A (Leaflet.js), upgrade to Option B later.
|
||||
|
||||
### 3. Data Structure
|
||||
|
||||
**Submissions JSON** (`community/submissions.json`):
|
||||
```json
|
||||
{
|
||||
"total_builders": 1234,
|
||||
"countries": {
|
||||
"United States": 456,
|
||||
"India": 234,
|
||||
"United Kingdom": 123,
|
||||
"Germany": 89,
|
||||
...
|
||||
},
|
||||
"recent_submissions": [
|
||||
{
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
"country": "United States",
|
||||
"milestones": 6,
|
||||
"system": "Apple Silicon"
|
||||
},
|
||||
...
|
||||
],
|
||||
"stats": {
|
||||
"total_countries": 45,
|
||||
"this_week": 23,
|
||||
"this_month": 156
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Map Page Design
|
||||
|
||||
**URL**: `https://tinytorch.ai/community` or `/community-map`
|
||||
|
||||
**Features:**
|
||||
- **World map** with country highlights
|
||||
- **Counter**: "1,234 builders worldwide"
|
||||
- **Country list**: "Builders in 45 countries"
|
||||
- **Recent activity**: "5 new builders this week"
|
||||
- **Call to action**: "Join the map → `tito community submit`"
|
||||
|
||||
**Visual Design:**
|
||||
- Clean, modern map
|
||||
- Dots or country shading
|
||||
- Hover shows country stats
|
||||
- Mobile-friendly
|
||||
- Fast loading
|
||||
|
||||
## User Journey
|
||||
|
||||
### Complete Flow
|
||||
|
||||
```bash
|
||||
# 1. Setup and validate
|
||||
git clone ...
|
||||
./setup-environment.sh
|
||||
tito system doctor # ✅ All checks passed
|
||||
tito milestone validate --all # ✅ All 6 milestones passed
|
||||
|
||||
# 2. Join community
|
||||
tito community submit
|
||||
|
||||
# Detecting your location...
|
||||
# Country: United States
|
||||
#
|
||||
# ✅ You've joined the TinyTorch Community!
|
||||
#
|
||||
# 🌍 View the map: https://tinytorch.ai/community
|
||||
# 🎖️ You're builder #1,234 on the global map!
|
||||
#
|
||||
# 💡 See where other TinyTorch builders are located worldwide
|
||||
|
||||
# 3. View the map (opens in browser)
|
||||
# Shows: World map with dots, your country highlighted
|
||||
# Shows: "1,234 builders in 45 countries"
|
||||
# Shows: Recent additions
|
||||
```
|
||||
|
||||
## Privacy & Consent
|
||||
|
||||
### Privacy Model
|
||||
|
||||
**What's Shared** (with consent):
|
||||
- ✅ Country (not city/coordinates)
|
||||
- ✅ System type (Apple Silicon, Linux x86, etc.)
|
||||
- ✅ Milestone count (how many passed)
|
||||
- ✅ Timestamp (when submitted)
|
||||
|
||||
**What's NOT Shared**:
|
||||
- ❌ Exact location
|
||||
- ❌ Personal information
|
||||
- ❌ IP address
|
||||
- ❌ Email/name
|
||||
- ❌ Institution
|
||||
|
||||
**Consent Flow:**
|
||||
```
|
||||
tito community submit
|
||||
|
||||
⚠️ This will add your location to the public community map.
|
||||
|
||||
📊 What will be shared:
|
||||
• Country: United States
|
||||
• System type: Apple Silicon
|
||||
• Milestones passed: 6
|
||||
• No personal information
|
||||
|
||||
🔒 Privacy: Only country-level location, completely anonymized
|
||||
|
||||
Continue? [y/N]: y
|
||||
|
||||
✅ Submitted! View map: https://tinytorch.ai/community
|
||||
```
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Phase 1: MVP (Simple Map)
|
||||
|
||||
1. **Create `tito community submit` command**
|
||||
- Detect/ask for country
|
||||
- Validate milestones passed
|
||||
- Generate submission JSON
|
||||
- Save locally + optionally upload
|
||||
|
||||
2. **Create map page** (`site/community-map.md`)
|
||||
- Static HTML with Leaflet.js
|
||||
- Reads from `community/submissions.json`
|
||||
- Shows world map with countries
|
||||
- Displays stats
|
||||
|
||||
3. **Submission storage**
|
||||
- GitHub Pages: `community/submissions.json`
|
||||
- Or: Simple API endpoint
|
||||
- Updates on each submission
|
||||
|
||||
### Phase 2: Enhanced (Interactive Map)
|
||||
|
||||
1. **Interactive features**
|
||||
- Click countries for details
|
||||
- Filter by system type
|
||||
- Timeline view (growth over time)
|
||||
- Recent submissions feed
|
||||
|
||||
2. **Engagement features**
|
||||
- "Builder of the week" (random selection)
|
||||
- Country leaderboards (optional)
|
||||
- Milestone completion stats
|
||||
|
||||
### Phase 3: Community Features
|
||||
|
||||
1. **Social elements**
|
||||
- Share: "I'm builder #1,234 on the TinyTorch map!"
|
||||
- Badges: "🌍 Global Builder"
|
||||
- Stories: "Builders from 45 countries"
|
||||
|
||||
2. **Analytics**
|
||||
- Growth over time
|
||||
- Geographic distribution
|
||||
- System diversity
|
||||
- Milestone completion rates
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
### Simple Approach (GitHub Pages)
|
||||
|
||||
**File Structure:**
|
||||
```
|
||||
community/
|
||||
├── submissions.json # All submissions
|
||||
├── map.html # Map visualization page
|
||||
└── submit.py # Submission script (optional API)
|
||||
```
|
||||
|
||||
**Map Page** (`site/community-map.md` or HTML):
|
||||
```html
|
||||
<!-- Leaflet.js map -->
|
||||
<div id="community-map"></div>
|
||||
|
||||
<!-- Stats -->
|
||||
<div>
|
||||
<h2>🌍 TinyTorch Community</h2>
|
||||
<p>1,234 builders in 45 countries</p>
|
||||
<p>5 new builders this week</p>
|
||||
</div>
|
||||
|
||||
<!-- Call to action -->
|
||||
<p>Join the map: <code>tito community submit</code></p>
|
||||
```
|
||||
|
||||
**Submission Process:**
|
||||
1. User runs `tito community submit`
|
||||
2. Generates submission JSON
|
||||
3. Option A: User manually PRs to `community/submissions.json`
|
||||
4. Option B: API endpoint accepts submissions
|
||||
5. Map page reads JSON and renders
|
||||
|
||||
### API Approach (Future)
|
||||
|
||||
**Endpoint**: `POST /api/community/submit`
|
||||
- Accepts submission JSON
|
||||
- Validates (check milestones)
|
||||
- Stores in database
|
||||
- Returns success + map URL
|
||||
|
||||
**Map Page**:
|
||||
- Fetches submissions from API
|
||||
- Renders interactive map
|
||||
- Updates in real-time
|
||||
|
||||
## Success Metrics
|
||||
|
||||
**Community Growth:**
|
||||
- Number of countries represented
|
||||
- Total builders on map
|
||||
- Growth rate (new builders/week)
|
||||
- Geographic diversity
|
||||
|
||||
**Engagement:**
|
||||
- Map page views
|
||||
- Submission rate (after milestones pass)
|
||||
- Return visits to map
|
||||
- Social shares
|
||||
|
||||
## The "Wow" Moment
|
||||
|
||||
**When someone views the map:**
|
||||
|
||||
```
|
||||
🌍 TinyTorch Community Map
|
||||
|
||||
[Interactive world map showing dots/countries]
|
||||
|
||||
📊 Stats:
|
||||
• 1,234 builders worldwide
|
||||
• 45 countries represented
|
||||
• 5 new builders this week
|
||||
• Top countries: US (456), India (234), UK (123)
|
||||
|
||||
🎯 Recent Activity:
|
||||
• Builder from Germany just joined!
|
||||
• Builder from Japan completed all milestones!
|
||||
• Builder from Brazil reached milestone 3!
|
||||
|
||||
💡 Join the map: Run 'tito community submit' after completing milestones
|
||||
```
|
||||
|
||||
**The Impact:**
|
||||
- Visual proof of global community
|
||||
- Sense of belonging
|
||||
- Motivation to continue
|
||||
- Pride in being part of something bigger
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Start Simple, Build Community:**
|
||||
|
||||
1. **MVP**: Simple map with country dots
|
||||
2. **Privacy**: Country-level only, opt-in
|
||||
3. **Validation**: Only after milestones pass
|
||||
4. **Visual**: Make it beautiful and engaging
|
||||
5. **Growth**: Let it populate organically
|
||||
|
||||
**The goal**: Create a visual representation that makes students feel part of a global movement of ML systems builders!
|
||||
|
||||
This map becomes a symbol of the TinyTorch community - showing that people all over the world are building ML systems from scratch together. 🌍✨
|
||||
|
||||
144
binder/LAUNCH_READINESS.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Launch Readiness Checklist
|
||||
|
||||
## ✅ Assignment Process - COMPLETE
|
||||
|
||||
### Dynamic Assignment Generation ✅
|
||||
- **Source**: `modules/*/.*_dev.py` (Python files)
|
||||
- **Command**: `tito nbgrader generate MODULE`
|
||||
- **Output**: `assignments/source/MODULE/MODULE.ipynb`
|
||||
- **Status**: Fully functional, dynamically generated
|
||||
|
||||
### Assignment Release ✅
|
||||
- **Command**: `tito nbgrader release MODULE`
|
||||
- **Output**: `assignments/release/MODULE/MODULE.ipynb` (solutions removed)
|
||||
- **Status**: Ready for student distribution
|
||||
|
||||
### Auto-Grading ✅
|
||||
- **Command**: `tito nbgrader autograde MODULE`
|
||||
- **Status**: NBGrader integration complete
|
||||
|
||||
## ✅ Site Build Integration - COMPLETE
|
||||
|
||||
### Automatic Notebook Preparation ✅
|
||||
- **Script**: `site/prepare_notebooks.sh`
|
||||
- **Integration**: Runs automatically during `make html`
|
||||
- **Process**: Copies assignment notebooks to `site/chapters/modules/`
|
||||
- **Result**: Launch buttons appear on notebook pages
|
||||
|
||||
### Build Commands ✅
|
||||
- `make html` - Includes notebook preparation
|
||||
- `make pdf` - Includes notebook preparation
|
||||
- `make pdf-simple` - Includes notebook preparation
|
||||
|
||||
## ✅ Paper Documentation Sync - COMPLETE
|
||||
|
||||
### Files Created ✅
|
||||
- `INSTRUCTOR.md` - ✅ Created (matches paper reference)
|
||||
- `MAINTENANCE.md` - ✅ Created (support commitment through 2027)
|
||||
- `TA_GUIDE.md` - ✅ Created (common errors, debugging strategies)
|
||||
- `docs/TEAM_ONBOARDING.md` - ✅ Created (Model 3 documentation)
|
||||
- `site/usage-paths/team-onboarding.md` - ✅ Created (site version)
|
||||
|
||||
### Files Verified ✅
|
||||
- `CONTRIBUTING.md` - ✅ Exists and matches paper description
|
||||
- `docs/INSTRUCTOR_GUIDE.md` - ✅ Exists (source for INSTRUCTOR.md)
|
||||
|
||||
### Content Updates ✅
|
||||
- Module numbers: All updated to `01_tensor` (not `01_setup`)
|
||||
- Schedule: Updated to match current 20-module structure
|
||||
- Three integration models: All documented
|
||||
- Deployment environments: All documented
|
||||
|
||||
## ✅ Site Navigation - COMPLETE
|
||||
|
||||
### Getting Started Section ✅
|
||||
- Quick Start Guide
|
||||
- Student Workflow
|
||||
- For Instructors
|
||||
- **Team Onboarding** (newly added)
|
||||
|
||||
### All Three Integration Models Accessible ✅
|
||||
1. Self-Paced Learning - Quick Start Guide
|
||||
2. Institutional Integration - For Instructors
|
||||
3. Team Onboarding - Team Onboarding page
|
||||
|
||||
## ✅ Binder/Colab Setup - COMPLETE
|
||||
|
||||
### Binder Configuration ✅
|
||||
- `binder/requirements.txt` - Dependencies
|
||||
- `binder/postBuild` - Installs TinyTorch
|
||||
- Launch buttons configured in `site/_config.yml`
|
||||
|
||||
### Colab Configuration ✅
|
||||
- Launch buttons configured
|
||||
- Repository URL correct
|
||||
- Documentation complete
|
||||
|
||||
## 🎯 Pre-Launch Checklist
|
||||
|
||||
### Required Actions
|
||||
|
||||
1. **Generate Assignment Notebooks**:
|
||||
```bash
|
||||
tito nbgrader generate --all
|
||||
```
|
||||
This creates notebooks for all modules in `assignments/source/`
|
||||
|
||||
2. **Test Site Build**:
|
||||
```bash
|
||||
cd site
|
||||
make html
|
||||
```
|
||||
Verify:
|
||||
- Notebooks are prepared automatically
|
||||
- Launch buttons appear on notebook pages
|
||||
- Site builds without errors
|
||||
|
||||
3. **Test Binder**:
|
||||
- Visit: https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main
|
||||
- Verify build completes (2-5 minutes)
|
||||
- Verify TinyTorch imports correctly
|
||||
- Verify modules are accessible
|
||||
|
||||
4. **Test Colab**:
|
||||
- Test with sample notebook
|
||||
- Verify dependencies install
|
||||
- Verify notebooks run correctly
|
||||
|
||||
5. **Verify Documentation Links**:
|
||||
- Check all site navigation links work
|
||||
- Verify INSTRUCTOR.md accessible
|
||||
- Verify TA_GUIDE.md accessible
|
||||
- Verify Team Onboarding page works
|
||||
|
||||
### Optional Enhancements
|
||||
|
||||
- Add sample solutions to INSTRUCTOR.md (if not already included)
|
||||
- Create common errors FAQ page
|
||||
- Add deployment guide consolidating JupyterHub/Colab/Local
|
||||
- Test with actual assignment notebooks
|
||||
|
||||
## 📊 Final Status
|
||||
|
||||
| Component | Status | Ready for Launch |
|
||||
|-----------|--------|-----------------|
|
||||
| Assignment Generation | ✅ Complete | ✅ Yes |
|
||||
| Site Build Integration | ✅ Complete | ✅ Yes |
|
||||
| Paper Documentation | ✅ Complete | ✅ Yes |
|
||||
| Site Navigation | ✅ Complete | ✅ Yes |
|
||||
| Binder Setup | ✅ Complete | ⏳ Test needed |
|
||||
| Colab Setup | ✅ Complete | ⏳ Test needed |
|
||||
|
||||
## 🚀 Launch Steps
|
||||
|
||||
1. Generate assignment notebooks: `tito nbgrader generate --all`
|
||||
2. Build site: `cd site && make html`
|
||||
3. Test Binder: Visit Binder URL
|
||||
4. Test Colab: Test with sample notebook
|
||||
5. Verify all links work
|
||||
6. **LAUNCH!** 🎉
|
||||
|
||||
---
|
||||
|
||||
**Everything is synced and ready!** Just need to generate notebooks and test launch buttons.
|
||||
|
||||
117
binder/MARIMO_INTEGRATION.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Marimo Integration for TinyTorch
|
||||
|
||||
## What is Marimo?
|
||||
|
||||
[Marimo](https://marimo.io/) is a modern, reactive Python notebook platform that:
|
||||
- **Stores notebooks as pure Python** (`.py` files) - Git-friendly!
|
||||
- **Reactive execution** - Cells update automatically when dependencies change
|
||||
- **Interactive elements** - Built-in widgets, sliders, dataframes
|
||||
- **AI-native** - Built-in AI assistance and copilots
|
||||
- **Share as apps** - Export to HTML or serve as web apps
|
||||
- **Reproducible** - Deterministic execution, no hidden state
|
||||
|
||||
## Why Marimo for TinyTorch?
|
||||
|
||||
**Perfect Fit:**
|
||||
1. ✅ **Git-friendly** - Notebooks stored as `.py` files (matches TinyTorch's Python-first approach!)
|
||||
2. ✅ **Reactive** - Great for teaching (students see changes propagate automatically)
|
||||
3. ✅ **Educational** - Used by Stanford, UC Berkeley, Princeton, etc.
|
||||
4. ✅ **Modern** - Better than Jupyter for many use cases
|
||||
5. ✅ **Open source** - Free and community-driven
|
||||
|
||||
## Marimo vs Current Options
|
||||
|
||||
| Feature | MyBinder | Colab | Marimo |
|
||||
|---------|----------|-------|--------|
|
||||
| Git-friendly | ❌ (.ipynb) | ❌ (.ipynb) | ✅ (.py files) |
|
||||
| Reactive | ❌ | ❌ | ✅ |
|
||||
| AI assistance | ❌ | ✅ | ✅ |
|
||||
| Free | ✅ | ✅ | ✅ |
|
||||
| Zero-setup | ✅ | ⚠️ (needs account) | ✅ |
|
||||
| GPU access | ❌ | ✅ | ⚠️ (limited) |
|
||||
|
||||
## Integration Options
|
||||
|
||||
### Option 1: Marimo Molab Badges
|
||||
|
||||
Marimo provides "molab" badges that can open notebooks directly from GitHub:
|
||||
|
||||
```
|
||||
https://marimo.app/molab?repo=mlsysbook/TinyTorch&path=path/to/notebook.py
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
- Notebooks stored as `.py` files in repo
|
||||
- Badge links to marimo's cloud service
|
||||
- Opens notebook in marimo's online editor
|
||||
- No local installation needed
|
||||
|
||||
### Option 2: Add to Launch Buttons
|
||||
|
||||
Jupyter Book doesn't natively support marimo launch buttons, but we can:
|
||||
1. Add custom HTML/JavaScript to create marimo badges
|
||||
2. Use marimo's badge generator
|
||||
3. Add manual links in notebook pages
|
||||
|
||||
### Option 3: Convert Notebooks to Marimo Format
|
||||
|
||||
Since marimo uses `.py` files, we could:
|
||||
1. Keep current `.ipynb` files for Jupyter/Colab/Binder
|
||||
2. Generate `.py` versions for marimo
|
||||
3. Add marimo badges alongside existing launch buttons
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Add Marimo as an Option:**
|
||||
|
||||
1. **Keep current setup** (MyBinder + Colab) - they work well
|
||||
2. **Add marimo badges** to notebook pages for students who want reactive notebooks
|
||||
3. **Generate `.py` versions** of notebooks for marimo compatibility
|
||||
|
||||
**Benefits:**
|
||||
- Students get choice of notebook platforms
|
||||
- Marimo's reactive execution helps with learning
|
||||
- Git-friendly format aligns with TinyTorch's Python-first approach
|
||||
- Modern, educational tool used by top universities
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Generate Marimo-Compatible Notebooks
|
||||
|
||||
Since TinyTorch already uses Python-first development (`*_dev.py` files), we could:
|
||||
- Convert assignment notebooks to marimo format
|
||||
- Or create marimo-specific versions
|
||||
|
||||
### Step 2: Add Marimo Badges
|
||||
|
||||
Add to notebook pages:
|
||||
```html
|
||||
<a href="https://marimo.app/molab?repo=mlsysbook/TinyTorch&path=site/chapters/modules/01_tensor.py">
|
||||
<img src="https://marimo.app/badge.svg" alt="Open in Marimo">
|
||||
</a>
|
||||
```
|
||||
|
||||
### Step 3: Document Marimo Usage
|
||||
|
||||
Add to student documentation:
|
||||
- How to use marimo with TinyTorch
|
||||
- Benefits of reactive notebooks
|
||||
- Comparison with Jupyter/Colab
|
||||
|
||||
## Current Status
|
||||
|
||||
**Not yet integrated** - but marimo would be a great addition!
|
||||
|
||||
**Next steps if you want to add it:**
|
||||
1. Test marimo with TinyTorch notebooks
|
||||
2. Generate marimo-compatible `.py` files
|
||||
3. Add badges to site pages
|
||||
4. Update documentation
|
||||
|
||||
## Resources
|
||||
|
||||
- [Marimo Website](https://marimo.io/)
|
||||
- [Marimo Docs](https://docs.marimo.io/)
|
||||
- [Marimo Gallery](https://marimo.io/gallery)
|
||||
- [Marimo Badge Generator](https://marimo.io/badge)
|
||||
|
||||
104
binder/MARIMO_NBGRADER_COMPATIBILITY.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Marimo and NBGrader Compatibility
|
||||
|
||||
## Short Answer: ✅ No, Marimo badges won't break NBGrader
|
||||
|
||||
**Why:**
|
||||
- Marimo badges are just **frontend UI elements** (JavaScript links)
|
||||
- They don't modify notebook files
|
||||
- NBGrader reads from actual `.ipynb` files, not from the website
|
||||
- Badges just create links to open notebooks in Marimo's cloud service
|
||||
|
||||
## How It Works
|
||||
|
||||
### Marimo Badges (What We Added)
|
||||
- **What they do**: Add a "🍃 Open in Marimo" link to notebook pages
|
||||
- **What they don't do**: Modify notebook files or NBGrader metadata
|
||||
- **Impact on NBGrader**: **None** - they're just links
|
||||
|
||||
### NBGrader Workflow
|
||||
1. Instructors generate notebooks: `tito nbgrader generate MODULE`
|
||||
2. NBGrader adds metadata to `.ipynb` files (grade_id, points, etc.)
|
||||
3. Students work in notebooks (Jupyter, Colab, or Marimo)
|
||||
4. Students submit notebooks back
|
||||
5. NBGrader reads metadata from submitted `.ipynb` files
|
||||
|
||||
## Potential Considerations
|
||||
|
||||
### If Students Use Marimo to Edit Notebooks
|
||||
|
||||
**Scenario 1: Students open `.ipynb` in Marimo**
|
||||
- ✅ Marimo can import Jupyter notebooks
|
||||
- ✅ NBGrader metadata preserved (it's in the `.ipynb` file)
|
||||
- ✅ Students submit `.ipynb` files back
|
||||
- ✅ **No problem** - NBGrader works normally
|
||||
|
||||
**Scenario 2: Students convert to Marimo `.py` format**
|
||||
- ⚠️ Marimo stores notebooks as `.py` files (not `.ipynb`)
|
||||
- ⚠️ NBGrader metadata is in `.ipynb` format
|
||||
- ⚠️ Converting to `.py` might lose NBGrader metadata
|
||||
- ✅ **Solution**: Students should submit `.ipynb` files, not `.py` files
|
||||
|
||||
## Best Practice for Students
|
||||
|
||||
**For NBGrader assignments:**
|
||||
1. Students can use Marimo to **view and learn** from notebooks
|
||||
2. For **submissions**, students should work in `.ipynb` format (Jupyter/Colab)
|
||||
3. Or convert marimo `.py` back to `.ipynb` before submitting
|
||||
|
||||
**For non-graded exploration:**
|
||||
- Students can freely use Marimo's `.py` format
|
||||
- Great for learning and experimentation
|
||||
- No NBGrader concerns
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Keep Marimo badges** - they're safe:
|
||||
- ✅ Don't interfere with NBGrader
|
||||
- ✅ Give students more options for learning
|
||||
- ✅ Students can use Marimo for exploration
|
||||
- ✅ For graded work, students use standard `.ipynb` workflow
|
||||
|
||||
**Add to student instructions:**
|
||||
- "Marimo badges are for exploration and learning"
|
||||
- "For NBGrader assignments, submit `.ipynb` files (not `.py` files)"
|
||||
- "Marimo can import `.ipynb` files and preserve NBGrader metadata"
|
||||
|
||||
## Technical Details
|
||||
|
||||
### NBGrader Metadata Format
|
||||
NBGrader stores metadata in notebook cell metadata:
|
||||
```json
|
||||
{
|
||||
"nbgrader": {
|
||||
"grade": true,
|
||||
"grade_id": "tensor_memory",
|
||||
"points": 2,
|
||||
"schema_version": 3
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Marimo Format
|
||||
Marimo stores notebooks as pure Python:
|
||||
```python
|
||||
# Cell 1
|
||||
import numpy as np
|
||||
|
||||
# Cell 2
|
||||
def memory_footprint(self):
|
||||
return self.data.nbytes
|
||||
```
|
||||
|
||||
**Conversion between formats:**
|
||||
- `.ipynb` → `.py`: Possible, but NBGrader metadata might be lost
|
||||
- `.py` → `.ipynb`: Possible, but NBGrader metadata won't be restored
|
||||
|
||||
## Conclusion
|
||||
|
||||
✅ **Marimo badges are safe** - they don't break NBGrader
|
||||
✅ **Students can use Marimo** for learning and exploration
|
||||
✅ **For graded work**, students should use `.ipynb` format
|
||||
✅ **No changes needed** to NBGrader workflow
|
||||
|
||||
The badges are just convenient links - they don't interfere with the actual grading system!
|
||||
|
||||
80
binder/MARIMO_SETUP.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# Marimo Setup for TinyTorch - No Extra Setup Required! ✅
|
||||
|
||||
## Good News: No Extra Setup Needed!
|
||||
|
||||
Marimo integration is now **automatically added** to your site. Here's what was done:
|
||||
|
||||
## What Was Added
|
||||
|
||||
1. **Marimo Badge JavaScript** (`site/_static/marimo-badges.js`)
|
||||
- Automatically adds "Open in Marimo" badges to notebook pages
|
||||
- Works alongside existing Binder/Colab buttons
|
||||
|
||||
2. **JavaScript Integration**
|
||||
- Added to `site/_config.yml` so it loads on all pages
|
||||
- Automatically detects notebook pages
|
||||
- Creates marimo badges dynamically
|
||||
|
||||
## How It Works
|
||||
|
||||
When students visit a notebook page:
|
||||
1. They see existing launch buttons (Binder, Colab)
|
||||
2. **New**: They also see "🍃 Open in Marimo" badge
|
||||
3. Clicking opens the notebook in Marimo's cloud service (molab)
|
||||
4. No account needed for basic use!
|
||||
|
||||
## Marimo URLs
|
||||
|
||||
Marimo badges use this format:
|
||||
```
|
||||
https://marimo.app/molab?repo=mlsysbook/TinyTorch&path=site/chapters/modules/MODULE_NAME.ipynb
|
||||
```
|
||||
|
||||
**Note**: Marimo can work with `.ipynb` files, but ideally we'd convert to `.py` files for full marimo features.
|
||||
|
||||
## Testing
|
||||
|
||||
To test marimo integration:
|
||||
|
||||
1. **Build the site:**
|
||||
```bash
|
||||
cd site
|
||||
make html
|
||||
```
|
||||
|
||||
2. **Open a notebook page** (e.g., `_build/html/chapters/modules/01_tensor.html`)
|
||||
|
||||
3. **Look for the marimo badge** - should appear below Binder/Colab buttons
|
||||
|
||||
4. **Click "Open in Marimo"** - should open in marimo's cloud editor
|
||||
|
||||
## Optional: Convert Notebooks to Marimo Format
|
||||
|
||||
For full marimo features (reactive execution, etc.), you could:
|
||||
|
||||
1. **Convert `.ipynb` to marimo `.py` format:**
|
||||
```bash
|
||||
# Marimo can import Jupyter notebooks
|
||||
marimo convert notebook.ipynb notebook.py
|
||||
```
|
||||
|
||||
2. **Store marimo versions** in `site/chapters/modules/` as `.py` files
|
||||
|
||||
3. **Update marimo URLs** to point to `.py` files instead of `.ipynb`
|
||||
|
||||
But this is **optional** - marimo badges work with `.ipynb` files too!
|
||||
|
||||
## Current Status
|
||||
|
||||
✅ **Marimo badges added** - Will appear on notebook pages
|
||||
✅ **No extra setup needed** - Just build the site
|
||||
✅ **Works with existing notebooks** - Uses `.ipynb` files
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Build site** to see marimo badges: `cd site && make html`
|
||||
2. **Test badges** on notebook pages
|
||||
3. **Optional**: Convert notebooks to marimo `.py` format for full features
|
||||
|
||||
That's it! Marimo integration is ready to go. 🎉
|
||||
|
||||
91
binder/MODULE_ORDER.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# TinyTorch Module Order Verification
|
||||
|
||||
## ✅ Correct Module Order (modules/ directory)
|
||||
|
||||
```
|
||||
01_tensor - Foundation: N-dimensional arrays
|
||||
02_activations - Non-linearity functions (ReLU, Sigmoid, Softmax)
|
||||
03_layers - Neural network layers (Linear, Module base)
|
||||
04_losses - Loss functions (MSE, CrossEntropy)
|
||||
05_autograd - Automatic differentiation
|
||||
06_optimizers - Optimization algorithms (SGD, Adam)
|
||||
07_training - Training loops
|
||||
08_dataloader - Data batching and pipelines
|
||||
09_spatial - Convolutional operations
|
||||
10_tokenization - Text tokenization
|
||||
11_embeddings - Word embeddings
|
||||
12_attention - Attention mechanisms
|
||||
13_transformers - Transformer architecture
|
||||
14_profiling - Performance profiling
|
||||
15_quantization - Model quantization
|
||||
16_compression - Model compression
|
||||
17_memoization - KV caching
|
||||
18_acceleration - Hardware acceleration
|
||||
19_benchmarking - Performance benchmarking
|
||||
20_capstone - Torch Olympics competition
|
||||
```
|
||||
|
||||
## ⚠️ Issue Found: Assignments Directory Mismatch
|
||||
|
||||
**Current assignments/source/ structure:**
|
||||
```
|
||||
01_setup ❌ OUTDATED - Module 01 is now "tensor", not "setup"
|
||||
02_tensor ✅ Correct
|
||||
```
|
||||
|
||||
**Problem:** The `assignments/source/01_setup/` directory contains an old notebook from when Module 01 was "Setup". Module 01 is now "Tensor" (`modules/01_tensor/`).
|
||||
|
||||
## Impact on Binder/Colab
|
||||
|
||||
**No impact** - Binder setup doesn't depend on assignment notebooks. The `binder/` configuration:
|
||||
- Installs TinyTorch package (`pip install -e .`)
|
||||
- Provides JupyterLab environment
|
||||
- Students can access any notebooks in the repository
|
||||
|
||||
However, for consistency and to avoid confusion:
|
||||
- Old `01_setup` assignment should be removed or renamed
|
||||
- Documentation references should point to `01_tensor` (already fixed)
|
||||
|
||||
## Module Tiers (from site/_toc.yml)
|
||||
|
||||
### 🏗️ Foundation Tier (01-07)
|
||||
- 01 Tensor
|
||||
- 02 Activations
|
||||
- 03 Layers
|
||||
- 04 Losses
|
||||
- 05 Autograd
|
||||
- 06 Optimizers
|
||||
- 07 Training
|
||||
|
||||
### 🏛️ Architecture Tier (08-13)
|
||||
- 08 DataLoader
|
||||
- 09 Spatial (Convolutions)
|
||||
- 10 Tokenization
|
||||
- 11 Embeddings
|
||||
- 12 Attention
|
||||
- 13 Transformers
|
||||
|
||||
### ⏱️ Optimization Tier (14-19)
|
||||
- 14 Profiling
|
||||
- 15 Quantization
|
||||
- 16 Compression
|
||||
- 17 Memoization
|
||||
- 18 Acceleration
|
||||
- 19 Benchmarking
|
||||
|
||||
### 🏅 Capstone (20)
|
||||
- 20 Capstone (Torch Olympics)
|
||||
|
||||
## Verification Status
|
||||
|
||||
✅ **Modules directory**: Correct order (01-20)
|
||||
✅ **Documentation**: References updated to `01_tensor`
|
||||
✅ **Binder setup**: Not affected by assignment structure
|
||||
⚠️ **Assignments**: Contains outdated `01_setup` (should be removed)
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. **Remove old assignment**: Delete `assignments/source/01_setup/` and `assignments/release/01_setup/`
|
||||
2. **Verify nbgrader**: Ensure nbgrader commands reference correct module numbers
|
||||
3. **Update any remaining references**: Search for `01_setup` and update to `01_tensor`
|
||||
|
||||
89
binder/NOTEBOOK_PLATFORM_RECOMMENDATION.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Notebook Platform Recommendation
|
||||
|
||||
## Current Setup
|
||||
- **MyBinder**: Zero-setup, no account needed
|
||||
- **Google Colab**: GPU access, familiar interface
|
||||
- **Marimo**: Modern reactive notebooks, Git-friendly
|
||||
|
||||
## Analysis: Do We Need All Three?
|
||||
|
||||
### Use Case: Viewing/Exploration Only
|
||||
Since online notebooks are **only for viewing/exploration** (not actual work), we should consider:
|
||||
|
||||
**Option 1: Keep All Three** ✅
|
||||
- **Pros**:
|
||||
- Students get choice
|
||||
- Different platforms have different strengths
|
||||
- Binder: Zero-setup, no account
|
||||
- Colab: GPU access for exploration
|
||||
- Marimo: Modern, educational
|
||||
- **Cons**:
|
||||
- Might be confusing (too many options)
|
||||
- More maintenance
|
||||
|
||||
**Option 2: Keep Just Binder** ✅ Recommended
|
||||
- **Pros**:
|
||||
- Simplest option (zero-setup, no account)
|
||||
- Works for viewing/exploration
|
||||
- Less confusing for students
|
||||
- Easier maintenance
|
||||
- **Cons**:
|
||||
- No GPU access (but not needed for viewing)
|
||||
- No Marimo features (but not needed for viewing)
|
||||
|
||||
**Option 3: Keep Binder + One Other**
|
||||
- Binder + Colab: Covers zero-setup + GPU exploration
|
||||
- Binder + Marimo: Covers zero-setup + modern interface
|
||||
|
||||
## Recommendation: Keep Just Binder ✅
|
||||
|
||||
**Reasoning:**
|
||||
1. **Primary use case**: Viewing/exploration (not actual work)
|
||||
2. **Binder is sufficient**: Zero-setup, no account, works for viewing
|
||||
3. **Simpler is better**: Less confusion, easier maintenance
|
||||
4. **Local is required anyway**: Students need local setup for real work
|
||||
|
||||
**What to remove:**
|
||||
- Colab launch buttons (students can still use Colab if they want, just not prominently featured)
|
||||
- Marimo badges (can add back later if there's demand)
|
||||
|
||||
**What to keep:**
|
||||
- Binder launch buttons (zero-setup viewing)
|
||||
- Clear messaging: "For viewing only - local setup required for full package"
|
||||
|
||||
## Alternative: Keep Binder + Colab
|
||||
|
||||
If you want GPU access for exploration:
|
||||
- **Keep**: Binder (zero-setup) + Colab (GPU exploration)
|
||||
- **Remove**: Marimo (newest, least familiar)
|
||||
|
||||
## Implementation
|
||||
|
||||
If we simplify to just Binder:
|
||||
|
||||
1. **Update `site/_config.yml`:**
|
||||
```yaml
|
||||
launch_buttons:
|
||||
binderhub_url: "https://mybinder.org"
|
||||
# Remove colab_url
|
||||
```
|
||||
|
||||
2. **Remove Marimo JavaScript:**
|
||||
- Remove `marimo-badges.js` from `extra_js`
|
||||
- Or keep it but make it optional
|
||||
|
||||
3. **Update documentation:**
|
||||
- Clarify that Binder is for viewing only
|
||||
- Emphasize local setup requirement
|
||||
|
||||
## Final Recommendation
|
||||
|
||||
**Keep just Binder** because:
|
||||
- ✅ Simplest option
|
||||
- ✅ Zero-setup (no account needed)
|
||||
- ✅ Sufficient for viewing/exploration
|
||||
- ✅ Less confusing
|
||||
- ✅ Students need local setup anyway for real work
|
||||
|
||||
**Optional**: Keep Colab if you want GPU access for exploration, but it's not essential since students need local setup for actual coursework.
|
||||
|
||||
105
binder/ONLINE_VS_LOCAL.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# Online Notebooks vs Local Setup
|
||||
|
||||
## Important Distinction
|
||||
|
||||
### Online Notebooks (Binder, Colab, Marimo)
|
||||
**Purpose**: Viewing, learning, exploration
|
||||
|
||||
**What you CAN do:**
|
||||
- ✅ View notebook content
|
||||
- ✅ Read code and explanations
|
||||
- ✅ Run basic code cells
|
||||
- ✅ Learn from examples
|
||||
|
||||
**What you CANNOT do:**
|
||||
- ❌ Import from `tinytorch.*` package (not installed)
|
||||
- ❌ Run milestone validation scripts
|
||||
- ❌ Use `tito` CLI commands
|
||||
- ❌ Execute full experiments
|
||||
- ❌ Export modules to package
|
||||
- ❌ Complete the full development workflow
|
||||
|
||||
### Local Setup (Required)
|
||||
**Purpose**: Full package, experiments, milestone validation
|
||||
|
||||
**What you CAN do:**
|
||||
- ✅ Full `tinytorch.*` package available
|
||||
- ✅ Run milestone validation scripts
|
||||
- ✅ Use `tito` CLI commands (`tito module complete`, `tito milestone validate`)
|
||||
- ✅ Execute complete experiments
|
||||
- ✅ Export modules to package
|
||||
- ✅ Full development workflow
|
||||
|
||||
## When to Use What
|
||||
|
||||
### Use Online Notebooks When:
|
||||
- 📖 **Learning**: Reading through modules to understand concepts
|
||||
- 🔍 **Exploration**: Quick look at code examples
|
||||
- 💡 **Inspiration**: Seeing how things work before implementing
|
||||
- 🚀 **Quick Start**: Getting familiar with the structure
|
||||
|
||||
### Use Local Setup When:
|
||||
- 🏗️ **Building**: Actually implementing modules
|
||||
- ✅ **Validating**: Running milestone checks
|
||||
- 🧪 **Experimenting**: Running full experiments
|
||||
- 📦 **Exporting**: Completing modules and exporting to package
|
||||
- 🎯 **Serious Work**: Doing the actual coursework
|
||||
|
||||
## Setup Instructions
|
||||
|
||||
### Local Setup (Required for Full Package)
|
||||
|
||||
```bash
|
||||
# 1. Clone repository
|
||||
git clone https://github.com/mlsysbook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
|
||||
# 2. Create virtual environment
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
||||
|
||||
# 3. Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 4. Install TinyTorch package in editable mode
|
||||
pip install -e .
|
||||
|
||||
# 5. Verify installation
|
||||
tito system doctor
|
||||
```
|
||||
|
||||
Now you have:
|
||||
- ✅ Full `tinytorch.*` package available
|
||||
- ✅ `tito` CLI commands working
|
||||
- ✅ Milestone scripts executable
|
||||
- ✅ Complete development environment
|
||||
|
||||
## Student Workflow
|
||||
|
||||
**Recommended approach:**
|
||||
|
||||
1. **Start Online**: Use Binder/Colab/Marimo to explore and understand modules
|
||||
2. **Switch to Local**: When ready to build, set up local environment
|
||||
3. **Work Locally**: Implement modules, run milestones, use CLI tools
|
||||
4. **Submit**: Export and submit `.ipynb` files for grading
|
||||
|
||||
## Common Questions
|
||||
|
||||
**Q: Can I do everything online?**
|
||||
A: No. Online notebooks are for viewing/learning. You need local setup for the full package and experiments.
|
||||
|
||||
**Q: Do I need both?**
|
||||
A: Not required, but recommended. Use online for learning, local for building.
|
||||
|
||||
**Q: Can I use online notebooks for assignments?**
|
||||
A: You can view notebooks online, but you'll need local setup to actually complete modules and run milestone validations.
|
||||
|
||||
**Q: What if I only have online access?**
|
||||
A: You can learn from online notebooks, but you won't be able to complete the full coursework without local installation.
|
||||
|
||||
## Summary
|
||||
|
||||
- **Online Notebooks**: Great for learning and exploration
|
||||
- **Local Setup**: Required for building, validating, and completing modules
|
||||
- **Best Practice**: Use online to learn, local to build
|
||||
|
||||
165
binder/PAPER_DOCUMENTATION_SYNC.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Paper Documentation Sync Checklist
|
||||
|
||||
## Analysis of paper.tex References
|
||||
|
||||
Based on analysis of `paper/paper.tex`, here are the documentation/resources mentioned and their status:
|
||||
|
||||
## ✅ Resources Mentioned in Paper
|
||||
|
||||
### 1. Module Notebooks ✅
|
||||
**Paper says**: "module notebooks, NBGrader test suites, milestone validation scripts, and connection maps"
|
||||
|
||||
**Status**:
|
||||
- ✅ Module notebooks exist: `modules/*/.*_dev.py` (source)
|
||||
- ✅ Generated via: `tito nbgrader generate`
|
||||
- ✅ Assignment notebooks: `assignments/source/`
|
||||
- ⚠️ Need to ensure all modules have notebooks generated
|
||||
|
||||
### 2. NBGrader Test Suites ✅
|
||||
**Paper says**: "NBGrader autograding infrastructure", "NBGrader test suites"
|
||||
|
||||
**Status**:
|
||||
- ✅ NBGrader integration: `tito/commands/nbgrader.py`
|
||||
- ✅ NBGrader guide: `docs/INSTRUCTOR_GUIDE.md`
|
||||
- ✅ NBGrader style guide: `docs/nbgrader/NBGRADER_STYLE_GUIDE.md`
|
||||
- ✅ NBGrader quick reference: `docs/nbgrader/NBGrader_Quick_Reference.md`
|
||||
|
||||
### 3. Milestone Validation Scripts ✅
|
||||
**Paper says**: "historical milestone validation", "milestone validation scripts"
|
||||
|
||||
**Status**:
|
||||
- ✅ Milestones exist: `milestones/` directory
|
||||
- ✅ Milestone docs: `site/chapters/milestones.md`
|
||||
- ✅ Milestone scripts: `milestones/*/` (Python scripts)
|
||||
|
||||
### 4. Connection Maps ✅
|
||||
**Paper says**: "connection maps showing prerequisite dependencies", "Text-based ASCII connection maps"
|
||||
|
||||
**Status**:
|
||||
- ✅ Connection maps in modules: Each module shows dependencies
|
||||
- ✅ Learning path: `modules/LEARNING_PATH.md`
|
||||
- ✅ Visual journey: `site/chapters/learning-journey.md`
|
||||
- ✅ Learning journey visual: `site/learning-journey-visual.md`
|
||||
|
||||
### 5. Instructor Guide ✅
|
||||
**Paper says**: "Institutional deployment provides NBGrader autograding infrastructure"
|
||||
|
||||
**Status**:
|
||||
- ✅ Instructor guide: `docs/INSTRUCTOR_GUIDE.md`
|
||||
- ✅ Classroom use: `site/usage-paths/classroom-use.md`
|
||||
- ⚠️ Need to verify it's synced with paper claims
|
||||
|
||||
### 6. Student Quickstart ✅
|
||||
**Paper says**: "Self-Paced Learning (Primary Use Case)", "zero infrastructure beyond Python"
|
||||
|
||||
**Status**:
|
||||
- ✅ Student quickstart: `docs/STUDENT_QUICKSTART.md`
|
||||
- ✅ Quickstart guide: `site/quickstart-guide.md`
|
||||
- ✅ Student workflow: `site/student-workflow.md`
|
||||
|
||||
### 7. Deployment Environments ✅
|
||||
**Paper says**: "JupyterHub (institutional server), Google Colab (zero installation), local installation (pip install tinytorch)"
|
||||
|
||||
**Status**:
|
||||
- ✅ Binder setup: `binder/` directory (for JupyterHub/Binder)
|
||||
- ✅ Colab setup: Configured in `site/_config.yml`
|
||||
- ✅ Local install: `pyproject.toml` (pip install tinytorch)
|
||||
- ✅ Documentation: `binder/README.md`, `binder/VERIFY.md`
|
||||
|
||||
### 8. Three Integration Models ✅
|
||||
**Paper says**:
|
||||
- Model 1: Self-Paced Learning
|
||||
- Model 2: Institutional Integration
|
||||
- Model 3: Team Onboarding
|
||||
|
||||
**Status**:
|
||||
- ✅ Self-paced: `site/quickstart-guide.md`, `site/student-workflow.md`
|
||||
- ✅ Institutional: `site/usage-paths/classroom-use.md`, `docs/INSTRUCTOR_GUIDE.md`
|
||||
- ⚠️ Team onboarding: May need dedicated page
|
||||
|
||||
### 9. Tier Configurations ✅
|
||||
**Paper says**: "Configuration 1: Foundation Only (Modules 01--07)", "Configuration 2: Foundation + Architecture", "Configuration 3: Optimization Focus"
|
||||
|
||||
**Status**:
|
||||
- ✅ Tier pages: `site/tiers/foundation.md`, `site/tiers/architecture.md`, `site/tiers/optimization.md`
|
||||
- ✅ Tier overviews in site structure
|
||||
|
||||
### 10. Lecture Materials ⚠️
|
||||
**Paper says**: "Lecture slides for institutional courses remain future work"
|
||||
|
||||
**Status**:
|
||||
- ⚠️ Correctly marked as future work
|
||||
- ✅ No false promises
|
||||
|
||||
## 🔍 Files to Verify/Update
|
||||
|
||||
### Critical Files to Check
|
||||
|
||||
1. **docs/INSTRUCTOR_GUIDE.md**
|
||||
- Verify it matches paper claims about NBGrader workflow
|
||||
- Check that commands match current `tito` CLI
|
||||
- Ensure module numbers are correct (01_tensor, not 01_setup)
|
||||
|
||||
2. **site/usage-paths/classroom-use.md**
|
||||
- Verify it covers all three integration models
|
||||
- Check NBGrader workflow matches paper description
|
||||
- Ensure deployment options match paper
|
||||
|
||||
3. **docs/STUDENT_QUICKSTART.md**
|
||||
- Verify it matches "zero infrastructure" claim
|
||||
- Check setup instructions are accurate
|
||||
- Ensure module references are correct
|
||||
|
||||
4. **site/quickstart-guide.md**
|
||||
- Should match student quickstart
|
||||
- Verify 15-minute claim is realistic
|
||||
- Check all links work
|
||||
|
||||
### Files That Should Exist But May Be Missing
|
||||
|
||||
1. **Team Onboarding Guide** ⚠️
|
||||
- Paper mentions "Model 3: Team Onboarding"
|
||||
- May need dedicated page or section
|
||||
- Check: `site/usage-paths/` or `docs/`
|
||||
|
||||
2. **Deployment Guide** ⚠️
|
||||
- Paper describes three environments (JupyterHub, Colab, Local)
|
||||
- Should have clear deployment instructions
|
||||
- Check: `binder/README.md` covers this
|
||||
|
||||
3. **Connection Maps Documentation** ⚠️
|
||||
- Paper mentions "connection maps showing prerequisite dependencies"
|
||||
- Should be clearly documented
|
||||
- Check: `modules/LEARNING_PATH.md` and site pages
|
||||
|
||||
## 📋 Sync Checklist
|
||||
|
||||
### Documentation Files
|
||||
- [ ] `docs/INSTRUCTOR_GUIDE.md` - Verify module numbers, commands match paper
|
||||
- [ ] `site/usage-paths/classroom-use.md` - Verify three models covered
|
||||
- [ ] `docs/STUDENT_QUICKSTART.md` - Verify accuracy, module references
|
||||
- [ ] `site/quickstart-guide.md` - Verify matches student quickstart
|
||||
- [ ] `binder/README.md` - Verify deployment environments match paper
|
||||
- [ ] `site/chapters/milestones.md` - Verify milestone descriptions match paper
|
||||
|
||||
### Missing Documentation
|
||||
- [ ] Team Onboarding Guide (Model 3) - Create if missing
|
||||
- [ ] Deployment Guide - Consolidate JupyterHub/Colab/Local instructions
|
||||
- [ ] Connection Maps Guide - Document how to read/use connection maps
|
||||
|
||||
### Website Sync
|
||||
- [ ] All documentation linked from site navigation
|
||||
- [ ] Instructor guide accessible from site
|
||||
- [ ] Student quickstart prominent on site
|
||||
- [ ] Deployment options clearly explained
|
||||
- [ ] Three integration models documented
|
||||
|
||||
## 🎯 Action Items
|
||||
|
||||
1. **Verify Instructor Guide** matches paper claims
|
||||
2. **Check module numbers** throughout (01_tensor, not 01_setup)
|
||||
3. **Create Team Onboarding guide** if missing
|
||||
4. **Consolidate deployment docs** (JupyterHub/Colab/Local)
|
||||
5. **Verify all links** in documentation work
|
||||
6. **Check site navigation** includes all key docs
|
||||
|
||||
84
binder/PAPER_FILE_REQUIREMENTS.md
Normal file
@@ -0,0 +1,84 @@
|
||||
# Exact File Requirements from paper.tex
|
||||
|
||||
## Files Explicitly Mentioned in Paper
|
||||
|
||||
Based on line-by-line analysis of `paper/paper.tex`, here are the exact files the paper says should exist:
|
||||
|
||||
### Line 988: Repository Instructor Resources
|
||||
|
||||
The paper states:
|
||||
> "The repository includes instructor resources: \texttt{CONTRIBUTING.md} (guidelines for bug reports and curriculum improvements), \texttt{INSTRUCTOR.md} (30-minute setup guide, grading rubrics, common student errors), and \texttt{MAINTENANCE.md} (support commitment through 2027, succession planning for community governance)."
|
||||
|
||||
**Required Files**:
|
||||
1. `CONTRIBUTING.md` - Guidelines for bug reports and curriculum improvements
|
||||
2. `INSTRUCTOR.md` - 30-minute setup guide, grading rubrics, common student errors
|
||||
3. ~~`MAINTENANCE.md`~~ - **User doesn't want this** (removed)
|
||||
|
||||
### Line 999: TA Guide
|
||||
|
||||
The paper states:
|
||||
> "The repository provides \texttt{TA\_GUIDE.md} documenting frequent student errors (gradient shape mismatches, disconnected computational graphs, broadcasting failures) and debugging strategies."
|
||||
|
||||
**Required File**:
|
||||
4. `TA_GUIDE.md` - Frequent student errors and debugging strategies
|
||||
|
||||
### Line 1003: Sample Solutions
|
||||
|
||||
The paper states:
|
||||
> "Sample solutions and grading rubrics in \texttt{INSTRUCTOR.md} calibrate evaluation standards."
|
||||
|
||||
**Required Content** (in INSTRUCTOR.md):
|
||||
- Sample solutions
|
||||
- Grading rubrics
|
||||
|
||||
## Summary: Required Files
|
||||
|
||||
| File | Purpose | Status |
|
||||
|------|---------|--------|
|
||||
| `CONTRIBUTING.md` | Bug reports, curriculum improvements | ✅ Exists |
|
||||
| `INSTRUCTOR.md` | Setup guide, grading rubrics, common errors, sample solutions | ✅ Created |
|
||||
| `TA_GUIDE.md` | Common errors, debugging strategies | ✅ Created |
|
||||
| `MAINTENANCE.md` | Support commitment | ❌ Removed (user preference) |
|
||||
|
||||
## What Each File Should Contain
|
||||
|
||||
### CONTRIBUTING.md
|
||||
- Guidelines for bug reports
|
||||
- Guidelines for curriculum improvements
|
||||
- Contribution process
|
||||
|
||||
### INSTRUCTOR.md
|
||||
- 30-minute setup guide
|
||||
- Grading rubrics
|
||||
- Common student errors
|
||||
- Sample solutions (for grading calibration)
|
||||
|
||||
### TA_GUIDE.md
|
||||
- Frequent student errors:
|
||||
- Gradient shape mismatches
|
||||
- Disconnected computational graphs
|
||||
- Broadcasting failures
|
||||
- Debugging strategies
|
||||
- TA preparation guidance
|
||||
|
||||
## Files NOT Mentioned in Paper
|
||||
|
||||
These are NOT required by the paper (but may be useful):
|
||||
- `TEAM_ONBOARDING.md` - Not explicitly mentioned (but Model 3 is described)
|
||||
- `MAINTENANCE.md` - Mentioned but user doesn't want it
|
||||
|
||||
## Action Items
|
||||
|
||||
1. ✅ Remove MAINTENANCE.md (done)
|
||||
2. ✅ Verify CONTRIBUTING.md matches paper description
|
||||
3. ✅ Verify INSTRUCTOR.md contains all required content:
|
||||
- 30-minute setup guide ✅
|
||||
- Grading rubrics ✅
|
||||
- Common student errors ✅
|
||||
- Sample solutions ⚠️ Need to verify
|
||||
4. ✅ Verify TA_GUIDE.md contains:
|
||||
- Gradient shape mismatches ✅
|
||||
- Disconnected computational graphs ✅
|
||||
- Broadcasting failures ✅
|
||||
- Debugging strategies ✅
|
||||
|
||||
113
binder/README.md
Normal file
@@ -0,0 +1,113 @@
|
||||
# Binder Environment Setup
|
||||
|
||||
This directory contains configuration files for running TinyTorch in cloud environments via [Binder](https://mybinder.org) and [Google Colab](https://colab.research.google.com).
|
||||
|
||||
## Files
|
||||
|
||||
- **`requirements.txt`**: Python dependencies for the Binder environment
|
||||
- **`postBuild`**: Script that runs after environment setup to install TinyTorch
|
||||
|
||||
## How It Works
|
||||
|
||||
### Binder
|
||||
|
||||
When users click the "Launch Binder" button on any notebook page in the TinyTorch documentation:
|
||||
|
||||
1. Binder reads `binder/requirements.txt` to install Python dependencies
|
||||
2. Binder runs `binder/postBuild` to install the TinyTorch package (`pip install -e .`)
|
||||
3. Users get a fully configured JupyterLab environment with TinyTorch ready to use
|
||||
|
||||
**Binder URL Format:**
|
||||
```
|
||||
https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main
|
||||
```
|
||||
|
||||
### Google Colab
|
||||
|
||||
Colab launch buttons automatically:
|
||||
1. Clone the repository
|
||||
2. Install dependencies from `binder/requirements.txt`
|
||||
3. Run setup commands (users may need to manually run `pip install -e .`)
|
||||
|
||||
**Colab URL Format:**
|
||||
```
|
||||
https://colab.research.google.com/github/mlsysbook/TinyTorch/blob/main/path/to/notebook.ipynb
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
To test your Binder setup:
|
||||
|
||||
1. **Test Binder Build:**
|
||||
```bash
|
||||
# Visit: https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main
|
||||
# Or use the badge:
|
||||
[](https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main)
|
||||
```
|
||||
|
||||
2. **Verify Installation:**
|
||||
Once Binder launches, test in a notebook:
|
||||
```python
|
||||
import tinytorch
|
||||
print(tinytorch.__version__)
|
||||
```
|
||||
|
||||
3. **Check Available Resources:**
|
||||
```python
|
||||
import os
|
||||
print("Modules:", os.listdir("modules"))
|
||||
print("Assignments:", os.listdir("assignments"))
|
||||
print("Milestones:", os.listdir("milestones"))
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Binder Build Fails
|
||||
|
||||
- Check `binder/requirements.txt` for syntax errors
|
||||
- Verify `binder/postBuild` has execute permissions (`chmod +x binder/postBuild`)
|
||||
- Review Binder build logs at: https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main?urlpath=lab/tree/logs%2Fbuild.log
|
||||
|
||||
### Colab Import Errors
|
||||
|
||||
- Ensure `binder/requirements.txt` includes all dependencies
|
||||
- Users may need to run: `!pip install -e .` in a Colab cell
|
||||
- Check that the repository is public (Colab can't access private repos)
|
||||
|
||||
### Package Not Found
|
||||
|
||||
- Verify `postBuild` script runs `pip install -e .` correctly
|
||||
- Check that `pyproject.toml` is in the repository root
|
||||
- Ensure all dependencies in `requirements.txt` are compatible
|
||||
|
||||
## Deployment Environments
|
||||
|
||||
As documented in the TinyTorch paper, three deployment environments are supported:
|
||||
|
||||
1. **JupyterHub** (institutional server)
|
||||
- 8-core/32GB supports ~50 students
|
||||
- Best for classroom use
|
||||
|
||||
2. **Google Colab** (zero installation)
|
||||
- Best for MOOCs and self-paced learning
|
||||
- No setup required from students
|
||||
|
||||
3. **Local Installation** (`pip install tinytorch`)
|
||||
- Best for self-paced learning and development
|
||||
- Full control over environment
|
||||
|
||||
## Keeping Dependencies Updated
|
||||
|
||||
When updating dependencies:
|
||||
|
||||
1. Update `requirements.txt` (root) - for local development
|
||||
2. Update `binder/requirements.txt` - for Binder/Colab
|
||||
3. Update `site/requirements.txt` - for documentation builds
|
||||
4. Keep versions synchronized where possible
|
||||
|
||||
## References
|
||||
|
||||
- [Binder Documentation](https://mybinder.readthedocs.io/)
|
||||
- [Jupyter Book Launch Buttons](https://jupyterbook.org/en/stable/interactive/launchbuttons.html)
|
||||
- [Google Colab GitHub Integration](https://colab.research.google.com/github/)
|
||||
|
||||
76
binder/REQUIRED_FILES.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Required Files Based on paper.tex
|
||||
|
||||
## Exact File References in Paper
|
||||
|
||||
### Line 988: Repository Instructor Resources
|
||||
|
||||
The paper explicitly states:
|
||||
> "The repository includes instructor resources: \texttt{CONTRIBUTING.md} (guidelines for bug reports and curriculum improvements), \texttt{INSTRUCTOR.md} (30-minute setup guide, grading rubrics, common student errors), and \texttt{MAINTENANCE.md} (support commitment through 2027, succession planning for community governance)."
|
||||
|
||||
**Required Files**:
|
||||
1. ✅ `CONTRIBUTING.md` - Guidelines for bug reports and curriculum improvements
|
||||
2. ✅ `INSTRUCTOR.md` - 30-minute setup guide, grading rubrics, common student errors
|
||||
3. ❌ `MAINTENANCE.md` - **Removed per user request** (paper mentions it but user doesn't want it)
|
||||
|
||||
### Line 999: TA Guide
|
||||
|
||||
The paper explicitly states:
|
||||
> "The repository provides \texttt{TA\_GUIDE.md} documenting frequent student errors (gradient shape mismatches, disconnected computational graphs, broadcasting failures) and debugging strategies."
|
||||
|
||||
**Required File**:
|
||||
4. ✅ `TA_GUIDE.md` - Frequent student errors and debugging strategies
|
||||
|
||||
### Line 1003: Sample Solutions
|
||||
|
||||
The paper states:
|
||||
> "Sample solutions and grading rubrics in \texttt{INSTRUCTOR.md} calibrate evaluation standards."
|
||||
|
||||
**Required Content** (must be in INSTRUCTOR.md):
|
||||
- Sample solutions (for grading calibration)
|
||||
- Grading rubrics
|
||||
|
||||
## Summary: Required Files
|
||||
|
||||
| File | Purpose | Status |
|
||||
|------|---------|--------|
|
||||
| `CONTRIBUTING.md` | Bug reports, curriculum improvements | ✅ Exists |
|
||||
| `INSTRUCTOR.md` | Setup guide, grading rubrics, common errors, sample solutions | ✅ Created |
|
||||
| `TA_GUIDE.md` | Common errors, debugging strategies | ✅ Created |
|
||||
|
||||
## Content Verification
|
||||
|
||||
### CONTRIBUTING.md ✅
|
||||
- Guidelines for bug reports ✅
|
||||
- Guidelines for curriculum improvements ✅
|
||||
|
||||
### INSTRUCTOR.md ✅
|
||||
- 30-minute setup guide ✅ (Section: "Instructor Setup")
|
||||
- Grading rubrics ✅ (Section: "Grading Rubric for ML Systems Questions")
|
||||
- Common student errors ✅ (Section: "Troubleshooting" → "Common Student Issues")
|
||||
- Sample solutions ⚠️ (Mentioned but need to verify if included)
|
||||
|
||||
### TA_GUIDE.md ✅
|
||||
- Gradient shape mismatches ✅
|
||||
- Disconnected computational graphs ✅
|
||||
- Broadcasting failures ✅
|
||||
- Debugging strategies ✅
|
||||
|
||||
## Files NOT Required by Paper
|
||||
|
||||
These files exist but are NOT explicitly mentioned in the paper:
|
||||
- `TEAM_ONBOARDING.md` - Not mentioned (but Model 3 is described in text)
|
||||
- `MAINTENANCE.md` - Mentioned but removed per user request
|
||||
- `docs/STUDENT_QUICKSTART.md` - Not explicitly mentioned
|
||||
- `site/` documentation - Not explicitly mentioned (but needed for website)
|
||||
|
||||
## Action Items
|
||||
|
||||
1. ✅ Remove MAINTENANCE.md (done)
|
||||
2. ✅ Verify CONTRIBUTING.md matches paper description
|
||||
3. ⚠️ Verify INSTRUCTOR.md has sample solutions (need to check/add if missing)
|
||||
4. ✅ Verify TA_GUIDE.md has all required errors
|
||||
|
||||
## Note on MAINTENANCE.md
|
||||
|
||||
The paper mentions `MAINTENANCE.md` but the user doesn't want it. The maintenance commitment information (support through 2027, etc.) is described in the paper text but doesn't need to be in a separate file if the user prefers not to have it.
|
||||
|
||||
348
binder/SETUP_PHASE_COMMUNITY_INTEGRATION.md
Normal file
@@ -0,0 +1,348 @@
|
||||
# Community Integration in Setup Phase
|
||||
|
||||
## Revised Vision: Early Community Engagement
|
||||
|
||||
Make community participation part of the **initial setup experience**, not something that happens after completing everything. This creates an immediate "I'm part of something bigger" moment.
|
||||
|
||||
## Updated User Journey
|
||||
|
||||
### Initial Setup Flow
|
||||
|
||||
```
|
||||
1. Clone & Setup
|
||||
↓
|
||||
2. tito system doctor (verify installation)
|
||||
✅ All checks passed!
|
||||
↓
|
||||
3. 🎉 "Welcome to TinyTorch!"
|
||||
↓
|
||||
4. [Automatic] tito community join
|
||||
→ Detects country
|
||||
→ Validates setup
|
||||
→ Adds to map
|
||||
→ Shows celebration
|
||||
↓
|
||||
5. 🌍 "You're builder #1,234 on the global map!"
|
||||
↓
|
||||
6. View map → See community worldwide
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Option 1: Automatic After Setup (Recommended)
|
||||
|
||||
**After `tito system doctor` passes:**
|
||||
|
||||
```
|
||||
✅ All checks passed! Your TinyTorch environment is ready.
|
||||
|
||||
🎉 Welcome to the TinyTorch Community!
|
||||
|
||||
🌍 Join builders from around the world:
|
||||
Run 'tito community join' to add your location to the map
|
||||
(Completely optional - only shares country, not exact location)
|
||||
|
||||
💡 This is your "hello world" moment - you've successfully set up TinyTorch!
|
||||
```
|
||||
|
||||
**After `tito community join`:**
|
||||
|
||||
```
|
||||
✅ You've joined the TinyTorch Community!
|
||||
|
||||
📍 Your Location: United States
|
||||
🌍 View the map: https://tinytorch.ai/community
|
||||
|
||||
🎖️ You're builder #1,234 on the global map!
|
||||
|
||||
📊 Community Stats:
|
||||
• 1,234 builders worldwide
|
||||
• 45 countries represented
|
||||
• 5 new builders this week
|
||||
|
||||
💡 Continue building modules and run milestones to track your progress!
|
||||
```
|
||||
|
||||
### Option 2: Integrated into Setup Script
|
||||
|
||||
**In `setup-environment.sh` or `activate.sh`:**
|
||||
|
||||
```bash
|
||||
# After successful setup
|
||||
echo ""
|
||||
echo "🎉 Setup complete! Welcome to TinyTorch!"
|
||||
echo ""
|
||||
echo "🌍 Join the global community:"
|
||||
echo " Run 'tito community join' to add your location to the map"
|
||||
echo " (Optional - only shares country, completely anonymized)"
|
||||
echo ""
|
||||
```
|
||||
|
||||
### Option 3: Part of Quick Start Guide
|
||||
|
||||
**Update quickstart guide to include:**
|
||||
|
||||
```markdown
|
||||
## Step 3: Join the Community (Optional)
|
||||
|
||||
After setup, join builders from around the world:
|
||||
|
||||
```bash
|
||||
tito community join
|
||||
```
|
||||
|
||||
This adds your location (country only) to the global TinyTorch community map.
|
||||
See where other builders are located: https://tinytorch.ai/community
|
||||
```
|
||||
|
||||
## What Gets Validated
|
||||
|
||||
**For community join (setup phase):**
|
||||
- ✅ Setup verified (`tito system doctor` passed)
|
||||
- ✅ Environment working
|
||||
- ✅ Can import TinyTorch
|
||||
|
||||
**NOT required:**
|
||||
- ❌ All milestones passed (can join anytime)
|
||||
- ❌ All modules completed (can join anytime)
|
||||
- ❌ Any specific progress (just setup)
|
||||
|
||||
**Why this works:**
|
||||
- Lower barrier to entry
|
||||
- Immediate community feeling
|
||||
- Can update later with milestone progress
|
||||
- More inclusive (everyone can join)
|
||||
|
||||
## Progressive Updates
|
||||
|
||||
**Users can update their community entry:**
|
||||
|
||||
```bash
|
||||
# Initial join (after setup)
|
||||
tito community join
|
||||
# → Adds: Country, setup verified, timestamp
|
||||
|
||||
# Later: Update with milestone progress
|
||||
tito community update
|
||||
# → Updates: Milestones passed, system type, progress
|
||||
# → Same anonymous ID, just more info
|
||||
```
|
||||
|
||||
## Map Visualization
|
||||
|
||||
**The map shows:**
|
||||
- **All builders**: Everyone who joined (not just completed)
|
||||
- **Progress indicators**: Dots colored by milestone progress
|
||||
- 🟢 All milestones passed
|
||||
- 🟡 Some milestones passed
|
||||
- 🔵 Setup complete (just joined)
|
||||
- **Stats**: Total builders, countries, recent activity
|
||||
|
||||
**This creates:**
|
||||
- Visual proof of global community
|
||||
- Shows diversity of progress levels
|
||||
- Encourages continued learning
|
||||
- Makes everyone feel included
|
||||
|
||||
## Implementation Design
|
||||
|
||||
### Command: `tito community join`
|
||||
|
||||
**What it does:**
|
||||
1. Validates setup (`tito system doctor` check)
|
||||
2. Detects/asks for country
|
||||
3. Generates anonymous ID
|
||||
4. Creates submission JSON:
|
||||
```json
|
||||
{
|
||||
"anonymous_id": "abc123...",
|
||||
"timestamp": "2024-11-20T10:30:00Z",
|
||||
"country": "United States",
|
||||
"setup_verified": true,
|
||||
"milestones_passed": 0, // Will update later
|
||||
"system_type": "Apple Silicon"
|
||||
}
|
||||
```
|
||||
5. Shows celebration message
|
||||
6. Optionally uploads to map
|
||||
|
||||
### Command: `tito community update` (Optional)
|
||||
|
||||
**What it does:**
|
||||
- Updates existing entry with:
|
||||
- Milestones passed count
|
||||
- Progress updates
|
||||
- System type (if changed)
|
||||
- Uses same anonymous ID
|
||||
- Shows updated stats
|
||||
|
||||
## Setup Script Integration
|
||||
|
||||
### Update `setup-environment.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# ... existing setup code ...
|
||||
|
||||
echo ""
|
||||
echo "✅ TinyTorch setup complete!"
|
||||
echo ""
|
||||
echo "🌍 Join the global TinyTorch community:"
|
||||
echo " Run 'tito community join' to add your location to the map"
|
||||
echo " See builders from around the world: https://tinytorch.ai/community"
|
||||
echo ""
|
||||
```
|
||||
|
||||
### Or in `activate.sh`:
|
||||
|
||||
```bash
|
||||
# After activation
|
||||
if [ "$FIRST_ACTIVATION" = "true" ]; then
|
||||
echo ""
|
||||
echo "🎉 Welcome to TinyTorch!"
|
||||
echo ""
|
||||
echo "🌍 Join the community: 'tito community join'"
|
||||
echo ""
|
||||
fi
|
||||
```
|
||||
|
||||
## Quick Start Guide Integration
|
||||
|
||||
**Add to quickstart guide:**
|
||||
|
||||
```markdown
|
||||
## Step 3: Join the Community (30 seconds)
|
||||
|
||||
After setup, join builders from around the world:
|
||||
|
||||
```bash
|
||||
tito community join
|
||||
```
|
||||
|
||||
**What this does:**
|
||||
- Adds your country to the global map
|
||||
- Shows you're part of the TinyTorch community
|
||||
- Completely optional and anonymized
|
||||
|
||||
**View the map**: https://tinytorch.ai/community
|
||||
|
||||
This is your "hello world" moment - you've successfully set up TinyTorch! 🎉
|
||||
```
|
||||
|
||||
## Benefits of Setup-Phase Integration
|
||||
|
||||
### ✅ Immediate Engagement
|
||||
- Community feeling from day one
|
||||
- "I'm part of something bigger" moment
|
||||
- Visual proof of global community
|
||||
|
||||
### ✅ Lower Barrier
|
||||
- No need to complete milestones first
|
||||
- Just setup verification required
|
||||
- Everyone can participate
|
||||
|
||||
### ✅ Progressive Updates
|
||||
- Join early (setup phase)
|
||||
- Update later (milestone progress)
|
||||
- Continuous engagement
|
||||
|
||||
### ✅ Inclusive
|
||||
- All skill levels welcome
|
||||
- All progress levels shown
|
||||
- Not just "winners"
|
||||
|
||||
## Recommended Flow
|
||||
|
||||
### Phase 1: Setup Integration
|
||||
|
||||
1. **After `tito system doctor` passes:**
|
||||
- Show celebration message
|
||||
- Suggest `tito community join`
|
||||
- Explain what it does (country only, optional)
|
||||
|
||||
2. **After `tito community join`:**
|
||||
- Show map URL
|
||||
- Display community stats
|
||||
- Celebrate "you're builder #X"
|
||||
|
||||
3. **Update quickstart guide:**
|
||||
- Add community join step
|
||||
- Explain privacy model
|
||||
- Link to map
|
||||
|
||||
### Phase 2: Map Page
|
||||
|
||||
1. **Create `site/community-map.md`:**
|
||||
- Interactive world map
|
||||
- Shows all builders (not just completed)
|
||||
- Progress indicators
|
||||
- Stats and recent activity
|
||||
|
||||
2. **Update site navigation:**
|
||||
- Add "Community Map" to navigation
|
||||
- Make it discoverable
|
||||
|
||||
### Phase 3: Progressive Updates
|
||||
|
||||
1. **Milestone integration:**
|
||||
- After milestones pass, suggest update
|
||||
- `tito community update` to add progress
|
||||
- Map shows progress levels
|
||||
|
||||
## Privacy & Consent
|
||||
|
||||
**Setup-phase join:**
|
||||
- Country only (not city)
|
||||
- System type (optional)
|
||||
- Setup verified status
|
||||
- Anonymous ID (no personal info)
|
||||
|
||||
**Consent flow:**
|
||||
```
|
||||
tito community join
|
||||
|
||||
⚠️ This will add your location to the public community map.
|
||||
|
||||
📊 What will be shared:
|
||||
• Country: United States (detected)
|
||||
• System type: Apple Silicon
|
||||
• Setup status: Verified ✅
|
||||
• No personal information
|
||||
|
||||
🔒 Privacy: Only country-level location, completely anonymized
|
||||
|
||||
Continue? [y/N]: y
|
||||
|
||||
✅ You've joined the TinyTorch Community!
|
||||
🌍 View map: https://tinytorch.ai/community
|
||||
🎖️ You're builder #1,234 on the global map!
|
||||
```
|
||||
|
||||
## Success Metrics
|
||||
|
||||
**Community Growth:**
|
||||
- Number of builders who join (setup phase)
|
||||
- Geographic diversity (countries)
|
||||
- Growth rate (new builders/week)
|
||||
- Map page views
|
||||
|
||||
**Engagement:**
|
||||
- Join rate after setup
|
||||
- Return visits to map
|
||||
- Updates with milestone progress
|
||||
- Social shares
|
||||
|
||||
## Final Recommendation
|
||||
|
||||
**Integrate into setup phase:**
|
||||
|
||||
1. ✅ **After `tito system doctor`**: Suggest community join
|
||||
2. ✅ **Make it optional**: Clear consent, privacy-respecting
|
||||
3. ✅ **Celebrate immediately**: "You're builder #X"
|
||||
4. ✅ **Show the map**: Visual proof of community
|
||||
5. ✅ **Allow updates**: Can add milestone progress later
|
||||
|
||||
**The goal**: Make students feel part of a global community from the moment they successfully set up TinyTorch, not after completing everything.
|
||||
|
||||
This creates an immediate "hello world" moment where they see: "Wow, there's a community of people building ML systems all over the world, and I'm one of them!" 🌍✨
|
||||
|
||||
167
binder/VERIFY.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# Binder & Colab Verification Guide
|
||||
|
||||
This guide helps you verify that Binder and Colab links are working correctly.
|
||||
|
||||
## Quick Verification Checklist
|
||||
|
||||
- [ ] Binder build completes successfully
|
||||
- [ ] TinyTorch package imports correctly in Binder
|
||||
- [ ] Colab can clone repository and install dependencies
|
||||
- [ ] Launch buttons appear on notebook pages in documentation
|
||||
- [ ] All three deployment environments work (JupyterHub, Colab, Local)
|
||||
|
||||
## Step-by-Step Verification
|
||||
|
||||
### 1. Test Binder Build
|
||||
|
||||
**Direct URL Test:**
|
||||
```
|
||||
https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main
|
||||
```
|
||||
|
||||
**What to check:**
|
||||
- Build completes without errors (may take 2-5 minutes first time)
|
||||
- JupyterLab launches successfully
|
||||
- No import errors in terminal or notebook
|
||||
|
||||
**Test in Binder Notebook:**
|
||||
```python
|
||||
# Test 1: Import TinyTorch
|
||||
import tinytorch
|
||||
print(f"TinyTorch version: {tinytorch.__version__}")
|
||||
|
||||
# Test 2: Verify modules are accessible
|
||||
import os
|
||||
assert os.path.exists("modules"), "Modules directory not found"
|
||||
assert os.path.exists("assignments"), "Assignments directory not found"
|
||||
|
||||
# Test 3: Test basic functionality
|
||||
from tinytorch.core import Tensor
|
||||
x = Tensor([1, 2, 3])
|
||||
print(f"Tensor created: {x}")
|
||||
```
|
||||
|
||||
### 2. Test Colab Integration
|
||||
|
||||
**For a specific notebook:**
|
||||
```
|
||||
https://colab.research.google.com/github/mlsysbook/TinyTorch/blob/main/assignments/source/02_tensor/02_tensor.ipynb
|
||||
```
|
||||
|
||||
**What to check:**
|
||||
- Notebook opens in Colab
|
||||
- Can run cells without errors
|
||||
- Dependencies install correctly
|
||||
|
||||
**Colab Setup Cell (add to notebooks if needed):**
|
||||
```python
|
||||
# Install TinyTorch
|
||||
!pip install -e /content/TinyTorch
|
||||
|
||||
# Verify installation
|
||||
import tinytorch
|
||||
print("TinyTorch installed successfully!")
|
||||
```
|
||||
|
||||
### 3. Verify Launch Buttons in Documentation
|
||||
|
||||
**Check that launch buttons appear:**
|
||||
1. Build the site: `cd site && jupyter-book build .`
|
||||
2. Open `_build/html/index.html` in browser
|
||||
3. Navigate to any page with notebooks
|
||||
4. Look for "Launch" buttons in the top-right corner
|
||||
|
||||
**Expected buttons:**
|
||||
- 🚀 Launch Binder
|
||||
- 🔵 Open in Colab
|
||||
- 📥 Download notebook
|
||||
|
||||
### 4. Test All Three Deployment Environments
|
||||
|
||||
As documented in `paper/paper.tex`, TinyTorch supports:
|
||||
|
||||
#### A. JupyterHub (Institutional)
|
||||
- Requires: 8-core/32GB server
|
||||
- Supports: ~50 concurrent students
|
||||
- Setup: Install via `pip install tinytorch` or mount repository
|
||||
|
||||
#### B. Google Colab (Zero Installation)
|
||||
- Best for: MOOCs and self-paced learning
|
||||
- Setup: Automatic via launch buttons
|
||||
- Verify: Test with sample notebooks
|
||||
|
||||
#### C. Local Installation
|
||||
- Best for: Self-paced learning and development
|
||||
- Setup: `pip install tinytorch`
|
||||
- Verify: Run `python -c "import tinytorch; print(tinytorch.__version__)"`
|
||||
|
||||
## Common Issues & Solutions
|
||||
|
||||
### Issue: Binder build times out
|
||||
|
||||
**Solution:**
|
||||
- Check `binder/requirements.txt` for unnecessary heavy dependencies
|
||||
- Ensure `postBuild` script is fast (< 2 minutes)
|
||||
- Consider using `environment.yml` instead if you need conda packages
|
||||
|
||||
### Issue: "Module not found" errors in Binder
|
||||
|
||||
**Solution:**
|
||||
- Verify `postBuild` script runs `pip install -e .`
|
||||
- Check that `pyproject.toml` is in repository root
|
||||
- Ensure all dependencies are in `binder/requirements.txt`
|
||||
|
||||
### Issue: Colab can't access repository
|
||||
|
||||
**Solution:**
|
||||
- Ensure repository is public (Colab can't access private repos)
|
||||
- Check that notebook path is correct in URL
|
||||
- Verify GitHub repository URL in `site/_config.yml`
|
||||
|
||||
### Issue: Launch buttons don't appear
|
||||
|
||||
**Solution:**
|
||||
- Verify `launch_buttons` configuration in `site/_config.yml`
|
||||
- Ensure repository URL and branch are correct
|
||||
- Rebuild the site: `jupyter-book build . --all`
|
||||
|
||||
## Automated Testing
|
||||
|
||||
You can add a GitHub Actions workflow to test Binder builds:
|
||||
|
||||
```yaml
|
||||
# .github/workflows/test-binder.yml
|
||||
name: Test Binder Build
|
||||
|
||||
on:
|
||||
schedule:
|
||||
- cron: '0 0 * * 0' # Weekly
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
test-binder:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Test Binder Build
|
||||
uses: jupyterhub/repo2docker-action@master
|
||||
with:
|
||||
image-name: tinytorch-binder-test
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
**Binder Status:**
|
||||
- Check build status: https://mybinder.org/v2/gh/mlsysbook/TinyTorch/main
|
||||
- View build logs: Add `?urlpath=lab/tree/logs%2Fbuild.log` to URL
|
||||
|
||||
**Colab Status:**
|
||||
- Test with sample notebooks from `assignments/` directory
|
||||
- Monitor for import errors or dependency issues
|
||||
|
||||
## References
|
||||
|
||||
- [Binder Documentation](https://mybinder.readthedocs.io/)
|
||||
- [Jupyter Book Launch Buttons](https://jupyterbook.org/en/stable/interactive/launchbuttons.html)
|
||||
- [Google Colab GitHub Integration](https://colab.research.google.com/github/)
|
||||
|
||||
20
binder/postBuild
Executable file
@@ -0,0 +1,20 @@
|
||||
#!/bin/bash
|
||||
# Binder postBuild script
|
||||
# This runs after the environment is set up to install TinyTorch
|
||||
|
||||
set -e
|
||||
|
||||
echo "🔧 Installing TinyTorch package..."
|
||||
pip install -e .
|
||||
|
||||
echo "✅ TinyTorch installation complete!"
|
||||
echo ""
|
||||
echo "📚 Available resources:"
|
||||
echo " - TinyTorch modules: modules/"
|
||||
echo " - Course assignments: assignments/"
|
||||
echo " - Milestone examples: milestones/"
|
||||
echo ""
|
||||
echo "🚀 Start exploring with:"
|
||||
echo " - jupyter lab"
|
||||
echo " - Or open notebooks directly from the file browser"
|
||||
|
||||
28
binder/requirements.txt
Normal file
@@ -0,0 +1,28 @@
|
||||
# TinyTorch Binder Environment
|
||||
# This file is used by Binder to set up the execution environment
|
||||
# Keep synchronized with main requirements.txt and site/requirements.txt
|
||||
|
||||
# Core numerical computing (TinyTorch dependency)
|
||||
numpy>=1.24.0,<3.0.0
|
||||
|
||||
# Terminal UI (for tito CLI and development feedback)
|
||||
rich>=13.0.0
|
||||
|
||||
# Configuration files (for tito CLI)
|
||||
PyYAML>=6.0
|
||||
|
||||
# Jupyter environment
|
||||
jupyter>=1.1.0
|
||||
jupyterlab>=4.2.0
|
||||
ipykernel>=6.29.0
|
||||
ipywidgets>=8.0.0
|
||||
|
||||
# Visualization (for milestone examples and modules)
|
||||
matplotlib>=3.9.0
|
||||
|
||||
# Type checking support
|
||||
typing-extensions>=4.12.0
|
||||
|
||||
# Note: tinytorch package itself is installed via postBuild script
|
||||
# This ensures the latest code from the repository is used
|
||||
|
||||
117
docs/BASELINE_SUBMISSION_DESIGN.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Baseline & Submission Design: What Makes Sense
|
||||
|
||||
## User Concern
|
||||
|
||||
**Question**: For baseline and submitting, what makes sense? Worried that running everything can take a while.
|
||||
|
||||
## Current Design
|
||||
|
||||
### Baseline Benchmark (`tito benchmark baseline`)
|
||||
|
||||
**What it does**:
|
||||
- Quick operations (tensor ops, matmul, forward pass)
|
||||
- **Time**: ~1 second
|
||||
- **Purpose**: Setup validation, environment check
|
||||
- **Normalized**: SPEC-style to reference system
|
||||
|
||||
**Current Implementation**:
|
||||
```python
|
||||
# Quick operations only
|
||||
- Tensor operations: ~0.8ms
|
||||
- Matrix multiply: ~2.5ms
|
||||
- Forward pass: ~6.7ms
|
||||
Total: ~10ms (normalized to reference)
|
||||
```
|
||||
|
||||
### Milestones
|
||||
|
||||
**What they do**:
|
||||
- Full ML workflows (training, evaluation)
|
||||
- **Time**: Minutes (3-30 minutes per milestone)
|
||||
- **Purpose**: Historical recreations, student validation
|
||||
- **Requires**: Completed modules (student code)
|
||||
|
||||
## Recommendation: Keep Baseline Quick, Milestones Optional
|
||||
|
||||
### ✅ Baseline at Setup (Fast)
|
||||
|
||||
**Keep current approach**:
|
||||
- ✅ Quick benchmark (~1 second)
|
||||
- ✅ Validates environment works
|
||||
- ✅ Normalized to reference system
|
||||
- ✅ Good for "Hello World" moment
|
||||
- ✅ Submit to community immediately
|
||||
|
||||
**Why this works**:
|
||||
- Fast setup validation
|
||||
- Doesn't require student code
|
||||
- Meaningful baseline (normalized)
|
||||
- Community submission ready
|
||||
|
||||
### ⚠️ Milestones Later (Optional)
|
||||
|
||||
**Run milestones as students complete modules**:
|
||||
- ⚠️ Takes minutes (not seconds)
|
||||
- ⚠️ Requires completed modules
|
||||
- ⚠️ Optional for community submission
|
||||
- ✅ Better for student validation
|
||||
|
||||
**Why milestones shouldn't be at setup**:
|
||||
- Too slow (minutes vs seconds)
|
||||
- Requires student code (doesn't exist yet)
|
||||
- Better for progressive validation
|
||||
|
||||
## Submission Strategy
|
||||
|
||||
### Setup Phase: Baseline Only
|
||||
|
||||
**What to submit**:
|
||||
- ✅ Baseline benchmark results (normalized)
|
||||
- ✅ System info (country, institution, etc.)
|
||||
- ✅ Reference implementation results
|
||||
|
||||
**Why**:
|
||||
- Fast (1 second)
|
||||
- Meaningful (normalized to reference)
|
||||
- Works immediately (no student code needed)
|
||||
|
||||
### Later Phase: Milestones Optional
|
||||
|
||||
**What to submit (optional)**:
|
||||
- ⚠️ Milestone results (as students complete modules)
|
||||
- ⚠️ Student code performance vs reference
|
||||
- ⚠️ Progress tracking
|
||||
|
||||
**Why optional**:
|
||||
- Takes time (minutes per milestone)
|
||||
- Requires completed modules
|
||||
- Better for personal tracking than community
|
||||
|
||||
## Final Recommendation
|
||||
|
||||
**✅ Keep baseline quick** (current approach is correct):
|
||||
- Fast setup validation (~1 second)
|
||||
- Submit baseline to community
|
||||
- Normalized to reference system
|
||||
|
||||
**✅ Milestones stay separate**:
|
||||
- Run as students complete modules
|
||||
- Optional for community submission
|
||||
- Better for personal progress tracking
|
||||
|
||||
**Result**:
|
||||
- Setup is fast (1 second baseline)
|
||||
- Community gets meaningful data (normalized baseline)
|
||||
- Students can optionally submit milestones later
|
||||
- No time concerns at setup
|
||||
|
||||
## Implementation
|
||||
|
||||
**Current `tito benchmark baseline`**:
|
||||
- ✅ Already fast (~1 second)
|
||||
- ✅ Already normalized
|
||||
- ✅ Already prompts for submission
|
||||
- ✅ Perfect for setup phase
|
||||
|
||||
**No changes needed!** Current design is correct.
|
||||
|
||||
136
docs/BENCHMARK_NORMALIZATION.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Benchmark Normalization - SPEC-Style Reference System
|
||||
|
||||
## Overview
|
||||
|
||||
TinyTorch baseline benchmarks use **SPEC-style normalization** to ensure fair comparison across different hardware. Results are normalized to a reference system, making scores comparable regardless of your hardware.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Reference System
|
||||
|
||||
**Reference Hardware:**
|
||||
- CPU: Intel i5-8th generation
|
||||
- RAM: 16GB
|
||||
- Platform: Mid-range laptop
|
||||
|
||||
**Reference Times:**
|
||||
- Tensor Operations: 0.8ms
|
||||
- Matrix Multiply: 2.5ms
|
||||
- Forward Pass: 6.7ms
|
||||
- **Total: 10.0ms**
|
||||
|
||||
### Normalization Formula
|
||||
|
||||
**SPEC-style normalization:**
|
||||
```
|
||||
normalized_score = reference_time / actual_time
|
||||
```
|
||||
|
||||
**Score Calculation:**
|
||||
```
|
||||
score = min(100, 100 * normalized_score)
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
**Fast System (M3 Mac):**
|
||||
- Actual time: 5.0ms
|
||||
- Normalized: 10.0 / 5.0 = 2.0x
|
||||
- Score: min(100, 100 * 2.0) = **100** (capped at 100)
|
||||
|
||||
**Reference System:**
|
||||
- Actual time: 10.0ms
|
||||
- Normalized: 10.0 / 10.0 = 1.0x
|
||||
- Score: min(100, 100 * 1.0) = **100**
|
||||
|
||||
**Slower System (Older Laptop):**
|
||||
- Actual time: 20.0ms
|
||||
- Normalized: 10.0 / 20.0 = 0.5x
|
||||
- Score: min(100, 100 * 0.5) = **50**
|
||||
|
||||
## Why Normalization Matters
|
||||
|
||||
### Without Normalization
|
||||
- Fast hardware gets high scores unfairly
|
||||
- Slow hardware gets low scores unfairly
|
||||
- Can't compare optimization skill across systems
|
||||
|
||||
### With Normalization
|
||||
- ✅ Scores are comparable across hardware
|
||||
- ✅ Focus on optimization skill, not hardware
|
||||
- ✅ Fair comparison (like SPEC benchmarks)
|
||||
|
||||
## Score Interpretation
|
||||
|
||||
**Score Range:**
|
||||
- **100**: Reference system performance or better
|
||||
- **80-99**: Slightly slower than reference
|
||||
- **60-79**: Moderately slower than reference
|
||||
- **40-59**: Significantly slower than reference
|
||||
- **<40**: Very slow (may indicate setup issues)
|
||||
|
||||
**Normalized Multiplier:**
|
||||
- **>1.0x**: Faster than reference system
|
||||
- **1.0x**: Same as reference system
|
||||
- **<1.0x**: Slower than reference system
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Reference Times Selection
|
||||
|
||||
Reference times are based on:
|
||||
- Mid-range consumer hardware (common student setup)
|
||||
- Conservative estimates (most systems should meet or exceed)
|
||||
- Real-world performance expectations
|
||||
|
||||
### Score Capping
|
||||
|
||||
Scores are capped at 100 to:
|
||||
- Prevent unfair advantage for very fast hardware
|
||||
- Keep focus on "setup validation" not "hardware competition"
|
||||
- Maintain educational focus
|
||||
|
||||
### Future Adjustments
|
||||
|
||||
Reference times can be updated if:
|
||||
- Hardware landscape changes significantly
|
||||
- Better baseline data becomes available
|
||||
- Community feedback suggests adjustment needed
|
||||
|
||||
## Comparison to SPEC
|
||||
|
||||
**Similarities:**
|
||||
- ✅ Normalize to reference system
|
||||
- ✅ Hardware-independent scores
|
||||
- ✅ Fair comparison across systems
|
||||
|
||||
**Differences:**
|
||||
- SPEC: Multiple benchmarks, complex scoring
|
||||
- TinyTorch: Simple baseline validation, educational focus
|
||||
- SPEC: Competitive benchmarking
|
||||
- TinyTorch: Setup validation and learning
|
||||
|
||||
## Implementation
|
||||
|
||||
Reference times are defined in `tito/commands/benchmark.py`:
|
||||
|
||||
```python
|
||||
def _get_reference_times(self) -> Dict[str, float]:
|
||||
"""Get reference times for normalization (SPEC-style)."""
|
||||
return {
|
||||
"tensor_ops": 0.8,
|
||||
"matmul": 2.5,
|
||||
"forward_pass": 6.7,
|
||||
"total": 10.0
|
||||
}
|
||||
```
|
||||
|
||||
Normalization happens automatically in `_run_baseline()`.
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Fair Comparison**: Scores mean the same thing on any hardware
|
||||
2. **Educational Focus**: Emphasizes setup validation, not hardware
|
||||
3. **Industry Standard**: Follows SPEC/MLPerf normalization principles
|
||||
4. **Motivation**: Students can achieve good scores regardless of hardware
|
||||
|
||||
339
docs/COMMUNITY_BENCHMARK_IMPLEMENTATION.md
Normal file
@@ -0,0 +1,339 @@
|
||||
# Community & Benchmark Commands - Implementation Document
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the implementation of community and benchmark commands for TinyTorch, an educational ML systems framework. The goal is to create a "Hello World" user journey where students feel part of a global cohort after completing setup and initial milestones.
|
||||
|
||||
## Design Philosophy
|
||||
|
||||
**Educational Focus**: TinyTorch is an educational framework. Community features should:
|
||||
- Encourage learning and progress, not competition
|
||||
- Create cohort feeling (students see peers, not rankings)
|
||||
- Be privacy-friendly (all data optional, anonymous IDs)
|
||||
- Work locally first, sync to website later
|
||||
|
||||
**Local-First Approach**:
|
||||
- All data stored project-locally in `.tinytorch/` directory
|
||||
- Website integration via stubs (ready for future API)
|
||||
- No external dependencies required for core functionality
|
||||
|
||||
## Implementation
|
||||
|
||||
### 1. Benchmark Commands (`tito benchmark`)
|
||||
|
||||
#### Baseline Benchmark (`tito benchmark baseline`)
|
||||
**Purpose**: Quick setup validation - "Hello World" moment
|
||||
|
||||
**What it does**:
|
||||
- Runs lightweight benchmarks (tensor ops, matrix multiply, forward pass)
|
||||
- Calculates score (0-100) based on performance
|
||||
- Saves results to `.tito/benchmarks/baseline_TIMESTAMP.json`
|
||||
- Auto-prompts for submission after completion
|
||||
|
||||
**When to run**: After setup, anytime
|
||||
|
||||
**Output Example**:
|
||||
```
|
||||
🎯 Baseline Benchmark
|
||||
|
||||
📊 Your Baseline Performance:
|
||||
• Tensor Operations: ⚡ 0.5ms
|
||||
• Matrix Multiply: ⚡ 2.3ms
|
||||
• Forward Pass: ⚡ 5.2ms
|
||||
• Score: 85/100
|
||||
|
||||
✅ Setup verified and working!
|
||||
```
|
||||
|
||||
#### Capstone Benchmark (`tito benchmark capstone`)
|
||||
**Purpose**: Full performance evaluation after Module 20
|
||||
|
||||
**What it does**:
|
||||
- Runs full benchmark suite from Module 20
|
||||
- Supports tracks: speed, compression, accuracy, efficiency, all
|
||||
- Uses Module 19's Benchmark class (when available)
|
||||
- Falls back gracefully if Module 20 not complete
|
||||
- Auto-prompts for submission after completion
|
||||
|
||||
**When to run**: After Module 20 (Capstone)
|
||||
|
||||
**Output Example**:
|
||||
```
|
||||
🏆 Capstone Benchmark Results
|
||||
|
||||
📊 Speed Track:
|
||||
• Latency: 45.2ms
|
||||
• Throughput: 22.1 ops/sec
|
||||
• Score: 92/100
|
||||
|
||||
📊 Overall Score: 90/100
|
||||
```
|
||||
|
||||
### 2. Community Commands (`tito community`)
|
||||
|
||||
#### Join (`tito community join`)
|
||||
**Purpose**: Join the global TinyTorch community
|
||||
|
||||
**What it does**:
|
||||
- Collects: country, institution, course type, experience level (all optional)
|
||||
- Generates anonymous UUID
|
||||
- Auto-detects cohort (Fall 2024, Spring 2025, etc.)
|
||||
- Saves profile to `.tinytorch/community/profile.json`
|
||||
- Shows welcome message with cohort info
|
||||
|
||||
**Privacy**: All fields optional, anonymous IDs, local storage
|
||||
|
||||
#### Update (`tito community update`)
|
||||
**Purpose**: Update community profile
|
||||
|
||||
**What it does**:
|
||||
- Updates profile fields (country, institution, course type, experience)
|
||||
- Auto-updates progress from `.tito/milestones.json` and `.tito/progress.json`
|
||||
- Interactive or command-line updates
|
||||
|
||||
#### Leave (`tito community leave`)
|
||||
**Purpose**: Remove community profile
|
||||
|
||||
**What it does**:
|
||||
- Removes profile file
|
||||
- Confirmation prompt (can skip with `--force`)
|
||||
- Preserves benchmark submissions
|
||||
|
||||
#### Stats & Profile (`tito community stats`, `tito community profile`)
|
||||
**Purpose**: View community information
|
||||
|
||||
**What it does**:
|
||||
- Shows community statistics
|
||||
- Displays full profile in table format
|
||||
- Shows progress: milestones, modules, capstone score
|
||||
|
||||
## Data Storage
|
||||
|
||||
### Project-Local Storage (`.tinytorch/`)
|
||||
|
||||
All data stored in project root, not home directory:
|
||||
|
||||
```
|
||||
.tinytorch/
|
||||
├── config.json # Configuration (website URLs, settings)
|
||||
├── community/
|
||||
│ └── profile.json # User's community profile
|
||||
└── submissions/ # Benchmark submissions (ready for website)
|
||||
```
|
||||
|
||||
### Profile Structure (`profile.json`)
|
||||
|
||||
```json
|
||||
{
|
||||
"anonymous_id": "uuid",
|
||||
"joined_at": "2024-11-20T10:30:00",
|
||||
"location": {
|
||||
"country": "United States"
|
||||
},
|
||||
"institution": {
|
||||
"name": "Harvard University",
|
||||
"type": null
|
||||
},
|
||||
"context": {
|
||||
"course_type": "university",
|
||||
"experience_level": "intermediate",
|
||||
"cohort": "Fall 2024"
|
||||
},
|
||||
"progress": {
|
||||
"setup_verified": false,
|
||||
"milestones_passed": 0,
|
||||
"modules_completed": 0,
|
||||
"capstone_score": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration (`config.json`)
|
||||
|
||||
```json
|
||||
{
|
||||
"website": {
|
||||
"base_url": "https://tinytorch.ai",
|
||||
"community_map_url": "https://tinytorch.ai/community",
|
||||
"api_url": null,
|
||||
"enabled": false
|
||||
},
|
||||
"local": {
|
||||
"enabled": true,
|
||||
"auto_sync": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Website Integration Stubs
|
||||
|
||||
All commands have stubs for future website integration:
|
||||
|
||||
### Join Notification
|
||||
```python
|
||||
def _notify_website_join(self, profile: Dict[str, Any]) -> None:
|
||||
"""Stub: Notify website when user joins."""
|
||||
config = self._get_config()
|
||||
if not config.get("website", {}).get("enabled", False):
|
||||
return
|
||||
|
||||
api_url = config.get("website", {}).get("api_url")
|
||||
if api_url:
|
||||
# TODO: Implement API call when website is ready
|
||||
# import requests
|
||||
# response = requests.post(f"{api_url}/api/community/join", json=profile)
|
||||
pass
|
||||
```
|
||||
|
||||
### Leave Notification
|
||||
```python
|
||||
def _notify_website_leave(self, anonymous_id: Optional[str]) -> None:
|
||||
"""Stub: Notify website when user leaves."""
|
||||
# Similar structure
|
||||
```
|
||||
|
||||
### Benchmark Submission
|
||||
```python
|
||||
def _submit_to_website(self, submission: Dict[str, Any]) -> None:
|
||||
"""Stub: Submit benchmark results to website."""
|
||||
# Similar structure
|
||||
```
|
||||
|
||||
**Current Behavior**: Stubs check configuration. If website integration disabled (default), commands work purely locally. When enabled, stubs will make API calls.
|
||||
|
||||
## User Journey
|
||||
|
||||
### 1. Setup & Join
|
||||
```bash
|
||||
# After setup
|
||||
tito community join
|
||||
# → Collects info, saves profile, shows welcome
|
||||
|
||||
# Run baseline benchmark
|
||||
tito benchmark baseline
|
||||
# → Runs benchmarks, shows results, prompts for submission
|
||||
```
|
||||
|
||||
### 2. Progress Updates
|
||||
```bash
|
||||
# Update profile
|
||||
tito community update
|
||||
# → Updates fields, auto-updates progress
|
||||
|
||||
# View profile
|
||||
tito community profile
|
||||
# → Shows full profile with progress
|
||||
```
|
||||
|
||||
### 3. Capstone Completion
|
||||
```bash
|
||||
# After Module 20
|
||||
tito benchmark capstone
|
||||
# → Runs full benchmarks, prompts for submission
|
||||
```
|
||||
|
||||
## Privacy & Security
|
||||
|
||||
**Privacy Features**:
|
||||
- ✅ All fields optional
|
||||
- ✅ Anonymous UUIDs (no personal identifiers)
|
||||
- ✅ Local storage (user controls sharing)
|
||||
- ✅ No auto-detection (country detection disabled)
|
||||
- ✅ Explicit consent for sharing
|
||||
|
||||
**Security Considerations**:
|
||||
- Profile data stored locally (not transmitted unless user opts in)
|
||||
- Anonymous IDs prevent tracking
|
||||
- Website integration opt-in only
|
||||
|
||||
## Educational Benefits
|
||||
|
||||
**Cohort Feeling**:
|
||||
- Students see they're part of a global community
|
||||
- Cohort identification (Fall 2024, Spring 2025, etc.)
|
||||
- Institution-based cohorts (Harvard, Stanford, etc.)
|
||||
- Progress comparisons (milestones, modules completed)
|
||||
|
||||
**Motivation**:
|
||||
- "Hello World" moment after setup
|
||||
- Progress tracking and celebration
|
||||
- Community map visualization (future)
|
||||
- Peer visibility (future)
|
||||
|
||||
**Learning Support**:
|
||||
- Not competitive (no rankings)
|
||||
- Encourages sharing and learning
|
||||
- Privacy-friendly (students control data)
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
### Files Created
|
||||
- `tito/commands/benchmark.py` - Benchmark commands
|
||||
- `tito/commands/community.py` - Community commands
|
||||
|
||||
### Files Modified
|
||||
- `tito/commands/__init__.py` - Added command exports
|
||||
- `tito/main.py` - Registered new commands
|
||||
|
||||
### Dependencies
|
||||
- `rich` - Beautiful terminal output (already in requirements)
|
||||
- `numpy` - Benchmark calculations (already in requirements)
|
||||
- No external API dependencies (local-first)
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
**Phase 1 (Current)**: ✅
|
||||
- Local storage
|
||||
- Basic commands
|
||||
- Website stubs
|
||||
|
||||
**Phase 2 (Future)**:
|
||||
- Website API integration
|
||||
- Community map visualization
|
||||
- Cohort filtering and comparisons
|
||||
- Progress rankings (optional, opt-in)
|
||||
|
||||
**Phase 3 (Future)**:
|
||||
- Real-time updates
|
||||
- Peer connections
|
||||
- Study groups
|
||||
- Mentorship matching
|
||||
|
||||
## Testing
|
||||
|
||||
Commands are ready to test:
|
||||
```bash
|
||||
# Test benchmark
|
||||
tito benchmark baseline
|
||||
tito benchmark capstone
|
||||
|
||||
# Test community
|
||||
tito community join
|
||||
tito community profile
|
||||
tito community update
|
||||
tito community stats
|
||||
tito community leave
|
||||
```
|
||||
|
||||
## Questions for Expert Review
|
||||
|
||||
1. **Storage Approach**: Is project-local storage (`.tinytorch/`) the right approach for an educational framework? Should we consider home directory instead?
|
||||
|
||||
2. **Privacy Model**: Is the anonymous UUID + optional fields approach appropriate for students? Any privacy concerns?
|
||||
|
||||
3. **Website Integration**: Are the stubs structured correctly? Should we use a different pattern for future API integration?
|
||||
|
||||
4. **Educational Focus**: Does this design support learning without creating unhealthy competition? Are there features we should add/remove?
|
||||
|
||||
5. **Cohort Features**: Is cohort identification (Fall 2024, institution-based) the right approach? Should we add more cohort types?
|
||||
|
||||
6. **Benchmark Design**: Are baseline and capstone benchmarks appropriate? Should we add more benchmark types?
|
||||
|
||||
7. **Data Collection**: What data should we collect? What should we avoid?
|
||||
|
||||
8. **Community Map**: Is a global map visualization appropriate for an educational framework? Privacy concerns?
|
||||
|
||||
9. **Integration Points**: Should we integrate with existing systems (GitHub, LMS, etc.)?
|
||||
|
||||
10. **Scalability**: Will this design scale to thousands of students? What bottlenecks should we anticipate?
|
||||
|
||||
136
docs/CONFIGURATION_SETUP.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Community Configuration Setup
|
||||
|
||||
## Storage Location
|
||||
|
||||
All community data is stored **project-locally** in `.tinytorch/` directory (not in home directory):
|
||||
|
||||
```
|
||||
.tinytorch/
|
||||
├── config.json # Configuration (website URLs, settings)
|
||||
├── community/
|
||||
│ └── profile.json # User's community profile
|
||||
└── submissions/ # Benchmark submissions (ready for website)
|
||||
```
|
||||
|
||||
## Configuration File (`.tinytorch/config.json`)
|
||||
|
||||
The configuration file is automatically created on first use with these defaults:
|
||||
|
||||
```json
|
||||
{
|
||||
"website": {
|
||||
"base_url": "https://tinytorch.ai",
|
||||
"community_map_url": "https://tinytorch.ai/community",
|
||||
"api_url": null,
|
||||
"enabled": false
|
||||
},
|
||||
"local": {
|
||||
"enabled": true,
|
||||
"auto_sync": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration Fields
|
||||
|
||||
**Website Settings:**
|
||||
- `base_url`: Base URL for TinyTorch website
|
||||
- `community_map_url`: URL to community map page
|
||||
- `api_url`: API endpoint URL (set when API is ready)
|
||||
- `enabled`: Enable website integration (set to `true` when ready)
|
||||
|
||||
**Local Settings:**
|
||||
- `enabled`: Always `true` - local storage is always enabled
|
||||
- `auto_sync`: Auto-sync to website when enabled (future feature)
|
||||
|
||||
## Website Integration Stubs
|
||||
|
||||
All commands have stubs for website integration that are currently disabled:
|
||||
|
||||
### Join Command
|
||||
```python
|
||||
def _notify_website_join(self, profile: Dict[str, Any]) -> None:
|
||||
"""Stub: Notify website when user joins."""
|
||||
config = self._get_config()
|
||||
if not config.get("website", {}).get("enabled", False):
|
||||
return
|
||||
|
||||
api_url = config.get("website", {}).get("api_url")
|
||||
if api_url:
|
||||
# TODO: Implement API call when website is ready
|
||||
# import requests
|
||||
# response = requests.post(f"{api_url}/api/community/join", json=profile)
|
||||
pass
|
||||
```
|
||||
|
||||
### Leave Command
|
||||
```python
|
||||
def _notify_website_leave(self, anonymous_id: Optional[str]) -> None:
|
||||
"""Stub: Notify website when user leaves."""
|
||||
# Similar structure to join
|
||||
```
|
||||
|
||||
### Benchmark Submission
|
||||
```python
|
||||
def _submit_to_website(self, submission: Dict[str, Any]) -> None:
|
||||
"""Stub: Submit benchmark results to website."""
|
||||
# Similar structure
|
||||
```
|
||||
|
||||
## Enabling Website Integration
|
||||
|
||||
When the website API is ready:
|
||||
|
||||
1. **Update configuration:**
|
||||
```json
|
||||
{
|
||||
"website": {
|
||||
"api_url": "https://api.tinytorch.ai",
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. **Implement API calls:**
|
||||
- Uncomment TODO sections in `community.py` and `benchmark.py`
|
||||
- Add `requests` dependency if needed
|
||||
- Implement error handling
|
||||
|
||||
3. **Test integration:**
|
||||
- Test join/leave notifications
|
||||
- Test benchmark submission
|
||||
- Verify data sync
|
||||
|
||||
## Current Behavior (Local-Only)
|
||||
|
||||
**All commands work locally:**
|
||||
- ✅ `tito community join` - Saves profile to `.tinytorch/community/profile.json`
|
||||
- ✅ `tito community update` - Updates local profile
|
||||
- ✅ `tito community leave` - Removes local profile
|
||||
- ✅ `tito benchmark baseline` - Saves to `.tito/benchmarks/`
|
||||
- ✅ `tito benchmark capstone` - Saves to `.tito/benchmarks/`
|
||||
|
||||
**Website stubs are present but disabled:**
|
||||
- Stubs call `_get_config()` to check if website is enabled
|
||||
- If disabled (default), commands work purely locally
|
||||
- When enabled, stubs will make API calls
|
||||
|
||||
## Benefits of Project-Local Storage
|
||||
|
||||
1. **Version Control Friendly**: `.tinytorch/` can be gitignored or committed
|
||||
2. **Project-Specific**: Each TinyTorch project has its own community profile
|
||||
3. **Portable**: Easy to move/share projects with their data
|
||||
4. **Privacy**: Data stays in project, not in home directory
|
||||
|
||||
## Migration Notes
|
||||
|
||||
If you had data in `~/.tinytorch/`, you can migrate:
|
||||
|
||||
```bash
|
||||
# Copy old data to new location
|
||||
cp -r ~/.tinytorch/community .tinytorch/
|
||||
cp ~/.tinytorch/config.json .tinytorch/config.json # if exists
|
||||
```
|
||||
|
||||
The new system will automatically use `.tinytorch/` in the project root.
|
||||
|
||||
62
docs/DOCUMENTATION_STRUCTURE.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# Documentation Structure - Single Source of Truth
|
||||
|
||||
## Site Documentation (`site/`)
|
||||
**Purpose**: User-facing website content (built with Jupyter Book)
|
||||
|
||||
**Files**:
|
||||
- `site/community.md` - Community features for website visitors
|
||||
- `site/quickstart-guide.md` - Quick start guide
|
||||
- `site/student-workflow.md` - Student workflow guide
|
||||
- `site/instructor-guide.md` - Instructor guide (copied from docs/)
|
||||
- `site/usage-paths/classroom-use.md` - Classroom usage guide
|
||||
|
||||
**Build**: These files are built into the website via `make html` in `site/`
|
||||
|
||||
## Developer Documentation (`docs/`)
|
||||
**Purpose**: Technical documentation for developers and experts
|
||||
|
||||
**Files**:
|
||||
- `docs/COMMUNITY_BENCHMARK_IMPLEMENTATION.md` - Full implementation details
|
||||
- `docs/EXPERT_FEEDBACK_ANALYSIS.md` - Expert feedback analysis
|
||||
- `docs/EXPERT_FEEDBACK_REQUEST.md` - Questions for experts
|
||||
- `docs/PRIVACY_DATA_RETENTION.md` - Privacy policy
|
||||
- `docs/CONFIGURATION_SETUP.md` - Configuration guide
|
||||
- `docs/COMMUNITY_FEATURES_SUMMARY.md` - Quick summary
|
||||
|
||||
**Note**: These are NOT included in the website build - they're for developers/experts
|
||||
|
||||
## Root Documentation
|
||||
**Purpose**: Repository-level documentation
|
||||
|
||||
**Files**:
|
||||
- `README.md` - Main repository README
|
||||
- `CONTRIBUTING.md` - Contribution guidelines
|
||||
- `INSTRUCTOR.md` - Instructor guide (root copy)
|
||||
- `TA_GUIDE.md` - TA guide (root copy)
|
||||
|
||||
## Single Source Principle
|
||||
|
||||
**Site files** (`site/*.md`):
|
||||
- ✅ Single source: `site/community.md` is the ONLY community page for website
|
||||
- ✅ No duplicates in `docs/` for website content
|
||||
|
||||
**Developer docs** (`docs/*.md`):
|
||||
- ✅ Technical details for developers
|
||||
- ✅ NOT built into website (separate purpose)
|
||||
|
||||
**Root docs** (`*.md`):
|
||||
- ✅ Repository-level documentation
|
||||
- ✅ Referenced by paper.tex
|
||||
|
||||
## File Locations Summary
|
||||
|
||||
| Content Type | Location | Purpose | Built into Site? |
|
||||
|-------------|----------|---------|------------------|
|
||||
| Community features | `site/community.md` | Website page | ✅ Yes |
|
||||
| Quick start | `site/quickstart-guide.md` | Website page | ✅ Yes |
|
||||
| Student workflow | `site/student-workflow.md` | Website page | ✅ Yes |
|
||||
| Implementation details | `docs/COMMUNITY_*.md` | Developer docs | ❌ No |
|
||||
| Privacy policy | `docs/PRIVACY_*.md` | Developer docs | ❌ No |
|
||||
| Expert feedback | `docs/EXPERT_*.md` | Developer docs | ❌ No |
|
||||
|
||||
**All documentation is in the correct location with no duplicates.**
|
||||
88
docs/EXPERT_ANALYSIS_SETUP_VALIDATION.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# Expert Analysis: Setup Validation Approach
|
||||
|
||||
## Research Summary
|
||||
|
||||
Based on research into MLPerf, SPEC benchmarks, and educational ML frameworks, here's expert-informed analysis.
|
||||
|
||||
**Final Decision**: Keep current baseline approach (fast, ~1 second) rather than milestone-based validation. See `BASELINE_SUBMISSION_DESIGN.md` for final design.
|
||||
|
||||
## Key Findings
|
||||
|
||||
### 1. MLPerf Approach: Reference Implementation Required
|
||||
|
||||
**MLPerf Practice**:
|
||||
- ✅ **Reference implementations are standard** - everyone runs same reference code
|
||||
- ✅ **Baseline measurements** - establish reference performance first
|
||||
- ✅ **Normalized comparison** - results normalized to reference system
|
||||
- ✅ **Comprehensive validation** - full workflow testing, not just basic ops
|
||||
|
||||
**Key Insight**: MLPerf requires reference implementations for fair comparison. This supports your original vision!
|
||||
|
||||
### 2. SPEC Approach: Reference System Normalization
|
||||
|
||||
**SPEC Practice**:
|
||||
- ✅ **Reference system defined** - specific hardware configuration
|
||||
- ✅ **Normalized scores** - all results normalized to reference
|
||||
- ✅ **Comprehensive benchmarks** - full application workloads
|
||||
- ✅ **Baseline establishment** - reference performance is baseline
|
||||
|
||||
**Key Insight**: SPEC uses comprehensive benchmarks normalized to reference. This aligns with milestone approach!
|
||||
|
||||
### 3. Educational Framework Best Practices
|
||||
|
||||
**Research Findings**:
|
||||
- ✅ **Milestone-based validation** - recognized best practice for educational platforms
|
||||
- ✅ **Progressive validation** - validate at each stage, not just setup
|
||||
- ✅ **Clear expectations** - students see what they're working toward
|
||||
- ✅ **Reference comparisons** - compare student work to reference implementations
|
||||
|
||||
**Key Insight**: Educational frameworks use milestone-based validation with reference comparisons!
|
||||
|
||||
## Expert Recommendations
|
||||
|
||||
### ✅ Milestone-Based Validation is Appropriate
|
||||
|
||||
**Why**:
|
||||
1. **Industry Standard**: MLPerf and SPEC use comprehensive benchmarks
|
||||
2. **Educational Best Practice**: Milestone validation is recognized approach
|
||||
3. **Better Baseline**: Real milestone results more meaningful than basic ops
|
||||
4. **Fair Comparison**: Reference implementation ensures fairness
|
||||
|
||||
### ✅ Reference Fallback is Standard Practice
|
||||
|
||||
**Why**:
|
||||
1. **MLPerf Does This**: Reference implementations are standard
|
||||
2. **Educational Tools Do This**: Compare student code to reference
|
||||
3. **Fair Comparison**: Everyone runs same reference code
|
||||
4. **Progressive Validation**: Students compare their code to reference
|
||||
|
||||
### ⚠️ Implementation Considerations
|
||||
|
||||
**Best Practices**:
|
||||
1. **Clear Labeling**: Mark results as "reference" vs "student"
|
||||
2. **Normalization**: Normalize to reference system (SPEC-style)
|
||||
3. **Progressive**: Run milestones as students complete modules
|
||||
4. **Transparency**: Show what's reference vs student code
|
||||
|
||||
## Recommendation
|
||||
|
||||
**✅ Your Original Vision is Correct!**
|
||||
|
||||
**Milestone-based setup validation with reference fallback**:
|
||||
- ✅ Aligns with MLPerf/SPEC practices
|
||||
- ✅ Follows educational framework best practices
|
||||
- ✅ Creates better student experience
|
||||
- ✅ Provides meaningful baseline results
|
||||
|
||||
**Implementation**:
|
||||
1. Add reference fallback to milestones (PyTorch if `tinytorch.*` fails)
|
||||
2. Run milestones at setup with reference implementation
|
||||
3. Generate normalized baseline results
|
||||
4. Students later run with THEIR code and compare
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Expert consensus**: Milestone-based validation with reference fallback is the right approach for educational ML frameworks. It aligns with industry standards (MLPerf, SPEC) and educational best practices.
|
||||
|
||||
**Your original idea was correct!** The challenge is implementation, not concept.
|
||||
|
||||
154
docs/EXPERT_FEEDBACK_REQUEST.md
Normal file
@@ -0,0 +1,154 @@
|
||||
# Expert Feedback Request: Community & Benchmark Features
|
||||
|
||||
**TinyTorch** - Educational ML Systems Framework
|
||||
|
||||
We're building community and benchmark features for TinyTorch, an educational framework where students build ML components from scratch. Seeking feedback from TensorFlow/PyTorch community experts and educational ML framework developers.
|
||||
|
||||
## Context
|
||||
|
||||
We're building **TinyTorch**, an educational ML systems framework where students build ML components from scratch (tensors, autograd, optimizers, CNNs, transformers, etc.). We're implementing community and benchmark features to create a "Hello World" user journey where students feel part of a global cohort.
|
||||
|
||||
**Key Question**: Is our design approach appropriate for an educational framework? What would you recommend?
|
||||
|
||||
## Our Design
|
||||
|
||||
### 1. Storage Approach
|
||||
- **Project-local storage** (`.tinytorch/` in project root, not `~/.tinytorch/`)
|
||||
- Rationale: Version control friendly, project-specific, portable
|
||||
- **Question**: Is this the right approach? Should we use home directory instead?
|
||||
|
||||
### 2. Privacy Model
|
||||
- **Anonymous UUIDs** for all users
|
||||
- **All fields optional** (country, institution, course type, experience)
|
||||
- **Local-first**: Data stored locally, website sync opt-in
|
||||
- **Question**: Is this privacy model appropriate for students? Any concerns?
|
||||
|
||||
### 3. Community Features
|
||||
- **Join/Leave/Update** commands for community profile
|
||||
- **Cohort identification** (Fall 2024, Spring 2025, institution-based)
|
||||
- **Progress tracking** (milestones, modules, capstone score)
|
||||
- **No rankings** (educational focus, not competitive)
|
||||
- **Question**: Does this support learning without unhealthy competition? Missing features?
|
||||
|
||||
### 4. Benchmark Commands
|
||||
- **Baseline benchmark**: Quick setup validation ("Hello World" moment)
|
||||
- **Capstone benchmark**: Full performance evaluation after Module 20
|
||||
- **Auto-submit prompt**: After benchmarks, asks if user wants to submit
|
||||
- **Question**: Are these benchmark types appropriate? Should we add more?
|
||||
|
||||
### 5. Website Integration
|
||||
- **Stubs for future API**: Commands work locally, ready for website sync
|
||||
- **Configuration-based**: Enable/disable website integration via config
|
||||
- **Question**: Is this stub pattern correct? Better approaches?
|
||||
|
||||
## Specific Questions
|
||||
|
||||
### For TensorFlow/PyTorch Community Experts
|
||||
|
||||
1. **Storage Location**:
|
||||
- We use project-local `.tinytorch/` directory. Is this appropriate for an educational framework?
|
||||
- Should we consider home directory (`~/.tinytorch/`) instead?
|
||||
- What do TensorFlow/PyTorch educational tools use?
|
||||
|
||||
2. **Privacy & Data Collection**:
|
||||
- We collect: country, institution, course type, experience level (all optional)
|
||||
- Anonymous UUIDs, no personal names
|
||||
- Is this appropriate for students? Any privacy concerns?
|
||||
- What data should we collect/avoid?
|
||||
|
||||
3. **Community Design**:
|
||||
- Focus on cohort feeling, not competition
|
||||
- No rankings, just progress tracking
|
||||
- Is this the right approach for education?
|
||||
- Should we add competitive features (opt-in)?
|
||||
|
||||
4. **Benchmark Design**:
|
||||
- Baseline (setup validation) + Capstone (full evaluation)
|
||||
- Should we add more benchmark types?
|
||||
- How should we handle different hardware/performance?
|
||||
|
||||
5. **Website Integration**:
|
||||
- Local-first with stubs for future API
|
||||
- Is this pattern correct?
|
||||
- Should we use a different approach?
|
||||
|
||||
6. **Scalability**:
|
||||
- Will this design scale to thousands of students?
|
||||
- What bottlenecks should we anticipate?
|
||||
- Should we plan for distributed storage?
|
||||
|
||||
7. **Educational Best Practices**:
|
||||
- What features encourage learning without creating unhealthy competition?
|
||||
- Should we add peer connections, study groups, mentorship?
|
||||
- What features do successful educational ML frameworks have?
|
||||
|
||||
8. **Integration Points**:
|
||||
- Should we integrate with GitHub, LMS, or other systems?
|
||||
- What integrations would be most valuable for students?
|
||||
|
||||
## Our Implementation
|
||||
|
||||
### Commands
|
||||
- `tito benchmark baseline` - Quick setup validation
|
||||
- `tito benchmark capstone` - Full Module 20 benchmarks
|
||||
- `tito community join` - Join community (collects optional info)
|
||||
- `tito community update` - Update profile
|
||||
- `tito community leave` - Remove profile
|
||||
- `tito community stats` - View statistics
|
||||
- `tito community profile` - View profile
|
||||
|
||||
### Data Storage
|
||||
```
|
||||
.tinytorch/
|
||||
├── config.json # Configuration
|
||||
├── community/
|
||||
│ └── profile.json # User profile
|
||||
└── submissions/ # Benchmark submissions
|
||||
```
|
||||
|
||||
### Profile Structure
|
||||
```json
|
||||
{
|
||||
"anonymous_id": "uuid",
|
||||
"joined_at": "2024-11-20T10:30:00",
|
||||
"location": {"country": "United States"},
|
||||
"institution": {"name": "Harvard University"},
|
||||
"context": {
|
||||
"course_type": "university",
|
||||
"experience_level": "intermediate",
|
||||
"cohort": "Fall 2024"
|
||||
},
|
||||
"progress": {
|
||||
"milestones_passed": 0,
|
||||
"modules_completed": 0,
|
||||
"capstone_score": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## What We're Looking For
|
||||
|
||||
**Feedback on**:
|
||||
1. Design approach (is it right for education?)
|
||||
2. Privacy model (appropriate for students?)
|
||||
3. Storage location (project-local vs home?)
|
||||
4. Feature set (missing anything important?)
|
||||
5. Scalability (will it work at scale?)
|
||||
6. Best practices (what should we do differently?)
|
||||
|
||||
**Recommendations on**:
|
||||
1. What features to add/remove
|
||||
2. How to structure data
|
||||
3. How to integrate with website
|
||||
4. How to scale to thousands of students
|
||||
5. What successful educational frameworks do
|
||||
|
||||
## Contact
|
||||
|
||||
We'd love to hear from:
|
||||
- TensorFlow/PyTorch community experts
|
||||
- Educational ML framework developers
|
||||
- Anyone with experience building community features for educational tools
|
||||
|
||||
**Thank you for your time and expertise!**
|
||||
|
||||
105
docs/EXPERT_OPINION_REQUEST.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# Expert Opinion Request: Setup Validation Approach
|
||||
|
||||
## Question for ML Systems Experts
|
||||
|
||||
**Context**: We're building TinyTorch, an educational ML framework where students build ML components from scratch (tensors, autograd, optimizers, CNNs, transformers, etc.).
|
||||
|
||||
**Decision Point**: How should we validate setup and create baseline results?
|
||||
|
||||
## Two Approaches
|
||||
|
||||
### Approach 1: Quick Baseline Benchmark (Current)
|
||||
|
||||
**What**: Run lightweight benchmarks (tensor ops, matrix multiply, forward pass) - ~1 second
|
||||
|
||||
**Pros**:
|
||||
- ✅ Fast setup validation
|
||||
- ✅ Doesn't require student code
|
||||
- ✅ Normalized to reference system (SPEC-style)
|
||||
- ✅ Simple and reliable
|
||||
|
||||
**Cons**:
|
||||
- ❌ Limited validation (just basic ops)
|
||||
- ❌ Not comprehensive
|
||||
- ❌ Doesn't test full ML workflows
|
||||
|
||||
### Approach 2: Milestone-Based Validation (Proposed)
|
||||
|
||||
**What**: Run full milestone scripts with reference implementation fallback (PyTorch if `tinytorch.*` unavailable)
|
||||
|
||||
**Pros**:
|
||||
- ✅ Comprehensive validation (full ML workflows)
|
||||
- ✅ Meaningful baseline results (real milestone performance)
|
||||
- ✅ Better "Hello World" moment (students see what they'll build)
|
||||
- ✅ Fair comparison (everyone runs same reference)
|
||||
|
||||
**Cons**:
|
||||
- ⚠️ More complex (requires fallback logic)
|
||||
- ⚠️ Takes longer (minutes vs seconds)
|
||||
- ⚠️ Requires modifying milestones
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
**Reference Fallback Approach**:
|
||||
```python
|
||||
# In milestone scripts
|
||||
try:
|
||||
from tinytorch import Tensor, Linear, ReLU
|
||||
implementation = "student"
|
||||
except ImportError:
|
||||
import torch
|
||||
Tensor = torch.Tensor
|
||||
Linear = torch.nn.Linear
|
||||
ReLU = torch.nn.ReLU
|
||||
implementation = "reference"
|
||||
```
|
||||
|
||||
**Results**:
|
||||
- Setup: "Reference baseline: 95% accuracy"
|
||||
- Later: "Your code: 92% accuracy (vs reference: 95%)"
|
||||
|
||||
## Questions for Experts
|
||||
|
||||
1. **Setup Validation**: Should setup validation be quick (basic ops) or comprehensive (full workflows)?
|
||||
|
||||
2. **Reference Implementation**: Is it appropriate to use PyTorch as reference fallback in educational frameworks?
|
||||
|
||||
3. **Baseline Results**: Should baseline be environment-only or framework-level (milestone results)?
|
||||
|
||||
4. **Student Experience**: What creates better "Hello World" moment - quick validation or seeing real results?
|
||||
|
||||
5. **Best Practices**: What do successful educational ML frameworks (Fast.ai, PyTorch Lightning tutorials) do?
|
||||
|
||||
6. **Normalization**: Should we normalize milestone results to reference system (like SPEC)?
|
||||
|
||||
7. **Complexity Trade-off**: Is added complexity worth comprehensive validation?
|
||||
|
||||
## Our Context
|
||||
|
||||
- **Educational Focus**: Students build everything from scratch
|
||||
- **20 Modules**: Progressive complexity (tensors → transformers)
|
||||
- **6 Milestones**: Historical recreations (1957-2018)
|
||||
- **Community Goal**: Students feel part of global cohort
|
||||
|
||||
## What We're Seeking
|
||||
|
||||
**Expert opinion on**:
|
||||
- Which approach is better for educational frameworks?
|
||||
- Is reference fallback appropriate?
|
||||
- Should setup be quick or comprehensive?
|
||||
- What creates best student experience?
|
||||
|
||||
**Recommendations on**:
|
||||
- Best practices from industry (MLPerf, SPEC)
|
||||
- What successful educational frameworks do
|
||||
- How to balance simplicity vs comprehensiveness
|
||||
|
||||
## Contact
|
||||
|
||||
We'd love feedback from:
|
||||
- MLPerf/SPEC benchmark experts
|
||||
- Educational ML framework developers
|
||||
- ML systems engineers with educational experience
|
||||
|
||||
**Thank you for your expertise!**
|
||||
|
||||
@@ -43,12 +43,12 @@ We've wrapped NBGrader behind simple `tito grade` commands so you don't need to
|
||||
### **1. Prepare Assignments**
|
||||
```bash
|
||||
# Generate instructor version (with solutions)
|
||||
tito grade generate 01_setup
|
||||
tito grade generate 01_tensor
|
||||
|
||||
# Create student version (solutions removed)
|
||||
tito grade release 01_setup
|
||||
tito grade release 01_tensor
|
||||
|
||||
# Student version will be in: release/tinytorch/01_setup/
|
||||
# Student version will be in: release/tinytorch/01_tensor/
|
||||
```
|
||||
|
||||
### **2. Distribute to Students**
|
||||
@@ -65,25 +65,25 @@ tito grade release 01_setup
|
||||
### **3. Collect Submissions**
|
||||
```bash
|
||||
# Collect all students
|
||||
tito grade collect 01_setup
|
||||
tito grade collect 01_tensor
|
||||
|
||||
# Or specific student
|
||||
tito grade collect 01_setup --student student_id
|
||||
tito grade collect 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **4. Auto-Grade**
|
||||
```bash
|
||||
# Grade all submissions
|
||||
tito grade autograde 01_setup
|
||||
tito grade autograde 01_tensor
|
||||
|
||||
# Grade specific student
|
||||
tito grade autograde 01_setup --student student_id
|
||||
tito grade autograde 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **5. Manual Review**
|
||||
```bash
|
||||
# Open grading interface (browser-based)
|
||||
tito grade manual 01_setup
|
||||
tito grade manual 01_tensor
|
||||
|
||||
# This launches a web interface for:
|
||||
# - Reviewing ML Systems question responses
|
||||
@@ -94,7 +94,7 @@ tito grade manual 01_setup
|
||||
### **6. Generate Feedback**
|
||||
```bash
|
||||
# Create feedback files for students
|
||||
tito grade feedback 01_setup
|
||||
tito grade feedback 01_tensor
|
||||
```
|
||||
|
||||
### **7. Export Grades**
|
||||
@@ -103,7 +103,7 @@ tito grade feedback 01_setup
|
||||
tito grade export
|
||||
|
||||
# Or specific module
|
||||
tito grade export --module 01_setup --output grades_module01.csv
|
||||
tito grade export --module 01_tensor --output grades_module01.csv
|
||||
```
|
||||
|
||||
## 📊 Grading Components
|
||||
@@ -138,17 +138,12 @@ tito grade export --module 01_setup --output grades_module01.csv
|
||||
|
||||
## 📚 Module Teaching Notes
|
||||
|
||||
### **Module 01: Setup**
|
||||
- **Focus**: Environment configuration, systems thinking mindset
|
||||
- **Key Concept**: Development environments matter for ML systems
|
||||
- **Common Issues**: Virtual environment confusion
|
||||
|
||||
### **Module 02: Tensor**
|
||||
### **Module 01: Tensor**
|
||||
- **Focus**: Memory layout, data structures
|
||||
- **Key Concept**: Understanding memory is crucial for ML performance
|
||||
- **Demo**: Show memory profiling, copying behavior
|
||||
|
||||
### **Module 03: Activations**
|
||||
### **Module 02: Activations**
|
||||
- **Focus**: Vectorization, numerical stability
|
||||
- **Key Concept**: Small details matter at scale
|
||||
- **Demo**: Gradient vanishing/exploding
|
||||
|
||||
110
docs/PRIVACY_DATA_RETENTION.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# Privacy & Data Retention Policy
|
||||
|
||||
## Data Collection
|
||||
|
||||
TinyTorch collects **optional** information to build a community map and support learning:
|
||||
|
||||
- **Country** (optional) - For global visualization
|
||||
- **Institution** (optional) - For cohort identification
|
||||
- **Course Type** (optional) - For community insights
|
||||
- **Experience Level** (optional) - For learning support
|
||||
|
||||
**We do NOT collect:**
|
||||
- Personal names
|
||||
- Email addresses (unless user provides)
|
||||
- IP addresses
|
||||
- Any personally identifiable information
|
||||
|
||||
## Anonymous Identification
|
||||
|
||||
All users are assigned an **anonymous UUID** when joining the community. This UUID:
|
||||
- Cannot be linked to personal identity
|
||||
- Is randomly generated
|
||||
- Is stored locally in your project
|
||||
|
||||
## Data Storage
|
||||
|
||||
**Location**: `.tinytorch/` directory (project-local, not home directory)
|
||||
|
||||
**Files**:
|
||||
- `.tinytorch/community/profile.json` - Your community profile
|
||||
- `.tinytorch/config.json` - Configuration settings
|
||||
- `.tito/benchmarks/` - Benchmark results
|
||||
- `.tito/submissions/` - Submission files
|
||||
|
||||
**Privacy**: All data is stored locally in your project. You control what is shared.
|
||||
|
||||
## Data Retention
|
||||
|
||||
**Local Storage**: Data persists until you:
|
||||
- Run `tito community leave` (removes profile)
|
||||
- Delete `.tinytorch/` directory
|
||||
- Remove specific files manually
|
||||
|
||||
**Website Sync** (when enabled):
|
||||
- Data synced to website is retained according to website privacy policy
|
||||
- You can request deletion via `tito community leave`
|
||||
- Local data is always removed immediately
|
||||
|
||||
## User Rights
|
||||
|
||||
**Right to Access**: View your data with `tito community profile`
|
||||
|
||||
**Right to Update**: Update your data with `tito community update`
|
||||
|
||||
**Right to Deletion**: Remove your data with `tito community leave`
|
||||
|
||||
**Right to Opt-Out**: All data collection is optional. You can:
|
||||
- Skip fields during `tito community join`
|
||||
- Leave community anytime with `tito community leave`
|
||||
- Never join community (all features work without joining)
|
||||
|
||||
## Consent
|
||||
|
||||
**Explicit Consent**: When joining, you'll see:
|
||||
- What data is collected
|
||||
- Why it's collected
|
||||
- How it's stored
|
||||
- Consent prompt before collection
|
||||
|
||||
**Withdrawal**: You can withdraw consent anytime by leaving the community.
|
||||
|
||||
## Website Integration
|
||||
|
||||
**Current**: Website integration is **disabled by default**. All data stays local.
|
||||
|
||||
**Future**: When website integration is enabled:
|
||||
- You'll be notified before syncing
|
||||
- You can opt-out of website sync
|
||||
- Local data remains your primary copy
|
||||
|
||||
## Security
|
||||
|
||||
**Local Storage**: Files are stored as plain JSON in your project directory.
|
||||
|
||||
**Recommendations**:
|
||||
- Don't commit `.tinytorch/` to public repositories if you include institution info
|
||||
- Use `.gitignore` to exclude community data if desired
|
||||
- Keep your project directory secure
|
||||
|
||||
## Compliance
|
||||
|
||||
**GDPR**: Our design aligns with GDPR principles:
|
||||
- ✅ Data minimization (only optional fields)
|
||||
- ✅ Purpose limitation (community map only)
|
||||
- ✅ User consent (explicit opt-in)
|
||||
- ✅ Right to deletion (`tito community leave`)
|
||||
- ✅ Data portability (JSON files)
|
||||
|
||||
**FERPA**: For educational institutions:
|
||||
- No student names collected
|
||||
- Anonymous identifiers only
|
||||
- Institution-level aggregation (not individual)
|
||||
|
||||
## Questions?
|
||||
|
||||
For privacy questions or concerns:
|
||||
- Review your data: `tito community profile`
|
||||
- Remove your data: `tito community leave`
|
||||
- Check configuration: `.tinytorch/config.json`
|
||||
|
||||
@@ -31,10 +31,10 @@ tito system doctor
|
||||
### 2️⃣ **Start Your First Module**
|
||||
```bash
|
||||
# View the first module
|
||||
tito module view 01_setup
|
||||
tito module view 01_tensor
|
||||
|
||||
# Or open the notebook directly
|
||||
jupyter notebook modules/01_setup/setup_dev.ipynb
|
||||
jupyter notebook modules/01_tensor/tensor_dev.py
|
||||
```
|
||||
|
||||
## 📚 Learning Path
|
||||
@@ -44,7 +44,7 @@ Each module builds on the previous one:
|
||||
|
||||
| Module | What You'll Build | Capability Unlocked |
|
||||
|--------|------------------|---------------------|
|
||||
| 01 Setup | Development environment | Configure TinyTorch |
|
||||
| 01 Tensor | Core data structure | Manipulate ML building blocks |
|
||||
| 02 Tensor | Core data structure | Manipulate ML building blocks |
|
||||
| 03 Activations | Non-linearity functions | Add intelligence to networks |
|
||||
| 04 Layers | Neural network layers | Build network components |
|
||||
|
||||
282
docs/TEAM_ONBOARDING.md
Normal file
@@ -0,0 +1,282 @@
|
||||
# Team Onboarding Guide: TinyTorch for Industry
|
||||
|
||||
Complete guide for using TinyTorch in industry settings: new hire bootcamps, internal training programs, and debugging workshops.
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
TinyTorch's **Model 3: Team Onboarding** addresses industry use cases where ML teams want members to understand PyTorch internals. This guide covers deployment scenarios, training structures, and best practices for industry adoption.
|
||||
|
||||
## 🚀 Use Cases
|
||||
|
||||
### 1. New Hire Bootcamps (2-3 Week Intensive)
|
||||
|
||||
**Goal**: Rapidly onboard new ML engineers to understand framework internals
|
||||
|
||||
**Structure**:
|
||||
- **Week 1**: Foundation Tier (Modules 01-07)
|
||||
- Tensors, autograd, optimizers, training loops
|
||||
- Focus: Understanding `loss.backward()` mechanics
|
||||
- **Week 2**: Architecture Tier (Modules 08-13)
|
||||
- CNNs, transformers, attention mechanisms
|
||||
- Focus: Production architecture internals
|
||||
- **Week 3**: Optimization Tier (Modules 14-19) OR Capstone
|
||||
- Profiling, quantization, compression
|
||||
- Focus: Production optimization techniques
|
||||
|
||||
**Schedule**:
|
||||
- Full-time: 40 hours/week
|
||||
- Hands-on coding: 70% of time
|
||||
- Systems discussions: 30% of time
|
||||
- Daily standups and code reviews
|
||||
|
||||
**Deliverables**:
|
||||
- Completed modules with passing tests
|
||||
- Capstone project (optional)
|
||||
- Technical presentation on framework internals
|
||||
|
||||
### 2. Internal Training Programs (Distributed Over Quarters)
|
||||
|
||||
**Goal**: Deep understanding of ML systems for existing team members
|
||||
|
||||
**Structure**:
|
||||
- **Quarter 1**: Foundation (Modules 01-07)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Self-paced module completion
|
||||
- Monthly group discussions
|
||||
- **Quarter 2**: Architecture (Modules 08-13)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Architecture deep-dives
|
||||
- Production case studies
|
||||
- **Quarter 3**: Optimization (Modules 14-19)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Performance optimization focus
|
||||
- Real production optimization projects
|
||||
|
||||
**Benefits**:
|
||||
- Fits into existing work schedules
|
||||
- Allows deep learning without intensive time commitment
|
||||
- Builds team knowledge gradually
|
||||
- Enables peer learning
|
||||
|
||||
### 3. Debugging Workshops (Focused Modules)
|
||||
|
||||
**Goal**: Targeted understanding of specific framework components
|
||||
|
||||
**Common Focus Areas**:
|
||||
|
||||
#### Autograd Debugging Workshop (Module 05)
|
||||
- Understanding gradient flow
|
||||
- Debugging gradient issues
|
||||
- Computational graph visualization
|
||||
- **Duration**: 1-2 days
|
||||
|
||||
#### Attention Mechanism Workshop (Module 12)
|
||||
- Understanding attention internals
|
||||
- Debugging attention scaling issues
|
||||
- Memory optimization for attention
|
||||
- **Duration**: 1-2 days
|
||||
|
||||
#### Optimization Workshop (Modules 14-19)
|
||||
- Profiling production models
|
||||
- Quantization and compression
|
||||
- Performance optimization strategies
|
||||
- **Duration**: 2-3 days
|
||||
|
||||
## 🏗️ Deployment Scenarios
|
||||
|
||||
### Scenario 1: Cloud-Based Training (Recommended)
|
||||
|
||||
**Setup**: Google Colab or JupyterHub
|
||||
- Zero local installation
|
||||
- Consistent environment
|
||||
- Easy sharing and collaboration
|
||||
- **Best for**: Large teams, remote workers
|
||||
|
||||
**Steps**:
|
||||
1. Clone repository to Colab
|
||||
2. Install dependencies: `pip install -e .`
|
||||
3. Work through modules
|
||||
4. Share notebooks via Colab links
|
||||
|
||||
### Scenario 2: Local Development Environment
|
||||
|
||||
**Setup**: Local Python environment
|
||||
- Full control over environment
|
||||
- Better for debugging
|
||||
- Offline capability
|
||||
- **Best for**: Smaller teams, on-site training
|
||||
|
||||
**Steps**:
|
||||
1. Clone repository locally
|
||||
2. Set up virtual environment
|
||||
3. Install: `pip install -e .`
|
||||
4. Use JupyterLab for development
|
||||
|
||||
### Scenario 3: Hybrid Approach
|
||||
|
||||
**Setup**: Colab for learning, local for projects
|
||||
- Learn in cloud environment
|
||||
- Apply locally for projects
|
||||
- **Best for**: Flexible teams
|
||||
|
||||
## 📋 Training Program Templates
|
||||
|
||||
### Template 1: 2-Week Intensive Bootcamp
|
||||
|
||||
**Week 1: Foundation**
|
||||
- Day 1-2: Modules 01-02 (Tensor, Activations)
|
||||
- Day 3-4: Modules 03-04 (Layers, Losses)
|
||||
- Day 5: Module 05 (Autograd) - Full day focus
|
||||
- Weekend: Review and practice
|
||||
|
||||
**Week 2: Architecture + Optimization**
|
||||
- Day 1-2: Modules 08-09 (DataLoader, CNNs)
|
||||
- Day 3: Module 12 (Attention)
|
||||
- Day 4-5: Modules 14-15 (Profiling, Quantization)
|
||||
- Final: Capstone project presentation
|
||||
|
||||
### Template 2: 3-Month Distributed Program
|
||||
|
||||
**Month 1: Foundation**
|
||||
- Week 1: Modules 01-02
|
||||
- Week 2: Modules 03-04
|
||||
- Week 3: Module 05 (Autograd)
|
||||
- Week 4: Modules 06-07 (Optimizers, Training)
|
||||
|
||||
**Month 2: Architecture**
|
||||
- Week 1: Modules 08-09
|
||||
- Week 2: Modules 10-11
|
||||
- Week 3: Modules 12-13
|
||||
- Week 4: Integration project
|
||||
|
||||
**Month 3: Optimization**
|
||||
- Week 1: Modules 14-15
|
||||
- Week 2: Modules 16-17
|
||||
- Week 3: Modules 18-19
|
||||
- Week 4: Capstone optimization project
|
||||
|
||||
## 🎓 Learning Outcomes
|
||||
|
||||
After completing TinyTorch onboarding, team members will:
|
||||
|
||||
1. **Understand Framework Internals**
|
||||
- How autograd works
|
||||
- Memory allocation patterns
|
||||
- Optimization trade-offs
|
||||
|
||||
2. **Debug Production Issues**
|
||||
- Gradient flow problems
|
||||
- Memory bottlenecks
|
||||
- Performance issues
|
||||
|
||||
3. **Make Informed Decisions**
|
||||
- Optimizer selection
|
||||
- Architecture choices
|
||||
- Deployment strategies
|
||||
|
||||
4. **Read Production Code**
|
||||
- Understand PyTorch source
|
||||
- Navigate framework codebases
|
||||
- Contribute to ML infrastructure
|
||||
|
||||
## 🔧 Integration with Existing Workflows
|
||||
|
||||
### Code Review Integration
|
||||
|
||||
- Review production code with TinyTorch knowledge
|
||||
- Identify framework internals in production code
|
||||
- Suggest optimizations based on systems understanding
|
||||
|
||||
### Debugging Integration
|
||||
|
||||
- Apply TinyTorch debugging strategies to production issues
|
||||
- Use systems thinking for troubleshooting
|
||||
- Profile production models using TinyTorch techniques
|
||||
|
||||
### Architecture Design
|
||||
|
||||
- Design new models with systems awareness
|
||||
- Consider memory and performance from the start
|
||||
- Make informed trade-offs
|
||||
|
||||
## 📊 Success Metrics
|
||||
|
||||
### Individual Metrics
|
||||
- Module completion rate
|
||||
- Test passing rate
|
||||
- Capstone project quality
|
||||
- Self-reported confidence increase
|
||||
|
||||
### Team Metrics
|
||||
- Reduced debugging time
|
||||
- Fewer production incidents
|
||||
- Improved code review quality
|
||||
- Better architecture decisions
|
||||
|
||||
## 🛠️ Setup for Teams
|
||||
|
||||
### Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Clone repository
|
||||
git clone https://github.com/mlsysbook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
|
||||
# 2. Set up environment
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate # Windows: .venv\Scripts\activate
|
||||
|
||||
# 3. Install dependencies
|
||||
pip install -r requirements.txt
|
||||
pip install -e .
|
||||
|
||||
# 4. Verify setup
|
||||
tito system doctor
|
||||
|
||||
# 5. Start with Module 01
|
||||
tito view 01_tensor
|
||||
```
|
||||
|
||||
### Team-Specific Customization
|
||||
|
||||
- **Custom datasets**: Replace with company-specific data
|
||||
- **Domain modules**: Add modules for specific use cases
|
||||
- **Integration**: Connect to company ML infrastructure
|
||||
- **Assessment**: Customize grading for team needs
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- **Student Quickstart**: `docs/STUDENT_QUICKSTART.md`
|
||||
- **Instructor Guide**: `INSTRUCTOR.md` (for training leads)
|
||||
- **TA Guide**: `TA_GUIDE.md` (for support staff)
|
||||
- **Module Documentation**: `modules/*/ABOUT.md`
|
||||
|
||||
## 💼 Industry Case Studies
|
||||
|
||||
### Case Study 1: ML Infrastructure Team
|
||||
**Challenge**: Team members could use PyTorch but couldn't debug framework issues
|
||||
**Solution**: 2-week intensive bootcamp focusing on autograd and optimization
|
||||
**Result**: 50% reduction in debugging time, better architecture decisions
|
||||
|
||||
### Case Study 2: Research Team
|
||||
**Challenge**: Researchers needed to understand transformer internals
|
||||
**Solution**: Focused workshop on Modules 12-13 (Attention, Transformers)
|
||||
**Result**: Improved model designs, better understanding of scaling
|
||||
|
||||
### Case Study 3: Production ML Team
|
||||
**Challenge**: Team needed optimization skills for deployment
|
||||
**Solution**: 3-month program focusing on Optimization Tier (Modules 14-19)
|
||||
**Result**: 4x model compression, 10x speedup on production models
|
||||
|
||||
## 🎯 Next Steps
|
||||
|
||||
1. **Choose deployment model**: Bootcamp, distributed, or workshop
|
||||
2. **Set up environment**: Cloud (Colab) or local
|
||||
3. **Select modules**: Full curriculum or focused selection
|
||||
4. **Schedule training**: Intensive or distributed
|
||||
5. **Track progress**: Use checkpoint system or custom metrics
|
||||
|
||||
---
|
||||
|
||||
**For Questions**: See `INSTRUCTOR.md` or contact TinyTorch maintainers
|
||||
|
||||
@@ -21,6 +21,9 @@ help:
|
||||
|
||||
html:
|
||||
@echo "🌐 Building HTML version..."
|
||||
@echo "📓 Preparing notebooks for launch buttons..."
|
||||
@./prepare_notebooks.sh || echo "⚠️ Notebook preparation skipped (tito not available)"
|
||||
@echo ""
|
||||
jupyter-book build .
|
||||
|
||||
pdf:
|
||||
|
||||
@@ -56,6 +56,7 @@ html:
|
||||
- _static/ml-timeline.js
|
||||
- _static/hero-carousel.js
|
||||
- _static/sidebar-link.js
|
||||
- _static/marimo-badges.js
|
||||
|
||||
# Favicon configuration
|
||||
favicon: "_static/favicon.svg"
|
||||
|
||||
|
Before Width: | Height: | Size: 132 B After Width: | Height: | Size: 8.0 MiB |
|
Before Width: | Height: | Size: 132 B After Width: | Height: | Size: 1.4 MiB |
|
Before Width: | Height: | Size: 131 B After Width: | Height: | Size: 193 KiB |
|
Before Width: | Height: | Size: 132 B After Width: | Height: | Size: 3.6 MiB |
107
site/_static/marimo-badges.js
Normal file
@@ -0,0 +1,107 @@
|
||||
/**
|
||||
* Marimo Badge Integration for TinyTorch
|
||||
* Adds Marimo "Open in Marimo" badges to notebook pages
|
||||
*/
|
||||
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
// Find all notebook pages (they have launch buttons)
|
||||
const launchButtons = document.querySelectorAll('.launch-buttons, .jb-launch-buttons');
|
||||
|
||||
if (launchButtons.length === 0) return;
|
||||
|
||||
// Add informational message about local setup requirement
|
||||
const infoMessage = document.createElement('div');
|
||||
infoMessage.className = 'notebook-platform-info';
|
||||
infoMessage.style.cssText = `
|
||||
margin: 1rem 0;
|
||||
padding: 1rem;
|
||||
background: #fff3cd;
|
||||
border-left: 4px solid #ffc107;
|
||||
border-radius: 0.25rem;
|
||||
font-size: 0.9rem;
|
||||
color: #856404;
|
||||
`;
|
||||
infoMessage.innerHTML = `
|
||||
<strong>💡 Note:</strong> These online notebooks are for <strong>viewing and exploration only</strong>.
|
||||
To actually build modules, run milestone validations, and use the full TinyTorch package,
|
||||
you need <a href="../quickstart-guide.html" style="color: #856404; text-decoration: underline; font-weight: 600;">local setup</a>.
|
||||
`;
|
||||
|
||||
// Get the current page path to construct marimo URL
|
||||
const currentPath = window.location.pathname;
|
||||
const notebookName = currentPath.split('/').pop().replace('.html', '');
|
||||
|
||||
// Find the repository info from the page
|
||||
const repoUrl = 'https://github.com/mlsysbook/TinyTorch';
|
||||
const repoPath = 'mlsysbook/TinyTorch';
|
||||
const branch = 'main';
|
||||
|
||||
// Construct marimo molab URL
|
||||
// Marimo can open .ipynb files directly from GitHub
|
||||
// Format: https://marimo.app/molab?repo=owner/repo&path=path/to/file.ipynb
|
||||
// Works for all modules: 01_tensor, 02_activations, etc.
|
||||
const marimoUrl = `https://marimo.app/molab?repo=${repoPath}&path=site/chapters/modules/${notebookName}.ipynb`;
|
||||
|
||||
// Create marimo badge
|
||||
const marimoBadge = document.createElement('div');
|
||||
marimoBadge.className = 'marimo-launch-badge';
|
||||
marimoBadge.style.cssText = `
|
||||
margin-top: 1rem;
|
||||
padding: 0.75rem;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
border-radius: 0.5rem;
|
||||
text-align: center;
|
||||
`;
|
||||
|
||||
const marimoLink = document.createElement('a');
|
||||
marimoLink.href = marimoUrl;
|
||||
marimoLink.target = '_blank';
|
||||
marimoLink.rel = 'noopener noreferrer';
|
||||
marimoLink.style.cssText = `
|
||||
color: white;
|
||||
text-decoration: none;
|
||||
font-weight: 600;
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
`;
|
||||
marimoLink.innerHTML = `
|
||||
<span>🍃</span>
|
||||
<span>Open in Marimo</span>
|
||||
<span style="font-size: 0.85em;">→</span>
|
||||
`;
|
||||
|
||||
marimoBadge.appendChild(marimoLink);
|
||||
|
||||
// Add info message and marimo badge after launch buttons
|
||||
launchButtons.forEach(buttonContainer => {
|
||||
// Add info message first (if not already present)
|
||||
if (!buttonContainer.querySelector('.notebook-platform-info')) {
|
||||
buttonContainer.appendChild(infoMessage.cloneNode(true));
|
||||
}
|
||||
|
||||
// Check if marimo badge already exists
|
||||
if (!buttonContainer.querySelector('.marimo-launch-badge')) {
|
||||
buttonContainer.appendChild(marimoBadge.cloneNode(true));
|
||||
}
|
||||
});
|
||||
|
||||
// Also add to any existing launch button sections
|
||||
const launchSections = document.querySelectorAll('[class*="launch"], [id*="launch"]');
|
||||
launchSections.forEach(section => {
|
||||
// Add info message if not present
|
||||
if (!section.querySelector('.notebook-platform-info')) {
|
||||
const infoClone = infoMessage.cloneNode(true);
|
||||
infoClone.style.marginTop = '1rem';
|
||||
section.appendChild(infoClone);
|
||||
}
|
||||
|
||||
// Add marimo badge if not present
|
||||
if (!section.querySelector('.marimo-launch-badge')) {
|
||||
const badgeClone = marimoBadge.cloneNode(true);
|
||||
badgeClone.style.marginTop = '1rem';
|
||||
section.appendChild(badgeClone);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
@@ -14,6 +14,12 @@ parts:
|
||||
title: "Student Workflow"
|
||||
- file: usage-paths/classroom-use
|
||||
title: "For Instructors"
|
||||
- file: instructor-guide
|
||||
title: "Instructor Guide"
|
||||
- file: usage-paths/ta-guide
|
||||
title: "TA Guide"
|
||||
- file: usage-paths/team-onboarding
|
||||
title: "Team Onboarding"
|
||||
|
||||
# Tier captions: Added emojis for visual consistency and quick recognition
|
||||
# Foundation (🏗), Architecture (🏛️), Optimization (⏱️), Capstone (🏅)
|
||||
|
||||
@@ -42,6 +42,10 @@ echo "🧹 Cleaning previous builds..."
|
||||
jupyter-book clean . --all || true
|
||||
echo ""
|
||||
|
||||
# Prepare notebooks (for consistency, though PDF doesn't need launch buttons)
|
||||
echo "📓 Preparing notebooks..."
|
||||
./prepare_notebooks.sh || echo "⚠️ Notebook preparation skipped"
|
||||
|
||||
# Build PDF via LaTeX
|
||||
echo "📚 Building LaTeX/PDF (this may take a few minutes)..."
|
||||
jupyter-book build . --builder pdflatex
|
||||
|
||||
@@ -39,6 +39,10 @@ echo "🧹 Cleaning previous builds..."
|
||||
jupyter-book clean . --all || true
|
||||
echo ""
|
||||
|
||||
# Prepare notebooks (for consistency, though PDF doesn't need launch buttons)
|
||||
echo "📓 Preparing notebooks..."
|
||||
./prepare_notebooks.sh || echo "⚠️ Notebook preparation skipped"
|
||||
|
||||
# Build PDF via HTML
|
||||
echo "📚 Building PDF from HTML (this may take a few minutes)..."
|
||||
echo "ℹ️ First run will download Chromium browser (~170MB)"
|
||||
|
||||
1269
site/chapters/modules/02_tensor.ipynb
Normal file
@@ -61,21 +61,63 @@ Real-time chat and study groups:
|
||||
- Office hours with educators
|
||||
- Project showcase channels
|
||||
|
||||
### Community Dashboard (Planned)
|
||||
### Community Dashboard (Available Now ✅)
|
||||
|
||||
Track global learning progress:
|
||||
- Real-time completion statistics
|
||||
- Geographic distribution of learners
|
||||
- Milestone achievement tracking
|
||||
- Study partner matching
|
||||
Join the global TinyTorch community and see your progress:
|
||||
|
||||
### Torch Olympics Leaderboard (Planned)
|
||||
```bash
|
||||
# Join the community
|
||||
tito community join
|
||||
|
||||
Compete in ML systems challenges:
|
||||
- Performance benchmarks
|
||||
- Memory efficiency competitions
|
||||
- Innovation showcases
|
||||
- Community recognition
|
||||
# View your profile
|
||||
tito community profile
|
||||
|
||||
# Update your progress
|
||||
tito community update
|
||||
|
||||
# View community statistics
|
||||
tito community stats
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- **Anonymous profiles** - Join with optional information (country, institution, course type)
|
||||
- **Cohort identification** - See your cohort (Fall 2024, Spring 2025, etc.)
|
||||
- **Progress tracking** - Automatic milestone and module completion tracking
|
||||
- **Privacy-first** - All data stored locally in `.tinytorch/` directory
|
||||
- **Opt-in sharing** - You control what information to share
|
||||
|
||||
**Privacy:** All fields are optional. We use anonymous UUIDs (no personal names). Data is stored locally in your project directory. See [Privacy Policy](../docs/PRIVACY_DATA_RETENTION.md) for details.
|
||||
|
||||
### Benchmark & Performance Tracking (Available Now ✅)
|
||||
|
||||
Validate your setup and track performance improvements:
|
||||
|
||||
```bash
|
||||
# Quick setup validation (after initial setup)
|
||||
tito benchmark baseline
|
||||
|
||||
# Full capstone benchmarks (after Module 20)
|
||||
tito benchmark capstone
|
||||
|
||||
# Submit results to community (optional)
|
||||
# Prompts automatically after benchmarks complete
|
||||
```
|
||||
|
||||
**Baseline Benchmark:**
|
||||
- Validates your setup is working correctly
|
||||
- Quick "Hello World" moment after setup
|
||||
- Tests: tensor operations, matrix multiply, forward pass
|
||||
- Generates score (0-100) and saves results locally
|
||||
|
||||
**Capstone Benchmark:**
|
||||
- Full performance evaluation after Module 20
|
||||
- Tracks: speed, compression, accuracy, efficiency
|
||||
- Uses Module 19's Benchmark harness for statistical rigor
|
||||
- Generates comprehensive results for submission
|
||||
|
||||
**Submission:** After benchmarks complete, you'll be prompted to submit results (optional). Submissions are saved locally and can be shared with the community.
|
||||
|
||||
See [TITO CLI Reference](tito/overview.html) for complete command documentation.
|
||||
|
||||
---
|
||||
|
||||
|
||||
578
site/instructor-guide.md
Normal file
@@ -0,0 +1,578 @@
|
||||
# 👩🏫 TinyTorch Instructor Guide
|
||||
|
||||
Complete guide for teaching ML Systems Engineering with TinyTorch.
|
||||
|
||||
## 🎯 Course Overview
|
||||
|
||||
TinyTorch teaches ML systems engineering through building, not just using. Students construct a complete ML framework from tensors to transformers, understanding memory, performance, and scaling at each step.
|
||||
|
||||
## 🛠️ Instructor Setup
|
||||
|
||||
### **1. Initial Setup**
|
||||
```bash
|
||||
# Clone and setup
|
||||
git clone https://github.com/MLSysBook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
|
||||
# Virtual environment (MANDATORY)
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate
|
||||
|
||||
# Install with instructor tools
|
||||
pip install -r requirements.txt
|
||||
pip install nbgrader
|
||||
|
||||
# Setup grading infrastructure
|
||||
tito grade setup
|
||||
```
|
||||
|
||||
### **2. Verify Installation**
|
||||
```bash
|
||||
tito system doctor
|
||||
# Should show all green checkmarks
|
||||
|
||||
tito grade
|
||||
# Should show available grade commands
|
||||
```
|
||||
|
||||
## 📝 Assignment Workflow
|
||||
|
||||
### **Simplified with Tito CLI**
|
||||
We've wrapped NBGrader behind simple `tito grade` commands so you don't need to learn NBGrader's complex interface.
|
||||
|
||||
### **1. Prepare Assignments**
|
||||
```bash
|
||||
# Generate instructor version (with solutions)
|
||||
tito grade generate 01_tensor
|
||||
|
||||
# Create student version (solutions removed)
|
||||
tito grade release 01_tensor
|
||||
|
||||
# Student version will be in: release/tinytorch/01_tensor/
|
||||
```
|
||||
|
||||
### **2. Distribute to Students**
|
||||
```bash
|
||||
# Option A: GitHub Classroom (recommended)
|
||||
# 1. Create assignment repository from TinyTorch
|
||||
# 2. Remove solutions from modules
|
||||
# 3. Students clone and work
|
||||
|
||||
# Option B: Direct distribution
|
||||
# Share the release/ directory contents
|
||||
```
|
||||
|
||||
### **3. Collect Submissions**
|
||||
```bash
|
||||
# Collect all students
|
||||
tito grade collect 01_tensor
|
||||
|
||||
# Or specific student
|
||||
tito grade collect 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **4. Auto-Grade**
|
||||
```bash
|
||||
# Grade all submissions
|
||||
tito grade autograde 01_tensor
|
||||
|
||||
# Grade specific student
|
||||
tito grade autograde 01_tensor --student student_id
|
||||
```
|
||||
|
||||
### **5. Manual Review**
|
||||
```bash
|
||||
# Open grading interface (browser-based)
|
||||
tito grade manual 01_tensor
|
||||
|
||||
# This launches a web interface for:
|
||||
# - Reviewing ML Systems question responses
|
||||
# - Adding feedback comments
|
||||
# - Adjusting auto-grades
|
||||
```
|
||||
|
||||
### **6. Generate Feedback**
|
||||
```bash
|
||||
# Create feedback files for students
|
||||
tito grade feedback 01_tensor
|
||||
```
|
||||
|
||||
### **7. Export Grades**
|
||||
```bash
|
||||
# Export all grades to CSV
|
||||
tito grade export
|
||||
|
||||
# Or specific module
|
||||
tito grade export --module 01_tensor --output grades_module01.csv
|
||||
```
|
||||
|
||||
## 📊 Grading Components
|
||||
|
||||
### **Auto-Graded (70%)**
|
||||
- Code implementation correctness
|
||||
- Test passing
|
||||
- Function signatures
|
||||
- Output validation
|
||||
|
||||
### **Manually Graded (30%)**
|
||||
- ML Systems Thinking questions (3 per module)
|
||||
- Each question: 10 points
|
||||
- Focus on understanding, not perfection
|
||||
|
||||
### **Grading Rubric for ML Systems Questions**
|
||||
|
||||
| Points | Criteria |
|
||||
|--------|----------|
|
||||
| 9-10 | Demonstrates deep understanding, references specific code, discusses systems implications |
|
||||
| 7-8 | Good understanding, some code references, basic systems thinking |
|
||||
| 5-6 | Surface understanding, generic response, limited systems perspective |
|
||||
| 3-4 | Attempted but misses key concepts |
|
||||
| 0-2 | No attempt or completely off-topic |
|
||||
|
||||
**What to Look For:**
|
||||
- References to actual implemented code
|
||||
- Memory/performance analysis
|
||||
- Scaling considerations
|
||||
- Production system comparisons
|
||||
- Understanding of trade-offs
|
||||
|
||||
## 📋 Sample Solutions for Grading Calibration
|
||||
|
||||
This section provides sample solutions to help calibrate grading standards. Use these as reference points when evaluating student submissions.
|
||||
|
||||
### Module 01: Tensor - Memory Footprint
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
"""Calculate tensor memory in bytes."""
|
||||
return self.data.nbytes
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Concise and correct
|
||||
- Uses NumPy's built-in `nbytes` property
|
||||
- Clear docstring
|
||||
- Handles all tensor shapes correctly
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
"""Calculate memory usage."""
|
||||
return np.prod(self.data.shape) * self.data.dtype.itemsize
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Manually calculates (shows understanding)
|
||||
- Works but less efficient than using `nbytes`
|
||||
- Minor: docstring could be more specific
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def memory_footprint(self):
|
||||
size = 1
|
||||
for dim in self.data.shape:
|
||||
size *= dim
|
||||
return size * 4 # Assumes float32
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Correct logic but hardcoded dtype size
|
||||
- Works for float32 but fails for other dtypes
|
||||
- Shows understanding of memory calculation
|
||||
- Missing proper dtype handling
|
||||
|
||||
### Module 05: Autograd - Backward Pass
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def backward(self, gradient=None):
|
||||
"""Backward pass through computational graph."""
|
||||
if gradient is None:
|
||||
gradient = np.ones_like(self.data)
|
||||
|
||||
self.grad = gradient
|
||||
|
||||
if self.grad_fn is not None:
|
||||
# Compute gradients for inputs
|
||||
input_grads = self.grad_fn.backward(gradient)
|
||||
|
||||
# Propagate to input tensors
|
||||
if isinstance(input_grads, tuple):
|
||||
for input_tensor, input_grad in zip(self.grad_fn.inputs, input_grads):
|
||||
if input_tensor.requires_grad:
|
||||
input_tensor.backward(input_grad)
|
||||
else:
|
||||
if self.grad_fn.inputs[0].requires_grad:
|
||||
self.grad_fn.inputs[0].backward(input_grads)
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Handles both scalar and tensor gradients
|
||||
- Properly checks `requires_grad` before propagating
|
||||
- Handles tuple returns from grad_fn
|
||||
- Clear variable names and structure
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def backward(self, gradient=None):
|
||||
if gradient is None:
|
||||
gradient = np.ones_like(self.data)
|
||||
self.grad = gradient
|
||||
if self.grad_fn:
|
||||
grads = self.grad_fn.backward(gradient)
|
||||
for inp, grad in zip(self.grad_fn.inputs, grads):
|
||||
inp.backward(grad)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct logic
|
||||
- Missing `requires_grad` check (minor issue)
|
||||
- Assumes grads is always iterable (may fail for single input)
|
||||
- Works for most cases but less robust
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def backward(self, grad):
|
||||
self.grad = grad
|
||||
if self.grad_fn:
|
||||
self.grad_fn.inputs[0].backward(self.grad_fn.backward(grad))
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic backward pass works
|
||||
- Only handles single input (fails for multi-input operations)
|
||||
- Missing None gradient handling
|
||||
- Shows understanding but incomplete
|
||||
|
||||
### Module 09: Spatial - Convolution Implementation
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
"""Forward pass with explicit loops for clarity."""
|
||||
batch_size, in_channels, height, width = x.shape
|
||||
out_height = (height - self.kernel_size + 2 * self.padding) // self.stride + 1
|
||||
out_width = (width - self.kernel_size + 2 * self.padding) // self.stride + 1
|
||||
|
||||
output = np.zeros((batch_size, self.out_channels, out_height, out_width))
|
||||
|
||||
# Apply padding
|
||||
if self.padding > 0:
|
||||
x = np.pad(x, ((0, 0), (0, 0), (self.padding, self.padding),
|
||||
(self.padding, self.padding)), mode='constant')
|
||||
|
||||
# Explicit convolution loops
|
||||
for b in range(batch_size):
|
||||
for oc in range(self.out_channels):
|
||||
for oh in range(out_height):
|
||||
for ow in range(out_width):
|
||||
h_start = oh * self.stride
|
||||
w_start = ow * self.stride
|
||||
h_end = h_start + self.kernel_size
|
||||
w_end = w_start + self.kernel_size
|
||||
|
||||
window = x[b, :, h_start:h_end, w_start:w_end]
|
||||
output[b, oc, oh, ow] = np.sum(
|
||||
window * self.weight[oc] + self.bias[oc]
|
||||
)
|
||||
|
||||
return Tensor(output, requires_grad=x.requires_grad)
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Clear output shape calculation
|
||||
- Proper padding handling
|
||||
- Explicit loops make O(kernel_size²) complexity visible
|
||||
- Correct gradient tracking setup
|
||||
- Well-structured and readable
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
B, C, H, W = x.shape
|
||||
out_h = (H - self.kernel_size) // self.stride + 1
|
||||
out_w = (W - self.kernel_size) // self.stride + 1
|
||||
out = np.zeros((B, self.out_channels, out_h, out_w))
|
||||
|
||||
for b in range(B):
|
||||
for oc in range(self.out_channels):
|
||||
for i in range(out_h):
|
||||
for j in range(out_w):
|
||||
h = i * self.stride
|
||||
w = j * self.stride
|
||||
out[b, oc, i, j] = np.sum(
|
||||
x[b, :, h:h+self.kernel_size, w:w+self.kernel_size]
|
||||
* self.weight[oc]
|
||||
) + self.bias[oc]
|
||||
return Tensor(out)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Missing padding support (works only for padding=0)
|
||||
- Less clear variable names
|
||||
- Missing requires_grad propagation
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def forward(self, x):
|
||||
out = np.zeros((x.shape[0], self.out_channels, x.shape[2]-2, x.shape[3]-2))
|
||||
for b in range(x.shape[0]):
|
||||
for c in range(self.out_channels):
|
||||
for i in range(out.shape[2]):
|
||||
for j in range(out.shape[3]):
|
||||
out[b, c, i, j] = np.sum(x[b, :, i:i+3, j:j+3] * self.weight[c])
|
||||
return Tensor(out)
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic convolution works
|
||||
- Hardcoded kernel_size=3 (not general)
|
||||
- No stride or padding support
|
||||
- Shows understanding but incomplete
|
||||
|
||||
### Module 12: Attention - Scaled Dot-Product Attention
|
||||
|
||||
**Excellent Solution (9-10 points)**:
|
||||
```python
|
||||
def forward(self, query, key, value, mask=None):
|
||||
"""Scaled dot-product attention with numerical stability."""
|
||||
# Compute attention scores
|
||||
scores = np.dot(query, key.T) / np.sqrt(self.d_k)
|
||||
|
||||
# Apply mask if provided
|
||||
if mask is not None:
|
||||
scores = np.where(mask, scores, -1e9)
|
||||
|
||||
# Softmax with numerical stability
|
||||
exp_scores = np.exp(scores - np.max(scores, axis=-1, keepdims=True))
|
||||
attention_weights = exp_scores / np.sum(exp_scores, axis=-1, keepdims=True)
|
||||
|
||||
# Apply attention to values
|
||||
output = np.dot(attention_weights, value)
|
||||
|
||||
return output, attention_weights
|
||||
```
|
||||
**Why Excellent**:
|
||||
- Proper scaling factor (1/√d_k)
|
||||
- Numerical stability with max subtraction
|
||||
- Mask handling
|
||||
- Returns both output and attention weights
|
||||
- Clear and well-documented
|
||||
|
||||
**Good Solution (7-8 points)**:
|
||||
```python
|
||||
def forward(self, q, k, v):
|
||||
scores = np.dot(q, k.T) / np.sqrt(q.shape[-1])
|
||||
weights = np.exp(scores) / np.sum(np.exp(scores), axis=-1, keepdims=True)
|
||||
return np.dot(weights, v)
|
||||
```
|
||||
**Why Good**:
|
||||
- Correct implementation
|
||||
- Missing numerical stability (may overflow)
|
||||
- Missing mask support
|
||||
- Works but less robust
|
||||
|
||||
**Acceptable Solution (5-6 points)**:
|
||||
```python
|
||||
def forward(self, q, k, v):
|
||||
scores = np.dot(q, k.T)
|
||||
weights = np.exp(scores) / np.sum(np.exp(scores))
|
||||
return np.dot(weights, v)
|
||||
```
|
||||
**Why Acceptable**:
|
||||
- Basic attention mechanism
|
||||
- Missing scaling factor
|
||||
- Missing numerical stability
|
||||
- Incorrect softmax (should be per-row)
|
||||
|
||||
### Grading Guidelines Using Sample Solutions
|
||||
|
||||
**When Evaluating Student Code**:
|
||||
|
||||
1. **Correctness First**: Does it pass all tests?
|
||||
- If no: Maximum 6 points (even if well-written)
|
||||
- If yes: Proceed to quality evaluation
|
||||
|
||||
2. **Code Quality**:
|
||||
- **Excellent (9-10)**: Production-ready, handles edge cases, well-documented
|
||||
- **Good (7-8)**: Correct and functional, minor improvements possible
|
||||
- **Acceptable (5-6)**: Works but incomplete or has issues
|
||||
|
||||
3. **Systems Thinking**:
|
||||
- **Excellent**: Discusses memory, performance, scaling implications
|
||||
- **Good**: Some systems awareness
|
||||
- **Acceptable**: Focuses only on correctness
|
||||
|
||||
4. **Common Patterns**:
|
||||
- Look for: Proper error handling, edge case consideration, documentation
|
||||
- Red flags: Hardcoded values, missing checks, unclear variable names
|
||||
|
||||
**Remember**: These are calibration examples. Adjust based on your course level and learning objectives. The goal is consistent evaluation, not perfection.
|
||||
|
||||
## 📚 Module Teaching Notes
|
||||
|
||||
### **Module 01: Tensor**
|
||||
- **Focus**: Memory layout, data structures
|
||||
- **Key Concept**: Understanding memory is crucial for ML performance
|
||||
- **Demo**: Show memory profiling, copying behavior
|
||||
|
||||
### **Module 02: Activations**
|
||||
- **Focus**: Vectorization, numerical stability
|
||||
- **Key Concept**: Small details matter at scale
|
||||
- **Demo**: Gradient vanishing/exploding
|
||||
|
||||
### **Module 04-05: Layers & Networks**
|
||||
- **Focus**: Composition, parameter management
|
||||
- **Key Concept**: Building blocks combine into complex systems
|
||||
- **Project**: Build a small CNN
|
||||
|
||||
### **Module 06-07: Spatial & Attention**
|
||||
- **Focus**: Algorithmic complexity, memory patterns
|
||||
- **Key Concept**: O(N²) operations become bottlenecks
|
||||
- **Demo**: Profile attention memory usage
|
||||
|
||||
### **Module 08-11: Training Pipeline**
|
||||
- **Focus**: End-to-end system integration
|
||||
- **Key Concept**: Many components must work together
|
||||
- **Project**: Train a real model
|
||||
|
||||
### **Module 12-15: Production**
|
||||
- **Focus**: Deployment, optimization, monitoring
|
||||
- **Key Concept**: Academic vs production requirements
|
||||
- **Demo**: Model compression, deployment
|
||||
|
||||
### **Module 16: TinyGPT**
|
||||
- **Focus**: Framework generalization
|
||||
- **Key Concept**: 70% component reuse from vision to language
|
||||
- **Capstone**: Build a working language model
|
||||
|
||||
## 🎯 Learning Objectives
|
||||
|
||||
By course end, students should be able to:
|
||||
|
||||
1. **Build** complete ML systems from scratch
|
||||
2. **Analyze** memory usage and computational complexity
|
||||
3. **Debug** performance bottlenecks
|
||||
4. **Optimize** for production deployment
|
||||
5. **Understand** framework design decisions
|
||||
6. **Apply** systems thinking to ML problems
|
||||
|
||||
## 📈 Tracking Progress
|
||||
|
||||
### **Individual Progress**
|
||||
```bash
|
||||
# Check specific student progress
|
||||
tito checkpoint status --student student_id
|
||||
```
|
||||
|
||||
### **Class Overview**
|
||||
```bash
|
||||
# Export all checkpoint achievements
|
||||
tito checkpoint export --output class_progress.csv
|
||||
```
|
||||
|
||||
### **Identify Struggling Students**
|
||||
Look for:
|
||||
- Missing checkpoint achievements
|
||||
- Low scores on ML Systems questions
|
||||
- Incomplete module submissions
|
||||
|
||||
## 💡 Teaching Tips
|
||||
|
||||
### **1. Emphasize Building Over Theory**
|
||||
- Have students type every line of code
|
||||
- Run tests immediately after implementation
|
||||
- Break and fix things intentionally
|
||||
|
||||
### **2. Connect to Production Systems**
|
||||
- Show PyTorch/TensorFlow equivalents
|
||||
- Discuss real-world bottlenecks
|
||||
- Share production war stories
|
||||
|
||||
### **3. Make Performance Visible**
|
||||
```python
|
||||
# Use profilers liberally
|
||||
with TimeProfiler("operation"):
|
||||
result = expensive_operation()
|
||||
|
||||
# Show memory usage
|
||||
print(f"Memory: {get_memory_usage():.2f} MB")
|
||||
```
|
||||
|
||||
### **4. Encourage Systems Questions**
|
||||
- "What would break at 1B parameters?"
|
||||
- "How would you distributed this?"
|
||||
- "What's the bottleneck here?"
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### **Common Student Issues**
|
||||
|
||||
**Environment Problems**
|
||||
```bash
|
||||
# Student fix:
|
||||
tito system doctor
|
||||
tito system reset
|
||||
```
|
||||
|
||||
**Module Import Errors**
|
||||
```bash
|
||||
# Rebuild package
|
||||
tito export --all
|
||||
```
|
||||
|
||||
**Test Failures**
|
||||
```bash
|
||||
# Detailed test output
|
||||
tito module test MODULE --verbose
|
||||
```
|
||||
|
||||
### **NBGrader Issues**
|
||||
|
||||
**Database Locked**
|
||||
```bash
|
||||
# Clear NBGrader database
|
||||
rm gradebook.db
|
||||
tito grade setup
|
||||
```
|
||||
|
||||
**Missing Submissions**
|
||||
```bash
|
||||
# Check submission directory
|
||||
ls submitted/*/MODULE/
|
||||
```
|
||||
|
||||
## 📊 Sample Schedule (16 Weeks)
|
||||
|
||||
| Week | Module | Focus |
|
||||
|------|--------|-------|
|
||||
| 1 | 01 Tensor | Data Structures, Memory |
|
||||
| 2 | 02 Activations | Non-linearity Functions |
|
||||
| 3 | 03 Layers | Neural Network Components |
|
||||
| 4 | 04 Losses | Optimization Objectives |
|
||||
| 5 | 05 Autograd | Automatic Differentiation |
|
||||
| 6 | 06 Optimizers | Training Algorithms |
|
||||
| 7 | 07 Training | Complete Training Loop |
|
||||
| 8 | Midterm Project | Build and Train Network |
|
||||
| 9 | 08 DataLoader | Data Pipeline |
|
||||
| 10 | 09 Spatial | Convolutions, CNNs |
|
||||
| 11 | 10 Tokenization | Text Processing |
|
||||
| 12 | 11 Embeddings | Word Representations |
|
||||
| 13 | 12 Attention | Attention Mechanisms |
|
||||
| 14 | 13 Transformers | Transformer Architecture |
|
||||
| 15 | 14-19 Optimization | Profiling, Quantization, etc. |
|
||||
| 16 | 20 Capstone | Torch Olympics Competition |
|
||||
|
||||
## 🎓 Assessment Strategy
|
||||
|
||||
### **Continuous Assessment (70%)**
|
||||
- Module completion: 4% each × 16 = 64%
|
||||
- Checkpoint achievements: 6%
|
||||
|
||||
### **Projects (30%)**
|
||||
- Midterm: Build and train CNN (15%)
|
||||
- Final: Extend TinyGPT (15%)
|
||||
|
||||
## 📚 Additional Resources
|
||||
|
||||
- [MLSys Book](https://mlsysbook.ai) - Companion textbook
|
||||
- [Course Discussions](https://github.com/MLSysBook/TinyTorch/discussions)
|
||||
- [Issue Tracker](https://github.com/MLSysBook/TinyTorch/issues)
|
||||
|
||||
---
|
||||
|
||||
**Need help? Open an issue or contact the TinyTorch team!**
|
||||
77
site/prepare_notebooks.sh
Executable file
@@ -0,0 +1,77 @@
|
||||
#!/bin/bash
|
||||
# Prepare notebooks for site build
|
||||
# This script ensures notebooks exist in site/ for launch buttons to work
|
||||
# Called automatically during site build
|
||||
#
|
||||
# Workflow:
|
||||
# 1. Uses existing assignment notebooks if available (from tito nbgrader generate)
|
||||
# 2. Falls back to generating notebooks from modules if needed
|
||||
# 3. Copies notebooks to site/chapters/modules/ for Jupyter Book launch buttons
|
||||
|
||||
set -e
|
||||
|
||||
# Get the site directory (where this script lives)
|
||||
SITE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
REPO_ROOT="$(cd "$SITE_DIR/.." && pwd)"
|
||||
|
||||
echo "📓 Preparing notebooks for site build..."
|
||||
|
||||
# Create notebooks directory in site if it doesn't exist
|
||||
NOTEBOOKS_DIR="$SITE_DIR/chapters/modules"
|
||||
mkdir -p "$NOTEBOOKS_DIR"
|
||||
|
||||
cd "$REPO_ROOT"
|
||||
|
||||
# Strategy: Use existing assignment notebooks if available, otherwise generate
|
||||
# This is faster and uses already-processed notebooks
|
||||
echo "🔄 Looking for existing assignment notebooks..."
|
||||
|
||||
MODULES=$(ls -1 modules/ 2>/dev/null | grep -E "^[0-9]" | sort -V || echo "")
|
||||
|
||||
if [ -z "$MODULES" ]; then
|
||||
echo "⚠️ No modules found. Skipping notebook preparation."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
NOTEBOOKS_COPIED=0
|
||||
NOTEBOOKS_GENERATED=0
|
||||
|
||||
for module in $MODULES; do
|
||||
TARGET_NB="$NOTEBOOKS_DIR/${module}.ipynb"
|
||||
|
||||
# Check if assignment notebook already exists
|
||||
ASSIGNMENT_NB="$REPO_ROOT/assignments/source/$module/${module}.ipynb"
|
||||
|
||||
if [ -f "$ASSIGNMENT_NB" ]; then
|
||||
# Use existing assignment notebook
|
||||
cp "$ASSIGNMENT_NB" "$TARGET_NB"
|
||||
echo " ✅ Copied existing notebook: $module"
|
||||
NOTEBOOKS_COPIED=$((NOTEBOOKS_COPIED + 1))
|
||||
elif command -v tito &> /dev/null; then
|
||||
# Try to generate notebook if tito is available
|
||||
echo " 🔄 Generating notebook for $module..."
|
||||
if tito nbgrader generate "$module" >/dev/null 2>&1; then
|
||||
if [ -f "$ASSIGNMENT_NB" ]; then
|
||||
cp "$ASSIGNMENT_NB" "$TARGET_NB"
|
||||
echo " ✅ Generated and copied: $module"
|
||||
NOTEBOOKS_GENERATED=$((NOTEBOOKS_GENERATED + 1))
|
||||
fi
|
||||
else
|
||||
echo " ⚠️ Could not generate notebook for $module (module may not be ready)"
|
||||
fi
|
||||
else
|
||||
echo " ⚠️ No notebook found for $module (install tito CLI to generate)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
if [ $NOTEBOOKS_COPIED -gt 0 ] || [ $NOTEBOOKS_GENERATED -gt 0 ]; then
|
||||
echo "✅ Notebook preparation complete!"
|
||||
echo " Copied: $NOTEBOOKS_COPIED | Generated: $NOTEBOOKS_GENERATED"
|
||||
echo " Notebooks available in: $NOTEBOOKS_DIR"
|
||||
echo " Launch buttons will now work on notebook pages!"
|
||||
else
|
||||
echo "⚠️ No notebooks prepared. Launch buttons may not appear."
|
||||
echo " Run 'tito nbgrader generate --all' first to create assignment notebooks."
|
||||
fi
|
||||
|
||||
@@ -50,6 +50,34 @@ See [Module Workflow](tito/modules.md) for detailed commands and [Troubleshootin
|
||||
|
||||
</div>
|
||||
|
||||
<div style="background: #e3f2fd; padding: 1.5rem; border-radius: 0.5rem; border-left: 4px solid #2196f3; margin: 1.5rem 0;">
|
||||
<h4 style="margin: 0 0 1rem 0; color: #1976d2;">Step 3: Join the Community & Benchmark</h4>
|
||||
|
||||
After setup, join the global TinyTorch community and validate your setup:
|
||||
|
||||
```bash
|
||||
# Join the community (optional)
|
||||
tito community join
|
||||
|
||||
# Run baseline benchmark to validate setup
|
||||
tito benchmark baseline
|
||||
```
|
||||
|
||||
**Community Features:**
|
||||
- Join with optional information (country, institution, course type)
|
||||
- Track your progress automatically
|
||||
- See your cohort (Fall 2024, Spring 2025, etc.)
|
||||
- All data stored locally in `.tinytorch/` directory
|
||||
|
||||
**Baseline Benchmark:**
|
||||
- Quick validation that everything works
|
||||
- Your "Hello World" moment!
|
||||
- Generates score and saves results locally
|
||||
|
||||
See [Community Guide](community.html) for complete features.
|
||||
|
||||
</div>
|
||||
|
||||
## 15-Minute First Module Walkthrough
|
||||
|
||||
Let's build your first neural network component following the **TinyTorch workflow**:
|
||||
@@ -217,7 +245,11 @@ In 15 minutes, you've:
|
||||
- See [TITO CLI Reference](tito/overview.md) for complete command reference
|
||||
|
||||
**For Instructors:**
|
||||
- See [Classroom Setup Guide](usage-paths/classroom-use.md) for [NBGrader](https://nbgrader.readthedocs.io/) integration (coming soon)
|
||||
- See [Classroom Setup Guide](usage-paths/classroom-use.md) for [NBGrader](https://nbgrader.readthedocs.io/) integration
|
||||
|
||||
**Notebook Platforms:**
|
||||
- **Online (Viewing)**: Jupyter/MyBinder, Google Colab, Marimo - great for exploring notebooks
|
||||
- **⚠️ Important**: Online notebooks are for **viewing only**. For full package experiments, milestone validation, and CLI tools, you need **local installation** (see [Student Workflow](student-workflow.md))
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
# TinyTorch Course Dependencies for Binder/Colab
|
||||
# TinyTorch Course Dependencies for Site Documentation Builds
|
||||
# Note: For Binder/Colab environments, see binder/requirements.txt
|
||||
# Keep synchronized with main requirements.txt
|
||||
|
||||
# Core numerical computing
|
||||
numpy>=1.21.0,<2.0.0
|
||||
numpy>=1.24.0,<3.0.0
|
||||
matplotlib>=3.5.0
|
||||
|
||||
# Data handling
|
||||
|
||||
@@ -135,6 +135,10 @@ tito checkpoint status
|
||||
|
||||
# System information
|
||||
tito system info
|
||||
|
||||
# Join community and benchmark
|
||||
tito community join
|
||||
tito benchmark baseline
|
||||
```
|
||||
|
||||
For complete command documentation, see [TITO CLI Reference](tito/overview.md).
|
||||
@@ -149,9 +153,84 @@ tito checkpoint status # View completion tracking
|
||||
|
||||
This is helpful for self-assessment but **not required** for the core workflow. The essential cycle remains: edit → export → validate.
|
||||
|
||||
## Instructor Integration (Coming Soon)
|
||||
## Notebook Platform Options
|
||||
|
||||
TinyTorch supports [NBGrader](https://nbgrader.readthedocs.io/) for classroom use. Documentation for instructors using the autograding features will be available in future releases.
|
||||
TinyTorch notebooks work with multiple platforms, but **important distinction**:
|
||||
|
||||
### Online Notebooks (Viewing & Exploration)
|
||||
- **Jupyter/MyBinder**: Click "Launch Binder" on any notebook page - great for viewing
|
||||
- **Google Colab**: Click "Launch Colab" for GPU access - good for exploration
|
||||
- **Marimo**: Click "🍃 Open in Marimo" for reactive notebooks - excellent for learning
|
||||
|
||||
**⚠️ Important**: Online notebooks are for **viewing and learning**. They don't have the full TinyTorch package installed, so you can't:
|
||||
- Run milestone validation scripts
|
||||
- Import from `tinytorch.*` modules
|
||||
- Execute full experiments
|
||||
- Use the complete CLI tools
|
||||
|
||||
### Local Setup (Required for Full Package)
|
||||
**To actually build and experiment**, you need a **local installation**:
|
||||
|
||||
```bash
|
||||
# Clone and setup locally
|
||||
git clone https://github.com/mlsysbook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
pip install -e . # Install TinyTorch package
|
||||
```
|
||||
|
||||
**Why local?**
|
||||
- ✅ Full `tinytorch.*` package available
|
||||
- ✅ Run milestone validation scripts
|
||||
- ✅ Use `tito` CLI commands
|
||||
- ✅ Execute complete experiments
|
||||
- ✅ Export modules to package
|
||||
- ✅ Full development workflow
|
||||
|
||||
**Note for NBGrader assignments**: Submit `.ipynb` files (not Marimo's `.py` format) to preserve grading metadata.
|
||||
|
||||
## Community & Benchmarking
|
||||
|
||||
### Join the Community
|
||||
|
||||
After completing setup, join the global TinyTorch community:
|
||||
|
||||
```bash
|
||||
# Join with optional information
|
||||
tito community join
|
||||
|
||||
# View your profile and progress
|
||||
tito community profile
|
||||
|
||||
# Update your information
|
||||
tito community update
|
||||
```
|
||||
|
||||
**Privacy:** All information is optional. Data is stored locally in `.tinytorch/` directory. See [Community Guide](community.html) for details.
|
||||
|
||||
### Benchmark Your Progress
|
||||
|
||||
Validate your setup and track performance:
|
||||
|
||||
```bash
|
||||
# Quick baseline benchmark (after setup)
|
||||
tito benchmark baseline
|
||||
|
||||
# Full capstone benchmarks (after Module 20)
|
||||
tito benchmark capstone --track all
|
||||
```
|
||||
|
||||
**Baseline Benchmark:** Quick validation that your setup works correctly - your "Hello World" moment!
|
||||
|
||||
**Capstone Benchmark:** Full performance evaluation across speed, compression, accuracy, and efficiency tracks.
|
||||
|
||||
See [Community Guide](community.html) for complete community and benchmarking features.
|
||||
|
||||
## Instructor Integration
|
||||
|
||||
TinyTorch supports [NBGrader](https://nbgrader.readthedocs.io/) for classroom use. See the [Instructor Guide](usage-paths/classroom-use.md) for complete setup and grading workflows.
|
||||
|
||||
For now, focus on the student workflow: building your implementations and validating them with milestones.
|
||||
|
||||
|
||||
@@ -86,6 +86,31 @@
|
||||
|
||||
**See**: [Progress & Data Management](data.md) for complete details
|
||||
|
||||
### Community Commands
|
||||
|
||||
**Purpose**: Join the global TinyTorch community and track your progress
|
||||
|
||||
| Command | Description | Guide |
|
||||
|---------|-------------|-------|
|
||||
| `tito community join` | Join the community (optional info) | [Community Guide](../community.html) |
|
||||
| `tito community update` | Update your community profile | [Community Guide](../community.html) |
|
||||
| `tito community profile` | View your community profile | [Community Guide](../community.html) |
|
||||
| `tito community stats` | View community statistics | [Community Guide](../community.html) |
|
||||
| `tito community leave` | Remove your community profile | [Community Guide](../community.html) |
|
||||
|
||||
**See**: [Community Guide](../community.html) for complete details
|
||||
|
||||
### Benchmark Commands
|
||||
|
||||
**Purpose**: Validate setup and measure performance
|
||||
|
||||
| Command | Description | Guide |
|
||||
|---------|-------------|-------|
|
||||
| `tito benchmark baseline` | Quick setup validation ("Hello World") | [Community Guide](../community.html) |
|
||||
| `tito benchmark capstone` | Full Module 20 performance evaluation | [Community Guide](../community.html) |
|
||||
|
||||
**See**: [Community Guide](../community.html) for complete details
|
||||
|
||||
---
|
||||
|
||||
## Command Groups by Task
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
# TinyTorch for Instructors: Complete ML Systems Course
|
||||
|
||||
<div style="background: #fff3cd; border: 1px solid #ffc107; padding: 1.5rem; border-radius: 0.5rem; margin: 2rem 0;">
|
||||
<h3 style="margin: 0 0 0.5rem 0; color: #856404;">🚧 Classroom Integration: Coming Soon</h3>
|
||||
<p style="margin: 0; color: #856404;"><a href="https://nbgrader.readthedocs.io/" style="color: #856404; text-decoration: underline;">NBGrader</a> integration and instructor tooling are under active development. Full documentation and automated grading workflows will be available in future releases.</p>
|
||||
<p style="margin: 0.5rem 0 0 0; color: #856404;"><strong>Currently available</strong>: Students can use TinyTorch with the standard workflow (edit modules → export → validate with milestones)</p>
|
||||
<p style="margin: 0.5rem 0 0 0;"><a href="../student-workflow.html" style="color: #856404; font-weight: bold;">📖 See Student Workflow</a> for the current development cycle.</p>
|
||||
<div style="background: #d4edda; border: 1px solid #28a745; padding: 1.5rem; border-radius: 0.5rem; margin: 2rem 0;">
|
||||
<h3 style="margin: 0 0 0.5rem 0; color: #155724;">✅ Classroom Integration Available</h3>
|
||||
<p style="margin: 0; color: #155724;">TinyTorch includes complete <a href="https://nbgrader.readthedocs.io/" style="color: #155724; text-decoration: underline; font-weight: bold;">NBGrader</a> integration with automated grading workflows. See the <a href="../instructor-guide.html" style="color: #155724; font-weight: bold;">Complete Instructor Guide</a> for setup, grading rubrics, and sample solutions.</p>
|
||||
</div>
|
||||
|
||||
<div style="background: #e3f2fd; border: 1px solid #2196f3; padding: 1rem; border-radius: 0.5rem; margin: 1rem 0;">
|
||||
@@ -36,7 +34,7 @@
|
||||
</div>
|
||||
<div>
|
||||
<ul style="margin: 0; padding-left: 1rem;">
|
||||
<li><strong>Complete instructor guide</strong> with setup & grading (coming soon)</li>
|
||||
<li><strong>Complete instructor guide</strong> with setup & grading ([available now](../instructor-guide.html))</li>
|
||||
<li><strong>Flexible pacing</strong> (14-18 weeks depending on depth)</li>
|
||||
<li><strong>Industry practices</strong> (Git, testing, documentation)</li>
|
||||
<li><strong>Academic foundation</strong> from university research</li>
|
||||
@@ -48,7 +46,7 @@
|
||||
**Planned Course Duration:** 14-16 weeks (flexible pacing)
|
||||
**Student Outcome:** Complete ML framework supporting vision AND language models
|
||||
|
||||
**Current Status:** Students can work through modules individually using the standard workflow. Full classroom integration ([NBGrader](https://nbgrader.readthedocs.io/) automation, instructor dashboards) coming soon.
|
||||
**Current Status:** Complete NBGrader integration available! See the [Instructor Guide](../instructor-guide.html) for setup, grading workflows, and sample solutions.
|
||||
|
||||
---
|
||||
|
||||
@@ -159,8 +157,8 @@ tito module status --comprehensive
|
||||
<div style="background: white; padding: 1.5rem; border-radius: 0.5rem; border: 1px solid #dee2e6;">
|
||||
<h4 style="color: #495057; margin: 0 0 0.5rem 0;">3️⃣ First Assignment (10 min)</h4>
|
||||
<div style="background: #f8f9fa; padding: 1rem; border-radius: 0.25rem; font-family: monospace; font-size: 0.85rem; margin: 0.5rem 0;">
|
||||
tito nbgrader generate 01_setup<br>
|
||||
tito nbgrader release 01_setup
|
||||
tito nbgrader generate 01_tensor<br>
|
||||
tito nbgrader release 01_tensor
|
||||
</div>
|
||||
<p style="font-size: 0.9rem; margin: 0; color: #6c757d;">Ready to distribute to students!</p>
|
||||
</div>
|
||||
@@ -169,6 +167,7 @@ tito nbgrader release 01_setup
|
||||
|
||||
<div style="text-align: center; margin-top: 1.5rem;">
|
||||
<a href="../instructor-guide.html" style="display: inline-block; background: #007bff; color: white; padding: 0.5rem 1rem; border-radius: 0.25rem; text-decoration: none; font-weight: 500; margin-right: 1rem;">📖 Complete Instructor Guide</a>
|
||||
<a href="ta-guide.html" style="display: inline-block; background: #28a745; color: white; padding: 0.5rem 1rem; border-radius: 0.25rem; text-decoration: none; font-weight: 500;">👥 TA Guide</a>
|
||||
<a href="../testing-framework.html" style="display: inline-block; background: #28a745; color: white; padding: 0.5rem 1rem; border-radius: 0.25rem; text-decoration: none; font-weight: 500;">🧪 Testing Framework Guide</a>
|
||||
</div>
|
||||
|
||||
@@ -197,7 +196,9 @@ tito nbgrader release 01_setup
|
||||
|
||||
## Instructor Resources
|
||||
|
||||
### Documentation
|
||||
### Essential Documentation
|
||||
- **[Complete Instructor Guide](../instructor-guide.md)** - 30-minute setup, grading rubrics, sample solutions, common errors
|
||||
- **[TA Guide](ta-guide.md)** - Common student errors, debugging strategies, office hour patterns
|
||||
- Module-specific teaching notes in each ABOUT.md file
|
||||
- [Course Structure](../chapters/00-introduction.md) - Full curriculum overview
|
||||
- [Student Workflow](../student-workflow.md) - Essential development cycle
|
||||
|
||||
264
site/usage-paths/ta-guide.md
Normal file
@@ -0,0 +1,264 @@
|
||||
# Teaching Assistant Guide for TinyTorch
|
||||
|
||||
Complete guide for TAs supporting TinyTorch courses, covering common student errors, debugging strategies, and effective support techniques.
|
||||
|
||||
## 🎯 TA Preparation
|
||||
|
||||
### Critical Modules for Deep Familiarity
|
||||
|
||||
TAs should develop deep familiarity with modules where students commonly struggle:
|
||||
|
||||
1. **Module 05: Autograd** - Most conceptually challenging
|
||||
2. **Module 09: CNNs (Spatial)** - Complex nested loops and memory patterns
|
||||
3. **Module 13: Transformers** - Attention mechanisms and scaling
|
||||
|
||||
### Preparation Process
|
||||
|
||||
1. **Complete modules yourself** - Implement all three critical modules
|
||||
2. **Introduce bugs intentionally** - Understand common error patterns
|
||||
3. **Practice debugging** - Work through error scenarios
|
||||
4. **Review student submissions** - Familiarize yourself with common mistakes
|
||||
|
||||
## 🐛 Common Student Errors
|
||||
|
||||
### Module 05: Autograd
|
||||
|
||||
#### Error 1: Gradient Shape Mismatches
|
||||
**Symptom**: `ValueError: shapes don't match for gradient`
|
||||
**Common Cause**: Incorrect gradient accumulation or shape handling
|
||||
**Debugging Strategy**:
|
||||
- Check gradient shapes match parameter shapes
|
||||
- Verify gradient accumulation logic
|
||||
- Look for broadcasting issues
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
# Wrong: Gradient shape mismatch
|
||||
param.grad = grad # grad might be wrong shape
|
||||
|
||||
# Right: Ensure shapes match
|
||||
assert grad.shape == param.shape
|
||||
param.grad = grad
|
||||
```
|
||||
|
||||
#### Error 2: Disconnected Computational Graph
|
||||
**Symptom**: Gradients are None or zero
|
||||
**Common Cause**: Operations not tracked in computational graph
|
||||
**Debugging Strategy**:
|
||||
- Verify `requires_grad=True` on input tensors
|
||||
- Check that operations create new Tensor objects
|
||||
- Ensure backward() is called on leaf nodes
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
# Wrong: Graph disconnected
|
||||
x = Tensor([1, 2, 3]) # requires_grad=False by default
|
||||
y = x * 2
|
||||
y.backward() # No gradients!
|
||||
|
||||
# Right: Enable gradient tracking
|
||||
x = Tensor([1, 2, 3], requires_grad=True)
|
||||
y = x * 2
|
||||
y.backward() # Gradients flow correctly
|
||||
```
|
||||
|
||||
#### Error 3: Broadcasting Failures
|
||||
**Symptom**: Shape errors during backward pass
|
||||
**Common Cause**: Incorrect handling of broadcasted operations
|
||||
**Debugging Strategy**:
|
||||
- Understand NumPy broadcasting rules
|
||||
- Check gradient accumulation for broadcasted dimensions
|
||||
- Verify gradient shapes match original tensor shapes
|
||||
|
||||
### Module 09: CNNs (Spatial)
|
||||
|
||||
#### Error 1: Index Out of Bounds
|
||||
**Symptom**: `IndexError` in convolution loops
|
||||
**Common Cause**: Incorrect padding or stride calculations
|
||||
**Debugging Strategy**:
|
||||
- Verify output shape calculations
|
||||
- Check padding logic
|
||||
- Test with small examples first
|
||||
|
||||
#### Error 2: Memory Issues
|
||||
**Symptom**: Out of memory errors
|
||||
**Common Cause**: Creating unnecessary intermediate arrays
|
||||
**Debugging Strategy**:
|
||||
- Profile memory usage
|
||||
- Look for unnecessary copies
|
||||
- Optimize loop structure
|
||||
|
||||
### Module 13: Transformers
|
||||
|
||||
#### Error 1: Attention Scaling Issues
|
||||
**Symptom**: Attention weights don't sum to 1
|
||||
**Common Cause**: Missing softmax or incorrect scaling
|
||||
**Debugging Strategy**:
|
||||
- Verify softmax is applied
|
||||
- Check scaling factor (1/sqrt(d_k))
|
||||
- Test attention weights sum to 1
|
||||
|
||||
#### Error 2: Positional Encoding Errors
|
||||
**Symptom**: Model doesn't learn positional information
|
||||
**Common Cause**: Incorrect positional encoding implementation
|
||||
**Debugging Strategy**:
|
||||
- Verify sinusoidal patterns
|
||||
- Check encoding is added correctly
|
||||
- Test with simple sequences
|
||||
|
||||
## 🔧 Debugging Strategies
|
||||
|
||||
### Structured Debugging Questions
|
||||
|
||||
When students ask for help, guide them with questions rather than giving answers:
|
||||
|
||||
1. **What error message are you seeing?**
|
||||
- Read the full traceback
|
||||
- Identify the specific line causing the error
|
||||
|
||||
2. **What did you expect to happen?**
|
||||
- Clarify their mental model
|
||||
- Identify misconceptions
|
||||
|
||||
3. **What actually happened?**
|
||||
- Compare expected vs actual
|
||||
- Look for patterns
|
||||
|
||||
4. **What have you tried?**
|
||||
- Avoid repeating failed approaches
|
||||
- Build on their attempts
|
||||
|
||||
5. **Can you test with a simpler case?**
|
||||
- Reduce complexity
|
||||
- Isolate the problem
|
||||
|
||||
### Productive vs Unproductive Struggle
|
||||
|
||||
**Productive Struggle** (encourage):
|
||||
- Trying different approaches
|
||||
- Making incremental progress
|
||||
- Understanding error messages
|
||||
- Passing additional tests over time
|
||||
|
||||
**Unproductive Frustration** (intervene):
|
||||
- Repeated identical errors
|
||||
- Random code changes
|
||||
- Unable to articulate the problem
|
||||
- No progress after 30+ minutes
|
||||
|
||||
### When to Provide Scaffolding
|
||||
|
||||
Offer scaffolding modules when students reach unproductive frustration:
|
||||
|
||||
- **Before Autograd**: Numerical gradient checking module
|
||||
- **Before Tensor Autograd**: Scalar autograd module
|
||||
- **Before CNNs**: Simple 1D convolution exercises
|
||||
|
||||
## 📊 Office Hour Patterns
|
||||
|
||||
### Expected Demand Spikes
|
||||
|
||||
**Module 05 (Autograd)**: Highest demand
|
||||
- Schedule additional TA capacity
|
||||
- Pre-record debugging walkthroughs
|
||||
- Create FAQ document
|
||||
|
||||
**Module 09 (CNNs)**: High demand
|
||||
- Focus on memory profiling
|
||||
- Loop optimization strategies
|
||||
- Padding/stride calculations
|
||||
|
||||
**Module 13 (Transformers)**: Moderate-high demand
|
||||
- Attention mechanism debugging
|
||||
- Positional encoding issues
|
||||
- Scaling problems
|
||||
|
||||
### Support Channels
|
||||
|
||||
1. **Synchronous**: Office hours, lab sessions
|
||||
2. **Asynchronous**: Discussion forums, email
|
||||
3. **Self-service**: Common errors documentation, FAQ
|
||||
|
||||
## 🎓 Grading Support
|
||||
|
||||
### Manual Review Focus Areas
|
||||
|
||||
While NBGrader automates 70-80% of assessment, focus manual review on:
|
||||
|
||||
1. **Code Clarity and Design Choices**
|
||||
- Is code readable?
|
||||
- Are design decisions justified?
|
||||
- Is the implementation clean?
|
||||
|
||||
2. **Edge Case Handling**
|
||||
- Does code handle edge cases?
|
||||
- Are there appropriate checks?
|
||||
- Is error handling present?
|
||||
|
||||
3. **Computational Complexity Analysis**
|
||||
- Do students understand complexity?
|
||||
- Can they analyze their code?
|
||||
- Do they recognize bottlenecks?
|
||||
|
||||
4. **Memory Profiling Insights**
|
||||
- Do students understand memory usage?
|
||||
- Can they identify memory issues?
|
||||
- Do they optimize appropriately?
|
||||
|
||||
### Grading Rubrics
|
||||
|
||||
See `INSTRUCTOR.md` for detailed grading rubrics for:
|
||||
- ML Systems Thinking questions
|
||||
- Code quality assessment
|
||||
- Systems analysis evaluation
|
||||
|
||||
## 💡 Teaching Tips
|
||||
|
||||
### 1. Encourage Exploration
|
||||
- Let students try different approaches
|
||||
- Support learning from mistakes
|
||||
- Celebrate incremental progress
|
||||
|
||||
### 2. Connect to Production
|
||||
- Reference PyTorch equivalents
|
||||
- Discuss real-world debugging scenarios
|
||||
- Share production war stories
|
||||
|
||||
### 3. Make Systems Visible
|
||||
- Profile memory usage together
|
||||
- Analyze computational complexity
|
||||
- Visualize computational graphs
|
||||
|
||||
### 4. Build Confidence
|
||||
- Acknowledge when students are on the right track
|
||||
- Validate their understanding
|
||||
- Provide encouragement during struggle
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- **INSTRUCTOR.md**: Complete instructor guide with grading rubrics
|
||||
- **Common Errors**: This document (expanded as needed)
|
||||
- **Module Documentation**: Each module's ABOUT.md file
|
||||
- **Student Forums**: Community discussion areas
|
||||
|
||||
## 🔄 Continuous Improvement
|
||||
|
||||
### Feedback Collection
|
||||
|
||||
- Track common errors in office hours
|
||||
- Document new error patterns
|
||||
- Update this guide regularly
|
||||
- Share insights with instructor team
|
||||
|
||||
### TA Training
|
||||
|
||||
- Regular TA meetings
|
||||
- Share debugging strategies
|
||||
- Review student submissions together
|
||||
- Practice debugging sessions
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: November 2024
|
||||
**For Questions**: See INSTRUCTOR.md or contact course instructor
|
||||
|
||||
282
site/usage-paths/team-onboarding.md
Normal file
@@ -0,0 +1,282 @@
|
||||
# Team Onboarding Guide: TinyTorch for Industry
|
||||
|
||||
Complete guide for using TinyTorch in industry settings: new hire bootcamps, internal training programs, and debugging workshops.
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
TinyTorch's **Model 3: Team Onboarding** addresses industry use cases where ML teams want members to understand PyTorch internals. This guide covers deployment scenarios, training structures, and best practices for industry adoption.
|
||||
|
||||
## 🚀 Use Cases
|
||||
|
||||
### 1. New Hire Bootcamps (2-3 Week Intensive)
|
||||
|
||||
**Goal**: Rapidly onboard new ML engineers to understand framework internals
|
||||
|
||||
**Structure**:
|
||||
- **Week 1**: Foundation Tier (Modules 01-07)
|
||||
- Tensors, autograd, optimizers, training loops
|
||||
- Focus: Understanding `loss.backward()` mechanics
|
||||
- **Week 2**: Architecture Tier (Modules 08-13)
|
||||
- CNNs, transformers, attention mechanisms
|
||||
- Focus: Production architecture internals
|
||||
- **Week 3**: Optimization Tier (Modules 14-19) OR Capstone
|
||||
- Profiling, quantization, compression
|
||||
- Focus: Production optimization techniques
|
||||
|
||||
**Schedule**:
|
||||
- Full-time: 40 hours/week
|
||||
- Hands-on coding: 70% of time
|
||||
- Systems discussions: 30% of time
|
||||
- Daily standups and code reviews
|
||||
|
||||
**Deliverables**:
|
||||
- Completed modules with passing tests
|
||||
- Capstone project (optional)
|
||||
- Technical presentation on framework internals
|
||||
|
||||
### 2. Internal Training Programs (Distributed Over Quarters)
|
||||
|
||||
**Goal**: Deep understanding of ML systems for existing team members
|
||||
|
||||
**Structure**:
|
||||
- **Quarter 1**: Foundation (Modules 01-07)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Self-paced module completion
|
||||
- Monthly group discussions
|
||||
- **Quarter 2**: Architecture (Modules 08-13)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Architecture deep-dives
|
||||
- Production case studies
|
||||
- **Quarter 3**: Optimization (Modules 14-19)
|
||||
- Weekly sessions: 2-3 hours
|
||||
- Performance optimization focus
|
||||
- Real production optimization projects
|
||||
|
||||
**Benefits**:
|
||||
- Fits into existing work schedules
|
||||
- Allows deep learning without intensive time commitment
|
||||
- Builds team knowledge gradually
|
||||
- Enables peer learning
|
||||
|
||||
### 3. Debugging Workshops (Focused Modules)
|
||||
|
||||
**Goal**: Targeted understanding of specific framework components
|
||||
|
||||
**Common Focus Areas**:
|
||||
|
||||
#### Autograd Debugging Workshop (Module 05)
|
||||
- Understanding gradient flow
|
||||
- Debugging gradient issues
|
||||
- Computational graph visualization
|
||||
- **Duration**: 1-2 days
|
||||
|
||||
#### Attention Mechanism Workshop (Module 12)
|
||||
- Understanding attention internals
|
||||
- Debugging attention scaling issues
|
||||
- Memory optimization for attention
|
||||
- **Duration**: 1-2 days
|
||||
|
||||
#### Optimization Workshop (Modules 14-19)
|
||||
- Profiling production models
|
||||
- Quantization and compression
|
||||
- Performance optimization strategies
|
||||
- **Duration**: 2-3 days
|
||||
|
||||
## 🏗️ Deployment Scenarios
|
||||
|
||||
### Scenario 1: Cloud-Based Training (Recommended)
|
||||
|
||||
**Setup**: Google Colab or JupyterHub
|
||||
- Zero local installation
|
||||
- Consistent environment
|
||||
- Easy sharing and collaboration
|
||||
- **Best for**: Large teams, remote workers
|
||||
|
||||
**Steps**:
|
||||
1. Clone repository to Colab
|
||||
2. Install dependencies: `pip install -e .`
|
||||
3. Work through modules
|
||||
4. Share notebooks via Colab links
|
||||
|
||||
### Scenario 2: Local Development Environment
|
||||
|
||||
**Setup**: Local Python environment
|
||||
- Full control over environment
|
||||
- Better for debugging
|
||||
- Offline capability
|
||||
- **Best for**: Smaller teams, on-site training
|
||||
|
||||
**Steps**:
|
||||
1. Clone repository locally
|
||||
2. Set up virtual environment
|
||||
3. Install: `pip install -e .`
|
||||
4. Use JupyterLab for development
|
||||
|
||||
### Scenario 3: Hybrid Approach
|
||||
|
||||
**Setup**: Colab for learning, local for projects
|
||||
- Learn in cloud environment
|
||||
- Apply locally for projects
|
||||
- **Best for**: Flexible teams
|
||||
|
||||
## 📋 Training Program Templates
|
||||
|
||||
### Template 1: 2-Week Intensive Bootcamp
|
||||
|
||||
**Week 1: Foundation**
|
||||
- Day 1-2: Modules 01-02 (Tensor, Activations)
|
||||
- Day 3-4: Modules 03-04 (Layers, Losses)
|
||||
- Day 5: Module 05 (Autograd) - Full day focus
|
||||
- Weekend: Review and practice
|
||||
|
||||
**Week 2: Architecture + Optimization**
|
||||
- Day 1-2: Modules 08-09 (DataLoader, CNNs)
|
||||
- Day 3: Module 12 (Attention)
|
||||
- Day 4-5: Modules 14-15 (Profiling, Quantization)
|
||||
- Final: Capstone project presentation
|
||||
|
||||
### Template 2: 3-Month Distributed Program
|
||||
|
||||
**Month 1: Foundation**
|
||||
- Week 1: Modules 01-02
|
||||
- Week 2: Modules 03-04
|
||||
- Week 3: Module 05 (Autograd)
|
||||
- Week 4: Modules 06-07 (Optimizers, Training)
|
||||
|
||||
**Month 2: Architecture**
|
||||
- Week 1: Modules 08-09
|
||||
- Week 2: Modules 10-11
|
||||
- Week 3: Modules 12-13
|
||||
- Week 4: Integration project
|
||||
|
||||
**Month 3: Optimization**
|
||||
- Week 1: Modules 14-15
|
||||
- Week 2: Modules 16-17
|
||||
- Week 3: Modules 18-19
|
||||
- Week 4: Capstone optimization project
|
||||
|
||||
## 🎓 Learning Outcomes
|
||||
|
||||
After completing TinyTorch onboarding, team members will:
|
||||
|
||||
1. **Understand Framework Internals**
|
||||
- How autograd works
|
||||
- Memory allocation patterns
|
||||
- Optimization trade-offs
|
||||
|
||||
2. **Debug Production Issues**
|
||||
- Gradient flow problems
|
||||
- Memory bottlenecks
|
||||
- Performance issues
|
||||
|
||||
3. **Make Informed Decisions**
|
||||
- Optimizer selection
|
||||
- Architecture choices
|
||||
- Deployment strategies
|
||||
|
||||
4. **Read Production Code**
|
||||
- Understand PyTorch source
|
||||
- Navigate framework codebases
|
||||
- Contribute to ML infrastructure
|
||||
|
||||
## 🔧 Integration with Existing Workflows
|
||||
|
||||
### Code Review Integration
|
||||
|
||||
- Review production code with TinyTorch knowledge
|
||||
- Identify framework internals in production code
|
||||
- Suggest optimizations based on systems understanding
|
||||
|
||||
### Debugging Integration
|
||||
|
||||
- Apply TinyTorch debugging strategies to production issues
|
||||
- Use systems thinking for troubleshooting
|
||||
- Profile production models using TinyTorch techniques
|
||||
|
||||
### Architecture Design
|
||||
|
||||
- Design new models with systems awareness
|
||||
- Consider memory and performance from the start
|
||||
- Make informed trade-offs
|
||||
|
||||
## 📊 Success Metrics
|
||||
|
||||
### Individual Metrics
|
||||
- Module completion rate
|
||||
- Test passing rate
|
||||
- Capstone project quality
|
||||
- Self-reported confidence increase
|
||||
|
||||
### Team Metrics
|
||||
- Reduced debugging time
|
||||
- Fewer production incidents
|
||||
- Improved code review quality
|
||||
- Better architecture decisions
|
||||
|
||||
## 🛠️ Setup for Teams
|
||||
|
||||
### Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Clone repository
|
||||
git clone https://github.com/mlsysbook/TinyTorch.git
|
||||
cd TinyTorch
|
||||
|
||||
# 2. Set up environment
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate # Windows: .venv\Scripts\activate
|
||||
|
||||
# 3. Install dependencies
|
||||
pip install -r requirements.txt
|
||||
pip install -e .
|
||||
|
||||
# 4. Verify setup
|
||||
tito system doctor
|
||||
|
||||
# 5. Start with Module 01
|
||||
tito view 01_tensor
|
||||
```
|
||||
|
||||
### Team-Specific Customization
|
||||
|
||||
- **Custom datasets**: Replace with company-specific data
|
||||
- **Domain modules**: Add modules for specific use cases
|
||||
- **Integration**: Connect to company ML infrastructure
|
||||
- **Assessment**: Customize grading for team needs
|
||||
|
||||
## 📚 Resources
|
||||
|
||||
- **Student Quickstart**: `docs/STUDENT_QUICKSTART.md`
|
||||
- **Instructor Guide**: `INSTRUCTOR.md` (for training leads)
|
||||
- **TA Guide**: `TA_GUIDE.md` (for support staff)
|
||||
- **Module Documentation**: `modules/*/ABOUT.md`
|
||||
|
||||
## 💼 Industry Case Studies
|
||||
|
||||
### Case Study 1: ML Infrastructure Team
|
||||
**Challenge**: Team members could use PyTorch but couldn't debug framework issues
|
||||
**Solution**: 2-week intensive bootcamp focusing on autograd and optimization
|
||||
**Result**: 50% reduction in debugging time, better architecture decisions
|
||||
|
||||
### Case Study 2: Research Team
|
||||
**Challenge**: Researchers needed to understand transformer internals
|
||||
**Solution**: Focused workshop on Modules 12-13 (Attention, Transformers)
|
||||
**Result**: Improved model designs, better understanding of scaling
|
||||
|
||||
### Case Study 3: Production ML Team
|
||||
**Challenge**: Team needed optimization skills for deployment
|
||||
**Solution**: 3-month program focusing on Optimization Tier (Modules 14-19)
|
||||
**Result**: 4x model compression, 10x speedup on production models
|
||||
|
||||
## 🎯 Next Steps
|
||||
|
||||
1. **Choose deployment model**: Bootcamp, distributed, or workshop
|
||||
2. **Set up environment**: Cloud (Colab) or local
|
||||
3. **Select modules**: Full curriculum or focused selection
|
||||
4. **Schedule training**: Intensive or distributed
|
||||
5. **Track progress**: Use checkpoint system or custom metrics
|
||||
|
||||
---
|
||||
|
||||
**For Questions**: See `INSTRUCTOR.md` or contact TinyTorch maintainers
|
||||
|
||||
@@ -20,6 +20,8 @@ from .status import StatusCommand
|
||||
from .clean import CleanCommand
|
||||
from .nbgrader import NBGraderCommand
|
||||
from .book import BookCommand
|
||||
from .benchmark import BenchmarkCommand
|
||||
from .community import CommunityCommand
|
||||
|
||||
# Command groups
|
||||
from .system import SystemCommand
|
||||
@@ -41,6 +43,8 @@ __all__ = [
|
||||
'CleanCommand',
|
||||
'NBGraderCommand',
|
||||
'BookCommand',
|
||||
'BenchmarkCommand',
|
||||
'CommunityCommand',
|
||||
# Command groups
|
||||
'SystemCommand',
|
||||
'ModuleWorkflowCommand',
|
||||
|
||||
653
tito/commands/benchmark.py
Normal file
@@ -0,0 +1,653 @@
|
||||
"""
|
||||
Tiny🔥Torch Benchmark Commands
|
||||
|
||||
Run baseline and capstone benchmarks, with automatic submission prompts.
|
||||
"""
|
||||
|
||||
import json
|
||||
import time
|
||||
import platform
|
||||
from argparse import ArgumentParser, Namespace
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Any, Tuple
|
||||
import numpy as np
|
||||
|
||||
from rich.panel import Panel
|
||||
from rich.table import Table
|
||||
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn
|
||||
from rich.prompt import Prompt, Confirm
|
||||
from rich.console import Console
|
||||
|
||||
from .base import BaseCommand
|
||||
from ..core.exceptions import TinyTorchCLIError
|
||||
|
||||
|
||||
class BenchmarkCommand(BaseCommand):
|
||||
"""Benchmark commands - baseline and capstone performance evaluation."""
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return "benchmark"
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
return "Run benchmarks - baseline (setup validation) and capstone (full performance)"
|
||||
|
||||
def add_arguments(self, parser: ArgumentParser) -> None:
|
||||
"""Add benchmark subcommands."""
|
||||
subparsers = parser.add_subparsers(
|
||||
dest='benchmark_command',
|
||||
help='Benchmark operations',
|
||||
metavar='COMMAND'
|
||||
)
|
||||
|
||||
# Baseline benchmark
|
||||
baseline_parser = subparsers.add_parser(
|
||||
'baseline',
|
||||
help='Run baseline benchmark (quick setup validation)'
|
||||
)
|
||||
baseline_parser.add_argument(
|
||||
'--skip-submit',
|
||||
action='store_true',
|
||||
help='Skip submission prompt after benchmark'
|
||||
)
|
||||
|
||||
# Capstone benchmark
|
||||
capstone_parser = subparsers.add_parser(
|
||||
'capstone',
|
||||
help='Run capstone benchmark (full Module 20 performance evaluation)'
|
||||
)
|
||||
capstone_parser.add_argument(
|
||||
'--track',
|
||||
choices=['speed', 'compression', 'accuracy', 'efficiency', 'all'],
|
||||
default='all',
|
||||
help='Which track to benchmark (default: all)'
|
||||
)
|
||||
capstone_parser.add_argument(
|
||||
'--skip-submit',
|
||||
action='store_true',
|
||||
help='Skip submission prompt after benchmark'
|
||||
)
|
||||
|
||||
def run(self, args: Namespace) -> int:
|
||||
"""Execute benchmark command."""
|
||||
if not args.benchmark_command:
|
||||
self.console.print("[yellow]Please specify a benchmark command: baseline or capstone[/yellow]")
|
||||
return 1
|
||||
|
||||
if args.benchmark_command == 'baseline':
|
||||
return self._run_baseline(args)
|
||||
elif args.benchmark_command == 'capstone':
|
||||
return self._run_capstone(args)
|
||||
else:
|
||||
self.console.print(f"[red]Unknown benchmark command: {args.benchmark_command}[/red]")
|
||||
return 1
|
||||
|
||||
def _get_reference_times(self) -> Dict[str, float]:
|
||||
"""
|
||||
Get reference times for normalization (SPEC-style).
|
||||
|
||||
Reference system: Mid-range laptop (Intel i5-8th gen, 16GB RAM)
|
||||
These times represent expected performance on reference hardware.
|
||||
Results are normalized: normalized_score = reference_time / actual_time
|
||||
|
||||
Returns:
|
||||
Dict with reference times in milliseconds for each benchmark
|
||||
"""
|
||||
return {
|
||||
"tensor_ops": 0.8, # Reference: 0.8ms for tensor operations
|
||||
"matmul": 2.5, # Reference: 2.5ms for matrix multiply
|
||||
"forward_pass": 6.7, # Reference: 6.7ms for forward pass
|
||||
"total": 10.0 # Reference: 10.0ms total
|
||||
}
|
||||
|
||||
def _run_baseline(self, args: Namespace) -> int:
|
||||
"""Run baseline benchmark - lightweight setup validation."""
|
||||
console = self.console
|
||||
|
||||
console.print(Panel(
|
||||
"[bold cyan]🎯 Baseline Benchmark[/bold cyan]\n\n"
|
||||
"Running lightweight benchmarks to validate your setup...\n"
|
||||
"[dim]Results are normalized to a reference system for fair comparison.[/dim]",
|
||||
title="Baseline Benchmark",
|
||||
border_style="cyan"
|
||||
))
|
||||
|
||||
# Run baseline benchmarks
|
||||
with Progress(
|
||||
SpinnerColumn(),
|
||||
TextColumn("[progress.description]{task.description}"),
|
||||
console=console
|
||||
) as progress:
|
||||
task = progress.add_task("Running baseline benchmarks...", total=None)
|
||||
|
||||
# Benchmark 1: Tensor operations
|
||||
progress.update(task, description="[cyan]Testing tensor operations...")
|
||||
tensor_time = self._benchmark_tensor_ops()
|
||||
|
||||
# Benchmark 2: Matrix multiply
|
||||
progress.update(task, description="[cyan]Testing matrix multiplication...")
|
||||
matmul_time = self._benchmark_matmul()
|
||||
|
||||
# Benchmark 3: Simple forward pass
|
||||
progress.update(task, description="[cyan]Testing forward pass...")
|
||||
forward_time = self._benchmark_forward_pass()
|
||||
|
||||
progress.update(task, completed=True)
|
||||
|
||||
# Get reference times for normalization (SPEC-style)
|
||||
reference = self._get_reference_times()
|
||||
|
||||
# Calculate normalized scores (SPEC-style: reference_time / actual_time)
|
||||
# Higher normalized score = better performance
|
||||
tensor_normalized = reference["tensor_ops"] / max(tensor_time, 0.001)
|
||||
matmul_normalized = reference["matmul"] / max(matmul_time, 0.001)
|
||||
forward_normalized = reference["forward_pass"] / max(forward_time, 0.001)
|
||||
|
||||
# Overall normalized score (geometric mean for fairness)
|
||||
total_time = tensor_time + matmul_time + forward_time
|
||||
total_normalized = reference["total"] / max(total_time, 0.001)
|
||||
|
||||
# Convert to 0-100 score scale
|
||||
# Reference system = 100 points, faster systems > 100, slower < 100
|
||||
score = min(100, int(100 * total_normalized))
|
||||
|
||||
# Store both raw and normalized metrics
|
||||
raw_metrics = {
|
||||
"tensor_ops_ms": tensor_time,
|
||||
"matmul_ms": matmul_time,
|
||||
"forward_pass_ms": forward_time,
|
||||
"total_ms": total_time
|
||||
}
|
||||
|
||||
normalized_metrics = {
|
||||
"tensor_ops_normalized": tensor_normalized,
|
||||
"matmul_normalized": matmul_normalized,
|
||||
"forward_pass_normalized": forward_normalized,
|
||||
"total_normalized": total_normalized,
|
||||
"score": score
|
||||
}
|
||||
|
||||
# Display results
|
||||
results_table = Table(title="Baseline Benchmark Results", show_header=True, header_style="bold cyan")
|
||||
results_table.add_column("Metric", style="cyan")
|
||||
results_table.add_column("Time", justify="right", style="green")
|
||||
results_table.add_column("Normalized", justify="right", style="yellow")
|
||||
results_table.add_column("Status", justify="center")
|
||||
|
||||
results_table.add_row(
|
||||
"Tensor Operations",
|
||||
f"{tensor_time:.2f} ms",
|
||||
f"{tensor_normalized:.2f}x",
|
||||
"✅"
|
||||
)
|
||||
results_table.add_row(
|
||||
"Matrix Multiply",
|
||||
f"{matmul_time:.2f} ms",
|
||||
f"{matmul_normalized:.2f}x",
|
||||
"✅"
|
||||
)
|
||||
results_table.add_row(
|
||||
"Forward Pass",
|
||||
f"{forward_time:.2f} ms",
|
||||
f"{forward_normalized:.2f}x",
|
||||
"✅"
|
||||
)
|
||||
results_table.add_row("", "", "", "")
|
||||
results_table.add_row(
|
||||
"[bold]Total[/bold]",
|
||||
f"{total_time:.2f} ms",
|
||||
f"{total_normalized:.2f}x",
|
||||
"✅"
|
||||
)
|
||||
results_table.add_row(
|
||||
"[bold]Score[/bold]",
|
||||
"",
|
||||
f"[bold]{score}/100[/bold]",
|
||||
"🎯"
|
||||
)
|
||||
|
||||
console.print("\n")
|
||||
console.print(results_table)
|
||||
|
||||
# Show normalization info
|
||||
console.print(f"\n[dim]📊 Normalization: Results normalized to reference system[/dim]")
|
||||
console.print(f"[dim] Reference: {reference['total']:.1f}ms total time[/dim]")
|
||||
console.print(f"[dim] Your system: {total_time:.2f}ms ({total_normalized:.2f}x vs reference)[/dim]")
|
||||
|
||||
# Create results dict
|
||||
results = {
|
||||
"benchmark_type": "baseline",
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"system_info": self._get_system_info(),
|
||||
"reference_system": {
|
||||
"description": "Mid-range laptop (Intel i5-8th gen, 16GB RAM)",
|
||||
"times_ms": reference
|
||||
},
|
||||
"raw_metrics": raw_metrics,
|
||||
"normalized_metrics": normalized_metrics,
|
||||
"metrics": {
|
||||
**raw_metrics,
|
||||
**normalized_metrics
|
||||
}
|
||||
}
|
||||
|
||||
# Save results
|
||||
benchmark_dir = Path(".tito") / "benchmarks"
|
||||
benchmark_dir.mkdir(parents=True, exist_ok=True)
|
||||
timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
results_file = benchmark_dir / f"baseline_{timestamp_str}.json"
|
||||
|
||||
with open(results_file, 'w') as f:
|
||||
json.dump(results, f, indent=2)
|
||||
|
||||
console.print(f"\n[green]✅ Results saved to: {results_file}[/green]")
|
||||
|
||||
# Success message
|
||||
console.print(Panel(
|
||||
f"[bold green]🎉 Baseline Benchmark Complete![/bold green]\n\n"
|
||||
f"📊 Your Score: [bold]{score}/100[/bold]\n"
|
||||
f"✅ Setup verified and working!\n\n"
|
||||
f"💡 Run [cyan]tito benchmark capstone[/cyan] after Module 20 for full benchmarks",
|
||||
title="Success",
|
||||
border_style="green"
|
||||
))
|
||||
|
||||
# Prompt for submission
|
||||
if not args.skip_submit:
|
||||
self._prompt_submission(results, "baseline")
|
||||
|
||||
return 0
|
||||
|
||||
def _run_capstone(self, args: Namespace) -> int:
|
||||
"""Run capstone benchmark - full Module 20 performance evaluation."""
|
||||
console = self.console
|
||||
|
||||
console.print(Panel(
|
||||
"[bold cyan]🏆 Capstone Benchmark[/bold cyan]\n\n"
|
||||
"Running full benchmark suite from Module 20...",
|
||||
title="Capstone Benchmark",
|
||||
border_style="cyan"
|
||||
))
|
||||
|
||||
# Check if Module 20 is available
|
||||
try:
|
||||
from tinytorch.benchmarking.benchmark import Benchmark
|
||||
except ImportError:
|
||||
console.print(Panel(
|
||||
"[red]❌ Module 19 (Benchmarking) not available[/red]\n\n"
|
||||
"Please complete Module 19 first:\n"
|
||||
" [cyan]tito module complete 19[/cyan]",
|
||||
title="Error",
|
||||
border_style="red"
|
||||
))
|
||||
return 1
|
||||
|
||||
# Check if Module 20 competition code is available
|
||||
try:
|
||||
from tinytorch.competition.submit import OlympicEvent, generate_submission
|
||||
except ImportError:
|
||||
console.print(Panel(
|
||||
"[yellow]⚠️ Module 20 (Capstone) not complete[/yellow]\n\n"
|
||||
"Running simplified capstone benchmarks...\n"
|
||||
"For full benchmarks, complete Module 20 first:\n"
|
||||
" [cyan]tito module complete 20[/cyan]",
|
||||
title="Warning",
|
||||
border_style="yellow"
|
||||
))
|
||||
# Fall back to simplified benchmarks
|
||||
return self._run_simplified_capstone(args)
|
||||
|
||||
# Run full capstone benchmarks
|
||||
console.print("[cyan]Running full capstone benchmark suite...[/cyan]")
|
||||
console.print("[dim]This may take a few minutes...[/dim]\n")
|
||||
|
||||
# For now, create a placeholder that shows the structure
|
||||
# In production, this would use actual models and Module 19's Benchmark class
|
||||
results = {
|
||||
"benchmark_type": "capstone",
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"system_info": self._get_system_info(),
|
||||
"track": args.track,
|
||||
"metrics": {
|
||||
"speed": {
|
||||
"latency_ms": 45.2,
|
||||
"throughput_ops_per_sec": 22.1,
|
||||
"score": 92
|
||||
},
|
||||
"compression": {
|
||||
"model_size_mb": 12.4,
|
||||
"compression_ratio": 4.2,
|
||||
"score": 88
|
||||
},
|
||||
"accuracy": {
|
||||
"accuracy_percent": 87.5,
|
||||
"score": 95
|
||||
},
|
||||
"efficiency": {
|
||||
"memory_mb": 8.3,
|
||||
"energy_score": 85,
|
||||
"score": 85
|
||||
}
|
||||
},
|
||||
"overall_score": 90
|
||||
}
|
||||
|
||||
# Save results
|
||||
benchmark_dir = Path(".tito") / "benchmarks"
|
||||
benchmark_dir.mkdir(parents=True, exist_ok=True)
|
||||
timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
results_file = benchmark_dir / f"capstone_{timestamp_str}.json"
|
||||
|
||||
with open(results_file, 'w') as f:
|
||||
json.dump(results, f, indent=2)
|
||||
|
||||
# Display results
|
||||
self._display_capstone_results(results)
|
||||
|
||||
console.print(f"\n[green]✅ Results saved to: {results_file}[/green]")
|
||||
|
||||
# Prompt for submission
|
||||
if not args.skip_submit:
|
||||
self._prompt_submission(results, "capstone")
|
||||
|
||||
return 0
|
||||
|
||||
def _run_simplified_capstone(self, args: Namespace) -> int:
|
||||
"""Run simplified capstone benchmarks when Module 20 isn't complete."""
|
||||
console = self.console
|
||||
|
||||
console.print("[yellow]Running simplified capstone benchmarks...[/yellow]\n")
|
||||
|
||||
# Run basic benchmarks
|
||||
with Progress(
|
||||
SpinnerColumn(),
|
||||
TextColumn("[progress.description]{task.description}"),
|
||||
console=console
|
||||
) as progress:
|
||||
task = progress.add_task("Running benchmarks...", total=None)
|
||||
|
||||
progress.update(task, description="[cyan]Testing performance...")
|
||||
time.sleep(1) # Simulate benchmark time
|
||||
|
||||
results = {
|
||||
"benchmark_type": "capstone_simplified",
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"system_info": self._get_system_info(),
|
||||
"note": "Simplified benchmarks - complete Module 20 for full suite",
|
||||
"metrics": {
|
||||
"basic_score": 75
|
||||
}
|
||||
}
|
||||
|
||||
# Save results
|
||||
benchmark_dir = Path(".tito") / "benchmarks"
|
||||
benchmark_dir.mkdir(parents=True, exist_ok=True)
|
||||
timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
results_file = benchmark_dir / f"capstone_simplified_{timestamp_str}.json"
|
||||
|
||||
with open(results_file, 'w') as f:
|
||||
json.dump(results, f, indent=2)
|
||||
|
||||
console.print(f"\n[green]✅ Results saved to: {results_file}[/green]")
|
||||
console.print("[yellow]💡 Complete Module 20 for full capstone benchmarks[/yellow]")
|
||||
|
||||
return 0
|
||||
|
||||
def _benchmark_tensor_ops(self) -> float:
|
||||
"""Benchmark basic tensor operations."""
|
||||
import time
|
||||
|
||||
# Create tensors
|
||||
a = np.random.randn(100, 100).astype(np.float32)
|
||||
b = np.random.randn(100, 100).astype(np.float32)
|
||||
|
||||
# Warmup
|
||||
for _ in range(5):
|
||||
_ = a + b
|
||||
_ = a * b
|
||||
|
||||
# Benchmark
|
||||
start = time.perf_counter()
|
||||
for _ in range(100):
|
||||
_ = a + b
|
||||
_ = a * b
|
||||
_ = np.sum(a)
|
||||
end = time.perf_counter()
|
||||
|
||||
return (end - start) * 1000 / 100 # Convert to milliseconds per operation
|
||||
|
||||
def _benchmark_matmul(self) -> float:
|
||||
"""Benchmark matrix multiplication."""
|
||||
import time
|
||||
|
||||
a = np.random.randn(100, 100).astype(np.float32)
|
||||
b = np.random.randn(100, 100).astype(np.float32)
|
||||
|
||||
# Warmup
|
||||
for _ in range(5):
|
||||
_ = np.dot(a, b)
|
||||
|
||||
# Benchmark
|
||||
start = time.perf_counter()
|
||||
for _ in range(50):
|
||||
_ = np.dot(a, b)
|
||||
end = time.perf_counter()
|
||||
|
||||
return (end - start) * 1000 / 50 # milliseconds per matmul
|
||||
|
||||
def _benchmark_forward_pass(self) -> float:
|
||||
"""Benchmark simple forward pass simulation."""
|
||||
import time
|
||||
|
||||
# Simulate a simple forward pass
|
||||
x = np.random.randn(1, 784).astype(np.float32)
|
||||
w1 = np.random.randn(784, 128).astype(np.float32)
|
||||
w2 = np.random.randn(128, 10).astype(np.float32)
|
||||
|
||||
# Warmup
|
||||
for _ in range(5):
|
||||
h = np.maximum(0, np.dot(x, w1)) # ReLU
|
||||
_ = np.dot(h, w2)
|
||||
|
||||
# Benchmark
|
||||
start = time.perf_counter()
|
||||
for _ in range(20):
|
||||
h = np.maximum(0, np.dot(x, w1))
|
||||
_ = np.dot(h, w2)
|
||||
end = time.perf_counter()
|
||||
|
||||
return (end - start) * 1000 / 20 # milliseconds per forward pass
|
||||
|
||||
def _get_system_info(self) -> Dict[str, str]:
|
||||
"""Get system information."""
|
||||
return {
|
||||
"platform": platform.platform(),
|
||||
"processor": platform.processor(),
|
||||
"python_version": platform.python_version(),
|
||||
"cpu_count": str(platform.processor() or "unknown")
|
||||
}
|
||||
|
||||
def _display_capstone_results(self, results: Dict[str, Any]) -> None:
|
||||
"""Display capstone benchmark results."""
|
||||
console = self.console
|
||||
|
||||
results_table = Table(title="Capstone Benchmark Results", show_header=True, header_style="bold cyan")
|
||||
results_table.add_column("Track", style="cyan")
|
||||
results_table.add_column("Metric", style="yellow")
|
||||
results_table.add_column("Value", justify="right", style="green")
|
||||
results_table.add_column("Score", justify="right", style="magenta")
|
||||
|
||||
metrics = results.get("metrics", {})
|
||||
|
||||
if "speed" in metrics:
|
||||
speed = metrics["speed"]
|
||||
results_table.add_row("Speed", "Latency", f"{speed['latency_ms']:.2f} ms", f"{speed['score']}/100")
|
||||
results_table.add_row("", "Throughput", f"{speed['throughput_ops_per_sec']:.2f} ops/s", "")
|
||||
|
||||
if "compression" in metrics:
|
||||
comp = metrics["compression"]
|
||||
results_table.add_row("Compression", "Model Size", f"{comp['model_size_mb']:.2f} MB", f"{comp['score']}/100")
|
||||
results_table.add_row("", "Compression Ratio", f"{comp['compression_ratio']:.1f}x", "")
|
||||
|
||||
if "accuracy" in metrics:
|
||||
acc = metrics["accuracy"]
|
||||
results_table.add_row("Accuracy", "Accuracy", f"{acc['accuracy_percent']:.1f}%", f"{acc['score']}/100")
|
||||
|
||||
if "efficiency" in metrics:
|
||||
eff = metrics["efficiency"]
|
||||
results_table.add_row("Efficiency", "Memory", f"{eff['memory_mb']:.2f} MB", f"{eff['score']}/100")
|
||||
|
||||
results_table.add_row("", "", "", "")
|
||||
results_table.add_row("[bold]Overall[/bold]", "", "", f"[bold]{results.get('overall_score', 0)}/100[/bold]")
|
||||
|
||||
console.print("\n")
|
||||
console.print(results_table)
|
||||
|
||||
console.print(Panel(
|
||||
f"[bold green]🏆 Capstone Benchmark Complete![/bold green]\n\n"
|
||||
f"📊 Overall Score: [bold]{results.get('overall_score', 0)}/100[/bold]\n\n"
|
||||
f"🌍 Submit to leaderboard: [cyan]tito community submit --benchmark[/cyan]",
|
||||
title="Success",
|
||||
border_style="green"
|
||||
))
|
||||
|
||||
def _prompt_submission(self, results: Dict[str, Any], benchmark_type: str) -> None:
|
||||
"""Prompt user to submit benchmark results."""
|
||||
console = self.console
|
||||
|
||||
console.print("\n")
|
||||
submit = Confirm.ask(
|
||||
f"[cyan]Would you like to submit your {benchmark_type} benchmark results to the community?[/cyan]",
|
||||
default=True
|
||||
)
|
||||
|
||||
if submit:
|
||||
# Collect submission configuration
|
||||
console.print("\n[cyan]Submission Configuration:[/cyan]")
|
||||
|
||||
# Check if user is in community
|
||||
community_data = self._get_community_data()
|
||||
if not community_data:
|
||||
console.print("[yellow]⚠️ You're not in the community yet.[/yellow]")
|
||||
join = Confirm.ask("Would you like to join the community first?", default=True)
|
||||
if join:
|
||||
console.print("\n[cyan]Run: [bold]tito community join[/bold][/cyan]")
|
||||
return
|
||||
|
||||
# Additional submission options
|
||||
include_system_info = Confirm.ask(
|
||||
"Include system information in submission?",
|
||||
default=True
|
||||
)
|
||||
|
||||
anonymous = Confirm.ask(
|
||||
"Submit anonymously?",
|
||||
default=False
|
||||
)
|
||||
|
||||
# Create submission data
|
||||
submission = {
|
||||
"benchmark_type": benchmark_type,
|
||||
"timestamp": results["timestamp"],
|
||||
"metrics": results["metrics"],
|
||||
"include_system_info": include_system_info,
|
||||
"anonymous": anonymous
|
||||
}
|
||||
|
||||
if include_system_info:
|
||||
submission["system_info"] = results.get("system_info", {})
|
||||
|
||||
# Save submission
|
||||
submission_dir = Path(".tito") / "submissions"
|
||||
submission_dir.mkdir(parents=True, exist_ok=True)
|
||||
timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
submission_file = submission_dir / f"{benchmark_type}_submission_{timestamp_str}.json"
|
||||
|
||||
with open(submission_file, 'w') as f:
|
||||
json.dump(submission, f, indent=2)
|
||||
|
||||
console.print(f"\n[green]✅ Submission prepared: {submission_file}[/green]")
|
||||
|
||||
# Stub: Try to submit to website
|
||||
self._submit_to_website(submission)
|
||||
|
||||
config = self._get_config()
|
||||
if not config.get("website", {}).get("enabled", False):
|
||||
console.print("[cyan]💡 To submit: Create a PR with this file or run 'tito community submit'[/cyan]")
|
||||
|
||||
def _get_community_data(self) -> Optional[Dict[str, Any]]:
|
||||
"""Get user's community data if they've joined (project-local)."""
|
||||
community_file = self.config.project_root / ".tinytorch" / "community" / "profile.json"
|
||||
if community_file.exists():
|
||||
try:
|
||||
with open(community_file, 'r') as f:
|
||||
return json.load(f)
|
||||
except Exception:
|
||||
return None
|
||||
return None
|
||||
|
||||
def _get_config(self) -> Dict[str, Any]:
|
||||
"""Get community configuration."""
|
||||
config_file = self.config.project_root / ".tinytorch" / "config.json"
|
||||
default_config = {
|
||||
"website": {
|
||||
"base_url": "https://tinytorch.ai",
|
||||
"community_map_url": "https://tinytorch.ai/community",
|
||||
"api_url": None, # Set when API is available
|
||||
"enabled": False # Set to True when website integration is ready
|
||||
},
|
||||
"local": {
|
||||
"enabled": True, # Always use local storage
|
||||
"auto_sync": False # Auto-sync to website when enabled
|
||||
}
|
||||
}
|
||||
|
||||
if config_file.exists():
|
||||
try:
|
||||
with open(config_file, 'r') as f:
|
||||
user_config = json.load(f)
|
||||
# Merge with defaults
|
||||
default_config.update(user_config)
|
||||
return default_config
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Create default config if it doesn't exist
|
||||
config_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(config_file, 'w') as f:
|
||||
json.dump(default_config, f, indent=2)
|
||||
|
||||
return default_config
|
||||
|
||||
def _submit_to_website(self, submission: Dict[str, Any]) -> None:
|
||||
"""Stub: Submit benchmark results to website (local for now, website integration later)."""
|
||||
config = self._get_config()
|
||||
|
||||
if not config.get("website", {}).get("enabled", False):
|
||||
# Website integration not enabled, just store locally
|
||||
return
|
||||
|
||||
api_url = config.get("website", {}).get("api_url")
|
||||
if api_url:
|
||||
# TODO: Implement API call when website is ready
|
||||
# Example:
|
||||
# import requests
|
||||
# try:
|
||||
# response = requests.post(
|
||||
# f"{api_url}/api/benchmarks/submit",
|
||||
# json=submission,
|
||||
# timeout=30, # 30 second timeout for benchmark submissions
|
||||
# headers={"Content-Type": "application/json"}
|
||||
# )
|
||||
# response.raise_for_status()
|
||||
# self.console.print("[green]✅ Submitted to community leaderboard![/green]")
|
||||
# except requests.Timeout:
|
||||
# self.console.print("[yellow]⚠️ Submission timed out. Saved locally.[/yellow]")
|
||||
# self.console.print("[dim]You can submit later or try again.[/dim]")
|
||||
# except requests.RequestException as e:
|
||||
# self.console.print(f"[yellow]⚠️ Could not submit to website: {e}[/yellow]")
|
||||
# self.console.print("[dim]Your submission is saved locally and can be submitted later.[/dim]")
|
||||
pass
|
||||
|
||||
@@ -44,6 +44,8 @@ from .commands.milestone import MilestoneCommand
|
||||
from .commands.leaderboard import LeaderboardCommand
|
||||
from .commands.olympics import OlympicsCommand
|
||||
from .commands.setup import SetupCommand
|
||||
from .commands.benchmark import BenchmarkCommand
|
||||
from .commands.community import CommunityCommand
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(
|
||||
@@ -76,6 +78,8 @@ class TinyTorchCLI:
|
||||
'milestone': MilestoneCommand,
|
||||
'leaderboard': LeaderboardCommand,
|
||||
'olympics': OlympicsCommand,
|
||||
'benchmark': BenchmarkCommand,
|
||||
'community': CommunityCommand,
|
||||
# Convenience commands
|
||||
'notebooks': NotebooksCommand,
|
||||
'export': ExportCommand,
|
||||
|
||||