github-starred/TinyTorch

Fork 0

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-04-28 01:17:32 -05:00

Files

Vijay Janapa Reddi 403d4c2f4c Add .tito/backups and docs/_build to gitignore

2025-11-28 14:59:51 +01:00

12 KiB

Raw Blame History

👥 For Instructors & TAs

Complete guide for teaching ML Systems Engineering with TinyTorch

📋 Quick Course Assessment

Duration: 14-16 weeks (flexible pacing)
Prerequisites: Python + basic linear algebra
Student Outcome: Complete ML framework supporting vision AND language models
Grading: 70% auto-graded (NBGrader), 30% manual (systems thinking)

For Instructors: Course Setup

30-Minute Initial Setup

Step 1: Environment Setup (10 minutes)

# Clone repository
git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install dependencies
pip install -r requirements.txt
pip install nbgrader

# Verify installation
tito system health

Step 2: Initialize Grading (10 minutes)

# Setup NBGrader integration
tito grade setup

# Verify grading commands
tito grade --help

Step 3: Prepare First Assignment (10 minutes)

# Generate instructor version (with solutions)
tito grade generate 01_tensor

# Create student version (solutions removed)
tito grade release 01_tensor

# Student assignments ready in: release/01_tensor/

Assignment Workflow

TinyTorch wraps NBGrader behind simple tito grade commands:

1. Prepare Assignments

# Generate instructor version with solutions
tito grade generate MODULE_NAME

# Create student version (auto-removes solutions)
tito grade release MODULE_NAME

2. Distribute to Students

Option A: GitHub Classroom (recommended)
- Create assignment repository from TinyTorch template
- Students clone and work in their repos
- Automatic submission via GitHub
Option B: Direct Distribution
- Share release/ directory contents
- Students download and submit via LMS

3. Collect Submissions

# Collect all students
tito grade collect MODULE_NAME

# Or specific student
tito grade collect MODULE_NAME --student student_id

4. Auto-Grade

# Grade all submissions
tito grade autograde MODULE_NAME

# Grade specific student
tito grade autograde MODULE_NAME --student student_id

5. Manual Review

# Open browser-based grading interface
tito grade manual MODULE_NAME

6. Generate Feedback

# Create feedback files for students
tito grade feedback MODULE_NAME

7. Export Grades

# Export all grades to CSV
tito grade export

# Or specific module
tito grade export --module MODULE_NAME --output grades.csv

Grading Components

Auto-Graded (70%)

Code implementation correctness
Test passing
Function signatures
Output validation
Edge case handling

Manually Graded (30%)

ML Systems Thinking questions (3 per module)
Each question: 10 points
Focus on understanding, not perfection

Grading Rubric for Systems Thinking Questions

Points	Criteria
9-10	Deep understanding, specific code references, discusses systems implications (memory, scaling, trade-offs)
7-8	Good understanding, some code references, basic systems thinking
5-6	Surface understanding, generic response, limited systems perspective
3-4	Attempted but misses key concepts
0-2	No attempt or completely off-topic

What to Look For:

References to actual implemented code
Memory/performance analysis
Scaling considerations
Production system comparisons (PyTorch, TensorFlow)
Understanding of trade-offs

Sample 16-Week Schedule

Week	Module	Focus	Teaching Notes
1	01 Tensor	Data Structures, Memory	Demo: memory profiling, copying behavior
2	02 Activations	Non-linearity, Stability	Demo: gradient vanishing/exploding
3	03 Layers	Neural Components	Demo: forward/backward passes
4	04 Losses	Optimization Objectives	Demo: loss landscapes
5	05 Autograd	Auto Differentiation	⚠️ Most challenging - allocate extra TA hours
6	06 Optimizers	Training Algorithms	Demo: optimizer comparisons
7	07 Training	Complete Training Loop	Milestone: Train first network!
8	Midterm Project	Build and Train Network	Assessment: End-to-end system
9	08 DataLoader	Data Pipeline	Demo: batching, shuffling
10	09 Spatial	Convolutions, CNNs	⚠️ High demand - O(N²) complexity
11	10 Tokenization	Text Processing	Demo: vocabulary building
12	11 Embeddings	Word Representations	Demo: embedding similarity
13	12 Attention	Attention Mechanisms	⚠️ Moderate-high demand
14	13 Transformers	Transformer Architecture	Milestone: Text generation!
15	14-19 Optimization	Profiling, Quantization	Focus on production trade-offs
16	20 Capstone	Torch Olympics	Final Competition

Critical Modules (Extra TA Support Needed)

Module 05: Autograd - Most conceptually challenging
- Pre-record debugging walkthroughs
- Create FAQ document
- Schedule additional office hours
Module 09: Spatial (CNNs) - Complex nested loops
- Focus on memory profiling
- Loop optimization strategies
- Padding/stride calculations
Module 12: Attention - Attention mechanisms
- Scaling factor importance
- Numerical stability
- Positional encoding issues

Module-Specific Teaching Notes

Module 01: Tensor

Key Concept: Memory layout is crucial for ML performance
Demo: Show memory_footprint(), compare copying vs views
Watch For: Students hardcoding float32 instead of using dtype

Module 05: Autograd

Key Concept: Computational graphs enable deep learning
Demo: Visualize computational graphs, show gradient flow
Watch For: Gradient shape mismatches, disconnected graphs

Module 09: Spatial (CNNs)

Key Concept: O(N²) operations become bottlenecks
Demo: Profile convolution memory usage
Watch For: Index out of bounds, missing padding

Module 12: Attention

Key Concept: Attention is compute-intensive but powerful
Demo: Profile attention with different sequence lengths
Watch For: Missing scaling factor (1/√d_k), softmax errors

Module 20: Capstone

Key Concept: Production requires optimization across ALL components
Project: Torch Olympics Competition (4 tracks: Speed, Compression, Accuracy, Efficiency)

Assessment Strategy

Continuous Assessment (70%)

Module completion: 4% each × 16 modules = 64%
Checkpoint achievements: 6%

Projects (30%)

Midterm: Build and train CNN on CIFAR-10 (15%)
Final: Torch Olympics Competition (15%)

Tracking Student Progress

# Check specific student
tito checkpoint status --student student_id

# Export class progress
tito checkpoint export --output class_progress.csv

# View module completion rates
tito module status --comprehensive

Identify Struggling Students:

Missing checkpoint achievements
Low scores on systems thinking questions
Incomplete module submissions
Late milestone completions

For Teaching Assistants: Student Support

TA Preparation

Develop Deep Familiarity With:

Module 05: Autograd - Most student questions
Module 09: CNNs - Complex implementation
Module 13: Transformers - Advanced concepts

Preparation Process:

Complete all three critical modules yourself
Introduce bugs intentionally
Practice debugging scenarios
Review past student submissions

Common Student Errors

Module 05: Autograd

Error 1: Gradient Shape Mismatches

Symptom: ValueError: shapes don't match for gradient
Cause: Incorrect gradient accumulation
Debug: Check gradient shapes match parameter shapes

Error 2: Disconnected Computational Graph

Symptom: Gradients are None or zero
Cause: Operations not tracked
Debug: Verify requires_grad=True, check graph construction

Error 3: Broadcasting Failures

Symptom: Shape errors during backward pass
Cause: Incorrect handling of broadcasted operations
Debug: Check gradient accumulation for broadcasted dims

Module 09: CNNs (Spatial)

Error 1: Index Out of Bounds

Symptom: IndexError in convolution loops
Cause: Incorrect padding/stride calculations
Debug: Verify output shape calculations

Error 2: Memory Issues

Symptom: Out of memory errors
Cause: Creating unnecessary intermediate arrays
Debug: Profile memory, look for unnecessary copies

Module 13: Transformers

Error 1: Attention Scaling Issues

Symptom: Attention weights don't sum to 1
Cause: Missing softmax or incorrect scaling
Debug: Verify softmax, check scaling factor (1/√d_k)

Error 2: Positional Encoding Errors

Symptom: Model doesn't learn positional information
Cause: Incorrect implementation
Debug: Verify sinusoidal patterns

Debugging Strategy

Guide students with questions, not answers:

"What error message are you seeing?" - Read full traceback
"What did you expect to happen?" - Clarify mental model
"What actually happened?" - Compare expected vs actual
"What have you tried?" - Avoid repeating failed approaches
"Can you test with a simpler case?" - Reduce complexity

Productive vs Unproductive Struggle

Productive Struggle (encourage):

Trying different approaches
Making incremental progress
Understanding error messages
Passing more tests over time

Unproductive Frustration (intervene):

Repeated identical errors
Random code changes
Unable to articulate the problem
No progress after 30+ minutes

Office Hour Patterns

Expected Demand Spikes:

Weeks 5-6 (Module 05: Autograd): Highest demand
- Schedule 2× TA capacity
- Pre-record debugging walkthroughs
- Create FAQ document
Week 10 (Module 09: CNNs): High demand
- Focus on memory profiling
- Loop optimization
- Padding/stride help
Week 13 (Module 13: Transformers): Moderate-high
- Attention debugging
- Scaling problems
- Architecture questions

Manual Review Focus

While auto-grading handles 70%, focus manual review on:

Code Quality
- Readability
- Design choices
- Documentation
Edge Case Handling
- Appropriate checks
- Error handling
- Boundary conditions
Systems Thinking
- Memory analysis
- Performance understanding
- Scaling awareness

Teaching Tips

Encourage Exploration - Let students try different approaches
Connect to Production - Reference PyTorch equivalents
Make Systems Visible - Profile memory, analyze complexity together
Build Confidence - Acknowledge progress and validate understanding

Troubleshooting Common Issues

Environment Problems

# Student fix:
tito system health
tito system reset

Module Import Errors

# Rebuild package
tito module complete N

Test Failures

# Detailed test output
tito module test N --verbose

NBGrader Issues

Database Locked

# Clear and reinitialize
rm gradebook.db
tito grade setup

Missing Submissions

# Check submission directory
ls submitted/*/MODULE/

Additional Resources

Complete Course Structure - Full curriculum overview
Student Getting Started - Send this to students
CLI Documentation - Detailed command reference
Troubleshooting Guide - Common issues and solutions
GitHub Discussions - Community support
Issue Tracker - Report bugs

Contact & Support

Need help?

Open an issue on GitHub
Join discussions forum
Email: support@mlsysbook.ai (if available)

Contributing:

Sample solutions welcome
Teaching material improvements
Bug fixes and enhancements

✅ You're Ready to Teach!

With NBGrader integration, automated grading, and comprehensive teaching materials, you have everything needed to run a successful ML systems course.

12 KiB Raw Blame History Unescape Escape