- Delete outdated site/ directory - Rename docs/ → site/ to match original architecture intent - Update all GitHub workflows to reference site/: - publish-live.yml: Update paths and build directory - publish-dev.yml: Update paths and build directory - build-pdf.yml: Update paths and artifact locations - Update README.md: - Consolidate site/ documentation (website + PDF) - Update all docs/ links to site/ - Test successful: Local build works with all 40 pages The site/ directory now clearly represents the course website and documentation, making the repository structure more intuitive. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
7.9 KiB
Team Onboarding Guide: TinyTorch for Industry
Complete guide for using TinyTorch in industry settings: new hire bootcamps, internal training programs, and debugging workshops.
🎯 Overview
TinyTorch's Model 3: Team Onboarding addresses industry use cases where ML teams want members to understand PyTorch internals. This guide covers deployment scenarios, training structures, and best practices for industry adoption.
🚀 Use Cases
1. New Hire Bootcamps (2-3 Week Intensive)
Goal: Rapidly onboard new ML engineers to understand framework internals
Structure:
- Week 1: Foundation Tier (Modules 01-07)
- Tensors, autograd, optimizers, training loops
- Focus: Understanding
loss.backward()mechanics
- Week 2: Architecture Tier (Modules 08-13)
- CNNs, transformers, attention mechanisms
- Focus: Production architecture internals
- Week 3: Optimization Tier (Modules 14-19) OR Capstone
- Profiling, quantization, compression
- Focus: Production optimization techniques
Schedule:
- Full-time: 40 hours/week
- Hands-on coding: 70% of time
- Systems discussions: 30% of time
- Daily standups and code reviews
Deliverables:
- Completed modules with passing tests
- Capstone project (optional)
- Technical presentation on framework internals
2. Internal Training Programs (Distributed Over Quarters)
Goal: Deep understanding of ML systems for existing team members
Structure:
- Quarter 1: Foundation (Modules 01-07)
- Weekly sessions: 2-3 hours
- Self-paced module completion
- Monthly group discussions
- Quarter 2: Architecture (Modules 08-13)
- Weekly sessions: 2-3 hours
- Architecture deep-dives
- Production case studies
- Quarter 3: Optimization (Modules 14-19)
- Weekly sessions: 2-3 hours
- Performance optimization focus
- Real production optimization projects
Benefits:
- Fits into existing work schedules
- Allows deep learning without intensive time commitment
- Builds team knowledge gradually
- Enables peer learning
3. Debugging Workshops (Focused Modules)
Goal: Targeted understanding of specific framework components
Common Focus Areas:
Autograd Debugging Workshop (Module 05)
- Understanding gradient flow
- Debugging gradient issues
- Computational graph visualization
- Duration: 1-2 days
Attention Mechanism Workshop (Module 12)
- Understanding attention internals
- Debugging attention scaling issues
- Memory optimization for attention
- Duration: 1-2 days
Optimization Workshop (Modules 14-19)
- Profiling production models
- Quantization and compression
- Performance optimization strategies
- Duration: 2-3 days
🏗️ Deployment Scenarios
Scenario 1: Cloud-Based Training (Recommended)
Setup: Google Colab or JupyterHub
- Zero local installation
- Consistent environment
- Easy sharing and collaboration
- Best for: Large teams, remote workers
Steps:
- Clone repository to Colab
- Install dependencies:
pip install -e . - Work through modules
- Share notebooks via Colab links
Scenario 2: Local Development Environment
Setup: Local Python environment
- Full control over environment
- Better for debugging
- Offline capability
- Best for: Smaller teams, on-site training
Steps:
- Clone repository locally
- Set up virtual environment
- Install:
pip install -e . - Use JupyterLab for development
Scenario 3: Hybrid Approach
Setup: Colab for learning, local for projects
- Learn in cloud environment
- Apply locally for projects
- Best for: Flexible teams
📋 Training Program Templates
Template 1: 2-Week Intensive Bootcamp
Week 1: Foundation
- Day 1-2: Modules 01-02 (Tensor, Activations)
- Day 3-4: Modules 03-04 (Layers, Losses)
- Day 5: Module 05 (Autograd) - Full day focus
- Weekend: Review and practice
Week 2: Architecture + Optimization
- Day 1-2: Modules 08-09 (DataLoader, CNNs)
- Day 3: Module 12 (Attention)
- Day 4-5: Modules 14-15 (Profiling, Quantization)
- Final: Capstone project presentation
Template 2: 3-Month Distributed Program
Month 1: Foundation
- Week 1: Modules 01-02
- Week 2: Modules 03-04
- Week 3: Module 05 (Autograd)
- Week 4: Modules 06-07 (Optimizers, Training)
Month 2: Architecture
- Week 1: Modules 08-09
- Week 2: Modules 10-11
- Week 3: Modules 12-13
- Week 4: Integration project
Month 3: Optimization
- Week 1: Modules 14-15
- Week 2: Modules 16-17
- Week 3: Modules 18-19
- Week 4: Capstone optimization project
🎓 Learning Outcomes
After completing TinyTorch onboarding, team members will:
-
Understand Framework Internals
- How autograd works
- Memory allocation patterns
- Optimization trade-offs
-
Debug Production Issues
- Gradient flow problems
- Memory bottlenecks
- Performance issues
-
Make Informed Decisions
- Optimizer selection
- Architecture choices
- Deployment strategies
-
Read Production Code
- Understand PyTorch source
- Navigate framework codebases
- Contribute to ML infrastructure
🔧 Integration with Existing Workflows
Code Review Integration
- Review production code with TinyTorch knowledge
- Identify framework internals in production code
- Suggest optimizations based on systems understanding
Debugging Integration
- Apply TinyTorch debugging strategies to production issues
- Use systems thinking for troubleshooting
- Profile production models using TinyTorch techniques
Architecture Design
- Design new models with systems awareness
- Consider memory and performance from the start
- Make informed trade-offs
📊 Success Metrics
Individual Metrics
- Module completion rate
- Test passing rate
- Capstone project quality
- Self-reported confidence increase
Team Metrics
- Reduced debugging time
- Fewer production incidents
- Improved code review quality
- Better architecture decisions
🛠️ Setup for Teams
Quick Start
# 1. Clone repository
git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch
# 2. Set up environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
pip install -e .
# 4. Verify setup
tito system doctor
# 5. Start with Module 01
tito view 01_tensor
Team-Specific Customization
- Custom datasets: Replace with company-specific data
- Domain modules: Add modules for specific use cases
- Integration: Connect to company ML infrastructure
- Assessment: Customize grading for team needs
📚 Resources
- Student Quickstart:
docs/STUDENT_QUICKSTART.md - Instructor Guide:
INSTRUCTOR.md(for training leads) - TA Guide:
TA_GUIDE.md(for support staff) - Module Documentation:
modules/*/ABOUT.md
💼 Industry Case Studies
Case Study 1: ML Infrastructure Team
Challenge: Team members could use PyTorch but couldn't debug framework issues Solution: 2-week intensive bootcamp focusing on autograd and optimization Result: 50% reduction in debugging time, better architecture decisions
Case Study 2: Research Team
Challenge: Researchers needed to understand transformer internals Solution: Focused workshop on Modules 12-13 (Attention, Transformers) Result: Improved model designs, better understanding of scaling
Case Study 3: Production ML Team
Challenge: Team needed optimization skills for deployment Solution: 3-month program focusing on Optimization Tier (Modules 14-19) Result: 4x model compression, 10x speedup on production models
🎯 Next Steps
- Choose deployment model: Bootcamp, distributed, or workshop
- Set up environment: Cloud (Colab) or local
- Select modules: Full curriculum or focused selection
- Schedule training: Intensive or distributed
- Track progress: Use checkpoint system or custom metrics
For Questions: See INSTRUCTOR.md or contact TinyTorch maintainers