docs: Major cleanup - 46 → 12 essential docs

MASSIVE DOCUMENTATION CLEANUP: - Reduced from 46 docs to 12 essential files - Archived 34 outdated planning and analysis documents ✅ KEPT (Essential for current operations): - STUDENT_QUICKSTART.md - Student onboarding - INSTRUCTOR_GUIDE.md - Instructor setup - cifar10-training-guide.md - North star achievement - tinytorch-assumptions.md - Complexity framework (NEW) - tinytorch-textbook-alignment.md - Academic alignment - NBGrader integration docs (3 files) - Development standards (3 files) - docs/README.md - Navigation guide (NEW) 🗑️ ARCHIVED (Completed/outdated planning): - All optimization-modules-* planning docs - All milestone-* system docs - All tutorial-master-plan and analysis docs - Module reordering and structure analysis - Agent setup and workflow case studies RESULT: Clean, focused documentation structure Only active, current docs remain - easy to find what you need!
2026-05-02 07:58:13 -05:00 · 2025-09-27 17:04:19 -04:00
parent 556ba0de83
commit d2cfb2d57e
31 changed files with 35 additions and 0 deletions
--- a/docs/archive/2024-cleanup/optimization-modules-tutorial-plan.md
+++ b/docs/archive/2024-cleanup/optimization-modules-tutorial-plan.md
@@ -0,0 +1,276 @@
+# TinyTorch Optimization Modules Tutorial Plan
+## Modules 15-20: From Manual Optimization to Automatic Systems
+
+## Overview: The Complete Optimization Journey
+
+Students progress from manual optimization techniques to building intelligent systems that optimize automatically, culminating in a competition where their AutoML systems compete.
+
+```
+Manual Optimization (15-18) → Automatic Optimization (19) → Competition (20)
+```
+
+---
+
+## Module 15: Acceleration - Speed Optimization
+
+### **Connection from Module 14**
+"Your transformer works but generates text slowly. Let's make it 10-100x faster!"
+
+### **What Students Build**
+- Transform educational loops into optimized operations
+- Cache-friendly blocked algorithms
+- NumPy vectorization integration
+- Transparent backend dispatch system
+
+### **Key Learning Outcomes**
+- Understand why educational loops are slow (cache misses, no vectorization)
+- Build blocked matrix multiplication for cache efficiency
+- Learn when to use optimized libraries vs custom code
+- Create backend systems for transparent optimization
+
+### **Module Structure Change**
+- **NEW**: Show `OptimizedBackend` class upfront as the goal
+- Students see where they're heading before learning the steps
+- "Here's the elegant solution, now let's understand how to build it"
+
+### **Performance Impact**: 10-100x speedup on matrix operations
+
+---
+
+## Module 16: Memory - Memory Optimization
+
+### **Connection from Module 15**
+"Operations are faster, but transformers still recompute everything. Let's be smarter with memory!"
+
+### **What Students Build**
+- `KVCache` class for transformer attention states
+- Incremental attention computation (process only new tokens)
+- Memory profiling and analysis tools
+- Cache management strategies
+
+### **Key Learning Outcomes**
+- Memory vs computation tradeoffs
+- Understanding O(N²) → O(N) optimization for sequences
+- Production caching patterns (GPT, LLaMA)
+- When caching helps vs hurts performance
+
+### **Performance Impact**: 50x speedup in autoregressive generation
+
+---
+
+## Module 17: Quantization - Precision Optimization
+
+### **Connection from Module 16**
+"Memory usage is optimized, but models are still huge. Let's use fewer bits!"
+
+### **What Students Build**
+- `Quantizer` class for FP32→INT8 conversion
+- Calibration techniques for maintaining accuracy
+- Quantized operations (matmul, conv2d)
+- Model size analysis tools
+
+### **Key Learning Outcomes**
+- Numerical precision vs accuracy tradeoffs
+- Post-training quantization techniques
+- Hardware acceleration through reduced precision
+- When to use INT8 vs FP16 vs FP32
+
+### **Performance Impact**: 4x model size reduction, 2-4x inference speedup
+
+---
+
+## Module 18: Compression - Structural Optimization
+
+### **Connection from Module 17**
+"We're using fewer bits, but can we remove weights entirely?"
+
+### **What Students Build**
+- `MagnitudePruner` for weight removal
+- `StructuredPruner` for channel/filter removal
+- Basic knowledge distillation
+- Sparsity visualization tools
+
+### **Key Learning Outcomes**
+- Structured vs unstructured pruning
+- Magnitude-based pruning strategies
+- Knowledge distillation basics
+- Sparsity patterns and hardware efficiency
+
+### **Performance Impact**: 90% sparsity with <5% accuracy loss
+
+---
+
+## Module 19: AutoTuning - Automatic Optimization
+
+### **Connection from Module 18**
+"We have all these optimization techniques. Let's build systems that apply them automatically!"
+
+### **What Students Build**
+```python
+class AutoTuner:
+    def auto_optimize(self, model, constraints):
+        """
+        Automatically decide:
+        - Which optimizations to apply
+        - In what order
+        - With what parameters
+        - For what deployment target
+        """
+        pass
+    
+    def hyperparameter_search(self, model, data, budget):
+        """Smart hyperparameter tuning (not random)"""
+        pass
+    
+    def optimization_pipeline(self, model, target_hardware):
+        """Build optimal pipeline for specific hardware"""
+        pass
+    
+    def adaptive_training(self, model, data):
+        """Training that adapts based on progress"""
+        pass
+```
+
+### **Key Learning Outcomes**
+- Automated optimization strategy selection
+- Constraint-based optimization (memory, latency, accuracy)
+- Hardware-aware optimization pipelines
+- Smart search strategies (Bayesian optimization basics)
+- Data-efficient training (curriculum learning, active learning)
+
+### **Student Experience**
+"I built a system that takes any model and automatically optimizes it for any deployment target!"
+
+### **Scope Balance** (Not Too Complex)
+- Focus on **rule-based automation** (if mobile → aggressive quantization)
+- Simple **grid search** with smart pruning (not full Bayesian optimization)
+- Basic **hardware detection** (CPU vs GPU vs Mobile)
+- **Pre-built optimization recipes** that students can combine
+
+---
+
+## Module 20: Competition - AutoML Olympics
+
+### **Connection from Module 19**
+"You've built AutoTuning systems. Time to compete!"
+
+### **What Students Build**
+- Complete end-to-end optimized ML systems
+- Submission package for competition platform
+- Performance analysis reports
+- Innovation documentation
+
+### **Competition Categories**
+1. **Speed Challenge**: Fastest to reach target accuracy
+2. **Size Challenge**: Best accuracy under size constraints
+3. **Efficiency Challenge**: Best accuracy/resource tradeoff
+4. **Innovation Challenge**: Most creative optimization approach
+
+### **Platform Concept**
+```python
+class CompetitionSubmission:
+    def __init__(self, team_name):
+        self.model = self.build_model()
+        self.auto_tuner = self.build_autotuner()
+        self.optimized = self.auto_tuner.optimize(self.model)
+    
+    def evaluate(self, test_data):
+        """Automated evaluation on hidden test set"""
+        return {
+            'accuracy': self.measure_accuracy(test_data),
+            'latency': self.measure_latency(),
+            'memory': self.measure_memory(),
+            'model_size': self.measure_size()
+        }
+```
+
+### **Leaderboard System**
+- Real-time rankings across multiple metrics
+- Automated testing on standardized hardware
+- Public showcase of techniques used
+- Innovation bonus for novel approaches
+
+---
+
+## Implementation Timeline
+
+### **Week 1: Foundation**
+- Create placeholder directories for modules 16-20
+- Restructure Module 15 with OptimizedBackend upfront
+- Begin drafting Module 16 (Memory)
+
+### **Week 2: Parallel Development**
+- Modules 16-18 developed in parallel by different agents
+- PyTorch expert reviews all three simultaneously
+- Integration testing between modules
+
+### **Week 3: AutoTuning Development**
+- Module 19 development with appropriate scope
+- Integration with all previous optimization modules
+- Testing of automatic optimization pipelines
+
+### **Week 4: Competition Platform**
+- Module 20 competition framework
+- Leaderboard system design
+- Submission and evaluation pipeline
+
+---
+
+## Directory Structure
+
+```
+modules/
+├── 15_acceleration/     [EXISTS - needs restructuring]
+├── 16_memory/           [TO CREATE]
+│   ├── memory_dev.py
+│   ├── module.yaml
+│   └── README.md
+├── 17_quantization/     [TO CREATE] 
+│   ├── quantization_dev.py
+│   ├── module.yaml
+│   └── README.md
+├── 18_compression/      [EXISTS - needs development]
+│   ├── compression_dev.py
+│   ├── module.yaml
+│   └── README.md
+├── 19_autotuning/       [TO CREATE]
+│   ├── autotuning_dev.py
+│   ├── module.yaml
+│   └── README.md
+└── 20_competition/      [TO CREATE]
+    ├── competition_dev.py
+    ├── module.yaml
+    └── README.md
+```
+
+---
+
+## Success Metrics
+
+### **Educational Success**
+- Students understand when/why to apply each optimization
+- Can build automated optimization systems
+- Understand tradeoffs and constraints
+- Ready for production ML engineering roles
+
+### **Technical Success**
+- All optimizations integrate seamlessly
+- AutoTuner successfully combines techniques
+- Competition platform handles submissions
+- Measurable performance improvements achieved
+
+### **Engagement Success**
+- Students excited about optimization
+- Active competition participation
+- Innovative approaches developed
+- Community sharing of techniques
+
+---
+
+## Next Steps
+
+1. **Get PyTorch expert validation** on AutoTuning scope
+2. **Create placeholder directories** for new modules
+3. **Begin parallel development** of modules 16-18
+4. **Design competition platform** architecture
+5. **Update master roadmap** with final structure