mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-27 09:13:30 -05:00
All module references updated to reflect new ordering: - Module 15: Quantization (was 16) - Module 16: Compression (was 17) - Module 17: Memoization (was 15) Updated by module-developer and website-manager agents: - Module ABOUT files with correct numbers and prerequisites - Cross-references and "What's Next" chains - Website navigation (_toc.yml) and content - Learning path progression in LEARNING_PATH.md - Profile milestone completion message (Module 17) Pedagogical flow now: Profile → Quantize → Prune → Cache → Accelerate
513 lines
23 KiB
Markdown
513 lines
23 KiB
Markdown
# TinyTorch Learning Journey
|
||
**From Zero to Transformer: A 20-Module Adventure**
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────────┐
|
||
│ 🎯 YOUR LEARNING DESTINATION │
|
||
│ │
|
||
│ Start: "What's a tensor?" │
|
||
│ ↓ │
|
||
│ Finish: "I built a transformer from scratch using only NumPy!" │
|
||
│ │
|
||
│ 🏆 North Star Achievement: Train CNNs on CIFAR-10 to 75%+ accuracy │
|
||
└─────────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
## Overview: 4 Phases, 20 Modules, 6 Milestones
|
||
|
||
**Total Time**: 60-80 hours (3-4 weeks at 20 hrs/week)
|
||
**Prerequisites**: Python, NumPy basics, basic linear algebra
|
||
**Tools**: Just Python + NumPy + Jupyter notebooks
|
||
|
||
---
|
||
|
||
## Phase 1: FOUNDATION (Modules 01-04)
|
||
**Goal**: Build the fundamental data structures and operations
|
||
**Time**: 10-12 hours | **Difficulty**: ⭐⭐ Beginner-friendly
|
||
|
||
```
|
||
┌──────────┐ ┌──────────────┐ ┌─────────┐ ┌─────────┐
|
||
│ 01 │─────▶│ 02 │─────▶│ 03 │─────▶│ 04 │
|
||
│ Tensor │ │ Activations │ │ Layers │ │ Losses │
|
||
│ │ │ │ │ │ │ │
|
||
│ • Shape │ │ • ReLU │ │ • Linear│ │ • MSE │
|
||
│ • Data │ │ • Sigmoid │ │ • Module│ │ • Cross │
|
||
│ • Ops │ │ • Softmax │ │ • Params│ │ Entropy│
|
||
└──────────┘ └──────────────┘ └─────────┘ └─────────┘
|
||
2-3 hrs 1.5-2 hrs 2-3 hrs 2-3 hrs
|
||
⭐⭐ ⭐⭐ ⭐⭐⭐ ⭐⭐⭐
|
||
```
|
||
|
||
### Module Details
|
||
|
||
**Module 01: Tensor** (2-3 hours, ⭐⭐)
|
||
- Build the foundation: n-dimensional arrays with operations
|
||
- Implement: shape, reshape, indexing, broadcasting
|
||
- Operations: add, multiply, matmul, transpose
|
||
- Why it matters: Everything in ML is tensor operations
|
||
|
||
**Module 02: Activations** (1.5-2 hours, ⭐⭐)
|
||
- Add non-linearity: ReLU, Sigmoid, Softmax
|
||
- Understand: Why neural networks need activations
|
||
- Implement: Forward passes for each activation
|
||
- Why it matters: Without activations, networks are just linear algebra
|
||
|
||
**Module 03: Layers** (2-3 hours, ⭐⭐⭐)
|
||
- Build neural network components: Linear layers
|
||
- Implement: nn.Module system, Parameter class
|
||
- Create: Weight initialization, layer composition
|
||
- Why it matters: Foundation for all network architectures
|
||
|
||
**Module 04: Losses** (2-3 hours, ⭐⭐⭐)
|
||
- Measure performance: MSE and CrossEntropy
|
||
- Understand: How to quantify model errors
|
||
- Implement: Loss calculation and aggregation
|
||
- Why it matters: Without loss, we can't train networks
|
||
|
||
### Milestone Checkpoint 1: 1957 Perceptron
|
||
**Unlock After**: Module 04
|
||
```
|
||
🏆 CHECKPOINT: Train Rosenblatt's Original Perceptron
|
||
├─ Dataset: Linearly separable binary classification
|
||
├─ Architecture: Single layer, no hidden units
|
||
├─ Achievement: First trainable neural network in history!
|
||
└─ Test: Can your implementation learn AND/OR logic?
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 2: TRAINING SYSTEMS (Modules 05-08)
|
||
**Goal**: Make your networks learn from data
|
||
**Time**: 14-18 hours | **Difficulty**: ⭐⭐⭐ Core ML concepts
|
||
|
||
```
|
||
┌──────────┐ ┌────────────┐ ┌──────────┐ ┌────────────┐
|
||
│ 05 │─────▶│ 06 │─────▶│ 07 │─────▶│ 08 │
|
||
│ Autograd │ │ Optimizers │ │ Training │ │ DataLoader │
|
||
│ │ │ │ │ │ │ │
|
||
│ • Graph │ │ • SGD │ │ • Loops │ │ • Batching │
|
||
│ • Forward│ │ • Momentum │ │ • Epochs │ │ • Shuffling│
|
||
│ • Backward│ │ • Adam │ │ • Eval │ │ • Pipeline │
|
||
└──────────┘ └────────────┘ └──────────┘ └────────────┘
|
||
3-4 hrs 3-4 hrs 4-5 hrs 3-4 hrs
|
||
⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐
|
||
│ │ │ │
|
||
└─────────────────┴──────────────────┴──────────────────┘
|
||
ALL BUILD ON TENSOR (Module 01)
|
||
```
|
||
|
||
### Module Details
|
||
|
||
**Module 05: Autograd** (3-4 hours, ⭐⭐⭐⭐) **CRITICAL MODULE**
|
||
- Implement automatic differentiation: The magic of modern ML
|
||
- Build: Computational graph, gradient tracking
|
||
- Implement: backward() for all operations
|
||
- Why it matters: This IS machine learning - without gradients, no training
|
||
|
||
**Module 06: Optimizers** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Update weights intelligently: SGD, Momentum, Adam
|
||
- Understand: Learning rates, momentum, adaptive methods
|
||
- Implement: Parameter updates, state management
|
||
- Why it matters: How networks actually improve over time
|
||
|
||
**Module 07: Training** (4-5 hours, ⭐⭐⭐⭐) **CRITICAL MODULE**
|
||
- Complete training loops: The full ML pipeline
|
||
- Implement: Epochs, batches, forward/backward passes
|
||
- Add: Metrics tracking, model evaluation
|
||
- Why it matters: This is where everything comes together
|
||
|
||
**Module 08: DataLoader** (3-4 hours, ⭐⭐⭐)
|
||
- Efficient data handling: Batching, shuffling, pipelines
|
||
- Implement: Batch creation, data iteration
|
||
- Optimize: Memory efficiency, preprocessing
|
||
- Why it matters: Real ML needs to handle millions of examples
|
||
|
||
### Milestone Checkpoint 2: 1969 XOR Crisis & Solution
|
||
**Unlock After**: Module 07
|
||
```
|
||
🏆 CHECKPOINT: Solve the Problem That Nearly Killed AI
|
||
├─ Dataset: XOR (the "impossible" problem for single-layer networks)
|
||
├─ Architecture: Multi-layer perceptron with hidden units
|
||
├─ Achievement: Prove Minsky wrong - MLPs can learn XOR!
|
||
└─ Test: 100% accuracy on XOR with your backpropagation
|
||
```
|
||
|
||
### Milestone Checkpoint 3: 1986 MLP Revival
|
||
**Unlock After**: Module 08
|
||
```
|
||
🏆 CHECKPOINT: Recognize Handwritten Digits (MNIST)
|
||
├─ Dataset: MNIST (60,000 handwritten digits)
|
||
├─ Architecture: 2-3 layer MLP with ReLU activations
|
||
├─ Achievement: 95%+ accuracy on real computer vision!
|
||
└─ Test: Your network recognizes digits you draw yourself
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 3: ADVANCED ARCHITECTURES (Modules 09-13)
|
||
**Goal**: Build modern CV and NLP architectures
|
||
**Time**: 20-25 hours | **Difficulty**: ⭐⭐⭐⭐ Advanced concepts
|
||
|
||
```
|
||
┌──────────┐ ┌───────────────┐ ┌─────────────┐
|
||
│ 09 │─────▶│ 10 │─────▶│ 11 │
|
||
│ Spatial │ │ Tokenization │ │ Embeddings │
|
||
│ │ │ │ │ │
|
||
│ • Conv2d │ │ • BPE │ │ • Token Emb │
|
||
│ • Pool2d │ │ • Vocab │ │ • Position │
|
||
│ • CNNs │ │ • Encoding │ │ • Learned │
|
||
└──────────┘ └───────────────┘ └─────────────┘
|
||
5-6 hrs 4-5 hrs 3-4 hrs
|
||
⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
|
||
│ │ │
|
||
│ └──────────┬───────────┘
|
||
│ ▼
|
||
│ ┌──────────┐ ┌──────────────┐
|
||
│ │ 12 │─────▶│ 13 │
|
||
│ │Attention │ │Transformers │
|
||
│ │ │ │ │
|
||
│ │ • Q,K,V │ │ • Encoder │
|
||
│ │ • Multi │ │ • Decoder │
|
||
│ │ -Head │ │ • Complete │
|
||
│ └──────────┘ └──────────────┘
|
||
│ 4-5 hrs 6-8 hrs
|
||
│ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
|
||
│ │ │
|
||
└──────────────────┴──────────────────┘
|
||
ALL USE AUTOGRAD (Module 05)
|
||
```
|
||
|
||
### Module Details
|
||
|
||
**Module 09: Spatial Operations** (5-6 hours, ⭐⭐⭐⭐⭐) **CRITICAL MODULE**
|
||
- Convolutional Neural Networks: Modern computer vision
|
||
- Implement: Conv2d (with 6 nested loops!), MaxPool2d
|
||
- Understand: Why CNNs revolutionized image processing
|
||
- Why it matters: The foundation of modern computer vision
|
||
|
||
**Module 10: Tokenization** (4-5 hours, ⭐⭐⭐⭐)
|
||
- Text preprocessing: From strings to numbers
|
||
- Implement: Byte-Pair Encoding (BPE), vocabulary building
|
||
- Understand: How transformers see language
|
||
- Why it matters: Can't process text without tokenization
|
||
|
||
**Module 11: Embeddings** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Convert tokens to vectors: Token and positional embeddings
|
||
- Implement: Embedding lookup, sinusoidal position encoding
|
||
- Understand: How models represent meaning
|
||
- Why it matters: Foundation for all language models
|
||
|
||
**Module 12: Attention** (4-5 hours, ⭐⭐⭐⭐⭐) **CRITICAL MODULE**
|
||
- The transformer revolution: Multi-head self-attention
|
||
- Implement: Q, K, V projections, scaled dot-product attention
|
||
- Understand: Why attention changed everything
|
||
- Why it matters: The core of GPT, BERT, and all modern LLMs
|
||
|
||
**Module 13: Transformers** (6-8 hours, ⭐⭐⭐⭐⭐) **CRITICAL MODULE**
|
||
- Complete transformer architecture: GPT-style models
|
||
- Implement: Encoder/decoder blocks, layer norm, residuals
|
||
- Build: Full transformer from components
|
||
- Why it matters: You're building GPT from scratch!
|
||
|
||
### Milestone Checkpoint 4: 1998 CNN Revolution
|
||
**Unlock After**: Module 09
|
||
```
|
||
🏆 CHECKPOINT: CIFAR-10 Image Classification (North Star!)
|
||
├─ Dataset: CIFAR-10 (50,000 color images, 10 classes)
|
||
├─ Architecture: LeNet-inspired CNN with Conv2d + MaxPool
|
||
├─ Achievement: 75%+ accuracy on real-world images!
|
||
├─ Test: Classify airplanes, cars, birds, cats, etc.
|
||
└─ Impact: This is where your framework becomes REAL
|
||
```
|
||
|
||
### Milestone Checkpoint 5: 2017 Transformer Era
|
||
**Unlock After**: Module 13
|
||
```
|
||
🏆 CHECKPOINT: Build a Language Model
|
||
├─ Dataset: Text corpus (Shakespeare, WikiText, etc.)
|
||
├─ Architecture: GPT-style decoder with multi-head attention
|
||
├─ Achievement: Generate coherent text character-by-character
|
||
├─ Test: Your model completes sentences meaningfully
|
||
└─ Impact: You've built the architecture behind ChatGPT!
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 4: PRODUCTION SYSTEMS (Modules 14-20)
|
||
**Goal**: Optimize and deploy ML systems at scale
|
||
**Time**: 18-22 hours | **Difficulty**: ⭐⭐⭐⭐⭐ Systems engineering
|
||
|
||
```
|
||
┌──────────┐ ┌──────────────┐ ┌──────────────┐
|
||
│ 14 │─────▶│ 15 │─────▶│ 16 │
|
||
│Profiling │ │ Quantization │ │ Compression │
|
||
│ │ │ │ │ │
|
||
│ • Time │ │ • INT8 │ │ • Pruning │
|
||
│ • Memory │ │ • Calibrate │ │ • Distill │
|
||
│ • FLOPs │ │ • Compress │ │ • Sparse │
|
||
└──────────┘ └──────────────┘ └──────────────┘
|
||
3-4 hrs 5-6 hrs 4-5 hrs
|
||
⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
|
||
|
||
▼ ▼ ▼
|
||
|
||
┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────┐
|
||
│ 17 │─────▶│ 18 │─────▶│ 19 │─────▶│ 20 │
|
||
│Memoization│ │Acceleration │ │Benchmark │ │ Capstone │
|
||
│ │ │ │ │ │ │ │
|
||
│ • KV-Cache│ │ • Vectorize │ │ • Compare│ │ • Full │
|
||
│ • Reuse │ │ • Hardware │ │ • Report │ │ System │
|
||
│ • Speedup│ │ • Parallel │ │ • Analyze│ │ • Deploy │
|
||
└──────────┘ └──────────────┘ └──────────┘ └──────────┘
|
||
3-4 hrs 3-4 hrs 3-4 hrs 4-6 hrs
|
||
⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
|
||
```
|
||
|
||
### Module Details
|
||
|
||
**Module 14: Profiling** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Measure everything: Time, memory, FLOPs
|
||
- Implement: Profiling decorators, bottleneck analysis
|
||
- Understand: Where computation actually happens
|
||
- Why it matters: Can't optimize what you don't measure
|
||
|
||
**Module 15: Quantization** (5-6 hours, ⭐⭐⭐⭐⭐)
|
||
- Compress models: Float32 → INT8
|
||
- Implement: Quantization, calibration, dequantization
|
||
- Achieve: 4× smaller models, faster inference
|
||
- Why it matters: Deploy models on edge devices
|
||
|
||
**Module 16: Compression** (4-5 hours, ⭐⭐⭐⭐⭐)
|
||
- Shrink models: Pruning and distillation
|
||
- Implement: Weight pruning, knowledge distillation
|
||
- Achieve: 10× smaller models with minimal accuracy loss
|
||
- Why it matters: Mobile ML and resource-constrained deployment
|
||
|
||
**Module 17: Memoization** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Cache computations: KV-cache for transformers
|
||
- Implement: Memoization decorators, cache management
|
||
- Optimize: 10-100× speedup for inference
|
||
- Why it matters: How production LLMs run efficiently
|
||
|
||
**Module 18: Acceleration** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Hardware optimization: Vectorization, parallelization
|
||
- Implement: NumPy tricks, batch processing
|
||
- Achieve: 10-100× speedups
|
||
- Why it matters: Production systems need speed
|
||
|
||
**Module 19: Benchmarking** (3-4 hours, ⭐⭐⭐⭐)
|
||
- Compare implementations: Rigorous performance testing
|
||
- Implement: Benchmark suite, statistical analysis
|
||
- Report: Scientific measurements
|
||
- Why it matters: Engineering decisions need data
|
||
|
||
**Module 20: Capstone** (4-6 hours, ⭐⭐⭐⭐⭐) **FINAL PROJECT**
|
||
- Build complete system: End-to-end ML pipeline
|
||
- Integrate: All 19 modules into production-ready system
|
||
- Deploy: Real application with optimization
|
||
- Why it matters: This is your portfolio piece!
|
||
|
||
### Milestone Checkpoint 6: 2024 Systems Age
|
||
**Unlock After**: Module 20
|
||
```
|
||
🏆 FINAL CHECKPOINT: Production-Optimized ML System
|
||
├─ Challenge: Take any milestone and make it production-ready
|
||
├─ Requirements:
|
||
│ ├─ 10× faster inference (profiling + acceleration)
|
||
│ ├─ 4× smaller model (quantization + compression)
|
||
│ ├─ <100ms latency (memoization + optimization)
|
||
│ └─ Rigorous benchmarks (statistical significance)
|
||
├─ Achievement: You're now an ML systems engineer!
|
||
└─ Test: Deploy your system, measure everything, compare to PyTorch
|
||
```
|
||
|
||
---
|
||
|
||
## Dependency Map: How Modules Connect
|
||
|
||
```
|
||
CORE FOUNDATION
|
||
├─ Module 01 (Tensor)
|
||
│ ├─▶ Module 02 (Activations)
|
||
│ ├─▶ Module 03 (Layers)
|
||
│ ├─▶ Module 04 (Losses)
|
||
│ └─▶ Module 08 (DataLoader)
|
||
│
|
||
TRAINING ENGINE
|
||
├─ Module 05 (Autograd) ← Enhances Module 01
|
||
│ ├─▶ Module 06 (Optimizers)
|
||
│ └─▶ Module 07 (Training)
|
||
│
|
||
COMPUTER VISION BRANCH
|
||
├─ Module 09 (Spatial) ← Uses 01,02,03,05
|
||
│ └─▶ Module 20 (Capstone)
|
||
│
|
||
NLP BRANCH
|
||
├─ Module 10 (Tokenization) ← Uses 01
|
||
│ ├─▶ Module 11 (Embeddings)
|
||
│ └─▶ Module 12 (Attention) ← Uses 01,03,05,11
|
||
│ └─▶ Module 13 (Transformers) ← Uses 02,11,12
|
||
│
|
||
OPTIMIZATION BRANCH
|
||
├─ Module 14 (Profiling) ← Measures any module
|
||
│ ├─▶ Module 15 (Quantization) ← Compresses any module
|
||
│ ├─▶ Module 16 (Compression) ← Shrinks any module
|
||
│ ├─▶ Module 17 (Memoization) ← Optimizes 12,13
|
||
│ ├─▶ Module 18 (Acceleration) ← Speeds up any module
|
||
│ └─▶ Module 19 (Benchmarking) ← Measures optimizations
|
||
│ └─▶ Module 20 (Capstone)
|
||
```
|
||
|
||
---
|
||
|
||
## Time Estimates by Experience Level
|
||
|
||
```
|
||
┌──────────────────┬──────────┬──────────┬──────────┬──────────┐
|
||
│ Experience Level │ Phase 1 │ Phase 2 │ Phase 3 │ Phase 4 │
|
||
├──────────────────┼──────────┼──────────┼──────────┼──────────┤
|
||
│ Beginner │ 12-15h │ 18-22h │ 25-30h │ 22-26h │
|
||
│ (New to ML) │ │ │ │ │
|
||
├──────────────────┼──────────┼──────────┼──────────┼──────────┤
|
||
│ Intermediate │ 10-12h │ 14-18h │ 20-25h │ 18-22h │
|
||
│ (Used PyTorch) │ │ │ │ │
|
||
├──────────────────┼──────────┼──────────┼──────────┼──────────┤
|
||
│ Advanced │ 8-10h │ 12-15h │ 18-22h │ 16-20h │
|
||
│ (Built models) │ │ │ │ │
|
||
└──────────────────┴──────────┴──────────┴──────────┴──────────┘
|
||
|
||
Total Time: 60-80 hours (Intermediate) | 3-4 weeks at 20 hrs/week
|
||
```
|
||
|
||
---
|
||
|
||
## Difficulty Ratings Explained
|
||
|
||
```
|
||
⭐⭐ │ Beginner-friendly
|
||
│ - Follow clear instructions
|
||
│ - Build intuition for concepts
|
||
│ - ~2 hours per module
|
||
│
|
||
⭐⭐⭐ │ Core ML concepts
|
||
│ - Implement fundamental algorithms
|
||
│ - Connect multiple concepts
|
||
│ - ~3 hours per module
|
||
│
|
||
⭐⭐⭐⭐ │ Advanced implementation
|
||
│ - Complex algorithms
|
||
│ - Systems thinking required
|
||
│ - ~4 hours per module
|
||
│
|
||
⭐⭐⭐⭐⭐ │ Expert-level systems
|
||
│ - Multi-layered complexity
|
||
│ - Production considerations
|
||
│ - ~5-6 hours per module
|
||
```
|
||
|
||
---
|
||
|
||
## Suggested Learning Paths
|
||
|
||
### Fast Track (Core ML Only) - 40 hours
|
||
Focus on the essentials to build and train networks:
|
||
```
|
||
01 → 02 → 03 → 04 → 05 → 06 → 07 → 08 → 09
|
||
(Tensor through Spatial for CNNs)
|
||
|
||
Milestones: Perceptron → XOR → MNIST → CIFAR-10
|
||
```
|
||
|
||
### NLP Focus - 55 hours
|
||
Core + Language models:
|
||
```
|
||
01 → 02 → 03 → 04 → 05 → 06 → 07 → 08
|
||
↓
|
||
10 → 11 → 12 → 13
|
||
(Add Tokenization through Transformers)
|
||
|
||
Milestones: All ML history + Transformer Era
|
||
```
|
||
|
||
### Systems Engineering Path - Full 75 hours
|
||
Everything + optimization:
|
||
```
|
||
Complete all 20 modules
|
||
(Tensor → Transformers → Optimization → Capstone)
|
||
|
||
Milestones: All 6 checkpoints + Production Systems
|
||
```
|
||
|
||
---
|
||
|
||
## Success Metrics: What "Done" Looks Like
|
||
|
||
```
|
||
✅ Module Complete When:
|
||
├─ All unit tests pass (test_unit_* functions)
|
||
├─ Module integration test passes (test_module())
|
||
├─ You can explain the algorithm to someone else
|
||
└─ Code matches PyTorch API (but implemented from scratch)
|
||
|
||
✅ Phase Complete When:
|
||
├─ All modules in phase pass tests
|
||
├─ Milestone checkpoint achieved
|
||
└─ You understand connections between modules
|
||
|
||
✅ Course Complete When:
|
||
├─ All 20 modules implemented
|
||
├─ All 6 milestones achieved
|
||
├─ Capstone project deployed
|
||
└─ You can confidently say: "I built a transformer from scratch!"
|
||
```
|
||
|
||
---
|
||
|
||
## Common Questions
|
||
|
||
**Q: Do I need to complete modules in order?**
|
||
A: YES! Each module builds on previous ones. Module 05 (Autograd) enhances Module 01 (Tensor), Module 12 (Attention) uses Modules 01, 03, 05, and 11. The dependency chain is strict.
|
||
|
||
**Q: Can I skip modules?**
|
||
A: Modules 01-08 are REQUIRED. Modules 09-13 split into CV (09) and NLP (10-13) tracks - you can choose one. Modules 14-20 are optimization - recommended but optional for core understanding.
|
||
|
||
**Q: How do I know if I'm ready for the next module?**
|
||
A: Run `test_module()` - if all tests pass, you're ready! Each module has comprehensive integration tests.
|
||
|
||
**Q: What if I get stuck?**
|
||
A: Each module has reference solutions, detailed scaffolding, and clear error messages. Plus milestone checkpoints validate your progress.
|
||
|
||
**Q: How is this different from online courses?**
|
||
A: You BUILD everything from scratch. No black boxes. No "just import PyTorch." You implement every line of a production ML framework.
|
||
|
||
---
|
||
|
||
## Your Journey Starts Now
|
||
|
||
```
|
||
┌─────────────────────────────────────────────┐
|
||
│ 📍 YOU ARE HERE │
|
||
│ │
|
||
│ Next Step: cd modules/01_tensor/ │
|
||
│ jupyter notebook tensor_dev.py │
|
||
│ │
|
||
│ First Goal: Understand what a tensor is │
|
||
│ First Win: Implement your first matmul │
|
||
│ First Checkpoint: Train a perceptron │
|
||
│ │
|
||
│ 🎯 Final Destination (60-80 hours ahead): │
|
||
│ "I built a transformer from scratch!" │
|
||
└─────────────────────────────────────────────┘
|
||
```
|
||
|
||
**Remember**: Every expert was once a beginner. Every line of PyTorch was written by someone who understood these fundamentals. Now it's your turn.
|
||
|
||
**Ready to start building?**
|
||
|
||
```bash
|
||
cd modules/01_tensor
|
||
jupyter notebook tensor_dev.py
|
||
```
|
||
|
||
Let's build something amazing! 🚀
|