diff --git a/site/_toc.yml b/site/_toc.yml
index 293fa3c2..8ab45130 100644
--- a/site/_toc.yml
+++ b/site/_toc.yml
@@ -17,8 +17,12 @@ parts:
chapters:
- file: tito-essentials
title: "Essential Commands"
+ - file: student-workflow
+ title: "Student Workflow"
- file: learning-progress
title: "Track Your Progress"
+ - file: datasets
+ title: "Datasets Guide"
- caption: 🧭 Course Orientation
chapters:
@@ -26,8 +30,12 @@ parts:
title: "Course Structure"
- file: chapters/learning-journey
title: "Learning Journey"
+ - file: learning-journey-visual
+ title: "Visual Learning Map"
- file: chapters/milestones
title: "Historical Milestones"
+ - file: faq
+ title: "FAQ"
- caption: 🏗️ Foundation Tier (01-07)
chapters:
diff --git a/site/learning-journey-visual.md b/site/learning-journey-visual.md
new file mode 100644
index 00000000..126be2b3
--- /dev/null
+++ b/site/learning-journey-visual.md
@@ -0,0 +1,491 @@
+# Visual Learning Journey
+
+
+
The TinyTorch Learning Journey
+
Visual roadmap from tensors to transformers
+
+
+**Purpose**: Visualize the learning progression, module dependencies, and milestone achievements in TinyTorch.
+
+---
+
+## The Complete Learning Flow
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e3f2fd','primaryTextColor':'#1976d2','primaryBorderColor':'#2196f3','lineColor':'#2196f3','secondaryColor':'#fff3e0','tertiaryColor':'#f3e5f5'}}}%%
+
+flowchart TB
+ Start([Start: Setup Environment]) --> M01[Module 01: Tensor]
+
+ subgraph Foundation["🏗️ Foundation Tier (Modules 01-07)"]
+ M01 --> M02[Module 02: Activations]
+ M02 --> M03[Module 03: Layers]
+ M03 --> M04[Module 04: Losses]
+ M04 --> M05[Module 05: Autograd]
+ M05 --> M06[Module 06: Optimizers]
+ M06 --> M07[Module 07: Training]
+ end
+
+ M07 --> MS01{{"🏆 M01: 1957 Perceptron"}}
+ M07 --> MS02{{"🏆 M02: 1969 XOR"}}
+
+ MS02 --> M08[Module 08: DataLoader]
+
+ M08 --> MS03{{"🏆 M03: 1986 MLP
95%+ MNIST"}}
+
+ subgraph Architecture["🏛️ Architecture Tier (Modules 08-13)"]
+ M08 --> M09[Module 09: Spatial/CNNs]
+ M08 --> M10[Module 10: Tokenization]
+
+ M09 --> MS04{{"🏆 M04: 1998 CNN
75%+ CIFAR-10"}}
+
+ M10 --> M11[Module 11: Embeddings]
+ M11 --> M12[Module 12: Attention]
+ M12 --> M13[Module 13: Transformers]
+ end
+
+ M13 --> MS05{{"🏆 M05: 2017 Transformers
Text Generation"}}
+
+ subgraph Optimization["⚡ Optimization Tier (Modules 14-20)"]
+ MS05 --> M14[Module 14: Profiling]
+ M14 --> M15[Module 15: Quantization]
+ M14 --> M16[Module 16: Compression]
+ M14 --> M17[Module 17: Memoization]
+ M15 --> M18[Module 18: Acceleration]
+ M16 --> M18
+ M17 --> M18
+ M18 --> M19[Module 19: Benchmarking]
+ M19 --> M20[Module 20: Competition]
+ end
+
+ M20 --> MS06{{"🏆 M06: 2024 MLPerf
Production System"}}
+ MS06 --> Complete([🎓 Complete!
ML Systems Engineer])
+
+ style M01 fill:#e3f2fd,stroke:#2196f3,stroke-width:2px
+ style M05 fill:#fff3e0,stroke:#f57c00,stroke-width:3px
+ style M07 fill:#fff3e0,stroke:#f57c00,stroke-width:3px
+ style M09 fill:#f3e5f5,stroke:#9c27b0,stroke-width:2px
+ style M13 fill:#f3e5f5,stroke:#9c27b0,stroke-width:3px
+ style M20 fill:#fce4ec,stroke:#e91e63,stroke-width:3px
+ style MS01 fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style MS02 fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style MS03 fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style MS04 fill:#fff9c4,stroke:#fbc02d,stroke-width:3px
+ style MS05 fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style MS06 fill:#ffccbc,stroke:#ff5722,stroke-width:3px
+ style Complete fill:#b2dfdb,stroke:#009688,stroke-width:4px
+```
+
+**Legend:**
+- 🟦 Blue: Foundation modules
+- 🟧 Orange highlights: Critical modules (Autograd, Training)
+- 🟪 Purple: Advanced architecture modules
+- 🟩 Green: Milestone achievements
+- 🟨 Yellow: North Star milestone (CIFAR-10)
+- 🟥 Red: Capstone
+
+---
+
+## Module Dependencies
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e8f5e9','primaryTextColor':'#2e7d32','primaryBorderColor':'#4caf50'}}}%%
+
+graph LR
+ subgraph Core["Core Foundation"]
+ T[01 Tensor] --> A[02 Activations]
+ T --> L[03 Layers]
+ T --> Lo[04 Losses]
+ T --> D[08 DataLoader]
+ end
+
+ subgraph Training["Training Engine"]
+ T -.enhances.-> AG[05 Autograd]
+ AG --> O[06 Optimizers]
+ L --> O
+ O --> TR[07 Training]
+ Lo --> TR
+ end
+
+ subgraph Vision["Computer Vision"]
+ T --> S[09 Spatial]
+ A --> S
+ L --> S
+ AG --> S
+ end
+
+ subgraph Language["NLP Pipeline"]
+ T --> TK[10 Tokenization]
+ TK --> E[11 Embeddings]
+ T --> E
+ E --> AT[12 Attention]
+ L --> AT
+ AG --> AT
+ AT --> TF[13 Transformers]
+ A --> TF
+ E --> TF
+ end
+
+ subgraph Opt["Optimization"]
+ P[14 Profiling] --> Q[15 Quantization]
+ P --> C[16 Compression]
+ P --> M[17 Memoization]
+ Q --> AC[18 Acceleration]
+ C --> AC
+ M --> AC
+ AC --> B[19 Benchmarking]
+ B --> CP[20 Competition]
+ end
+
+ TR --> S
+ TR --> TF
+ S -.optimized by.-> Opt
+ TF -.optimized by.-> Opt
+
+ style T fill:#ffeb3b,stroke:#f57c00,stroke-width:4px
+ style AG fill:#ff9800,stroke:#e65100,stroke-width:4px
+ style TR fill:#ff9800,stroke:#e65100,stroke-width:4px
+ style S fill:#9c27b0,stroke:#4a148c,stroke-width:3px
+ style TF fill:#9c27b0,stroke:#4a148c,stroke-width:3px
+ style CP fill:#f44336,stroke:#b71c1c,stroke-width:3px
+```
+
+**Key Dependencies:**
+- **Tensor (Module 01)**: Foundation for everything - all modules depend on it
+- **Autograd (Module 05)**: Enhances Tensor, enables all learning
+- **Training (Module 07)**: Orchestrates the complete learning pipeline
+- **Vision & Language**: Parallel tracks that converge at optimization
+
+---
+
+## Three-Tier Structure
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'fontSize':'16px'}}}%%
+
+timeline
+ title TinyTorch Three-Tier Learning Journey
+
+ section 🏗️ Foundation
+ Module 01 : Tensor
+ Module 02 : Activations
+ Module 03 : Layers
+ Module 04 : Losses
+ Module 05 : Autograd
+ Module 06 : Optimizers
+ Module 07 : Training
+
+ section 🏛️ Architecture
+ Module 08 : DataLoader
+ Module 09 : Spatial (CNNs)
+ Module 10 : Tokenization
+ Module 11 : Embeddings
+ Module 12 : Attention
+ Module 13 : Transformers
+
+ section ⚡ Optimization
+ Module 14 : Profiling
+ Module 15 : Quantization
+ Module 16 : Compression
+ Module 17 : Memoization
+ Module 18 : Acceleration
+ Module 19 : Benchmarking
+ Module 20 : Competition
+```
+
+---
+
+## Historical Milestones Timeline
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#fff9c4','primaryTextColor':'#f57c00','primaryBorderColor':'#fbc02d'}}}%%
+
+gantt
+ title ML History Recreation Journey
+ dateFormat YYYY
+ axisFormat %Y
+
+ section Milestones
+ M01 1957 Perceptron :milestone, 1957, 0d
+ M02 1969 XOR Solution :milestone, 1969, 0d
+ M03 1986 MLP Revival :milestone, 1986, 0d
+ M04 1998 CNN Revolution :milestone, 1998, 0d
+ M05 2017 Transformer Era :milestone, 2017, 0d
+ M06 2024 Systems Age :milestone, 2024, 0d
+
+ section Your Progress
+ Foundation (M01-07) :active, 1957, 1969
+ Architecture (M08-13) :1969, 2017
+ Optimization (M14-20) :2017, 2024
+```
+
+**Journey Through ML History**: As you complete modules, you unlock milestones that recreate 67 years of machine learning breakthroughs using YOUR implementations.
+
+---
+
+## Student Learning Paths
+
+```{mermaid}
+%%{init: {'theme':'base'}}%%
+
+flowchart TD
+ Start([Choose Your Path]) --> Decision{Learning Goal?}
+
+ Decision -->|"Fast: Understand ML"| Fast["🚀 Fast Track
(6-8 weeks)
Modules 01-09"]
+ Decision -->|"Deep: Build Everything"| Complete["🎯 Complete Builder
(14-18 weeks)
All 20 Modules"]
+ Decision -->|"Focus: Specific Skills"| Focused["🔍 Focused Explorer
(8-12 weeks)
Choose Tiers"]
+
+ Fast --> F1[Foundation
01-07]
+ F1 --> F2[DataLoader
08]
+ F2 --> F3[Spatial/CNNs
09]
+ F3 --> FResult["✅ Can build & train
neural networks
75%+ CIFAR-10"]
+
+ Complete --> C1[Foundation
01-07]
+ C1 --> C2[Architecture
08-13]
+ C2 --> C3[Optimization
14-20]
+ C3 --> CResult["🏆 ML Systems
Engineer
Production-ready"]
+
+ Focused --> Choice{Focus Area?}
+ Choice -->|Vision| FV[Foundation +
Spatial 09]
+ Choice -->|Language| FL[Foundation +
NLP 10-13]
+ Choice -->|Production| FO[Foundation +
Optimization 14-20]
+
+ FV --> FVResult["✅ Computer
Vision Expert"]
+ FL --> FLResult["✅ NLP/LLM
Specialist"]
+ FO --> FOResult["✅ ML Optimization
Engineer"]
+
+ style Fast fill:#e3f2fd,stroke:#2196f3
+ style Complete fill:#f3e5f5,stroke:#9c27b0
+ style Focused fill:#fff3e0,stroke:#f57c00
+ style FResult fill:#c8e6c9,stroke:#4caf50
+ style CResult fill:#fff9c4,stroke:#fbc02d,stroke-width:3px
+ style FVResult fill:#c8e6c9,stroke:#4caf50
+ style FLResult fill:#c8e6c9,stroke:#4caf50
+ style FOResult fill:#c8e6c9,stroke:#4caf50
+```
+
+---
+
+## Capability Progression
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e1f5fe','primaryTextColor':'#01579b'}}}%%
+
+graph TB
+ subgraph L1["Level 1: Foundation"]
+ C1["Can create tensors
and perform operations"]
+ C2["Can build neural
network layers"]
+ C3["Can implement
backpropagation"]
+ C1 --> C2 --> C3
+ end
+
+ subgraph L2["Level 2: Training"]
+ C4["Can train networks
on datasets"]
+ C5["Can achieve 95%+
on MNIST"]
+ C3 --> C4 --> C5
+ end
+
+ subgraph L3["Level 3: Architectures"]
+ C6["Can build CNNs
for vision"]
+ C7["Can build transformers
for language"]
+ C8["Can achieve 75%+
on CIFAR-10"]
+ C5 --> C6 --> C8
+ C5 --> C7
+ end
+
+ subgraph L4["Level 4: Production"]
+ C9["Can profile and
optimize models"]
+ C10["Can compress 4×
and speedup 10×"]
+ C11["Can deploy production
ML systems"]
+ C8 --> C9
+ C7 --> C9
+ C9 --> C10 --> C11
+ end
+
+ C11 --> Master["🎓 ML Systems
Mastery"]
+
+ style C1 fill:#e3f2fd
+ style C3 fill:#fff3e0
+ style C5 fill:#f3e5f5
+ style C8 fill:#fff9c4,stroke:#fbc02d,stroke-width:3px
+ style C11 fill:#ffccbc
+ style Master fill:#c8e6c9,stroke:#4caf50,stroke-width:4px
+```
+
+**Each level builds concrete, measurable capabilities** - not just "completed a module" but "can build production CNNs achieving 75%+ accuracy."
+
+---
+
+## Workflow Cycle
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#f0f4c3'}}}%%
+
+graph LR
+ Edit["📝 Edit Modules
modules/source/XX_name/"] --> Export["⚙️ Export to Package
tito module complete XX"]
+ Export --> Validate["✅ Validate with Milestones
milestones/0X_*/script.py"]
+ Validate --> Check{Tests Pass?}
+ Check -->|Yes| Next["➡️ Next Module"]
+ Check -->|No| Debug["🔍 Debug & Fix"]
+ Debug --> Edit
+ Next --> Edit
+
+ Validate -.optional.-> Progress["📊 Track Progress
tito checkpoint status"]
+
+ style Edit fill:#e3f2fd,stroke:#2196f3,stroke-width:2px
+ style Export fill:#fff3e0,stroke:#f57c00,stroke-width:2px
+ style Validate fill:#f3e5f5,stroke:#9c27b0,stroke-width:2px
+ style Next fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style Debug fill:#ffcdd2,stroke:#f44336,stroke-width:2px
+ style Progress fill:#f5f5f5,stroke:#9e9e9e,stroke-width:1px,stroke-dasharray: 5 5
+```
+
+**The essential three-step cycle**: Edit → Export → Validate
+
+**📖 See [Student Workflow](student-workflow.html)** for detailed workflow guide.
+
+---
+
+## Dataset Strategy
+
+```{mermaid}
+%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e8eaf6'}}}%%
+
+flowchart TB
+ Start([Start Learning]) --> Ship["📦 Shipped Datasets
(~350 KB in repo)"]
+
+ Ship --> TD["TinyDigits
1,200 samples
8×8 images
310 KB"]
+ Ship --> TT["TinyTalks
350 Q&A pairs
Character-level
40 KB"]
+
+ TD --> M03["Milestone 03
MLP on TinyDigits
⚡ Fast iteration"]
+ TT --> M05["Milestone 05
Transformers on TinyTalks
⚡ Instant training"]
+
+ M03 --> Scale{Scale Up?}
+ M05 --> Scale
+
+ Scale -->|Yes| Download["⬇️ Downloaded Datasets
(Auto-download when needed)"]
+
+ Download --> MNIST["MNIST
70K samples
28×28 images
10 MB"]
+ Download --> CIFAR["CIFAR-10
60K samples
32×32 RGB
170 MB"]
+
+ MNIST --> M03B["Milestone 03
MLP on MNIST
🎯 95%+ accuracy"]
+ CIFAR --> M04["Milestone 04
CNN on CIFAR-10
🏆 75%+ accuracy"]
+
+ style Ship fill:#c8e6c9,stroke:#4caf50,stroke-width:2px
+ style TD fill:#e3f2fd,stroke:#2196f3
+ style TT fill:#e3f2fd,stroke:#2196f3
+ style Download fill:#fff3e0,stroke:#f57c00,stroke-width:2px
+ style MNIST fill:#f3e5f5,stroke:#9c27b0
+ style CIFAR fill:#fff9c4,stroke:#fbc02d,stroke-width:3px
+ style M04 fill:#ffccbc,stroke:#ff5722,stroke-width:3px
+```
+
+**Strategy**: Start small (shipped datasets), iterate fast, then validate on benchmarks (downloaded datasets).
+
+**📖 See [Datasets Guide](datasets.html)** for complete dataset documentation.
+
+---
+
+## Success Metrics
+
+```{mermaid}
+%%{init: {'theme':'base'}}%%
+
+mindmap
+ root((TinyTorch
Success))
+ Technical Skills
+ Build tensors from scratch
+ Implement autograd engine
+ Train real neural networks
+ Achieve 75%+ CIFAR-10
+ Optimize for production
+
+ Understanding
+ Know how PyTorch works internally
+ Understand gradient flow
+ Debug ML issues from first principles
+ Profile and optimize bottlenecks
+
+ Career Impact
+ ML Systems Engineer role-ready
+ Can implement novel architectures
+ Production deployment skills
+ Portfolio project (capstone)
+
+ Milestones Achieved
+ 6 historical ML breakthroughs
+ Recreated 67 years of ML history
+ 95%+ MNIST accuracy
+ 75%+ CIFAR-10 accuracy
+```
+
+---
+
+## Time Investment vs. Outcomes
+
+```{mermaid}
+%%{init: {'theme':'base'}}%%
+
+quadrantChart
+ title Learning Paths: Time vs. Depth
+ x-axis "Time Investment (weeks)"
+ y-axis "ML Systems Mastery"
+ quadrant-1 "Complete Mastery"
+ quadrant-2 "Deep Understanding"
+ quadrant-3 "Quick Learning"
+ quadrant-4 "Focused Skills"
+
+ "Fast Track (6-8w)": [0.35, 0.5]
+ "Focused Vision (8w)": [0.45, 0.6]
+ "Focused NLP (10w)": [0.55, 0.65]
+ "Complete Builder (14-18w)": [0.85, 0.95]
+ "Foundation Only (4w)": [0.25, 0.35]
+```
+
+**Quadrants:**
+- **Bottom-left (Quick Learning)**: Foundation tier - understand basics in 4 weeks
+- **Top-left (Deep Understanding)**: Fast track - build & train networks in 6-8 weeks
+- **Bottom-right (Focused Skills)**: Specialized paths - vision or NLP focus
+- **Top-right (Complete Mastery)**: Full course - ML systems engineer in 14-18 weeks
+
+---
+
+## Module Difficulty Progression
+
+```{mermaid}
+%%{init: {'theme':'base'}}%%
+
+%%{init: {'theme':'base', 'themeVariables': { 'xyChart': {'backgroundColor': 'transparent'}}}}%%
+xychart-beta
+ title "Difficulty Curve Across 20 Modules"
+ x-axis [M01, M02, M03, M04, M05, M06, M07, M08, M09, M10, M11, M12, M13, M14, M15, M16, M17, M18, M19, M20]
+ y-axis "Difficulty (1-5 stars)" 0 --> 5
+ line [2, 2, 3, 3, 4, 4, 4, 3, 5, 4, 4, 5, 5, 4, 5, 5, 4, 4, 4, 5]
+```
+
+**Key observations:**
+- **Gentle start**: Modules 01-02 are beginner-friendly
+- **First challenge**: Module 05 (Autograd) - the critical breakthrough
+- **Sustained difficulty**: Modules 09, 12, 13, 15-16 are advanced (⭐⭐⭐⭐⭐)
+- **Capstone peak**: Module 20 integrates everything
+
+---
+
+## Ready to Start?
+
+
+
+---
+
+## Related Pages
+
+- **📖 [Introduction](intro.html)** - What is TinyTorch and why build from scratch
+- **📖 [Student Workflow](student-workflow.html)** - The essential edit → export → validate cycle
+- **📖 [Three-Tier Structure](chapters/00-introduction.html)** - Detailed tier breakdown
+- **📖 [Milestones](chapters/milestones.html)** - Journey through ML history
+- **📖 [FAQ](faq.html)** - Common questions answered