mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-29 00:59:07 -05:00
docs(tinytorch): update TOC and tier documentation for module reordering
- Update _toc.yml: Foundation (01-08), Architecture (09-13) - Update _toc_pdf.yml: same tier ranges - Update foundation.md: add DataLoader as module 05, renumber autograd/optimizers/training - Update architecture.md: remove DataLoader, start with Convolutions at 09 - Update all Mermaid diagrams and tier references
This commit is contained in:
@@ -17,7 +17,7 @@ parts:
|
||||
title: "Quick Start"
|
||||
|
||||
# Foundation Tier - Collapsible section
|
||||
- caption: 🏗 Foundation Tier (01-07)
|
||||
- caption: 🏗 Foundation Tier (01-08)
|
||||
chapters:
|
||||
- file: tiers/foundation
|
||||
title: "📖 Tier Overview"
|
||||
@@ -29,20 +29,20 @@ parts:
|
||||
title: "03. Layers"
|
||||
- file: modules/04_losses_ABOUT
|
||||
title: "04. Losses"
|
||||
- file: modules/05_autograd_ABOUT
|
||||
title: "05. Autograd"
|
||||
- file: modules/06_optimizers_ABOUT
|
||||
title: "06. Optimizers"
|
||||
- file: modules/07_training_ABOUT
|
||||
title: "07. Training"
|
||||
- file: modules/05_dataloader_ABOUT
|
||||
title: "05. DataLoader"
|
||||
- file: modules/06_autograd_ABOUT
|
||||
title: "06. Autograd"
|
||||
- file: modules/07_optimizers_ABOUT
|
||||
title: "07. Optimizers"
|
||||
- file: modules/08_training_ABOUT
|
||||
title: "08. Training"
|
||||
|
||||
# Architecture Tier - Collapsible section
|
||||
- caption: 🏛️ Architecture Tier (08-13)
|
||||
- caption: 🏛️ Architecture Tier (09-13)
|
||||
chapters:
|
||||
- file: tiers/architecture
|
||||
title: "📖 Tier Overview"
|
||||
- file: modules/08_dataloader_ABOUT
|
||||
title: "08. DataLoader"
|
||||
- file: modules/09_convolutions_ABOUT
|
||||
title: "09. Convolutions"
|
||||
- file: modules/10_tokenization_ABOUT
|
||||
|
||||
@@ -19,7 +19,7 @@ parts:
|
||||
title: "Quick Start"
|
||||
|
||||
# Foundation Tier
|
||||
- caption: Foundation (Modules 01-07)
|
||||
- caption: Foundation (Modules 01-08)
|
||||
numbered: true
|
||||
chapters:
|
||||
- file: modules/01_tensor_ABOUT
|
||||
@@ -30,19 +30,19 @@ parts:
|
||||
title: "03. Layers"
|
||||
- file: modules/04_losses_ABOUT
|
||||
title: "04. Losses"
|
||||
- file: modules/05_autograd_ABOUT
|
||||
title: "05. Autograd"
|
||||
- file: modules/06_optimizers_ABOUT
|
||||
title: "06. Optimizers"
|
||||
- file: modules/07_training_ABOUT
|
||||
title: "07. Training"
|
||||
- file: modules/05_dataloader_ABOUT
|
||||
title: "05. DataLoader"
|
||||
- file: modules/06_autograd_ABOUT
|
||||
title: "06. Autograd"
|
||||
- file: modules/07_optimizers_ABOUT
|
||||
title: "07. Optimizers"
|
||||
- file: modules/08_training_ABOUT
|
||||
title: "08. Training"
|
||||
|
||||
# Architecture Tier
|
||||
- caption: Architecture (Modules 08-13)
|
||||
- caption: Architecture (Modules 09-13)
|
||||
numbered: true
|
||||
chapters:
|
||||
- file: modules/08_dataloader_ABOUT
|
||||
title: "08. DataLoader"
|
||||
- file: modules/09_convolutions_ABOUT
|
||||
title: "09. Convolutions"
|
||||
- file: modules/10_tokenization_ABOUT
|
||||
|
||||
@@ -1,14 +1,13 @@
|
||||
# Architecture Tier (Modules 08-13)
|
||||
# Architecture Tier (Modules 09-13)
|
||||
|
||||
**Build modern neural architectures—from computer vision to language models.**
|
||||
|
||||
|
||||
## What You'll Learn
|
||||
|
||||
The Architecture tier teaches you how to build the neural network architectures that power modern AI. You'll implement CNNs for computer vision, transformers for language understanding, and the data loading infrastructure needed to train on real datasets.
|
||||
The Architecture tier teaches you how to build the neural network architectures that power modern AI. You'll implement CNNs for computer vision and transformers for language understanding, building on the foundational training infrastructure from the previous tier.
|
||||
|
||||
**By the end of this tier, you'll understand:**
|
||||
- How data loaders efficiently feed training data to models
|
||||
- Why convolutional layers are essential for computer vision
|
||||
- How attention mechanisms enable transformers to understand sequences
|
||||
- What embeddings do to represent discrete tokens as continuous vectors
|
||||
@@ -19,14 +18,11 @@ The Architecture tier teaches you how to build the neural network architectures
|
||||
|
||||
```{mermaid}
|
||||
:align: center
|
||||
:caption: "**Architecture Module Flow.** Two parallel tracks branch from Foundation: vision (DataLoader, Convolutions) and language (Tokenization through Transformers)."
|
||||
:caption: "**Architecture Module Flow.** Two parallel tracks branch from Foundation: vision (Convolutions) and language (Tokenization through Transformers)."
|
||||
graph TB
|
||||
F[ Foundation<br/>Tensor, Autograd, Training]
|
||||
F[ Foundation<br/>Tensor, DataLoader, Autograd, Training]
|
||||
|
||||
F --> M08[08. DataLoader<br/>Efficient data pipelines]
|
||||
F --> M09[09. Convolutions<br/>Conv2d + Pooling]
|
||||
|
||||
M08 --> M09
|
||||
M09 --> VISION[ Computer Vision<br/>CNNs unlock spatial intelligence]
|
||||
|
||||
F --> M10[10. Tokenization<br/>Text → integers]
|
||||
@@ -37,7 +33,6 @@ graph TB
|
||||
M13 --> LLM[ Language Models<br/>Transformers generate text]
|
||||
|
||||
style F fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
|
||||
style M08 fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px
|
||||
style M09 fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px
|
||||
style M10 fill:#e1bee7,stroke:#6a1b9a,stroke-width:3px
|
||||
style M11 fill:#e1bee7,stroke:#6a1b9a,stroke-width:3px
|
||||
@@ -50,17 +45,6 @@ graph TB
|
||||
|
||||
## Module Details
|
||||
|
||||
### 08. DataLoader - Efficient Data Pipelines
|
||||
|
||||
**What it is**: Infrastructure for loading, batching, and shuffling training data efficiently.
|
||||
|
||||
**Why it matters**: Real ML systems train on datasets that don't fit in memory. DataLoaders handle batching, shuffling, and parallel data loading—essential for efficient training.
|
||||
|
||||
**What you'll build**: A DataLoader that supports batching, shuffling, and dataset iteration with proper memory management.
|
||||
|
||||
**Systems focus**: Memory efficiency, batching strategies, I/O optimization
|
||||
|
||||
|
||||
### 09. Convolutions - Convolutional Neural Networks
|
||||
|
||||
**What it is**: Conv2d (convolutional layers) and pooling operations for processing images.
|
||||
@@ -124,7 +108,7 @@ graph TB
|
||||
|
||||
```{mermaid}
|
||||
:align: center
|
||||
:caption: "**Architecture Tier Milestones.** After completing modules 08-13, you unlock computer vision (1998 CNN) and language understanding (2017 Transformer) breakthroughs."
|
||||
:caption: "**Architecture Tier Milestones.** After completing modules 09-13, you unlock computer vision (1998 CNN) and language understanding (2017 Transformer) breakthroughs."
|
||||
timeline
|
||||
title Historical Achievements Unlocked
|
||||
1998 : CNN Revolution : 75%+ accuracy on CIFAR-10 with spatial intelligence
|
||||
@@ -142,8 +126,8 @@ After completing the Architecture tier, you'll be able to:
|
||||
## Prerequisites
|
||||
|
||||
**Required**:
|
||||
- ** Foundation Tier** (Modules 01-07) completed
|
||||
- Understanding of tensors, autograd, and training loops
|
||||
- ** Foundation Tier** (Modules 01-08) completed
|
||||
- Understanding of tensors, data loaders, autograd, and training loops
|
||||
- Basic understanding of images (height, width, channels)
|
||||
- Basic understanding of text/language concepts
|
||||
|
||||
@@ -199,8 +183,8 @@ python 01_vaswani_generation.py # Text generation with YOUR transformer
|
||||
|
||||
The Architecture tier splits into two parallel paths that can be learned in any order:
|
||||
|
||||
**Vision Track (Modules 08-09)**:
|
||||
- DataLoader → Convolutions (Conv2d + Pooling)
|
||||
**Vision Track (Module 09)**:
|
||||
- Convolutions (Conv2d + Pooling)
|
||||
- Enables computer vision applications
|
||||
- Culminates in CNN milestone
|
||||
|
||||
@@ -209,7 +193,7 @@ The Architecture tier splits into two parallel paths that can be learned in any
|
||||
- Enables natural language processing
|
||||
- Culminates in Transformer milestone
|
||||
|
||||
**Recommendation**: Complete both tracks in order (08→09→10→11→12→13), but you can prioritize the track that interests you more.
|
||||
**Recommendation**: Complete both tracks in order (09→10→11→12→13), but you can prioritize the track that interests you more.
|
||||
|
||||
|
||||
## Next Steps
|
||||
@@ -217,8 +201,8 @@ The Architecture tier splits into two parallel paths that can be learned in any
|
||||
**Ready to build modern architectures?**
|
||||
|
||||
```bash
|
||||
# Start the Architecture tier
|
||||
tito module start 08_dataloader
|
||||
# Start the Architecture tier with vision
|
||||
tito module start 09_convolutions
|
||||
|
||||
# Or jump to language models
|
||||
tito module start 10_tokenization
|
||||
@@ -226,7 +210,7 @@ tito module start 10_tokenization
|
||||
|
||||
**Or explore other tiers:**
|
||||
|
||||
- **[ Foundation Tier](foundation)** (Modules 01-07): Mathematical foundations
|
||||
- **[ Foundation Tier](foundation)** (Modules 01-08): Mathematical foundations
|
||||
- **[ Optimization Tier](optimization)** (Modules 14-19): Production-ready performance
|
||||
- **[ Torch Olympics](olympics)** (Module 20): Compete in ML systems challenges
|
||||
|
||||
|
||||
@@ -1,15 +1,16 @@
|
||||
# Foundation Tier (Modules 01-07)
|
||||
# Foundation Tier (Modules 01-08)
|
||||
|
||||
**Build the mathematical core that makes neural networks learn.**
|
||||
|
||||
|
||||
## What You'll Learn
|
||||
|
||||
The Foundation tier teaches you how to build a complete learning system from scratch. Starting with basic tensor operations, you'll construct the mathematical infrastructure that powers every modern ML framework—automatic differentiation, gradient-based optimization, and training loops.
|
||||
The Foundation tier teaches you how to build a complete learning system from scratch. Starting with basic tensor operations, you'll construct the mathematical infrastructure that powers every modern ML framework—data loading, automatic differentiation, gradient-based optimization, and training loops.
|
||||
|
||||
**By the end of this tier, you'll understand:**
|
||||
- How tensors represent and transform data in neural networks
|
||||
- Why activation functions enable non-linear learning
|
||||
- How data loaders efficiently feed training data to models
|
||||
- How backpropagation computes gradients automatically
|
||||
- What optimizers do to make training converge
|
||||
- How training loops orchestrate the entire learning process
|
||||
@@ -19,18 +20,18 @@ The Foundation tier teaches you how to build a complete learning system from scr
|
||||
|
||||
```{mermaid}
|
||||
:align: center
|
||||
:caption: "**Foundation Module Dependencies.** Tensors and activations feed into layers, which connect to losses and autograd, enabling optimizers and ultimately training loops."
|
||||
:caption: "**Foundation Module Dependencies.** Tensors and activations feed into layers, which connect to losses and dataloader, then autograd, enabling optimizers and ultimately training loops."
|
||||
graph TB
|
||||
M01[01. Tensor<br/>Multidimensional arrays] --> M03[03. Layers<br/>Linear transformations]
|
||||
M02[02. Activations<br/>Non-linear functions] --> M03
|
||||
|
||||
M03 --> M04[04. Losses<br/>Measure prediction quality]
|
||||
M03 --> M05[05. Autograd<br/>Automatic differentiation]
|
||||
M04 --> M05[05. DataLoader<br/>Efficient data pipelines]
|
||||
M05 --> M06[06. Autograd<br/>Automatic differentiation]
|
||||
|
||||
M04 --> M06[06. Optimizers<br/>Gradient-based updates]
|
||||
M05 --> M06
|
||||
M06 --> M07[07. Optimizers<br/>Gradient-based updates]
|
||||
|
||||
M06 --> M07[07. Training<br/>Complete learning loop]
|
||||
M07 --> M08[08. Training<br/>Complete learning loop]
|
||||
|
||||
style M01 fill:#e3f2fd,stroke:#1976d2,stroke-width:3px
|
||||
style M02 fill:#e3f2fd,stroke:#1976d2,stroke-width:3px
|
||||
@@ -38,7 +39,8 @@ graph TB
|
||||
style M04 fill:#90caf9,stroke:#1565c0,stroke-width:3px
|
||||
style M05 fill:#90caf9,stroke:#1565c0,stroke-width:3px
|
||||
style M06 fill:#64b5f6,stroke:#0d47a1,stroke-width:3px
|
||||
style M07 fill:#42a5f5,stroke:#0d47a1,stroke-width:4px
|
||||
style M07 fill:#64b5f6,stroke:#0d47a1,stroke-width:3px
|
||||
style M08 fill:#42a5f5,stroke:#0d47a1,stroke-width:4px
|
||||
```
|
||||
|
||||
|
||||
@@ -88,7 +90,18 @@ graph TB
|
||||
**Systems focus**: Numerical stability (log-sum-exp trick), reduction strategies
|
||||
|
||||
|
||||
### 05. Autograd - The Gradient Revolution
|
||||
### 05. DataLoader - Efficient Data Pipelines
|
||||
|
||||
**What it is**: Infrastructure for loading, batching, and shuffling training data efficiently.
|
||||
|
||||
**Why it matters**: Real ML systems train on datasets that don't fit in memory. DataLoaders handle batching, shuffling, and parallel data loading, which are essential for efficient training.
|
||||
|
||||
**What you'll build**: A DataLoader that supports batching, shuffling, and dataset iteration with proper memory management.
|
||||
|
||||
**Systems focus**: Memory efficiency, batching strategies, I/O optimization
|
||||
|
||||
|
||||
### 06. Autograd - The Gradient Revolution
|
||||
|
||||
**What it is**: Automatic differentiation system that computes gradients through computation graphs.
|
||||
|
||||
@@ -99,7 +112,7 @@ graph TB
|
||||
**Systems focus**: Computational graphs, topological sorting, gradient accumulation
|
||||
|
||||
|
||||
### 06. Optimizers - Learning from Gradients
|
||||
### 07. Optimizers - Learning from Gradients
|
||||
|
||||
**What it is**: Algorithms that update parameters using gradients (SGD, Adam, RMSprop).
|
||||
|
||||
@@ -110,7 +123,7 @@ graph TB
|
||||
**Systems focus**: Update rules, momentum buffers, numerical stability
|
||||
|
||||
|
||||
### 07. Training - Orchestrating the Learning Process
|
||||
### 08. Training - Orchestrating the Learning Process
|
||||
|
||||
**What it is**: The training loop that ties everything together—forward pass, loss computation, backpropagation, parameter updates.
|
||||
|
||||
@@ -125,7 +138,7 @@ graph TB
|
||||
|
||||
```{mermaid}
|
||||
:align: center
|
||||
:caption: "**Foundation Tier Milestones.** After completing modules 01-07, you unlock three historical achievements spanning three decades of neural network breakthroughs."
|
||||
:caption: "**Foundation Tier Milestones.** After completing modules 01-08, you unlock three historical achievements spanning three decades of neural network breakthroughs."
|
||||
timeline
|
||||
title Historical Achievements Unlocked
|
||||
1957 : Perceptron : Binary classification with gradient descent
|
||||
@@ -187,7 +200,7 @@ tito module start 01_tensor
|
||||
|
||||
**Or explore other tiers:**
|
||||
|
||||
- **[ Architecture Tier](architecture)** (Modules 08-13): CNNs, transformers, attention
|
||||
- **[ Architecture Tier](architecture)** (Modules 09-13): CNNs, transformers, attention
|
||||
- **[ Optimization Tier](optimization)** (Modules 14-19): Production-ready performance
|
||||
- **[ Torch Olympics](olympics)** (Module 20): Compete in ML systems challenges
|
||||
|
||||
|
||||
Reference in New Issue
Block a user