mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-03-12 02:09:16 -05:00
docs: update module references in README and guides
- Update README.md module structure (14→Profiling, 15→Memoization) - Fix tier descriptions (10-13 Architecture, 14-19 Optimization) - Update Module 13 next steps to reference Module 15 - Fix Module 15 prerequisite reference to Module 14 - Correct cifar10-training-guide module numbers
This commit is contained in:
58
README.md
58
README.md
@@ -14,7 +14,9 @@
|
||||

|
||||

|
||||
|
||||
> 🚧 **Work in Progress** - Actively developing TinyTorch for Spring 2025! All 20 core modules (01-20) are implemented but still being debugged and tested. Core foundation modules (01-09) are stable. Transformer and optimization modules (10-20) are functional but undergoing refinement. Join us in building the future of ML systems education.
|
||||
> 📢 **December 2024 Release** - TinyTorch is ready for community review! All 20 modules (Tensor → Transformers → Optimization → Capstone) are implemented with complete solutions. **Seeking feedback on pedagogy, implementation quality, and learning progression.** Student version tooling exists but is untested. This release focuses on validating the educational content before classroom deployment.
|
||||
>
|
||||
> 🎯 **For Reviewers**: Read the [📚 Jupyter Book](https://mlsysbook.github.io/TinyTorch/) to evaluate pedagogy. Clone the repo to run implementations. See [STUDENT_VERSION_TOOLING.md](STUDENT_VERSION_TOOLING.md) for classroom deployment plans.
|
||||
|
||||
## 📖 Table of Contents
|
||||
- [Why TinyTorch?](#why-tinytorch)
|
||||
@@ -71,8 +73,8 @@ TinyTorch/
|
||||
│ │ ├── 11_embeddings/ # Module 11: Token & positional embeddings
|
||||
│ │ ├── 12_attention/ # Module 12: Multi-head attention
|
||||
│ │ ├── 13_transformers/ # Module 13: Complete transformer blocks
|
||||
│ │ ├── 14_kvcaching/ # Module 14: KV-cache optimization
|
||||
│ │ ├── 15_profiling/ # Module 15: Performance analysis
|
||||
│ │ ├── 14_profiling/ # Module 14: Performance analysis
|
||||
│ │ ├── 15_memoization/ # Module 15: KV-cache/memoization
|
||||
│ │ ├── 16_acceleration/ # Module 16: Hardware optimization
|
||||
│ │ ├── 17_quantization/ # Module 17: Model compression
|
||||
│ │ ├── 18_compression/ # Module 18: Pruning & distillation
|
||||
@@ -178,7 +180,7 @@ Build transformers that generate text
|
||||
| 11 | Embeddings | Token embeddings + positional encoding | **Embedding tables** (vocab × dim parameters), lookup performance |
|
||||
| 12 | Attention | Multi-head attention mechanisms | **O(N²) scaling**, memory bottlenecks, attention optimization |
|
||||
| 13 | Transformers | Complete transformer blocks | **Layer scaling**, memory requirements, architectural trade-offs |
|
||||
| 14 | KV-Caching | Inference optimization for transformers | **Memory vs compute trade-offs**, cache management, generation efficiency |
|
||||
| 14 | Profiling | Performance analysis + bottleneck detection | **Memory profiling**, FLOP counting, **Amdahl's Law**, performance measurement |
|
||||
|
||||
**Milestone Achievement**: TinyGPT language generation with optimized inference
|
||||
|
||||
@@ -189,7 +191,7 @@ Profile, optimize, and benchmark ML systems
|
||||
|
||||
| Module | Topic | What You Build | ML Systems Learning |
|
||||
|--------|-------|----------------|-------------------|
|
||||
| 15 | Profiling | Performance analysis + bottleneck detection | **Memory profiling**, FLOP counting, **Amdahl's Law**, performance measurement |
|
||||
| 15 | Memoization | Computational reuse via KV-caching | **Memory vs compute trade-offs**, cache management, generation efficiency |
|
||||
| 16 | Acceleration | Hardware optimization + cache-friendly algorithms | **Cache hierarchies**, memory access patterns, **vectorization vs loops** |
|
||||
| 17 | Quantization | Model compression + precision reduction | **Precision trade-offs** (FP32→INT8), memory reduction, accuracy preservation |
|
||||
| 18 | Compression | Pruning + knowledge distillation | **Sparsity patterns**, parameter reduction, **compression ratios** |
|
||||
@@ -242,8 +244,8 @@ tito checkpoint timeline
|
||||
- **01-02**: Foundation (Tensor, Activations)
|
||||
- **03-07**: Core Networks (Layers, Losses, Autograd, Optimizers, Training)
|
||||
- **08-09**: Computer Vision (DataLoader, Spatial ops - unlocks CIFAR-10 @ 75%+)
|
||||
- **10-14**: Language Models (Tokenization, Embeddings, Attention, Transformers, KV-Caching)
|
||||
- **15-19**: System Optimization (Profiling, Acceleration, Quantization, Compression, Benchmarking)
|
||||
- **10-13**: Language Models (Tokenization, Embeddings, Attention, Transformers)
|
||||
- **14-19**: System Optimization (Profiling, Memoization, Quantization, Compression, Acceleration, Benchmarking)
|
||||
- **20**: Capstone (Complete end-to-end ML systems)
|
||||
|
||||
Each module asks: **"Can I build this capability from scratch?"** with hands-on validation.
|
||||
@@ -389,8 +391,8 @@ pytest tests/
|
||||
- ✅ **20 modules implemented** (01 Tensor → 20 Capstone) - all code exists
|
||||
- ✅ **6 historical milestones** (1957 Perceptron → 2024 Systems Age)
|
||||
- ✅ **Foundation modules stable** (01-09): Tensor through Spatial operations
|
||||
- 🚧 **Transformer modules functional** (10-14): Tokenization through KV-Caching - undergoing testing
|
||||
- 🚧 **Optimization modules functional** (15-20): Profiling through Capstone - undergoing testing
|
||||
- 🚧 **Transformer modules functional** (10-13): Tokenization through Transformers - undergoing testing
|
||||
- 🚧 **Optimization modules functional** (14-20): Profiling through Capstone - undergoing testing
|
||||
- ✅ **KISS principle design** for clear, maintainable code
|
||||
- ✅ **Essential-only features**: Focus on what's used in production ML systems
|
||||
- 🎯 **Target: Spring 2025** - Active debugging and refinement in progress
|
||||
@@ -437,6 +439,44 @@ tito benchmark submit --event cnn_marathon
|
||||
|
||||
📊 **View Leaderboard**: [TinyMLPerf Competition](https://mlsysbook.github.io/TinyTorch/leaderboard.html) | Future: `tinytorch.org/leaderboard`
|
||||
|
||||
## Academic Integrity & Solutions Philosophy
|
||||
|
||||
### Why Solutions Are Public
|
||||
|
||||
TinyTorch releases complete implementations publicly to support:
|
||||
- **Transparent peer review** of educational materials
|
||||
- **Instructor evaluation** before course adoption
|
||||
- **Open-source community** contribution and improvement
|
||||
- **Real-world learning** from production-quality code
|
||||
|
||||
### For Students: Learning > Copying
|
||||
|
||||
**TinyTorch's pedagogy makes copying solutions ineffective:**
|
||||
|
||||
1. **Progressive Complexity**: Module 05 (Autograd) requires deep understanding of Modules 01-04. You cannot fake building automatic differentiation by copying code you don't understand.
|
||||
|
||||
2. **Integration Requirements**: Each module builds on previous work. Superficial copying breaks down as complexity compounds.
|
||||
|
||||
3. **Systems Thinking**: The learning goal is understanding memory management, computational graphs, and performance trade-offs—not just getting tests to pass.
|
||||
|
||||
4. **Self-Correcting**: Students who copy without understanding fail subsequent modules. The system naturally identifies shallow work.
|
||||
|
||||
### For Instructors: Pedagogy Over Secrecy
|
||||
|
||||
Modern ML education accepts that solutions are findable (Chegg, Course Hero, Discord). Defense comes through:
|
||||
|
||||
**✅ Progressive module dependencies** (can't fake understanding)
|
||||
**✅ Changed parameters/datasets** each semester
|
||||
**✅ Competitive benchmarking** (reveals true optimization skill)
|
||||
**✅ Honor codes** (trust students to learn honestly)
|
||||
**✅ Focus on journey** (building > having built)
|
||||
|
||||
See [STUDENT_VERSION_TOOLING.md](STUDENT_VERSION_TOOLING.md) for classroom deployment strategies.
|
||||
|
||||
### Honor Code
|
||||
|
||||
> "I understand that TinyTorch solutions are public for educational transparency. I commit to building my own understanding by struggling with implementations, not copying code. I recognize that copying teaches nothing and that subsequent modules will expose shallow understanding. I choose to learn."
|
||||
|
||||
## Contributing
|
||||
|
||||
We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
||||
|
||||
@@ -465,7 +465,7 @@ This module implements patterns from:
|
||||
|
||||
## What's Next?
|
||||
|
||||
In **Module 14: KV Caching** (Performance Tier), you'll optimize transformers for production:
|
||||
In **Module 15: Memoization** (Optimization Tier), you'll optimize transformers for production:
|
||||
|
||||
- Cache key and value matrices to avoid recomputation
|
||||
- Reduce inference latency by 10-100× for long sequences
|
||||
|
||||
@@ -432,7 +432,7 @@ This module implements patterns from:
|
||||
|
||||
## What's Next?
|
||||
|
||||
In **Module 15: Profiling**, you'll measure where time goes in your transformer:
|
||||
In **Module 14: Profiling**, you measured where time goes in your transformer. Now you'll fix the bottleneck:
|
||||
|
||||
- Profile attention, feedforward, and embedding operations
|
||||
- Identify computational bottlenecks beyond caching
|
||||
|
||||
@@ -6,9 +6,9 @@ This guide walks you through training a CNN on CIFAR-10 using your TinyTorch imp
|
||||
## Prerequisites
|
||||
Complete these modules first:
|
||||
- ✅ Module 08: DataLoader (for CIFAR-10 loading)
|
||||
- ✅ Module 11: Training (for model checkpointing)
|
||||
- ✅ Module 06: Spatial (for CNN layers)
|
||||
- ✅ Module 10: Optimizers (for Adam optimizer)
|
||||
- ✅ Module 07: Training (for model checkpointing)
|
||||
- ✅ Module 09: Convolutional Networks (for CNN layers)
|
||||
- ✅ Module 06: Optimizers (for Adam optimizer)
|
||||
|
||||
## Step 1: Load CIFAR-10 Data
|
||||
|
||||
|
||||
Reference in New Issue
Block a user