docs: update module references in README and guides

- Update README.md module structure (14→Profiling, 15→Memoization) - Fix tier descriptions (10-13 Architecture, 14-19 Optimization) - Update Module 13 next steps to reference Module 15 - Fix Module 15 prerequisite reference to Module 14 - Correct cifar10-training-guide module numbers
2026-03-12 02:09:16 -05:00 · 2025-11-09 12:42:27 -05:00
parent 5eaffe8501
commit 757d50b717
4 changed files with 54 additions and 14 deletions
--- a/README.md
+++ b/README.md
@@ -14,7 +14,9 @@
 ![GitHub Stars](https://img.shields.io/github/stars/MLSysBook/TinyTorch?style=social)
 ![Contributors](https://img.shields.io/github/contributors/MLSysBook/TinyTorch)

-> 🚧 **Work in Progress** - Actively developing TinyTorch for Spring 2025! All 20 core modules (01-20) are implemented but still being debugged and tested. Core foundation modules (01-09) are stable. Transformer and optimization modules (10-20) are functional but undergoing refinement. Join us in building the future of ML systems education.
+> 📢 **December 2024 Release** - TinyTorch is ready for community review! All 20 modules (Tensor → Transformers → Optimization → Capstone) are implemented with complete solutions. **Seeking feedback on pedagogy, implementation quality, and learning progression.** Student version tooling exists but is untested. This release focuses on validating the educational content before classroom deployment.
+> 
+> 🎯 **For Reviewers**: Read the [📚 Jupyter Book](https://mlsysbook.github.io/TinyTorch/) to evaluate pedagogy. Clone the repo to run implementations. See [STUDENT_VERSION_TOOLING.md](STUDENT_VERSION_TOOLING.md) for classroom deployment plans.

 ## 📖 Table of Contents
 - [Why TinyTorch?](#why-tinytorch)
@@ -71,8 +73,8 @@ TinyTorch/
 │   │   ├── 11_embeddings/    # Module 11: Token & positional embeddings
 │   │   ├── 12_attention/     # Module 12: Multi-head attention
 │   │   ├── 13_transformers/  # Module 13: Complete transformer blocks
-│   │   ├── 14_kvcaching/     # Module 14: KV-cache optimization
-│   │   ├── 15_profiling/     # Module 15: Performance analysis
+│   │   ├── 14_profiling/     # Module 14: Performance analysis
+│   │   ├── 15_memoization/   # Module 15: KV-cache/memoization
 │   │   ├── 16_acceleration/  # Module 16: Hardware optimization
 │   │   ├── 17_quantization/  # Module 17: Model compression
 │   │   ├── 18_compression/   # Module 18: Pruning & distillation
@@ -178,7 +180,7 @@ Build transformers that generate text
 | 11 | Embeddings | Token embeddings + positional encoding | **Embedding tables** (vocab × dim parameters), lookup performance |
 | 12 | Attention | Multi-head attention mechanisms | **O(N²) scaling**, memory bottlenecks, attention optimization |
 | 13 | Transformers | Complete transformer blocks | **Layer scaling**, memory requirements, architectural trade-offs |
-| 14 | KV-Caching | Inference optimization for transformers | **Memory vs compute trade-offs**, cache management, generation efficiency |
+| 14 | Profiling | Performance analysis + bottleneck detection | **Memory profiling**, FLOP counting, **Amdahl's Law**, performance measurement |

 **Milestone Achievement**: TinyGPT language generation with optimized inference

@@ -189,7 +191,7 @@ Profile, optimize, and benchmark ML systems

 | Module | Topic | What You Build | ML Systems Learning |
 |--------|-------|----------------|-------------------|
-| 15 | Profiling | Performance analysis + bottleneck detection | **Memory profiling**, FLOP counting, **Amdahl's Law**, performance measurement |
+| 15 | Memoization | Computational reuse via KV-caching | **Memory vs compute trade-offs**, cache management, generation efficiency |
 | 16 | Acceleration | Hardware optimization + cache-friendly algorithms | **Cache hierarchies**, memory access patterns, **vectorization vs loops** |
 | 17 | Quantization | Model compression + precision reduction | **Precision trade-offs** (FP32→INT8), memory reduction, accuracy preservation |
 | 18 | Compression | Pruning + knowledge distillation | **Sparsity patterns**, parameter reduction, **compression ratios** |
@@ -242,8 +244,8 @@ tito checkpoint timeline
 - **01-02**: Foundation (Tensor, Activations)
 - **03-07**: Core Networks (Layers, Losses, Autograd, Optimizers, Training)
 - **08-09**: Computer Vision (DataLoader, Spatial ops - unlocks CIFAR-10 @ 75%+)
- **10-14**: Language Models (Tokenization, Embeddings, Attention, Transformers, KV-Caching)
- **15-19**: System Optimization (Profiling, Acceleration, Quantization, Compression, Benchmarking)
+- **10-13**: Language Models (Tokenization, Embeddings, Attention, Transformers)
+- **14-19**: System Optimization (Profiling, Memoization, Quantization, Compression, Acceleration, Benchmarking)
 - **20**: Capstone (Complete end-to-end ML systems)

 Each module asks: **"Can I build this capability from scratch?"** with hands-on validation.
@@ -389,8 +391,8 @@ pytest tests/
 - ✅ **20 modules implemented** (01 Tensor → 20 Capstone) - all code exists
 - ✅ **6 historical milestones** (1957 Perceptron → 2024 Systems Age)
 - ✅ **Foundation modules stable** (01-09): Tensor through Spatial operations
- 🚧 **Transformer modules functional** (10-14): Tokenization through KV-Caching - undergoing testing
- 🚧 **Optimization modules functional** (15-20): Profiling through Capstone - undergoing testing
+- 🚧 **Transformer modules functional** (10-13): Tokenization through Transformers - undergoing testing
+- 🚧 **Optimization modules functional** (14-20): Profiling through Capstone - undergoing testing
 - ✅ **KISS principle design** for clear, maintainable code
 - ✅ **Essential-only features**: Focus on what's used in production ML systems
 - 🎯 **Target: Spring 2025** - Active debugging and refinement in progress  
@@ -437,6 +439,44 @@ tito benchmark submit --event cnn_marathon

 📊 **View Leaderboard**: [TinyMLPerf Competition](https://mlsysbook.github.io/TinyTorch/leaderboard.html) | Future: `tinytorch.org/leaderboard`

+## Academic Integrity & Solutions Philosophy
+
+### Why Solutions Are Public
+
+TinyTorch releases complete implementations publicly to support:
+- **Transparent peer review** of educational materials
+- **Instructor evaluation** before course adoption  
+- **Open-source community** contribution and improvement
+- **Real-world learning** from production-quality code
+
+### For Students: Learning > Copying
+
+**TinyTorch's pedagogy makes copying solutions ineffective:**
+
+1. **Progressive Complexity**: Module 05 (Autograd) requires deep understanding of Modules 01-04. You cannot fake building automatic differentiation by copying code you don't understand.
+
+2. **Integration Requirements**: Each module builds on previous work. Superficial copying breaks down as complexity compounds.
+
+3. **Systems Thinking**: The learning goal is understanding memory management, computational graphs, and performance trade-offs—not just getting tests to pass.
+
+4. **Self-Correcting**: Students who copy without understanding fail subsequent modules. The system naturally identifies shallow work.
+
+### For Instructors: Pedagogy Over Secrecy
+
+Modern ML education accepts that solutions are findable (Chegg, Course Hero, Discord). Defense comes through:
+
+**✅ Progressive module dependencies** (can't fake understanding)  
+**✅ Changed parameters/datasets** each semester  
+**✅ Competitive benchmarking** (reveals true optimization skill)  
+**✅ Honor codes** (trust students to learn honestly)  
+**✅ Focus on journey** (building > having built)
+
+See [STUDENT_VERSION_TOOLING.md](STUDENT_VERSION_TOOLING.md) for classroom deployment strategies.
+
+### Honor Code
+
+> "I understand that TinyTorch solutions are public for educational transparency. I commit to building my own understanding by struggling with implementations, not copying code. I recognize that copying teaches nothing and that subsequent modules will expose shallow understanding. I choose to learn."
+
 ## Contributing

 We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
--- a/book/chapters/13-transformers.md
+++ b/book/chapters/13-transformers.md
@@ -465,7 +465,7 @@ This module implements patterns from:

 ## What's Next?

-In **Module 14: KV Caching** (Performance Tier), you'll optimize transformers for production:
+In **Module 15: Memoization** (Optimization Tier), you'll optimize transformers for production:

 - Cache key and value matrices to avoid recomputation
 - Reduce inference latency by 10-100× for long sequences
--- a/book/chapters/15-memoization.md
+++ b/book/chapters/15-memoization.md
@@ -432,7 +432,7 @@ This module implements patterns from:

 ## What's Next?

-In **Module 15: Profiling**, you'll measure where time goes in your transformer:
+In **Module 14: Profiling**, you measured where time goes in your transformer. Now you'll fix the bottleneck:

 - Profile attention, feedforward, and embedding operations
 - Identify computational bottlenecks beyond caching
--- a/docs/cifar10-training-guide.md
+++ b/docs/cifar10-training-guide.md
@@ -6,9 +6,9 @@ This guide walks you through training a CNN on CIFAR-10 using your TinyTorch imp
 ## Prerequisites
 Complete these modules first:
 - ✅ Module 08: DataLoader (for CIFAR-10 loading)
- ✅ Module 11: Training (for model checkpointing)
- ✅ Module 06: Spatial (for CNN layers)
- ✅ Module 10: Optimizers (for Adam optimizer)
+- ✅ Module 07: Training (for model checkpointing)
+- ✅ Module 09: Convolutional Networks (for CNN layers)
+- ✅ Module 06: Optimizers (for Adam optimizer)

 ## Step 1: Load CIFAR-10 Data