mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-04-30 22:56:55 -05:00
- Removed 01_setup module (archived to archive/setup_module) - Renumbered all modules: tensor is now 01, activations is 02, etc. - Added tito setup command for environment setup and package installation - Added numeric shortcuts: tito 01, tito 02, etc. for quick module access - Fixed view command to find dev files correctly - Updated module dependencies and references - Improved user experience: immediate ML learning instead of boring setup
24 lines
759 B
YAML
24 lines
759 B
YAML
description: "Memory optimization through KV caching for transformer inference. Students\
|
|
\ learn to\ntransform O(N\xB2) attention complexity into O(N) for autoregressive\
|
|
\ generation, achieving\ndramatic speedups in transformer inference.\n"
|
|
difficulty: advanced
|
|
estimated_hours: 8-10
|
|
exports:
|
|
- tinytorch.optimizations.caching
|
|
learning_objectives:
|
|
- Understand attention memory complexity
|
|
- Implement KV caching for transformers
|
|
- Build incremental computation patterns
|
|
- Optimize autoregressive generation
|
|
name: Caching
|
|
number: 18
|
|
prerequisites:
|
|
- Module 14: Transformers
|
|
- Module 17: Compression
|
|
skills_developed:
|
|
- KV caching implementation
|
|
- Memory-computation tradeoffs
|
|
- Incremental computation
|
|
- Production inference patterns
|
|
type: optimization
|