name: Caching number: 16 type: optimization difficulty: advanced estimated_hours: 8-12 description: | Memory optimization through caching, focusing on KV caching for transformer inference. Students learn how to reuse computations across time steps in autoregressive generation. learning_objectives: - Understand memory vs computation tradeoffs - Implement KV caching for transformer inference - Learn incremental computation patterns - Optimize autoregressive generation speed prerequisites: - Module 14: Transformers - Module 15: Acceleration skills_developed: - Memory optimization techniques - Incremental computation strategies - Transformer inference optimization - Cache management patterns exports: - tinytorch.optimizations.caching