mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-27 19:45:52 -05:00
Module 14 now provides enable_kv_cache(model) - following same pattern as enable_autograd() from Module 05. Key innovation: students ADD capabilities in new modules WITHOUT modifying old ones! Implementation: - enable_kv_cache(model): Patches model attention layers with caching - disable_kv_cache(model): Restores original attention behavior - Non-invasive: Modules 12-13 unchanged, Module 14 enhances them - Educational: Teaches composition over modification Architecture Pattern: 1. Module 14 wraps each TransformerBlock attention layer 2. Stores original forward methods before patching 3. Creates cache infrastructure for model architecture 4. Can enable/disable without breaking model Systems Engineering Lesson: Forward-only learning: New modules ADD features, never BREAK old ones - Module 12 (Attention): Core implementation - Module 13 (Transformers): Uses Module 12 - Module 14 (KV Caching): ENHANCES Module 12 without changing it Milestone Integration: - TinyGPT.generate() now uses enable_kv_cache() when use_cache=True - Cache automatically created for model architecture - Clean fallback if Module 14 not available - Educational notes explain concept vs production implementation Module now: 1005 lines (805 + 200 integration code) Tests: All pass (12/12 including new integration tests)