TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-05-27 19:45:52 -05:00

Files

Vijay Janapa Reddi 0ba1a210a8 Implement non-invasive KV cache integration (enable_kv_cache)

Module 14 now provides enable_kv_cache(model) - following same pattern
as enable_autograd() from Module 05. Key innovation: students ADD
capabilities in new modules WITHOUT modifying old ones!

Implementation:
- enable_kv_cache(model): Patches model attention layers with caching
- disable_kv_cache(model): Restores original attention behavior
- Non-invasive: Modules 12-13 unchanged, Module 14 enhances them
- Educational: Teaches composition over modification

Architecture Pattern:
1. Module 14 wraps each TransformerBlock attention layer
2. Stores original forward methods before patching
3. Creates cache infrastructure for model architecture
4. Can enable/disable without breaking model

Systems Engineering Lesson:
Forward-only learning: New modules ADD features, never BREAK old ones
- Module 12 (Attention): Core implementation
- Module 13 (Transformers): Uses Module 12
- Module 14 (KV Caching): ENHANCES Module 12 without changing it

Milestone Integration:
- TinyGPT.generate() now uses enable_kv_cache() when use_cache=True
- Cache automatically created for model architecture
- Clean fallback if Module 14 not available
- Educational notes explain concept vs production implementation

Module now: 1005 lines (805 + 200 integration code)
Tests: All pass (12/12 including new integration tests)

2025-11-05 18:19:52 -05:00

01_tensor

fix(module-01): Fix batched matmul and transpose grad preservation

2025-10-27 20:28:53 -04:00

02_activations

fix(module-02): Rewrite Softmax to use Tensor operations

2025-10-27 20:29:35 -04:00

03_layers

fix(module-03): Rewrite Dropout to use Tensor operations

2025-10-27 20:29:43 -04:00

04_losses

feat: Add Milestone 04 (CNN Revolution 1998) + Clean spatial imports