TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-26 16:32:34 -05:00

Files

Vijay Janapa Reddi f2f46ac021 Fix enable_kv_cache to handle mask parameter and add integration test

Module 14 fix:
- Updated cached_forward() to accept mask parameter (x, mask=None)
- Attention forward calls with 2 args: forward(x, mask)
- Now properly passes through both arguments to original forward

Integration test (test_kv_cache_milestone.py):
- Tests generation WITHOUT cache (baseline)
- Tests generation WITH cache enabled
- Verifies cache infrastructure works without breaking model
- Documents current implementation (architecture demo)
- Shows that full speedup requires deeper attention integration

Test results:
✅ Without cache: 139.3 tok/s
✅ With cache: 142.5 tok/s (similar - expected with pass-through)
✅ Cache infrastructure successfully integrated
✅ Model continues to work with caching enabled

Educational value:
Students learn the PATTERN of non-invasive optimization through
composition and monkey-patching, which is more important than
absolute speedup numbers for this module.

2025-11-05 19:13:41 -05:00

01_tensor

fix(module-01): Fix batched matmul and transpose grad preservation

2025-10-27 20:28:53 -04:00

02_activations

fix(module-02): Rewrite Softmax to use Tensor operations

2025-10-27 20:29:35 -04:00

03_layers

fix(module-03): Rewrite Dropout to use Tensor operations

2025-10-27 20:29:43 -04:00

04_losses

feat: Add Milestone 04 (CNN Revolution 1998) + Clean spatial imports