TinyTorch/tinytorch at 3b21687f0f67d536f9229650e920b5f45d17dc88 - TinyTorch - Computersurge

github-starred/TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-04-28 17:37:31 -05:00

Files

History

Vijay Janapa Reddi 3b21687f0f Implement REAL KV caching with 6x speedup

Module 14 now provides TRUE O(n²) → O(n) transformation with measurable speedup!

Implementation:
- cached_forward() now computes K,V only for NEW token
- Stores K,V in cache, retrieves full history for attention
- Uses numpy operations directly for efficiency
- Detects single-token (generation) vs full-sequence (training)
- First token handled via original path (cache initialization)

Results (test_kv_cache_milestone.py):
✅ WITHOUT cache: 118.2 tok/s (baseline)
✅ WITH cache: 705.6 tok/s (optimized)
✅ SPEEDUP: 6x on tiny model (2 layers, embed_dim=32)

For longer sequences: 10-15x+ speedup expected!

Milestone integration (vaswani_chatgpt.py):
- Resets cache at start of each generation
- Populates cache with prompt tokens
- Processes only new token when cache enabled
- Calls cache.advance() after each token
- Seamless fallback to standard generation

Gradient safety:
✅ Training (seq_len>1): Uses original path (full gradients)
✅ Generation (seq_len=1): Uses cache path (inference only)
✅ No gradient tracking in cache operations (uses .data)

This is how production LLMs work! Students learn real ML systems engineering.

2025-11-05 20:54:55 -05:00

..

Reset package and export modules 01-07 only (skip broken spatial module)

2025-09-30 13:41:00 -04:00

FOUNDATION: Establish AI Engineering as a discipline through TinyTorch

2025-09-25 11:16:28 -04:00

Reset package and export modules 01-07 only (skip broken spatial module)

2025-09-30 13:41:00 -04:00

Adds initial TinyTorch CLI and core structure

2025-07-09 00:23:19 -04:00

Add jupytext to requirements and export Module 14

2025-11-05 19:10:52 -05:00

Adds initial TinyTorch CLI and core structure

2025-07-09 00:23:19 -04:00

FOUNDATION: Establish AI Engineering as a discipline through TinyTorch

2025-09-25 11:16:28 -04:00

Implement REAL KV caching with 6x speedup

2025-11-05 20:54:55 -05:00

Add jupytext to requirements and export Module 14

2025-11-05 19:10:52 -05:00

refactor: Keep explicit module imports + optimize CNN milestone

2025-09-30 17:15:40 -04:00

Organize package with nn and optim modules

2025-09-23 08:10:47 -04:00

Reset package and export modules 01-07 only (skip broken spatial module)

2025-09-30 13:41:00 -04:00

FOUNDATION: Establish AI Engineering as a discipline through TinyTorch

2025-09-25 11:16:28 -04:00

Reset package and export modules 01-07 only (skip broken spatial module)

2025-09-30 13:41:00 -04:00

FOUNDATION: Establish AI Engineering as a discipline through TinyTorch

2025-09-25 11:16:28 -04:00

Add jupytext to requirements and export Module 14

2025-11-05 19:10:52 -05:00

Clean up module imports: convert tinytorch.core to sys.path style

2025-09-30 08:58:58 -04:00

__init__.py

Fix gradient propagation: enable autograd and patch activations/losses

2025-09-30 13:51:30 -04:00

_modidx.py

Add jupytext to requirements and export Module 14

2025-11-05 19:10:52 -05:00