Files
ollama/llm
Jesse Gross 638faeac54 mlxrunner: Report actual memory usage from runner
The MLX runner previously reported a static VRAM estimate that was
computed at load time and consisted only of the weights. This is
strictly less than the actual memory usage, as it does not include
the KV cache or compute graph.
2026-02-27 17:29:47 -08:00
..
2025-05-05 11:08:12 -07:00
2025-10-31 09:54:25 -07:00