Files
Jesse Gross 4d5ff25724 mlxrunner: Report actual memory usage from runner
The MLX runner previously reported a static VRAM estimate that was
computed at load time and consisted only of the weights. This is
strictly less than the actual memory usage, as it does not include
the KV cache or compute graph.
2026-02-25 15:06:37 -08:00
..
2026-02-12 15:47:00 -08:00