ollama-ollama

mirror of https://github.com/ollama/ollama.git synced 2026-03-11 20:23:55 -05:00

Files

Jesse Gross 4e57d2094e mlxrunner: Simplify pipeline memory and cache management

Particularly in error cases, it can be difficult to ensure that
all pinned memory is unpinned, MLX buffers are released and cache
state is consistent. This encapsulates those pieces and sets up
proper deferrals so that this happens automatically on exit.

2026-02-25 14:00:42 -08:00

cache

mlxrunner: Fix duplicate log prefixes and reduce log noise

2026-02-23 14:09:20 -08:00

mlx

update mlx-c bindings to 0.5.0 (#14380 )

2026-02-23 16:44:29 -08:00

model

mlx: don't default to affine quantization for unquantized models

2026-02-23 15:03:53 -08:00

sample

…