[PR #14522] [CLOSED] sampling penalties for MLX #25250

Closed
opened 2026-04-19 18:06:05 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14522
Author: @pdevine
Created: 3/1/2026
Status: Closed

Base: mainHead: pdevine/sampling-penalties


📝 Commits (5)

📊 Changes

24 files changed (+3291 additions, -118 deletions)

View changed files

📝 llm/server.go (+6 -4)
📝 server/routes.go (+49 -16)
📝 x/create/create.go (+23 -4)
📝 x/create/create_test.go (+42 -0)
📝 x/mlxrunner/cache.go (+94 -22)
📝 x/mlxrunner/cache/cache.go (+18 -0)
x/mlxrunner/cache/recurrent.go (+220 -0)
📝 x/mlxrunner/client.go (+45 -10)
x/mlxrunner/client_test.go (+167 -0)
📝 x/mlxrunner/imports.go (+2 -0)
x/mlxrunner/mlx/gated_delta_metal.go (+275 -0)
📝 x/mlxrunner/mlx/mlx.go (+1 -1)
📝 x/mlxrunner/mlx/ops.go (+24 -0)
📝 x/mlxrunner/mlx/ops_extra.go (+73 -1)
📝 x/mlxrunner/pipeline.go (+37 -17)
📝 x/mlxrunner/runner.go (+10 -5)
📝 x/mlxrunner/sample/sample.go (+169 -34)
x/mlxrunner/sample/sample_test.go (+104 -0)
📝 x/mlxrunner/server.go (+87 -4)
x/mlxrunner/server_test.go (+172 -0)

...and 4 more files

📄 Description

This change adds more sampling parameters for the mlxrunner.

Based on #14417


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14522 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 3/1/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `pdevine/sampling-penalties` --- ### 📝 Commits (5) - [`a6c1aa4`](https://github.com/ollama/ollama/commit/a6c1aa4da5c3dcf3555fb08cb0c62db4b348551d) smaller recurrent cache - [`1a23c1a`](https://github.com/ollama/ollama/commit/1a23c1a8101917ce642b38945c091cc2cb822ec5) add qwen3.5 - [`560626f`](https://github.com/ollama/ollama/commit/560626fb437446afedce01703646103ee9ada311) cleanup - [`dd49753`](https://github.com/ollama/ollama/commit/dd497534c450c57125f318e744592652bedda9d5) allow think/nothink in mlxrunner - [`67ce53b`](https://github.com/ollama/ollama/commit/67ce53b9b5c08ed4a8543c5253920803816ca3ef) wip sampling ### 📊 Changes **24 files changed** (+3291 additions, -118 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+6 -4) 📝 `server/routes.go` (+49 -16) 📝 `x/create/create.go` (+23 -4) 📝 `x/create/create_test.go` (+42 -0) 📝 `x/mlxrunner/cache.go` (+94 -22) 📝 `x/mlxrunner/cache/cache.go` (+18 -0) ➕ `x/mlxrunner/cache/recurrent.go` (+220 -0) 📝 `x/mlxrunner/client.go` (+45 -10) ➕ `x/mlxrunner/client_test.go` (+167 -0) 📝 `x/mlxrunner/imports.go` (+2 -0) ➕ `x/mlxrunner/mlx/gated_delta_metal.go` (+275 -0) 📝 `x/mlxrunner/mlx/mlx.go` (+1 -1) 📝 `x/mlxrunner/mlx/ops.go` (+24 -0) 📝 `x/mlxrunner/mlx/ops_extra.go` (+73 -1) 📝 `x/mlxrunner/pipeline.go` (+37 -17) 📝 `x/mlxrunner/runner.go` (+10 -5) 📝 `x/mlxrunner/sample/sample.go` (+169 -34) ➕ `x/mlxrunner/sample/sample_test.go` (+104 -0) 📝 `x/mlxrunner/server.go` (+87 -4) ➕ `x/mlxrunner/server_test.go` (+172 -0) _...and 4 more files_ </details> ### 📄 Description This change adds more sampling parameters for the mlxrunner. Based on #14417 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 18:06:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#25250