[PR #15784] sample: implement repeat, frequency, and presence penalties in Go sampler #62004

Open
opened 2026-04-29 16:57:40 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15784
Author: @42euge
Created: 4/24/2026
Status: 🔄 Open

Base: mainHead: fix-go-sampler-repeat-penalty


📝 Commits (2)

  • 61844d9 sample: implement repeat, frequency, and presence penalties in Go sampler
  • a19534d sample: add integration and edge case tests for repeat penalty

📊 Changes

6 files changed (+224 additions, -23 deletions)

View changed files

📝 runner/ollamarunner/runner.go (+4 -0)
📝 sample/samplers.go (+40 -13)
📝 sample/samplers_benchmark_test.go (+4 -4)
📝 sample/samplers_test.go (+54 -6)
📝 sample/transforms.go (+32 -0)
📝 sample/transforms_test.go (+90 -0)

📄 Description

Summary

The Go-native sampler (sample/samplers.go) used by models on the ollamarunner path accepts repeat_penalty, frequency_penalty, and presence_penalty via the API but silently ignores them. This PR implements the missing penalties, matching llama.cpp's sampling_repetition_penalties behavior.

Changes:

  • Add repeatPenalize() transform to sample/transforms.go — penalizes logits for recently generated tokens
  • Add penalty fields and token history ring buffer to the Sampler struct
  • Wire repeat_penalty, repeat_last_n, frequency_penalty, and presence_penalty from API options through to the sampler in runner/ollamarunner/runner.go
  • Add unit tests for the transform and integration tests for the full sampling pipeline

No API changes neededapi/types.go already defines these fields with defaults (repeat_penalty: 1.1, repeat_last_n: 64).

Impact

Most visible on audio transcription with Gemma 4 e4b, where the lack of repeat penalty caused severe repetition loops:

Sentence type Before (no penalty) After (penalty working)
Long paragraph (25s) 84.7% WER 30.6% WER
Technical jargon 60.6% WER 33.3% WER
Simple/short sentences 0.0% WER 0.0% WER

Test plan

  • go test ./sample/ — all existing + new tests pass
  • TestRepeatPenalize — verifies transform math (positive/negative logits, frequency, presence, combined, edge cases)
  • TestRepeatPenaltyIntegration — verifies penalty changes greedy token selection, history accumulates, ring buffer caps at repeatLastN
  • Benchmarked against a 7-sentence voice corpus with Gemma 4 e4b audio transcription
  • Verified no regression on models using the llamarunner path

Fixes #15783
Related: #9278

🤖 Generated with Claude Code


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15784 **Author:** [@42euge](https://github.com/42euge) **Created:** 4/24/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix-go-sampler-repeat-penalty` --- ### 📝 Commits (2) - [`61844d9`](https://github.com/ollama/ollama/commit/61844d9f277090deaec57cc3b7eb21638f13595d) sample: implement repeat, frequency, and presence penalties in Go sampler - [`a19534d`](https://github.com/ollama/ollama/commit/a19534d3da0d106fa14a3cfe70e222e06ab0de8f) sample: add integration and edge case tests for repeat penalty ### 📊 Changes **6 files changed** (+224 additions, -23 deletions) <details> <summary>View changed files</summary> 📝 `runner/ollamarunner/runner.go` (+4 -0) 📝 `sample/samplers.go` (+40 -13) 📝 `sample/samplers_benchmark_test.go` (+4 -4) 📝 `sample/samplers_test.go` (+54 -6) 📝 `sample/transforms.go` (+32 -0) 📝 `sample/transforms_test.go` (+90 -0) </details> ### 📄 Description ## Summary The Go-native sampler (`sample/samplers.go`) used by models on the `ollamarunner` path accepts `repeat_penalty`, `frequency_penalty`, and `presence_penalty` via the API but silently ignores them. This PR implements the missing penalties, matching llama.cpp's `sampling_repetition_penalties` behavior. **Changes:** - Add `repeatPenalize()` transform to `sample/transforms.go` — penalizes logits for recently generated tokens - Add penalty fields and token history ring buffer to the `Sampler` struct - Wire `repeat_penalty`, `repeat_last_n`, `frequency_penalty`, and `presence_penalty` from API options through to the sampler in `runner/ollamarunner/runner.go` - Add unit tests for the transform and integration tests for the full sampling pipeline **No API changes needed** — `api/types.go` already defines these fields with defaults (`repeat_penalty: 1.1`, `repeat_last_n: 64`). ## Impact Most visible on audio transcription with Gemma 4 e4b, where the lack of repeat penalty caused severe repetition loops: | Sentence type | Before (no penalty) | After (penalty working) | |--------------|---------------------|------------------------| | Long paragraph (25s) | 84.7% WER | 30.6% WER | | Technical jargon | 60.6% WER | 33.3% WER | | Simple/short sentences | 0.0% WER | 0.0% WER | ## Test plan - [x] `go test ./sample/` — all existing + new tests pass - [x] `TestRepeatPenalize` — verifies transform math (positive/negative logits, frequency, presence, combined, edge cases) - [x] `TestRepeatPenaltyIntegration` — verifies penalty changes greedy token selection, history accumulates, ring buffer caps at `repeatLastN` - [x] Benchmarked against a 7-sentence voice corpus with Gemma 4 e4b audio transcription - [x] Verified no regression on models using the `llamarunner` path Fixes #15783 Related: #9278 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:57:40 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#62004