[PR #15542] sample: reduce greedy and top-k sampler allocations #61892

Open
opened 2026-04-29 16:53:01 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15542
Author: @vonbai
Created: 4/13/2026
Status: 🔄 Open

Base: mainHead: goalx/cert-evolve-sampler


📝 Commits (3)

  • 58c7080 sample: avoid greedy sampler allocation
  • b5cd722 sample: avoid top-k heap allocations
  • 958e624 sample: preserve top-k tie ordering

📊 Changes

4 files changed (+77 additions, -30 deletions)

View changed files

📝 sample/samplers.go (+10 -0)
📝 sample/samplers_test.go (+20 -0)
📝 sample/transforms.go (+27 -30)
📝 sample/transforms_test.go (+20 -0)

📄 Description

Summary

This reduces sampler overhead in two focused paths:

  • add a no-allocation greedy fast path when temperature is zero and no grammar sampler is active
  • replace positive top-k sampling's generic container/heap path with an in-place typed min-heap over the existing token slice
  • add regression coverage for greedy allocations and top-k tie ordering

The top-p/full-sort path is intentionally left unchanged to avoid changing broader sampling semantics.

Validation

  • go test ./sample
  • go test ./sample -run '^$' -bench 'Benchmark(GreedySampler|WeightedSampler|Transforms)' -benchmem -count=1
  • git diff --check upstream/main..HEAD

Benchmark signals from the replayed current-upstream branch:

  • BenchmarkWeightedSampler/ConfigGreedy: 83574 ns/op, 0 B/op, 0 allocs/op
  • BenchmarkGreedySampler/Size_100000: 81878 ns/op, 0 B/op, 0 allocs/op
  • BenchmarkWeightedSampler/ConfigTopK: 548624 ns/op, 1024023 B/op, 1 alloc/op
  • BenchmarkTransforms/TopK: 28393 ns/op, 0 B/op, 0 allocs/op

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15542 **Author:** [@vonbai](https://github.com/vonbai) **Created:** 4/13/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `goalx/cert-evolve-sampler` --- ### 📝 Commits (3) - [`58c7080`](https://github.com/ollama/ollama/commit/58c7080eb8165643e5f046ac6712b95810fcf54a) sample: avoid greedy sampler allocation - [`b5cd722`](https://github.com/ollama/ollama/commit/b5cd722d75110023d78b85648759ccb7c467a793) sample: avoid top-k heap allocations - [`958e624`](https://github.com/ollama/ollama/commit/958e624ac616db6f270dcab5b0ec2f51968f3560) sample: preserve top-k tie ordering ### 📊 Changes **4 files changed** (+77 additions, -30 deletions) <details> <summary>View changed files</summary> 📝 `sample/samplers.go` (+10 -0) 📝 `sample/samplers_test.go` (+20 -0) 📝 `sample/transforms.go` (+27 -30) 📝 `sample/transforms_test.go` (+20 -0) </details> ### 📄 Description ## Summary This reduces sampler overhead in two focused paths: - add a no-allocation greedy fast path when temperature is zero and no grammar sampler is active - replace positive top-k sampling's generic `container/heap` path with an in-place typed min-heap over the existing token slice - add regression coverage for greedy allocations and top-k tie ordering The top-p/full-sort path is intentionally left unchanged to avoid changing broader sampling semantics. ## Validation - `go test ./sample` - `go test ./sample -run '^$' -bench 'Benchmark(GreedySampler|WeightedSampler|Transforms)' -benchmem -count=1` - `git diff --check upstream/main..HEAD` Benchmark signals from the replayed current-upstream branch: - `BenchmarkWeightedSampler/ConfigGreedy`: `83574 ns/op`, `0 B/op`, `0 allocs/op` - `BenchmarkGreedySampler/Size_100000`: `81878 ns/op`, `0 B/op`, `0 allocs/op` - `BenchmarkWeightedSampler/ConfigTopK`: `548624 ns/op`, `1024023 B/op`, `1 alloc/op` - `BenchmarkTransforms/TopK`: `28393 ns/op`, `0 B/op`, `0 allocs/op` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:53:01 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#61892