[GH-ISSUE #1344] Beam search (best of) for completion API #26462

Open
opened 2026-04-22 02:45:40 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @aparatext on GitHub (Dec 1, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1344

Beam search is a sampling mechanism by which we maximize probability of not just the next token, but the entire completion.

While it can be ignored for simpler uses, any form of reasoning, especially with a tiny model, requires beam search to backtrack from incorrect steps.

llama.cpp already supports beam search, so it should be trivial to implement.

Originally created by @aparatext on GitHub (Dec 1, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1344 Beam search is a sampling mechanism by which we maximize probability of not just the next token, but the entire completion. While it can be ignored for simpler uses, any form of reasoning, especially with a tiny model, requires beam search to backtrack from incorrect steps. llama.cpp already supports beam search, so it should be trivial to implement.
GiteaMirror added the feature request label 2026-04-22 02:45:40 -05:00
Author
Owner

@Probst1nator commented on GitHub (Nov 21, 2024):

Is this still active? If this won't ever be natively supported please say so, then I'll find a workaround for my projects

<!-- gh-comment-id:2490359851 --> @Probst1nator commented on GitHub (Nov 21, 2024): Is this still active? If this won't ever be natively supported please say so, then I'll find a workaround for my projects
Author
Owner

@carl-krikorian commented on GitHub (Feb 23, 2025):

I am also very interested in this, any idea if this will be implemented or where I can find this?

<!-- gh-comment-id:2676961836 --> @carl-krikorian commented on GitHub (Feb 23, 2025): I am also very interested in this, any idea if this will be implemented or where I can find this?
Author
Owner

@mjspeck commented on GitHub (Mar 13, 2025):

Also very interest in seeing this implemented. Seems crucial for exactly the application that Ollama is designed around (running small LLMs on local hardware).

<!-- gh-comment-id:2721439514 --> @mjspeck commented on GitHub (Mar 13, 2025): Also very interest in seeing this implemented. Seems crucial for exactly the application that Ollama is designed around (running small LLMs on local hardware).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26462