[PR #7368] [CLOSED] runner.go: Use stable llama.cpp sampling interface #17670

Closed
opened 2026-04-16 06:10:26 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7368
Author: @jessegross
Created: 10/25/2024
Status: Closed

Base: mainHead: jessegross/sample


📝 Commits (1)

  • c14f348 runner.go: Use stable llama.cpp sampling interface

📊 Changes

15 files changed (+121 additions, -30235 deletions)

View changed files

llama/base64.hpp (+0 -392)
llama/common.cpp (+0 -2092)
llama/common.h (+0 -581)
llama/json-schema-to-grammar.cpp (+0 -1071)
llama/json-schema-to-grammar.h (+0 -34)
llama/json.hpp (+0 -24766)
📝 llama/llama.go (+113 -35)
llama/log.cpp (+0 -427)
llama/log.h (+0 -118)
📝 llama/make/Makefile.sync (+2 -15)
📝 llama/runner/runner.go (+6 -1)
llama/sampling.cpp (+0 -484)
llama/sampling.h (+0 -109)
llama/sampling_ext.cpp (+0 -56)
llama/sampling_ext.h (+0 -54)

📄 Description

Currently for sampling we are using an internal interface for the llama.cpp examples, which tends to change from release to release. This is the only such interface used for text models, though llava and clip are also used for image processing.

This switches to use the stable interfaces, reducing the amount of work needed for future llama.cpp bumps. It also significantly reduces the amount of code that we need to vendor (much of it is unused but is a dependency).

The sampling logic is the same as it is now for the parameters that we support and is done at the CGo layer. However, in the future if there are benefits to reconfiguring it then we can expose the primatives to native Go code.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7368 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 10/25/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `jessegross/sample` --- ### 📝 Commits (1) - [`c14f348`](https://github.com/ollama/ollama/commit/c14f348ffa2437740916f2a9b4798bc1366e0a1a) runner.go: Use stable llama.cpp sampling interface ### 📊 Changes **15 files changed** (+121 additions, -30235 deletions) <details> <summary>View changed files</summary> ➖ `llama/base64.hpp` (+0 -392) ➖ `llama/common.cpp` (+0 -2092) ➖ `llama/common.h` (+0 -581) ➖ `llama/json-schema-to-grammar.cpp` (+0 -1071) ➖ `llama/json-schema-to-grammar.h` (+0 -34) ➖ `llama/json.hpp` (+0 -24766) 📝 `llama/llama.go` (+113 -35) ➖ `llama/log.cpp` (+0 -427) ➖ `llama/log.h` (+0 -118) 📝 `llama/make/Makefile.sync` (+2 -15) 📝 `llama/runner/runner.go` (+6 -1) ➖ `llama/sampling.cpp` (+0 -484) ➖ `llama/sampling.h` (+0 -109) ➖ `llama/sampling_ext.cpp` (+0 -56) ➖ `llama/sampling_ext.h` (+0 -54) </details> ### 📄 Description Currently for sampling we are using an internal interface for the llama.cpp examples, which tends to change from release to release. This is the only such interface used for text models, though llava and clip are also used for image processing. This switches to use the stable interfaces, reducing the amount of work needed for future llama.cpp bumps. It also significantly reduces the amount of code that we need to vendor (much of it is unused but is a dependency). The sampling logic is the same as it is now for the parameters that we support and is done at the CGo layer. However, in the future if there are benefits to reconfiguring it then we can expose the primatives to native Go code. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:10:27 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#17670