[PR #5716] [CLOSED] Require cached prompt to match certain percentage. #74193

Closed
opened 2026-05-05 06:09:48 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5716
Author: @rasodu
Created: 7/16/2024
Status: Closed

Base: mainHead: fix-cache-issue


📝 Commits (1)

  • e615cd8 Require cached prompt to match certain percentage.

📊 Changes

1 file changed (+7 additions, -2 deletions)

View changed files

📝 llm/ext_server/server.cpp (+7 -2)

📄 Description

We have three options:

  1. Any number of characters are matched
    • This is the current logic. The issue is even matching beginning of the prompt "<|user|>" leads to the selection of the slot. So, every request is using 1st slot.
  2. All characters are matched
    • We can require matching entire prompt. I use OpenWeb UI. It has functionality to edit last prompt. Even editing the last prompt will re-evaluate entire prompt.
  3. Partial match
    • This is what I have currently implemented in the PR. It will fix the bug that we are facing and also allow potentially changing the prompt without leading to full re-evaluation. We are matching 60% of the cached prompt but it could be any % value. Essentially, we are asking for re-evaluation if the prompt doesn't match at least 60% with the cached prompt.

Let me know if you have any other ideas to fix this issue.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5716 **Author:** [@rasodu](https://github.com/rasodu) **Created:** 7/16/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `fix-cache-issue` --- ### 📝 Commits (1) - [`e615cd8`](https://github.com/ollama/ollama/commit/e615cd8c9fd686546c54781b3686c29ece14492f) Require cached prompt to match certain percentage. ### 📊 Changes **1 file changed** (+7 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `llm/ext_server/server.cpp` (+7 -2) </details> ### 📄 Description We have three options: 1. Any number of characters are matched - This is the current logic. The issue is even matching beginning of the prompt "<|user|>" leads to the selection of the slot. So, every request is using 1st slot. 2. All characters are matched - We can require matching entire prompt. I use OpenWeb UI. It has functionality to edit last prompt. Even editing the last prompt will re-evaluate entire prompt. 3. Partial match - This is what I have currently implemented in the PR. It will fix the bug that we are facing and also allow potentially changing the prompt without leading to full re-evaluation. We are matching 60% of the cached prompt but it could be any % value. Essentially, we are asking for re-evaluation if the prompt doesn't match at least 60% with the cached prompt. Let me know if you have any other ideas to fix this issue. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 06:09:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#74193