[PR #1963] [MERGED] trim chat prompt based on llm context size #21279

Closed
opened 2026-04-19 15:33:30 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/1963
Author: @BruceMacD
Created: 1/12/2024
Status: Merged
Merged: 1/30/2024
Merged by: @BruceMacD

Base: mainHead: brucemacd/template-token-smart


📝 Commits (8)

📊 Changes

4 files changed (+440 additions, -57 deletions)

View changed files

📝 server/images.go (+24 -27)
📝 server/images_test.go (+56 -28)
📝 server/routes.go (+105 -2)
📝 server/routes_test.go (+255 -0)

📄 Description

When trimming the input chat prompt we need to make sure we keep the prompt template in the expected format. Without this the prompt will be trimmed without accounting for the model template when the maximum context length is reached, which can result in unexpected behavior from the model.

  • update the ChatPrompt function to return a list of prompt variable, to allow the calling function to append them into the final prompt
  • create the final prompt based on the loaded LLM's context window size, while preserving the prompt template formatting and system message in the first message of the new context window

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/1963 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 1/12/2024 **Status:** ✅ Merged **Merged:** 1/30/2024 **Merged by:** [@BruceMacD](https://github.com/BruceMacD) **Base:** `main` ← **Head:** `brucemacd/template-token-smart` --- ### 📝 Commits (8) - [`04ec4f3`](https://github.com/ollama/ollama/commit/04ec4f340ba52ed2cd34a88beae97758fbff412f) trim chat prompt based on llm context size - [`87eeb4e`](https://github.com/ollama/ollama/commit/87eeb4e885102bbcf0a451817d0d3a1a43b4cd78) Update images_test.go - [`c930fe8`](https://github.com/ollama/ollama/commit/c930fe889fd0cfef52a81a6c68616212ec2a1d6f) refactor contextLimitPrompt - [`94c21b5`](https://github.com/ollama/ollama/commit/94c21b5510aa013b385d750b094b64dd1cdcc26c) maintain system message in chat history - [`28e0293`](https://github.com/ollama/ollama/commit/28e0293ce12c1778a9df2fadc9d87719df529fc1) lint fix - [`171e22b`](https://github.com/ollama/ollama/commit/171e22b0c42697b84a48829c074051cc8590899f) fix lint - [`4f6f68d`](https://github.com/ollama/ollama/commit/4f6f68d475556d2b833009b277415e9295f500ca) formatting - [`b8608ff`](https://github.com/ollama/ollama/commit/b8608ff3b9503a593ed351d567c322ca229846ff) refactor ### 📊 Changes **4 files changed** (+440 additions, -57 deletions) <details> <summary>View changed files</summary> 📝 `server/images.go` (+24 -27) 📝 `server/images_test.go` (+56 -28) 📝 `server/routes.go` (+105 -2) 📝 `server/routes_test.go` (+255 -0) </details> ### 📄 Description When trimming the input chat prompt we need to make sure we keep the prompt template in the expected format. Without this the prompt will be trimmed without accounting for the model template when the maximum context length is reached, which can result in unexpected behavior from the model. - update the `ChatPrompt` function to return a list of prompt variable, to allow the calling function to append them into the final prompt - create the final prompt based on the loaded LLM's context window size, while preserving the prompt template formatting and system message in the first message of the new context window --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 15:33:30 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#21279