[PR #11794] [MERGED] server: Reduce gpt-oss context length for small VRAM GPUs #18893

Closed
opened 2026-04-16 06:50:57 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/11794
Author: @jessegross
Created: 8/7/2025
Status: Merged
Merged: 8/7/2025
Merged by: @jessegross

Base: mainHead: jessegross/vram


📝 Commits (1)

  • 2d03782 server: Reduce gpt-oss context length for small VRAM GPUs

📊 Changes

1 file changed (+20 additions, -4 deletions)

View changed files

📝 server/routes.go (+20 -4)

📄 Description

gpt-oss works best with a context length of at least 8k. However, for GPUs with limited amount of VRAM, there is a significant performance hit to this increased context. In these cases, we switch to the Ollama default of 4k


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/11794 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 8/7/2025 **Status:** ✅ Merged **Merged:** 8/7/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/vram` --- ### 📝 Commits (1) - [`2d03782`](https://github.com/ollama/ollama/commit/2d037822142d0bf40e9d0fd56c7bb5561ff73f75) server: Reduce gpt-oss context length for small VRAM GPUs ### 📊 Changes **1 file changed** (+20 additions, -4 deletions) <details> <summary>View changed files</summary> 📝 `server/routes.go` (+20 -4) </details> ### 📄 Description gpt-oss works best with a context length of at least 8k. However, for GPUs with limited amount of VRAM, there is a significant performance hit to this increased context. In these cases, we switch to the Ollama default of 4k --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:50:57 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#18893