[PR #13946] [MERGED] Use tiered VRAM-based default context length #14453

Closed
opened 2026-04-13 00:54:35 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13946
Author: @jessegross
Created: 1/28/2026
Status: Merged
Merged: 2/2/2026
Merged by: @jessegross

Base: mainHead: jessegross/context


📝 Commits (2)

  • 05359f3 server: fix ollama ps showing configured instead of actual context length
  • 0cab1b6 server: use tiered VRAM-based default context length

📊 Changes

10 files changed (+170 additions, -26 deletions)

View changed files

📝 envconfig/config.go (+2 -2)
📝 envconfig/config_test.go (+1 -1)
📝 llm/server.go (+5 -0)
📝 server/routes.go (+22 -23)
📝 server/routes_debug_test.go (+2 -0)
📝 server/routes_generate_renderer_test.go (+2 -0)
📝 server/routes_generate_test.go (+3 -0)
server/routes_options_test.go (+127 -0)
📝 server/sched_test.go (+1 -0)
📝 x/imagegen/server.go (+5 -0)

📄 Description

Replace binary low VRAM mode with tiered VRAM thresholds that set default context lengths for all models:

  • < 24 GiB VRAM: 4k context
  • 24-48 GiB VRAM: 32k context
  • >= 48 GiB VRAM: 256k context

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13946 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 1/28/2026 **Status:** ✅ Merged **Merged:** 2/2/2026 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/context` --- ### 📝 Commits (2) - [`05359f3`](https://github.com/ollama/ollama/commit/05359f383a1f85a1ea82ce838392745e40f672f1) server: fix ollama ps showing configured instead of actual context length - [`0cab1b6`](https://github.com/ollama/ollama/commit/0cab1b619308bf1da1b0dd45d7ee9592d780a44d) server: use tiered VRAM-based default context length ### 📊 Changes **10 files changed** (+170 additions, -26 deletions) <details> <summary>View changed files</summary> 📝 `envconfig/config.go` (+2 -2) 📝 `envconfig/config_test.go` (+1 -1) 📝 `llm/server.go` (+5 -0) 📝 `server/routes.go` (+22 -23) 📝 `server/routes_debug_test.go` (+2 -0) 📝 `server/routes_generate_renderer_test.go` (+2 -0) 📝 `server/routes_generate_test.go` (+3 -0) ➕ `server/routes_options_test.go` (+127 -0) 📝 `server/sched_test.go` (+1 -0) 📝 `x/imagegen/server.go` (+5 -0) </details> ### 📄 Description Replace binary low VRAM mode with tiered VRAM thresholds that set default context lengths for all models: - < 24 GiB VRAM: 4k context - 24-48 GiB VRAM: 32k context - \>= 48 GiB VRAM: 256k context --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:54:35 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14453