[PR #6504] [CLOSED] openai: increase context window when max_tokens is provided #22670

Closed
opened 2026-04-19 16:28:27 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6504
Author: @jmorganca
Created: 8/25/2024
Status: Closed

Base: mainHead: jmorganca/openai-context


📝 Commits (3)

  • 9899f18 openai: increase context window when max_tokens is provided
  • dc04f41 fix linter issues
  • 5a67f93 fix tests

📊 Changes

2 files changed (+137 additions, -120 deletions)

View changed files

📝 openai/openai.go (+6 -1)
📝 openai/openai_test.go (+131 -119)

📄 Description

Previously, /v1/chat/completions requests were limited to 2048 tokens. This PR extends the context length by setting num_ctx to max_tokens if it's larger than the default context window of 2048 tokens. It also includes a minor clean up for the OpenAI compatibility unit tests.

Note: this doesn't solve the case of having a large context window while limiting the number of tokens to a small number. This will be solved in a future change where num_ctx will be set automatically based on available VRAM and compute.

Fixes https://github.com/ollama/ollama/issues/6286 https://github.com/ollama/ollama/issues/5356


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6504 **Author:** [@jmorganca](https://github.com/jmorganca) **Created:** 8/25/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `jmorganca/openai-context` --- ### 📝 Commits (3) - [`9899f18`](https://github.com/ollama/ollama/commit/9899f18e18d98622c260113cbfb4aecc3a733547) openai: increase context window when max_tokens is provided - [`dc04f41`](https://github.com/ollama/ollama/commit/dc04f41eb77d247b418e2a73059296c0d12287e5) fix linter issues - [`5a67f93`](https://github.com/ollama/ollama/commit/5a67f93eae5a3f6d08af4f2d9015ace86d4cb550) fix tests ### 📊 Changes **2 files changed** (+137 additions, -120 deletions) <details> <summary>View changed files</summary> 📝 `openai/openai.go` (+6 -1) 📝 `openai/openai_test.go` (+131 -119) </details> ### 📄 Description Previously, `/v1/chat/completions` requests were limited to 2048 tokens. This PR extends the context length by setting `num_ctx` to `max_tokens` if it's larger than the default context window of 2048 tokens. It also includes a minor clean up for the OpenAI compatibility unit tests. Note: this doesn't solve the case of having a large context window while limiting the number of tokens to a small number. This will be solved in a future change where `num_ctx` will be set automatically based on available VRAM and compute. Fixes https://github.com/ollama/ollama/issues/6286 https://github.com/ollama/ollama/issues/5356 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 16:28:27 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#22670