[GH-ISSUE #9440] Allow change of the context window from Python #6155

Closed
opened 2026-04-12 17:30:18 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @mmb78 on GitHub (Mar 1, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9440

It would be great if one could change the context window length parameter "num_ctx" also from Python when using OpenAI library.
So that this can be regulated per request and not globally for the ollama server.

Originally created by @mmb78 on GitHub (Mar 1, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9440 It would be great if one could change the context window length parameter "num_ctx" also from Python when using OpenAI library. So that this can be regulated per request and not globally for the ollama server.
GiteaMirror added the feature request label 2026-04-12 17:30:18 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 1, 2025):

Changing the size of the context in the API will cause a model reload. The best approach is to set the context length to the maximum required and just let the clients use whatever part of the buffer they need. For openAI calls, you can either set num_ctx on a per-model basis in the Modelfile, or set OLLAMA_CONTEXT_LENGTH on a server-wide basis.

<!-- gh-comment-id:2692244422 --> @rick-github commented on GitHub (Mar 1, 2025): Changing the size of the context in the API will cause a model reload. The best approach is to set the context length to the maximum required and just let the clients use whatever part of the buffer they need. For openAI calls, you can either set `num_ctx` on a per-model basis in the [Modelfile](https://github.com/ollama/ollama/blob/main/docs/openai.md#setting-the-context-size), or set [`OLLAMA_CONTEXT_LENGTH`](https://github.com/ollama/ollama/blob/bebb6823c03df34404753e42a41e7be8049d3146/envconfig/config.go#L258C1-L259C1) on a server-wide basis.
Author
Owner

@mmb78 commented on GitHub (Mar 1, 2025):

Well .. this seems to be possible:

curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt": "Why is the sky blue?",
"options": {
"num_ctx": 4096
}
}'

It would be nice, if I could do that from Python.

<!-- gh-comment-id:2692268241 --> @mmb78 commented on GitHub (Mar 1, 2025): Well .. this seems to be possible: curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Why is the sky blue?", "options": { "num_ctx": 4096 } }' It would be nice, if I could do that from Python.
Author
Owner

@rick-github commented on GitHub (Mar 1, 2025):

You can do it from python, for the ollama API. You can't do it for the openAI API.

<!-- gh-comment-id:2692278540 --> @rick-github commented on GitHub (Mar 1, 2025): You can do it from python, for the ollama API. You can't do it for the openAI API.
Author
Owner

@lemassykoi commented on GitHub (Mar 4, 2025):

You can do it from python, for the ollama API. You can't do it for the openAI API.

this is possible with ChatOllama from langchain_ollama too

<!-- gh-comment-id:2697323368 --> @lemassykoi commented on GitHub (Mar 4, 2025): > You can do it from python, for the ollama API. You can't do it for the openAI API. this is possible with ChatOllama from langchain_ollama too
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6155