[GH-ISSUE #14173] Ollama 0.15.6 ignore requested contex size. #71299

Closed
opened 2026-05-05 01:09:40 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @luisbrandao on GitHub (Feb 9, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14173

What is the issue?

Because of the new default size, ollama ignore the context size request in the request and use the default.

If the context size is defined on the message, it SHOULD NEVER user the default.

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @luisbrandao on GitHub (Feb 9, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14173 ### What is the issue? Because of the new default size, ollama ignore the context size request in the request and use the default. If the context size is defined on the message, it SHOULD NEVER user the default. ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bugneeds more info labels 2026-05-05 01:09:40 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 9, 2026):

ollama should not ignore context size, can you provide an example and server logs?

$ ollama -v
ollama version is 0.15.6
$ curl -s localhost:11434/api/generate -d '{
  "model":"qwen2.5:0.5b",
  "prompt":"hello",
  "options":{"num_ctx":12345},
  "stream":false
}' | jq -r .response
Hello! How can I assist you today?
$ ollama ps
NAME                        ID              SIZE      PROCESSOR    CONTEXT    UNTIL   
qwen2.5:0.5b                a8b0c5157701    926 MB    100% GPU     12345      Forever    
<!-- gh-comment-id:3873468951 --> @rick-github commented on GitHub (Feb 9, 2026): ollama should not ignore context size, can you provide an example and server logs? ```console $ ollama -v ollama version is 0.15.6 $ curl -s localhost:11434/api/generate -d '{ "model":"qwen2.5:0.5b", "prompt":"hello", "options":{"num_ctx":12345}, "stream":false }' | jq -r .response Hello! How can I assist you today? $ ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL qwen2.5:0.5b a8b0c5157701 926 MB 100% GPU 12345 Forever ```
Author
Owner

@Protozua commented on GitHub (Feb 10, 2026):

I am seeing the same issue, here is my configuration file:

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
#ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
Environment="OLLAMA_HOST=0.0.0.0"
Environment="CUDA_VISIBLE_DEVICES=0"
Environment="OLLAMA_NUM_PARALLEL=1"
Environemnt="OLLAMA_CONTEXT_LENGTH=4096"


[Install]
WantedBy=default.target

But the OLLAMA_CONTEXT_LENGTH variable is ignored:

`NAME               ID              SIZE     PROCESSOR    CONTEXT    UNTIL
deepseek-r1:14b    c333b7232bdb    17 GB    100% GPU     32768      59 minutes from now
<!-- gh-comment-id:3879491010 --> @Protozua commented on GitHub (Feb 10, 2026): I am seeing the same issue, here is my configuration file: ```` [Unit] Description=Ollama Service After=network-online.target [Service] #ExecStart=/usr/local/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" Environment="OLLAMA_HOST=0.0.0.0" Environment="CUDA_VISIBLE_DEVICES=0" Environment="OLLAMA_NUM_PARALLEL=1" Environemnt="OLLAMA_CONTEXT_LENGTH=4096" [Install] WantedBy=default.target ```` But the OLLAMA_CONTEXT_LENGTH variable is ignored: ``` `NAME ID SIZE PROCESSOR CONTEXT UNTIL deepseek-r1:14b c333b7232bdb 17 GB 100% GPU 32768 59 minutes from now ````
Author
Owner

@rick-github commented on GitHub (Feb 10, 2026):

--- ollama.service.orig	2026-02-10 18:36:40.761187475 +0100
+++ ollama.service	2026-02-10 18:36:53.249136880 +0100
@@ -12,7 +12,7 @@
 Environment="OLLAMA_HOST=0.0.0.0"
 Environment="CUDA_VISIBLE_DEVICES=0"
 Environment="OLLAMA_NUM_PARALLEL=1"
-Environemnt="OLLAMA_CONTEXT_LENGTH=4096"
+Environment="OLLAMA_CONTEXT_LENGTH=4096"
 
 
 [Install]

<!-- gh-comment-id:3879698092 --> @rick-github commented on GitHub (Feb 10, 2026): ```diff --- ollama.service.orig 2026-02-10 18:36:40.761187475 +0100 +++ ollama.service 2026-02-10 18:36:53.249136880 +0100 @@ -12,7 +12,7 @@ Environment="OLLAMA_HOST=0.0.0.0" Environment="CUDA_VISIBLE_DEVICES=0" Environment="OLLAMA_NUM_PARALLEL=1" -Environemnt="OLLAMA_CONTEXT_LENGTH=4096" +Environment="OLLAMA_CONTEXT_LENGTH=4096" [Install] ```
Author
Owner

@luisbrandao commented on GitHub (Feb 12, 2026):

I also have OLLAMA_CONTEXT_LENGTH set on the server.
So i can't set my own default_if_unspecified anymore?

Should be hierarchical,
Global_default -> Local_default -> request specified.

<!-- gh-comment-id:3891235512 --> @luisbrandao commented on GitHub (Feb 12, 2026): I also have OLLAMA_CONTEXT_LENGTH set on the server. So i can't set my own default_if_unspecified anymore? Should be hierarchical, Global_default -> Local_default -> request specified.
Author
Owner

@rick-github commented on GitHub (Feb 12, 2026):

It is hierarchical.

OLLAMA_CONTEXT_LENGTH -> num_ctx in Modelfile -> num_ctx in request.

If OLLAMA_CONTEXT_LENGTH is not set, a default is chosen. Prior to 0.15.6, the default was 4096 (8192 for a few special cases) across the board. Now the default is 4096 if there is less than 24G VRAM, 32k for 24-48G VRAM, 256k for > 48G VRAM.

<!-- gh-comment-id:3891271915 --> @rick-github commented on GitHub (Feb 12, 2026): It is hierarchical. `OLLAMA_CONTEXT_LENGTH` -> `num_ctx` in Modelfile -> `num_ctx` in request. If `OLLAMA_CONTEXT_LENGTH` is not set, a default is chosen. Prior to 0.15.6, the default was 4096 (8192 for a few special cases) across the board. Now the default is 4096 if there is less than 24G VRAM, 32k for 24-48G VRAM, 256k for > 48G VRAM.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71299