[GH-ISSUE #10995] context length is doubled #69309

Closed
opened 2026-05-04 17:45:24 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @nicho2 on GitHub (Jun 6, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10995

What is the issue?

The context length is doubled:

i send "num_ctx": 45000 and in the ollama log , i see runner.num_ctx=90000

is this right?

Relevant log output

I send :

POST /api/chat HTTP/1.1
Host: 10.2.142.77:11434
User-Agent: python-requests/2.32.3
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Type: application/json
Content-Length: 345384

{"model": "llama3.3:latest", "stream": true, "options": {"temperature": 0.1, "num_ctx": 45000}, "messages": [{"role": "system", "content": ".........}, {"role": "user", "content": "...."}]}


In ollama log, i see:  (runner.num_ctx=90000)

time=2025-06-06T07:07:42.955Z level=DEBUG source=sched.go:361 msg="after processing request finished event" runner.name=registry.ollama.ai/library/llama3.3:latest runner.inference=cuda runner.devices=3 runner.size="105.1 GiB" runner.vram="105.1 GiB" runner.parallel=2 runner.pid=5251 runner.model=/root/.ollama/models/blobs/sha256-4824460d29f2058aaf6e1118a63a7a197a09bed509f0e7d4e2efb1ee273b447d runner.num_ctx=90000 refCount=0

time=2025-06-06T07:07:43.074Z level=DEBUG source=ggml.go:155 msg="key not found" key=general.alignment default=32

time=2025-06-06T07:07:43.075Z level=DEBUG source=sched.go:615 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-4824460d29f2058aaf6e1118a63a7a197a09bed509f0e7d4e2efb1ee273b447d

time=2025-06-06T07:07:43.241Z level=DEBUG source=server.go:729 msg="completion request" images=0 prompt=333991 format=""

time=2025-06-06T07:07:43.430Z level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=37575 prompt=37177 used=114 remaining=37063

OS

Docker

GPU

Nvidia

CPU

Intel

Ollama version

0.9.0

Originally created by @nicho2 on GitHub (Jun 6, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10995 ### What is the issue? The context length is doubled: i send "num_ctx": 45000 and in the ollama log , i see runner.num_ctx=90000 is this right? ### Relevant log output ```shell I send : POST /api/chat HTTP/1.1 Host: 10.2.142.77:11434 User-Agent: python-requests/2.32.3 Accept-Encoding: gzip, deflate Accept: */* Connection: keep-alive Content-Type: application/json Content-Length: 345384 {"model": "llama3.3:latest", "stream": true, "options": {"temperature": 0.1, "num_ctx": 45000}, "messages": [{"role": "system", "content": ".........}, {"role": "user", "content": "...."}]} In ollama log, i see: (runner.num_ctx=90000) time=2025-06-06T07:07:42.955Z level=DEBUG source=sched.go:361 msg="after processing request finished event" runner.name=registry.ollama.ai/library/llama3.3:latest runner.inference=cuda runner.devices=3 runner.size="105.1 GiB" runner.vram="105.1 GiB" runner.parallel=2 runner.pid=5251 runner.model=/root/.ollama/models/blobs/sha256-4824460d29f2058aaf6e1118a63a7a197a09bed509f0e7d4e2efb1ee273b447d runner.num_ctx=90000 refCount=0 time=2025-06-06T07:07:43.074Z level=DEBUG source=ggml.go:155 msg="key not found" key=general.alignment default=32 time=2025-06-06T07:07:43.075Z level=DEBUG source=sched.go:615 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-4824460d29f2058aaf6e1118a63a7a197a09bed509f0e7d4e2efb1ee273b447d time=2025-06-06T07:07:43.241Z level=DEBUG source=server.go:729 msg="completion request" images=0 prompt=333991 format="" time=2025-06-06T07:07:43.430Z level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=37575 prompt=37177 used=114 remaining=37063 ``` ### OS Docker ### GPU Nvidia ### CPU Intel ### Ollama version 0.9.0
GiteaMirror added the bug label 2026-05-04 17:45:24 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 6, 2025):

Set OLLAMA_NUM_PARALLEL=1 in the server environment.

<!-- gh-comment-id:2948350769 --> @rick-github commented on GitHub (Jun 6, 2025): Set `OLLAMA_NUM_PARALLEL=1` in the server environment.
Author
Owner

@pdevine commented on GitHub (Jun 10, 2025):

Going to close this as answered. (Thanks @rick-github !)

<!-- gh-comment-id:2960183825 --> @pdevine commented on GitHub (Jun 10, 2025): Going to close this as answered. (Thanks @rick-github !)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69309