[GH-ISSUE #12855] num_ctx for Qwen3-VL models doesn't work. #34280

Closed
opened 2026-04-22 17:43:18 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @y-tor on GitHub (Oct 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12855

What is the issue?

The num_ctx parameter in the ModelFile file is ignored for Qwen3-VL models.

The context size always remains at 8192.

I tried with different Qwen3-VLs, thinking or instructions version, 2b, 8b, etc. The result is the same.

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.12.7

Originally created by @y-tor on GitHub (Oct 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12855 ### What is the issue? The num_ctx parameter in the ModelFile file is ignored for Qwen3-VL models. The context size always remains at 8192. I tried with different Qwen3-VLs, thinking or instructions version, 2b, 8b, etc. The result is the same. ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.12.7
GiteaMirror added the bug label 2026-04-22 17:43:18 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 30, 2025):

qwen3vl (and gpt-oss) perform better with at least 8k context, so the ollama server sets a floor.

<!-- gh-comment-id:3467276917 --> @rick-github commented on GitHub (Oct 30, 2025): qwen3vl (and gpt-oss) perform better with at least 8k context, so the ollama server [sets a floor](https://github.com/ollama/ollama/blob/0a2d92081bb6b6b2d3eab5908fce08cfcf736e1d/server/routes.go#L143).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34280