[GH-ISSUE #8382] OpenWebUI not following Ollama config parameters? #15103

Closed
opened 2026-04-19 21:23:51 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @alpilotx on GitHub (Jan 7, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8382

After some experiments and observations I have realized, that OpenWebUI seems to "decide" on its own, which parameters to use to start Ollama Server, and does not really "seems" to follow what settings I try to dial in!

Here is an example.
I set in the Models Settings a few parameters:
image
Like:

  • Context Length: 20000
  • use_mmap: enabled
  • use_mlock: enabled
  • num_threads: 64

And then also set these in the chats model settings :
image

But when I fire off my chat (with a knowledge set), I see - while the Ollama process runs - that it was started with parameters like this:
image
Or in text:

/usr/lib/ollama/runners/cpu_avx2/ollama_llama_server runner --model /root/.ollama/models/blobs/sha256-ac3d1ba8aa77755dab3806d9024e9c385ea0d5b412d6bdf9157f8a4a7e9fc0d9 --ctx-size 80000 --batch-size 512 --threads 64 --no-mmap --parallel 4 --port 34773

So, effectively it ignored almost all of my settings, only threads seems to have been "respected" (or was by chance set correctly, as I have 64 CPU cores)

Do I maybe overlook some basic "workings" of ho OpenWebUI interacts withOllama, or is there something missing / not-working in the configuration?

Additional info: both OpenWebUI and Ollama run in their own Docker container.

Originally created by @alpilotx on GitHub (Jan 7, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8382 After some experiments and observations I have realized, that OpenWebUI seems to "decide" on its own, which parameters to use to start Ollama Server, and does not really "seems" to follow what settings I try to dial in! Here is an example. I set in the Models Settings a few parameters: ![image](https://github.com/user-attachments/assets/52811939-d2df-405e-ab20-966d0fe66d25) Like: - Context Length: 20000 - use_mmap: enabled - use_mlock: enabled - num_threads: 64 And then also set these in the chats model settings : ![image](https://github.com/user-attachments/assets/ea434a29-a089-4904-bf44-a03f97a7efea) But when I fire off my chat (with a knowledge set), I see - while the Ollama process runs - that it was started with parameters like this: ![image](https://github.com/user-attachments/assets/8ec7916b-e393-44e8-a638-4f182959372d) Or in text: > /usr/lib/ollama/runners/cpu_avx2/ollama_llama_server runner --model /root/.ollama/models/blobs/sha256-ac3d1ba8aa77755dab3806d9024e9c385ea0d5b412d6bdf9157f8a4a7e9fc0d9 --ctx-size 80000 --batch-size 512 --threads 64 --no-mmap --parallel 4 --port 34773 So, effectively it ignored almost all of my settings, only threads seems to have been "respected" (or was by chance set correctly, as I have 64 CPU cores) Do I maybe overlook some basic "workings" of ho OpenWebUI interacts withOllama, or is there something missing / not-working in the configuration? Additional info: both OpenWebUI and Ollama run in their own Docker container.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#15103