[GH-ISSUE #11942] multiple request wia openwebui #54442

Closed
opened 2026-04-29 05:57:18 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @wertrigone on GitHub (Aug 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11942

hello i have ollama and i use oss 20b, i make ollama host 0.0.0.0 at my another server nd connect ollama to webui, after it i try make 2 or 4 dialog but they awnswer only 1 by 1 how to fix it?

Originally created by @wertrigone on GitHub (Aug 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11942 hello i have ollama and i use oss 20b, i make ollama host 0.0.0.0 at my another server nd connect ollama to webui, after it i try make 2 or 4 dialog but they awnswer only 1 by 1 how to fix it?
GiteaMirror added the feature request label 2026-04-29 05:57:18 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 17, 2025):

Set OLLAMA_NUM_PARALLEL in the server environment.

<!-- gh-comment-id:3194519867 --> @rick-github commented on GitHub (Aug 17, 2025): Set [`OLLAMA_NUM_PARALLEL`](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-does-ollama-handle-concurrent-requests) in the server environment.
Author
Owner

@wertrigone commented on GitHub (Aug 17, 2025):

Set OLLAMA_NUM_PARALLEL in the server environment.

like that?

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
Environment="OLLAMA_HOST=0.0.0.0"
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
Environment="OLLAMA_NUM_PARALLEL=4"

[Install]
WantedBy=default.target


<!-- gh-comment-id:3194537643 --> @wertrigone commented on GitHub (Aug 17, 2025): > Set [`OLLAMA_NUM_PARALLEL`](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-does-ollama-handle-concurrent-requests) in the server environment. like that? ``` [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/local/bin/ollama serve User=ollama Group=ollama Restart=always Environment="OLLAMA_HOST=0.0.0.0" RestartSec=3 Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" Environment="OLLAMA_NUM_PARALLEL=4" [Install] WantedBy=default.target ```
Author
Owner

@rick-github commented on GitHub (Aug 17, 2025):

Looks right. Note that this will quadruple the amount of KV cache allocated, so can cause layers to spill to system RAM which will result in slower inference.

<!-- gh-comment-id:3194542522 --> @rick-github commented on GitHub (Aug 17, 2025): Looks right. Note that this will quadruple the amount of KV cache allocated, so can cause layers to spill to system RAM which will result in slower inference.
Author
Owner

@pdevine commented on GitHub (Aug 19, 2025):

Going to close this as answered (thank you @rick-github !)

<!-- gh-comment-id:3202932205 --> @pdevine commented on GitHub (Aug 19, 2025): Going to close this as answered (thank you @rick-github !)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54442