issue: Unexpected new requests #4986

New Issue

GiteaMirror · 2025-11-11T16:08:55-06:00

GiteaMirror commented

2025-11-11 16:08:55 -06:00

Originally created by @spaceater on GitHub (Apr 28, 2025).

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.6.5

Ollama Version (if applicable)

No response

Operating System

Ubuntu 24

Browser (if applicable)

edge

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have listed steps to reproduce the bug in detail.

Expected Behavior

After a chat is completed. There are no requests running in the backend.

Actual Behavior

If the chat is the first chat of a dialogue, then once the response is completed, openwebui will immediately post a new request to the backend, and the response of this unexpected request cannot be received in open-webui.

Steps to Reproduce

I use sglang as the backend and use OFFLINE_MODE=True ENABLE_OLLAMA_API=False ENABLE_OPENAI_API=True open-webui serve --port 999 to start open-webui server at port 999.

Logs & Screenshots

Here is the logs of open-webui that causes the bug:

2025-04-28 15:13:01.944 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/new HTTP/1.1" 200 - {}
2025-04-28 15:13:02.004 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=2 HTTP/1.1" 200 - {}
2025-04-28 15:13:02.077 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {}
2025-04-28 15:13:02.133 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {}
2025-04-28 15:13:02.200 | INFO     | open_webui.routers.openai:get_all_models:389 - get_all_models() - {}
2025-04-28 15:13:02.273 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/chat/completions HTTP/1.1" 200 - {}
2025-04-28 15:13:02.335 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {}
2025-04-28 15:13:52.156 | INFO     | open_webui.routers.openai:get_all_models:389 - get_all_models() - {}
2025-04-28 15:13:52.209 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/chat/completed HTTP/1.1" 200 - {}
2025-04-28 15:13:52.272 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {}
2025-04-28 15:13:52.325 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {}

Here is the logs of sglang:

[2025-04-28 15:12:36] INFO:     Started server process [77268]
[2025-04-28 15:12:36] INFO:     Waiting for application startup.
[2025-04-28 15:12:36] INFO:     Application startup complete.
[2025-04-28 15:12:36] INFO:     Uvicorn running on http://127.0.0.1:998 (Press CTRL+C to quit)
[2025-04-28 15:12:37] INFO:     127.0.0.1:46746 - "GET /get_model_info HTTP/1.1" 200 OK
[2025-04-28 15:12:37 TP0] Prefill batch. #new-seq: 1, #new-token: 7, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-28 15:12:51] INFO:     127.0.0.1:46750 - "POST /generate HTTP/1.1" 200 OK
[2025-04-28 15:12:51] The server is fired up and ready to roll!
[2025-04-28 15:12:55] INFO:     127.0.0.1:38858 - "GET /v1/models HTTP/1.1" 200 OK
[2025-04-28 15:13:02] INFO:     127.0.0.1:38874 - "GET /v1/models HTTP/1.1" 200 OK
[2025-04-28 15:13:02] INFO:     127.0.0.1:38876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[2025-04-28 15:13:02 TP0] Prefill batch. #new-seq: 1, #new-token: 6, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-28 15:13:21 TP0] Decode batch. #running-req: 1, #token: 40, token usage: 0.00, gen throughput (token/s): 0.90, largest-len: 0, #queue-req: 0,
[2025-04-28 15:13:40 TP0] Decode batch. #running-req: 1, #token: 80, token usage: 0.00, gen throughput (token/s): 2.06, largest-len: 0, #queue-req: 0,
[2025-04-28 15:13:52] INFO:     127.0.0.1:35026 - "GET /v1/models HTTP/1.1" 200 OK
[2025-04-28 15:13:52 TP0] Prefill batch. #new-seq: 1, #new-token: 308, #cached-token: 2, token usage: 0.00, #running-req: 0, #queue-req: 0,
[2025-04-28 15:14:15 TP0] Decode batch. #running-req: 1, #token: 326, token usage: 0.00, gen throughput (token/s): 1.16, largest-len: 0, #queue-req: 0,
[2025-04-28 15:14:34 TP0] Decode batch. #running-req: 1, #token: 366, token usage: 0.00, gen throughput (token/s): 2.02, largest-len: 0, #queue-req: 0,
[2025-04-28 15:14:54 TP0] Decode batch. #running-req: 1, #token: 406, token usage: 0.00, gen throughput (token/s): 2.02, largest-len: 0, #queue-req: 0,

Additional Information

Does anyone know about POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {}? I can't find in documents about it. I strongly supposed that the bug is caused by it, because when I use temporary chats mode, everything is fine, and there is no this POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {} in logs.

Originally created by @spaceater on GitHub (Apr 28, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.6.5 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 24 ### Browser (if applicable) edge ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior After a chat is completed. There are no requests running in the backend. ### Actual Behavior If the chat is the first chat of a dialogue, then once the response is completed, openwebui will immediately post a new request to the backend, and the response of this unexpected request cannot be received in open-webui. ### Steps to Reproduce I use sglang as the backend and use `OFFLINE_MODE=True ENABLE_OLLAMA_API=False ENABLE_OPENAI_API=True open-webui serve --port 999` to start open-webui server at port 999. ### Logs & Screenshots Here is the logs of open-webui that causes the bug: ``` 2025-04-28 15:13:01.944 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/new HTTP/1.1" 200 - {} 2025-04-28 15:13:02.004 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=2 HTTP/1.1" 200 - {} 2025-04-28 15:13:02.077 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {} 2025-04-28 15:13:02.133 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {} 2025-04-28 15:13:02.200 | INFO | open_webui.routers.openai:get_all_models:389 - get_all_models() - {} 2025-04-28 15:13:02.273 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/chat/completions HTTP/1.1" 200 - {} 2025-04-28 15:13:02.335 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {} 2025-04-28 15:13:52.156 | INFO | open_webui.routers.openai:get_all_models:389 - get_all_models() - {} 2025-04-28 15:13:52.209 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/chat/completed HTTP/1.1" 200 - {} 2025-04-28 15:13:52.272 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {} 2025-04-28 15:13:52.325 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 127.0.0.1:0 - "GET /api/v1/chats/?page=1 HTTP/1.1" 200 - {} ``` Here is the logs of sglang: ``` [2025-04-28 15:12:36] INFO: Started server process [77268] [2025-04-28 15:12:36] INFO: Waiting for application startup. [2025-04-28 15:12:36] INFO: Application startup complete. [2025-04-28 15:12:36] INFO: Uvicorn running on http://127.0.0.1:998 (Press CTRL+C to quit) [2025-04-28 15:12:37] INFO: 127.0.0.1:46746 - "GET /get_model_info HTTP/1.1" 200 OK [2025-04-28 15:12:37 TP0] Prefill batch. #new-seq: 1, #new-token: 7, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0, [2025-04-28 15:12:51] INFO: 127.0.0.1:46750 - "POST /generate HTTP/1.1" 200 OK [2025-04-28 15:12:51] The server is fired up and ready to roll! [2025-04-28 15:12:55] INFO: 127.0.0.1:38858 - "GET /v1/models HTTP/1.1" 200 OK [2025-04-28 15:13:02] INFO: 127.0.0.1:38874 - "GET /v1/models HTTP/1.1" 200 OK [2025-04-28 15:13:02] INFO: 127.0.0.1:38876 - "POST /v1/chat/completions HTTP/1.1" 200 OK [2025-04-28 15:13:02 TP0] Prefill batch. #new-seq: 1, #new-token: 6, #cached-token: 1, token usage: 0.00, #running-req: 0, #queue-req: 0, [2025-04-28 15:13:21 TP0] Decode batch. #running-req: 1, #token: 40, token usage: 0.00, gen throughput (token/s): 0.90, largest-len: 0, #queue-req: 0, [2025-04-28 15:13:40 TP0] Decode batch. #running-req: 1, #token: 80, token usage: 0.00, gen throughput (token/s): 2.06, largest-len: 0, #queue-req: 0, [2025-04-28 15:13:52] INFO: 127.0.0.1:35026 - "GET /v1/models HTTP/1.1" 200 OK [2025-04-28 15:13:52 TP0] Prefill batch. #new-seq: 1, #new-token: 308, #cached-token: 2, token usage: 0.00, #running-req: 0, #queue-req: 0, [2025-04-28 15:14:15 TP0] Decode batch. #running-req: 1, #token: 326, token usage: 0.00, gen throughput (token/s): 1.16, largest-len: 0, #queue-req: 0, [2025-04-28 15:14:34 TP0] Decode batch. #running-req: 1, #token: 366, token usage: 0.00, gen throughput (token/s): 2.02, largest-len: 0, #queue-req: 0, [2025-04-28 15:14:54 TP0] Decode batch. #running-req: 1, #token: 406, token usage: 0.00, gen throughput (token/s): 2.02, largest-len: 0, #queue-req: 0, ``` ### Additional Information Does anyone know about `POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {}`? I can't find in documents about it. I strongly supposed that the bug is caused by it, because when I use temporary chats mode, everything is fine, and there is no this `POST /api/v1/chats/9c740f65-e1d0-4cf4-b845-938dc888d1c5 HTTP/1.1" 200 - {}` in logs.

GiteaMirror added the bug label 2025-11-11 16:08:55 -06:00

GiteaMirror closed this issue

2025-11-11 16:08:56 -06:00

GiteaMirror commented

2025-11-11 16:08:58 -06:00

@Classic298 commented on GitHub (Apr 28, 2025):

There is no bug

Open WebUI makes another request after the response was received to the "completed" endpoint to call all filter outlets and it also needs to generate the chat's title and tags as the chat gets created, which needs an API call of course.

@Classic298 commented on GitHub (Apr 28, 2025): There is no bug Open WebUI makes another request after the response was received to the "completed" endpoint to call all filter outlets and it also needs to generate the chat's title and tags as the chat gets created, which needs an API call of course.

GiteaMirror commented

2025-11-11 16:08:58 -06:00

@spaceater commented on GitHub (Apr 28, 2025):

There is no bug

Open WebUI makes another request after the response was received to the "completed" endpoint to call all filter outlets and it also needs to generate the chat's title and tags as the chat gets created, which needs an API call of course.

I know that it will create titles, but it takes very long times.
And according to logs in slangs, the extra request generate over 400 tokens. But open-webui didn't receive those tokens.

@spaceater commented on GitHub (Apr 28, 2025): > There is no bug > > Open WebUI makes another request after the response was received to the "completed" endpoint to call all filter outlets and it also needs to generate the chat's title and tags as the chat gets created, which needs an API call of course. I know that it will create titles, but it takes very long times. And according to logs in slangs, the extra request generate over 400 tokens. But open-webui didn't receive those tokens.

GiteaMirror commented

2025-11-11 16:08:59 -06:00

@Classic298 commented on GitHub (Apr 28, 2025):

@spaceater that's the input tokens - your message, the AI's response and the prompt of the title generation/tag generation. This is normal amount

@Classic298 commented on GitHub (Apr 28, 2025): @spaceater that's the input tokens - your message, the AI's response and the prompt of the title generation/tag generation. This is normal amount

GiteaMirror commented

2025-11-11 16:08:59 -06:00

@spaceater commented on GitHub (Apr 30, 2025):

Okay, I get it.
The issue is solved:)

@spaceater commented on GitHub (Apr 30, 2025): Okay, I get it. The issue is solved:)

GiteaMirror referenced this issue

2026-04-19 20:24:57 -05:00

[GH-ISSUE #4986] Feature Req: Flag to disable user API #13813

GiteaMirror referenced this issue

2026-04-25 03:46:04 -05:00

[GH-ISSUE #4986] Feature Req: Flag to disable user API #29341

GiteaMirror referenced this issue