mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 07:43:10 -05:00
issue: upstream headers are dropped for non-streaming chat completions request #5674
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Simon-Stone on GitHub (Jul 1, 2025).
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
v0.6.15
Ollama Version (if applicable)
No response
Operating System
Any
Browser (if applicable)
Any
Confirmation
README.md.Expected Behavior
When making requests against
/api/chat/completions, headers from the upstream APIs (OpenAI, Anthropic, LiteLLM) should be forwarded.Actual Behavior
Headers are currently only passed through from the upstream for streaming requests, but not for non-streamed requests because of this bit of code:
The
elsebranch only returns the actual payload of the response, but not the headers.Steps to Reproduce
This can be confirmed with the following two curl commands:
versus
The first call returns all headers from the upstream API, the second does not.
Logs & Screenshots
No relevant logs
Additional Information
No response
@jackthgu commented on GitHub (Jul 2, 2025):
Hello, @Simon-Stone
Thank you for your thoughtful check.
I have verified everything, and confirmed that when stream is set to false, the endpoint does not return information such as tokens.
While these details could certainly be useful internally, I'm wondering in what situations they would actually be needed at the endpoint itself.
If you know of any good use cases, we’d love to hear them.
Thank you!
@rgaricano commented on GitHub (Jul 2, 2025):
I think this is intentional behavior,
if is SSE endpoint (Server-Sent Events, continuous data streams), message header have to be check for status,... in each streamed message,
if not headers is on response.
@Simon-Stone commented on GitHub (Jul 2, 2025):
There are all sorts of scenarios where having access to the response headers might be useful.
Two examples that come to mind would be rate limits, which are communicated in the headers by Anthropic and OpenAI (and I'm sure others, too) and response cost, which is included as a header by LiteLLM.
@rgaricano commented on GitHub (Jul 2, 2025):
But if not streaming function return a response r.json(), that I supose with headers in it, isn't?
de018f0912/backend/open_webui/routers/openai.py (L862)@Simon-Stone commented on GitHub (Jul 2, 2025):
No, that only returns the response body decoded as a JSON. If it did, we should see the headers in the second, non-streaming curl command.
@rgaricano commented on GitHub (Jul 2, 2025):
sure??
@Simon-Stone commented on GitHub (Jul 2, 2025):
I believe those are the request headers. Take a look at
response.headersin your example. Those are the headers returned by the server and those are not included in the JSON.@rgaricano commented on GitHub (Jul 2, 2025):
that are result/response, (a json object):
ant this is result.json (another json object) (json formated content)
@Simon-Stone commented on GitHub (Jul 2, 2025):
Not sure what you mean. Take a look at this:
Outputs:
The second part is the response headers, the first part is the request headers (as seen by the server, which may be affected by load balancers and such).
@rgaricano commented on GitHub (Jul 2, 2025):
yes, you have reason:
@rgaricano commented on GitHub (Jul 2, 2025):
Then Simon, what do you propose?
return just
r(session.request) and modify conversion ?I was looking those, I didn't a deep search, just curious:
(for reference)
session & requests:
de018f0912/backend/open_webui/routers/openai.py (L837-L887)59ba21bdf8/backend/open_webui/utils/chat.py (L262-L284)response conversion:
(do a iterator in convert_response_ollama_to_openai similar to the one in below convert_streaming_response_ollama_to_openai & reassign object content ? )
59ba21bdf8/backend/open_webui/utils/response.py (L103-L130)request object & methods:
https://www.w3schools.com/python/module_requests.asp
https://fastapi.tiangolo.com/reference/parameters/
@Simon-Stone commented on GitHub (Jul 2, 2025):
I've opened a PR regarding this. Feedback much appreciated!
@tjbck commented on GitHub (Sep 6, 2025):
Won't be added, this will require major refactor on all openai, ollama, function routers and will break too many existing Functions.