mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-05 18:38:17 -05:00
[GH-ISSUE #23204] issue: Anthropic direct connection - Prompt caching not supported #19918
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Lyhtande on GitHub (Mar 29, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23204
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.8.12
Ollama Version (if applicable)
No response
Operating System
Ubuntu 22.04.5 LTS
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
The native Anthropic integration should support Anthropic's prompt caching mechanism, so that:
Actual Behavior
When using Anthropic Claude models via the direct connection in Open WebUI, prompt caching is not being utilized. This results in significantly higher API costs, especially with large context models like Claude Sonnet and Opus.
Anthropic supports prompt caching via specific cache-control headers in the API request. However, the current native integration does not implement this feature.
Steps to Reproduce
Logs & Screenshots
Additional Information
No response
@Classic298 commented on GitHub (Mar 29, 2026):
you can do it via advanced parameter and adding cache_control header there - or a filter
both are easy options
and this is not only for direct connections but global ones
but yeah not an issue, you just need to configure it
@Lyhtande commented on GitHub (Mar 29, 2026):
@Classic298 Thanks for the suggestions! I tried both approaches – custom params and an inlet filter setting
cache_control: {"type": "ephemeral"}at the top level. In both cases, Cache Read and Cache Write remain 0 tokens in the Anthropic console. It seems Open WebUI strips unknown keys when transforming the request to Anthropic's format. Do you have a specific implementation in mind that actually works?What I tried:
Filter:
Advanced Param:
cache_control Value: {"type": "ephemeral"}
This results in:
@Classic298 commented on GitHub (Mar 29, 2026):
You're right i read the anthropic docs again
The reason this doesn't work (and can't work via filters/advanced params either) is that Open WebUI communicates with Anthropic through their OpenAI-compatible /v1/chat/completions endpoint AND THAT ENDPOINT doesnt have caching support - even if you add the parameter (and you did add it correctly)
Prompt caching (cache_control) is a feature of Anthropic's native Messages API only — the OpenAI-compatible endpoint doesn't support it.
Open WebUI doesn't have outgoing support for Anthropic's native Messages API format — it only has an inbound /api/v1/messages endpoint for compatibility when using Open WebUI as an LLM proxy. Supporting prompt caching would require adding native Anthropic Messages API support on the outgoing request side, which would be a feature request rather than a bug.
What you CAN DO is add a pipe which implements anthropic as a provider and use that for your models and enjoy caching via that
reference: https://openwebui.com/posts/anthropic_60984ebf
@Lyhtande commented on GitHub (Mar 29, 2026):
@Classic298 Thanks for the detailed explanation! I'll go with a custom pipe for now. Could you please convert this to a feature request? Native caching support via the Anthropic Messages API would be a great addition.
@Classic298 commented on GitHub (Mar 29, 2026):
@Lyhtande there have been dozens of feature requests (many duplicates) the past about native anthropic messages support. The answer was and is: no. No native support for messages will be added as it doesn't fit with Open WebUI's stance. Providers should support universal or de-facto universal API standards and not invent their own (i.e. messages, google's "interactions" and many other examples)
And if it wasn't clear: there is no Anthropic Messages API support, therefore, no prompt caching because Anthropic only supports it via that API and not via chat completions.
@Classic298 commented on GitHub (Mar 29, 2026):
https://docs.openwebui.com/faq#q-why-doesnt-open-webui-natively-support-provider-xs-proprietary-api
@Lyhtande commented on GitHub (Mar 29, 2026):
@Classic298 understood, and thanks for the clarification. To be fair though – in your earlier comment you yourself described this as 'a feature request rather than a bug', which is why I asked for the label change. Anyway, good to know this is a deliberate design decision. I'll stick with my custom pipe.
@Classic298 commented on GitHub (Mar 29, 2026):
Yeah maybe my wording wasn't clear
I meant to say it WOULD be a feature request and not a bug.
That's it haha
I didn't mean to imply you should ask for messages api support - but that's on me. Clearly now that i read my sentence again, ... one can read an implication to request a feature for messages API support