Enh: call outlet hook from the backend #1288

New Issue

GiteaMirror · 2025-11-11T14:41:54-06:00

GiteaMirror commented

2025-11-11 14:41:54 -06:00

Originally created by @frederikschubert on GitHub (Jun 17, 2024).

Is your feature request related to a problem? Please describe.
We are using Open WebUI as a general solution to manage access to LLMs and RAG applications in our company. Besides the Open WebUI web application, we are using the continue plugin with the following configuration:

"models": [
    {
      "model": "gpt-4o",
      "title": "GPT-4o",
      "apiKey": "sk-...,
      "completionOptions": {},
      "apiBase": "https://openwebui/openai",
      "provider": "openai",
      "requestOptions": {
        "headers": {
            "Content-Type": "application/json"
        }
      }
    },
...

Additionally, we are using pipelines such as the langfuse filter to track the usage. When called via the continue plugin, only the inlet is called. This leaves the response text empty, as it is set in the outlet filter.

Describe the solution you'd like
The inlet as well as the outlet filter of the registered pipelines should be called when using the Open WebUI as an API proxy.

Describe alternatives you've considered
The langfuse filter could be integrated in a general litellm setup, but I think that using inlets and outlets of pipelines in the same way for all chat interactions is a good feature in general.

Originally created by @frederikschubert on GitHub (Jun 17, 2024). **Is your feature request related to a problem? Please describe.** We are using Open WebUI as a general solution to manage access to LLMs and RAG applications in our company. Besides the Open WebUI web application, we are using the [continue](https://docs.openwebui.com/tutorial/continue-dev/) plugin with the following configuration: ```json "models": [ { "model": "gpt-4o", "title": "GPT-4o", "apiKey": "sk-..., "completionOptions": {}, "apiBase": "https://openwebui/openai", "provider": "openai", "requestOptions": { "headers": { "Content-Type": "application/json" } } }, ... ``` Additionally, we are using pipelines such as the [langfuse filter ](https://github.com/open-webui/pipelines/blob/main/examples/filters/langfuse_filter_pipeline.py) to track the usage. When called via the continue plugin, only the `inlet` is called. This leaves the response text empty, as it is set in the `outlet` filter. **Describe the solution you'd like** The `inlet` as well as the `outlet` [filter](https://github.com/open-webui/open-webui/blob/main/backend/main.py#L1033) of the registered pipelines should be called when using the Open WebUI as an API proxy. **Describe alternatives you've considered** The langfuse filter could be integrated in a general litellm setup, but I think that using inlets and outlets of pipelines in the same way for all chat interactions is a good feature in general.

GiteaMirror commented

2025-11-11 14:41:54 -06:00

@Ronan035 commented on GitHub (Sep 30, 2024):

Same issue with version 0.3.30 and current WebUI Pipelines docker image.
I use Continue and Open WebUI as a proxy to an external Ollama instance. The outlets are 'visible' via Open WebUI chat but not for calls made by Continue via /ollama/v1 endpoint. Only inlets are taken into account.
(Open WebUI is a fantastic tool! Huge huge thanks for your work!)

@Ronan035 commented on GitHub (Sep 30, 2024): Same issue with version 0.3.30 and current WebUI Pipelines docker image. I use Continue and Open WebUI as a proxy to an external Ollama instance. The outlets are 'visible' via Open WebUI chat but not for calls made by Continue via <Open WebUI URL>/ollama/v1 endpoint. Only inlets are taken into account. (Open WebUI is a fantastic tool! Huge huge thanks for your work!)

GiteaMirror commented

2025-11-11 14:41:54 -06:00

@Ronan035 commented on GitHub (Oct 17, 2024):

ok, it's not an issue (https://github.com/open-webui/open-webui/discussions/4460). But it would be very convenient to have a chat API call that takes into account outlet filters. The use of tools often does not allow to implement the Open WebUI client workflow.

@Ronan035 commented on GitHub (Oct 17, 2024): ok, it's not an issue (https://github.com/open-webui/open-webui/discussions/4460). But it would be very convenient to have a chat API call that takes into account outlet filters. The use of tools often does not allow to implement the Open WebUI client workflow.

GiteaMirror commented

2025-11-11 14:41:55 -06:00

@ADD-Carlos-Zamora commented on GitHub (Oct 17, 2024):

As @Ronan035 said, thanks for your amazing job, Open WebUI team!

At my company, we are facing the same problem. We tried to create a Pipe to intercept the call and process the response produced by Ollama, but we have not been able to make it executed :-/

How is this issue going? Is it being considered?

@ADD-Carlos-Zamora commented on GitHub (Oct 17, 2024): As @Ronan035 said, thanks for your amazing job, Open WebUI team! At my company, we are facing the same problem. We tried to create a Pipe to intercept the call and process the response produced by Ollama, but we have not been able to make it executed :-/ How is this issue going? Is it being considered?

GiteaMirror commented

2025-11-11 14:41:55 -06:00

@sir3mat commented on GitHub (Nov 5, 2024):

Hi, i have seen a strange behaviour using rag and pipelines

The workflow is as follows:

Documents are loaded and fed into OpenWebUI.
OpenWebUI encodes the documents, splits them into chunks, and stores them.
The /inlet endpoint on the pipeline is called.
OpenWebUI then performs the RAG operation.
The pipeline’s /pipe endpoint is triggered, displaying the response on OpenWebUI.
Finally, the client calls the /outlet endpoint

Question on Implementation:

Why do the /inlet, /pipe, and /outlet endpoints each have different request bodies?
Why the process is openwebuiClient->inlet->openwebuiClient->pipe->openwebuiClient->outlet->openwebuiClient?

If I develop RAG on my custom pipeline in pipe logic the RAG of openwebui is still executed.

@sir3mat commented on GitHub (Nov 5, 2024): Hi, i have seen a strange behaviour using rag and pipelines The workflow is as follows: Documents are loaded and fed into OpenWebUI. OpenWebUI encodes the documents, splits them into chunks, and stores them. The /inlet endpoint on the pipeline is called. OpenWebUI then performs the RAG operation. The pipeline’s /pipe endpoint is triggered, displaying the response on OpenWebUI. Finally, the client calls the /outlet endpoint Question on Implementation: Why do the /inlet, /pipe, and /outlet endpoints each have different request bodies? Why the process is openwebuiClient->inlet->openwebuiClient->pipe->openwebuiClient->outlet->openwebuiClient? If I develop RAG on my custom pipeline in pipe logic the RAG of openwebui is still executed.

GiteaMirror commented

2025-11-11 14:41:55 -06:00

@rkconsulting commented on GitHub (Dec 8, 2024):

stumbling across this as well.. we def need to be able to get at the outlet from backend for various retrieval purposes. thanks oi team!

@rkconsulting commented on GitHub (Dec 8, 2024): stumbling across this as well.. we def need to be able to get at the outlet from backend for various retrieval purposes. thanks oi team!

GiteaMirror commented

2025-11-11 14:41:55 -06:00

@Seniorsimo commented on GitHub (Dec 20, 2024):

I'm having the same problem using the continue plugin with a simple filter to count the token usage as metric. When using the model from the UI, booth the inlet filter and the outled are called, but when using the same model from the chat in the continue plugin it results in a call to the endpoint /api/chat/completions and the outlet filter is not called.

@Seniorsimo commented on GitHub (Dec 20, 2024): I'm having the same problem using the continue plugin with a simple filter to count the token usage as metric. When using the model from the UI, booth the inlet filter and the outled are called, but when using the same model from the chat in the continue plugin it results in a call to the endpoint `/api/chat/completions` and the outlet filter is not called.

GiteaMirror commented

2025-11-11 14:41:55 -06:00

@DmitriyAlergant commented on GitHub (Mar 10, 2025):

Hi @tjbck, your recent improvements to Filters including .stream() method works for both UI and API consumption, so this already moved things into the right direction.

Would you support a PR that moves outlet() filter calls from /chat/completed and into /chat/completions? Specifically into the process_chat_response() funciton. It will be somewhat tricky to do correctly for all non-streaming and streaming request avenues - types including tool calls, code interpreter, tasks, etc. That function is somewhat complicated. But should be possible.

Would you accept such a PR (if it passes the tests), or is it something that you are looking to do yourself any time soon?

@DmitriyAlergant commented on GitHub (Mar 10, 2025): Hi @tjbck, your recent improvements to Filters including .stream() method works for both UI and API consumption, so this already moved things into the right direction. Would you support a PR that moves outlet() filter calls from /chat/completed and into /chat/completions? Specifically into the process_chat_response() funciton. It will be somewhat tricky to do correctly for all non-streaming and streaming request avenues - types including tool calls, code interpreter, tasks, etc. That function is somewhat complicated. But should be possible. Would you accept such a PR (if it passes the tests), or is it something that you are looking to do yourself any time soon?

GiteaMirror commented

2025-11-11 14:41:56 -06:00

@arunbugkiller commented on GitHub (Mar 12, 2025):

@DmitriyAlergant, @Seniorsimo @frederikschubert @rkconsulting @sir3mat
I am facing an issue where the filter is getting executed at the backend, but sometimes the results are shown on the front-end and sometimes they are not. Any idea on how this can be resolved.

@arunbugkiller commented on GitHub (Mar 12, 2025): @DmitriyAlergant, @Seniorsimo @frederikschubert @rkconsulting @sir3mat I am facing an issue where the filter is getting executed at the backend, but sometimes the results are shown on the front-end and sometimes they are not. Any idea on how this can be resolved.

GiteaMirror commented

2025-11-11 14:41:56 -06:00

@yz342 commented on GitHub (Apr 25, 2025):

A workaround would be to inject the connection to Open WebUI through pipeline and add an observability tool like openlit there

@yz342 commented on GitHub (Apr 25, 2025): A workaround would be to inject the connection to Open WebUI through [pipeline](https://github.com/open-webui/pipelines/tree/main/examples/pipelines/providers) and add an observability tool like [openlit](https://openlit.io/blogs/openlit-openwebui) there

GiteaMirror commented

2025-11-11 14:41:56 -06:00

@DmitriyAlergant commented on GitHub (Apr 27, 2025):

I actually made an attempt at starting this as a PR (moving "outlet" call from /chat/completed to /chat/completion). Unfortunately the stream processing code pathway is very complicated (also with embedded tool call calls, etc), it would be a nightmare to test, so I gave up.

What can be done though (relatively easily) is to introduce a new filter method "batch_outlet()" which would be

Only applicable to non-streaming requests (much easier to code and test)
Invoked reliably from /chat/completions regardless of who is calling it whether the app itself or an API user
Won't break compatibility with all existing filters that may have relied on the legacy behavior (calling of "outlet" from the app only, both for streaming and batch requests, etc).

With a combination of stream() and batch_outlet() filters, one will be able to achieve reliable output filtering for all requests including batch and streaming requests, from the app and the API. While existing community filters may continue relying on the legacy outlet() method until it is sunset.

@tjbck would you agree to this approach? It's not motivating to work on a PR if we don't know upfront whether you like the design in general.

@DmitriyAlergant commented on GitHub (Apr 27, 2025): I actually made an attempt at starting this as a PR (moving "outlet" call from /chat/completed to /chat/completion). Unfortunately the stream processing code pathway is very complicated (also with embedded tool call calls, etc), it would be a nightmare to test, so I gave up. What can be done though (relatively easily) is to introduce a new filter method "**batch_outlet**()" which would be 1) Only applicable to non-streaming requests (much easier to code and test) 2) Invoked reliably from /chat/completions regardless of who is calling it whether the app itself or an API user 3) Won't break compatibility with all existing filters that may have relied on the legacy behavior (calling of "outlet" from the app only, both for streaming and batch requests, etc). With a combination of **stream()** and **batch_outlet()** filters, one will be able to achieve reliable output filtering for all requests including batch and streaming requests, from the app and the API. While existing community filters may continue relying on the legacy **outlet()** method until it is sunset. @tjbck would you agree to this approach? It's not motivating to work on a PR if we don't know upfront whether you like the design in general.

GiteaMirror commented

2025-11-11 14:41:56 -06:00

@DmitriyAlergant commented on GitHub (Apr 27, 2025):

@yz342 in https://github.com/open-webui/open-webui/discussions/8722#discussioncomment-12193783 people are saying the the behavior is the same for pipelines as well... outlet() not being called for API usage, being an "Open WebUI exclusive feature"

@DmitriyAlergant commented on GitHub (Apr 27, 2025): @yz342 in https://github.com/open-webui/open-webui/discussions/8722#discussioncomment-12193783 people are saying the the behavior is the same for pipelines as well... outlet() not being called for API usage, being an "Open WebUI exclusive feature"

GiteaMirror commented

2025-11-11 14:41:57 -06:00

@MotherEarth-AI commented on GitHub (Jun 12, 2025):

Same problem here ;(

@MotherEarth-AI commented on GitHub (Jun 12, 2025): Same problem here ;(

GiteaMirror commented

2025-11-11 14:41:57 -06:00

@bennfocus commented on GitHub (Jun 20, 2025):

Same problem

@bennfocus commented on GitHub (Jun 20, 2025): Same problem

GiteaMirror commented

2025-11-11 14:41:57 -06:00

@Jirubizu commented on GitHub (Aug 4, 2025):

Has there been any more light in regards to this. Really would like a proper deployment with langfuse, but not being able to monitor which user sends api requests is a bit worrying

@Jirubizu commented on GitHub (Aug 4, 2025): Has there been any more light in regards to this. Really would like a proper deployment with langfuse, but not being able to monitor which user sends api requests is a bit worrying