Allow filter outlets to trigger continuation automatically #1787

New Issue

GiteaMirror · 2025-11-11T14:53:16-06:00

GiteaMirror commented

2025-11-11 14:53:16 -06:00

Originally created by @rndmcnlly on GitHub (Aug 15, 2024).

Originally assigned to: @tjbck on GitHub.

Is your feature request related to a problem? Please describe.

A model cannot react to the result of a filter outlet's effect until the user's next query. This inhibits the development of certain useful interaction patterns sketched below.

Describe the solution you'd like

In a filter's outlet, I'd like to be able to say body["continue"] = True. This should have a similar effect to as if the user had decided to press existing "Continue Response" button in the UI. Rather than appending content to the previous message, however, it should start the flow for generating a new message. It should be possible for this process to be chained several times (e.g. the continued response, upon being examined by a filter outline, gets the continue flag set again).

(I have some ideas for safety features that would break potential infinite auto-continue loops, but perhaps we can save those until problems arise in practice. The UI's existing "Stop" button should be sufficient for for most cases.)

Describe alternatives you've considered

Here's my current workaround within the current OWUI design: My filter outlet programmatically inserts text like (1) "Say anything to continue the conversation." or (2) "Press the 'Continue Response' button to continue the conversation." The first option requires the user to make up some useless response (usually "okay", "sure", "k", "get on with it!"), and the second option requires them to switch from typing to UI button clicking and back. Beyond disrupting the conversational flow, this solution isn't responsive to the conversation context. For example, if the conversation had recently been proceeding in the Spanish language, it is jarring to read the English text injected by my Python code.

Compared to this workaround, an auto-continue mechanism would allow the assistant to remain in-character and avoid requiring the user to take a distracting action just to kick the conversation along.

Additional context
Here are some use-cases for this feature:

Content moderation: Suppose my filter outlet has detected some bad model behavior in the last chat completion. I'd like to give the model a chance to react to my programmatic censoring and try again without requiring a user action to proceed.
Multi-step code execution: Suppose my assistant is given code execution abilities via some combination of actions and filters. If code execution yields some trivial error message, I want to give the assistant a chance to revise the code and try again immediately. Likewise, if the execution succeeds, the assistant might want to run some additional code as an immediate follow-up.
Server-sent events: This one is a stretch, but I think it could be a big deal for supporting non-traditional conversation flows. Here's a toy example to illustrate the mechanism:
- User sends message "Tell me a joke every 60 seconds." (or some other request that involves the assistant deciding when to respond rather than only reaction to messages synchronously)
- Assistant replies "No problem. I'll be watching the clock! Press the stop button if you need to interrupt me. AUTO_CONTINUE_WITH_DELAY=60".
- When this message has finished streaming to the user, a filter outlet detects the "AUTO_CONTINUE" text and decides to set body["continue"] = True.
- This quickly triggers response continuation similar to if the user had explicitly replied or pressed a button.
- In some filter inlet logic, we use asyncio.sleep(60) or whatever to stall the response flow. It could be based on any await-able expression, not just time.
- When the sleep is complete, the filter inlet can inject some text like "(60 seconds has now passed)" into body["messages"].
- The model can react to the passage of time by generating a joke and resetting the continuation mechanism again.
- After the user get tired of scheduled jokes, they can hit Stop, and say "Thanks, I'm done with the jokes now."
- Naturally, the assistant's normal reply will not trigger another continuation cycle, so we are then back to the traditional reactive interaction model.

I prototyped a mechanism somewhat like this in the ChatGPT interface a few months ago. I created a small local browser extension (based on Tampermonkey) that would inject a user message like "The time is now {time}." every 10 seconds. By default, ChatGPT replies something like "Thanks for the update. How can I help you?" However, if the context of the conversation so far included messages like "Remind me to check on the pie in the oven in 5 minutes." then some of these replies would look like "It looks like 5 minutes have passed. You should check the pie now." This was fun to play with, but the polling-based design was expensive (many messages sent when no action was needed) and imprecise (only 10-second granularity to timing). The design I've sketched in this feature request could allow extremely precise timing (based on whenever an await completes) and efficient (no extra messages during idle times).

tl;dr: An auto-continue mechanism triggered by setting body["continue"] = True in a filter's outlet could be extremely powerful.

Originally created by @rndmcnlly on GitHub (Aug 15, 2024). Originally assigned to: @tjbck on GitHub. **Is your feature request related to a problem? Please describe.** A model cannot react to the result of a filter outlet's effect until the user's next query. This inhibits the development of certain useful interaction patterns sketched below. **Describe the solution you'd like** In a filter's outlet, I'd like to be able to say `body["continue"] = True`. This should have a _similar_ effect to as if the user had decided to press existing "Continue Response" button in the UI. Rather than appending content to the previous message, however, it should start the flow for generating a new message. It should be possible for this process to be chained several times (e.g. the continued response, upon being examined by a filter outline, gets the continue flag set again). (I have some ideas for safety features that would break potential infinite auto-continue loops, but perhaps we can save those until problems arise in practice. The UI's existing "Stop" button should be sufficient for for most cases.) **Describe alternatives you've considered** Here's my current workaround within the current OWUI design: My filter outlet programmatically inserts text like (1) "Say anything to continue the conversation." or (2) "Press the 'Continue Response' button to continue the conversation." The first option requires the user to make up some useless response (usually "okay", "sure", "k", "get on with it!"), and the second option requires them to switch from typing to UI button clicking and back. Beyond disrupting the conversational flow, this solution isn't responsive to the conversation context. For example, if the conversation had recently been proceeding in the Spanish language, it is jarring to read the English text injected by my Python code. Compared to this workaround, an auto-continue mechanism would allow the assistant to remain in-character and avoid requiring the user to take a distracting action just to kick the conversation along. **Additional context** Here are some use-cases for this feature: - **Content moderation**: Suppose my filter outlet has detected some bad model behavior in the last chat completion. I'd like to give the model a chance to react to my programmatic censoring and try again without requiring a user action to proceed. - **Multi-step code execution**: Suppose my assistant is given code execution abilities via some combination of actions and filters. If code execution yields some trivial error message, I want to give the assistant a chance to revise the code and try again immediately. Likewise, if the execution succeeds, the assistant might want to run some additional code as an immediate follow-up. - **Server-sent events**: This one is a stretch, but I think it could be a big deal for supporting non-traditional conversation flows. Here's a toy example to illustrate the mechanism: - User sends message "Tell me a joke every 60 seconds." (or some other request that involves the assistant deciding _when_ to respond rather than only reaction to messages synchronously) - Assistant replies "No problem. I'll be watching the clock! Press the stop button if you need to interrupt me. AUTO_CONTINUE_WITH_DELAY=60". - When this message has finished streaming to the user, a filter outlet detects the "AUTO_CONTINUE" text and decides to set `body["continue"] = True`. - This quickly triggers response continuation similar to if the user had explicitly replied or pressed a button. - In some filter inlet logic, we use `asyncio.sleep(60)` or whatever to stall the response flow. It could be based on any await-able expression, not just time. - When the sleep is complete, the filter inlet can inject some text like "(60 seconds has now passed)" into `body["messages"]`. - The model can react to the passage of time by generating a joke and resetting the continuation mechanism again. - After the user get tired of scheduled jokes, they can hit Stop, and say "Thanks, I'm done with the jokes now." - Naturally, the assistant's normal reply will not trigger another continuation cycle, so we are then back to the traditional reactive interaction model. I prototyped a mechanism somewhat like this in the ChatGPT interface a few months ago. I created a small local browser extension (based on Tampermonkey) that would inject a user message like "The time is now {time}." every 10 seconds. By default, ChatGPT replies something like "Thanks for the update. How can I help you?" However, if the context of the conversation so far included messages like "Remind me to check on the pie in the oven in 5 minutes." then some of these replies would look like "It looks like 5 minutes have passed. You should check the pie now." This was fun to play with, but the polling-based design was expensive (many messages sent when no action was needed) and imprecise (only 10-second granularity to timing). The design I've sketched in this feature request could allow extremely precise timing (based on whenever an `await` completes) and efficient (no extra messages during idle times). _tl;dr: An auto-continue mechanism triggered by setting `body["continue"] = True` in a filter's outlet could be extremely powerful._

GiteaMirror closed this issue

2025-11-11 14:53:16 -06:00

GiteaMirror commented

2025-11-11 14:53:17 -06:00

@tjbck commented on GitHub (Aug 15, 2024):

This would be a great addition for __event_emitter__, will be added soon.

@tjbck commented on GitHub (Aug 15, 2024): This would be a great addition for `__event_emitter__`, will be added soon.

GiteaMirror commented

2025-11-11 14:53:17 -06:00

@rndmcnlly commented on GitHub (Aug 15, 2024):

Having the behavior triggered via __event_emitter__ sounds like an excellent design!

Concretely, I'm imagining you'd use it like

await __event_emitter__({
	"type": "continue"
})

# returns as soon as the continue message is sent out to the UI,
# not waiting for the continuation to take place (which might never happen)

I had an idea like that at an earlier stage, but I dropped it because I (mistakenly) thought that it would be cleaner to handle auto-continuation entirely in backend. Here's my new reasoning in favor of the __event_emitter__ design.

Generality: Events can already be emitted by many types of script, allowing auto-continue to be used more flexibly.
Composability: This would play nicely with other events like {"type": "status", ..., "done": false} to let users understand that the system is working even though they didn't type any new messages yet.
Safety: Infinite continuation loops can be broken just by closing the chat window. To resume a broken cycle, they just press the existing Continue button or the like.

Implementation note: If several different scripts each try to trigger a continuation in the same conversation cycle, those should probably be de-duplicated / considered idempotent. An alternate message design like {"type": "automatically_continue_this_message", "enabled": true} might make this clearer (i.e. setting it to true many times in a row only triggers one continuation).

@rndmcnlly commented on GitHub (Aug 15, 2024): Having the behavior triggered via `__event_emitter__` sounds like an excellent design! Concretely, I'm imagining you'd use it like await __event_emitter__({ "type": "continue" }) # returns as soon as the continue message is sent out to the UI, # not waiting for the continuation to take place (which might never happen) I had an idea like that at an earlier stage, but I dropped it because I (mistakenly) thought that it would be cleaner to handle auto-continuation entirely in backend. Here's my new reasoning in favor of the `__event_emitter__` design. - Generality: Events can already be emitted by many types of script, allowing auto-continue to be used more flexibly. - Composability: This would play nicely with other events like `{"type": "status", ..., "done": false}` to let users understand that the system is working even though they didn't type any new messages yet. - Safety: Infinite continuation loops can be broken just by closing the chat window. To resume a broken cycle, they just press the existing Continue button or the like. Implementation note: If several different scripts each try to trigger a continuation in the same conversation cycle, those should probably be de-duplicated / considered idempotent. An alternate message design like `{"type": "automatically_continue_this_message", "enabled": true}` might make this clearer (i.e. setting it to true many times in a row only triggers one continuation).

GiteaMirror commented

2025-11-11 14:53:17 -06:00

@tjbck commented on GitHub (Aug 15, 2024):

Added to dev, here's how you would use this:

await __event_emitter__(
                {
                    "type": "action",
                    "data": {"action": "continue"},
                }
            )

Let me know if you encounter any issues!

@tjbck commented on GitHub (Aug 15, 2024): Added to dev, here's how you would use this: ```py await __event_emitter__( { "type": "action", "data": {"action": "continue"}, } ) ``` Let me know if you encounter any issues!

GiteaMirror referenced this issue

2026-04-19 19:33:37 -05:00

[GH-ISSUE #1787] Easily create Litellm modelfiles via the web UI and not only Ollama #12637

GiteaMirror referenced this issue

2026-04-25 02:54:06 -05:00

[GH-ISSUE #1787] Easily create Litellm modelfiles via the web UI and not only Ollama #28165

GiteaMirror referenced this issue

2026-05-05 12:16:14 -05:00

[GH-ISSUE #1787] Easily create Litellm modelfiles via the web UI and not only Ollama #51303

Sign in to join this conversation.