mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-12 10:04:14 -05:00
Close connection to ollama when stop button is pressed #171
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @robertvazan on GitHub (Jan 12, 2024).
Originally assigned to: @tjbck on GitHub.
Bug Report
Description
Bug Summary:
Stop button (#48) does not really work, because WebUI backend keeps streaming the response from ollama. This also causes #444 and the same problem contributes to poor UX in #452. WebUI backend should close connection to ollama to stop generation.
Steps to Reproduce:
Expected Behavior:
CPU usage drops immediately. New question can be asked immediately.
Actual Behavior:
While the client indeed stops receiving tokens immediately, ollama apparently continues to generate the whole response in the background, which blocks other chat requests. The only way to stop the generation is to restart ollama process.
Reproduction Details
Confirmation:
Additional Information
The root cause of the issue is that WebUI backend fails to close connection to ollama. My experiments with ollama API show that closing the connection cancels generation immediately with no further CPU usage. Ollama's built-in CLI client apparently does this when you press Ctrl+C.
As I understand it, there's a connection chain:
WebUI frontend <-> WebUI backend <-> ollama
Frontend-backend connection is closed/cancelled properly, judging by browser console messages. But backend-ollama connection apparently stays open.
@jukofyork commented on GitHub (Jan 12, 2024):
Yeah, this can be very problematic if you are using a model that sometimes goes wrong and gives out an infinite repeating response. I had to ssh into the host machine and do a kill - 9 on the ollama process to get it working again.
@MarvinJWendt commented on GitHub (Jan 15, 2024):
I would also like to have a restart button in the WebUI. Sometimes when Ollama hangs, I need to do the same.
@tjbck commented on GitHub (Jan 18, 2024):
Might be relevant: https://github.com/tiangolo/fastapi/discussions/8805
@robertvazan commented on GitHub (Jan 18, 2024):
@tjbck This mostly works. Thanks! One issue though: Why does it take 15 seconds for ollama to go quiet after cancelling the response in UI? My experiments on API level show that ollama CPU usage should drop in under 3 seconds after connection is closed.
@robertvazan commented on GitHub (Jan 18, 2024):
PS: It takes much longer with larger models. Do you have a buffer somewhere that needs to process a certain number of tokens before it can close the connection?
@robertvazan commented on GitHub (Feb 12, 2024):
This is broken again. Ollama WebUI keeps streaming the rest of the response after I press stop button.
@tjbck commented on GitHub (Feb 12, 2024):
@robertvazan Hmm, AFAIK there hasn't been any changes from the webui side. Could you verify that the issue is from the webui? Thanks!
@robertvazan commented on GitHub (Feb 12, 2024):
@tjbck After some testing, I see that it is only broken under some circumstances. The last fix worked, but it did not cover all cases. Repro steps: