mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #1483] feat: direct llama.cpp integration #28044
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tjbck on GitHub (Apr 10, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1483
Originally assigned to: @tjbck on GitHub.
@jukofyork commented on GitHub (Apr 10, 2024):
Just a quick follow-up to say it seems to work fine:
./server --port 8081 ...).http://127.0.0.1:8081/v1and the API Key to not be blank (eg:none) in OpenWebUI settings.and it seems to be calling the OAI-like API endpoint on the llama.cpp server fine. It wasn't that clear I needed to add the
/v1to the URL and ensure the API Key not be blank though (had to find by trial and error).The only difference I can see is there is no little "information" icon like there was with Ollama models, but it does seem to be calling the
OAI-like APIendpoint to get these stats:I'll report back if I can see any other major differences, but otherwise 👍
@jukofyork commented on GitHub (Apr 12, 2024):
I've used this quite a bit with llama.cpp server now and the only problem I've come across is pressing the stop button doesn't actually disconnect/stop the generation. This was a problem with the Ollama server and was fixed AFAIK:
https://github.com/open-webui/open-webui/issues/1166
https://github.com/open-webui/open-webui/issues/1170
It would be helpful if this could be added to the OpenAI API code too, as otherwise the only way currently to stop runaway LLMs is to Control-C the running server and restart it.
@jukofyork commented on GitHub (Apr 12, 2024):
Another thing that might be helpful would be to add an option to hide the "Modelfiles" and "Prompts" menu options in the left, as these aren't able to be used with the OpenAI API and just add clutter.
@tjbck commented on GitHub (Apr 14, 2024):
@jukofyork I'll start working on this feature after #665, we should strive to keep all the core features.
@DenisSergeevitch commented on GitHub (Apr 26, 2024):
Small update: Stop generation button is still an issue
@justinh-rahb commented on GitHub (Apr 26, 2024):
@DenisSergeevitch that is unrelated to the issue being discussed here. Let's keep discussion of the stop generation function here:
@tjbck commented on GitHub (Jun 13, 2024):
Related: https://github.com/open-webui/open-webui/issues/1166
#1568
@jukofyork @DenisSergeevitch @SN4K3D @0x7CFE
Correct me if I'm wrong but, stop generation button not actually stopping is only an issue when running LLMs with Ollama using CPU only and a vast majority of us face zero issue with terminating the response using a stop button. Could anyone confirm this with the latest? I appreciate it!
@SN4K3D commented on GitHub (Jul 24, 2024):
I confirm the issue it is with ollama.cpp using LLMs CPU only, today i have try with the latest version and the stop button work , this is top all thread ollama can be launch.
Thanks all for your work its appreciated