[GH-ISSUE #18133] issue: 0.6.33 - Focused Retrieval mode doesn't work #18507

Closed
opened 2026-04-20 00:44:16 -05:00 by GiteaMirror · 35 comments
Owner

Originally created by @frenzybiscuit on GitHub (Oct 8, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18133

Originally assigned to: @tjbck on GitHub.

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

0.6.33

Ollama Version (if applicable)

No response

Operating System

Debian 12

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Works as expected

Actual Behavior

Focused Retrieval mode doesn't work. Instead, it will use full context on all files in a knowledgebase.

I can CONFIRM:

A) URL's load fine (which iirc requires rag to work)
B) Files get uploaded fine to the knowledgebase
C) The RAG backend gets hit when the files get uploaded and it works.

This happens WITH and WITHOUT a reranker

Steps to Reproduce

Install LiteLLM + VLLM and use bge-large-en-v1.5 for the embedding model.

Set the documents to the following (Again, it happens WITHOUT the reranker and WITH). Top-K is set to 10.

When you insert a knowledgebase into the chat, it will retrieve -all- documents in full context mode and use 20k+ context in chat.

Image

Logs & Screenshots

See above for screenshot

Additional Information

Also large knowledgebases (700+ documents) don't work at all on retrieval.

Originally created by @frenzybiscuit on GitHub (Oct 8, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/18133 Originally assigned to: @tjbck on GitHub. ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version 0.6.33 ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Works as expected ### Actual Behavior Focused Retrieval mode doesn't work. Instead, it will use full context on all files in a knowledgebase. I can CONFIRM: A) URL's load fine (which iirc requires rag to work) B) Files get uploaded fine to the knowledgebase C) The RAG backend gets hit when the files get uploaded and it works. This happens WITH and WITHOUT a reranker ### Steps to Reproduce Install LiteLLM + VLLM and use bge-large-en-v1.5 for the embedding model. Set the documents to the following (Again, it happens WITHOUT the reranker and WITH). Top-K is set to 10. When you insert a knowledgebase into the chat, it will retrieve -all- documents in full context mode and use 20k+ context in chat. ![Image](https://github.com/user-attachments/assets/a3fc6dae-fb75-4432-992a-231e2c4c4406) ### Logs & Screenshots See above for screenshot ### Additional Information Also large knowledgebases (700+ documents) don't work at all on retrieval.
GiteaMirror added the bug label 2026-04-20 00:44:16 -05:00
Author
Owner

@frenzybiscuit commented on GitHub (Oct 8, 2025):

Also i've removed litellm from the equation and have the same issue

<!-- gh-comment-id:3379428828 --> @frenzybiscuit commented on GitHub (Oct 8, 2025): Also i've removed litellm from the equation and have the same issue
Author
Owner

@REDWGioBrusca commented on GitHub (Oct 8, 2025):

I'm also having this issue. Downgrading back to 0.6.22 fixes it. I wasn't getting any errors in my logs either, it just wasn't running the focused retrieval.

<!-- gh-comment-id:3379439056 --> @REDWGioBrusca commented on GitHub (Oct 8, 2025): I'm also having this issue. Downgrading back to 0.6.22 fixes it. I wasn't getting any errors in my logs either, it just wasn't running the focused retrieval.
Author
Owner

@tjbck commented on GitHub (Oct 8, 2025):

@silentoplayz could you confirm here?

<!-- gh-comment-id:3379482326 --> @tjbck commented on GitHub (Oct 8, 2025): @silentoplayz could you confirm here?
Author
Owner

@silentoplayz commented on GitHub (Oct 8, 2025):

@frenzybiscuit Could you share a fuller screenshot of your Document settings for Open WebUI with LiteLLM removed from the equation?

Edit: Note that I do not use VLLM or LiteLLM myself, so my chances of reproducing this issue may be slim to none.

<!-- gh-comment-id:3379500548 --> @silentoplayz commented on GitHub (Oct 8, 2025): @frenzybiscuit Could you share a fuller screenshot of your `Document` settings for Open WebUI with LiteLLM removed from the equation? Edit: Note that I do not use VLLM or LiteLLM myself, so my chances of reproducing this issue may be slim to none.
Author
Owner

@frenzybiscuit commented on GitHub (Oct 8, 2025):

@frenzybiscuit Could you share a fuller screenshot of your Document settings for Open WebUI with LiteLLM removed from the equation?

This is what it looks like when I use VLLM directly for embedding (reranker is irrelevant, as the issue happens with/without):

Image
<!-- gh-comment-id:3379503412 --> @frenzybiscuit commented on GitHub (Oct 8, 2025): > [@frenzybiscuit](https://github.com/frenzybiscuit) Could you share a fuller screenshot of your `Document` settings for Open WebUI with LiteLLM removed from the equation? This is what it looks like when I use VLLM directly for embedding (reranker is irrelevant, as the issue happens with/without): <img width="4222" height="2456" alt="Image" src="https://github.com/user-attachments/assets/0f05ee9a-6d57-43de-94ca-66e0949ff7c4" />
Author
Owner

@frenzybiscuit commented on GitHub (Oct 8, 2025):

This is what it looks like on VLLMs end (yes the ip changed in this screenshot) during file upload in the knowledgebase.

There is -no activity- on VLLM when retrieving from knowledgebase.

Image
<!-- gh-comment-id:3379506508 --> @frenzybiscuit commented on GitHub (Oct 8, 2025): This is what it looks like on VLLMs end (yes the ip changed in this screenshot) during file upload in the knowledgebase. There is -no activity- on VLLM when retrieving from knowledgebase. <img width="2540" height="1046" alt="Image" src="https://github.com/user-attachments/assets/a7610f9d-8693-4d0c-b2e4-72a95e9b0927" />
Author
Owner

@silentoplayz commented on GitHub (Oct 8, 2025):

I believe I may have reproduced the issue without much work? I need your confirmation on this though. I would assume it shouldn't be retrieving all 115 documents in the knowledge collection if Full Context Mode is toggled off in the Documents admin settings for RAG. @frenzybiscuit

Image Image
<!-- gh-comment-id:3379522232 --> @silentoplayz commented on GitHub (Oct 8, 2025): I believe I may have reproduced the issue without much work? I need your confirmation on this though. I would assume it shouldn't be retrieving all 115 documents in the knowledge collection if `Full Context Mode` is toggled off in the `Documents` admin settings for RAG. @frenzybiscuit <img width="2296" height="338" alt="Image" src="https://github.com/user-attachments/assets/cf4a9c40-c7bd-47b5-a0c5-358ec3fa9d27" /> <img width="2296" height="1276" alt="Image" src="https://github.com/user-attachments/assets/49c599ae-c8da-4ceb-b190-dfb896a0e4c6" />
Author
Owner

@frenzybiscuit commented on GitHub (Oct 8, 2025):

I made a video, but can't upload it. If you are on discord, ping me and I'll send it to you.

Basically the problem is this:

  1. Using RAG retrieval does not work

  2. It forces full context mode (despite not being selected) and loads -all- the knowledgebase into context, regardless on top k in the settings.

  3. The embedding server doesn't physically receive any requests from OWUI when this happens.

<!-- gh-comment-id:3379533459 --> @frenzybiscuit commented on GitHub (Oct 8, 2025): I made a video, but can't upload it. If you are on discord, ping me and I'll send it to you. Basically the problem is this: 1) Using RAG retrieval does not work 2) It forces full context mode (despite not being selected) and loads -all- the knowledgebase into context, regardless on top k in the settings. 3) The embedding server doesn't physically receive any requests from OWUI when this happens.
Author
Owner

@frenzybiscuit commented on GitHub (Oct 8, 2025):

I'm aware this is being looked into, but wanted to share the embedding command for llamacpp so you guys can test it (since most of you likely use ollama) - this worked on the last OWUI version. It has the same problem now.

./llama.cpp*/build/bin/llama-server --host iphere --port 5000 -m ~/models/bge-large-en-v1.5.f16.gguf --embedding --pooling cls -ub 8192 --no-mmap --flash-attn on --api-key here --cont-batching --parallel 5

<!-- gh-comment-id:3379553492 --> @frenzybiscuit commented on GitHub (Oct 8, 2025): I'm aware this is being looked into, but wanted to share the embedding command for llamacpp so you guys can test it (since most of you likely use ollama) - this worked on the last OWUI version. It has the same problem now. `./llama.cpp*/build/bin/llama-server --host iphere --port 5000 -m ~/models/bge-large-en-v1.5.f16.gguf --embedding --pooling cls -ub 8192 --no-mmap --flash-attn on --api-key here --cont-batching --parallel 5`
Author
Owner

@PawelAnt commented on GitHub (Oct 8, 2025):

I made a video, but can't upload it. If you are on discord, ping me and I'll send it to you.

Basically the problem is this:

  1. Using RAG retrieval does not work
  2. It forces full context mode (despite not being selected) and loads -all- the knowledgebase into context, regardless on top k in the settings.
  3. The embedding server doesn't physically receive any requests from OWUI when this happens.

I confirm this, when even one small document is loaded into the qdrant knowledge base, the question exceeds the context window with the prompt itself

<!-- gh-comment-id:3380225696 --> @PawelAnt commented on GitHub (Oct 8, 2025): > I made a video, but can't upload it. If you are on discord, ping me and I'll send it to you. > > Basically the problem is this: > > 1. Using RAG retrieval does not work > 2. It forces full context mode (despite not being selected) and loads -all- the knowledgebase into context, regardless on top k in the settings. > 3. The embedding server doesn't physically receive any requests from OWUI when this happens. I confirm this, when even one small document is loaded into the qdrant knowledge base, the question exceeds the context window with the prompt itself
Author
Owner

@theepicsaxguy commented on GitHub (Oct 8, 2025):

Same exact issue for me. it seems to force full context mode even thought that is disabled.

<!-- gh-comment-id:3380270487 --> @theepicsaxguy commented on GitHub (Oct 8, 2025): Same exact issue for me. it seems to force full context mode even thought that is disabled.
Author
Owner

@Classic298 commented on GitHub (Oct 8, 2025):

Can reproduce

<!-- gh-comment-id:3381294603 --> @Classic298 commented on GitHub (Oct 8, 2025): Can reproduce
Author
Owner

@dotmobo commented on GitHub (Oct 8, 2025):

+1 same problem for me with Qdrant, Hybrid Search and external embedding and reranking engine

<!-- gh-comment-id:3382092226 --> @dotmobo commented on GitHub (Oct 8, 2025): +1 same problem for me with Qdrant, Hybrid Search and external embedding and reranking engine
Author
Owner

@AbdullahMPrograms commented on GitHub (Oct 8, 2025):

A similar issue exists for web search; search no longer queries results, instead returning a large amount of tokens (50k+) to the model. In 0.6.32, with the exact same settings, it would return ~5k tokens.

0.6.33:
Image

(47.8k tokens in prompt)

0.6.32:
Image

(6.2k tokens in prompt)

This is a regeneration of the same search prompt with the same settings across versions, but the behaviour is consistent with all web searches.

llama-server command:
./llama-server -m /home/victis/LLM/Models/unsloth/embeddinggemma-300m-GGUF/embeddinggemma-300M-Q8_0.gguf --embeddings -c 2048 -ngl 999 --flash-attn on

Document settings:
Image

<!-- gh-comment-id:3382142330 --> @AbdullahMPrograms commented on GitHub (Oct 8, 2025): A similar issue exists for web search; search no longer queries results, instead returning a large amount of tokens (50k+) to the model. In 0.6.32, with the exact same settings, it would return ~5k tokens. 0.6.33: <img width="969" height="292" alt="Image" src="https://github.com/user-attachments/assets/8fb04852-5ea2-4716-88f4-a5d78bd3fc28" /> (47.8k tokens in prompt) 0.6.32: <img width="960" height="407" alt="Image" src="https://github.com/user-attachments/assets/6d2f233c-dbb9-414e-963c-68874284dd26" /> (6.2k tokens in prompt) This is a regeneration of the same search prompt with the same settings across versions, but the behaviour is consistent with all web searches. llama-server command: ./llama-server -m /home/victis/LLM/Models/unsloth/embeddinggemma-300m-GGUF/embeddinggemma-300M-Q8_0.gguf --embeddings -c 2048 -ngl 999 --flash-attn on Document settings: <img width="3202" height="852" alt="Image" src="https://github.com/user-attachments/assets/40e734be-f84c-4925-817b-cfb46aa2d732" />
Author
Owner

@Matten1887 commented on GitHub (Oct 8, 2025):

I run in this issue also since the last update.

<!-- gh-comment-id:3382775117 --> @Matten1887 commented on GitHub (Oct 8, 2025): I run in this issue also since the last update.
Author
Owner

@cableman commented on GitHub (Oct 8, 2025):

Using qdrant as vector store it loads all documents in the models knowledge base regardless of the "question" asked! I did not do that in 0.6.32

<!-- gh-comment-id:3382777210 --> @cableman commented on GitHub (Oct 8, 2025): Using qdrant as vector store it loads all documents in the models knowledge base regardless of the "question" asked! I did not do that in 0.6.32
Author
Owner

@jamesottera commented on GitHub (Oct 9, 2025):

Having this same issue with Postgres Vector DB. This is causing HUGE context increase and massive cost growth.

<!-- gh-comment-id:3383986458 --> @jamesottera commented on GitHub (Oct 9, 2025): Having this same issue with Postgres Vector DB. This is causing HUGE context increase and massive cost growth.
Author
Owner

@arruga commented on GitHub (Oct 9, 2025):

Same issue here with v0.6.33 and Ubuntu 22.04. It didn't happen with v0.6.32. I have 3674 source files in 4 knowledge collections. When I ask something in the chat, the 3674 sources (there is a temporary message indicating sources_other) are recovered. I'm working with the default ChromaDB. This happens with and without hybrid search and with "full context mode" off, Top-k = 10.

<!-- gh-comment-id:3384348873 --> @arruga commented on GitHub (Oct 9, 2025): Same issue here with v0.6.33 and Ubuntu 22.04. It didn't happen with v0.6.32. I have 3674 source files in 4 knowledge collections. When I ask something in the chat, the 3674 sources (there is a temporary message indicating sources_other) are recovered. I'm working with the default ChromaDB. This happens with and without hybrid search and with "full context mode" off, Top-k = 10.
Author
Owner

@mahenning commented on GitHub (Oct 9, 2025):

I PR'ed a fix: https://github.com/open-webui/open-webui/pull/18182.

<!-- gh-comment-id:3384985594 --> @mahenning commented on GitHub (Oct 9, 2025): I PR'ed a fix: https://github.com/open-webui/open-webui/pull/18182.
Author
Owner

@alpilotx commented on GitHub (Oct 9, 2025):

I PR'ed a fix: #18182.

Indeed seems to fix it (just quickly applied the changed file in test env, and now retrieval seems to be back to "normal" again, not returning all files)

<!-- gh-comment-id:3385580082 --> @alpilotx commented on GitHub (Oct 9, 2025): > I PR'ed a fix: [#18182](https://github.com/open-webui/open-webui/pull/18182). Indeed seems to fix it (just quickly applied the changed file in test env, and now retrieval seems to be back to "normal" again, not returning all files)
Author
Owner

@jamesottera commented on GitHub (Oct 9, 2025):

Given the seriousness of this issue causing huge context increases that have a large financial impact, can this be merged and rolled out on a release today?

Otherwise, is there any breaking change in 0.6.33 that would cause issues with databases if we rolled our servers back to 0.6.32? I was hesitant to do that due to any possible migrations that may have ran.

<!-- gh-comment-id:3386364044 --> @jamesottera commented on GitHub (Oct 9, 2025): Given the seriousness of this issue causing huge context increases that have a large financial impact, can this be merged and rolled out on a release today? Otherwise, is there any breaking change in 0.6.33 that would cause issues with databases if we rolled our servers back to 0.6.32? I was hesitant to do that due to any possible migrations that may have ran.
Author
Owner

@dotmobo commented on GitHub (Oct 9, 2025):

@jamesottera for info, i rolled back to 0.6.32 and i don't have any issues.

<!-- gh-comment-id:3386580716 --> @dotmobo commented on GitHub (Oct 9, 2025): @jamesottera for info, i rolled back to 0.6.32 and i don't have any issues.
Author
Owner

@silentoplayz commented on GitHub (Oct 9, 2025):

Testing is wanted on the dev branch to see if the issue reported here has been solved or not! c4832fdb70

<!-- gh-comment-id:3386752760 --> @silentoplayz commented on GitHub (Oct 9, 2025): Testing is wanted on the `dev` branch to see if the issue reported here has been solved or not! https://github.com/open-webui/open-webui/commit/c4832fdb7033e6787961e21eb81e2671519cf87d
Author
Owner

@Classic298 commented on GitHub (Oct 9, 2025):

Might be fixed on dev > c4832fdb70

<!-- gh-comment-id:3387267093 --> @Classic298 commented on GitHub (Oct 9, 2025): Might be fixed on dev > https://github.com/open-webui/open-webui/commit/c4832fdb7033e6787961e21eb81e2671519cf87d
Author
Owner

@theepicsaxguy commented on GitHub (Oct 10, 2025):

Testing is wanted on the dev branch to see if the issue reported here has been solved or not! c4832fd

I have tested ghcr.io/open-webui/open-webui:git-c4832fd-slim and confirm it is fixed.

<!-- gh-comment-id:3388789830 --> @theepicsaxguy commented on GitHub (Oct 10, 2025): > Testing is wanted on the `dev` branch to see if the issue reported here has been solved or not! [c4832fd](https://github.com/open-webui/open-webui/commit/c4832fdb7033e6787961e21eb81e2671519cf87d) I have tested ghcr.io/open-webui/open-webui:git-c4832fd-slim and confirm it is fixed.
Author
Owner

@silentoplayz commented on GitHub (Oct 10, 2025):

Testing from a contributor internally has also revealed that this issue has most likely been fixed/solved. I'll close this issue, but feel free to add a comment (whoever may see this) if you're still having issues!

<!-- gh-comment-id:3388814070 --> @silentoplayz commented on GitHub (Oct 10, 2025): Testing from a contributor internally has also revealed that this issue has most likely been fixed/solved. I'll close this issue, but feel free to add a comment (whoever may see this) if you're still having issues!
Author
Owner

@deliciousbob commented on GitHub (Oct 13, 2025):

I've just tested ghcr.io/open-webui/open-webui:git-c4832fd-slim and i can also confirm that Retrieval is working normal again.
Instead of 413 sources it retrieved 10 sources as limited by reranking. Thank you for the fix!

<!-- gh-comment-id:3397137469 --> @deliciousbob commented on GitHub (Oct 13, 2025): I've just tested ghcr.io/open-webui/open-webui:git-c4832fd-slim and i can also **confirm** that **Retrieval is working** normal again. Instead of 413 sources it retrieved 10 sources as limited by reranking. Thank you for the fix!
Author
Owner

@nlamarque42 commented on GitHub (Oct 13, 2025):

Big issue indeed. We are speedrunning the 1T Tokens of Appreciation OpenAI Trophy with this one.

<!-- gh-comment-id:3397441635 --> @nlamarque42 commented on GitHub (Oct 13, 2025): Big issue indeed. We are speedrunning the 1T Tokens of Appreciation OpenAI Trophy with this one.
Author
Owner

@jamesottera commented on GitHub (Oct 13, 2025):

Given the severity of this issue from a performance and cost standpoint and being confirmed fixed 3 days ago, can this please be merged to main ASAP? This is a regression error and not a feature request or minor bug.

<!-- gh-comment-id:3398159214 --> @jamesottera commented on GitHub (Oct 13, 2025): Given the severity of this issue from a performance and cost standpoint and being confirmed fixed 3 days ago, can this please be merged to main ASAP? This is a regression error and not a feature request or minor bug.
Author
Owner

@ypsilonkah commented on GitHub (Oct 16, 2025):

Switching to ghcr.io/open-webui/open-webui:git-c4832fd-slim reverts all settings changes and chats of the last 14 or so days or is it just on my side here?

<!-- gh-comment-id:3409697284 --> @ypsilonkah commented on GitHub (Oct 16, 2025): Switching to ghcr.io/open-webui/open-webui:git-c4832fd-slim reverts all settings changes and chats of the last 14 or so days or is it just on my side here?
Author
Owner

@mahenning commented on GitHub (Oct 16, 2025):

As @jamesottera and others already wrote, I urge for a release of this fix as fast as possible. Only make 0.6.34 this fix if you have to, or at least mark the version as yanked on PyPi as e.g. vLLM v0.2.1. I dont understand why a fix was pushed only 1 day later, but it is not on main for over a week.

<!-- gh-comment-id:3410055956 --> @mahenning commented on GitHub (Oct 16, 2025): As @jamesottera and others already wrote, I urge for a release of this fix as fast as possible. Only make 0.6.34 this fix if you have to, or at least mark the version as yanked on PyPi as e.g. [vLLM v0.2.1](https://pypi.org/project/vllm/0.2.1/). I dont understand why a fix was pushed only 1 day later, but it is not on main for over a week.
Author
Owner

@Classic298 commented on GitHub (Oct 16, 2025):

I can't answer your questions precisely but 0.6.34 is probably close to releasing.

Either way, if you struggle with this issue, downgrading is a valid option.

<!-- gh-comment-id:3410296406 --> @Classic298 commented on GitHub (Oct 16, 2025): I can't answer your questions precisely but 0.6.34 is probably close to releasing. Either way, if you struggle with this issue, downgrading is a valid option.
Author
Owner

@czar commented on GitHub (Oct 16, 2025):

Either way, if you struggle with this issue, downgrading is a valid option.

I'm using Watchtower for updates (using docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui) and now I'm trying to downgrade, but I'm not quite sure how to do it. Any help with downgrading my open-webui to before this bug was present would be awesome. Thanks!

<!-- gh-comment-id:3410318085 --> @czar commented on GitHub (Oct 16, 2025): > Either way, if you struggle with this issue, downgrading is a valid option. I'm using Watchtower for updates (using docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui) and now I'm trying to downgrade, but I'm not quite sure how to do it. Any help with downgrading my open-webui to before this bug was present would be awesome. Thanks!
Author
Owner

@mahenning commented on GitHub (Oct 16, 2025):

@czar watchtower only updates the image/container watching the tag change. If you used the docker run command from the main github page, the image used is ghcr.io/open-webui/open-webui:main. Change the open-webui:main part to e.g. open-webui:0.6.32 to fix the image version on the version one before the latest (0.6.33) and recreate the container.

@Classic298 I know that downgrading is a valid option but most of the people using open-webui aren't aware of this issue only after maybe burning through a few 100k tokens with an external API.. I just wanted to voice that I'm unhappy that the fix is already on dev but not published. There were 2+ releases on the sameday in the past.

<!-- gh-comment-id:3410498786 --> @mahenning commented on GitHub (Oct 16, 2025): @czar watchtower only updates the image/container watching the tag change. If you used the `docker run` command from the main github page, the image used is `ghcr.io/open-webui/open-webui:main`. Change the `open-webui:main` part to e.g. `open-webui:0.6.32` to fix the image version on the version one before the latest (0.6.33) and recreate the container. @Classic298 I know that downgrading is a valid option but most of the people using open-webui aren't aware of this issue only after maybe burning through a few 100k tokens with an external API.. I just wanted to voice that I'm unhappy that the fix is already on dev but not published. There were 2+ releases on the sameday in the past.
Author
Owner

@jamesottera commented on GitHub (Oct 16, 2025):

Agreed with @mahenning . We are the few that noticed this and been vocal to have it addressed.
There are countless others that are not aware of it and I’m sure there are many users of this repo using docker on main branch with no version pinning.

It should not be assumed that every user is monitoring every release and some may come to a surprise of bills in the thousands of dollars unexpectedly.

This isn’t a trivial bug and one with great financial impact and for some companies using this library may be the type of thing that moves them away from it. This isn’t only being used by home casual users.

Downgrading (if you are aware of the issue) is an option and what my team did but downgrading sometimes is not possible if there were database migrations in an update and would break a deployment.

It is understood there is some kind of release cycle with other changes that need testing but this single commit could be cherry picked.

<!-- gh-comment-id:3411602748 --> @jamesottera commented on GitHub (Oct 16, 2025): Agreed with @mahenning . We are the few that noticed this and been vocal to have it addressed. There are countless others that are not aware of it and I’m sure there are many users of this repo using docker on main branch with no version pinning. It should not be assumed that every user is monitoring every release and some may come to a surprise of bills in the thousands of dollars unexpectedly. This isn’t a trivial bug and one with great financial impact and for some companies using this library may be the type of thing that moves them away from it. This isn’t only being used by home casual users. Downgrading (if you are aware of the issue) is an option and what my team did but downgrading sometimes is not possible if there were database migrations in an update and would break a deployment. It is understood there is some kind of release cycle with other changes that need testing but this single commit could be cherry picked.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18507