mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-12 01:54:38 -05:00
[PR #772] [MERGED] feat: choose embedding model when using docker #7253
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/772
Author: @jannikstdl
Created: 2/17/2024
Status: ✅ Merged
Merged: 2/19/2024
Merged by: @tjbck
Base:
main← Head:choose-embedding-model📝 Commits (7)
1846c1echoose embedding model when using dockerbc3dd34collection query fix4b88e7eMerge branch 'main' into choose-embedding-model0cb0358refac: more descriptive var namesacf9990storing vectordb in project cache folder + device typesab104d5refac7c127c3feat: dynamic embedding model load📊 Changes
4 files changed (+87 additions, -17 deletions)
View changed files
📝
Dockerfile(+19 -4)📝
backend/apps/audio/main.py(+1 -1)📝
backend/apps/rag/main.py(+61 -11)📝
backend/config.py(+6 -1)📄 Description
Changes
all-MiniLM-L6-v2has low performance nowerdays and supports only english:all-MiniLM-L6-v2is still default, please read the important info down below!intfloat/multilingual-e5-largewhich is one of the most powerful embedding models aviable (You can also use the instruct version of the multilingual-e5-largeintfloat/multilingual-e5-large-instructwhich is smaller and almost better as the latest from OpenAI "text-embedding-large-3")main.pyImprovements
In my local testing (in german) the output with
intfloat/multilingual-e5-largeas the embedding model are way more accurate to my question given to the LLM.Also the load times of the RAG were much shorter, i don't know why and whether that is the case for you too. Maybe you can test this and give feedback.
Open Points
/app/backend/data/cachefor now the model dir should be
~/.cache/torch/sentence_transformerswhich is not in the PVCI didn't find a parameter to change the location here
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(model_name=SENTENCE_TRANSFORMER_EMBED_MODEL)Insted of giving the model name as a string to be downloaded you can also give the param model_name a path. Maybe this would be the way to go.
cpu(our standard for both) otcudawhich can lead to better performance when using nvida gpus. We could also set a ENV to change that to the users specific needsImportant Information
If you have some documents stored under the
/documentsroute, changing the embedding model will cause the backend to not be able to read the files. So if you have a lot of docs beeing used for RAG let that by default or re-embed your files.I also mentianed this in a Dockerfile comment.
@tjbck i don't know if
# wget embedding model weight from alpine (does not exist from slim-buster) RUN wget "https://chroma-onnx-models.s3.amazonaws.com/all-MiniLM-L6-v2/onnx.tar.gz" -O - | \ tar -xzf - -C /appin the Dockerfile was ever used, but with this update the
all-MiniLM-L6-v2is declared in the ENV so it is obsolete imo.If this is the case, deleting this RUN statement would save some buildtime + image size.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.