Concurrent users using RAG issues #4070

Closed
opened 2025-11-11 15:45:33 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @galvanoid on GitHub (Feb 23, 2025).

This is the scenario:

A user is creating a collection of documents (for example 100 documents), which may take a few hours.

Meanwhile, another user queries a model using their own collections.

In this case, the answers are inconsistent.

If the first user stops his process, then the questions asked by the second user are answered correctly.

Is it possible that there is a problem when two users are simultaneously using the same embedding model (llama embedding models)?

Latest Open-Webui installed in ubuntu server 24.02. Docker with local llama installation.
2x3060 Nvidia Cards.
Embedding model: Nomic or Artic Embed.

Originally created by @galvanoid on GitHub (Feb 23, 2025). This is the scenario: A user is creating a collection of documents (for example 100 documents), which may take a few hours. Meanwhile, another user queries a model using their own collections. In this case, the answers are inconsistent. If the first user stops his process, then the questions asked by the second user are answered correctly. Is it possible that there is a problem when two users are simultaneously using the same embedding model (llama embedding models)? --- Latest Open-Webui installed in ubuntu server 24.02. Docker with local llama installation. 2x3060 Nvidia Cards. Embedding model: Nomic or Artic Embed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#4070