recreate rag collection instead of falling back to stale version #409

Closed
opened 2025-11-11 14:20:34 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @fbirlik on GitHub (Mar 4, 2024).

Bug Report

Description

Bug Summary:
store_data_in_vector_db is used to store web pages after retrieval. Currently, new version of the page is retrieved, split into chunks, but because previous collection exists, new data is dropped. Queries are executed against previous version instead of latest.

Steps to Reproduce:

  • open a new chat, fetch a web page using '#' prefix, query the content (preferably content changeable by tester)
  • wait until content changes (or update the content manually)
  • in a new chat query the same page using '#' prefix
  • query results will be based on first request although second request was completed

Expected Behavior:

  • if a page is fetched again, latest version should be available for query

Actual Behavior:

  • if a page is fetched again, first retrieved version is stuck for all consequent queries
Originally created by @fbirlik on GitHub (Mar 4, 2024). # Bug Report ## Description **Bug Summary:** store_data_in_vector_db is used to store web pages after retrieval. Currently, new version of the page is retrieved, split into chunks, but because previous collection exists, new data is dropped. Queries are executed against previous version instead of latest. **Steps to Reproduce:** - open a new chat, fetch a web page using '#' prefix, query the content (preferably content changeable by tester) - wait until content changes (or update the content manually) - in a new chat query the same page using '#' prefix - query results will be based on first request although second request was completed **Expected Behavior:** - if a page is fetched again, latest version should be available for query **Actual Behavior:** - if a page is fetched again, first retrieved version is stuck for all consequent queries
Author
Owner

@fbirlik commented on GitHub (Mar 4, 2024):

pull-req #1029 deletes the collection if an earlier version exists

@fbirlik commented on GitHub (Mar 4, 2024): pull-req #1029 deletes the collection if an earlier version exists
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#409