issue: Alternatives to using UUID4 to generate hashes #4969

Closed
opened 2025-11-11 16:08:23 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @vinsdragonis on GitHub (Apr 27, 2025).

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.5

Ollama Version (if applicable)

0.6.5

Operating System

Windows 11

Browser (if applicable)

Chrome 100.0

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

  • When uploading a document of the same name and content, the hashed ID previously created collection (if any exists) must be reused for updates.
  • No new collection needs to be created if a collection for that file already exists.

Actual Behavior

  • A new hash is being produced every time any file (even the same file unchanged) and a new collection is created.

Steps to Reproduce

  1. Click on "Upload file" and upload any file (PDF, docx, etc.)
  2. Once uploaded, you may give any prompt if needed
  3. Upload the same file again (even unchanged)
  4. Observe in your browser console logs/docker logs/any dashboarding tool for your vector database that a new hash ID is generated every time any file is uploaded (same or different).
  5. This can lead to "Collection explosion" or "Index explosion"

Logs & Screenshots

N/A

Additional Information

  • This seems to stem from the use of UUID4 while generating ID(s) for uploaded files.
  • The better option would be using UUID3 or UUID5 for this.
Originally created by @vinsdragonis on GitHub (Apr 27, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.5 ### Ollama Version (if applicable) 0.6.5 ### Operating System Windows 11 ### Browser (if applicable) Chrome 100.0 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [ ] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior - When uploading a document of the same name and content, the hashed ID previously created collection (if any exists) must be reused for updates. - No new collection needs to be created if a collection for that file already exists. ### Actual Behavior - A new hash is being produced every time any file (even the same file unchanged) and a new collection is created. ### Steps to Reproduce 1. Click on "Upload file" and upload any file (PDF, docx, etc.) 2. Once uploaded, you may give any prompt if needed 3. Upload the same file again (even unchanged) 4. Observe in your browser console logs/docker logs/any dashboarding tool for your vector database that a new hash ID is generated every time any file is uploaded (same or different). 5. This can lead to "Collection explosion" or "Index explosion" ### Logs & Screenshots N/A ### Additional Information - This seems to stem from the use of UUID4 while generating ID(s) for uploaded files. - The better option would be using UUID3 or UUID5 for this.
GiteaMirror added the bug label 2025-11-11 16:08:23 -06:00
Author
Owner

@tjbck commented on GitHub (Apr 27, 2025):

Intended behaviour and this is not a real issue.

@tjbck commented on GitHub (Apr 27, 2025): Intended behaviour and this is not a real issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#4969