mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #20846] issue: Bug with pagination for search_files_by_id. Leads to duplicates being returned or files missing #57975
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @thomasmhofmann on GitHub (Jan 21, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20846
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.7.2
Ollama Version (if applicable)
No response
Operating System
Linux from official container image
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
Bug Report: Non-Deterministic Pagination in Knowledge Base File Listing
Summary
The knowledge base file listing API (
GET /api/v1/knowledge/{id}/files) returns non-deterministic results when paginating through files that share the sameupdated_attimestamp. This causes files to appear multiple times across pages or not appear at all.Environment
GET /api/v1/knowledge/{id}/files?page={N}Expected Behavior
Each file should appear exactly once across all paginated results. The total count should match the number of unique files returned.
Actual Behavior
Example with 208 files in knowledge base:
total: 208lb-1603-v2.0.md(id:ab2efd9f-86a5-4aaa-8a02-9377f32da4d3) appears on both page 1 and page 2lb-448-v1.0.md(id:1a1f1963-62ce-4404-b892-7fea3c5ca8da) never appears on any pageDatabase verification shows both files exist and are linked to the knowledge base:
Result:
Both files have identical
updated_attimestamps.Root Cause
File:
backend/open_webui/models/knowledge.pyMethod:
Knowledges.search_files_by_id()Line: 463
The query sorts by
File.updated_atwithout a secondary sort key. When multiple files have the sameupdated_atvalue, the database returns them in undefined order. This order can vary between pagination requests, causing:This is a well-known database pagination anti-pattern when sorting by non-unique columns.
Steps to Reproduce
updated_attimestamps (e.g., bulk upload)GET /api/v1/knowledge/{id}/files?page=1GET /api/v1/knowledge/{id}/files?page=2Logs & Screenshots
Results in:
API result is missing lb-448-v1.0.md but lb-1603-v2.0.md appears twice.
Additional Information
Proposed Fix
Add a secondary sort on a unique column (e.g.,
File.id) to ensure deterministic ordering:This should be applied to all order_by clauses in the method:
Impact
Client-Side Workaround
Until the OpenWebUI API is fixed, clients can work around this issue by specifying a sort order on a unique column using the filter parameters:
API Request:
Filter Parameters:
order_by=name: Sort by filename (unique in most cases)direction=asc: Ascending order for consistencyThis ensures deterministic ordering across paginated requests, preventing duplicates and missing files.
Implementation Example (Java with JAX-RS):
Limitations:
Additional Notes
This bug was introduced when pagination was added to the knowledge base file listing endpoint. Before pagination, the entire result set was returned in one query, so the non-deterministic ordering wasn't visible.
Commit that caused the issue:
94a8439105@owui-terminator[bot] commented on GitHub (Jan 21, 2026):
🔍 Similar Issues Found
I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:
#20641 issue: Web Search and Builtin Tools permissions break search
by HenkieTenkie62 • Jan 13, 2026 •
bug#19264 issue: Uploaded file hash remains in database even when OCR fails, causing false duplicate detection
by flefevre • Nov 18, 2025 •
bug#20552 issue: Retrieval: list index out of range
by outis151 • Jan 10, 2026 •
bug#20595 issue: "search_web" tool executed even when "Web Search" control disabled
by SlavikCA • Jan 11, 2026 •
bug#19429 issue: user list wrong count and less than 30 items per page
by destination-one • Nov 24, 2025 •
bug💡 Tips:
This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.
@tjbck commented on GitHub (Jan 21, 2026):
Addressed in dev!