open-webui

mirror of https://github.com/open-webui/open-webui.git synced 2026-05-03 18:59:38 -05:00

Author	SHA1	Message	Date
Timothy Jaeryang Baek	e695d854f2	refac	2026-04-17 13:10:06 +09:00
Timothy Jaeryang Baek	34d569d564	refac	2026-04-17 13:04:07 +09:00
Timothy Jaeryang Baek	e709d6812f	refac	2026-04-17 12:55:56 +09:00
Timothy Jaeryang Baek	3dd8255816	refac	2026-04-17 12:37:44 +09:00
Classic298	32cfb5788a	perf(chats): select only meta column in get_chat_tags_by_id_and_user_id (#23798 ) Avoid loading the full Chat row (including the potentially large `chat` JSON column) just to read tag IDs from `meta.tags`. Issue a narrow SELECT on `Chat.meta` instead, which is much cheaper for chats with large message histories. Co-authored-by: Claude <noreply@anthropic.com>	2026-04-17 12:35:58 +09:00
Classic298	2dca850cee	perf: use set for O(1) lookup in insert_chat_files dedupe (#23800 ) Convert chat_message_file_ids from list to set so the membership test in the comprehension is O(1) instead of O(m), turning the dedupe from O(n*m) into O(n+m). Also replace the redundant set([...]) with a set comprehension. https://claude.ai/code/session_01Le3NnqNhcgaJvFrDZGqmwe Co-authored-by: Claude <noreply@anthropic.com>	2026-04-17 12:28:34 +09:00
Timothy Jaeryang Baek	349ea4ea9e	refac	2026-04-17 12:25:43 +09:00
Timothy Jaeryang Baek	7e453de4f7	refac	2026-04-17 11:54:19 +09:00
Timothy Jaeryang Baek	f1ef09ddc8	refac	2026-04-17 11:48:17 +09:00
Timothy Jaeryang Baek	3332878321	refac	2026-04-17 11:29:59 +09:00
Classic298	e396af3cc8	perf: reuse request db session in get_model_profile_image (#23796 ) Pass the request-scoped AsyncSession into Models.get_model_by_id so the endpoint no longer opens a fresh DB session on every call, avoiding an extra connection acquisition per profile image request. Co-authored-by: Claude <noreply@anthropic.com>	2026-04-17 11:28:19 +09:00
Timothy Jaeryang Baek	7cc7b367dc	refac	2026-04-17 11:27:58 +09:00
Classic298	5eae0a5cdd	perf(users): drop redundant get_user_by_id refetch in session-user endpoints (#23794 ) * perf(users): drop redundant get_user_by_id refetch in session-user endpoints Five /user/* handlers refetched the user row via Users.get_user_by_id(user.id) immediately after receiving an identical UserModel from Depends(get_verified_user). Since get_verified_user already populated the user within the same request microseconds earlier, the refetch is pure overhead. The dead else branches (unreachable — get_verified_user raises 401 on missing user) are removed as a natural consequence. Affected endpoints: - GET /user/settings - GET /user/status - POST /user/status/update - GET /user/info - POST /user/info/update Eliminates one SELECT per request to each of these endpoints with no behavioral change. * fix(users): preserve USER_NOT_FOUND error on status update failure update_user_status_by_id returns None when the target user is missing or the update raises. The previous commit removed the pre-update existence gate (get_user_by_id) and returned the update result directly, which turned not-found/failure cases into 200 OK with a null body instead of the expected 400 USER_NOT_FOUND. Guard the update result explicitly to preserve the original API contract, matching the equivalent pattern already applied in /user/info/update. * docs(users): note lost-update tradeoff on /user/info/update Make the concurrency tradeoff explicit: merging against the auth-time snapshot slightly widens the lost-update window compared to the previous pre-merge refetch, but the refetch only narrowed (did not eliminate) that window. Real safety requires row locking or a version column. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-17 11:23:08 +09:00
Timothy Jaeryang Baek	43e5905c13	refac	2026-04-17 11:18:48 +09:00
Timothy Jaeryang Baek	128cf41fce	refac	2026-04-17 11:12:42 +09:00
Timothy Jaeryang Baek	398718d505	refac	2026-04-17 10:44:29 +09:00
Timothy Jaeryang Baek	2e52ad8ff2	refac: shared chat	2026-04-17 10:16:32 +09:00
Timothy Jaeryang Baek	4d2f189810	feat: add RAG_RERANKING_BATCH_SIZE configuration option Add configurable reranker batch size (env var RAG_RERANKING_BATCH_SIZE, default 32) following the same pattern as RAG_EMBEDDING_BATCH_SIZE. - config.py: PersistentConfig for RAG_RERANKING_BATCH_SIZE - main.py: import, state init, pass to get_reranking_function - colbert.py: accept batch_size param in predict() (was hardcoded 32) - utils.py: get_reranking_function passes batch_size at call time - retrieval.py: expose in config GET/POST endpoints and ConfigForm - Documents.svelte: add Reranking Batch Size input in admin settings Closes #23730	2026-04-17 08:35:45 +09:00
Timothy Jaeryang Baek	70a6a24f14	refac	2026-04-15 10:37:59 -07:00
Timothy Jaeryang Baek	2f9e326dba	refac	2026-04-15 10:26:47 -07:00
Timothy Jaeryang Baek	5944eda0ff	refac	2026-04-15 10:17:40 -07:00
Timothy Jaeryang Baek	5dae600ce7	chore: format	2026-04-14 17:27:31 -05:00
Timothy Jaeryang Baek	ecd74f220c	refac	2026-04-14 17:22:54 -05:00
Timothy Jaeryang Baek	8bd23b9145	refac	2026-04-14 16:47:43 -05:00
Timothy Jaeryang Baek	a4ed16999e	refac	2026-04-14 16:08:14 -05:00
Classic298	a3ea7bf043	fix(retrieval): offload Loader.load to a worker thread so file uploads stop blocking the event loop (#23705 ) Loader.load() dispatches to the underlying langchain document loaders (PyMuPDF, Unstructured, python-docx, Tika, …) which are all synchronous and CPU/IO-bound. process_file() awaited it directly on the event loop, so parsing a non-trivial PDF/DOCX would freeze the entire FastAPI app for the duration of the parse — which is what users experience as "the server hangs whenever I upload a file." Add an `aload()` async wrapper on Loader that runs the sync load on a worker thread via asyncio.to_thread, and update process_file() to await it. The sync API is preserved so existing callers that already run inside run_in_threadpool (e.g. save_docs_to_vector_db) are unaffected. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 Co-authored-by: Claude <noreply@anthropic.com>	2026-04-14 10:55:46 -05:00
Timothy Jaeryang Baek	4866bec0f2	refac	2026-04-14 10:55:11 -05:00
Classic298	804f9f3153	fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient (#23706 ) * fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient The vector DB backends (Chroma, pgvector, Qdrant, Milvus, Pinecone, Weaviate, …) are uniformly synchronous and their methods perform blocking network or disk I/O. Multiple async route handlers and helpers were calling them directly on the event loop — file processing, memories, knowledge bases, hybrid search bookkeeping — so a single upsert/delete/search would freeze every other in-flight request for the duration of the call. Introduce `AsyncVectorDBClient`, a thin async facade that wraps the existing sync client and dispatches each method through `asyncio.to_thread`. It mirrors `VectorDBBase` exactly and forwards args/kwargs so backend-specific extra parameters keep working. Update every async-context call site (routers/retrieval, routers/files, routers/memories, routers/knowledge, retrieval/utils, tools/builtin) to await `ASYNC_VECTOR_DB_CLIENT` instead of calling the sync client directly. Two helpers that were sync-only also acquire async siblings or are awaited via `asyncio.to_thread` at their async call site (`remove_knowledge_base_metadata_embedding`, `get_all_items_from_collections`, `query_doc`). The original sync `VECTOR_DB_CLIENT` is unchanged, so callers that already run inside `run_in_threadpool` (e.g. `save_docs_to_vector_db` and the sync `query_doc`/`get_doc` helpers) are unaffected. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 fix(retrieval): restore explicit AsyncVectorDBClient signatures matching VectorDBBase Per PR review: the original args/kwargs forwarding lost type safety and IDE/static-analysis support. Restore explicit signatures that mirror VectorDBBase exactly, so: Bad kwargs fail at the facade boundary instead of inside the worker thread (where the resulting TypeError tends to be swallowed by surrounding `try/except`). * IDE autocomplete and static analysis work as expected. * The stated intent ("mirror VectorDBBase exactly") now holds at the API contract level, not just behaviourally. While doing this, surface a pre-existing bug in `delete_entries_from_collection` that the stricter typing flagged: the call passed `metadata={'hash': hash}` which is not a parameter on `VectorDBBase.delete` nor any backend. The TypeError raised inside the sync delete was silently swallowed by `except Exception` so the endpoint always reported `{'status': False}` for every request instead of actually deleting matching vectors. Replace with `filter=...` to do what the endpoint name promises. The thorough review's other note (no concurrency/backpressure on the shared default threadpool) is intentionally not addressed here: asyncio.to_thread on the shared executor is the right primitive for this use case; per-domain bounded executors would add lifecycle complexity disproportionate to the problem and the loop is no longer blocked, which was the actual bug. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 * fix(retrieval): parallelize hybrid-search collection prefetch; document async facade contracts Address PR review findings: 1. Hybrid-search prefetch was sequential `query_collection_with_hybrid_search` previously awaited `ASYNC_VECTOR_DB_CLIENT.get(name)` once per collection in a for loop. Each call already off-loaded to a worker thread, but awaiting them serially meant total prefetch latency scaled linearly with the number of collections. Run them concurrently with `asyncio.gather` so multi-collection queries actually benefit from the threadpool. Per-collection exception handling is preserved by wrapping each fetch in a small helper that logs and returns `(name, None)` on failure, so a single bad collection cannot poison the whole gather. 2. Document the thread-safety expectation explicitly The facade now formally states what was always implicit: the sync `VECTOR_DB_CLIENT` is shared across worker threads, so the underlying backend driver must be thread-safe. This is not a new exposure — `save_docs_to_vector_db` already called the sync client from `run_in_threadpool`. Adding a global lock here would defeat the responsiveness the facade exists to provide; backends that cannot tolerate concurrent access should grow their own internal serialization. 3. Document the API-surface choice and `.sync` escape hatch The strict `VectorDBBase` mirror was a deliberate choice (the previous `args/kwargs` revision let a `metadata=` typo silently break an endpoint). Document it, and call out the `.sync` escape hatch with an example for callers that genuinely need a backend-specific parameter not on `VectorDBBase`. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 fix(retrieval): guard /delete against null file.hash and let HTTPException reach the client Address PR review finding on the `metadata=` → `filter=` change in `delete_entries_from_collection`. The new `filter={'hash': hash}` query was correct for files that have a hash, but did not handle `file.hash is None` (unprocessed, failed, or legacy records). The match semantics of a null filter value are backend-dependent — some ignore the key entirely, some treat it as "metadata field absent" and match every such row — so issuing the query risked deleting unrelated entries. * Reject `hash is None` up front with a 400 explaining the file has no hash to target. * Narrow the surrounding `except Exception` so it no longer swallows `HTTPException`. Without this fix the new 400 (and the pre-existing 404 for missing files) would be silently re-shaped into `{'status': False}` and the caller could not distinguish a bad-request input from a backend error. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-14 10:50:18 -05:00
Classic298	ee28032fb9	fix(middleware): replace BaseHTTPMiddleware HTTP middlewares with pure ASGI implementations (#23709 ) * fix(middleware): replace BaseHTTPMiddleware HTTP middlewares with pure ASGI implementations Starlette's BaseHTTPMiddleware (and the @app.middleware('http') decorator that uses it) wraps the downstream app in an anyio task group whose cancel scope tears down the inner task on every exit — client disconnect, response complete, or any outer middleware bailing. That CancelledError gets injected into whatever the inner task was awaiting, so DB queries, embedding calls, and other long awaits get killed mid-flight. Under aiosqlite the cleanup path then logs a multi-page `terminate_force_close() not implemented` traceback at ERROR for every cancelled DB call. Open WebUI had four such middlewares stacked (`commit_session_after_request`, `check_url`, `inspect_websocket`, `RedirectMiddleware`) so a single cancellation would compound through all four. Move the four middlewares to a new `open_webui.utils.asgi_middleware` module as plain ASGI classes (`__call__(scope, receive, send)`): * `CommitSessionMiddleware` — was `commit_session_after_request`; now also rolls back if commit fails before releasing the connection. * `AuthTokenMiddleware` — was `check_url`; sets request.state token + enable_api_keys + stamps X-Process-Time via a wrapped send. * `WebsocketUpgradeGuardMiddleware` — was `inspect_websocket`; rejects /ws/socket.io HTTP requests that claim transport=websocket without a proper Upgrade/Connection header. * `RedirectMiddleware` — was the BaseHTTPMiddleware subclass; same /watch + share-target rewrites. Pure ASGI does not introduce a cancel scope around the downstream app, so client disconnects propagate via `receive()` (the way ASGI was designed) instead of being injected as CancelledError. Middleware ordering is preserved. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 * fix(middleware): CommitSessionMiddleware — rollback on downstream error, never commit failed requests The first cut put commit() in a finally block, which meant that even when a downstream handler raised, the middleware would still commit whatever partial sync writes that handler had made before the failure. That regressed the previous BaseHTTPMiddleware semantics where commit only ran on the success path. Restructure the failure handling: * Downstream raised → rollback any pending sync work, release the connection, re-raise so the outer error middleware turns it into an error response. We never commit a request that did not complete. * Downstream returned → commit. On commit failure, log loudly, rollback, and re-raise. ScopedSession.remove() always runs in finally so the connection cannot leak. Document the inherent pure-ASGI limitation explicitly: by the time `await self.app(...)` returns the response messages have already been emitted, so a commit failure can no longer change what the client sees on the wire. Buffering the response to gate it on commit success would break streaming responses (chat completions, SSE) which are core to Open WebUI; the trade-off is intentional. Routes that need commit-before-send must manage the sync session explicitly. Also drop unused `typing` imports flagged by review. https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8 --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-14 10:47:48 -05:00
Timothy Jaeryang Baek	37658fd541	refac	2026-04-14 01:17:39 -05:00
Timothy Jaeryang Baek	cced77b584	refac	2026-04-14 00:07:50 -05:00
Timothy Jaeryang Baek	45e49d33e5	refac	2026-04-13 21:52:19 -05:00
Timothy Jaeryang Baek	cf4218e688	refac	2026-04-13 21:29:03 -05:00
Algorithm5838	33a4d1b412	fix: image url to base64 conversion (#23685 )	2026-04-13 19:15:22 -05:00
Timothy Jaeryang Baek	c767bcaa73	refac	2026-04-13 18:20:46 -05:00
Timothy Jaeryang Baek	715cf9797a	refac	2026-04-13 16:25:44 -05:00
Classic298	8979987eed	fix: drop extra='allow' on FolderForm and FolderUpdateForm (#23648 ) * fix: drop extra='allow' on FolderForm and FolderUpdateForm These request models were configured to accept arbitrary extra fields, which were then merged into the folder row via form_data.model_dump(). In insert_new_folder the server-assigned user_id is placed before the form spread, so a client-supplied user_id in the request body would override it and the folder would be persisted against another account. Strictly typed inputs are the correct shape for these endpoints — the client has no legitimate reason to send fields beyond the declared ones, and dropping extra='allow' closes the mass-assignment sink at the validation layer instead of relying on every callsite to merge fields in the right order. * fix: reject unknown fields on FolderForm and FolderUpdateForm Address review feedback: dropping extra='allow' fell back to Pydantic v2's default extra='ignore', which only silently drops unknown fields instead of rejecting them. The intent for these request models is a strict input contract — fail fast when a client sends anything the server does not expect — so explicitly set extra='forbid'. This also makes the hardening visible in the form definition rather than implicit in the default.	2026-04-13 16:14:00 -05:00
Timothy Jaeryang Baek	8dba798cce	refac	2026-04-13 16:03:36 -05:00
Timothy Jaeryang Baek	611fe0c8a9	refac	2026-04-13 15:14:55 -05:00
Timothy Jaeryang Baek	31406caa79	refac	2026-04-13 15:13:14 -05:00
Timothy Jaeryang Baek	9c64d84ad9	refac	2026-04-13 15:03:22 -05:00
Timothy Jaeryang Baek	40f5b3d135	refac	2026-04-13 14:51:09 -05:00
Timothy Jaeryang Baek	869cf9e848	refac	2026-04-13 14:33:23 -05:00
Timothy Jaeryang Baek	2ddcb30b9a	refac	2026-04-13 14:29:27 -05:00
Timothy Jaeryang Baek	96265cf042	refac	2026-04-13 14:19:15 -05:00
Timothy Jaeryang Baek	050c4b97a9	refac	2026-04-13 14:13:03 -05:00
Timothy Jaeryang Baek	d0188f3fe1	refac	2026-04-13 14:08:58 -05:00
Timothy Jaeryang Baek	8936721414	refac	2026-04-13 13:44:44 -05:00
Timothy Jaeryang Baek	d1a0fbe292	refac	2026-04-13 13:36:54 -05:00
Timothy Jaeryang Baek	22cfb3c673	refac	2026-04-13 13:26:13 -05:00

1 2 3 4 5 ...

3994 Commits