Commit Graph

3994 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
e695d854f2 refac 2026-04-17 13:10:06 +09:00
Timothy Jaeryang Baek
34d569d564 refac 2026-04-17 13:04:07 +09:00
Timothy Jaeryang Baek
e709d6812f refac 2026-04-17 12:55:56 +09:00
Timothy Jaeryang Baek
3dd8255816 refac 2026-04-17 12:37:44 +09:00
Classic298
32cfb5788a perf(chats): select only meta column in get_chat_tags_by_id_and_user_id (#23798)
Avoid loading the full Chat row (including the potentially large `chat`
JSON column) just to read tag IDs from `meta.tags`. Issue a narrow
SELECT on `Chat.meta` instead, which is much cheaper for chats with
large message histories.

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-17 12:35:58 +09:00
Classic298
2dca850cee perf: use set for O(1) lookup in insert_chat_files dedupe (#23800)
Convert chat_message_file_ids from list to set so the membership test
in the comprehension is O(1) instead of O(m), turning the dedupe from
O(n*m) into O(n+m). Also replace the redundant set([...]) with a set
comprehension.

https://claude.ai/code/session_01Le3NnqNhcgaJvFrDZGqmwe

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-17 12:28:34 +09:00
Timothy Jaeryang Baek
349ea4ea9e refac 2026-04-17 12:25:43 +09:00
Timothy Jaeryang Baek
7e453de4f7 refac 2026-04-17 11:54:19 +09:00
Timothy Jaeryang Baek
f1ef09ddc8 refac 2026-04-17 11:48:17 +09:00
Timothy Jaeryang Baek
3332878321 refac 2026-04-17 11:29:59 +09:00
Classic298
e396af3cc8 perf: reuse request db session in get_model_profile_image (#23796)
Pass the request-scoped AsyncSession into Models.get_model_by_id so the
endpoint no longer opens a fresh DB session on every call, avoiding an
extra connection acquisition per profile image request.

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-17 11:28:19 +09:00
Timothy Jaeryang Baek
7cc7b367dc refac 2026-04-17 11:27:58 +09:00
Classic298
5eae0a5cdd perf(users): drop redundant get_user_by_id refetch in session-user endpoints (#23794)
* perf(users): drop redundant get_user_by_id refetch in session-user endpoints

Five /user/* handlers refetched the user row via Users.get_user_by_id(user.id)
immediately after receiving an identical UserModel from Depends(get_verified_user).
Since get_verified_user already populated the user within the same request
microseconds earlier, the refetch is pure overhead. The dead else branches
(unreachable — get_verified_user raises 401 on missing user) are removed as
a natural consequence.

Affected endpoints:
- GET  /user/settings
- GET  /user/status
- POST /user/status/update
- GET  /user/info
- POST /user/info/update

Eliminates one SELECT per request to each of these endpoints with no behavioral
change.

* fix(users): preserve USER_NOT_FOUND error on status update failure

update_user_status_by_id returns None when the target user is missing or
the update raises. The previous commit removed the pre-update existence
gate (get_user_by_id) and returned the update result directly, which
turned not-found/failure cases into 200 OK with a null body instead of
the expected 400 USER_NOT_FOUND.

Guard the update result explicitly to preserve the original API contract,
matching the equivalent pattern already applied in /user/info/update.

* docs(users): note lost-update tradeoff on /user/info/update

Make the concurrency tradeoff explicit: merging against the auth-time
snapshot slightly widens the lost-update window compared to the previous
pre-merge refetch, but the refetch only narrowed (did not eliminate) that
window. Real safety requires row locking or a version column.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-17 11:23:08 +09:00
Timothy Jaeryang Baek
43e5905c13 refac 2026-04-17 11:18:48 +09:00
Timothy Jaeryang Baek
128cf41fce refac 2026-04-17 11:12:42 +09:00
Timothy Jaeryang Baek
398718d505 refac 2026-04-17 10:44:29 +09:00
Timothy Jaeryang Baek
2e52ad8ff2 refac: shared chat 2026-04-17 10:16:32 +09:00
Timothy Jaeryang Baek
4d2f189810 feat: add RAG_RERANKING_BATCH_SIZE configuration option
Add configurable reranker batch size (env var RAG_RERANKING_BATCH_SIZE,
default 32) following the same pattern as RAG_EMBEDDING_BATCH_SIZE.

- config.py: PersistentConfig for RAG_RERANKING_BATCH_SIZE
- main.py: import, state init, pass to get_reranking_function
- colbert.py: accept batch_size param in predict() (was hardcoded 32)
- utils.py: get_reranking_function passes batch_size at call time
- retrieval.py: expose in config GET/POST endpoints and ConfigForm
- Documents.svelte: add Reranking Batch Size input in admin settings

Closes #23730
2026-04-17 08:35:45 +09:00
Timothy Jaeryang Baek
70a6a24f14 refac 2026-04-15 10:37:59 -07:00
Timothy Jaeryang Baek
2f9e326dba refac 2026-04-15 10:26:47 -07:00
Timothy Jaeryang Baek
5944eda0ff refac 2026-04-15 10:17:40 -07:00
Timothy Jaeryang Baek
5dae600ce7 chore: format 2026-04-14 17:27:31 -05:00
Timothy Jaeryang Baek
ecd74f220c refac 2026-04-14 17:22:54 -05:00
Timothy Jaeryang Baek
8bd23b9145 refac 2026-04-14 16:47:43 -05:00
Timothy Jaeryang Baek
a4ed16999e refac 2026-04-14 16:08:14 -05:00
Classic298
a3ea7bf043 fix(retrieval): offload Loader.load to a worker thread so file uploads stop blocking the event loop (#23705)
Loader.load() dispatches to the underlying langchain document loaders
(PyMuPDF, Unstructured, python-docx, Tika, …) which are all
synchronous and CPU/IO-bound. process_file() awaited it directly on
the event loop, so parsing a non-trivial PDF/DOCX would freeze the
entire FastAPI app for the duration of the parse — which is what users
experience as "the server hangs whenever I upload a file."

Add an `aload()` async wrapper on Loader that runs the sync load on a
worker thread via asyncio.to_thread, and update process_file() to
await it. The sync API is preserved so existing callers that already
run inside run_in_threadpool (e.g. save_docs_to_vector_db) are
unaffected.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:55:46 -05:00
Timothy Jaeryang Baek
4866bec0f2 refac 2026-04-14 10:55:11 -05:00
Classic298
804f9f3153 fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient (#23706)
* fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient

The vector DB backends (Chroma, pgvector, Qdrant, Milvus, Pinecone,
Weaviate, …) are uniformly synchronous and their methods perform
blocking network or disk I/O. Multiple async route handlers and helpers
were calling them directly on the event loop — file processing,
memories, knowledge bases, hybrid search bookkeeping — so a single
upsert/delete/search would freeze every other in-flight request for the
duration of the call.

Introduce `AsyncVectorDBClient`, a thin async facade that wraps the
existing sync client and dispatches each method through
`asyncio.to_thread`. It mirrors `VectorDBBase` exactly and forwards
*args/**kwargs so backend-specific extra parameters keep working.

Update every async-context call site (routers/retrieval, routers/files,
routers/memories, routers/knowledge, retrieval/utils,
tools/builtin) to await `ASYNC_VECTOR_DB_CLIENT` instead of calling the
sync client directly. Two helpers that were sync-only also acquire
async siblings or are awaited via `asyncio.to_thread` at their async
call site (`remove_knowledge_base_metadata_embedding`,
`get_all_items_from_collections`, `query_doc`).

The original sync `VECTOR_DB_CLIENT` is unchanged, so callers that
already run inside `run_in_threadpool` (e.g. `save_docs_to_vector_db`
and the sync `query_doc`/`get_doc` helpers) are unaffected.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): restore explicit AsyncVectorDBClient signatures matching VectorDBBase

Per PR review: the original *args/**kwargs forwarding lost type
safety and IDE/static-analysis support. Restore explicit signatures
that mirror VectorDBBase exactly, so:

  * Bad kwargs fail at the facade boundary instead of inside the
    worker thread (where the resulting TypeError tends to be
    swallowed by surrounding `try/except`).
  * IDE autocomplete and static analysis work as expected.
  * The stated intent ("mirror VectorDBBase exactly") now holds at
    the API contract level, not just behaviourally.

While doing this, surface a pre-existing bug in
`delete_entries_from_collection` that the stricter typing flagged:
the call passed `metadata={'hash': hash}` which is not a parameter
on `VectorDBBase.delete` nor any backend. The TypeError raised
inside the sync delete was silently swallowed by `except Exception`
so the endpoint always reported `{'status': False}` for every
request instead of actually deleting matching vectors. Replace with
`filter=...` to do what the endpoint name promises.

The thorough review's other note (no concurrency/backpressure on
the shared default threadpool) is intentionally not addressed here:
asyncio.to_thread on the shared executor is the right primitive for
this use case; per-domain bounded executors would add lifecycle
complexity disproportionate to the problem and the loop is no
longer blocked, which was the actual bug.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): parallelize hybrid-search collection prefetch; document async facade contracts

Address PR review findings:

1. Hybrid-search prefetch was sequential
   `query_collection_with_hybrid_search` previously awaited
   `ASYNC_VECTOR_DB_CLIENT.get(name)` once per collection in a for
   loop. Each call already off-loaded to a worker thread, but
   awaiting them serially meant total prefetch latency scaled
   linearly with the number of collections. Run them concurrently
   with `asyncio.gather` so multi-collection queries actually
   benefit from the threadpool. Per-collection exception handling
   is preserved by wrapping each fetch in a small helper that
   logs and returns `(name, None)` on failure, so a single bad
   collection cannot poison the whole gather.

2. Document the thread-safety expectation explicitly
   The facade now formally states what was always implicit: the
   sync `VECTOR_DB_CLIENT` is shared across worker threads, so the
   underlying backend driver must be thread-safe. This is not a
   new exposure — `save_docs_to_vector_db` already called the sync
   client from `run_in_threadpool`. Adding a global lock here
   would defeat the responsiveness the facade exists to provide;
   backends that cannot tolerate concurrent access should grow
   their own internal serialization.

3. Document the API-surface choice and `.sync` escape hatch
   The strict `VectorDBBase` mirror was a deliberate choice (the
   previous `*args/**kwargs` revision let a `metadata=` typo
   silently break an endpoint). Document it, and call out the
   `.sync` escape hatch with an example for callers that genuinely
   need a backend-specific parameter not on `VectorDBBase`.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): guard /delete against null file.hash and let HTTPException reach the client

Address PR review finding on the `metadata=` → `filter=` change in
`delete_entries_from_collection`.

The new `filter={'hash': hash}` query was correct for files that
have a hash, but did not handle `file.hash is None` (unprocessed,
failed, or legacy records). The match semantics of a null filter
value are backend-dependent — some ignore the key entirely, some
treat it as "metadata field absent" and match every such row — so
issuing the query risked deleting unrelated entries.

  * Reject `hash is None` up front with a 400 explaining the file
    has no hash to target.

  * Narrow the surrounding `except Exception` so it no longer
    swallows `HTTPException`. Without this fix the new 400 (and the
    pre-existing 404 for missing files) would be silently re-shaped
    into `{'status': False}` and the caller could not distinguish a
    bad-request input from a backend error.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:50:18 -05:00
Classic298
ee28032fb9 fix(middleware): replace BaseHTTPMiddleware HTTP middlewares with pure ASGI implementations (#23709)
* fix(middleware): replace BaseHTTPMiddleware HTTP middlewares with pure ASGI implementations

Starlette's BaseHTTPMiddleware (and the @app.middleware('http')
decorator that uses it) wraps the downstream app in an anyio task
group whose cancel scope tears down the inner task on every exit —
client disconnect, response complete, or any outer middleware bailing.
That CancelledError gets injected into whatever the inner task was
awaiting, so DB queries, embedding calls, and other long awaits get
killed mid-flight. Under aiosqlite the cleanup path then logs a
multi-page `terminate_force_close() not implemented` traceback at
ERROR for every cancelled DB call.

Open WebUI had four such middlewares stacked
(`commit_session_after_request`, `check_url`, `inspect_websocket`,
`RedirectMiddleware`) so a single cancellation would compound through
all four.

Move the four middlewares to a new `open_webui.utils.asgi_middleware`
module as plain ASGI classes (`__call__(scope, receive, send)`):

  * `CommitSessionMiddleware`   — was `commit_session_after_request`;
                                  now also rolls back if commit fails
                                  before releasing the connection.
  * `AuthTokenMiddleware`       — was `check_url`; sets request.state
                                  token + enable_api_keys + stamps
                                  X-Process-Time via a wrapped send.
  * `WebsocketUpgradeGuardMiddleware`
                                — was `inspect_websocket`; rejects
                                  /ws/socket.io HTTP requests that
                                  claim transport=websocket without a
                                  proper Upgrade/Connection header.
  * `RedirectMiddleware`        — was the BaseHTTPMiddleware subclass;
                                  same /watch + share-target rewrites.

Pure ASGI does not introduce a cancel scope around the downstream app,
so client disconnects propagate via `receive()` (the way ASGI was
designed) instead of being injected as CancelledError. Middleware
ordering is preserved.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(middleware): CommitSessionMiddleware — rollback on downstream error, never commit failed requests

The first cut put commit() in a finally block, which meant that even
when a downstream handler raised, the middleware would still commit
whatever partial sync writes that handler had made before the
failure. That regressed the previous BaseHTTPMiddleware semantics
where commit only ran on the success path.

Restructure the failure handling:

* Downstream raised → rollback any pending sync work, release the
  connection, re-raise so the outer error middleware turns it into
  an error response. We never commit a request that did not complete.
* Downstream returned → commit. On commit failure, log loudly,
  rollback, and re-raise. ScopedSession.remove() always runs in
  finally so the connection cannot leak.

Document the inherent pure-ASGI limitation explicitly: by the time
`await self.app(...)` returns the response messages have already
been emitted, so a commit failure can no longer change what the
client sees on the wire. Buffering the response to gate it on commit
success would break streaming responses (chat completions, SSE) which
are core to Open WebUI; the trade-off is intentional. Routes that
need commit-before-send must manage the sync session explicitly.

Also drop unused `typing` imports flagged by review.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:47:48 -05:00
Timothy Jaeryang Baek
37658fd541 refac 2026-04-14 01:17:39 -05:00
Timothy Jaeryang Baek
cced77b584 refac 2026-04-14 00:07:50 -05:00
Timothy Jaeryang Baek
45e49d33e5 refac 2026-04-13 21:52:19 -05:00
Timothy Jaeryang Baek
cf4218e688 refac 2026-04-13 21:29:03 -05:00
Algorithm5838
33a4d1b412 fix: image url to base64 conversion (#23685) 2026-04-13 19:15:22 -05:00
Timothy Jaeryang Baek
c767bcaa73 refac 2026-04-13 18:20:46 -05:00
Timothy Jaeryang Baek
715cf9797a refac 2026-04-13 16:25:44 -05:00
Classic298
8979987eed fix: drop extra='allow' on FolderForm and FolderUpdateForm (#23648)
* fix: drop extra='allow' on FolderForm and FolderUpdateForm

These request models were configured to accept arbitrary extra fields,
which were then merged into the folder row via form_data.model_dump().
In insert_new_folder the server-assigned user_id is placed before the
form spread, so a client-supplied user_id in the request body would
override it and the folder would be persisted against another account.

Strictly typed inputs are the correct shape for these endpoints — the
client has no legitimate reason to send fields beyond the declared
ones, and dropping extra='allow' closes the mass-assignment sink at
the validation layer instead of relying on every callsite to merge
fields in the right order.

* fix: reject unknown fields on FolderForm and FolderUpdateForm

Address review feedback: dropping extra='allow' fell back to Pydantic
v2's default extra='ignore', which only silently drops unknown fields
instead of rejecting them. The intent for these request models is a
strict input contract — fail fast when a client sends anything the
server does not expect — so explicitly set extra='forbid'. This also
makes the hardening visible in the form definition rather than implicit
in the default.
2026-04-13 16:14:00 -05:00
Timothy Jaeryang Baek
8dba798cce refac 2026-04-13 16:03:36 -05:00
Timothy Jaeryang Baek
611fe0c8a9 refac 2026-04-13 15:14:55 -05:00
Timothy Jaeryang Baek
31406caa79 refac 2026-04-13 15:13:14 -05:00
Timothy Jaeryang Baek
9c64d84ad9 refac 2026-04-13 15:03:22 -05:00
Timothy Jaeryang Baek
40f5b3d135 refac 2026-04-13 14:51:09 -05:00
Timothy Jaeryang Baek
869cf9e848 refac 2026-04-13 14:33:23 -05:00
Timothy Jaeryang Baek
2ddcb30b9a refac 2026-04-13 14:29:27 -05:00
Timothy Jaeryang Baek
96265cf042 refac 2026-04-13 14:19:15 -05:00
Timothy Jaeryang Baek
050c4b97a9 refac 2026-04-13 14:13:03 -05:00
Timothy Jaeryang Baek
d0188f3fe1 refac 2026-04-13 14:08:58 -05:00
Timothy Jaeryang Baek
8936721414 refac 2026-04-13 13:44:44 -05:00
Timothy Jaeryang Baek
d1a0fbe292 refac 2026-04-13 13:36:54 -05:00
Timothy Jaeryang Baek
22cfb3c673 refac 2026-04-13 13:26:13 -05:00