Commit Graph

104 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
4866bec0f2 refac 2026-04-14 10:55:11 -05:00
Classic298
804f9f3153 fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient (#23706)
* fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient

The vector DB backends (Chroma, pgvector, Qdrant, Milvus, Pinecone,
Weaviate, …) are uniformly synchronous and their methods perform
blocking network or disk I/O. Multiple async route handlers and helpers
were calling them directly on the event loop — file processing,
memories, knowledge bases, hybrid search bookkeeping — so a single
upsert/delete/search would freeze every other in-flight request for the
duration of the call.

Introduce `AsyncVectorDBClient`, a thin async facade that wraps the
existing sync client and dispatches each method through
`asyncio.to_thread`. It mirrors `VectorDBBase` exactly and forwards
*args/**kwargs so backend-specific extra parameters keep working.

Update every async-context call site (routers/retrieval, routers/files,
routers/memories, routers/knowledge, retrieval/utils,
tools/builtin) to await `ASYNC_VECTOR_DB_CLIENT` instead of calling the
sync client directly. Two helpers that were sync-only also acquire
async siblings or are awaited via `asyncio.to_thread` at their async
call site (`remove_knowledge_base_metadata_embedding`,
`get_all_items_from_collections`, `query_doc`).

The original sync `VECTOR_DB_CLIENT` is unchanged, so callers that
already run inside `run_in_threadpool` (e.g. `save_docs_to_vector_db`
and the sync `query_doc`/`get_doc` helpers) are unaffected.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): restore explicit AsyncVectorDBClient signatures matching VectorDBBase

Per PR review: the original *args/**kwargs forwarding lost type
safety and IDE/static-analysis support. Restore explicit signatures
that mirror VectorDBBase exactly, so:

  * Bad kwargs fail at the facade boundary instead of inside the
    worker thread (where the resulting TypeError tends to be
    swallowed by surrounding `try/except`).
  * IDE autocomplete and static analysis work as expected.
  * The stated intent ("mirror VectorDBBase exactly") now holds at
    the API contract level, not just behaviourally.

While doing this, surface a pre-existing bug in
`delete_entries_from_collection` that the stricter typing flagged:
the call passed `metadata={'hash': hash}` which is not a parameter
on `VectorDBBase.delete` nor any backend. The TypeError raised
inside the sync delete was silently swallowed by `except Exception`
so the endpoint always reported `{'status': False}` for every
request instead of actually deleting matching vectors. Replace with
`filter=...` to do what the endpoint name promises.

The thorough review's other note (no concurrency/backpressure on
the shared default threadpool) is intentionally not addressed here:
asyncio.to_thread on the shared executor is the right primitive for
this use case; per-domain bounded executors would add lifecycle
complexity disproportionate to the problem and the loop is no
longer blocked, which was the actual bug.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): parallelize hybrid-search collection prefetch; document async facade contracts

Address PR review findings:

1. Hybrid-search prefetch was sequential
   `query_collection_with_hybrid_search` previously awaited
   `ASYNC_VECTOR_DB_CLIENT.get(name)` once per collection in a for
   loop. Each call already off-loaded to a worker thread, but
   awaiting them serially meant total prefetch latency scaled
   linearly with the number of collections. Run them concurrently
   with `asyncio.gather` so multi-collection queries actually
   benefit from the threadpool. Per-collection exception handling
   is preserved by wrapping each fetch in a small helper that
   logs and returns `(name, None)` on failure, so a single bad
   collection cannot poison the whole gather.

2. Document the thread-safety expectation explicitly
   The facade now formally states what was always implicit: the
   sync `VECTOR_DB_CLIENT` is shared across worker threads, so the
   underlying backend driver must be thread-safe. This is not a
   new exposure — `save_docs_to_vector_db` already called the sync
   client from `run_in_threadpool`. Adding a global lock here
   would defeat the responsiveness the facade exists to provide;
   backends that cannot tolerate concurrent access should grow
   their own internal serialization.

3. Document the API-surface choice and `.sync` escape hatch
   The strict `VectorDBBase` mirror was a deliberate choice (the
   previous `*args/**kwargs` revision let a `metadata=` typo
   silently break an endpoint). Document it, and call out the
   `.sync` escape hatch with an example for callers that genuinely
   need a backend-specific parameter not on `VectorDBBase`.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): guard /delete against null file.hash and let HTTPException reach the client

Address PR review finding on the `metadata=` → `filter=` change in
`delete_entries_from_collection`.

The new `filter={'hash': hash}` query was correct for files that
have a hash, but did not handle `file.hash is None` (unprocessed,
failed, or legacy records). The match semantics of a null filter
value are backend-dependent — some ignore the key entirely, some
treat it as "metadata field absent" and match every such row — so
issuing the query risked deleting unrelated entries.

  * Reject `hash is None` up front with a 400 explaining the file
    has no hash to target.

  * Narrow the surrounding `except Exception` so it no longer
    swallows `HTTPException`. Without this fix the new 400 (and the
    pre-existing 404 for missing files) would be silently re-shaped
    into `{'status': False}` and the caller could not distinguish a
    bad-request input from a backend error.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:50:18 -05:00
Timothy Jaeryang Baek
8172c7e3d5 refac 2026-04-12 19:08:30 -05:00
Timothy Jaeryang Baek
25898116ea chore: format 2026-04-12 18:12:59 -05:00
Timothy Jaeryang Baek
de27a12151 refac 2026-04-12 14:39:23 -05:00
Timothy Jaeryang Baek
27169124f2 refac: async db 2026-04-12 14:22:11 -05:00
Timothy Jaeryang Baek
6acaaea59a refac 2026-04-11 15:23:37 -06:00
Timothy Jaeryang Baek
8c2afb8157 refac 2026-04-02 17:58:11 -05:00
Timothy Jaeryang Baek
ade617efa8 refac 2026-03-24 04:49:48 -05:00
Timothy Jaeryang Baek
2e165926de refac 2026-03-22 06:40:39 -05:00
Timothy Jaeryang Baek
de3317e26b refac 2026-03-17 17:58:01 -05:00
Timothy Jaeryang Baek
f9756de693 refac 2026-03-15 17:35:06 -05:00
Timothy Jaeryang Baek
0bfacca0a0 refac 2026-03-08 18:30:16 -05:00
Timothy Jaeryang Baek
352391fa76 chore: format 2026-03-08 18:14:09 -05:00
Timothy Jaeryang Baek
259d5ca596 refac 2026-03-01 13:49:36 -06:00
Timothy Jaeryang Baek
9044abf3bb chore: format 2026-02-23 01:40:53 -06:00
Timothy Jaeryang Baek
f651809001 refac 2026-02-22 17:05:39 -06:00
Timothy Jaeryang Baek
631e30e22d refac 2026-02-21 15:35:34 -06:00
G30
8c713a171d fix(backend): catch 404 http exceptions before generalized exception block in files router (#21687) 2026-02-21 14:48:51 -06:00
Classic298
c5c31ab769 fix: respect BYPASS_ADMIN_ACCESS_CONTROL in file list/search endpoints (#21595) 2026-02-19 16:36:48 -06:00
Timothy Jaeryang Baek
f7406ff576 refac 2026-02-09 13:28:14 -06:00
Timothy Jaeryang Baek
f9ab66f51a refac
Co-Authored-By: Hsienz <55347238+hsienz@users.noreply.github.com>
2026-01-30 00:46:42 +04:00
Timothy Jaeryang Baek
93ed4ae2cd enh: files data controls 2026-01-29 19:50:06 +04:00
Timothy Jaeryang Baek
409f565f09 refac 2026-01-17 21:41:48 +04:00
Classic298
81510e9d8f fix(files): prevent connection pool exhaustion in file status streaming (#20547)
Refactored the file processing status streaming endpoint to avoid holding
a database connection for the entire stream duration (up to 2 hours).
Changes:
- Each status poll now creates its own short-lived database session instead
  of capturing the request's session in the generator closure
- Increased poll interval from 0.5s to 1s, halving database queries with
  negligible UX impact
This prevents a single file status stream from blocking a connection pool
slot for hours, which could contribute to pool exhaustion under load.
2026-01-10 15:23:48 +04:00
Timothy Jaeryang Baek
b377e5ff4c chore: format 2026-01-09 02:46:04 +04:00
Timothy Jaeryang Baek
a9a979fb3d refac: files search
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2026-01-08 03:08:11 +04:00
Timothy Jaeryang Baek
b1d0f00d8c refac/enh: db session sharing 2025-12-29 00:21:18 +04:00
Timothy Jaeryang Baek
4ab917c74b fix/refac: stt default content type 2025-12-22 09:45:55 +04:00
Timothy Jaeryang Baek
45e3237756 fix/refac: shared chat files behaviour 2025-12-21 23:29:54 +04:00
Classic298
823b9a6dd9 chore/perf: Remove old SRC level log env vars with no impact (#20045)
* Update openai.py

* Update env.py

* Merge pull request open-webui#19030 from open-webui/dev (#119)

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-20 08:16:14 -05:00
Shirasawa
99c820d607 fix: fixed the issue of mismatched spaces in audio MIME types (#17771) 2025-12-10 23:59:10 -05:00
Timothy Jaeryang Baek
ceae3d48e6 enh/refac: kb pagination 2025-12-10 23:19:19 -05:00
Timothy Jaeryang Baek
d1d42128e5 refac/fix: channel files 2025-12-10 15:53:45 -05:00
Timothy Jaeryang Baek
c15201620d refac: kb files 2025-12-10 15:48:27 -05:00
Timothy Jaeryang Baek
22f1b764a7 refac/perf: channel image upload behaviour 2025-12-03 19:06:02 -05:00
Timothy Jaeryang Baek
e301d1962e refac/perf: has_access_to_file optimization 2025-12-02 11:11:17 -05:00
Timothy Jaeryang Baek
d19023288e feat/enh: kb files db migration 2025-12-02 10:53:32 -05:00
Classic298
485896753d feat: Add user header information for TTS/STT requests (#93) (#19323)
Resolves #19312

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-20 16:43:22 -05:00
Timothy Jaeryang Baek
f524a6a8e7 refac/fix: kb image upload handling 2025-10-28 00:34:53 -07:00
Timothy Jaeryang Baek
60f62c2f59 refac 2025-09-17 11:28:04 -05:00
Timothy Jaeryang Baek
c9282135c4 refac 2025-09-07 02:02:21 +04:00
Timothy Jaeryang Baek
c61698efcf enh: process_in_background query param for file upload endpoint 2025-08-25 18:18:52 +04:00
Timothy Jaeryang Baek
37a3de0703 fix 2025-08-22 17:19:57 +04:00
Timothy Jaeryang Baek
4451f86eb0 refac 2025-08-21 01:22:32 +04:00
Timothy Jaeryang Baek
5e1f4fa0ff feat: async file upload 2025-08-20 00:36:13 +04:00
Timothy Jaeryang Baek
575db66295 feat: save temporary chats 2025-08-19 02:37:18 +04:00
expruc
30a079cba8 added handler for deleting files from vdb upon files deletion 2025-07-18 18:40:29 +03:00
Timothy Jaeryang Baek
6186bbf337 refac/fix: stt supported type 2025-06-18 14:01:14 +04:00
Timothy Jaeryang Baek
7a1afa9c66 feat: custom stt content type
Co-Authored-By: Bryan Berns <berns@uwalumni.com>
2025-06-16 16:13:40 +04:00