Commit Graph

140 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
5dae600ce7 chore: format 2026-04-14 17:27:31 -05:00
Classic298
804f9f3153 fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient (#23706)
* fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient

The vector DB backends (Chroma, pgvector, Qdrant, Milvus, Pinecone,
Weaviate, …) are uniformly synchronous and their methods perform
blocking network or disk I/O. Multiple async route handlers and helpers
were calling them directly on the event loop — file processing,
memories, knowledge bases, hybrid search bookkeeping — so a single
upsert/delete/search would freeze every other in-flight request for the
duration of the call.

Introduce `AsyncVectorDBClient`, a thin async facade that wraps the
existing sync client and dispatches each method through
`asyncio.to_thread`. It mirrors `VectorDBBase` exactly and forwards
*args/**kwargs so backend-specific extra parameters keep working.

Update every async-context call site (routers/retrieval, routers/files,
routers/memories, routers/knowledge, retrieval/utils,
tools/builtin) to await `ASYNC_VECTOR_DB_CLIENT` instead of calling the
sync client directly. Two helpers that were sync-only also acquire
async siblings or are awaited via `asyncio.to_thread` at their async
call site (`remove_knowledge_base_metadata_embedding`,
`get_all_items_from_collections`, `query_doc`).

The original sync `VECTOR_DB_CLIENT` is unchanged, so callers that
already run inside `run_in_threadpool` (e.g. `save_docs_to_vector_db`
and the sync `query_doc`/`get_doc` helpers) are unaffected.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): restore explicit AsyncVectorDBClient signatures matching VectorDBBase

Per PR review: the original *args/**kwargs forwarding lost type
safety and IDE/static-analysis support. Restore explicit signatures
that mirror VectorDBBase exactly, so:

  * Bad kwargs fail at the facade boundary instead of inside the
    worker thread (where the resulting TypeError tends to be
    swallowed by surrounding `try/except`).
  * IDE autocomplete and static analysis work as expected.
  * The stated intent ("mirror VectorDBBase exactly") now holds at
    the API contract level, not just behaviourally.

While doing this, surface a pre-existing bug in
`delete_entries_from_collection` that the stricter typing flagged:
the call passed `metadata={'hash': hash}` which is not a parameter
on `VectorDBBase.delete` nor any backend. The TypeError raised
inside the sync delete was silently swallowed by `except Exception`
so the endpoint always reported `{'status': False}` for every
request instead of actually deleting matching vectors. Replace with
`filter=...` to do what the endpoint name promises.

The thorough review's other note (no concurrency/backpressure on
the shared default threadpool) is intentionally not addressed here:
asyncio.to_thread on the shared executor is the right primitive for
this use case; per-domain bounded executors would add lifecycle
complexity disproportionate to the problem and the loop is no
longer blocked, which was the actual bug.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): parallelize hybrid-search collection prefetch; document async facade contracts

Address PR review findings:

1. Hybrid-search prefetch was sequential
   `query_collection_with_hybrid_search` previously awaited
   `ASYNC_VECTOR_DB_CLIENT.get(name)` once per collection in a for
   loop. Each call already off-loaded to a worker thread, but
   awaiting them serially meant total prefetch latency scaled
   linearly with the number of collections. Run them concurrently
   with `asyncio.gather` so multi-collection queries actually
   benefit from the threadpool. Per-collection exception handling
   is preserved by wrapping each fetch in a small helper that
   logs and returns `(name, None)` on failure, so a single bad
   collection cannot poison the whole gather.

2. Document the thread-safety expectation explicitly
   The facade now formally states what was always implicit: the
   sync `VECTOR_DB_CLIENT` is shared across worker threads, so the
   underlying backend driver must be thread-safe. This is not a
   new exposure — `save_docs_to_vector_db` already called the sync
   client from `run_in_threadpool`. Adding a global lock here
   would defeat the responsiveness the facade exists to provide;
   backends that cannot tolerate concurrent access should grow
   their own internal serialization.

3. Document the API-surface choice and `.sync` escape hatch
   The strict `VectorDBBase` mirror was a deliberate choice (the
   previous `*args/**kwargs` revision let a `metadata=` typo
   silently break an endpoint). Document it, and call out the
   `.sync` escape hatch with an example for callers that genuinely
   need a backend-specific parameter not on `VectorDBBase`.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): guard /delete against null file.hash and let HTTPException reach the client

Address PR review finding on the `metadata=` → `filter=` change in
`delete_entries_from_collection`.

The new `filter={'hash': hash}` query was correct for files that
have a hash, but did not handle `file.hash is None` (unprocessed,
failed, or legacy records). The match semantics of a null filter
value are backend-dependent — some ignore the key entirely, some
treat it as "metadata field absent" and match every such row — so
issuing the query risked deleting unrelated entries.

  * Reject `hash is None` up front with a 400 explaining the file
    has no hash to target.

  * Narrow the surrounding `except Exception` so it no longer
    swallows `HTTPException`. Without this fix the new 400 (and the
    pre-existing 404 for missing files) would be silently re-shaped
    into `{'status': False}` and the caller could not distinguish a
    bad-request input from a backend error.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:50:18 -05:00
Timothy Jaeryang Baek
8dba798cce refac 2026-04-13 16:03:36 -05:00
Timothy Jaeryang Baek
de3317e26b refac 2026-03-17 17:58:01 -05:00
Timothy Jaeryang Baek
fcf7208352 refac 2026-03-17 17:56:15 -05:00
Ethan T.
a229f9ea42 fix: replace bare except with except Exception (#22473)
Replace bare except clauses with except Exception to follow Python best practices and avoid catching unexpected system exceptions like KeyboardInterrupt and SystemExit.
2026-03-15 17:48:23 -05:00
Timothy Jaeryang Baek
c6a1469fad refac 2026-03-08 19:05:15 -05:00
Timothy Jaeryang Baek
61366cbcda refac 2026-03-08 18:57:20 -05:00
Timothy Jaeryang Baek
352391fa76 chore: format 2026-03-08 18:14:09 -05:00
Code with love
265d1b2824 Add support for mariadb-vector as backing vector DB (#21931) 2026-03-08 17:13:14 -05:00
Classic298
97a3b1528d Update utils.py (#21105) 2026-02-13 13:37:12 -06:00
Timothy Jaeryang Baek
f376d4f378 chore: format 2026-02-11 16:24:11 -06:00
Varun Chawla
9b1fd86aa7 fix: use keyword argument for IndicesClient.refresh() for opensearch-py 3.x (#21248)
In opensearch-py >= 3.0.0, IndicesClient.refresh() no longer accepts the
index name as a positional argument. This causes a TypeError when
uploading documents to knowledge bases with OpenSearch backend.

Changes positional arguments to keyword arguments (index=...) in all
three refresh() calls in the OpenSearch vector DB client.

Fixes #20649
2026-02-09 16:16:44 -06:00
rohithshenoy
9d642f6354 Added support for connecting to self hosted weaviate deployments using connect_to_custom replacing connect_to_local, which is better suited for cases where HTTP and GRPC are hosted on different ingresses. (#20620)
Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: joaoback <156559121+joaoback@users.noreply.github.com>
Co-authored-by: rohithshenoyg@gmail.com <rohithshenoyg@gmail.com>
2026-01-17 21:48:52 +04:00
Timothy Jaeryang Baek
5990c51ab5 chore: format 2026-01-09 22:27:53 +04:00
Timothy Jaeryang Baek
3c986adeda enh: kb metadata search 2026-01-09 22:21:00 +04:00
Timothy Jaeryang Baek
b1d0f00d8c refac/enh: db session sharing 2025-12-29 00:21:18 +04:00
Dechao Sun
25db8225f8 openWebUI supports openGauss vector store (#20179) 2025-12-26 18:32:05 +04:00
Classic298
823b9a6dd9 chore/perf: Remove old SRC level log env vars with no impact (#20045)
* Update openai.py

* Update env.py

* Merge pull request open-webui#19030 from open-webui/dev (#119)

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-20 08:16:14 -05:00
Timothy Jaeryang Baek
9a65ed2260 chore: format 2025-12-02 16:06:57 -05:00
Classic298
b29fdc2a0c Update milvus_multitenancy.py (#19695) 2025-12-02 15:38:06 -05:00
Classic298
12f237ff80 fix: Update milvus.py (#19602)
* Update milvus.py

* Update milvus.py

* Update milvus.py

* Update milvus.py

* Update milvus.py

---------

Co-authored-by: Tim Baek <tim@openwebui.com>
2025-12-02 15:30:31 -05:00
Classic298
0a14196afb Update milvus_multitenancy.py (#19680) 2025-12-02 03:57:14 -05:00
Timothy Jaeryang Baek
48d1e67e79 chore: format 2025-11-23 20:15:52 -05:00
Diwakar
b8728064d8 feat: add support for Weaviate vector database (#14747) 2025-11-20 19:23:46 -05:00
Timothy Jaeryang Baek
a1d09eae95 chore: format 2025-11-19 03:23:33 -05:00
Seth Argyle
720af637e6 fix: Use get_index() instead of list_indexes() in has_collection() to… (#19238)
* fix: Use get_index() instead of list_indexes() in has_collection() to handle pagination

Fixes #19233

  Replace list_indexes() pagination scan with direct get_index() lookup
  in has_collection() method. The previous implementation only checked
  the first ~1,000 indexes due to unhandled pagination, causing RAG
  queries to fail for indexes beyond the first page.

  Benefits:
  - Handles buckets with any number of indexes (no pagination needed)
  - ~8x faster (0.19s vs 1.53s in testing)
  - Proper exception handling for ResourceNotFoundException
  - Scales to millions of indexes

* Update s3vector.py

Unneeded exception handling removed to match original OWUI code
2025-11-19 00:19:10 -05:00
lazariv
6cdb13d5cb feat: pgvector hnsw index type (#19158)
* Adding hnsw index type for pgvector, allowing vector dimensions larger than 2000

* remove some variable assignments

* Make USE_HALFVEC variable configurable

* Simplify USE_HALFVEC handling

* Raise runtime error if the index requires rebuilt

---------

Co-authored-by: Moritz <moritz.mueller2@tu-dresden.de>
2025-11-18 04:14:43 -05:00
Timothy Jaeryang Baek
fcc2bb5a05 refac: oracle23ai 2025-10-14 18:22:48 -05:00
Timothy Jaeryang Baek
e7fa86aa26 chore: format 2025-09-29 00:58:21 -05:00
Tim Jaeryang Baek
2d94b8e905 Merge pull request #17837 from Classic298/milvus-multitenancy
feat: Impelement Milvus multitenancy // breaking: set milvus multitenancy as standard option (just like Qdrant already is)
2025-09-29 00:29:35 -05:00
Timothy Jaeryang Baek
118549caf3 enh/fix: filter content metadata 2025-09-28 20:17:27 -05:00
Classic298
b1e63639cd ADD FAT WARNING - QDRANT 2025-09-28 21:17:07 +02:00
Classic298
0e99c43495 ADD FAT WARNING 2025-09-28 21:16:02 +02:00
Classic298
01d4a8ab7a Update factory.py 2025-09-28 11:06:29 +02:00
Classic298
8dc43f9e3a Create milvus_multitenancy.py 2025-09-28 11:05:15 +02:00
Tim Jaeryang Baek
f8a3ed2d18 Merge pull request #17770 from Classic298/feat-milvus-diskann-support
feat: Add DISKANN index type support for Milvus
2025-09-26 14:23:53 -05:00
google-labs-jules[bot]
123dbf152e feat: Add DISKANN index type support for Milvus
This commit introduces support for the DISKANN index type in the Milvus vector database integration.

Changes include:
- Added `MILVUS_DISKANN_MAX_DEGREE` and `MILVUS_DISKANN_SEARCH_LIST_SIZE` configuration variables.
- Updated the Milvus client to recognize and configure the DISKANN index type during collection creation.
2025-09-26 06:54:06 +00:00
google-labs-jules[bot]
e7ccaf6e78 Fix: milvus error because the limit set to None by default
The pymilvus library expects -1 for unlimited queries, but the code was passing None, which caused a TypeError. This commit changes the default value of the limit parameter in the query method from None to -1. It also updates the call site in the get method to pass -1 instead of None and updates the type hint and a comment to reflect this change.
2025-09-26 06:39:54 +00:00
Timothy Jaeryang Baek
c2b4976c82 enh: PGVECTOR_CREATE_EXTENSION env var 2025-08-31 23:58:18 +04:00
Timothy Jaeryang Baek
1a15a62b73 chore: format 2025-08-21 04:47:28 +04:00
Tim Jaeryang Baek
7452b87877 Merge pull request #16741 from 0xThresh/s3vector-support
fix: batch S3 vectors in groups of 500 to comply with API limitations
2025-08-20 13:25:42 +04:00
James W.
45d9a720b9 Merge branch 'open-webui:main' into s3vector-support 2025-08-19 22:06:16 -06:00
0xThresh.eth
7fcc545672 fix: batch S3 vectors in groups of 500 to comply with API limitations 2025-08-19 22:05:47 -06:00
Tim Jaeryang Baek
0b59aa940e Merge pull request #16606 from Rain6435/fix/azure-postgresql-pgvector-permissions
fix: resolve Azure PostgreSQL pgvector extension permission issue
2025-08-15 00:59:04 +04:00
Rain6435
a1e62ab422 fix: Formatting 2025-08-14 01:50:57 -04:00
Rain6435
1a42e96a3b fix: resolve Azure PostgreSQL pgvector extension permission issue
Replace direct CREATE EXTENSION commands with conditional checks to avoid
  permission errors on Azure PostgreSQL Flexible Server where only
  azure_pg_admin members can create extensions.

  - Check pg_extension table before attempting to create vector extension
  - Apply same fix to pgcrypto extension for consistency
  - Allows following least privilege principle for database users

  Fixes #12453
2025-08-14 01:45:02 -04:00
Timothy Jaeryang Baek
ad98d4300b refac/fix: milvus query logic 2025-08-14 03:18:38 +04:00
Timothy Jaeryang Baek
890691319f fix: s3vector import issue 2025-08-11 16:23:08 +04:00
Timothy Jaeryang Baek
21094ca88b fix: pinecone insert issue 2025-08-11 16:22:58 +04:00