Commit Graph

58 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
46d73c9dcd refac 2026-04-21 13:46:39 +09:00
Timothy Jaeryang Baek
a05a769938 refac 2026-04-19 23:42:09 +09:00
Timothy Jaeryang Baek
37eba1c5a6 chore: format 2026-04-19 22:45:54 +09:00
Timothy Jaeryang Baek
98627e42b4 refac 2026-04-19 22:13:47 +09:00
Timothy Jaeryang Baek
e709d6812f refac 2026-04-17 12:55:56 +09:00
Classic298
804f9f3153 fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient (#23706)
* fix(retrieval): offload sync VECTOR_DB_CLIENT calls in async paths via AsyncVectorDBClient

The vector DB backends (Chroma, pgvector, Qdrant, Milvus, Pinecone,
Weaviate, …) are uniformly synchronous and their methods perform
blocking network or disk I/O. Multiple async route handlers and helpers
were calling them directly on the event loop — file processing,
memories, knowledge bases, hybrid search bookkeeping — so a single
upsert/delete/search would freeze every other in-flight request for the
duration of the call.

Introduce `AsyncVectorDBClient`, a thin async facade that wraps the
existing sync client and dispatches each method through
`asyncio.to_thread`. It mirrors `VectorDBBase` exactly and forwards
*args/**kwargs so backend-specific extra parameters keep working.

Update every async-context call site (routers/retrieval, routers/files,
routers/memories, routers/knowledge, retrieval/utils,
tools/builtin) to await `ASYNC_VECTOR_DB_CLIENT` instead of calling the
sync client directly. Two helpers that were sync-only also acquire
async siblings or are awaited via `asyncio.to_thread` at their async
call site (`remove_knowledge_base_metadata_embedding`,
`get_all_items_from_collections`, `query_doc`).

The original sync `VECTOR_DB_CLIENT` is unchanged, so callers that
already run inside `run_in_threadpool` (e.g. `save_docs_to_vector_db`
and the sync `query_doc`/`get_doc` helpers) are unaffected.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): restore explicit AsyncVectorDBClient signatures matching VectorDBBase

Per PR review: the original *args/**kwargs forwarding lost type
safety and IDE/static-analysis support. Restore explicit signatures
that mirror VectorDBBase exactly, so:

  * Bad kwargs fail at the facade boundary instead of inside the
    worker thread (where the resulting TypeError tends to be
    swallowed by surrounding `try/except`).
  * IDE autocomplete and static analysis work as expected.
  * The stated intent ("mirror VectorDBBase exactly") now holds at
    the API contract level, not just behaviourally.

While doing this, surface a pre-existing bug in
`delete_entries_from_collection` that the stricter typing flagged:
the call passed `metadata={'hash': hash}` which is not a parameter
on `VectorDBBase.delete` nor any backend. The TypeError raised
inside the sync delete was silently swallowed by `except Exception`
so the endpoint always reported `{'status': False}` for every
request instead of actually deleting matching vectors. Replace with
`filter=...` to do what the endpoint name promises.

The thorough review's other note (no concurrency/backpressure on
the shared default threadpool) is intentionally not addressed here:
asyncio.to_thread on the shared executor is the right primitive for
this use case; per-domain bounded executors would add lifecycle
complexity disproportionate to the problem and the loop is no
longer blocked, which was the actual bug.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): parallelize hybrid-search collection prefetch; document async facade contracts

Address PR review findings:

1. Hybrid-search prefetch was sequential
   `query_collection_with_hybrid_search` previously awaited
   `ASYNC_VECTOR_DB_CLIENT.get(name)` once per collection in a for
   loop. Each call already off-loaded to a worker thread, but
   awaiting them serially meant total prefetch latency scaled
   linearly with the number of collections. Run them concurrently
   with `asyncio.gather` so multi-collection queries actually
   benefit from the threadpool. Per-collection exception handling
   is preserved by wrapping each fetch in a small helper that
   logs and returns `(name, None)` on failure, so a single bad
   collection cannot poison the whole gather.

2. Document the thread-safety expectation explicitly
   The facade now formally states what was always implicit: the
   sync `VECTOR_DB_CLIENT` is shared across worker threads, so the
   underlying backend driver must be thread-safe. This is not a
   new exposure — `save_docs_to_vector_db` already called the sync
   client from `run_in_threadpool`. Adding a global lock here
   would defeat the responsiveness the facade exists to provide;
   backends that cannot tolerate concurrent access should grow
   their own internal serialization.

3. Document the API-surface choice and `.sync` escape hatch
   The strict `VectorDBBase` mirror was a deliberate choice (the
   previous `*args/**kwargs` revision let a `metadata=` typo
   silently break an endpoint). Document it, and call out the
   `.sync` escape hatch with an example for callers that genuinely
   need a backend-specific parameter not on `VectorDBBase`.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

* fix(retrieval): guard /delete against null file.hash and let HTTPException reach the client

Address PR review finding on the `metadata=` → `filter=` change in
`delete_entries_from_collection`.

The new `filter={'hash': hash}` query was correct for files that
have a hash, but did not handle `file.hash is None` (unprocessed,
failed, or legacy records). The match semantics of a null filter
value are backend-dependent — some ignore the key entirely, some
treat it as "metadata field absent" and match every such row — so
issuing the query risked deleting unrelated entries.

  * Reject `hash is None` up front with a 400 explaining the file
    has no hash to target.

  * Narrow the surrounding `except Exception` so it no longer
    swallows `HTTPException`. Without this fix the new 400 (and the
    pre-existing 404 for missing files) would be silently re-shaped
    into `{'status': False}` and the caller could not distinguish a
    bad-request input from a backend error.

https://claude.ai/code/session_01JSr4NZSskEUQvoJnavVXh8

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-14 10:50:18 -05:00
Timothy Jaeryang Baek
20544d412e chore: format 2026-04-12 22:11:10 -05:00
Timothy Jaeryang Baek
a359262616 refac 2026-04-12 18:48:06 -05:00
Timothy Jaeryang Baek
de27a12151 refac 2026-04-12 14:39:23 -05:00
Timothy Jaeryang Baek
27169124f2 refac: async db 2026-04-12 14:22:11 -05:00
Timothy Jaeryang Baek
674695918e refac 2026-04-11 16:44:12 -06:00
Classic298
e7e006e781 fix: use admin-configured WEB_SEARCH_RESULT_COUNT as default (#23488)
The built-in search_web tool hardcoded count=5 as the default,
ignoring the admin-configured WEB_SEARCH_RESULT_COUNT setting.
When the LLM did not specify a count, the tool always returned 5
results regardless of admin configuration.

Now the tool defaults to the admin-configured value when the LLM
omits the count parameter, while still capping LLM-requested
values at the admin maximum to prevent abuse.

Closes #23485
2026-04-08 13:13:44 -07:00
Timothy Jaeryang Baek
65ee771fd0 refac 2026-04-02 01:40:50 -05:00
Timothy Jaeryang Baek
c8ef5a4f38 chore: format 2026-04-01 04:36:02 -05:00
Timothy Jaeryang Baek
b1e8c7d2aa refac 2026-03-29 21:25:39 -05:00
Timothy Jaeryang Baek
b794d61626 refac 2026-03-29 21:15:20 -05:00
Timothy Jaeryang Baek
4777f4fa32 refac 2026-03-29 19:48:23 -05:00
Timothy Jaeryang Baek
1b1d85fe2e refac 2026-03-29 19:42:10 -05:00
Timothy Jaeryang Baek
bcb71bb520 feat: tasks 2026-03-29 18:01:04 -05:00
Timothy Jaeryang Baek
f7e07f3ca1 chore: format 2026-03-24 06:07:20 -05:00
Timothy Jaeryang Baek
0f0ba7dadd refac 2026-03-23 16:56:50 -05:00
Timothy Jaeryang Baek
5d7766e1b6 refac 2026-03-23 16:46:54 -05:00
Timothy Jaeryang Baek
9a2c60d595 refac 2026-03-21 17:12:33 -05:00
Timothy Jaeryang Baek
de3317e26b refac 2026-03-17 17:58:01 -05:00
Timothy Jaeryang Baek
b171b0216b refac 2026-03-17 17:54:59 -05:00
Steve-Li-1998
7ea1e9cbd0 fix: Prefer model-provided web search result count over admin default (#22577)
* Prefer model-provided web search result count over admin default

Update `search_web` to prioritize the model-provided `count` parameter before falling back to the admin-configured `WEB_SEARCH_RESULT_COUNT`, and finally defaulting to 5.

Changes:
- Set `count` default to `None` instead of `5`.
- Adjust fallback order to: model-provided `count` → admin-configured value → `5`.
- Update comment to reflect the new precedence logic.

This ensures explicit model requests for result count are respected while preserving sensible defaults.

* Enforce maximum web search result count from config

Update `search_web` to cap the model-provided `count` parameter at the admin-configured `WEB_SEARCH_RESULT_COUNT` to prevent excessive result requests.

Changes:
- Set default `count` parameter to `5`.
- Replace fallback logic with enforcement logic that limits `count` to the configured maximum.
- Update comment to reflect that the result count is now capped to prevent abuse.

This ensures web search requests cannot exceed the configured limit while maintaining a sensible default.
2026-03-11 15:34:24 -05:00
Timothy Jaeryang Baek
35bc831077 refac 2026-03-07 18:18:02 -06:00
Timothy Jaeryang Baek
d4faa5a5ea refac 2026-03-07 17:13:19 -06:00
Classic298
b9c0a9c3bf enh: prevent models from always using internal knowledge base search first (#22264)
Some models always primarily use the internal knowledge base first before deviating to the web search tool
2026-03-07 16:16:43 -06:00
Classic298
65fbbf5e35 fix: grant file access for knowledge attached to shared workspace models (#22151) 2026-03-02 18:08:49 -05:00
Timothy Jaeryang Baek
259d5ca596 refac 2026-03-01 13:49:36 -06:00
Timothy Jaeryang Baek
f872a178bc refac 2026-02-19 14:06:24 -06:00
Timothy Jaeryang Baek
05b8768fb9 refac 2026-02-17 00:48:49 -06:00
Classic298
d01b1d4880 enh: apply admin default to builtin web search (#21373) 2026-02-13 13:32:48 -06:00
Timothy Jaeryang Baek
f376d4f378 chore: format 2026-02-11 16:24:11 -06:00
Timothy Jaeryang Baek
c2207887b3 feat: skills backend 2026-02-11 14:00:34 -06:00
Tim Baek
48a0abb40f Merge pull request #21277 from open-webui/acl
refac: acl
2026-02-09 13:34:36 -06:00
Timothy Jaeryang Baek
f7406ff576 refac 2026-02-09 13:28:14 -06:00
Tim Baek
b1737040a7 refac 2026-02-06 22:25:18 +04:00
Classic298
f751c0b46c Update builtin.py (#21115) 2026-02-05 15:14:58 -05:00
Timothy Jaeryang Baek
683438b418 refac 2026-01-27 21:37:20 +04:00
Timothy Jaeryang Baek
1a4bdd2b30 refac 2026-01-22 14:59:15 +04:00
Classic298
1c1f72f05c Update builtin.py (#20705) 2026-01-16 00:15:02 +04:00
EntropyYue
1d343aeae4 enh: Make builtin search web tools asynchronous (#20630)
Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: joaoback <156559121+joaoback@users.noreply.github.com>
2026-01-15 10:46:00 +04:00
Classic298
af584b46f4 feat: code-interpreter native (#20592)
* code-interpreter native

* Update tools.py

* Update builtin.py

* Update builtin.py

* Update tools.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py

* Update builtin.py
2026-01-12 00:18:41 +04:00
Timothy Jaeryang Baek
5990c51ab5 chore: format 2026-01-09 22:27:53 +04:00
Timothy Jaeryang Baek
3c986adeda enh: kb metadata search 2026-01-09 22:21:00 +04:00
Timothy Jaeryang Baek
ffbd6ec7f2 refac 2026-01-09 03:03:25 +04:00
Timothy Jaeryang Baek
700349064d chore: format 2026-01-08 01:55:56 +04:00
Tim Baek
35d385e9cc refac 2026-01-07 10:21:05 -05:00