Commit Graph

128 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
56c5bc1d34 refac 2026-04-20 08:36:24 +09:00
Timothy Jaeryang Baek
2ddcb30b9a refac 2026-04-13 14:29:27 -05:00
Timothy Jaeryang Baek
d0188f3fe1 refac 2026-04-13 14:08:58 -05:00
Timothy Jaeryang Baek
25898116ea chore: format 2026-04-12 18:12:59 -05:00
Classic298
4292358bd5 feat: log provider errors to console for better insights (#23379)
* fix: log provider errors that were silently swallowed

* Update main.py

* fix: wrap non-JSON SSE error responses in JSON so middleware handles them
2026-04-12 18:07:20 -05:00
Timothy Jaeryang Baek
c47dd7b771 refac 2026-04-12 17:22:06 -05:00
Classic298
f6b85700ea fix: gate OpenAI catch-all proxy behind ENABLE_OPENAI_API_PASSTHROUGH toggle (#23640)
The catch-all /{path:path} proxy forwards any request to the upstream OpenAI-compatible API with the admin's API key and no access control. This is an intentional proxy but should be opt-in.

Adds ENABLE_OPENAI_API_PASSTHROUGH env var (defaults to False). When disabled, the catch-all returns 403. No other routers (Ollama, responses) have catch-all proxies.
2026-04-12 14:26:12 -05:00
Timothy Jaeryang Baek
27169124f2 refac: async db 2026-04-12 14:22:11 -05:00
Classic298
a2a9a3a42a fix: prevent path traversal via model name in Azure deployment URLs (#23629)
The model name from user input was interpolated directly into Azure deployment URL paths without validation. A user could send a model name like '../../management/foo' to traverse the URL path and hit unintended Azure endpoints with the admin's API key.

Adds _sanitize_model_for_url that rejects path separators and traversal sequences, and percent-encodes the name. Applied at convert_to_azure_payload (covers chat completions + proxy) and the responses endpoint's direct URL construction.
2026-04-12 12:29:45 -05:00
Classic298
e790e7be7a fix: enforce model access control on /responses endpoint (#23481)
The /responses proxy endpoint only required authentication via
get_verified_user but did not check per-model access grants. This
allowed any authenticated user to access any model through this
endpoint, bypassing the access control system.

Extract a shared check_model_access helper into utils/access_control
and replace all inline access control blocks across openai.py and
ollama.py (7 locations) with calls to this helper. This eliminates
code duplication and prevents future policy drift between endpoints.

CWE-862: Missing Authorization
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:N/A:H (6.5 Medium)
2026-04-12 11:06:33 -05:00
Classic298
99f3c554c8 feat: support Azure v1 endpoint format (/openai/v1) (#23484)
Azure offers two URL formats: the legacy deployment-based format
(/openai/deployments/{model}/...) and the newer v1 format
(/openai/v1/...) where the model stays in the payload body and no
api-version query parameter is needed.

Previously, the code always ran convert_to_azure_payload which
rewrites the URL to the deployment format, causing 404 errors for
users with v1-style base URLs. Now, when the base URL contains
'/openai/v1', we skip deployment URL construction and route
directly.

Applied consistently across all three Azure routing paths:
generate_chat_completion, /responses proxy, and generic proxy.
2026-04-08 13:14:46 -07:00
Timothy Jaeryang Baek
350d52f515 chore: format 2026-03-25 16:43:06 -05:00
Timothy Jaeryang Baek
76ece4049e refac 2026-03-24 20:32:23 -05:00
Timothy Jaeryang Baek
f7e07f3ca1 chore: format 2026-03-24 06:07:20 -05:00
Timothy Jaeryang Baek
ade617efa8 refac 2026-03-24 04:49:48 -05:00
Timothy Jaeryang Baek
108a019cb8 refac 2026-03-23 19:58:32 -05:00
Timothy Jaeryang Baek
dfc2dc2c0b refac 2026-03-22 06:29:31 -05:00
Timothy Jaeryang Baek
ee9099cab9 refac 2026-03-22 06:00:41 -05:00
Timothy Jaeryang Baek
93415a48e8 refac 2026-03-21 20:46:25 -05:00
Timothy Jaeryang Baek
adcbba34f8 refac 2026-03-21 20:03:02 -05:00
Timothy Jaeryang Baek
de3317e26b refac 2026-03-17 17:58:01 -05:00
Timothy Jaeryang Baek
c0385f60ba refac 2026-03-17 16:52:14 -05:00
Timothy Jaeryang Baek
631e30e22d refac 2026-02-21 15:35:34 -06:00
Minwoo 'Charlie' Choi
56246324b2 fix: apply AIOHTTP_CLIENT_TIMEOUT to embeddings endpoint (#21558) 2026-02-19 14:13:50 -06:00
Timothy Jaeryang Baek
e9d852545c refac 2026-02-18 14:24:42 -06:00
Timothy Jaeryang Baek
d33ad462aa refac 2026-02-13 17:38:57 -06:00
Timothy Jaeryang Baek
79ecbfc757 refac 2026-02-13 14:59:20 -06:00
Timothy Jaeryang Baek
abc9b63093 refac
Co-Authored-By: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
2026-02-13 14:55:13 -06:00
Timothy Jaeryang Baek
589c4e64c1 refac 2026-02-13 13:56:29 -06:00
Timothy Jaeryang Baek
423d8b1817 refac 2026-02-12 11:04:34 -06:00
Timothy Jaeryang Baek
a40808579f refac 2026-02-12 10:59:41 -06:00
Timothy Jaeryang Baek
ccb71a7322 refac 2026-02-11 18:32:14 -06:00
Classic298
efe5416f83 fix: reduce TTFT by caching model lookups in chat completion (#20886)
fix: reduce TTFT by caching model lookups in chat completion

Skip expensive get_all_models() calls when models are already cached
in app.state. This significantly reduces Time To First Token (TTFT)
for chat completions and embeddings requests.

Previously, every request called get_all_models() which fetches model
lists from all configured backends. Now we check the cache first and
only call get_all_models() on cache miss.

Affected endpoints:
- openai: generate_chat_completion, embeddings
- ollama: embed, embeddings

Fixes #20069

Co-authored-by: Michael <42099345+mickeytheseal@users.noreply.github.com>
2026-02-11 18:29:10 -06:00
Timothy Jaeryang Baek
dddac2b0ca refac
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2026-02-11 18:19:01 -06:00
Timothy Jaeryang Baek
0da57149ae refac
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2026-02-11 18:13:30 -06:00
Thomas Rehn
ce3a615442 perf: cache OpenAI config reads to avoid redundant Redis lookups in /api/models (#21306)
Each access to request.app.state.config.<KEY> triggers a synchronous
Redis GET. In get_all_models_responses() and get_merged_models(), the
config keys OPENAI_API_BASE_URLS, OPENAI_API_KEYS, and
OPENAI_API_CONFIGS were read on every loop iteration — resulting
in some cases in 200-300 Redis round-trips for OPENAI_API_BASE_URLS alone.

Read each config value once into a local variable at the start of the
function and reuse it throughout.
2026-02-11 17:59:50 -06:00
Timothy Jaeryang Baek
f376d4f378 chore: format 2026-02-11 16:24:11 -06:00
Tim Baek
48a0abb40f Merge pull request #21277 from open-webui/acl
refac: acl
2026-02-09 13:34:36 -06:00
Timothy Jaeryang Baek
f7406ff576 refac 2026-02-09 13:28:14 -06:00
Tim Baek
e2d09ac361 refac 2026-02-09 09:06:48 +04:00
Classic298
494cf8b3ef fix (#21226) 2026-02-08 03:19:26 +04:00
Tim Baek
2c37daef86 refac 2026-02-06 03:23:37 +04:00
Timothy Jaeryang Baek
ea9c58ea80 feat: experimental responses api support 2026-02-01 19:39:28 -06:00
Classic298
d0c2bfdbff fix(db): release connection before LLM call in OpenAI /chat/completions (#20572)
Remove Depends(get_session) from the /chat/completions endpoint to prevent database connections from being held during the entire duration of LLM calls (30-60+ seconds for streaming responses).

Previously, the database session was acquired at request start and held until the streaming response completed. Under concurrent load, this exhausted the connection pool, causing QueuePool timeout errors for other database operations.

The fix allows Models.get_model_by_id() and has_access() to manage their own short-lived sessions internally, releasing the connection immediately after the quick authorization checks complete - before the slow external LLM API call begins.
2026-01-11 23:34:11 +04:00
Timothy Jaeryang Baek
9223efaff0 fix: native function calling system prompt duplication 2026-01-08 23:08:47 +04:00
Timothy Jaeryang Baek
700349064d chore: format 2026-01-08 01:55:56 +04:00
Timothy Jaeryang Baek
fe3047d53c refac 2025-12-29 02:05:55 +04:00
Timothy Jaeryang Baek
2453b75ff0 refac 2025-12-29 01:31:27 +04:00
Timothy Jaeryang Baek
b35aeb8f46 feat: custom model base model fallback
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2025-12-21 20:22:37 +04:00
Classic298
823b9a6dd9 chore/perf: Remove old SRC level log env vars with no impact (#20045)
* Update openai.py

* Update env.py

* Merge pull request open-webui#19030 from open-webui/dev (#119)

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Tim Baek <tim@openwebui.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-20 08:16:14 -05:00