Commit Graph

977 Commits

Author SHA1 Message Date
Timothy Jaeryang Baek
d56bb2c383 refac 2026-01-11 00:52:43 +04:00
Classic298
3f133fad56 fix: release database connections immediately after auth instead of holding during LLM calls (#20545)
fix: release database connections immediately after auth instead of holding during LLM calls

Authentication was using Depends(get_session) which holds a database connection
for the entire request lifecycle. For chat completions, this meant connections
were held for 30-60 seconds while waiting for LLM responses, despite only needing
the connection for ~50ms of actual database work.

With a default pool of 15 connections, this limited concurrent chat users to ~15
before pool exhaustion and timeout errors:

    sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached,
    connection timed out, timeout 30.00

The fix removes Depends(get_session) from get_current_user. Each database
operation now manages its own short-lived session internally:

    BEFORE: One session held for entire request
    ──────────────────────────────────────────────────
    │ auth │ queries │ LLM wait (30s) │ save │
    │         CONNECTION HELD ENTIRE TIME            │
    ──────────────────────────────────────────────────

    AFTER: Short-lived sessions, released immediately
    ┌──────┐ ┌───────┐                 ┌──────┐
    │ auth │ │ query │   LLM (30s)     │ save │
    │ 10ms │ │ 20ms  │  NO CONNECTION  │ 20ms │
    └──────┘ └───────┘                 └──────┘

This is safe because:
- User model has no lazy-loaded relationships (all simple columns)
- Pydantic conversion (UserModel.model_validate) happens while session is open
- Returned object is pure Pydantic with no SQLAlchemy ties

Combined with the telemetry efficiency fix, this resolves connection pool
exhaustion for high-concurrency deployments, particularly on network-attached
databases like AWS Aurora where connection hold time is more impactful.
2026-01-10 15:34:36 +04:00
Classic298
7839d043ff fix: use efficient COUNT queries in telemetry metrics to prevent connection pool exhaustion (#20542)
fix: use efficient COUNT queries in telemetry metrics to prevent connection pool exhaustion

This fixes database connection pool exhaustion issues reported after v0.7.0,
particularly affecting PostgreSQL deployments on high-latency networks (e.g., AWS Aurora).

## The Problem

The telemetry metrics callbacks (running every 10 seconds via OpenTelemetry's
PeriodicExportingMetricReader) were using inefficient queries that loaded entire
database tables into memory just to count records:

    len(Users.get_users()["users"])  # Loads ALL user records to count them

On high-latency network-attached databases like AWS Aurora, this would:
1. Hold database connections for hundreds of milliseconds while transferring data
2. Deserialize all records into Python objects
3. Only then count the list length

Under concurrent load, these long-held connections would stack up and drain the
connection pool, resulting in:

    sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached,
    connection timed out, timeout 30.00

## The Fix

Replace inefficient full-table loads with efficient COUNT(*) queries using
methods that already exist in the codebase:

- `len(Users.get_users()["users"])` → `Users.get_num_users()`
- Similar changes for other telemetry callbacks as needed

COUNT(*) queries use database indexes and return a single integer, completing in
~5-10ms even on Aurora, versus potentially 500ms+ for loading all records.

## Why v0.7.1's Session Sharing Disable "Helped"

The v0.7.1 change to disable DATABASE_ENABLE_SESSION_SHARING by default appeared
to fix the issue, but it was masking the root cause. Disabling session sharing
causes connections to be returned to the pool faster (more connection churn),
which reduced the window for pool exhaustion but didn't address the underlying
inefficient queries.

With this fix, session sharing can be safely re-enabled for deployments that
benefit from it (especially PostgreSQL), as telemetry will no longer hold
connections for extended periods.

## Impact

- Telemetry connection usage drops from potentially seconds to ~30ms total per
  collection cycle
- Connection pool pressure from telemetry becomes negligible (~0.3% utilization)
- Enterprise PostgreSQL deployments (Aurora, RDS, etc.) should no longer
  experience pool exhaustion under normal load
2026-01-10 15:33:42 +04:00
Timothy Jaeryang Baek
3c986adeda enh: kb metadata search 2026-01-09 22:21:00 +04:00
Tim Baek
daccf0713e enh: file context model setting 2026-01-09 03:41:43 -05:00
Timothy Jaeryang Baek
1138929f4d feat: headless admin creation 2026-01-09 12:01:36 +04:00
Timothy Jaeryang Baek
b377e5ff4c chore: format 2026-01-09 02:46:04 +04:00
Timothy Jaeryang Baek
9223efaff0 fix: native function calling system prompt duplication 2026-01-08 23:08:47 +04:00
Timothy Jaeryang Baek
9b06fdc8fe refac 2026-01-08 03:37:11 +04:00
Timothy Jaeryang Baek
700349064d chore: format 2026-01-08 01:55:56 +04:00
Timothy Jaeryang Baek
c417fdd94d refac 2026-01-08 01:38:40 +04:00
Timothy Jaeryang Baek
e67891a374 refac 2026-01-08 00:42:29 +04:00
Tim Baek
0654df7bdb refac 2026-01-07 10:25:13 -05:00
Tim Baek
35d385e9cc refac 2026-01-07 10:21:05 -05:00
Tim Baek
ab400e3eae enh: native tool citations
Co-Authored-By: Jannik S. <jannik@streidl.dev>
2026-01-07 10:14:45 -05:00
Tim Baek
961136413f refac 2026-01-07 09:46:07 -05:00
Tim Baek
c8622adcb0 feat: builtin kb tools 2026-01-07 08:58:58 -05:00
Tim Baek
2789f6a24d enh: builtin tools 2026-01-07 07:00:32 -05:00
Tim Baek
60e916d6c0 enh: built-in tools toggle in model editor 2026-01-07 06:22:17 -05:00
Timothy Jaeryang Baek
927a765641 refac 2026-01-06 03:24:08 +04:00
Timothy Jaeryang Baek
5921a19519 refac 2026-01-06 02:19:57 +04:00
Timothy Jaeryang Baek
119fc21257 refac 2026-01-06 01:42:49 +04:00
Classic298
713a65ee31 fix: inject full context knowledge into system message for KV prefix caching (#20317)
* Update middleware.py

* Update middleware.py

* env var

* address

* upd
2026-01-05 23:58:53 +04:00
Timothy Jaeryang Baek
8ef0f7743b refac 2026-01-05 23:13:05 +04:00
Timothy Jaeryang Baek
1d08376860 refac 2026-01-05 18:55:44 +04:00
Timothy Jaeryang Baek
646835d767 feat: builtin native tools 2026-01-05 17:45:39 +04:00
Timothy Jaeryang Baek
5c1d52231a feat: native function calling for built-in tools 2026-01-05 04:45:17 +04:00
Timothy Jaeryang Baek
b55a46ae99 refac 2026-01-05 03:46:46 +04:00
Timothy Jaeryang Baek
bd07ef87ab refac 2026-01-03 18:43:12 +04:00
Timothy Jaeryang Baek
89565c58c6 refac/fix: oauth discovery urls
Co-Authored-By: jamie-dit <80016430+jamie-dit@users.noreply.github.com>
2026-01-01 14:01:18 +04:00
Jan Kessler
6c7f966f2a properly handle async-generator Redis methods in SentinelRedisProx to fix changed YDocManager's remove_user_from_all_documents (#20145) 2025-12-31 17:15:24 -05:00
Timothy Jaeryang Baek
fdae5644e3 refac 2026-01-01 01:51:37 +04:00
Timothy Jaeryang Baek
bf2b296239 fix: oauth server_metadata_url issue
Co-Authored-By: Shamray Alexander <843002+imsamurai@users.noreply.github.com>
2026-01-01 01:37:38 +04:00
Classic298
b91e8b73ab fix: properly raise exceptions instead of returning them in chat.py (#20276)
Change 'return Exception(...)' to 'raise Exception(...)' in chat_completed() and chat_action() functions. Returning an exception object instead of raising it causes errors to be silently swallowed, breaking error propagation.
2025-12-31 17:38:47 +04:00
Classic298
6d087202ad fix: prevent invalidate_token crash when decode_token returns None (#20277)
Add null check after decode_token() before calling decoded.get(). Invalid/expired tokens now gracefully exit instead of crashing with AttributeError.
2025-12-31 02:30:45 -05:00
Timothy Jaeryang Baek
fe84afd09a enh: delta annotations support 2025-12-30 20:05:31 +04:00
Timothy Jaeryang Baek
61e25dc2dc refac 2025-12-30 18:28:57 +04:00
Timothy Jaeryang Baek
aaea9a5956 refac/fix: comfyui filter output node type
Co-Authored-By: Paul <239564541+mirrordna-reflection-protocol@users.noreply.github.com>
2025-12-30 12:24:48 +04:00
Timothy Jaeryang Baek
2453b75ff0 refac 2025-12-29 01:31:27 +04:00
Timothy Jaeryang Baek
b1d0f00d8c refac/enh: db session sharing 2025-12-29 00:21:18 +04:00
Timothy Jaeryang Baek
9c2f5148d9 refac 2025-12-26 18:30:50 +04:00
Timothy Jaeryang Baek
4ab917c74b fix/refac: stt default content type 2025-12-22 09:45:55 +04:00
Timothy Jaeryang Baek
446cc0ac60 refac 2025-12-22 00:39:05 +04:00
Timothy Jaeryang Baek
01e88c6ac2 chore: format 2025-12-21 23:34:08 +04:00
Timothy Jaeryang Baek
f1bf4f20c5 feat: chat_file table 2025-12-21 23:17:53 +04:00
Classic298
ef43e81f9a fix: MCP OAuth 2.1 token exchange and multi-node propagation (#20076)
* sequential

* zero default

* fix

* fix: preserve absolute paths in sqlite+sqlcipher URLs

Previously, the connection logic incorrectly stripped the leading slash
from `sqlite+sqlcipher` paths, forcibly converting absolute paths
(e.g., `sqlite+sqlcipher:////app/data.db`) into relative paths
(which became `app/data.db`). This caused database initialization failures
when using absolute paths, such as with Docker volume mounts.
This change removes the slash-stripping logic, ensuring that absolute
path conventions (starting with `/`) are respected while maintaining
support for relative paths (which do not start with `/`).

* fix: MCP OAuth 2.1 token exchange and multi-node propagation

Fix two MCP OAuth 2.1 bugs affecting tool server authentication:

1. Token exchange failing with duplicate credentials (#19823)
   - Removed explicit client_id/client_secret passing in handle_callback()
   - Authlib already has credentials configured during add_client(),
     passing them again caused concatenation (e.g., "ID1,ID1") and 401 errors
   - Added token validation to detect missing access_token and provide
     clear error messages instead of cryptic database constraint errors

2. OAuth clients not propagating across multi-node setups (#19901)
   - Updated get_client() and get_client_info() to auto-lazy-load
     OAuth clients from the Redis-synced TOOL_SERVER_CONNECTIONS config
   - Clients are now instantiated on-demand on any node that needs them

Fixes #19823, #19901

* Update db.py

* Update wrappers.py
2025-12-21 10:51:52 -05:00
Timothy Jaeryang Baek
ae203d8952 refac 2025-12-21 16:15:28 +04:00
Classic298
48ccb1e170 fix: consolidate psql cleanup logic and fix web add with cleanup (#20072)
* sequential

* consolidate logic and fix for web add

* Update WebSearch.svelte

* Update retrieval.py

* Update retrieval.py

* Update WebSearch.svelte
2025-12-21 07:14:29 -05:00
Timothy Jaeryang Baek
0dd2cfe1f2 enh: models endpoint optimization
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>

#20010
2025-12-21 15:43:02 +04:00
Timothy Jaeryang Baek
28b2fcab0c refac 2025-12-21 13:58:49 +04:00