The sharePublic prop in editor components (Knowledge, Tools, Skills,
Prompts, Models) incorrectly included an "|| edit" / "|| write_access"
condition, allowing users with write access to see and use the "Public"
sharing option regardless of their actual public sharing permission.
Additionally, all backend access/update endpoints only verified write
authorization but did not check the corresponding sharing.public_*
permission, allowing direct API calls to bypass frontend restrictions
entirely.
Frontend: removed the edit/write_access bypass from sharePublic in all
five editor components so visibility is gated solely by the user's
sharing.public_* permission or admin role.
Backend: added has_public_read_access_grant checks to the access/update
endpoints in knowledge.py, tools.py, prompts.py, skills.py, models.py,
and notes.py. Public grants are silently stripped when the user lacks
the corresponding permission.
Fixes#21356
Ensure chat_id is reliably passed to function pipelines/manifolds during internal task invocations (web search query generation, RAG query generation, image prompt generation).
This allows stateful functions to maintain per-chat state without fragmentation, as they will now receive a consistent chat_id for all chat-scoped invocations including internal tasks.
Backend changes:
- Pass chat_id in generate_queries call for web search
- Pass chat_id in generate_queries call for RAG/retrieval
- Pass chat_id in generate_image_prompt call
Frontend changes:
- Add optional chat_id parameter to generateQueries API function
- Add optional chat_id parameter to generateAutoCompletion API function
Fixes#20563
* fix: prevent worker death during document upload by using run_coroutine_threadsafe
Replace asyncio.run() with asyncio.run_coroutine_threadsafe() in
save_docs_to_vector_db() to prevent uvicorn worker health check failures.
The issue: asyncio.run() creates a new event loop and blocks the thread
completely, preventing the worker from responding to health checks during
long-running embedding operations (>5 seconds default timeout).
The fix: Schedule the async embedding work on the main event loop using
run_coroutine_threadsafe(). This keeps the main loop responsive to health
check pings while the sync caller waits for the result.
Changes:
- main.py: Store main event loop reference in app.state.main_loop at startup
- retrieval.py: Use run_coroutine_threadsafe() instead of asyncio.run()
https://claude.ai/code/session_01UQSYvSTkXb57sFb7M85Kcw
* add env var
---------
Co-authored-by: Claude <noreply@anthropic.com>
fix: reduce TTFT by caching model lookups in chat completion
Skip expensive get_all_models() calls when models are already cached
in app.state. This significantly reduces Time To First Token (TTFT)
for chat completions and embeddings requests.
Previously, every request called get_all_models() which fetches model
lists from all configured backends. Now we check the cache first and
only call get_all_models() on cache miss.
Affected endpoints:
- openai: generate_chat_completion, embeddings
- ollama: embed, embeddings
Fixes#20069
Co-authored-by: Michael <42099345+mickeytheseal@users.noreply.github.com>
* fix: add ScopedSession.remove() to prevent idle transaction leaks
The HTTP middleware was calling ScopedSession.commit() but not
ScopedSession.remove(), causing database connections to remain
"checked out" from the pool indefinitely. This resulted in
"idle in transaction" connections in PostgreSQL that could persist
for 30-50+ minutes.
With SQLAlchemy's scoped_session:
- commit() commits but keeps the session active
- remove() is required to return the connection to the pool
This fix adds the missing remove() call, ensuring connections are
properly returned after each HTTP request.
Also includes IDLE_TRANSACTION_ANALYSIS.md documenting the full
root cause analysis and additional recommendations.
* Delete IDLE_TRANSACTION_ANALYSIS.md
---------
Co-authored-by: Claude <noreply@anthropic.com>
Each access to request.app.state.config.<KEY> triggers a synchronous
Redis GET. In get_all_models_responses() and get_merged_models(), the
config keys OPENAI_API_BASE_URLS, OPENAI_API_KEYS, and
OPENAI_API_CONFIGS were read on every loop iteration — resulting
in some cases in 200-300 Redis round-trips for OPENAI_API_BASE_URLS alone.
Read each config value once into a local variable at the start of the
function and reuse it throughout.