fix(db): release connection before embeddings in knowledge /metadata/reindex (#20577)

Remove Depends(get_session) from POST /metadata/reindex endpoint to prevent database connections from being held during N embedding API calls.

This endpoint is CRITICAL as it loops through ALL knowledge bases and calls embed_knowledge_base_metadata() for each one. With the original code, a single connection would be held for the entire duration (potentially minutes for large deployments), completely exhausting the pool.

The Knowledges.get_knowledge_bases() function manages its own short-lived session, releasing the connection before the embedding loop begins.
This commit is contained in:
Classic298
2026-01-11 20:33:04 +01:00
committed by GitHub
parent 182d5e8591
commit 826e9ab317

View File

@@ -345,10 +345,15 @@ async def reindex_knowledge_files(
async def reindex_knowledge_base_metadata_embeddings(
request: Request,
user=Depends(get_admin_user),
db: Session = Depends(get_session),
):
"""Batch embed all existing knowledge bases. Admin only."""
knowledge_bases = Knowledges.get_knowledge_bases(db=db)
"""Batch embed all existing knowledge bases. Admin only.
NOTE: We intentionally do NOT use Depends(get_session) here.
This endpoint loops through ALL knowledge bases and calls embed_knowledge_base_metadata()
for each one, making N external embedding API calls. Holding a session during
this entire operation would exhaust the connection pool.
"""
knowledge_bases = Knowledges.get_knowledge_bases()
log.info(f"Reindexing embeddings for {len(knowledge_bases)} knowledge bases")
success_count = 0