[GH-ISSUE #24605] issue: db healthcheck stuck, possible regression from 0.9.3 #123667

Closed
opened 2026-05-21 03:05:20 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @olevitt on GitHub (May 12, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24605

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Other

Open WebUI Version

v0.9.4

Ollama Version (if applicable)

No response

Operating System

Kubernetes (debian 13)

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

DB healthcheck should not get stuck

Actual Behavior

DB healthcheck got stuck with error :
sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding and could not recover.
Connection to the database was fine (confirmed manually and by restarting another similar pod that got through without issues) but the db healthcheck (triggered by the readyness probe) was stuck with this error. Only fix was to manually restart the pod.
I suspect this may be caused by recent changes to db healthcheck from v0.9.3 (#24380 and/or #24384) as we never encountered this issue in versions < 0.9.3 with the same setup.

Steps to Reproduce

Not sure exactly how to reproduce as you have to have your db healthcheck fail at some point + probably have some kind of load on the database or transactions in progress ("Can't reconnect until invalid transaction is rolled back")

Logs & Screenshots

sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding (Background on this error at: https://sqlalche.me/e/20/8s2b)

Additional Information

I strongly suspect this may be caused by recent changes to db healthcheck from v0.9.3 (#24380 and/or #24384) as we never encountered this issue in versions < 0.9.3 with the same setup.

Originally created by @olevitt on GitHub (May 12, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24605 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Other ### Open WebUI Version v0.9.4 ### Ollama Version (if applicable) _No response_ ### Operating System Kubernetes (debian 13) ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior DB healthcheck should not get stuck ### Actual Behavior DB healthcheck got stuck with error : `sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding` and could not recover. Connection to the database was fine (confirmed manually and by restarting another similar pod that got through without issues) but the db healthcheck (triggered by the readyness probe) was stuck with this error. Only fix was to manually restart the pod. I suspect this may be caused by recent changes to db healthcheck from v0.9.3 (#24380 and/or #24384) as we never encountered this issue in versions < 0.9.3 with the same setup. ### Steps to Reproduce Not sure exactly how to reproduce as you have to have your db healthcheck fail at some point + probably have some kind of load on the database or transactions in progress ("Can't reconnect until invalid transaction is rolled back") ### Logs & Screenshots sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding (Background on this error at: https://sqlalche.me/e/20/8s2b) ### Additional Information I strongly suspect this may be caused by recent changes to db healthcheck from v0.9.3 (#24380 and/or #24384) as we never encountered this issue in versions < 0.9.3 with the same setup.
GiteaMirror added the bug label 2026-05-21 03:05:20 -05:00
Author
Owner

@owui-terminator[bot] commented on GitHub (May 12, 2026):

🔍 Related Issues Found

I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions:

  1. 🟣 #9496 PgvectorClient.search() Fails After Connection Loss – Causes Stuck Transactions and Query Failures
    This issue reports PostgreSQL connection-loss behavior leaving SQLAlchemy transactions stuck until a rollback/restart, which is very close to the same underlying failure mode as a healthcheck getting wedged with PendingRollbackError. It is the strongest match for transaction state not being cleared after a DB problem.
    by tintina95

  2. 🟣 #21349 issue: upgrade to 0.8.0 can not start up
    Although about startup rather than readiness probes, it is another database-related regression around startup/connection handling in Open WebUI and may reflect the same class of SQLAlchemy/DB initialization failures that can prevent the app from recovering cleanly.
    by hzr42strrs-hash · bug


💡 If your issue is a duplicate, please close it and add any additional details to the existing issue instead.

This comment was generated automatically. React with 👍 if helpful, 👎 if not.

<!-- gh-comment-id:4428292823 --> @owui-terminator[bot] commented on GitHub (May 12, 2026): <!-- terminator-bot:related-issues-reply --> 🔍 **Related Issues Found** I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions: 1. 🟣 [#9496](https://github.com/open-webui/open-webui/issues/9496) **PgvectorClient.search() Fails After Connection Loss – Causes Stuck Transactions and Query Failures** *This issue reports PostgreSQL connection-loss behavior leaving SQLAlchemy transactions stuck until a rollback/restart, which is very close to the same underlying failure mode as a healthcheck getting wedged with `PendingRollbackError`. It is the strongest match for transaction state not being cleared after a DB problem.* *by tintina95* 2. 🟣 [#21349](https://github.com/open-webui/open-webui/issues/21349) **issue: upgrade to 0.8.0 can not start up** *Although about startup rather than readiness probes, it is another database-related regression around startup/connection handling in Open WebUI and may reflect the same class of SQLAlchemy/DB initialization failures that can prevent the app from recovering cleanly.* *by hzr42strrs-hash · `bug`* --- 💡 If your issue is a duplicate, please close it and add any additional details to the existing issue instead. *This comment was generated automatically.* React with 👍 if helpful, 👎 if not.
Author
Owner

@jmleksan commented on GitHub (May 14, 2026):

Good catch, my bad. I don't think I encountered any failures during my testing, so might have slipped past.

<!-- gh-comment-id:4446328032 --> @jmleksan commented on GitHub (May 14, 2026): Good catch, my bad. I don't think I encountered any failures during my testing, so might have slipped past.
Author
Owner

@Classic298 commented on GitHub (May 14, 2026):

Might be addressed in dev testing wanted

<!-- gh-comment-id:4449035812 --> @Classic298 commented on GitHub (May 14, 2026): Might be addressed in dev testing wanted
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#123667