enh: more mature support for external (non-SQLite) databases #1072

Closed
opened 2025-11-11 14:36:43 -06:00 by GiteaMirror · 4 comments
Owner

Originally created by @PeterBeklemishev on GitHub (May 30, 2024).

Is your feature request related to a problem? Please describe.
Currently, there is support for specifying an external database via the DATABASE_URL environment variable instead of the default SQLite (added in https://github.com/open-webui/open-webui/issues/1397).
However, it seems that OpenWebUI (at least in the main branch) doesn't fully manage the database connection lifecycle, likely because SQLite is the default option, and connection management is often less critical in this context.

"Standard" approach to handling database connections in web applications looks something like
  1. Init database connection pool

  2. Start web server (so (1) should be done via something like lifespan in FastAPI terms)

  3. On every incoming request - take connection from pool, execute required sql, return response to the request - and return connection to pool

  4. Before application shutdown - gracefully close all connections to database

When using SQLite, these steps (outlined in the collapsible section) may seem overcomplicated, so OpenWebUI currently creates a single connection to the database at application start.

However, if we specify a Postgresql DSN via the DATABASE_URL env var, OpenWebUI works fine only for a limited time.
The single connection will eventually be closed by the database since connections are not immortal. Currently, neither Peewee nor OpenWebUI handles reconnections in such scenarios.

Describe the solution you'd like

OpenWebUI is build using FastAPI framework and uses Peewee ORM.
FastAPI documentation includes tutorial / how-to work with Peewee from asynchronous application, and Peewee documentation points to the same tutorial.

In short - this tutorial is about ensuring that

  • PeeWee functions correctly in asynchronous environment
  • For every request [to our FastAPI application] there is open and ready-to-be-used connection to the database.

First part can be achieved by adding the following code to the backend/apps/web/internal/db.py file (adapted from the tutorial linked above):

...
from contextvars import ContextVar

from peewee import InterfaceError as _ConnectionState
...

db_state_default = {"closed": None, "conn": None, "ctx": None, "transactions": None}
db_state = ContextVar("db_state", default=db_state_default.copy())


class PeeweeConnectionState(_ConnectionState):
    def __init__(self, **kwargs):
        super().__setattr__("_state", db_state)
        super().__init__(**kwargs)

    def __setattr__(self, name, value):
        self._state.get()[name] = value

    def __getattr__(self, name):
        return self._state.get()[name]

...

DB = connect(DATABASE_URL)
DB._state = PeeweeConnectionState()  # overrride _state property after connection

...

The second part is about ensuring an active connection. This can be done using following ad-hoc method:
Peewee has an almost ready-made solution: mixin playhouse.shortcuts.ReconnectMixin.

We just need to adapt this to psycopg2 exceptions (in the same backend/apps/web/internal/db.py file):

...
from peewee import InterfaceError as PeeWeeInterfaceError
from psycopg2 import OperationalError
from psycopg2.errors import InterfaceError
from playhouse.db_url import PooledPostgresqlDatabase, register_database
...

class PGReconnectMixin(ReconnectMixin):
    reconnect_errors = (
        (OperationalError, 'termin'),
        (InterfaceError, 'closed'),
        (PeeWeeInterfaceError, 'closed'),
    )


class ReconnectingPostgresqlDatabase(PGReconnectMixin, PooledPostgresqlDatabase):
    pass


register_database(ReconnectingPostgresqlDatabase, 'postgres+pool', 'postgresql+pool')

...

I acknowledge that this approach may not be ideal, but it should be an improvement.

Describe alternatives you've considered

Migrating from Peewee ORM to SQLAlchemy ORM (or SQLAlchemy Core) could greatly benefit the project. However, this requires substantial refactoring, which means a significant investment of person-hours to implement and even more time to properly review and test the changes.

Additional context

Q&A:

  • Will OpenWebUI stop working with a "connection closed" error after these changes? Hopefully not, as we will intercept such exceptions and reconnect to the database.
  • Is this solution complete? I am not entirely sure.
  • Will it provide isolation between different asynchronous requests to OpenWebUI? Again, I am not sure.

Any feedback or guidance on this would be greatly appreciated. Thank you for your hard work and dedication to this project.

Originally created by @PeterBeklemishev on GitHub (May 30, 2024). **Is your feature request related to a problem? Please describe.** Currently, there is support for specifying an external database via the `DATABASE_URL` environment variable instead of the default SQLite (added in https://github.com/open-webui/open-webui/issues/1397). However, it seems that OpenWebUI (at least in the `main` branch) doesn't fully manage the database connection lifecycle, likely because SQLite is the default option, and connection management is often less critical in this context. <details> <summary> "Standard" approach to handling database connections in web applications looks something like </summary> 1. Init database connection pool 2. Start web server (so (1) should be done via something like [lifespan](https://fastapi.tiangolo.com/advanced/events/) in FastAPI terms) 3. On every incoming request - take connection from pool, execute required sql, return response to the request - and return connection to pool 4. Before application shutdown - gracefully close all connections to database </details> When using SQLite, these steps (outlined in the collapsible section) may seem overcomplicated, so OpenWebUI currently creates a single connection to the database at application start. However, if we specify a Postgresql DSN via the `DATABASE_URL` env var, OpenWebUI works fine only for a limited time. The single connection will eventually be closed by the database since connections are not immortal. __Currently, neither Peewee nor OpenWebUI handles reconnections in such scenarios.__ **Describe the solution you'd like** OpenWebUI is build using FastAPI framework and uses Peewee ORM. FastAPI documentation includes [tutorial / how-to](https://fastapi.tiangolo.com/how-to/sql-databases-peewee/) work with Peewee from asynchronous application, and Peewee documentation points to the same tutorial. In short - this tutorial is about ensuring that - PeeWee functions correctly in asynchronous environment - For every request [to our FastAPI application] there is open and ready-to-be-used connection to the database. First part can be achieved by adding the following code to the `backend/apps/web/internal/db.py` file (adapted from the tutorial linked above): ```python ... from contextvars import ContextVar from peewee import InterfaceError as _ConnectionState ... db_state_default = {"closed": None, "conn": None, "ctx": None, "transactions": None} db_state = ContextVar("db_state", default=db_state_default.copy()) class PeeweeConnectionState(_ConnectionState): def __init__(self, **kwargs): super().__setattr__("_state", db_state) super().__init__(**kwargs) def __setattr__(self, name, value): self._state.get()[name] = value def __getattr__(self, name): return self._state.get()[name] ... DB = connect(DATABASE_URL) DB._state = PeeweeConnectionState() # overrride _state property after connection ... ``` The second part is about ensuring an active connection. This can be done using following ad-hoc method: Peewee has an almost ready-made solution: [mixin playhouse.shortcuts.ReconnectMixin](https://github.com/coleifer/peewee/blob/master/playhouse/shortcuts.py#L212-L242). We just need to adapt this to psycopg2 exceptions (in the same `backend/apps/web/internal/db.py` file): ```python ... from peewee import InterfaceError as PeeWeeInterfaceError from psycopg2 import OperationalError from psycopg2.errors import InterfaceError from playhouse.db_url import PooledPostgresqlDatabase, register_database ... class PGReconnectMixin(ReconnectMixin): reconnect_errors = ( (OperationalError, 'termin'), (InterfaceError, 'closed'), (PeeWeeInterfaceError, 'closed'), ) class ReconnectingPostgresqlDatabase(PGReconnectMixin, PooledPostgresqlDatabase): pass register_database(ReconnectingPostgresqlDatabase, 'postgres+pool', 'postgresql+pool') ... ``` I acknowledge that this approach may not be ideal, but it should be an improvement. **Describe alternatives you've considered** Migrating from Peewee ORM to SQLAlchemy ORM (or SQLAlchemy Core) could greatly benefit the project. However, this requires substantial refactoring, which means a significant investment of person-hours to implement and even more time to properly review and test the changes. **Additional context** Q&A: - Will OpenWebUI stop working with a "connection closed" error after these changes? Hopefully not, as we will intercept such exceptions and reconnect to the database. - Is this solution complete? I am not entirely sure. - Will it provide isolation between different asynchronous requests to OpenWebUI? Again, I am not sure. Any feedback or guidance on this would be greatly appreciated. Thank you for your hard work and dedication to this project.
Author
Owner

@tjbck commented on GitHub (May 30, 2024):

Great suggestion! As long as it won't break the existing default sqlite db, PR welcome!

@tjbck commented on GitHub (May 30, 2024): Great suggestion! As long as it won't break the existing default sqlite db, PR welcome!
Author
Owner

@PeterBeklemishev commented on GitHub (May 30, 2024):

@tjbck can you please take a look at https://github.com/open-webui/open-webui/pull/2666?

@PeterBeklemishev commented on GitHub (May 30, 2024): @tjbck can you please take a look at https://github.com/open-webui/open-webui/pull/2666?
Author
Owner

@devdev999 commented on GitHub (May 31, 2024):

Is this the reason why I get 404 not found errors after a certain period of time after initial startup, when using postgresql? Works fine on restart but logins always fail after some time.

@devdev999 commented on GitHub (May 31, 2024): Is this the reason why I get 404 not found errors after a certain period of time after initial startup, when using postgresql? Works fine on restart but logins always fail after some time.
Author
Owner

@PeterBeklemishev commented on GitHub (May 31, 2024):

@devdev999 IDK, probably not, my observation is that when connection to postgres is no longer alive - login fails with 500 status code with something like "connection already closed" in body.

(But it depends on how openwebui is deployed in your case, maybe something between your browser and OpenWebUI is converting these 500s to 404)

You can increase log verbosity by setting following env variables (main one here is UVICORN_LOG_LEVEL, and OpenWebUI-specific you can find here):

UVICORN_LOG_LEVEL=debug
DB_LOG_LEVEL=DEBUG
GLOBAL_LOG_LEVEL=DEBUG
@PeterBeklemishev commented on GitHub (May 31, 2024): @devdev999 IDK, probably not, my observation is that when connection to postgres is no longer alive - login fails with 500 status code with something like "connection already closed" in body. (But it depends on how openwebui is deployed in your case, maybe something between your browser and OpenWebUI is converting these 500s to 404) You can increase log verbosity by setting following env variables (main one here is UVICORN_LOG_LEVEL, and OpenWebUI-specific you can find [here](https://github.com/open-webui/open-webui/blob/main/backend/config.py#L48-L70)): ``` UVICORN_LOG_LEVEL=debug DB_LOG_LEVEL=DEBUG GLOBAL_LOG_LEVEL=DEBUG ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1072