[PR #13396] [CLOSED] WIP: add periodic data cleanup task #62047

Closed
opened 2026-05-06 05:55:24 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/13396
Author: @CallumJHays
Created: 5/1/2025
Status: Closed

Base: devHead: periodic-data-cleanup-task


📝 Commits (5)

  • b6a9f83 fix: db unbound error if connection fails
  • c5ce52f tests(fix): always use absolute imports to prevent various issues (eg. isinstance)
  • 580882f test(refactor): Add AbstractDBTest and AbstractSQLiteTest
  • c21dc5a feat: add periodic data cleanup task
  • b8be8f3 test(refactor): truncate tables in teardown using orm

📊 Changes

12 files changed (+488 additions, -64 deletions)

View changed files

backend/open_webui/data_cleanup_task.py (+243 -0)
📝 backend/open_webui/env.py (+1 -0)
📝 backend/open_webui/internal/db.py (+6 -6)
📝 backend/open_webui/main.py (+4 -0)
📝 backend/open_webui/test/apps/webui/routers/test_auths.py (+2 -2)
📝 backend/open_webui/test/apps/webui/routers/test_chats.py (+2 -2)
📝 backend/open_webui/test/apps/webui/routers/test_models.py (+2 -2)
📝 backend/open_webui/test/apps/webui/routers/test_prompts.py (+2 -2)
📝 backend/open_webui/test/apps/webui/routers/test_users.py (+2 -2)
backend/open_webui/test/test_data_cleanup_task.py (+182 -0)
📝 backend/open_webui/test/util/abstract_integration_test.py (+41 -47)
📝 backend/open_webui/test/util/mock_user.py (+1 -1)

📄 Description

Adds periodic apscheduler job for implementing data retention policy by deleting expired db chat table rows, associated records, and cache files. Only supports sqlite and postgres databases.

Discussion: https://github.com/open-webui/open-webui/discussions/7465

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Please verify that the pull request targets the dev branch.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests for validating the changes?
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Prefix: To cleary categorize this pull request, prefix the pull request title, using one of the following

Changelog Entry

Description

  • Add periodic data cleanup task to periodically delete old db chat table rows, associated records, and cache files.

Added

  • Periodic Data Cleanup Task, configurable through environment variables:
    • DATA_CLEANUP_ENABLED (bool, default False): Whether to enable the data cleanup policy.
    • DATA_CLEANUP_MAX_CHAT_AGE_DAYS (float): Sets the threshold for max chat age in days after which chats are deleted from DB, and uploaded files removed. Required if DATA_CLEANUP_ENABLED.
    • DATA_CLEANUP_MAX_CACHE_AGE_DAYS (float): Sets the threshold for max cache file age after which files are deleted (from the /app/data/cache/[audio|image] folders). Required if DATA_CLEANUP_ENABLED.
    • DATA_CLEANUP_CRON_SCHEDULE (str): Crontab schedule for the cleanup job eg: "0 * * * *" (the top of every hour). Required if DATA_CLEANUP_ENABLED.
    • DATA_CLEANUP_LOG_LEVEL (str, default "INFO"): Sets the log level for the data_cleanup_policy module. EG: "DEBUG" | "INFO" etc

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/13396 **Author:** [@CallumJHays](https://github.com/CallumJHays) **Created:** 5/1/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `periodic-data-cleanup-task` --- ### 📝 Commits (5) - [`b6a9f83`](https://github.com/open-webui/open-webui/commit/b6a9f83b088edd1a01235f29f47b598e84331ffb) fix: db unbound error if connection fails - [`c5ce52f`](https://github.com/open-webui/open-webui/commit/c5ce52fd7d57d2765bef09ca6d99b36c5546edeb) tests(fix): always use absolute imports to prevent various issues (eg. isinstance) - [`580882f`](https://github.com/open-webui/open-webui/commit/580882f813adc7d12c9c28e936bd56471175a6f9) test(refactor): Add AbstractDBTest and AbstractSQLiteTest - [`c21dc5a`](https://github.com/open-webui/open-webui/commit/c21dc5a99b07cb40b673abfe45ddb98defb364f7) feat: add periodic data cleanup task - [`b8be8f3`](https://github.com/open-webui/open-webui/commit/b8be8f3977b6ce4f6329dd6e55c48ae9cb248bad) test(refactor): truncate tables in teardown using orm ### 📊 Changes **12 files changed** (+488 additions, -64 deletions) <details> <summary>View changed files</summary> ➕ `backend/open_webui/data_cleanup_task.py` (+243 -0) 📝 `backend/open_webui/env.py` (+1 -0) 📝 `backend/open_webui/internal/db.py` (+6 -6) 📝 `backend/open_webui/main.py` (+4 -0) 📝 `backend/open_webui/test/apps/webui/routers/test_auths.py` (+2 -2) 📝 `backend/open_webui/test/apps/webui/routers/test_chats.py` (+2 -2) 📝 `backend/open_webui/test/apps/webui/routers/test_models.py` (+2 -2) 📝 `backend/open_webui/test/apps/webui/routers/test_prompts.py` (+2 -2) 📝 `backend/open_webui/test/apps/webui/routers/test_users.py` (+2 -2) ➕ `backend/open_webui/test/test_data_cleanup_task.py` (+182 -0) 📝 `backend/open_webui/test/util/abstract_integration_test.py` (+41 -47) 📝 `backend/open_webui/test/util/mock_user.py` (+1 -1) </details> ### 📄 Description Adds periodic `apscheduler` job for implementing data retention policy by deleting expired db chat table rows, associated records, and cache files. Only supports `sqlite` and `postgres` databases. Discussion: https://github.com/open-webui/open-webui/discussions/7465 # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests for validating the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Prefix:** To cleary categorize this pull request, prefix the pull request title, using one of the following # Changelog Entry ### Description - Add periodic data cleanup task to periodically delete old db chat table rows, associated records, and cache files. ### Added - **Periodic Data Cleanup Task**, configurable through environment variables: - `DATA_CLEANUP_ENABLED (bool, default False)`: Whether to enable the data cleanup policy. - `DATA_CLEANUP_MAX_CHAT_AGE_DAYS (float)`: Sets the threshold for max chat age in days after which chats are deleted from DB, and uploaded files removed. Required if `DATA_CLEANUP_ENABLED`. - `DATA_CLEANUP_MAX_CACHE_AGE_DAYS (float)`: Sets the threshold for max cache file age after which files are deleted (from the /app/data/cache/[audio|image] folders). Required if `DATA_CLEANUP_ENABLED`. - `DATA_CLEANUP_CRON_SCHEDULE (str)`: Crontab schedule for the cleanup job eg: "0 * * * *" (the top of every hour). Required if `DATA_CLEANUP_ENABLED`. - `DATA_CLEANUP_LOG_LEVEL (str, default "INFO")`: Sets the log level for the data_cleanup_policy module. EG: `"DEBUG" | "INFO"` etc --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-06 05:55:24 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#62047