mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[PR #16520] [CLOSED] 🚀 feat: COMPREHENSIVE DATA PRUNING SYSTEM - The Ultimate Storage Management Solution for Open WebUI #47208
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/16520
Author: @Classic298
Created: 8/12/2025
Status: ❌ Closed
Base:
dev← Head:universal_file_deletion📝 Commits (10+)
d454e6aFeat/prune orphaned data (#16)aadb296Merge branch 'open-webui:main' into universal_file_deletion028a2e5Update prune.py0bd42e5Update Database.svelte5ce002dUpdate PruneDataDialog.svelte8d7273aUpdate prune.tse4a0bd8Update Database.svelte60edac6Update Database.svelte709c852Update prune.py34c9a88Update prune.py📊 Changes
6 files changed (+3015 additions, -0 deletions)
View changed files
📝
backend/open_webui/main.py(+2 -0)📝
backend/open_webui/models/folders.py(+6 -0)➕
backend/open_webui/routers/prune.py(+1793 -0)➕
src/lib/apis/prune.ts(+66 -0)📝
src/lib/components/admin/Settings/Database.svelte(+249 -0)➕
src/lib/components/common/PruneDataDialog.svelte(+899 -0)📄 Description
🚀 feat: COMPREHENSIVE DATA PRUNING SYSTEM - The Ultimate Storage Management Solution for Open WebUI
Before submitting, make sure you've checked the following:
devbranch.🎯 COMPREHENSIVE DATA MANAGEMENT SOLUTION
After MONTHS of development and addressing way over 20+ community issues, this PR introduces a complete data pruning system for Open WebUI. This implementation has been carefully designed over multiple months to address the most requested feature in the Open WebUI community - comprehensive storage management and cleanup capabilities.
🎉 ADDRESSES 25+ COMMUNITY ISSUES & DISCUSSIONS
This implementation closes/addresses:
Primary Issues / PRs / Discussions Resolved:
And definitely many more - in fact, i lost track of some of the discussions in my notifications.
Also there were PLENTY of feature requests, bug reports and discussions around this topic on the official Discord Server. If I had to guess, there were at least 40 real requests and discussions around this.
🛡️ PRESERVES EXISTING BEHAVIOR & FOLLOWS MAINTAINER VISION
🎯 EXACTLY THE APPROACH REQUESTED
This implementation follows the API endpoint + manual trigger approach outlined by @tjbck (maintainer) in previous discussions:
🔒 WHY THIS APPROACH WAS CHOSEN
Previous PRs proposing automated background deletion were correctly rejected because:
This PR respects these constraints by providing optional, manual, admin-only, API-driven and FULLY CONFIGURABLE data pruning.
🌟 KEY DESIGN PRINCIPLES
🔒 SAFETY FIRST
🌍 REGULATORY COMPLIANCE
🤖 AUTOMATION READY
/api/v1/prune) for external scripts and automated calling🚀 COMPREHENSIVE FEATURE SET
👥 ADVANCED USER MANAGEMENT
last_active_attimestamps for accurate detection💬 ADVANCED CHAT MANAGEMENT
updated_attimestamps📁 COMPREHENSIVE FILE SYSTEM INTEGRATION
🎵 AUDIO CACHE MANAGEMENT
🗄️ DATABASE OPTIMIZATION
👥 USER & RESOURCE MANAGEMENT
🎛️ ADVANCED CONFIGURATION
🔧 TECHNICAL IMPLEMENTATION
🏗️ MULTI-STAGE PROCESSING ARCHITECTURE
🏭 MODULAR VECTOR DATABASE FRAMEWORK
🔧 CODE QUALITY IMPROVEMENTS
🔍 INTELLIGENT FILE SCANNING
🧠 ENHANCED VECTOR CLEANUP
🎵 AUDIO CACHE INTELLIGENCE
🔍 DRY-RUN PREVIEW SYSTEM
⚡ OPTIMIZATION BENEFITS
🎨 USER EXPERIENCE
🖼️ BEAUTIFUL INTERFACE
🚧 FUTURE DEVELOPMENT OPPORTUNITIES
🗄️ VECTOR DATABASE SUPPORT
Currently Implemented:
Community Extension Framework Ready:
Adding New Vector Databases:
Changelog Entry
Description
COMPREHENSIVE DATA PRUNING SYSTEM - A production-ready, enterprise-grade pruning system developed over multiple months to address 20+ community issues. This optional, admin-controlled feature includes intelligent chat deletion, time-based user account management, comprehensive file cleanup, audio cache management, enhanced vector database optimization with modular framework, dry-run preview capabilities, and full GDPR compliance capabilities while preserving all existing behavior.
Added
/api/v1/prunefor external automated script-based integrationsDeprecated / Changed
Removed
Fixed
Security
Additional Information
🎯 ADDRESSES COMMUNITY PAIN POINTS
This PR addresses years of community feedback about:
Screenshots or Videos
Admin Panel - Database Section
Admin Panel - Prune Modal
Admin Panel - Prune Modal Docs
Admin Panel - Prune Modal Config
Admin Panel - Inactive User Management Tab
Dry-Run Preview Modal
Admin Panel - Prune Modal API helper
Shows the API call, fully configured according to the selections and settings you set in the configurator above.
Useful for external pruning automation.
API Helper with Advanced Comments
Example:
Confirmation of prune success
Info Level Logging
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
User Feedback Tracking
Thanks for your feedback and for testing the PR. This section of the PR description will be continuously updated to keep track of the last remaining points
Feature Wishes / To Do
attempt to simplify implementation with .delete command (needs investigation if UUID matching still works, since chroma DB and the files itself and the file handles in Open WebUI's database have different UUID's each, requiring complex cross matching to even make it work in the first place)The amount of tinkering that is necessary to fully cleanup chroma db does not allow for this to be easy lol.
Tested by
Vector Database Integration Status:
Major Breakthroughs Achieved:
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.