[GH-ISSUE #5946] pt_main_thread process memory consumption monotonically increasing #14182

Closed
opened 2026-04-19 20:37:49 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @Elettrotecnica on GitHub (Oct 6, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/5946

Bug Report

Installation Method

docker cuda installation, e.g. docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Environment

  • Open WebUI Version: v0.3.30

  • Operating System: Debian Linux 12

Bug Summary:

pt_main_thread never seems to release memory. This is particularly problematic when loading many files for RAG. I have tried to scan the codebase of a tool I would like models to do RAG about, roughly 3000 files. During load, pt_main_thread process appears to be monotonically increasing its resident memory consumption and to never release it.

Even when the loading was to finish successfully, depending on how big the process has grown, one may be forced to restart the container afterwards in order to reclaim resources.

Reproduction Details

Steps to Reproduce:

  1. Import a folder containing "many" files, such as a sizeable codebase. The folder, I was trying to import was the extracted archive from https://openacs.org/projects/openacs/download/download/openacs-full-5.10.1.tar.gz?revision_id=6172827.

  2. observe the memory footprint of the pt_main_thread process.

Note that I have experienced this behavior using both CPU and GPU for indexing. In my experience, using GPU makes memory grow faster (I was out of memory before I could finish my import).

Many thanks in advance for any feedback on this and kudos for this great tool!

Originally created by @Elettrotecnica on GitHub (Oct 6, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/5946 # Bug Report ## Installation Method docker cuda installation, e.g. `docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda` ## Environment - **Open WebUI Version:** v0.3.30 - **Operating System:** Debian Linux 12 **Bug Summary:** pt_main_thread never seems to release memory. This is particularly problematic when loading many files for RAG. I have tried to scan the codebase of a tool I would like models to do RAG about, roughly 3000 files. During load, pt_main_thread process appears to be monotonically increasing its resident memory consumption and to never release it. Even when the loading was to finish successfully, depending on how big the process has grown, one may be forced to restart the container afterwards in order to reclaim resources. ## Reproduction Details **Steps to Reproduce:** 1. Import a folder containing "many" files, such as a sizeable codebase. The folder, I was trying to import was the extracted archive from https://openacs.org/projects/openacs/download/download/openacs-full-5.10.1.tar.gz?revision_id=6172827. 2. observe the memory footprint of the pt_main_thread process. Note that I have experienced this behavior using both CPU and GPU for indexing. In my experience, using GPU makes memory grow faster (I was out of memory before I could finish my import). Many thanks in advance for any feedback on this and kudos for this great tool!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#14182