mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 04:16:03 -05:00
[GH-ISSUE #14260] issue: RAG file upload fails due to configuration conflict after switching to Docling processor #17194
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @eowensai on GitHub (May 23, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14260
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.6.10
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
Chrome 136.0
Confirmation
README.md.Expected Behavior
When the RAG_ALLOWED_FILE_EXTENSIONS environment variable is set, the application should honor it, allowing file uploads of the specified types. A change in the UI from the default document processor to docling should not result in a permanently broken state for file extension validation. The setting should be configurable via the UI or reliably overridden by an environment variable.
Actual Behavior
The application fails all RAG file uploads with a 400: [ERROR: File type ... is not allowed] error.
The core of the issue is that the application enters a locked state where it loads a stale/misformatted value from its database, ignoring any environment variables. The container's own startup log confirms this behavior, printing the line: INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry.
This makes it impossible to configure the allowed file types after the initial switch to docling without completely deleting the persistent data volume.
Steps to Reproduce
docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' ghcr.io/open-webui/open-webui:main
docker stop openwebui-test
docker rm openwebui-test
docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' -e RAG_ALLOWED_FILE_EXTENSIONS='.pdf,.docx,.txt' ghcr.io/open-webui/open-webui:main
Logs & Screenshots
There are three key pieces of log evidence that demonstrate the issue:
1. Container Environment Confirmation
This
printenvcommand, run inside the container, proves theRAG_ALLOWED_FILE_EXTENSIONSenvironment variable was correctly set and available to the application.$ docker exec -it openwebui-latest /bin/bash# printenv | grep RAG_"RAG_ALLOWED_FILE_EXTENSIONS=.pdf,.docx,.pptx,.txt,.md,.xlsx"
2. Container Startup Log
Despite the environment variable being set, this log from the container's startup sequence shows the application explicitly ignoring it and loading the value from the database instead.
"INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry"
3. File Upload Error Log
This is the resulting traceback when attempting to upload a file, which occurs because the incorrect value from the database is being used for validation.
ERROR | open_webui.routers.files:upload_file:189 - 400: [ERROR: File type .docx is not allowed] - {}
Traceback (most recent call last):
File "/app/backend/open_webui/routers/files.py", line 105, in upload_file
raise HTTPException(...)
Additional Information
No response
@tjbck commented on GitHub (May 23, 2025):
Should already be addressed in dev and the format should be
pdf,txt,docx.@eowensai commented on GitHub (May 23, 2025):
To provide full context, my work was on the :main image. It is possible the fix in :dev addresses this, but sharing more detail.
I tested the format you suggested (pdf,txt,docx), both with and without spaces between the entries, and found it did not resolve the error. The conclusion that leading dots were required was based on an inspection of the validation logic in /app/backend/open_webui/routers/files.py
file_extension = os.path.splitext(filename)[1]
...
if file_extension not in request.app.state.config.ALLOWED_FILE_EXTENSIONS:
raise HTTPException(...)
As os.path.splitext(filename)[1] returns the extension with the leading dot, the logic suggested the items in the ALLOWED_FILE_EXTENSIONS list must also contain the dot for the in check to succeed.
The primary issue, however, was that the application consistently ignored the environment variable after the first run. The startup logs showed 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry, which overrode any value set in the Docker command or saved in the UI, regardless of format.