[GH-ISSUE #14260] issue: RAG file upload fails due to configuration conflict after switching to Docling processor #55860

Closed
opened 2026-05-05 18:10:46 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @eowensai on GitHub (May 23, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14260

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.10

Ollama Version (if applicable)

No response

Operating System

Ubuntu 24.04

Browser (if applicable)

Chrome 136.0

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

When the RAG_ALLOWED_FILE_EXTENSIONS environment variable is set, the application should honor it, allowing file uploads of the specified types. A change in the UI from the default document processor to docling should not result in a permanently broken state for file extension validation. The setting should be configurable via the UI or reliably overridden by an environment variable.

Actual Behavior

The application fails all RAG file uploads with a 400: [ERROR: File type ... is not allowed] error.

The core of the issue is that the application enters a locked state where it loads a stale/misformatted value from its database, ignoring any environment variables. The container's own startup log confirms this behavior, printing the line: INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry.

This makes it impossible to configure the allowed file types after the initial switch to docling without completely deleting the persistent data volume.

Steps to Reproduce

  1. Start the Open WebUI container with a clean data volume.
    docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' ghcr.io/open-webui/open-webui:main
  2. In the Admin Panel -> Documents settings, switch the "Content Extraction Engine" from default to docling and provide the Docling URL. Save changes.
  3. Attempt to upload any document (e.g., a .pdf or .docx). The upload will fail with the 400: File type is not allowed error.
  4. Stop and remove the container. Relaunch it, this time providing the correctly formatted RAG_ALLOWED_FILE_EXTENSIONS variable.
    docker stop openwebui-test
    docker rm openwebui-test
    docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' -e RAG_ALLOWED_FILE_EXTENSIONS='.pdf,.docx,.txt' ghcr.io/open-webui/open-webui:main
  5. Attempt to upload a .docx file again. The action will fail with the same 400 error because the application is now loading the stale, misformatted value from the database it created in step 2.

Logs & Screenshots

There are three key pieces of log evidence that demonstrate the issue:

1. Container Environment Confirmation

This printenv command, run inside the container, proves the RAG_ALLOWED_FILE_EXTENSIONS environment variable was correctly set and available to the application.

$ docker exec -it openwebui-latest /bin/bash
# printenv | grep RAG_

"RAG_ALLOWED_FILE_EXTENSIONS=.pdf,.docx,.pptx,.txt,.md,.xlsx"

2. Container Startup Log

Despite the environment variable being set, this log from the container's startup sequence shows the application explicitly ignoring it and loading the value from the database instead.

"INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry"

3. File Upload Error Log

This is the resulting traceback when attempting to upload a file, which occurs because the incorrect value from the database is being used for validation.

ERROR | open_webui.routers.files:upload_file:189 - 400: [ERROR: File type .docx is not allowed] - {}
Traceback (most recent call last):

File "/app/backend/open_webui/routers/files.py", line 105, in upload_file
raise HTTPException(...)

Additional Information

No response

Originally created by @eowensai on GitHub (May 23, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/14260 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.10 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 24.04 ### Browser (if applicable) Chrome 136.0 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When the RAG_ALLOWED_FILE_EXTENSIONS environment variable is set, the application should honor it, allowing file uploads of the specified types. A change in the UI from the default document processor to docling should not result in a permanently broken state for file extension validation. The setting should be configurable via the UI or reliably overridden by an environment variable. ### Actual Behavior The application fails all RAG file uploads with a 400: [ERROR: File type ... is not allowed] error. The core of the issue is that the application enters a locked state where it loads a stale/misformatted value from its database, ignoring any environment variables. The container's own startup log confirms this behavior, printing the line: INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry. This makes it impossible to configure the allowed file types after the initial switch to docling without completely deleting the persistent data volume. ### Steps to Reproduce 1. Start the Open WebUI container with a clean data volume. docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' ghcr.io/open-webui/open-webui:main 2. In the Admin Panel -> Documents settings, switch the "Content Extraction Engine" from default to docling and provide the Docling URL. Save changes. 3. Attempt to upload any document (e.g., a .pdf or .docx). The upload will fail with the 400: File type is not allowed error. 4. Stop and remove the container. Relaunch it, this time providing the correctly formatted RAG_ALLOWED_FILE_EXTENSIONS variable. docker stop openwebui-test docker rm openwebui-test docker run -d -p 3001:8080 -v openwebui-data:/app/backend/data --name openwebui-test -e WEBUI_AUTH='false' -e RAG_ALLOWED_FILE_EXTENSIONS='.pdf,.docx,.txt' ghcr.io/open-webui/open-webui:main 5. Attempt to upload a .docx file again. The action will fail with the same 400 error because the application is now loading the stale, misformatted value from the database it created in step 2. ### Logs & Screenshots There are three key pieces of log evidence that demonstrate the issue: ### 1. Container Environment Confirmation This `printenv` command, run inside the container, proves the `RAG_ALLOWED_FILE_EXTENSIONS` environment variable was correctly set and available to the application. `$ docker exec -it openwebui-latest /bin/bash` `# printenv | grep RAG_` "RAG_ALLOWED_FILE_EXTENSIONS=.pdf,.docx,.pptx,.txt,.md,.xlsx" ### 2. Container Startup Log Despite the environment variable being set, this log from the container's startup sequence shows the application explicitly ignoring it and loading the value from the database instead. "INFO [open_webui.env] 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry" ### 3. File Upload Error Log This is the resulting traceback when attempting to upload a file, which occurs because the incorrect value from the database is being used for validation. ERROR | open_webui.routers.files:upload_file:189 - 400: [ERROR: File type .docx is not allowed] - {} Traceback (most recent call last): File "/app/backend/open_webui/routers/files.py", line 105, in upload_file raise HTTPException(...) ### Additional Information _No response_
GiteaMirror added the bug label 2026-05-05 18:10:46 -05:00
Author
Owner

@tjbck commented on GitHub (May 23, 2025):

Should already be addressed in dev and the format should be pdf,txt,docx.

<!-- gh-comment-id:2905928289 --> @tjbck commented on GitHub (May 23, 2025): Should already be addressed in dev and the format should be `pdf,txt,docx`.
Author
Owner

@eowensai commented on GitHub (May 23, 2025):

To provide full context, my work was on the :main image. It is possible the fix in :dev addresses this, but sharing more detail.

I tested the format you suggested (pdf,txt,docx), both with and without spaces between the entries, and found it did not resolve the error. The conclusion that leading dots were required was based on an inspection of the validation logic in /app/backend/open_webui/routers/files.py

file_extension = os.path.splitext(filename)[1]

...

if file_extension not in request.app.state.config.ALLOWED_FILE_EXTENSIONS:
raise HTTPException(...)

As os.path.splitext(filename)[1] returns the extension with the leading dot, the logic suggested the items in the ALLOWED_FILE_EXTENSIONS list must also contain the dot for the in check to succeed.

The primary issue, however, was that the application consistently ignored the environment variable after the first run. The startup logs showed 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry, which overrode any value set in the Docker command or saved in the UI, regardless of format.

<!-- gh-comment-id:2905942109 --> @eowensai commented on GitHub (May 23, 2025): To provide full context, my work was on the :main image. It is possible the fix in :dev addresses this, but sharing more detail. I tested the format you suggested (pdf,txt,docx), both with and without spaces between the entries, and found it did not resolve the error. The conclusion that leading dots were required was based on an inspection of the validation logic in /app/backend/open_webui/routers/files.py file_extension = os.path.splitext(filename)[1] # ... if file_extension not in request.app.state.config.ALLOWED_FILE_EXTENSIONS: raise HTTPException(...) As os.path.splitext(filename)[1] returns the extension with the leading dot, the logic suggested the items in the ALLOWED_FILE_EXTENSIONS list must also contain the dot for the in check to succeed. The primary issue, however, was that the application consistently ignored the environment variable after the first run. The startup logs showed 'RAG_ALLOWED_FILE_EXTENSIONS' loaded from the latest database entry, which overrode any value set in the Docker command or saved in the UI, regardless of format.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#55860