issue: File Upload Issues #4491

Closed
opened 2025-11-11 15:55:21 -06:00 by GiteaMirror · 2 comments
Owner

Originally created by @jeannotdamoiseaux on GitHub (Mar 19, 2025).

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.5.20

Ollama Version (if applicable)

No response

Operating System

Ubuntu 22.04

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

Issue 1 (PDF Upload Failure)

  • The user interface should provide clear and informative feedback, such as:
    • "PDF format unsupported due to image data issues."
    • "Could not process this document: invalid data format detected."
  • Server logs should include detailed diagnostics, such as:
    • Specific problematic file components and processing stages.
    • Metadata about the file (e.g., type, size, affected pages).
  • Ideally, the file should upload successfully without any issues.

Issue 2 (File Disappearance After Timeout)

  • File uploads should complete successfully without timing out, regardless of size or type.
  • If a timeout occurs, both the user interface and server logs should clearly indicate:
    • A user-facing error message, e.g., "Upload failed: server timeout."
    • Detailed server logs explaining the cause, including file metadata and diagnostics.
  • Silent failures should be avoided through robust error handling, with logs and fallback mechanisms preserving system functionality.

Actual Behavior

Issue 1 (PDF Upload Failure)

  • The user interface displays vague error messages and fails to explain the reason for the upload failure.
  • Server logs produce unhelpful messages, such as:

Skipping data after last boundary
Cannot handle this data type: (1, 1, 1), |u1

  • Even after disabling OCR or validating the PDF file, uploads still fail, suggesting issues with image processing (e.g., unsupported data types in Pillow).

Issue 2 (File Disappearance After Timeout)

  • Certain file uploads enter a loading state that persists for up to 1 minute before disappearing entirely.
  • The browser console logs capture errors such as:

POST url/api/v1/files/ 504 (Gateway Timeout)
SyntaxError: Unexpected token '<', "<html><h"… is not valid JSON

  • The server logs fail to capture any information about the timeout, leaving administrators without insight into the issue.

Steps to Reproduce

Issue 1 (PDF Upload Failure)

  1. Navigate to the document upload interface of OpenWebUI v0.5.20.
  2. Attempt to upload a PDF file.
  3. Observe the upload failure:
    • The user interface provides vague error messages.
    • Server logs show errors like "Cannot handle this data type: (1, 1, 1), |u1."
    • Disabling OCR or validating the PDF does not resolve the issue.

Issue 2 (File Disappearance After Timeout)

  1. Navigate to the document upload interface of OpenWebUI v0.5.20.
  2. Attempt to upload a file.
  3. Observe the continuous loading state for up to 1 minute.
  4. The file disappears, and the browser console logs errors:

POST url/api/v1/files/ 504 (Gateway Timeout)
SyntaxError: Unexpected token '<', "<html><h"… is not valid JSON

  1. Check the server logs and note the absence of relevant information.

Logs & Screenshots

app-open-webui-1 | 2025-03-19 13:11:46.637 | WARNING | python_multipart.multipart:_internal_write:1401 - Skipping data after last boundary - {}
app-open-webui-1 | 2025-03-19 13:11:46.658 | INFO | open_webui.routers.files:upload_file:42 - file.content_type: application/pdf - {}
app-open-webui-1 | 2025-03-19 13:12:05.187 | ERROR | open_webui.routers.retrieval:process_file:1078 - Cannot handle this data type: (1, 1, 1), |u1 - {}
app-open-webui-1 | Traceback (most recent call last):
app-open-webui-1 |
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/PIL/Image.py", line 3315, in fromarray
app-open-webui-1 | mode, rawmode = _fromarray_typemap[typekey]
app-open-webui-1 | │ │ └ ((1, 1, 1), '|u1')
app-open-webui-1 | │ └ {((1, 1), '|b1'): ('1', '1;8'), ((1, 1), '|u1'): ('L', 'L'), ((1, 1), '|i1'): ('I', 'I;8'), ((1, 1), '<u2'): ('I', 'I;16'), (...)
app-open-webui-1 | └ None
app-open-webui-1 |
app-open-webui-1 | KeyError: ((1, 1, 1), '|u1')
app-open-webui-1 |
app-open-webui-1 |
app-open-webui-1 | The above exception was the direct cause of the following exception:
app-open-webui-1 |
app-open-webui-1 |
app-open-webui-1 | Traceback (most recent call last):
app-open-webui-1 |
app-open-webui-1 | File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
app-open-webui-1 | self._bootstrap_inner()
app-open-webui-1 | │ └ <function Thread._bootstrap_inner at 0x7cea9d3b4860>
app-open-webui-1 | └ <WorkerThread(AnyIO worker thread, started 137345235945152)>
app-open-webui-1 | File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
app-open-webui-1 | self.run()
app-open-webui-1 | │ └ <function WorkerThread.run at 0x7cea642de2a0>
app-open-webui-1 | └ <WorkerThread(AnyIO worker thread, started 137345235945152)>
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
app-open-webui-1 | result = context.run(func, *args)
app-open-webui-1 | │ │ │ └ ()
app-open-webui-1 | │ │ └ functools.partial(<function upload_file at 0x7cea7148a980>, user=UserModel(id='[ANONYMIZED_USER_ID]',
name='[ANONYMIZED_NAME]', email='[redacted-email@domain.tld]', role='...))
app-open-webui-1 | │ └ <method 'run' of '_contextvars.Context' objects>
app-open-webui-1 | └ <_contextvars.Context object at 0x7cea63be78c0>
app-open-webui-1 |
app-open-webui-1 | File "/app/backend/open_webui/routers/files.py", line 85, in upload_file
app-open-webui-1 | process_file(request, ProcessFileForm(file_id=id), user=user)
app-open-webui-1 | │ │ │ │ └ UserModel(id='[ANONYMIZED_USER_ID]', name='[ANONYMIZED_NAME]', email='[redacted-email@domain.tld]', role='...))
app-open-webui-1 | │ │ │ └ '[UUID_REDACTED]'
app-open-webui-1 | │ │ └ <class 'open_webui.routers.retrieval.ProcessFileForm'>
app-open-webui-1 | │ └ <starlette.requests.Request object at 0x7cea629dd8d0>
app-open-webui-1 | └ <function process_file at 0x7cea6705e7a0>
app-open-webui-1 |
app-open-webui-1 | File "/app/backend/open_webui/routers/retrieval.py", line 997, in process_file
app-open-webui-1 | docs = loader.load(
app-open-webui-1 | │ └ <function Loader.load at 0x7cea67f2f7e0>
app-open-webui-1 | └ <open_webui.retrieval.loaders.main.Loader object at 0x7cea62b4b1d0>
app-open-webui-1 |
app-open-webui-1 | File "/app/backend/open_webui/retrieval/loaders/main.py", line 129, in load
app-open-webui-1 | docs = loader.load()
app-open-webui-1 | │ └ <function BaseLoader.load at 0x7cea6d110ea0>
app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50>
app-open-webui-1 |
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 32, in load
app-open-webui-1 | return list(self.lazy_load())
app-open-webui-1 | │ └ <function PyPDFLoader.lazy_load at 0x7cea6c06f560>
app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50>
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/pdf.py", line 307, in lazy_load
app-open-webui-1 | yield from self.parser.lazy_parse(blob)
app-open-webui-1 | │ │ │ └ Blob [BLOB_REDACTED] /app/backend/data/uploads/[REDACTED_FILENAME.pdf]
app-open-webui-1 | │ │ └ <function PyPDFParser.lazy_parse at 0x7cea6c06e020>
app-open-webui-1 | │ └ <langchain_community.document_loaders.parsers.pdf.PyPDFParser object at 0x7cea63d8b9d0>
app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50>
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 397, in lazy_parse
app-open-webui-1 | images_from_page = self.extract_images_from_page(page)
app-open-webui-1 | │ │ └ {'/Tabs': '/S', '/Group': {'/S': '/Transparency', '/Type': '/Group', '/CS': '/DeviceRGB'}, '/Contents': IndirectObject(77, 0,...}
app-open-webui-1 | │ └ <function PyPDFParser.extract_images_from_page at 0x7cea6c06e0c0>
app-open-webui-1 | └ <langchain_community.document_loaders.parsers.pdf.PyPDFParser object at 0x7cea63d8b9d0>
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 454, in extract_images_from_page
app-open-webui-1 | Image.fromarray(np_image).save(image_bytes, format="PNG")
app-open-webui-1 | │ │ │ └ <_io.BytesIO object at 0x7cea6c0407c0>
app-open-webui-1 | │ │ └ [NUMPY_ARRAY_REDACTED]
app-open-webui-1 | │ └ <function fromarray at 0x7cea662d9260>
app-open-webui-1 | └ <module 'PIL.Image' from '/usr/local/lib/python3.11/site-packages/PIL/Image.py'>
app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/PIL/Image.py", line 3319, in fromarray
app-open-webui-1 | raise TypeError(msg) from e
app-open-webui-1 | └ 'Cannot handle this data type: (1, 1, 1), |u1'
app-open-webui-1 |
app-open-webui-1 | TypeError: Cannot handle this data type: (1, 1, 1), |u1
app-open-webui-1 | 2025-03-19 13:12:05.197 | ERROR | open_webui.routers.files:upload_file:89 - 400: Cannot handle this data type: (1, 1, 1), |u1 - {}
app-open-webui-1 | Traceback (most recent call last):

Additional Information

Summary

PDF Upload Failure

  • PDF uploads fail with unclear error messages that provide no useful information to users or administrators.
  • Server logs contain vague errors like "Cannot handle this data type: (1, 1, 1), |u1," making troubleshooting difficult.

File Disappearance After Timeout

  • Certain file uploads remain in a loading state for up to 1 minute before disappearing.
  • No meaningful error feedback is displayed to the user.
  • The server logs remain silent, leaving administrators with no diagnostics or details to address the issue.
Originally created by @jeannotdamoiseaux on GitHub (Mar 19, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.5.20 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 22.04 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior #### Issue 1 (PDF Upload Failure) - The user interface should provide clear and informative feedback, such as: - "PDF format unsupported due to image data issues." - "Could not process this document: invalid data format detected." - Server logs should include detailed diagnostics, such as: - Specific problematic file components and processing stages. - Metadata about the file (e.g., type, size, affected pages). - Ideally, the file should upload successfully without any issues. #### Issue 2 (File Disappearance After Timeout) - File uploads should complete successfully without timing out, regardless of size or type. - If a timeout occurs, both the user interface and server logs should clearly indicate: - A user-facing error message, e.g., "Upload failed: server timeout." - Detailed server logs explaining the cause, including file metadata and diagnostics. - Silent failures should be avoided through robust error handling, with logs and fallback mechanisms preserving system functionality. ### Actual Behavior #### Issue 1 (PDF Upload Failure) - The user interface displays vague error messages and fails to explain the reason for the upload failure. - Server logs produce unhelpful messages, such as: > Skipping data after last boundary > Cannot handle this data type: (1, 1, 1), |u1 - Even after disabling OCR or validating the PDF file, uploads still fail, suggesting issues with image processing (e.g., unsupported data types in `Pillow`). #### Issue 2 (File Disappearance After Timeout) - Certain file uploads enter a loading state that persists for up to 1 minute before disappearing entirely. - The browser console logs capture errors such as: > POST [url/api/v1/files/ ](url/api/v1/files/)504 (Gateway Timeout) SyntaxError: Unexpected token '<', "<html><h"… is not valid JSON - The server logs fail to capture any information about the timeout, leaving administrators without insight into the issue. ### Steps to Reproduce #### Issue 1 (PDF Upload Failure) 1. Navigate to the document upload interface of OpenWebUI v0.5.20. 2. Attempt to upload a PDF file. 3. Observe the upload failure: - The user interface provides vague error messages. - Server logs show errors like "Cannot handle this data type: (1, 1, 1), |u1." - Disabling OCR or validating the PDF does not resolve the issue. #### Issue 2 (File Disappearance After Timeout) 1. Navigate to the document upload interface of OpenWebUI v0.5.20. 2. Attempt to upload a file. 3. Observe the continuous loading state for up to 1 minute. 4. The file disappears, and the browser console logs errors: > POST [url/api/v1/files/ ](url/api/v1/files/)504 (Gateway Timeout) SyntaxError: Unexpected token '<', "<html><h"… is not valid JSON 5. Check the server logs and note the absence of relevant information. ### Logs & Screenshots > app-open-webui-1 | 2025-03-19 13:11:46.637 | WARNING | python_multipart.multipart:_internal_write:1401 - Skipping data after last boundary - {} app-open-webui-1 | 2025-03-19 13:11:46.658 | INFO | open_webui.routers.files:upload_file:42 - file.content_type: application/pdf - {} app-open-webui-1 | 2025-03-19 13:12:05.187 | ERROR | open_webui.routers.retrieval:process_file:1078 - Cannot handle this data type: (1, 1, 1), |u1 - {} app-open-webui-1 | Traceback (most recent call last): app-open-webui-1 | app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/PIL/Image.py", line 3315, in fromarray app-open-webui-1 | mode, rawmode = _fromarray_typemap[typekey] app-open-webui-1 | │ │ └ ((1, 1, 1), '|u1') app-open-webui-1 | │ └ {((1, 1), '|b1'): ('1', '1;8'), ((1, 1), '|u1'): ('L', 'L'), ((1, 1), '|i1'): ('I', 'I;8'), ((1, 1), '<u2'): ('I', 'I;16'), (...) app-open-webui-1 | └ None app-open-webui-1 | app-open-webui-1 | KeyError: ((1, 1, 1), '|u1') app-open-webui-1 | app-open-webui-1 | app-open-webui-1 | The above exception was the direct cause of the following exception: app-open-webui-1 | app-open-webui-1 | app-open-webui-1 | Traceback (most recent call last): app-open-webui-1 | app-open-webui-1 | File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap app-open-webui-1 | self._bootstrap_inner() app-open-webui-1 | │ └ <function Thread._bootstrap_inner at 0x7cea9d3b4860> app-open-webui-1 | └ <WorkerThread(AnyIO worker thread, started 137345235945152)> app-open-webui-1 | File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner app-open-webui-1 | self.run() app-open-webui-1 | │ └ <function WorkerThread.run at 0x7cea642de2a0> app-open-webui-1 | └ <WorkerThread(AnyIO worker thread, started 137345235945152)> app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run app-open-webui-1 | result = context.run(func, *args) app-open-webui-1 | │ │ │ └ () app-open-webui-1 | │ │ └ functools.partial(<function upload_file at 0x7cea7148a980>, user=UserModel(id='[ANONYMIZED_USER_ID]', name='[ANONYMIZED_NAME]', email='[redacted-email@domain.tld]', role='...)) app-open-webui-1 | │ └ <method 'run' of '_contextvars.Context' objects> app-open-webui-1 | └ <_contextvars.Context object at 0x7cea63be78c0> app-open-webui-1 | app-open-webui-1 | File "/app/backend/open_webui/routers/files.py", line 85, in upload_file app-open-webui-1 | process_file(request, ProcessFileForm(file_id=id), user=user) app-open-webui-1 | │ │ │ │ └ UserModel(id='[ANONYMIZED_USER_ID]', name='[ANONYMIZED_NAME]', email='[redacted-email@domain.tld]', role='...)) app-open-webui-1 | │ │ │ └ '[UUID_REDACTED]' app-open-webui-1 | │ │ └ <class 'open_webui.routers.retrieval.ProcessFileForm'> app-open-webui-1 | │ └ <starlette.requests.Request object at 0x7cea629dd8d0> app-open-webui-1 | └ <function process_file at 0x7cea6705e7a0> app-open-webui-1 | app-open-webui-1 | File "/app/backend/open_webui/routers/retrieval.py", line 997, in process_file app-open-webui-1 | docs = loader.load( app-open-webui-1 | │ └ <function Loader.load at 0x7cea67f2f7e0> app-open-webui-1 | └ <open_webui.retrieval.loaders.main.Loader object at 0x7cea62b4b1d0> app-open-webui-1 | app-open-webui-1 | File "/app/backend/open_webui/retrieval/loaders/main.py", line 129, in load app-open-webui-1 | docs = loader.load() app-open-webui-1 | │ └ <function BaseLoader.load at 0x7cea6d110ea0> app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50> app-open-webui-1 | app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_core/document_loaders/base.py", line 32, in load app-open-webui-1 | return list(self.lazy_load()) app-open-webui-1 | │ └ <function PyPDFLoader.lazy_load at 0x7cea6c06f560> app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50> app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/pdf.py", line 307, in lazy_load app-open-webui-1 | yield from self.parser.lazy_parse(blob) app-open-webui-1 | │ │ │ └ Blob [BLOB_REDACTED] /app/backend/data/uploads/[REDACTED_FILENAME.pdf] app-open-webui-1 | │ │ └ <function PyPDFParser.lazy_parse at 0x7cea6c06e020> app-open-webui-1 | │ └ <langchain_community.document_loaders.parsers.pdf.PyPDFParser object at 0x7cea63d8b9d0> app-open-webui-1 | └ <langchain_community.document_loaders.pdf.PyPDFLoader object at 0x7cea62bc5d50> app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 397, in lazy_parse app-open-webui-1 | images_from_page = self.extract_images_from_page(page) app-open-webui-1 | │ │ └ {'/Tabs': '/S', '/Group': {'/S': '/Transparency', '/Type': '/Group', '/CS': '/DeviceRGB'}, '/Contents': IndirectObject(77, 0,...} app-open-webui-1 | │ └ <function PyPDFParser.extract_images_from_page at 0x7cea6c06e0c0> app-open-webui-1 | └ <langchain_community.document_loaders.parsers.pdf.PyPDFParser object at 0x7cea63d8b9d0> app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/langchain_community/document_loaders/parsers/pdf.py", line 454, in extract_images_from_page app-open-webui-1 | Image.fromarray(np_image).save(image_bytes, format="PNG") app-open-webui-1 | │ │ │ └ <_io.BytesIO object at 0x7cea6c0407c0> app-open-webui-1 | │ │ └ [NUMPY_ARRAY_REDACTED] app-open-webui-1 | │ └ <function fromarray at 0x7cea662d9260> app-open-webui-1 | └ <module 'PIL.Image' from '/usr/local/lib/python3.11/site-packages/PIL/Image.py'> app-open-webui-1 | File "/usr/local/lib/python3.11/site-packages/PIL/Image.py", line 3319, in fromarray app-open-webui-1 | raise TypeError(msg) from e app-open-webui-1 | └ 'Cannot handle this data type: (1, 1, 1), |u1' app-open-webui-1 | app-open-webui-1 | TypeError: Cannot handle this data type: (1, 1, 1), |u1 app-open-webui-1 | 2025-03-19 13:12:05.197 | ERROR | open_webui.routers.files:upload_file:89 - 400: Cannot handle this data type: (1, 1, 1), |u1 - {} app-open-webui-1 | Traceback (most recent call last): ### Additional Information ### Summary #### PDF Upload Failure - PDF uploads fail with unclear error messages that provide no useful information to users or administrators. - Server logs contain vague errors like "Cannot handle this data type: (1, 1, 1), |u1," making troubleshooting difficult. #### File Disappearance After Timeout - Certain file uploads remain in a loading state for up to 1 minute before disappearing. - No meaningful error feedback is displayed to the user. - The server logs remain silent, leaving administrators with no diagnostics or details to address the issue.
GiteaMirror added the bug label 2025-11-11 15:55:21 -06:00
Author
Owner

@wertrigone commented on GitHub (Mar 19, 2025):

use model

Image

@wertrigone commented on GitHub (Mar 19, 2025): use model ![Image](https://github.com/user-attachments/assets/50f4be51-3b4b-4a85-bc78-eee272611bcb)
Author
Owner

@jeannotdamoiseaux commented on GitHub (Mar 19, 2025):

use model

@wertrigone , Could you clarify what you mean by "use model"? It's not entirely clear to me.

@jeannotdamoiseaux commented on GitHub (Mar 19, 2025): > use model @wertrigone , Could you clarify what you mean by "use model"? It's not entirely clear to me.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#4491