[GH-ISSUE #14670] issue: Azure Document Intelligence can crash Open WebUI #17330

Closed
opened 2026-04-19 23:03:53 -05:00 by GiteaMirror · 23 comments
Owner

Originally created by @jimbo-p on GitHub (Jun 4, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14670

Originally assigned to: @tjbck on GitHub.

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.13

Ollama Version (if applicable)

No response

Operating System

Windows 11

Browser (if applicable)

Firefox

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

I drag and drop a larger PDF into Open WebUI. Larger being defined as a document that may take more than one minute to OCR (i.e. > 5-10 MB). It uses Azure Document Intelligence to OCR that PDF in preparation for RAG workflow.

Actual Behavior

I drag and drop a larger PDF into Open WebUI. It uses Azure Document Intelligence to OCR and crashes Open WebUI entirely if it takes > 45 seconds to complete the OCR.

After ~45 seconds, the document disappears out of the chatbox and then the Open WebUI interface becomes unresponsive.

Steps to Reproduce

  1. Set Content Extraction Engine to use 'Document Intelligence'.
  2. Set corresponding Azure Doc Intelligence URL / API key.
  3. Drag and drop or upload a larger PDF into the chat box. Documents over 25MB will almost always take long enough to crash.
  4. Wait ~50-60 seconds. The document should disappear from the chat box and Open WebUI will crash without reporting an error.

Logs & Screenshots

Cloudwatch Logs

2025-06-04T18:30:55.538Z
INFO | Request URL: https://opstech-form-recognizer.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=REDACTED&outputContentFormat=REDACTED
Request method: POST
Request headers:
content-type: application/octet-stream
Accept: application/json
x-ms-client-request-id: 0d667c42-4172-11f0-bfc1-0a58a9feac02
x-ms-useragent: REDACTED
User-Agent: azsdk-python-ai-documentintelligence/1.0.0 Python/3.11.12 (Linux-5.10.235-227.919.amzn2.x86_64-x86_64-with-glibc2.36)
Ocp-Apim-Subscription-Key: REDACTED
A body is sent with the request.

DEBUG | Starting new HTTPS connection: opstech-form-recognizer.cognitiveservices.azure.com:443

2025-06-04T18:30:57.757Z
DEBUG | HTTPS POST to /documentintelligence/documentModels/prebuilt-layout:analyze?...
Response: 202

2025-06-04T18:30:57.763Z
INFO | Response status: 202
Response headers:
Date: Wed, 04 Jun 2025 18:30:57 GMT
Content-Length: 0
Connection: keep-alive
Operation-Location: REDACTED
x-envoy-upstream-service-time: REDACTED
apim-request-id: REDACTED
Strict-Transport-Security: REDACTED
x-content-type-options: REDACTED
x-ms-region: REDACTED

2025-06-04T18:30:57.764Z
INFO | Request URL: https://opstech-form-recognizer.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout/analyzeResults/b1418de2-256b-445e-8876-6a963abdfb0f?api-version=REDACTED
Request method: GET
Request headers:
x-ms-client-request-id: 0d667c42-4172-11f0-bfc1-0a58a9feac02
x-ms-useragent: REDACTED
User-Agent: azsdk-python-ai-documentintelligence/1.0.0 Python/3.11.12 (Linux-5.10.235-227.919.amzn2.x86_64-x86_64-with-glibc2.36)
Ocp-Apim-Subscription-Key: REDACTED
No body was attached to the request - {}

....

2025-06-04T18:31:39.116Z
Ocp-Apim-Subscription-Key: REDACTED
No body was attached to the request - {}

2025-06-04T18:31:41.495Z
DEBUG | HTTPS GET to /documentintelligence/documentModels/prebuilt-layout/analyzeResults/b1418de2-256b-445e-8876-6a963abdfb0f?...
Response: 200

2025-06-04T18:31:46.242Z
INFO | Response status: 200
Response headers:
Date: Wed, 04 Jun 2025 18:31:41 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 57325967
Connection: keep-alive
x-envoy-upstream-service-time: REDACTED
apim-request-id: REDACTED
Strict-Transport-Security: REDACTED
x-content-type-options: REDACTED
x-ms-region: REDACTED

Browser Logs:
drop { target: p.is-empty.is-editor-empty, buttons: 0, clientX: 678, clientY: 229, layerX: 156, layerY: 18 }
CBWlh6ps.js:15:56554
Array [ File ]
CBWlh6ps.js:15:56687
Input files handler called with:
Array [ File ]
CBWlh6ps.js:15:55174
Processing file:
Object { name: "All Fluid Levels in Sample.pdf", type: "application/pdf", size: 24858908, extension: "pdf" }
CBWlh6ps.js:15:55258
Object { type: "file", file: "", id: null, url: "", name: "All Fluid Levels in Sample.pdf", collection_name: "", status: "uploading", size: 24858908, error: "", itemId: "30b37fea-08a9-456a-9d3c-98947fb6568a" }
DlDIEvos.js:7:7076
XHRPOST
http://internal-oxygpt-alb-dev-1998233518.us-east-1.elb.amazonaws.com/api/v1/files/
[HTTP/1.1 504 Gateway Time-out 69605ms]

SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data Bt_AvK56.js:1:365
a index.ts:26
(Async: promise callback)
s index.ts:24
Et MessageInput.svelte:265
on MessageInput.svelte:352
on MessageInput.svelte:298
mn MessageInput.svelte:380

Additional Information

The logs are very verbose so I didn't include everything but Open WebUI continuously makes calls to Azure Document Intelligence (as it should) until the document is ready. The failure that occurs is sudden, my logs show no error message but the Open WebUI app crashes and has to be restarted.

On testing, it appears to happen after ~50-60 seconds of not receiving an OCR result.

Originally created by @jimbo-p on GitHub (Jun 4, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/14670 Originally assigned to: @tjbck on GitHub. ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.13 ### Ollama Version (if applicable) _No response_ ### Operating System Windows 11 ### Browser (if applicable) Firefox ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior I drag and drop a larger PDF into Open WebUI. Larger being defined as a document that may take more than one minute to OCR (i.e. > 5-10 MB). It uses Azure Document Intelligence to OCR that PDF in preparation for RAG workflow. ### Actual Behavior I drag and drop a larger PDF into Open WebUI. It uses Azure Document Intelligence to OCR and crashes Open WebUI entirely if it takes > 45 seconds to complete the OCR. After ~45 seconds, the document disappears out of the chatbox and then the Open WebUI interface becomes unresponsive. ### Steps to Reproduce 1. Set Content Extraction Engine to use 'Document Intelligence'. 2. Set corresponding Azure Doc Intelligence URL / API key. 3. Drag and drop or upload a larger PDF into the chat box. Documents over 25MB will almost always take long enough to crash. 4. Wait ~50-60 seconds. The document should disappear from the chat box and Open WebUI will crash without reporting an error. ### Logs & Screenshots Cloudwatch Logs 2025-06-04T18:30:55.538Z INFO | Request URL: https://opstech-form-recognizer.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=REDACTED&outputContentFormat=REDACTED Request method: POST Request headers: content-type: application/octet-stream Accept: application/json x-ms-client-request-id: 0d667c42-4172-11f0-bfc1-0a58a9feac02 x-ms-useragent: REDACTED User-Agent: azsdk-python-ai-documentintelligence/1.0.0 Python/3.11.12 (Linux-5.10.235-227.919.amzn2.x86_64-x86_64-with-glibc2.36) Ocp-Apim-Subscription-Key: REDACTED A body is sent with the request. DEBUG | Starting new HTTPS connection: opstech-form-recognizer.cognitiveservices.azure.com:443 2025-06-04T18:30:57.757Z DEBUG | HTTPS POST to /documentintelligence/documentModels/prebuilt-layout:analyze?... Response: 202 2025-06-04T18:30:57.763Z INFO | Response status: 202 Response headers: Date: Wed, 04 Jun 2025 18:30:57 GMT Content-Length: 0 Connection: keep-alive Operation-Location: REDACTED x-envoy-upstream-service-time: REDACTED apim-request-id: REDACTED Strict-Transport-Security: REDACTED x-content-type-options: REDACTED x-ms-region: REDACTED 2025-06-04T18:30:57.764Z INFO | Request URL: https://opstech-form-recognizer.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout/analyzeResults/b1418de2-256b-445e-8876-6a963abdfb0f?api-version=REDACTED Request method: GET Request headers: x-ms-client-request-id: 0d667c42-4172-11f0-bfc1-0a58a9feac02 x-ms-useragent: REDACTED User-Agent: azsdk-python-ai-documentintelligence/1.0.0 Python/3.11.12 (Linux-5.10.235-227.919.amzn2.x86_64-x86_64-with-glibc2.36) Ocp-Apim-Subscription-Key: REDACTED No body was attached to the request - {} .... 2025-06-04T18:31:39.116Z Ocp-Apim-Subscription-Key: REDACTED No body was attached to the request - {} 2025-06-04T18:31:41.495Z DEBUG | HTTPS GET to /documentintelligence/documentModels/prebuilt-layout/analyzeResults/b1418de2-256b-445e-8876-6a963abdfb0f?... Response: 200 2025-06-04T18:31:46.242Z INFO | Response status: 200 Response headers: Date: Wed, 04 Jun 2025 18:31:41 GMT Content-Type: application/json; charset=utf-8 Content-Length: 57325967 Connection: keep-alive x-envoy-upstream-service-time: REDACTED apim-request-id: REDACTED Strict-Transport-Security: REDACTED x-content-type-options: REDACTED x-ms-region: REDACTED Browser Logs: drop { target: p.is-empty.is-editor-empty, buttons: 0, clientX: 678, clientY: 229, layerX: 156, layerY: 18 } CBWlh6ps.js:15:56554 Array [ File ] CBWlh6ps.js:15:56687 Input files handler called with: Array [ File ] CBWlh6ps.js:15:55174 Processing file: Object { name: "All Fluid Levels in Sample.pdf", type: "application/pdf", size: 24858908, extension: "pdf" } CBWlh6ps.js:15:55258 Object { type: "file", file: "", id: null, url: "", name: "All Fluid Levels in Sample.pdf", collection_name: "", status: "uploading", size: 24858908, error: "", itemId: "30b37fea-08a9-456a-9d3c-98947fb6568a" } DlDIEvos.js:7:7076 XHRPOST http://internal-oxygpt-alb-dev-1998233518.us-east-1.elb.amazonaws.com/api/v1/files/ [HTTP/1.1 504 Gateway Time-out 69605ms] SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data Bt_AvK56.js:1:365 a index.ts:26 (Async: promise callback) s index.ts:24 Et MessageInput.svelte:265 on MessageInput.svelte:352 on MessageInput.svelte:298 mn MessageInput.svelte:380 ### Additional Information The logs are very verbose so I didn't include everything but Open WebUI continuously makes calls to Azure Document Intelligence (as it should) until the document is ready. The failure that occurs is sudden, my logs show no error message but the Open WebUI app crashes and has to be restarted. On testing, it appears to happen after ~50-60 seconds of not receiving an OCR result.
GiteaMirror added the bug label 2026-04-19 23:03:53 -05:00
Author
Owner

@decent-engineer-decent-datascientist commented on GitHub (Jun 4, 2025):

I’m able to recreate this on our setup.

<!-- gh-comment-id:2941231543 --> @decent-engineer-decent-datascientist commented on GitHub (Jun 4, 2025): I’m able to recreate this on our setup.
Author
Owner

@pierrelouisbescond commented on GitHub (Jun 5, 2025):

I confirm that I've been able to reproduce this bug using Azure Document Intelligence and a 6.3 MB Arxiv PDF document (https://arxiv.org/pdf/2505.24876).
The uploaded document simply disappears from the chat UI.

<!-- gh-comment-id:2942949508 --> @pierrelouisbescond commented on GitHub (Jun 5, 2025): I confirm that I've been able to reproduce this bug using Azure Document Intelligence and a 6.3 MB Arxiv PDF document (https://arxiv.org/pdf/2505.24876). The uploaded document simply disappears from the chat UI.
Author
Owner

@iamcristi commented on GitHub (Jun 5, 2025):

I've also reproduced this, I've noticed cpu goes to 100% and RAM usage grows until OOM while stuck at 53764fe648/backend/open_webui/routers/retrieval.py (L1125)

<!-- gh-comment-id:2945593239 --> @iamcristi commented on GitHub (Jun 5, 2025): I've also reproduced this, I've noticed cpu goes to 100% and RAM usage grows until OOM while stuck at https://github.com/open-webui/open-webui/blob/53764fe64884da147359e54ed6d9607fe57f1600/backend/open_webui/routers/retrieval.py#L1125
Author
Owner

@jackthgu commented on GitHub (Jun 13, 2025):

Hello, are you currently using the paid version of Azure Document Intelligence?

<!-- gh-comment-id:2969547287 --> @jackthgu commented on GitHub (Jun 13, 2025): Hello, are you currently using the paid version of Azure Document Intelligence?
Author
Owner

@pierrelouisbescond commented on GitHub (Jun 13, 2025):

Hello, are you currently using the paid version of Azure Document Intelligence?

Yes, we have an Azure company subscription.

<!-- gh-comment-id:2969673849 --> @pierrelouisbescond commented on GitHub (Jun 13, 2025): > Hello, are you currently using the paid version of Azure Document Intelligence? Yes, we have an Azure company subscription.
Author
Owner

@jackthgu commented on GitHub (Jun 15, 2025):

It looks like the request dies right at most reverse-proxy time-outs . Could you tell me:

  1. Which proxy / load balancer (Nginx, Traefik, Cloudflare, ALB, etc.)
  2. Your timeout settings (e.g., proxy_read_timeout, proxy_send_timeout)
  3. Any logs showing 504 or “upstream timed out”

Reload Nginx, rerun the upload.
If it works, we can fine-tune or document the fix. Let me know!

<!-- gh-comment-id:2973511872 --> @jackthgu commented on GitHub (Jun 15, 2025): It looks like the request dies right at most reverse-proxy time-outs . Could you tell me: 1. **Which proxy / load balancer** (Nginx, Traefik, Cloudflare, ALB, etc.) 2. **Your timeout settings** (e.g., `proxy_read_timeout`, `proxy_send_timeout`) 3. **Any logs** showing `504` or “upstream timed out” Reload Nginx, rerun the upload. If it works, we can fine-tune or document the fix. Let me know!
Author
Owner

@jimbo-p commented on GitHub (Jun 16, 2025):

It looks like the request dies right at most reverse-proxy time-outs . Could you tell me:

1. **Which proxy / load balancer** (Nginx, Traefik, Cloudflare, ALB, etc.)

2. **Your timeout settings** (e.g., `proxy_read_timeout`, `proxy_send_timeout`)

3. **Any logs** showing `504` or “upstream timed out”

Reload Nginx, rerun the upload. If it works, we can fine-tune or document the fix. Let me know!

  1. ALB
  2. 60 seconds (default)
  3. NA

I went ahead and updated my timeout on the ALB to 15 minutes. Open WebUI is no longer crashing and the document isn't disappearing out of the chatbox. However, it does spin forever. After the previous default timeout (~60 seconds), it looks like Open WebUI gives up on making requests to the doc intelligence endpoint to check for if OCR has been completed.

<!-- gh-comment-id:2976881059 --> @jimbo-p commented on GitHub (Jun 16, 2025): > It looks like the request dies right at most reverse-proxy time-outs . Could you tell me: > > 1. **Which proxy / load balancer** (Nginx, Traefik, Cloudflare, ALB, etc.) > > 2. **Your timeout settings** (e.g., `proxy_read_timeout`, `proxy_send_timeout`) > > 3. **Any logs** showing `504` or “upstream timed out” > > > Reload Nginx, rerun the upload. If it works, we can fine-tune or document the fix. Let me know! 1. ALB 2. 60 seconds (default) 3. NA I went ahead and updated my timeout on the ALB to 15 minutes. Open WebUI is no longer crashing and the document isn't disappearing out of the chatbox. However, it does spin forever. After the previous default timeout (~60 seconds), it looks like Open WebUI gives up on making requests to the doc intelligence endpoint to check for if OCR has been completed.
Author
Owner

@tjbck commented on GitHub (Jun 20, 2025):

Potentially related to #15023

<!-- gh-comment-id:2990136238 --> @tjbck commented on GitHub (Jun 20, 2025): Potentially related to #15023
Author
Owner

@jackthgu commented on GitHub (Jun 20, 2025):

As I'm currently using the free tier of Document Intelligence, I'm unable to fully replicate the issue on my end. Could you kindly share the full logs from your side for that specific scenario? It would be very helpful.

<!-- gh-comment-id:2991705578 --> @jackthgu commented on GitHub (Jun 20, 2025): As I'm currently using the free tier of Document Intelligence, I'm unable to fully replicate the issue on my end. Could you kindly share the full logs from your side for that specific scenario? It would be very helpful.
Author
Owner

@zolgear commented on GitHub (Jun 23, 2025):

Error situation:

  • PDF file: 19MB, 484 pages

  • Azure Document Intelligence API call completes successfully and is confirmed in the logs:

    | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1125 - save_docs_to_vector_db: document ******* - {}
    
  • After that, it seems that split_documents is running until an OOM (Out Of Memory) error occurs.

  • Tested on a machine with 32 vCPUs and 128GB RAM.
    Even when CPU usage reached 100% and RAM usage exceeded 100GB, the embedding process did not start.

<!-- gh-comment-id:2995749977 --> @zolgear commented on GitHub (Jun 23, 2025): **Error situation:** - PDF file: 19MB, 484 pages - Azure Document Intelligence API call completes successfully and is confirmed in the logs: ``` | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1125 - save_docs_to_vector_db: document ******* - {} ``` - After that, it seems that `split_documents` is running until an OOM (Out Of Memory) error occurs. - Tested on a machine with 32 vCPUs and 128GB RAM. Even when CPU usage reached 100% and RAM usage exceeded 100GB, the embedding process did not start.
Author
Owner

@fmonnier74 commented on GitHub (Jul 21, 2025):

Was able to reproduce, even without Document intelligence.

To reproduce :

if Openweb-UI behind a reverse-proxy : Upload a large full text file that will take more time than the proxy timeout. This is when the file disapear from the interface.
if not behind a reverse-proxy : Upload a full text large file and kill your browser while the upload process is still on.

Result :

Openweb-ui will start to allocate memory until OOM no matter what. Document intelligence is just a catalyst to reach this timeout faster since it gets more data from files.

I have 200 users on my deployment and I am struggling with this issue since many users do upload any file they want regardless of the current limitation.

Workaround :

Take a worst case scenario, for exemple a full text file of about 20MB, measure the time your setup will take to perform the vectorization, and set the timeout of your proxy to the time you measured + some magin and set max file size to 20MB. That is how I approached the issue.

Hope this can help.

<!-- gh-comment-id:3095789218 --> @fmonnier74 commented on GitHub (Jul 21, 2025): Was able to reproduce, even without Document intelligence. **To reproduce :** if Openweb-UI behind a reverse-proxy : Upload a large full text file that will take more time than the proxy timeout. This is when the file disapear from the interface. if not behind a reverse-proxy : Upload a full text large file and kill your browser while the upload process is still on. **Result :** Openweb-ui will start to allocate memory until OOM no matter what. Document intelligence is just a catalyst to reach this timeout faster since it gets more data from files. I have 200 users on my deployment and I am struggling with this issue since many users do upload any file they want regardless of the current limitation. **Workaround :** Take a worst case scenario, for exemple a full text file of about 20MB, measure the time your setup will take to perform the vectorization, and set the timeout of your proxy to the time you measured + some magin and set max file size to 20MB. That is how I approached the issue. Hope this can help.
Author
Owner

@Maximilian-Pichler commented on GitHub (Aug 18, 2025):

i think we're running into two separate issues here: first,
it seems like there's a proxy timeout that could be affecting large uploads; second, the way embeddings and document uploads are handled appears to be very resource-intensive.

our setup includes openwebui (hosted as an azure webapp with 32gb ram), document intelligence, and postgres with pg-vector. whenever i try uploading a large pdf (for example, a 1000-page, 33mb document), the app runs out of memory, crashes, and restarts. any advice on how to tackle these problems or optimize our resource usage would be super helpful!

<!-- gh-comment-id:3195428119 --> @Maximilian-Pichler commented on GitHub (Aug 18, 2025): i think we're running into two separate issues here: first, it seems like there's a proxy timeout that could be affecting large uploads; second, the way embeddings and document uploads are handled appears to be very resource-intensive. our setup includes openwebui (hosted as an azure webapp with 32gb ram), document intelligence, and postgres with pg-vector. whenever i try uploading a large pdf (for example, a 1000-page, 33mb document), the app runs out of memory, crashes, and restarts. any advice on how to tackle these problems or optimize our resource usage would be super helpful!
Author
Owner

@zolgear commented on GitHub (Sep 1, 2025):

With v0.6.26, PDF transcription in Notes works.

Embedding in the backend causes memory to increase until OOM.
Restarting the container before server crashes allows work to continue.
Enabling "Bypass Embedding and Retrieval" avoids the problem for now.

<!-- gh-comment-id:3241042736 --> @zolgear commented on GitHub (Sep 1, 2025): With v0.6.26, PDF transcription in Notes works. Embedding in the backend causes memory to increase until OOM. Restarting the container before server crashes allows work to continue. Enabling "Bypass Embedding and Retrieval" avoids the problem for now.
Author
Owner

@Maximilian-Pichler commented on GitHub (Sep 2, 2025):

the issue still persists with v.0.6.26

<!-- gh-comment-id:3245915049 --> @Maximilian-Pichler commented on GitHub (Sep 2, 2025): the issue still persists with v.0.6.26
Author
Owner

@tjbck commented on GitHub (Sep 11, 2025):

We may have addressed this in dev! Testing wanted here! @Maximilian-Pichler @zolgear

<!-- gh-comment-id:3282190581 --> @tjbck commented on GitHub (Sep 11, 2025): We may have addressed this in dev! Testing wanted here! @Maximilian-Pichler @zolgear
Author
Owner

@zolgear commented on GitHub (Sep 12, 2025):

I tested using the ghcr.io/open-webui/open-webui:dev-cuda image.

Test case: Knowledge Base
PDF file size: 9 MB, 132 pages.

  • OCR (Azure Document Intelligence) succeeded within 1 minute, confirmed via DEBUG log.
  • Embedding ran 28 times and finished in about 15 seconds.
  • After that, memory usage increased by about 20 GB (see screenshot).
  • About 10 GB was released, then for some reason, Azure Document Intelligence and Embedding ran a second time.
  • Both completed successfully, and the Knowledge Base worked as expected.

The issue of memory usage increasing indefinitely has been improved.
However, when the page count is large, Embedding can consume 20 GB–30 GB of memory, which is still a challenge for practical use — but this is a step forward!

2025-09-12 10:34:29.034 | INFO     | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document XXXXXXXX.pdf file-eb76f156-2532-4a64-b9d4-6e8347ea1f6d
2025-09-12 10:34:35.043 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /api/v1/chats/2673db18-a775-40ee-a6c4-4d32c64e64db HTTP/1.1" 200
2025-09-12 10:34:35.049 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35298 - "GET /api/v1/chats/8d453b7e-393d-4b57-9d87-8f1afd06b5d6 HTTP/1.1" 200
2025-09-12 10:34:35.222 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35314 - "GET /api/v1/chats/9b796796-9d20-47ed-aa58-27d90265eff3 HTTP/1.1" 200
2025-09-12 10:34:35.228 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35332 - "GET /api/v1/chats/64ba9a9f-8b7c-4715-a66a-ec95b467c9a9 HTTP/1.1" 200
2025-09-12 10:34:35.234 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35330 - "GET /api/v1/chats/52c113f1-c59b-4072-b985-cabb7b0a7439 HTTP/1.1" 200
2025-09-12 10:34:35.412 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35298 - "GET /api/v1/chats/1f92a570-734c-4fbc-8c69-2815aadb2977 HTTP/1.1" 200
2025-09-12 10:34:35.424 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /api/v1/chats/59e712d9-9680-4023-b6d5-586cefb7eb10 HTTP/1.1" 200
2025-09-12 10:34:35.673 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /_app/version.json HTTP/1.1" 200
2025-09-12 10:35:37.721 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:36362 - "GET /_app/version.json HTTP/1.1" 200
2025-09-12 10:36:44.698 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:36266 - "GET /_app/version.json HTTP/1.1" 200
2025-09-12 10:36:57.402 | INFO     | open_webui.routers.retrieval:save_docs_to_vector_db:1337 - generating embeddings for file-eb76f156-2532-4a64-b9d4-6e8347ea1f6d
2025-09-12 10:36:57.403 | DEBUG    | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1
2025-09-12 10:36:57.403 | DEBUG    | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443
2025-09-12 10:36:59.136 | DEBUG    | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33142
2025-09-12 10:36:59.137 | DEBUG    | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1
2025-09-12 10:36:59.137 | DEBUG    | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443
2025-09-12 10:37:00.014 | DEBUG    | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33089
2025-09-12 10:37:00.015 | DEBUG    | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1
2025-09-12 10:37:00.015 | DEBUG    | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443
2025-09-12 10:37:00.890 | DEBUG    | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33089
2025-09-12 10:37:00.891 | DEBUG    | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1


2025-09-12 10:41:06.217 | INFO     | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document XXXXXXXX.pdf 1f2514f9-f14f-4333-a6e2-ce622a0b95c8
2025-09-12 10:41:06.567 | INFO     | open_webui.routers.retrieval:save_docs_to_vector_db:1337 - generating embeddings for 1f2514f9-f14f-4333-a6e2-ce622a0b95c8

2025-09-12 10:42:43.889 | DEBUG    | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443
2025-09-12 10:42:44.236 | DEBUG    | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33118
2025-09-12 10:42:44.237 | DEBUG    | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1
2025-09-12 10:42:44.238 | DEBUG    | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443
2025-09-12 10:42:44.609 | DEBUG    | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33099
2025-09-12 10:43:50.185 | INFO     | open_webui.routers.retrieval:process_file:1577 - added 196 items to collection 1f2514f9-f14f-4333-a6e2-ce622a0b95c8
2025-09-12 10:43:50.213 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:60700 - "GET /_app/version.json HTTP/1.1" 200
2025-09-12 10:43:50.232 | DEBUG    | open_webui.socket.main:user_join:301 - channels=[]
2025-09-12 10:43:50.233 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:49410 - "POST /api/v1/knowledge/1f2514f9-f14f-4333-a6e2-ce622a0b95c8/file/add HTTP/1.1" 200
2025-09-12 10:44:33.621 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:43612 - "GET /_app/version.json HTTP/1.1" 200
Image
<!-- gh-comment-id:3284879185 --> @zolgear commented on GitHub (Sep 12, 2025): I tested using the `ghcr.io/open-webui/open-webui:dev-cuda` image. Test case: Knowledge Base PDF file size: 9 MB, 132 pages. - OCR (Azure Document Intelligence) succeeded within 1 minute, confirmed via DEBUG log. - Embedding ran 28 times and finished in about 15 seconds. - After that, memory usage increased by about 20 GB (see screenshot). - About 10 GB was released, then for some reason, Azure Document Intelligence and Embedding ran a second time. - Both completed successfully, and the Knowledge Base worked as expected. The issue of memory usage increasing indefinitely has been improved. However, when the page count is large, Embedding can consume 20 GB–30 GB of memory, which is still a challenge for practical use — but this is a step forward! ```log 2025-09-12 10:34:29.034 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document XXXXXXXX.pdf file-eb76f156-2532-4a64-b9d4-6e8347ea1f6d 2025-09-12 10:34:35.043 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /api/v1/chats/2673db18-a775-40ee-a6c4-4d32c64e64db HTTP/1.1" 200 2025-09-12 10:34:35.049 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35298 - "GET /api/v1/chats/8d453b7e-393d-4b57-9d87-8f1afd06b5d6 HTTP/1.1" 200 2025-09-12 10:34:35.222 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35314 - "GET /api/v1/chats/9b796796-9d20-47ed-aa58-27d90265eff3 HTTP/1.1" 200 2025-09-12 10:34:35.228 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35332 - "GET /api/v1/chats/64ba9a9f-8b7c-4715-a66a-ec95b467c9a9 HTTP/1.1" 200 2025-09-12 10:34:35.234 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35330 - "GET /api/v1/chats/52c113f1-c59b-4072-b985-cabb7b0a7439 HTTP/1.1" 200 2025-09-12 10:34:35.412 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35298 - "GET /api/v1/chats/1f92a570-734c-4fbc-8c69-2815aadb2977 HTTP/1.1" 200 2025-09-12 10:34:35.424 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /api/v1/chats/59e712d9-9680-4023-b6d5-586cefb7eb10 HTTP/1.1" 200 2025-09-12 10:34:35.673 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:35304 - "GET /_app/version.json HTTP/1.1" 200 2025-09-12 10:35:37.721 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:36362 - "GET /_app/version.json HTTP/1.1" 200 2025-09-12 10:36:44.698 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:36266 - "GET /_app/version.json HTTP/1.1" 200 2025-09-12 10:36:57.402 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1337 - generating embeddings for file-eb76f156-2532-4a64-b9d4-6e8347ea1f6d 2025-09-12 10:36:57.403 | DEBUG | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1 2025-09-12 10:36:57.403 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443 2025-09-12 10:36:59.136 | DEBUG | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33142 2025-09-12 10:36:59.137 | DEBUG | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1 2025-09-12 10:36:59.137 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443 2025-09-12 10:37:00.014 | DEBUG | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33089 2025-09-12 10:37:00.015 | DEBUG | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1 2025-09-12 10:37:00.015 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443 2025-09-12 10:37:00.890 | DEBUG | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33089 2025-09-12 10:37:00.891 | DEBUG | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1 2025-09-12 10:41:06.217 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document XXXXXXXX.pdf 1f2514f9-f14f-4333-a6e2-ce622a0b95c8 2025-09-12 10:41:06.567 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1337 - generating embeddings for 1f2514f9-f14f-4333-a6e2-ce622a0b95c8 2025-09-12 10:42:43.889 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443 2025-09-12 10:42:44.236 | DEBUG | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33118 2025-09-12 10:42:44.237 | DEBUG | open_webui.retrieval.utils:generate_openai_batch_embeddings:747 - generate_openai_batch_embeddings:model text-embedding-ada-002 batch size: 1 2025-09-12 10:42:44.238 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): litellm.xxxxxx:443 2025-09-12 10:42:44.609 | DEBUG | urllib3.connectionpool:_make_request:544 - https://litellm.xxxxxx:443 "POST /embeddings HTTP/1.1" 200 33099 2025-09-12 10:43:50.185 | INFO | open_webui.routers.retrieval:process_file:1577 - added 196 items to collection 1f2514f9-f14f-4333-a6e2-ce622a0b95c8 2025-09-12 10:43:50.213 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:60700 - "GET /_app/version.json HTTP/1.1" 200 2025-09-12 10:43:50.232 | DEBUG | open_webui.socket.main:user_join:301 - channels=[] 2025-09-12 10:43:50.233 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:49410 - "POST /api/v1/knowledge/1f2514f9-f14f-4333-a6e2-ce622a0b95c8/file/add HTTP/1.1" 200 2025-09-12 10:44:33.621 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 172.18.3.1:43612 - "GET /_app/version.json HTTP/1.1" 200 ``` <img width="489" height="535" alt="Image" src="https://github.com/user-attachments/assets/18fdb264-95f7-4002-95e4-2665b08903d4" />
Author
Owner

@asabla commented on GitHub (Sep 23, 2025):

We've had the exact same issue as mentioned previously in this issue. And it seems like, depending on which tier of Document Intelligence you're using, you'll also be able to get more data out of OCR in the responses.

The main issue, seems to be caused by some of the fields stored in embedding_metadata, which grows the vector database at an insane rate. On top of that, there also seems to be an issue with self-referencing json (might cause some of out-of-memory issues we're seeing).

Initial testing with removing some of the fields during stringification of the metadata in utils.py found at ./backend/open_webui/retrieval/vector/utils.py seems to be enough. So far we've removed the following keys in that function:

  • Content: Text representation of extracted text, which means we're storing the text output twice.
  • Complicated fields: which includes deeper json structures
    • Figures: depends on which model you're using to identify where and what these figures are
    • Pages: Is probably the least complicated of them. Still pretty large tho
    • Paragraphs: Is heavily related to document size. A +200 page document may cause a really large json structure
    • Sections: same as paragraphs, and is heavily related to document size
    • Tables: similar to figures, it depends on what you're using to identifying tables. And the amount of them

I do realize that some of these fields are very useful during RAG pipeline, but I do not believe the whole complicated json structure returned by Azure Document Intelligence is necessary in order to do so.

Suggestion:
Remove some of the returned fields (like content) and simplify the JSON structure. Another alternative is to ignore some/all of these fields for now

<!-- gh-comment-id:3323036214 --> @asabla commented on GitHub (Sep 23, 2025): We've had the exact same issue as mentioned previously in this issue. And it seems like, depending on which tier of Document Intelligence you're using, you'll also be able to get more data out of OCR in the responses. The main issue, seems to be caused by some of the fields stored in embedding_metadata, which grows the vector database at an insane rate. On top of that, there also seems to be an issue with self-referencing json (might cause some of out-of-memory issues we're seeing). Initial testing with removing some of the fields during stringification of the metadata in [utils.py](https://github.com/open-webui/open-webui/blob/main/backend/open_webui/retrieval/vector/utils.py#L4) found at `./backend/open_webui/retrieval/vector/utils.py` seems to be enough. So far we've removed the following keys in that function: - **Content:** Text representation of extracted text, which means we're storing the text output twice. - **Complicated fields:** which includes deeper json structures - **Figures:** depends on which model you're using to identify where and what these figures are - **Pages:** Is probably the least complicated of them. Still pretty large tho - **Paragraphs:** Is heavily related to document size. A +200 page document may cause a really large json structure - **Sections:** same as paragraphs, and is heavily related to document size - **Tables:** similar to figures, it depends on what you're using to identifying tables. And the amount of them I do realize that some of these fields are very useful during RAG pipeline, but I do not believe the whole complicated json structure returned by Azure Document Intelligence is necessary in order to do so. **Suggestion:** Remove some of the returned fields (like content) and simplify the JSON structure. Another alternative is to ignore some/all of these fields for now
Author
Owner

@decent-engineer-decent-datascientist commented on GitHub (Sep 29, 2025):

@asabla Any chance you've verified the extracted tables exist in the text representation as well? it'd be a shame if we were to lose the tables all together.

<!-- gh-comment-id:3348095650 --> @decent-engineer-decent-datascientist commented on GitHub (Sep 29, 2025): @asabla Any chance you've verified the extracted tables exist in the text representation as well? it'd be a shame if we were to lose the tables all together.
Author
Owner

@asabla commented on GitHub (Oct 2, 2025):

@decent-engineer-decent-datascientist the tables are somewhat represented depending on what the documents looks like. Like I mentioned before, it would probably be enough to reduce the default amount of allowed json complexity + fixing self-referencing issue.

<!-- gh-comment-id:3362566791 --> @asabla commented on GitHub (Oct 2, 2025): @decent-engineer-decent-datascientist the tables are somewhat represented depending on what the documents looks like. Like I mentioned before, it would probably be enough to reduce the default amount of allowed json complexity + fixing self-referencing issue.
Author
Owner

@tjbck commented on GitHub (Oct 2, 2025):

@asabla those metadata should've been removed in the latest release. Excel files still seem to take forever to retrieve the parsed content so we're updated our internal content extraction logic to use the built-in method instead FYI.

<!-- gh-comment-id:3362704782 --> @tjbck commented on GitHub (Oct 2, 2025): @asabla those metadata should've been removed in the latest release. Excel files still seem to take forever to retrieve the parsed content so we're updated our internal content extraction logic to use the built-in method instead FYI.
Author
Owner

@asabla commented on GitHub (Nov 10, 2025):

alright @tjbck been testing around with different vector stores. And so far the changes seems to have solved it. Unless anyone else has or finds any further issues, I would consider this as solved

<!-- gh-comment-id:3510832513 --> @asabla commented on GitHub (Nov 10, 2025): alright @tjbck been testing around with different vector stores. And so far the changes seems to have solved it. Unless anyone else has or finds any further issues, I would consider this as solved
Author
Owner

@Maximilian-Pichler commented on GitHub (Nov 10, 2025):

can confirm

<!-- gh-comment-id:3513010620 --> @Maximilian-Pichler commented on GitHub (Nov 10, 2025): can confirm
Author
Owner

@Ruben-Wien commented on GitHub (Nov 12, 2025):

Is there a PR associated with this fix? Cant find it.

<!-- gh-comment-id:3522740875 --> @Ruben-Wien commented on GitHub (Nov 12, 2025): Is there a PR associated with this fix? Cant find it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#17330