[GH-ISSUE #10332] feat: Docling Integration via Docling Server pattern #15858

Closed
opened 2026-04-19 21:57:07 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @flefevre on GitHub (Feb 19, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/10332

Description: This proposal aims to integrate Docling into Open WebUI using a dedicated Docling server (Docling-Serve or Docling-api) similar to the existing Apache Tika implementation. This approach ensures better flexibility, performance, and maintainability.

NB: These elements have been gathered from @MichaelKarpe and @JPC612 in the closed discussion #9238
I am not part from Open Webui team, just trying to help for this project. I am convince that docling integration will be a game changer in result for RAG in scientific domain, that 's why I am trying to help.

Approach: Using Docling-Serve or Docling-api

There are currently two implementations for the Docling server:

  1. docling-serve: Implemented by the official Docling team at IBM: https://github.com/DS4SD/docling-serve

  2. docling-api: An independent implementation, already operational and ahead in development, featuring a Docker integration (docling-api repository): https://github.com/drmingler/docling-api

We propose leveraging docling-serve or docling-api, similar to the way Open WebUI currently integrates Apache Tika.
We propose leveraging the docling-serve Docker image, similar to the way Open WebUI currently integrates Apache Tika.

This will allow for easier deployment, maintainability, and optimized performance.

CPU/GPU considerations should be taken into account for efficient processing.

Key Improvements Over Previous Approaches

1. Avoids LangChain dependency:

Previous integrations attempted using langchain-docling, which has limitations such as CPU-only execution and loss of metadata.

Direct server integration provides more control over processing parameters and metadata retention.

2. Enhances OCR Processing for RAG Pipelines:

Supports multiple OCR engines, including EasyOCR and RapidOCR, both with GPU acceleration.

Enables structured document processing, including table extraction and formula enrichment.

3. Aligns with Open WebUI’s modular architecture:

Mirrors Apache Tika implementation, ensuring easier adoption.

Allows users to configure Docling as an alternative backend for document parsing.

Proposed Changes

  • Integration with docling-serve as an OCR/document parsing backend.
  • Option to switch between Tika and Docling in settings.
  • Support for GPU acceleration when available.
  • Modify retrieval/loaders/main.py to support Docling-Serve.
  • Update configuration to include DOCLING_SERVER_URL similar to TIKA_SERVER_URL.
  • Addresses previous issues with slow processing times when using langchain-docling.
  • Ensures correct handling of document metadata, including page labels.

Additional Information

This integration will make Open WebUI more robust for document processing tasks, enhancing its usability in knowledge-based applications.

Future improvements could include support for switching between multiple OCR engines dynamically.

References:

Official Enhancement : #6844

Previous PR: #9238 and #7033

Originally created by @flefevre on GitHub (Feb 19, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/10332 Description: This proposal aims to integrate Docling into Open WebUI using a dedicated Docling server (Docling-Serve or Docling-api) similar to the existing Apache Tika implementation. This approach ensures better flexibility, performance, and maintainability. NB: These elements have been gathered from @MichaelKarpe and @JPC612 in the closed discussion #9238 I am not part from Open Webui team, just trying to help for this project. I am convince that docling integration will be a game changer in result for RAG in scientific domain, that 's why I am trying to help. ### Approach: Using Docling-Serve or Docling-api There are currently two implementations for the Docling server: 1. docling-serve: Implemented by the official Docling team at IBM: https://github.com/DS4SD/docling-serve 2. docling-api: An independent implementation, already operational and ahead in development, featuring a Docker integration (docling-api repository): https://github.com/drmingler/docling-api We propose leveraging docling-serve or docling-api, similar to the way Open WebUI currently integrates Apache Tika. We propose leveraging the docling-serve Docker image, similar to the way Open WebUI currently integrates Apache Tika. This will allow for easier deployment, maintainability, and optimized performance. CPU/GPU considerations should be taken into account for efficient processing. ### Key Improvements Over Previous Approaches **1. Avoids LangChain dependency:** Previous integrations attempted using langchain-docling, which has limitations such as CPU-only execution and loss of metadata. Direct server integration provides more control over processing parameters and metadata retention. **2. Enhances OCR Processing for RAG Pipelines:** Supports multiple OCR engines, including EasyOCR and RapidOCR, both with GPU acceleration. Enables structured document processing, including table extraction and formula enrichment. **3. Aligns with Open WebUI’s modular architecture:** Mirrors Apache Tika implementation, ensuring easier adoption. Allows users to configure Docling as an alternative backend for document parsing. ### Proposed Changes - Integration with docling-serve as an OCR/document parsing backend. - Option to switch between Tika and Docling in settings. - Support for GPU acceleration when available. - Modify retrieval/loaders/main.py to support Docling-Serve. - Update configuration to include DOCLING_SERVER_URL similar to TIKA_SERVER_URL. - Addresses previous issues with slow processing times when using langchain-docling. - Ensures correct handling of document metadata, including page labels. ### Additional Information This integration will make Open WebUI more robust for document processing tasks, enhancing its usability in knowledge-based applications. Future improvements could include support for switching between multiple OCR engines dynamically. ### References: Official Enhancement : #6844 Previous PR: #9238 and #7033
Author
Owner

@VarianFry commented on GitHub (Feb 20, 2025):

Yes please. Just a user, but I've heard of Doclin and played with it. Seems to be very good.

<!-- gh-comment-id:2670539421 --> @VarianFry commented on GitHub (Feb 20, 2025): Yes please. Just a user, but I've heard of Doclin and played with it. Seems to be very good.
Author
Owner

@flefevre commented on GitHub (Feb 22, 2025):

Open has just integrated the "document intelligence" server of Microsoft Azure which is a good solution but cost money each time you make ocerizaion.
Perhaps we could use exactly the same pattern for the docling PR as these one to ease the integration by openwebui team:
35f3824932

What do you think?
Will you be able to take the work you have done and inspiration of the PR above?

<!-- gh-comment-id:2676023952 --> @flefevre commented on GitHub (Feb 22, 2025): Open has just integrated the "document intelligence" server of Microsoft Azure which is a good solution but cost money each time you make ocerizaion. Perhaps we could use exactly the same pattern for the docling PR as these one to ease the integration by openwebui team: https://github.com/open-webui/open-webui/commit/35f3824932833fe77ef3bce54b86803cda4838a6 What do you think? Will you be able to take the work you have done and inspiration of the PR above?
Author
Owner

@vishnudev-k commented on GitHub (Mar 4, 2025):

Docling has the option to run as a CLI service or an API endpoint.
This would make the document extraction much better.

<!-- gh-comment-id:2698035611 --> @vishnudev-k commented on GitHub (Mar 4, 2025): Docling has the option to run as a CLI service or an API endpoint. This would make the document extraction much better.
Author
Owner

@FabioPolito24 commented on GitHub (Mar 5, 2025):

I have implemented the version with docling-serve on my local branch. I can create a PR to discuss further improvements to the code. Let me know how you’d like to proceed!

<!-- gh-comment-id:2702251214 --> @FabioPolito24 commented on GitHub (Mar 5, 2025): I have implemented the version with docling-serve on my local branch. I can create a PR to discuss further improvements to the code. Let me know how you’d like to proceed!
Author
Owner

@tjbck commented on GitHub (Mar 28, 2025):

Implemented in dev.

<!-- gh-comment-id:2760063791 --> @tjbck commented on GitHub (Mar 28, 2025): Implemented in dev.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#15858