feat: extend Docling integration configuration #6300

Open
opened 2025-11-11 16:50:26 -06:00 by GiteaMirror · 8 comments
Owner

Originally created by @Elettrotecnica on GitHub (Sep 2, 2025).

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

Docling excellent quality comes at the cost of a throughput that could be in the range of seconds per page.
In the current OpenWebUI integration, we cannot set parameters that have a huge impact on performance, such as:

  • whether to perform OCR or not
  • pdf backend to use
  • table extraction mode
  • pipeline type

Sources:
https://arxiv.org/html/2408.09869v5#S4
https://github.com/docling-project/docling/discussions/2173

Desired Solution you'd like

It would be great that the docling integration would allow to set at least the things I have mentioned. Ideally, we should be able to support the whole docling-serve api.

Alternatives Considered

The alternative would be to have an inbetween component that would inject these settings into the request. This would not be very elegant imo.

Another thing that would be great, is for docling-serve to expose more settings through environment variables, so that we could have this stuff set up once and for all as part of the deployment (but would reduce user flexibility).

Additional Context

No response

Originally created by @Elettrotecnica on GitHub (Sep 2, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Docling excellent quality comes at the cost of a throughput that could be in the range of seconds per page. In the current OpenWebUI integration, we cannot set parameters that have a huge impact on performance, such as: - whether to perform OCR or not - pdf backend to use - table extraction mode - pipeline type Sources: https://arxiv.org/html/2408.09869v5#S4 https://github.com/docling-project/docling/discussions/2173 ### Desired Solution you'd like It would be great that the docling integration would allow to set at least the things I have mentioned. Ideally, we should be able to support the whole docling-serve api. ### Alternatives Considered The alternative would be to have an inbetween component that would inject these settings into the request. This would not be very elegant imo. Another thing that would be great, is for docling-serve to expose more settings through environment variables, so that we could have this stuff set up once and for all as part of the deployment (but would reduce user flexibility). ### Additional Context _No response_
Author
Owner

@alvarellos commented on GitHub (Sep 4, 2025):

Ciao Antonio.

This discussion is related:
https://github.com/open-webui/open-webui/discussions/16919

It includes the backend part to upload documents with Docling or having the possibility to activate it in workspaces and not globally. Other features could be added as well… feel free to include comments.

Regards,
Diego

@alvarellos commented on GitHub (Sep 4, 2025): Ciao Antonio. This discussion is related: https://github.com/open-webui/open-webui/discussions/16919 It includes the backend part to upload documents with Docling or having the possibility to activate it in workspaces and not globally. Other features could be added as well… feel free to include comments. Regards, Diego
Author
Owner

@Elettrotecnica commented on GitHub (Sep 5, 2025):

Dear @alvarellos,

the scope of your feature request appears to be much larger, in that you want to allow for the knowledge backend to be configurable per-conversation, rather than globally. For my use case, it would be sufficient to extend the parameters one can set when talking to the docling backend.

@Elettrotecnica commented on GitHub (Sep 5, 2025): Dear @alvarellos, the scope of your feature request appears to be much larger, in that you want to allow for the knowledge backend to be configurable per-conversation, rather than globally. For my use case, it would be sufficient to extend the parameters one can set when talking to the docling backend.
Author
Owner

@alvarellos commented on GitHub (Sep 5, 2025):

Hi Antonio.

Stefan is also asking something similar regarding OCR options. https://github.com/open-webui/open-webui/discussions/16919

For instance, I have already include language selector which could be used for the Tesseract language options instead of the headers https://github.com/open-webui/open-webui/discussions/15839 ...

I agree that more fine tunning could be added. As there are many options available within the Docling API.

@alvarellos commented on GitHub (Sep 5, 2025): Hi Antonio. Stefan is also asking something similar regarding OCR options. https://github.com/open-webui/open-webui/discussions/16919 For instance, I have already include language selector which could be used for the Tesseract language options instead of the headers https://github.com/open-webui/open-webui/discussions/15839 ... I agree that more fine tunning could be added. As there are many options available within the Docling API.
Author
Owner

@Vincentkerstholt commented on GitHub (Sep 8, 2025):

@alvarellos this isn't the same request. @Elettrotecnica is asking to add the configuration to bypass the OCR (What is a much smaller feature request).

The feature could be as minimal as adding a toggle in the docling setup, and passing the following variable along every docling parsing request:

    "options": {
      "do_ocr": false
    }

This feature would be very beneficial for docling users, since it would drastically increase performance on digital pdf's.

It's described in detail here:
https://github.com/docling-project/docling/discussions/2173

@Vincentkerstholt commented on GitHub (Sep 8, 2025): @alvarellos this isn't the same request. @Elettrotecnica is asking to add the configuration to bypass the OCR (What is a much smaller feature request). The feature could be as minimal as adding a toggle in the docling setup, and passing the following variable along every docling parsing request: ``` "options": { "do_ocr": false } ``` This feature would be very beneficial for docling users, since it would drastically increase performance on digital pdf's. It's described in detail here: https://github.com/docling-project/docling/discussions/2173
Author
Owner

@Elettrotecnica commented on GitHub (Sep 8, 2025):

I have submitted a PR introducing the configurations I am interested about. Feedback is welcome!

@Elettrotecnica commented on GitHub (Sep 8, 2025): I have submitted a PR introducing the configurations I am interested about. Feedback is welcome!
Author
Owner

@gordoabc commented on GitHub (Sep 22, 2025):

It would be good to be able to set enhancements like

code enrichment
formula enrichment
picture classification

picture description is there but it doesn't seem to affect the resulting md

Also being able to specify the vlm_model as well as pipeline would be good (not sure what it defaults to but I'd like to use granite_docling)

@gordoabc commented on GitHub (Sep 22, 2025): It would be good to be able to set enhancements like code enrichment formula enrichment picture classification picture description is there but it doesn't seem to affect the resulting md Also being able to specify the vlm_model as well as pipeline would be good (not sure what it defaults to but I'd like to use granite_docling)
Author
Owner

@Elettrotecnica commented on GitHub (Sep 22, 2025):

FYI: I already opened a PR about extending the configuration further concerning vlm pipeline options: https://github.com/open-webui/open-webui/pull/17363.

This probably makes sense to implement as we go, as the total number of options is quite big. See https://github.com/docling-project/docling-serve/issues/336 for the actual full list.

Picture description is working in my experience. This appears to support pdfs only atm, see e.g. https://github.com/docling-project/docling/issues/2225. Only images from a certain size upward will be described. An option exists for to specify this size on the command-line docling, not sure about docling-serve

@Elettrotecnica commented on GitHub (Sep 22, 2025): FYI: I already opened a PR about extending the configuration further concerning vlm pipeline options: https://github.com/open-webui/open-webui/pull/17363. This probably makes sense to implement as we go, as the total number of options is quite big. See https://github.com/docling-project/docling-serve/issues/336 for the actual full list. Picture description is working in my experience. This appears to support pdfs only atm, see e.g. https://github.com/docling-project/docling/issues/2225. Only images from a certain size upward will be described. An option exists for to specify this size on the command-line docling, not sure about docling-serve
Author
Owner

@Elettrotecnica commented on GitHub (Oct 17, 2025):

Since 339e95e9d7, this issue should be completely addressed. Unfortunately, I believe a typo slipped through, see https://github.com/open-webui/open-webui/pull/18390

Once this works as expected, this issue could be closed.

All the best

@Elettrotecnica commented on GitHub (Oct 17, 2025): Since https://github.com/open-webui/open-webui/commit/339e95e9d7f46894ee1221d5d7045c54527c4f82, this issue should be completely addressed. Unfortunately, I believe a typo slipped through, see https://github.com/open-webui/open-webui/pull/18390 Once this works as expected, this issue could be closed. All the best
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#6300