[PR #14712] [CLOSED] WIP: Customize Docling's "Describe Pictures" feature #10360

Closed
opened 2025-11-11 19:02:55 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/14712
Author: @vaclcer
Created: 6/5/2025
Status: Closed

Base: devHead: vaclavs-custom-picture-describe


📝 Commits (10+)

  • 9606f43 feat: add picture description configuration options for Docling
  • 898e56c chore: remove unused log level variables from dev script
  • e0c20e1 Merge branch 'dev' into vaclavs-custom-picture-describe
  • baf90ce refac: improve input formatting for Docling server and OCR settings
  • e3f341c add aria-hidden to svg, as these are decorative
  • e1949e6 remove aria-label from tooltip, as tippy handles this out of the box
  • b9c2fcf add aria labels to buttons that only contains decorative svgs
  • 4532b40 reindent xmlns
  • 6cb37e8 chore: format
  • ce9fb75 [i18n] Russian locale update

📊 Changes

59 files changed (+476 additions, -41 deletions)

View changed files

📝 backend/open_webui/config.py (+48 -0)
📝 backend/open_webui/main.py (+14 -0)
📝 backend/open_webui/retrieval/loaders/main.py (+51 -7)
📝 backend/open_webui/routers/retrieval.py (+65 -0)
📝 src/lib/components/admin/Settings/Documents.svelte (+137 -3)
📝 src/lib/components/chat/Messages/ResponseMessage.svelte (+28 -0)
📝 src/lib/components/common/Tooltip.svelte (+1 -3)
📝 src/lib/i18n/locales/ar-BH/translation.json (+2 -0)
📝 src/lib/i18n/locales/ar/translation.json (+2 -0)
📝 src/lib/i18n/locales/bg-BG/translation.json (+2 -0)
📝 src/lib/i18n/locales/bn-BD/translation.json (+2 -0)
📝 src/lib/i18n/locales/bo-TB/translation.json (+2 -0)
📝 src/lib/i18n/locales/ca-ES/translation.json (+2 -0)
📝 src/lib/i18n/locales/ceb-PH/translation.json (+2 -0)
📝 src/lib/i18n/locales/cs-CZ/translation.json (+2 -0)
📝 src/lib/i18n/locales/da-DK/translation.json (+2 -0)
📝 src/lib/i18n/locales/de-DE/translation.json (+2 -0)
📝 src/lib/i18n/locales/dg-DG/translation.json (+2 -0)
📝 src/lib/i18n/locales/el-GR/translation.json (+2 -0)
📝 src/lib/i18n/locales/en-GB/translation.json (+2 -0)

...and 39 more files

📄 Description

Description

This PR adds a customization to the Docling's content extraction engine. It follows recent Docling's development described at https://github.com/docling-project/docling-serve/blob/main/docs/usage.md#picture-description-options.

With this PR, user can choose to use a locally hosted VLM (in Docling env) to describe pictures in the document or use external OpenAI-like API to do so.

Additional Information

Please advice if this is wanted or not, especially UI styling etc.

Screenshots or Videos

describe1
describe2

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/14712 **Author:** [@vaclcer](https://github.com/vaclcer) **Created:** 6/5/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `vaclavs-custom-picture-describe` --- ### 📝 Commits (10+) - [`9606f43`](https://github.com/open-webui/open-webui/commit/9606f439fca4c3fa9b9377af368401f2218aa65e) feat: add picture description configuration options for Docling - [`898e56c`](https://github.com/open-webui/open-webui/commit/898e56c5f660ce5049f820e2e83756b9a37b5d70) chore: remove unused log level variables from dev script - [`e0c20e1`](https://github.com/open-webui/open-webui/commit/e0c20e16fc8ae8c035cd49a121686a1bfd56f4d9) Merge branch 'dev' into vaclavs-custom-picture-describe - [`baf90ce`](https://github.com/open-webui/open-webui/commit/baf90ce762d714cc338ff7f6532674c91a184735) refac: improve input formatting for Docling server and OCR settings - [`e3f341c`](https://github.com/open-webui/open-webui/commit/e3f341ca3145d66e9f15fcd98083321ddb150d7f) add aria-hidden to svg, as these are decorative - [`e1949e6`](https://github.com/open-webui/open-webui/commit/e1949e6e5dc8643b617b2386cadc67f31b4c57dc) remove aria-label from tooltip, as tippy handles this out of the box - [`b9c2fcf`](https://github.com/open-webui/open-webui/commit/b9c2fcfee4f5df39f74330baf5b1ad3e22c3f79f) add aria labels to buttons that only contains decorative svgs - [`4532b40`](https://github.com/open-webui/open-webui/commit/4532b4042a8b91c59885175fb9e57bac92dfbcf4) reindent xmlns - [`6cb37e8`](https://github.com/open-webui/open-webui/commit/6cb37e8e7c2eb2d2917f6831cd473c7e42b5c344) chore: format - [`ce9fb75`](https://github.com/open-webui/open-webui/commit/ce9fb759f1d4a62b61b5774b25fe03be3fce67cb) [i18n] Russian locale update ### 📊 Changes **59 files changed** (+476 additions, -41 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+48 -0) 📝 `backend/open_webui/main.py` (+14 -0) 📝 `backend/open_webui/retrieval/loaders/main.py` (+51 -7) 📝 `backend/open_webui/routers/retrieval.py` (+65 -0) 📝 `src/lib/components/admin/Settings/Documents.svelte` (+137 -3) 📝 `src/lib/components/chat/Messages/ResponseMessage.svelte` (+28 -0) 📝 `src/lib/components/common/Tooltip.svelte` (+1 -3) 📝 `src/lib/i18n/locales/ar-BH/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/ar/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/bg-BG/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/bn-BD/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/bo-TB/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/ca-ES/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/ceb-PH/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/cs-CZ/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/da-DK/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/de-DE/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/dg-DG/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/el-GR/translation.json` (+2 -0) 📝 `src/lib/i18n/locales/en-GB/translation.json` (+2 -0) _...and 39 more files_ </details> ### 📄 Description ### Description This PR adds a customization to the Docling's content extraction engine. It follows recent Docling's development described at [https://github.com/docling-project/docling-serve/blob/main/docs/usage.md#picture-description-options](https://github.com/docling-project/docling-serve/blob/main/docs/usage.md#picture-description-options). With this PR, user can choose to use a locally hosted VLM (in Docling env) to describe pictures in the document or use external OpenAI-like API to do so. ### Additional Information Please advice if this is wanted or not, especially UI styling etc. ### Screenshots or Videos ![describe1](https://github.com/user-attachments/assets/b58beab4-bb0b-46b2-9ddb-ebea43991058) ![describe2](https://github.com/user-attachments/assets/23a4926d-725d-49ee-98d6-5bd7af27820c) ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-11 19:02:55 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#10360