[PR #20126] feat: extend Mistral OCR with base64 mode for LLM proxy compatibility #48523

Open
opened 2026-04-30 00:31:47 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/20126
Author: @KevinRohn
Created: 12/22/2025
Status: 🔄 Open

Base: devHead: dev


📝 Commits (10+)

  • 3ca208c feat: add LiteLLM/Azure compatibility for Mistral OCR
  • 61fd9c2 Merge branch 'open-webui:dev' into dev
  • 99c0ce0 Merge branch 'open-webui:dev' into dev
  • 8ff00f6 Merge branch 'open-webui:dev' into dev
  • fad7514 Merge branch 'open-webui:dev' into dev
  • 5eb9651 Merge branch 'open-webui:dev' into dev
  • 9450d5a Merge branch 'open-webui:dev' into dev
  • 7d84782 Merge branch 'open-webui:dev' into dev
  • e162af0 Merge branch 'open-webui:dev' into dev
  • 306c598 Merge branch 'open-webui:dev' into dev

📊 Changes

7 files changed (+193 additions, -14 deletions)

View changed files

📝 backend/open_webui/config.py (+12 -0)
📝 backend/open_webui/main.py (+8 -0)
📝 backend/open_webui/retrieval/loaders/main.py (+2 -0)
📝 backend/open_webui/retrieval/loaders/mistral.py (+123 -14)
📝 backend/open_webui/retrieval/utils.py (+2 -0)
📝 backend/open_webui/routers/retrieval.py (+19 -0)
📝 src/lib/components/admin/Settings/Documents.svelte (+27 -0)

📄 Description

Changelog Entry

  • MISTRAL_OCR_USE_BASE64 and MISTRAL_OCR_MODEL environment variables were added to enable LLM proxy compatibility (e.g., LiteLLM) for Mistral OCR, allowing base64 data URI format instead of file upload and configurable OCR model selection.

Description

Added LLM proxy compatibility for Mistral OCR with base64 encoding support and custom model selection. LLM proxies like LiteLLM wrap providers (e.g., Azure AI Foundry) to expose an OCR endpoint following the Mistral OCR API spec (https://docs.mistral.ai/capabilities/vision/#optical-character-recognition-ocr).

Mistral OCR only works with the official API endpoint atm using file upload with signed URLs.
This change allows to select between the upload method and base64 data URI format.

Added

  • MISTRAL_OCR_USE_BASE64 environment variable and UI toggle to enable base64 data URI format instead of file upload
  • MISTRAL_OCR_MODEL environment variable and UI input to configure the OCR model name (default: mistral-ocr-latest)
  • Sync and async base64 OCR processing methods in the Mistral loader

Changed

  • Added MISTRAL_OCR_USE_BASE64 and MISTRAL_OCR_MODEL persistent config variables
  • Added new config variables to app state
  • Pass new parameters to MistralLoader
  • Implemented _process_ocr_base64() and _process_ocr_base64_async() methods with conditional workflow selection (I hope this naming is good here)
  • Added new cnfig variables to RAG config API endpoints
  • Added "OCR Model" input field and "Use Base64 Encoding" tggle

Deprecated

  • N/A

Removed

  • N/A

Fixed

  • Mistral OCR compatibility with LLM proxies (e.g., LiteLLM)

Security

  • N/A

Breaking Changes

  • N/A

Additional Information

Screenshots or Videos

Using custom model mistral-document-ai-2505:

UI Config
Screenshot 2025-12-22 at 21 19 06
Request
Screenshot 2025-12-22 at 21 20 46
LiteLLM request
Screenshot 2025-12-22 at 21 42 55

Using custom model azure-doc-intel (Large document):

UI Config
Screenshot 2025-12-22 at 21 21 26
Request
Screenshot 2025-12-22 at 21 22 16
LiteLLM request
Screenshot 2025-12-22 at 21 44 30

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/20126 **Author:** [@KevinRohn](https://github.com/KevinRohn) **Created:** 12/22/2025 **Status:** 🔄 Open **Base:** `dev` ← **Head:** `dev` --- ### 📝 Commits (10+) - [`3ca208c`](https://github.com/open-webui/open-webui/commit/3ca208ce4b30f1303e515326d241adeb5a1a5b1b) feat: add LiteLLM/Azure compatibility for Mistral OCR - [`61fd9c2`](https://github.com/open-webui/open-webui/commit/61fd9c2def36f72ce01a3748bbdd3f6e4baadae6) Merge branch 'open-webui:dev' into dev - [`99c0ce0`](https://github.com/open-webui/open-webui/commit/99c0ce08f894f84b249f2c5f7a95e76852f5e543) Merge branch 'open-webui:dev' into dev - [`8ff00f6`](https://github.com/open-webui/open-webui/commit/8ff00f6a0305a573c5aa65594021dcdae2cf7b77) Merge branch 'open-webui:dev' into dev - [`fad7514`](https://github.com/open-webui/open-webui/commit/fad7514e428f440a26fd66bd068fea0b539ee41c) Merge branch 'open-webui:dev' into dev - [`5eb9651`](https://github.com/open-webui/open-webui/commit/5eb9651cf958245dac0a81752462ff3d3a7aad06) Merge branch 'open-webui:dev' into dev - [`9450d5a`](https://github.com/open-webui/open-webui/commit/9450d5a174ed2d0fa53c10b89760be166b0a16d8) Merge branch 'open-webui:dev' into dev - [`7d84782`](https://github.com/open-webui/open-webui/commit/7d84782498107a8bfca64d77c5524eff118df368) Merge branch 'open-webui:dev' into dev - [`e162af0`](https://github.com/open-webui/open-webui/commit/e162af0ec94cd4bc2e020ad794d71f02d7ba34be) Merge branch 'open-webui:dev' into dev - [`306c598`](https://github.com/open-webui/open-webui/commit/306c598aee7bf525a5607538054ffdfb4902a8cf) Merge branch 'open-webui:dev' into dev ### 📊 Changes **7 files changed** (+193 additions, -14 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+12 -0) 📝 `backend/open_webui/main.py` (+8 -0) 📝 `backend/open_webui/retrieval/loaders/main.py` (+2 -0) 📝 `backend/open_webui/retrieval/loaders/mistral.py` (+123 -14) 📝 `backend/open_webui/retrieval/utils.py` (+2 -0) 📝 `backend/open_webui/routers/retrieval.py` (+19 -0) 📝 `src/lib/components/admin/Settings/Documents.svelte` (+27 -0) </details> ### 📄 Description # Changelog Entry - `MISTRAL_OCR_USE_BASE64` and `MISTRAL_OCR_MODEL` environment variables were added to enable LLM proxy compatibility (e.g., LiteLLM) for Mistral OCR, allowing base64 data URI format instead of file upload and configurable OCR model selection. ### Description Added LLM proxy compatibility for Mistral OCR with base64 encoding support and custom model selection. LLM proxies like LiteLLM wrap providers (e.g., Azure AI Foundry) to expose an OCR endpoint following the Mistral OCR API spec (https://docs.mistral.ai/capabilities/vision/#optical-character-recognition-ocr). Mistral OCR only works with the official API endpoint atm using file upload with signed URLs. This change allows to select between the upload method and base64 data URI format. ### Added - `MISTRAL_OCR_USE_BASE64` environment variable and UI toggle to enable base64 data URI format instead of file upload - `MISTRAL_OCR_MODEL` environment variable and UI input to configure the OCR model name (default: `mistral-ocr-latest`) - Sync and async base64 OCR processing methods in the Mistral loader ### Changed - Added `MISTRAL_OCR_USE_BASE64` and `MISTRAL_OCR_MODEL` persistent config variables - Added new config variables to app state - Pass new parameters to `MistralLoader` - Implemented `_process_ocr_base64()` and `_process_ocr_base64_async()` methods with conditional workflow selection (I hope this naming is good here) - Added new cnfig variables to RAG config API endpoints - Added "OCR Model" input field and "Use Base64 Encoding" tggle ### Deprecated - N/A ### Removed - N/A ### Fixed - Mistral OCR compatibility with LLM proxies (e.g., LiteLLM) ### Security - N/A ### Breaking Changes - N/A --- ### Additional Information - Related discussion: #14200 (Original request for custom Mistral OCR endpoints) - related to #19707 (Mistral OCR additional params - base64 method and model configuration) - related to #17677 (Custom endpoint for Mistral OCR) - LiteLLM OCR docs: https://docs.litellm.ai/docs/providers/mistral#ocr ### Screenshots or Videos **Using custom model `mistral-document-ai-2505`:** UI Config <img width="1094" height="166" alt="Screenshot 2025-12-22 at 21 19 06" src="https://github.com/user-attachments/assets/83893fe0-e8a3-41a1-8afe-62a76c6b5bd9" /> Request <img width="1460" height="101" alt="Screenshot 2025-12-22 at 21 20 46" src="https://github.com/user-attachments/assets/108809c4-80ac-4013-a563-5b212e3ab9f5" /> LiteLLM request <img width="2179" height="387" alt="Screenshot 2025-12-22 at 21 42 55" src="https://github.com/user-attachments/assets/c12fe7c0-eafd-4574-8ad9-d6ab5fc899de" /> **Using custom model `azure-doc-intel` (Large document):** UI Config <img width="1078" height="144" alt="Screenshot 2025-12-22 at 21 21 26" src="https://github.com/user-attachments/assets/f2b87d28-a419-41ee-ad39-8d7a2fa6e73e" /> Request <img width="1453" height="100" alt="Screenshot 2025-12-22 at 21 22 16" src="https://github.com/user-attachments/assets/1f1e6eb6-f29d-46dd-a395-0b5b3a97b090" /> LiteLLM request <img width="2179" height="387" alt="Screenshot 2025-12-22 at 21 44 30" src="https://github.com/user-attachments/assets/2c1f497c-5d98-4311-b09d-d445eb4f1312" /> ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-30 00:31:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#48523