mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[PR #18905] [CLOSED] feat: Enhance Mistral OCR integration with configurable endpoint and … #40648
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/18905
Author: @paddy313
Created: 11/3/2025
Status: ❌ Closed
Base:
dev← Head:feature/custom_mistral_ocr_endpoints📝 Commits (2)
0f2d59cfeat: Enhance Mistral OCR integration with configurable endpoint and parameters08b5df5Merge branch 'dev' into feature/custom_mistral_ocr_endpoints📊 Changes
6 files changed (+709 additions, -147 deletions)
View changed files
📝
backend/open_webui/config.py(+18 -0)📝
backend/open_webui/main.py(+4 -0)📝
backend/open_webui/retrieval/loaders/main.py(+6 -1)📝
backend/open_webui/retrieval/loaders/mistral.py(+583 -140)📝
backend/open_webui/routers/retrieval.py(+18 -0)📝
src/lib/components/admin/Settings/Documents.svelte(+80 -6)📄 Description
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.
Before submitting, make sure you've checked the following:
devbranch. Not targeting thedevbranch may lead to immediate closure of the PR.Changelog Entry
Description
Updated the Mistral OCR loader to allow for defining additional endpoints and models, and to support two different methods (upload/base64) to use with the API.
Previously, defining different endpoints was not possible. However, given that the Mistral OCR API can be used with LiteLLM or in Azure AI Foundry, it was necessary to enhance the Mistral loader to use diverse endpoints and model names.
LiteLLM and Azure AI Foundry do not support document upload with a signed URL; they only permit sending PDFs as base64 encoded strings. The official Mistral OCR API supports both methods. The upload method is faster for larger files. Consequently, I have provided the option in the UI, allowing users to choose between the two methods based on endpoint support.
Added
MISTRAL_OCR_ENDPOINT: Custom endpoint supportMISTRAL_OCR_PARAMS: Additional Parameters for Mistral OCR, currently model & PDF transfer method.Changed
config.pyandmain.py: Added environment variable handlingretrieval/loaders/main.py: Updated Mistral loader constructor with new parametersretrieval/loaders/mistral.py: Implemented base64 encoding support for API compatibilityrouters/retrieval.py: Added configuration variable managementDocuments.svelte: Added input fields for endpoint and model selectionFixed
Additional Information
Screenshots or Videos
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.