mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-22 17:42:18 -05:00
[PR #13761] [CLOSED] perf Mistral.py #10078
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/13761
Author: @PVBLIC-F
Created: 5/10/2025
Status: ❌ Closed
Base:
dev← Head:dev📝 Commits (3)
77fb01eUpdate mistral.py13834e3Update mistral.py9e30df0Update mistral.py📊 Changes
1 file changed (+118 additions, -101 deletions)
View changed files
📝
backend/open_webui/retrieval/loaders/mistral.py(+118 -101)📄 Description
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.
Before submitting, make sure you've checked the following:
devbranch.Changelog Entry
Description
I’ve completely overhauled the Mistral OCR loader for maximum performance and resilience: I replaced raw requests calls with a single aiohttp.ClientSession configured with a 30 s ClientTimeout so no request can hang indefinitely; converted every network interaction (_upload_file, _get_signed_url, _process_ocr, and _delete_file) to non-blocking async def methods; wrapped file uploads in a with open(...) context manager to guarantee the PDF handle is always closed; parallelized page-to-Document construction using asyncio.get_event_loop().run_in_executor so multi-page docs process in parallel; hardened the OCR step with a Tenacity @retry decorator for exponential-backoff retries; throttled overall concurrency with an asyncio.Semaphore(5) around the entire load() method; and added aenter/aexit to make the loader a proper async context manager—plus a helper to shut down the session cleanly.
Added
Changed
Deprecated
Removed
Fixed
Security
Breaking Changes
Additional Information
Screenshots or Videos
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.