mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 21:09:41 -05:00
[GH-ISSUE #13309] Youtube Loader Multible Language Support #55545
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DrMatschhirn on GitHub (Apr 28, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/13309
Check Existing Issues
Problem Description
Referring to the previously reported bug regarding limitations in retrieving transcripts from YouTube videos, I would like to submit a feature request to enhance support for multilingual content.
When retrieving YouTube transcripts via the API, it's currently necessary to manually specify the desired language and check for its availability per video. This is inefficient, especially when processing large amounts of content across different languages or when the language of the transcript varies from video to video.
Thank you for your consideration, and I look forward to any feedback and the possibility of seeing this functionality implemented in a future release.
Desired Solution you'd like
Introduce support for specifying a list of preferred languages (e.g., ["de", "en", "fr"]) during transcript retrieval. The system should then automatically select and return the transcript in the first available language from that list.
Alternatively (or additionally), add the ability to retrieve all available transcripts in all supported languages, grouped or returned in a structured format.
Goals of this Feature:
Allow automatic language selection based on a prioritized list, without manual intervention for each video.
Avoid unnecessary retries or API calls to check for each language’s availability individually.
Enable consistent multilingual processing for applications such as translation, NLP, or content indexing.
Improve usability and automation for users working with international/multilingual YouTube content.
Alternatives Considered
Add a fallback option when none of the preferred languages are found (e.g., return the default/original language or return an error/warning).
Offer structured output listing all available transcript languages with timestamps (if applicable).
Additional Context
refere to: enhancement: non-english youtube rag #1960
@Classic298 commented on GitHub (Apr 29, 2025):
Good idea!
Limiting yourself to only one language (which may not be available), accepting many different languages can be a great solution to allow for muuuuch more videos to be fetched
And I have noticed this limitation myself too, recently.
@Classic298 commented on GitHub (May 5, 2025):
created a PR: https://github.com/open-webui/open-webui/pull/13528