[GH-ISSUE #13309] Youtube Loader Multible Language Support #16879

Closed
opened 2026-04-19 22:42:28 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @DrMatschhirn on GitHub (Apr 28, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/13309

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

Referring to the previously reported bug regarding limitations in retrieving transcripts from YouTube videos, I would like to submit a feature request to enhance support for multilingual content.

When retrieving YouTube transcripts via the API, it's currently necessary to manually specify the desired language and check for its availability per video. This is inefficient, especially when processing large amounts of content across different languages or when the language of the transcript varies from video to video.

Thank you for your consideration, and I look forward to any feedback and the possibility of seeing this functionality implemented in a future release.

Desired Solution you'd like

Introduce support for specifying a list of preferred languages (e.g., ["de", "en", "fr"]) during transcript retrieval. The system should then automatically select and return the transcript in the first available language from that list.

Alternatively (or additionally), add the ability to retrieve all available transcripts in all supported languages, grouped or returned in a structured format.

Goals of this Feature:
Allow automatic language selection based on a prioritized list, without manual intervention for each video.
Avoid unnecessary retries or API calls to check for each language’s availability individually.
Enable consistent multilingual processing for applications such as translation, NLP, or content indexing.
Improve usability and automation for users working with international/multilingual YouTube content.

Alternatives Considered

Add a fallback option when none of the preferred languages are found (e.g., return the default/original language or return an error/warning).
Offer structured output listing all available transcript languages with timestamps (if applicable).

Additional Context

refere to: enhancement: non-english youtube rag #1960

Originally created by @DrMatschhirn on GitHub (Apr 28, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/13309 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Referring to the previously reported bug regarding limitations in retrieving transcripts from YouTube videos, I would like to submit a feature request to enhance support for multilingual content. When retrieving YouTube transcripts via the API, it's currently necessary to manually specify the desired language and check for its availability per video. This is inefficient, especially when processing large amounts of content across different languages or when the language of the transcript varies from video to video. Thank you for your consideration, and I look forward to any feedback and the possibility of seeing this functionality implemented in a future release. ### Desired Solution you'd like Introduce support for specifying a list of preferred languages (e.g., ["de", "en", "fr"]) during transcript retrieval. The system should then automatically select and return the transcript in the first available language from that list. Alternatively (or additionally), add the ability to retrieve all available transcripts in all supported languages, grouped or returned in a structured format. **Goals of this Feature:** Allow automatic language selection based on a prioritized list, without manual intervention for each video. Avoid unnecessary retries or API calls to check for each language’s availability individually. Enable consistent multilingual processing for applications such as translation, NLP, or content indexing. Improve usability and automation for users working with international/multilingual YouTube content. ### Alternatives Considered Add a fallback option when none of the preferred languages are found (e.g., return the default/original language or return an error/warning). Offer structured output listing all available transcript languages with timestamps (if applicable). ### Additional Context refere to: enhancement: non-english youtube rag #1960
Author
Owner

@Classic298 commented on GitHub (Apr 29, 2025):

Good idea!

Limiting yourself to only one language (which may not be available), accepting many different languages can be a great solution to allow for muuuuch more videos to be fetched

And I have noticed this limitation myself too, recently.

<!-- gh-comment-id:2837585717 --> @Classic298 commented on GitHub (Apr 29, 2025): Good idea! Limiting yourself to only one language (which may not be available), accepting many different languages can be a great solution to allow for muuuuch more videos to be fetched And I have noticed this limitation myself too, recently.
Author
Owner

@Classic298 commented on GitHub (May 5, 2025):

created a PR: https://github.com/open-webui/open-webui/pull/13528

<!-- gh-comment-id:2852004952 --> @Classic298 commented on GitHub (May 5, 2025): created a PR: https://github.com/open-webui/open-webui/pull/13528
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#16879