mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 21:09:41 -05:00
[PR #13528] [MERGED] feat: Enhance YouTube Transcription Loader for multi-language support #46273
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/13528
Author: @Classic298
Created: 5/5/2025
Status: ✅ Merged
Merged: 5/6/2025
Merged by: @tjbck
Base:
dev← Head:dev📝 Commits (10+)
7680ac2Update youtube.py0a845dbUpdate youtube.py0a3817eUpdate youtube.py1a30b37Update youtube.pyb0d74a5Update youtube.py9cf3381Update youtube.py791dd24Update youtube.py67a612fUpdate youtube.py5e1cb76Update youtube.pya129e09Update youtube.py📊 Changes
1 file changed (+34 additions, -19 deletions)
View changed files
📝
backend/open_webui/retrieval/loaders/youtube.py(+34 -19)📄 Description
Pull Request Checklist
Before submitting, make sure you've checked the following:
devbranch.Changelog Entry
Description
Enhanced YouTube transcript loader to properly handle multiple language fallbacks. Previously, if a transcript wasn't available in the configured language, it would only fall back to English. Now, multiple languages can be specified in priority order, and the system will try each language in sequence before eventually falling back to English.
This is a nifty feature, as in some usecases, trying to work with videos in different languages will result in an unexpected error just because there was no transcription in e.g.
deanden.With this change, you can now create a custom priority list, e.g.
es,de,enwhich will ensure different transcription languages will be (attempted to) fetched.The behaviour of defaulting to English if any languages in the list were unsuccessful to be fetched remained unchanged. So a list of
es,dewill have the same result ases,de,en.Added
Changed
load()method inYoutubeLoaderclass to attempt to fetch transcripts in each configured language in priority orderYOUTUBE_LOADER_LANGUAGEconfig option to clarify the new behavior: https://github.com/open-webui/docs/pull/528Fixed
Additional Information
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree tothe Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.