[GH-ISSUE #1903] feat: youtube video rag #51346

Closed
opened 2026-05-05 12:20:42 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @tjbck on GitHub (May 1, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1903

Originally assigned to: @tjbck on GitHub.

Originally created by @tjbck on GitHub (May 1, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/1903 Originally assigned to: @tjbck on GitHub.
Author
Owner

@woutervdijke commented on GitHub (May 3, 2024):

With this RAG it's not currently possible to import non-English YouTube videos? When I try to add one, it gives the following error:

Something went wrong :/ Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Cp-SysUm2KU! This is most likely caused by: No transcripts were found for any of the requested language codes: ['en'] For this video ( [id] ) transcripts are available in the following languages: (MANUALLY CREATED) - nl ("Dutch")[TRANSLATABLE] (GENERATED) - nl ("Dutch (auto-generated)")[TRANSLATABLE] (TRANSLATION LANGUAGES) - af ("Afrikaans") - ak ("Akan") [....] yo ("Yoruba") - zu ("Zulu")

English is the default setting of the Langchain YouTube loader, according to this page of the docs, but the loader does have a language and a translation parameter.

Is there a way to change the requested language codes, in settings or while uploading? Or should this be a new feature request?

<!-- gh-comment-id:2092792397 --> @woutervdijke commented on GitHub (May 3, 2024): With this RAG it's not currently possible to import non-English YouTube videos? When I try to add one, it gives the following error: `Something went wrong :/ Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Cp-SysUm2KU! This is most likely caused by: No transcripts were found for any of the requested language codes: ['en'] For this video ( [id] ) transcripts are available in the following languages: (MANUALLY CREATED) - nl ("Dutch")[TRANSLATABLE] (GENERATED) - nl ("Dutch (auto-generated)")[TRANSLATABLE] (TRANSLATION LANGUAGES) - af ("Afrikaans") - ak ("Akan") [....] yo ("Yoruba") - zu ("Zulu") ` English is the default setting of the Langchain YouTube loader, according to [this page of the docs](https://python.langchain.com/docs/integrations/document_loaders/youtube_transcript/), but the loader does have a `language` and a `translation` parameter. Is there a way to change the requested language codes, in settings or while uploading? Or should this be a new feature request?
Author
Owner

@thebetauser commented on GitHub (May 4, 2024):

Hello,
There are multiple shortening services for youtube links, the most common one is "youtu.be" can we add a check to this line to include "youtu.be"?

<!-- gh-comment-id:2093920674 --> @thebetauser commented on GitHub (May 4, 2024): Hello, There are multiple shortening services for youtube links, the most common one is "youtu.be" can we add a check [to this line](https://github.com/open-webui/open-webui/blob/30b053116d6a40fc60dad1766e83ed41ffcb712c/src/lib/components/chat/MessageInput/Documents.svelte#L146) to include "youtu.be"?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#51346