feat: youtube video rag #784

Closed
opened 2025-11-11 14:31:13 -06:00 by GiteaMirror · 2 comments
Owner

Originally created by @tjbck on GitHub (May 1, 2024).

Originally assigned to: @tjbck on GitHub.

Originally created by @tjbck on GitHub (May 1, 2024). Originally assigned to: @tjbck on GitHub.
Author
Owner

@woutervdijke commented on GitHub (May 3, 2024):

With this RAG it's not currently possible to import non-English YouTube videos? When I try to add one, it gives the following error:

Something went wrong :/ Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Cp-SysUm2KU! This is most likely caused by: No transcripts were found for any of the requested language codes: ['en'] For this video ( [id] ) transcripts are available in the following languages: (MANUALLY CREATED) - nl ("Dutch")[TRANSLATABLE] (GENERATED) - nl ("Dutch (auto-generated)")[TRANSLATABLE] (TRANSLATION LANGUAGES) - af ("Afrikaans") - ak ("Akan") [....] yo ("Yoruba") - zu ("Zulu")

English is the default setting of the Langchain YouTube loader, according to this page of the docs, but the loader does have a language and a translation parameter.

Is there a way to change the requested language codes, in settings or while uploading? Or should this be a new feature request?

@woutervdijke commented on GitHub (May 3, 2024): With this RAG it's not currently possible to import non-English YouTube videos? When I try to add one, it gives the following error: `Something went wrong :/ Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Cp-SysUm2KU! This is most likely caused by: No transcripts were found for any of the requested language codes: ['en'] For this video ( [id] ) transcripts are available in the following languages: (MANUALLY CREATED) - nl ("Dutch")[TRANSLATABLE] (GENERATED) - nl ("Dutch (auto-generated)")[TRANSLATABLE] (TRANSLATION LANGUAGES) - af ("Afrikaans") - ak ("Akan") [....] yo ("Yoruba") - zu ("Zulu") ` English is the default setting of the Langchain YouTube loader, according to [this page of the docs](https://python.langchain.com/docs/integrations/document_loaders/youtube_transcript/), but the loader does have a `language` and a `translation` parameter. Is there a way to change the requested language codes, in settings or while uploading? Or should this be a new feature request?
Author
Owner

@thebetauser commented on GitHub (May 4, 2024):

Hello,
There are multiple shortening services for youtube links, the most common one is "youtu.be" can we add a check to this line to include "youtu.be"?

@thebetauser commented on GitHub (May 4, 2024): Hello, There are multiple shortening services for youtube links, the most common one is "youtu.be" can we add a check [to this line](https://github.com/open-webui/open-webui/blob/30b053116d6a40fc60dad1766e83ed41ffcb712c/src/lib/components/chat/MessageInput/Documents.svelte#L146) to include "youtu.be"?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#784