pdf text extraction #71

Closed
opened 2025-11-11 14:03:48 -06:00 by GiteaMirror · 5 comments
Owner

Originally created by @JohnZolton on GitHub (Dec 3, 2023).

I want to drag pdfs into the chat window and talk with the ai about them by loading the pdf text into the prompt. I'm trying to implement it myself but have issues with the pdf libraries and svelte/vite. PDFs would already be OCR'd so its just extracting text.

Originally created by @JohnZolton on GitHub (Dec 3, 2023). I want to drag pdfs into the chat window and talk with the ai about them by loading the pdf text into the prompt. I'm trying to implement it myself but have issues with the pdf libraries and svelte/vite. PDFs would already be OCR'd so its just extracting text.
Author
Owner

@DanMyers300 commented on GitHub (Dec 4, 2023):

I would like this but with the ability to integrate with LangChain. I'm attempting to work on it, if successful I'll make a PR.

@DanMyers300 commented on GitHub (Dec 4, 2023): I would like this but with the ability to integrate with LangChain. I'm attempting to work on it, if successful I'll make a PR.
Author
Owner

@tjbck commented on GitHub (Dec 5, 2023):

Hi, Thanks for the feature request! I'm actively working on a RAG feature for the webui, so stay tuned! Let's continue our discussion here: #31

@tjbck commented on GitHub (Dec 5, 2023): Hi, Thanks for the feature request! I'm actively working on a RAG feature for the webui, so stay tuned! Let's continue our discussion here: #31
Author
Owner

@bhaidar commented on GitHub (May 27, 2024):

Hi @tjbck
I've uploaded a PDF file (~2mb) and asked the LLAMA3 what is this document about and received "I am not sure ..". Is there a special model to use with documents? Thanks

@bhaidar commented on GitHub (May 27, 2024): Hi @tjbck I've uploaded a PDF file (~2mb) and asked the LLAMA3 what is this document about and received "I am not sure ..". Is there a special model to use with documents? Thanks
Author
Owner

@dmvieira commented on GitHub (Sep 12, 2024):

Its not working here for me too @bhaidar . I'm trying to summarize a large PDF and it's not working because it's using RAG for PDF text recovery. Why not an option to use all pdf when generate response?

@dmvieira commented on GitHub (Sep 12, 2024): Its not working here for me too @bhaidar . I'm trying to summarize a large PDF and it's not working because it's using RAG for PDF text recovery. Why not an option to use all pdf when generate response?
Author
Owner

@thiswillbeyourgithub commented on GitHub (Sep 12, 2024):

@bhaidar and @dmvieira this might be related to my ongoing PR #5378

@thiswillbeyourgithub commented on GitHub (Sep 12, 2024): @bhaidar and @dmvieira this might be related to my ongoing PR #5378
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#71