[GH-ISSUE #5762] How to pass a custom file (txt\doc\pdf) to prompt through the API #3588

Closed
opened 2026-04-12 14:19:37 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @liukesoft on GitHub (Jul 18, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5762

Hello, I have deployed Olama on my local Linux and would like to call it through an API. The model is llama3, and prompt is my custom Python template, which allows the model to generate new Python code according to my Python template. How can I send my template (txt \ doc \ pdf) to the interface through API calls? How to set parameters? thanks!!

curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?" //How to use custom prompt templates??
}'

Originally created by @liukesoft on GitHub (Jul 18, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5762 Hello, I have deployed Olama on my local Linux and would like to call it through an API. The model is llama3, and prompt is my custom Python template, which allows the model to generate new Python code according to my Python template. How can I send my template (txt \ doc \ pdf) to the interface through API calls? How to set parameters? thanks!! curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" //How to use custom prompt templates?? }'
GiteaMirror added the question label 2026-04-12 14:19:37 -05:00
Author
Owner

@rick-github commented on GitHub (Jul 18, 2024):

ollama is just an inference engine, it doesn't do document extraction. For that you would use something like a document loader from langchain_community.document_loaders or llama_parse.LLamaParse. Then you include the extracted information along with your prompt in the prompt field of the message you send to ollama. The parameters you can set are laid out in the API guide.

<!-- gh-comment-id:2237027537 --> @rick-github commented on GitHub (Jul 18, 2024): ollama is just an inference engine, it doesn't do document extraction. For that you would use something like a document loader from `langchain_community.document_loaders` or `llama_parse.LLamaParse`. Then you include the extracted information along with your prompt in the `prompt` field of the message you send to ollama. The parameters you can set are laid out in the [API guide](https://github.com/ollama/ollama/blob/main/docs/api.md).
Author
Owner

@robotom commented on GitHub (Oct 24, 2024):

I can do the text extraction from the document. At that point, is it recommended to just pass entire walls of text into each prompt?

Are there any tunable settings to maximize the model's understanding of this wall of text?

Or is it just as simple of dumping all the text into the message that I send to the model.

Any advice on this would be super helpful! Thank you.

<!-- gh-comment-id:2436489660 --> @robotom commented on GitHub (Oct 24, 2024): I can do the text extraction from the document. At that point, is it recommended to just pass entire walls of text into each prompt? Are there any tunable settings to maximize the model's understanding of this wall of text? Or is it just as simple of dumping all the text into the message that I send to the model. Any advice on this would be super helpful! Thank you.
Author
Owner

@rick-github commented on GitHub (Oct 24, 2024):

Increase the context size (https://github.com/ollama/ollama/issues/7341) and send the text.

<!-- gh-comment-id:2436492711 --> @rick-github commented on GitHub (Oct 24, 2024): Increase the context size (https://github.com/ollama/ollama/issues/7341) and send the text.
Author
Owner

@robotom commented on GitHub (Oct 24, 2024):

Thanks. Any idea on how well the model will understand this text? Are there other methods like tuning, params, etc. to achieve this?

Is the max recommended 4096? Are there any larger numbers, etc. Is there any data on this relationship to CPU/RAM, etc?

Thanks.

<!-- gh-comment-id:2436503583 --> @robotom commented on GitHub (Oct 24, 2024): Thanks. Any idea on how well the model will understand this text? Are there other methods like tuning, params, etc. to achieve this? Is the max recommended 4096? Are there any larger numbers, etc. Is there any data on this relationship to CPU/RAM, etc? Thanks.
Author
Owner

@rick-github commented on GitHub (Oct 24, 2024):

https://github.com/ollama/ollama/issues/7341#issuecomment-2436504435

<!-- gh-comment-id:2436507330 --> @rick-github commented on GitHub (Oct 24, 2024): https://github.com/ollama/ollama/issues/7341#issuecomment-2436504435
Author
Owner

@armartinez commented on GitHub (Sep 24, 2025):

@rick-github The ollama app supports passing files as input, is this functionality available on the API ?

<!-- gh-comment-id:3328147879 --> @armartinez commented on GitHub (Sep 24, 2025): @rick-github The ollama app supports passing files as input, is this functionality available on the API ?
Author
Owner

@rick-github commented on GitHub (Sep 24, 2025):

The app does as outlined in https://github.com/ollama/ollama/issues/5762#issuecomment-2237027537: it converts the input file into text that is added to the prompt. The API doesn't provide this, it's done by the client. In the app's case, the app is the client doing the conversion.

<!-- gh-comment-id:3328164659 --> @rick-github commented on GitHub (Sep 24, 2025): The app does as outlined in https://github.com/ollama/ollama/issues/5762#issuecomment-2237027537: it converts the input file into text that is added to the prompt. The API doesn't provide this, it's done by the client. In the app's case, the app is the client doing the conversion.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3588