[GH-ISSUE #7755] Proper way to train model on my data and load into Ollama? #67008

Closed
opened 2026-05-04 09:14:00 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @robotom on GitHub (Nov 20, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7755

As I've mentioned in the title, I have some huge text based documents which exceed typical context windows, even on large machines with large models (e.g. 405B). Is there a way I could train llama 3.1:8B (for example) on these docs and then load it into Ollama and ask the model about them? Thank you!

Originally created by @robotom on GitHub (Nov 20, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7755 As I've mentioned in the title, I have some huge text based documents which exceed typical context windows, even on large machines with large models (e.g. 405B). Is there a way I could train llama 3.1:8B (for example) on these docs and then load it into Ollama and ask the model about them? Thank you!
GiteaMirror added the feature request label 2026-05-04 09:14:00 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 20, 2024):

Your options are RAG or fine-tuning. There are RAG implementations for ollama in the community integrations. RAG doesn't "understand" the document, though, it just searches based on a query and responds with a result synthesized from the search results. Fine-tuning options include unsloth, llama factory and axolotl. The problem with fine-tuning is it takes time to generate the new model, if you are receiving daily doc dumps it might not be fit for purpose. A new entry in to the scene is instructlab, which from the sound of it is a more iterative approach to fine-tuning, so might fit your use case better. I haven't used it yet so my understanding may be incorrect.

<!-- gh-comment-id:2487704178 --> @rick-github commented on GitHub (Nov 20, 2024): Your options are [RAG](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) or [fine-tuning](https://en.wikipedia.org/wiki/Fine-tuning_(deep_learning)). There are RAG implementations for ollama in the [community integrations](https://github.com/ollama/ollama?tab=readme-ov-file#community-integrations). RAG doesn't "understand" the document, though, it just searches based on a query and responds with a result synthesized from the search results. Fine-tuning options include [unsloth](https://unsloth.ai/), [llama factory](https://github.com/hiyouga/LLaMA-Factory) and [axolotl](https://github.com/axolotl-ai-cloud/axolotl). The problem with fine-tuning is it takes time to generate the new model, if you are receiving daily doc dumps it might not be fit for purpose. A new entry in to the scene is [instructlab](https://github.com/instructlab), which from the sound of it is a more iterative approach to fine-tuning, so might fit your use case better. I haven't used it yet so my understanding may be incorrect.
Author
Owner

@robotom commented on GitHub (Nov 20, 2024):

Thanks for that. I had a look at the links for fine-tuning. Llama factory looks simple enough to use. I've never done this before but I think it's something like this:

  1. Create a "dataset" --- not sure what format this needs to be in but I have a bunch of docs/txts.
  2. Upload dataset, tweak the parameters and press "Train/Start".
  3. Load the new_llama3_model into an engine --- I want to load it into Ollama locally on my machine.

Do you have any advice on 1-2-3? I am designing a piece of software and I really want it to revolve around Ollama as the underlying engine.

I also don't mind taking some time to generate new models. I take it we're talking days not weeks of GPU time.

Thanks!

<!-- gh-comment-id:2487891201 --> @robotom commented on GitHub (Nov 20, 2024): Thanks for that. I had a look at the links for fine-tuning. Llama factory looks simple enough to use. I've never done this before but I think it's something like this: 1) Create a "dataset" --- not sure what format this needs to be in but I have a bunch of docs/txts. 2) Upload dataset, tweak the parameters and press "Train/Start". 3) Load the new_llama3_model into an engine --- I want to load it into Ollama locally on my machine. Do you have any advice on 1-2-3? I am designing a piece of software and I really want it to revolve around Ollama as the underlying engine. I also don't mind taking some time to generate new models. I take it we're talking days not weeks of GPU time. Thanks!
Author
Owner
<!-- gh-comment-id:2586024704 --> @rick-github commented on GitHub (Jan 13, 2025): https://www.datacamp.com/tutorial/llama-factory-web-ui-guide-fine-tuning-llms
Author
Owner

@OussamaAdaoumoum commented on GitHub (Apr 7, 2025):

@robotom , could I have your email or another way to contact you to ask about this, please?

<!-- gh-comment-id:2782920030 --> @OussamaAdaoumoum commented on GitHub (Apr 7, 2025): @robotom , could I have your email or another way to contact you to ask about this, please?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#67008