[GH-ISSUE #10194] Add Human Feedback #68745

Closed
opened 2026-05-04 15:03:35 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @KevinKrueger on GitHub (Apr 9, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10194

Hello,

the idea is that you say, okay, I liked the answer or not.
possibly with a reason (optional) and you have the possibility to improve your model further.

This should then be possible on the part of the API.

I know of course that human feedback can be much more than just “yes, it was good or not” and a reason why (if that makes any sense at all) but I am a friend of approaching the topic with small steps.

What do you think about this?

Originally created by @KevinKrueger on GitHub (Apr 9, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10194 Hello, the idea is that you say, okay, I liked the answer or not. possibly with a reason (optional) and you have the possibility to improve your model further. This should then be possible on the part of the API. I know of course that human feedback can be much more than just “yes, it was good or not” and a reason why (if that makes any sense at all) but I am a friend of approaching the topic with small steps. What do you think about this?
GiteaMirror added the feature request label 2026-05-04 15:03:35 -05:00
Author
Owner

@KevinKrueger commented on GitHub (Apr 9, 2025):

The Technical idea of the request:
{
"Output": "The model output text",
"Evaluation": 0 // 0 = bad, 1 = good etc...
"Reason": "The Reason why is this bad or good etc",
"Solution" "The Solution to make the model better"
}

<!-- gh-comment-id:2788691273 --> @KevinKrueger commented on GitHub (Apr 9, 2025): The Technical idea of the request: { "Output": "The model output text", "Evaluation": 0 // 0 = bad, 1 = good etc... "Reason": "The Reason why is this bad or good etc", "Solution" "The Solution to make the model better" }
Author
Owner

@apunkt commented on GitHub (Apr 9, 2025):

Ollama is a backend system.
What you want to add is a feature of a frontend!
It is implemented in OpenWebUI for example, but this is just collecting the HumanFeedback (HF).
In order to reflect this in the models, you need to finetune the model yourself with the HF and ReinforcementLearning (RL), with a technique called Reinforcement Learning from Human Feedback (RLHF).

For this you need a) massive amounts of HF (>thousands) and b) massive amount of hardware as this is much more difficult to do than just the simple inference that Ollama does.

<!-- gh-comment-id:2790022534 --> @apunkt commented on GitHub (Apr 9, 2025): Ollama is a backend system. What you want to add is a feature of a frontend! It is implemented in OpenWebUI for example, but this is just collecting the HumanFeedback (HF). In order to reflect this in the models, you need to finetune the model yourself with the HF and ReinforcementLearning (RL), with a technique called Reinforcement Learning from Human Feedback (RLHF). For this you need a) massive amounts of HF (>thousands) and b) massive amount of hardware as this is much more difficult to do than just the simple inference that Ollama does.
Author
Owner

@KevinKrueger commented on GitHub (Apr 9, 2025):

Okay, yes, I understand. Let me put it this way, if it were easy I would do it myself.

But that would be something if Ollama could do both ;)

<!-- gh-comment-id:2790816588 --> @KevinKrueger commented on GitHub (Apr 9, 2025): Okay, yes, I understand. Let me put it this way, if it were easy I would do it myself. But that would be something if Ollama could do both ;)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68745