[GH-ISSUE #4567] ollama and the inference results are inconsistent with the inference results of Transformers (HF) #2865

Open
opened 2026-04-12 13:12:47 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @githubzuoyi on GitHub (May 22, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4567

What is the issue?

This is the evaluation result of using our own benchmark on ollama and transformers, and the gap is huge.

My benchmark is a set similar to function call, which is more difficult than QA questions such as summary. It requires the model to have certain instruction following ability and logical reasoning ability. The evaluation shows that the same model llama3-8B-int4 has completely different results on the two frameworks.

image

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.37

Originally created by @githubzuoyi on GitHub (May 22, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4567 ### What is the issue? This is the evaluation result of using our own benchmark on ollama and transformers, and the gap is huge. My benchmark is a set similar to function call, which is more difficult than QA questions such as summary. It requires the model to have certain instruction following ability and logical reasoning ability. The evaluation shows that the same model llama3-8B-int4 has completely different results on the two frameworks. <img width="916" alt="image" src="https://github.com/ollama/ollama/assets/22168174/2ab36ef7-d0c7-4d4f-8a72-e9b7e3c50008"> ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.37
GiteaMirror added the bug label 2026-04-12 13:12:47 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2865