[GH-ISSUE #7151] Falcon Mamba Quality Issues #4539

Closed
opened 2026-04-12 15:28:37 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @hg0428 on GitHub (Oct 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7151

What is the issue?

Ollama has a quality issue when running Falcon Mamba that is not present when using Llama.cpp.

Ollama:

>>> What is 10*10/10 + 10/10*10?
Let's solve the given equation step by step: 
- First, we need to simplify each term in parentheses according to BODMAS rule which stands for Brackets first then Orders(or Powers and Square Roots), followed by Division or Multiplication 
from left side. Finally Addition/Subtraction 
       So our expression becomes 
       =10*(1+0)
     - Next, we solve the sum inside bracket 
      This gives us:
       =(20)*2
   
Therefore 
<div style="text-align:center">
  The final answer is equal to  **40**.
  </br>  
</body></html>

Llama.cpp:

> What is 10*10/10 + 10/10*10?
To solve this expression, we should follow the order of operations, which is often remembered by the acronym PEMDAS (Parentheses, Exponents, Multiplication and Division, and Addition and Subtraction). 

The expression given is: 10*10/10 + 10/10*10

First, we will do the multiplication and division from left to right:

1. 10*10/10 = 100/10 = 10
2. 10/10*10 = 1*10 = 10

Now, we can replace these values back into the expression:

10 + 10

Finally, we add the numbers:

10 + 10 = 20

So, the result of the expression 10*10/10 + 10/10*10 is 20.


Both were using the same model, Falcon Mamba 7b Instruct q4_0 and the same system prompt. All settings are default on both.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.12

Originally created by @hg0428 on GitHub (Oct 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7151 ### What is the issue? Ollama has a quality issue when running Falcon Mamba that is not present when using Llama.cpp. #### Ollama: ``` >>> What is 10*10/10 + 10/10*10? Let's solve the given equation step by step: - First, we need to simplify each term in parentheses according to BODMAS rule which stands for Brackets first then Orders(or Powers and Square Roots), followed by Division or Multiplication from left side. Finally Addition/Subtraction So our expression becomes =10*(1+0) - Next, we solve the sum inside bracket This gives us: =(20)*2 Therefore <div style="text-align:center"> The final answer is equal to **40**. </br> </body></html> ``` #### Llama.cpp: ``` > What is 10*10/10 + 10/10*10? To solve this expression, we should follow the order of operations, which is often remembered by the acronym PEMDAS (Parentheses, Exponents, Multiplication and Division, and Addition and Subtraction). The expression given is: 10*10/10 + 10/10*10 First, we will do the multiplication and division from left to right: 1. 10*10/10 = 100/10 = 10 2. 10/10*10 = 1*10 = 10 Now, we can replace these values back into the expression: 10 + 10 Finally, we add the numbers: 10 + 10 = 20 So, the result of the expression 10*10/10 + 10/10*10 is 20. ``` Both were using the same model, [Falcon Mamba 7b Instruct q4_0](https://ollama.com/Hudson/falcon-mamba-instruct) and the same system prompt. All settings are default on both. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.3.12
GiteaMirror added the bug label 2026-04-12 15:28:37 -05:00
Author
Owner

@victorb commented on GitHub (Oct 9, 2024):

All settings are default on both.

Can you try to set these values yourself, so you make sure the values are the same for both Ollama and llama.cpp?

<!-- gh-comment-id:2402540621 --> @victorb commented on GitHub (Oct 9, 2024): > All settings are default on both. Can you try to set these values yourself, so you make sure the values are the same for both Ollama and llama.cpp?
Author
Owner

@hg0428 commented on GitHub (Oct 9, 2024):

All settings are default on both.

Can you try to set these values yourself, so you make sure the values are the same for both Ollama and llama.cpp?

Sure, I'll try that. What settings do you suggest setting?

<!-- gh-comment-id:2402543567 --> @hg0428 commented on GitHub (Oct 9, 2024): > > All settings are default on both. > > Can you try to set these values yourself, so you make sure the values are the same for both Ollama and llama.cpp? Sure, I'll try that. What settings do you suggest setting?
Author
Owner

@hg0428 commented on GitHub (Oct 9, 2024):

Using the Ollama settings, Llama.cpp performs like Ollama.
The only differences between the settings:

Setting          Llama.cpp   Ollama
Repeat Penalty   1           1.1
Min p            0.1         0.05

By adjusting the settings, I can make it perform better.

<!-- gh-comment-id:2402594549 --> @hg0428 commented on GitHub (Oct 9, 2024): Using the Ollama settings, Llama.cpp performs like Ollama. The only differences between the settings: ``` Setting Llama.cpp Ollama Repeat Penalty 1 1.1 Min p 0.1 0.05 ``` By adjusting the settings, I can make it perform better.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4539