Gemma 2 9B and 27B is not behaving right #3347

@jkfnc commented on GitHub (Jun 27, 2024):

am using default q4 quants.

@jkfnc commented on GitHub (Jun 27, 2024): am using default q4 quants.

GiteaMirror commented

@jkfnc commented on GitHub (Jun 27, 2024):

fp16 has similar problem for me, it just does not stop . @Qualzz Also the response you posted has more than 40 words. Where as Gemma2:27b on aistudio.google.com sticks to the prompt and gives less than 40 words. not sure whether thats a model issue or config issue.

@jkfnc commented on GitHub (Jun 27, 2024): fp16 has similar problem for me, it just does not stop . @Qualzz Also the response you posted has more than 40 words. Where as Gemma2:27b on aistudio.google.com sticks to the prompt and gives less than 40 words. not sure whether thats a model issue or config issue.

GiteaMirror commented

@bthanos commented on GitHub (Jun 27, 2024):

I see similar strangeness. As the model is putting out words, it continues and will not stop. I tried the 27b version.

@bthanos commented on GitHub (Jun 27, 2024): I see similar strangeness. As the model is putting out words, it continues and will not stop. I tried the 27b version.

GiteaMirror commented

@bthanos commented on GitHub (Jun 27, 2024):

write some python code for a game
profile
gemma2:27b
python
Run
Copy Code
import random

def choose_random_word():
"""
This function chooses a random word from a list.
"""
words = ["python", "javascript", "javascript", "programming", "coding", "computer"]
return random.choice(words)

def play_game():
"""
Plays the game of guessing the randomly chosen word.
"""
words = ["python", "python", "javascript", "python", "computer"]
word = random.choice(words)
word_letters = list(word)
random.shuffle(word_letters) # Shuffle the letters

#Convert the shuffled list of letters back into a string
word = "".join(word_letters)

Create a set of the letters in the word

word_set = set(word_letters)

attempts = 6 # Number of attempts allowed

Number of incorrect guesses

incorrect_guesses = 0

Set the initial state of the word to a string of underscores

incorrect_guesses_list = list(word)
for i in range(len(word)):
letter_guess_list[i] = "_"
word_to_guess = "".join(word_set)

Determine the length of the word

word_length = len(word_to_guess)

Create a string of underscores for the guessed word

word_guessed = list("_" * len(word))

Convert the word to a list of letters

word_list = list(word_guess)

word_to_guess = "".join(word_list)
word_length = len(word_to_guess)

Create a list to store the letters the user has guessed

word_letters = list(word)
word_length = len(word)

Display the hidden word as a string of underscores

print(f"The word is {len(word_to_guess)} letters long.

words_to_guess = list(word_letters)

Set the number of incorrect guesses to 0

incorrect_guesses = 0

Create a list for the guessed letters

word_guessed = list(word)

Create a list to store the positions of the correct letters

(this assumes a word_guess_length of 5)

Print the number of letters in the word

(the underscore representation will be used for the user's guesses)

print(f"Guess the word: {' '.join(guessing_word)}")

Let the user guess letters

for i in range(len(word)):
print(f"Guess {word_letter[i]} : {'' * (len(word) - len(word_with_correct_letters))}")
guessing_word[i] = ""

Store the incorrect letters for the player

correct_guesses = []

Keep track of the number of incorrect guesses

incorrect_guesses_list = []

Create a list to store the letters the user has already guessed

incorrect_guesses_list = list(range(len(incorrect_guesses)))

Store the correct letters

for i in range(len(word_list)):
word_guessed[i] = "*"

Create a list of letters the user has already guessed

word_guessed = ["*"] * word_length

Loop through the letters the user has already guessed

for letter in range(len(word_guessed)):
if letter in word:
# Letter has been correctly guessed
word_guessed[letter] = "*"
else:
# Letter has not been guessed
guessing_list[len(guessing_list)] = letter

Convert the incorrect guesses to the list format

for i in range(len(word_list)):
if "*" in guessing_list:
print("You've got this!")

Store the correct letters

... (code to print the word list)

The rest of the code to play the game

... (game logic)

Create a list to store the letters the user has already guessed

(this code is incorrect)

python
Run
Copy Code

Create a list of unique letters

def play_game():

... (existing code for choosing the word)

... (existing code for guessing the word)

... (code for printing the word)

def display_word(word):
"""
Displays the guessed word with the unguessed letters as underscores.
"""
guessing_list = list(word)
if "_" in guessing_list:
print("You've got this!")

Display the word the user is guessing

print(" ".join(letter_list))

Print the positions of the incorrect letters

for letter in word_guesses:
if letter.isalpha():
print(f"The letter {letter} is in the word.")

Print the incorrect letters

for i in range(len(word)):
print(f"Letter {letter} : {i+1}st letter is {word[i]}")

... (code for guessing the word)

... (guessing code)

Explanation:

The provided code snippet seems to be intended for a simple word-guessing game, but it has some issues and needs to be completed.

Here's a breakdown of the code and the necessary changes:

Original Code:

The original code was incomplete and had a flawed approach to guessing the word.

Purpose of the Code:

This code provides the basic structure for a simple text-based word-guessing game. It's designed to handle the display of the word with only the correctly guessed letters revealed.

Corrected Code:

python
Run
Copy Code
def play_word_game():
word = "python" # Set the word to guess
word_list = list(word)

global word_list

Display the word with the letters the user has not guessed as "_"

word_list = "_" * len(word)

Print the word with the incorrect letters replaced by "*"

print("The word is: ", end="")
for i in range(len(word)):
print(word_list[i], end=" ")

... (code for the rest of the game)

... (function to check if the guessed letter is in the word)

... (function to guess the letter)

... (code to play the game)

The game logic

word_to_guess = word_guess.replace("_", "#") # Replace the "*" in the original word with "#"

... (function to guess the letter)

def print_word(word):

Replace the guessed letters with "_" for the user to see

... (rest of the code)

... (guessing_letters)

... (function to update the word)

Changes:

Display Logic: The original code snippet was missing the code to actually display the word.

Word Guessing:

python
Run
Copy Code
def guessing_game():
word = input("Enter a letter: ")

... (The rest of the code)

def display_word(word):

Create a list of letters

word_list = list(word)
for letter in word_list:
if letter == "_":
return "Incorrect"

Replace the "_" with the letters the user has guessed

word_to_guess = "_" * len(word)

Create a list for the correct and incorrect letters

guessed_letter is the index of the letter in the word

... (Code for the function to update the word)

... (Original function)

Proposed Function:

python
Run
Copy Code
def print_word(word):

Create a list of the letters in the word

word_list = list(word)
for i in range(len(word)):
letter_guess = list("_" * len(word))

... (Rest of the code)

Function to update the word

Replace the "_ " with "#" for the guessed letters

If the word is not guessed correctly, the function replaces the "_" with the correct letter

def guessing_word(guess):
# Replace the guessed_letter with the correct letter
# ... (Code for checking the guessed letter)
Function to Play the Game:

python
Run
Copy Code
def display_word_guess(word_list):

... (Rest of the function code)

Logic:

python
Run
Copy Code

Show the word with correctly guessed letters.

This function is used to solve the problem of finding the correct letter. The function takes a list of letters and a string.
It returns the index of the correct letter in the string.
def get_letter_position(letters, word):
for i in range(len(letters)):
if letters[i] == word:
return i
return "Not Found"

letter_index = word.

The function returns the index of the first instance of the letter in the list

Run
Copy Code
def find_letter_index(letters, alphabet):

for i in range(len(word)):
if word.find(letter):
return i
return "not found"

The function checks if the given letter is in the list
.
"Not Found" means the letter is not in the string.
If the letter is found, the function returns its index.
The letter's index is the index of the letter in the list.

Let's say the letter is "a" and the word is "banana".

Copy Code
Python
letters = list(input_letter.lower())
letter_index = "Not Found"

for i in range(len(word_list)):
if(
letter_index = "banana"
word = "a"

if "banana" in word_word_found = "a"

"word"

word_index = 0
for i in range(len(word)):
if (i == "banana)
letter_index:
if word.find(word_list[0]

Run
Copy Code
The code you provided has some errors.

Let's break down how to fix it.

def get_letter_index(letters):
  return "banana".find(index)
  # The function will now return the correct
  # "The"
python
Run
Copy Code
def get_letter_and_letter_index
Let's break down the corrected code, and explain how it works.


python
Run
Copy Code
def get_index_of_letter(letter, word_list):
  # Convert the word_list to a list of lower case letters

# Iterate through the word_list
for i in range(len(letter)):
  if letter.lower in word_list:
    print(f"banana"): 
  return word.find(letter)
  # "banana" is the

@bthanos commented on GitHub (Jun 27, 2024): write some python code for a game profile gemma2:27b python Run Copy Code import random def choose_random_word(): """ This function chooses a random word from a list. """ words = ["python", "javascript", "javascript", "programming", "coding", "computer"] return random.choice(words) def play_game(): """ Plays the game of guessing the randomly chosen word. """ words = ["python", "python", "javascript", "python", "computer"] word = random.choice(words) word_letters = list(word) random.shuffle(word_letters) # Shuffle the letters #Convert the shuffled list of letters back into a string word = "".join(word_letters) # Create a set of the letters in the word word_set = set(word_letters) attempts = 6 # Number of attempts allowed # Number of incorrect guesses incorrect_guesses = 0 # Set the initial state of the word to a string of underscores incorrect_guesses_list = list(word) for i in range(len(word)): letter_guess_list[i] = "_" word_to_guess = "".join(word_set) # Determine the length of the word word_length = len(word_to_guess) # Create a string of underscores for the guessed word word_guessed = list("_" * len(word)) # Convert the word to a list of letters word_list = list(word_guess) word_to_guess = "".join(word_list) word_length = len(word_to_guess) # Create a list to store the letters the user has guessed word_letters = list(word) word_length = len(word) # Display the hidden word as a string of underscores print(f"The word is {len(word_to_guess)} letters long. words_to_guess = list(word_letters) # Set the number of incorrect guesses to 0 incorrect_guesses = 0 # Create a list for the guessed letters word_guessed = list(word) # Create a list to store the positions of the correct letters # (this assumes a word_guess_length of 5) # Print the number of letters in the word # (the underscore representation will be used for the user's guesses) print(f"Guess the word: {' '.join(guessing_word)}") # Let the user guess letters for i in range(len(word)): print(f"Guess {word_letter[i]} : {'*' * (len(word) - len(word_with_correct_letters))}") guessing_word[i] = "*" # Store the incorrect letters for the player correct_guesses = [] # Keep track of the number of incorrect guesses incorrect_guesses_list = [] # Create a list to store the letters the user has already guessed incorrect_guesses_list = list(range(len(incorrect_guesses))) # Store the correct letters for i in range(len(word_list)): word_guessed[i] = "*" # Create a list of letters the user has already guessed word_guessed = ["*"] * word_length # Loop through the letters the user has already guessed for letter in range(len(word_guessed)): if letter in word: # Letter has been correctly guessed word_guessed[letter] = "*" else: # Letter has not been guessed guessing_list[len(guessing_list)] = letter # Convert the incorrect guesses to the list format for i in range(len(word_list)): if "*" in guessing_list: print("You've got this!") # Store the correct letters # ... (code to print the word list) # The rest of the code to play the game # ... (game logic) Create a list to store the letters the user has already guessed (this code is incorrect) python Run Copy Code # Create a list of unique letters def play_game(): # ... (existing code for choosing the word) # ... (existing code for guessing the word) # ... (code for printing the word) def display_word(word): """ Displays the guessed word with the unguessed letters as underscores. """ guessing_list = list(word) if "_" in guessing_list: print("You've got this!") # Display the word the user is guessing print(" ".join(letter_list)) # Print the positions of the incorrect letters for letter in word_guesses: if letter.isalpha(): print(f"The letter {letter} is in the word.") # Print the incorrect letters for i in range(len(word)): print(f"Letter {letter} : {i+1}st letter is {word[i]}") # ... (code for guessing the word) # ... (guessing code) Explanation: The provided code snippet seems to be intended for a simple word-guessing game, but it has some issues and needs to be completed. Here's a breakdown of the code and the necessary changes: Original Code: The original code was incomplete and had a flawed approach to guessing the word. Purpose of the Code: This code provides the basic structure for a simple text-based word-guessing game. It's designed to handle the display of the word with only the correctly guessed letters revealed. Corrected Code: python Run Copy Code def play_word_game(): word = "python" # Set the word to guess word_list = list(word) global word_list # Display the word with the letters the user has not guessed as "_" word_list = "_" * len(word) # Print the word with the incorrect letters replaced by "*" print("The word is: ", end="") for i in range(len(word)): print(word_list[i], end=" ") # ... (code for the rest of the game) # ... (function to check if the guessed letter is in the word) # ... (function to guess the letter) # ... (code to play the game) # The game logic word_to_guess = word_guess.replace("_", "#") # Replace the "*" in the original word with "#" # ... (function to guess the letter) def print_word(word): # Replace the guessed letters with "_" for the user to see # ... (rest of the code) # ... (guessing_letters) # ... (function to update the word) Changes: Display Logic: The original code snippet was missing the code to actually display the word. Word Guessing: python Run Copy Code def guessing_game(): word = input("Enter a letter: ") # ... (The rest of the code) def display_word(word): # Create a list of letters word_list = list(word) for letter in word_list: if letter == "_": return "Incorrect" # Replace the "_" with the letters the user has guessed word_to_guess = "_" * len(word) # Create a list for the correct and incorrect letters # guessed_letter is the index of the letter in the word # ... (Code for the function to update the word) # ... (Original function) Proposed Function: python Run Copy Code def print_word(word): # Create a list of the letters in the word word_list = list(word) for i in range(len(word)): letter_guess = list("_" * len(word)) # ... (Rest of the code) # Function to update the word # Replace the "_ " with "#" for the guessed letters # If the word is not guessed correctly, the function replaces the "_" with the correct letter def guessing_word(guess): # Replace the guessed_letter with the correct letter # ... (Code for checking the guessed letter) Function to Play the Game: python Run Copy Code def display_word_guess(word_list): # ... (Rest of the function code) Logic: python Run Copy Code # Show the word with correctly guessed letters. This function is used to solve the problem of finding the correct letter. The function takes a list of letters and a string. It returns the index of the correct letter in the string. def get_letter_position(letters, word): for i in range(len(letters)): if letters[i] == word: return i return "Not Found" letter_index = word. The function returns the index of the first instance of the letter in the list Run Copy Code def find_letter_index(letters, alphabet): for i in range(len(word)): if word.find(letter): return i return "not found" The function checks if the given letter is in the list . "Not Found" means the letter is not in the string. If the letter is found, the function returns its index. The letter's index is the index of the letter in the list. Let's say the letter is "a" and the word is "banana". Copy Code Python letters = list(input_letter.lower()) letter_index = "Not Found" for i in range(len(word_list)): if( letter_index = "banana" word = "a" if "banana" in word_word_found = "a" "word" word_index = 0 for i in range(len(word)): if (i == "banana) letter_index: if word.find(word_list[0] Run Copy Code The code you provided has some errors. Let's break down how to fix it. ```python def get_letter_index(letters): return "banana".find(index) # The function will now return the correct # "The" python Run Copy Code def get_letter_and_letter_index Let's break down the corrected code, and explain how it works. python Run Copy Code def get_index_of_letter(letter, word_list): # Convert the word_list to a list of lower case letters # Iterate through the word_list for i in range(len(letter)): if letter.lower in word_list: print(f"banana"): return word.find(letter) # "banana" is the

GiteaMirror commented

2025-11-12 11:33:31 -06:00

@gamersover commented on GitHub (Jun 28, 2024):

I also tried gemma2:9B and gemma2-27B, and found that 27B is far worse than 9B.

@gamersover commented on GitHub (Jun 28, 2024): I also tried gemma2:9B and gemma2-27B, and found that 27B is far worse than 9B.

GiteaMirror commented

@Qualzz commented on GitHub (Jun 28, 2024):

Google said that the 27b was trained with bf16, as using f16 will
produce erratic outputs.

Le ven. 28 juin 2024 à 12:23, 陈华杰 @.***> a écrit :

I also tried gemma2:9B and gemma2-27B, and found that 27B is far worse
than 9B.

—
Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/5341#issuecomment-2196584776,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AIMKMGG4V3SUU36G4DV2FNTZJU2QXAVCNFSM6AAAAABKASBZA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJWGU4DINZXGY
.
You are receiving this because you were mentioned.Message ID:
@.***>

@Qualzz commented on GitHub (Jun 28, 2024): Google said that the 27b was trained with bf16, as using f16 will produce erratic outputs. Le ven. 28 juin 2024 à 12:23, 陈华杰 ***@***.***> a écrit : > I also tried gemma2:9B and gemma2-27B, and found that 27B is far worse > than 9B. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/5341#issuecomment-2196584776>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AIMKMGG4V3SUU36G4DV2FNTZJU2QXAVCNFSM6AAAAABKASBZA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJWGU4DINZXGY> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

GiteaMirror commented

2025-11-12 11:33:31 -06:00

@bfroemel commented on GitHub (Jun 28, 2024):

https://github.com/ggerganov/llama.cpp/pull/8156#issuecomment-2195875770

@bfroemel commented on GitHub (Jun 28, 2024): https://github.com/ggerganov/llama.cpp/pull/8156#issuecomment-2195875770

GiteaMirror commented

2025-11-12 11:33:31 -06:00

@rick-github commented on GitHub (Jun 28, 2024):

I have found that increasing the context count makes the prompt from the first post work.

Default context:

curl -s http://ollama:11434/api/chat -d '{"model":"gemma2","messages":[{"role":"user","content":'"$(jq -sR . msg)"'}],"format":"","options":{},"stream": false}'
{"model":"gemma2","created_at":"2024-06-28T13:41:30.264385112Z","message":{"role":"assistant","content":"Enron executives discussed various financial deals, internal restructuring, and public image\n\n\nThe following the \n\nThe agenda\n\nI am=\n\nWe are\n\n\n\nThe The \nPlease Note: Please\n\nThis is the\n\nSubject to the. We will be\n\nThe\n\nThe,\n\n|\n\n\n,\n\nThe\n\n\n**\n\nA\n\nThe\n## |\n\nTo\n\n\nThe\n\nThe,\n\nI hope\n\nTo and\n\nTo\n\n\nThe\n\n\nThe\n\nIt’s\n\n\nThe\n\nIn this week.\n\nThe\n\nThe\n\n\nThe\n\nThere is\n\nWe\n\nTo the\n\nThis will be\n\nTo\n\nthe\n\nThis is\n\nPlease\n\nto\n\n\n| The\n\n\n[\n\n\n\nThe\nThe \nThe\n\nThe\n\nThis week, and\n\nThe\n\n**\n\nand\n\nIt’s\n\n\nThis\n\nand\n\nThe\n\nThe\n\n\nThank you to\n\nThe.\n\n\nI's\n\n\n\n\nThis\nto\n\nWe will\n\n\nThe\n\nTo\n\nThe\n\nTo\n\nTo\n\nTo\n\n\nThe\n\n,\n\n\n\nThe\n\nThese are\n\nTo\n\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n*\n\nof the\n\nThe\n\nThe\n\nThere will be\n\n\nPlease\n\nThe\n\nto\n\nThe\n\nPlease\n\nand\n\n\nThe\n\nThis is\n\n\nThe\n\n.\n\nI\n\n\nPlease\n\nThe\n\nThe\n\nThe\n\n**I\n\n\nThe\n\nTo\n\nThe\n\nThis is a\n\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nIn\n\nThe\n\n\nWe can\n\nThe\n\n\n*\n\nThis will be\n\nThe\n\n\n,\n\nThe\n\nWe have\n\n\n\n\\\n\nThere will be\n\nThe\n\n\nThe\n \nThis\n\nThe\n\nPlease\n\n\n\n\n\nThe\n\nThe\n\nThe\n\nTo The\n\nand to the\n\nThe\n  \n\nThe\n\nThis. This\n\nto\n\n.\n\n\n\n\n\nThe\n\n\nThe\n\nIt is a\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe\n\nThe\n\nThe\n\n\n**\n\n**\n\nThe\n\n\nThe\n\nThe \nThe\n\nThe\n\nThis\n\nThe\n\n\nThis is\n\nThe\n\nThe\n\nThis will\n\nThe\n\n\nThis\n\n\nThe\n\nThis\n\nThe\n\nTo\n\nThe\n\n\nThis\n\nThe\n  \n\nThe\n\n\nPlease\n\nThe\n\nThe\n\nThe\n\n\nPlease\n\nTo\n\n*\n\n\n\nThe\n\n**\n\nThe\n,\n\nThe\n\nThe\n\nand\n\nThe\n\nThe\n\nThis week.\n\nThe\n\nThe\n 1st\n\nThe\n\nThe\n\nThe\n The\n\n\nThe\n\n\nThis is\n\nThe\n\nThese\n\nThis\n\nThe\n\n\n\nThis\n\nThe\n\nThis will be\n\nThe\n\nThe\n\nThe\n\n\nThe\n\n**Please\n\nThe\nA\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\nThe\nTo\n\nThe\n\nThis\n\n\n\nThe\n\nThe\n \n\nThe\n\nThis week\nto\n\n\nThe\n\nThe\n\nThe\n\n\nThe\n\nThis\n\nThese\n\n*\n\n\nThis\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe.\n\nThe\n\nThe\n\n**\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe\n\nThe\n\n\n\nThere is\n\n\nThis \n\nThe\n\nThe\n\nThe\nTo\n\n\nThis\n\nThe\n\nThis\n\nThis week\n\nThis\n\nThis\n\nThese\n\nThe\n  \n\n**\n\n\nPlease,\n\nThis\n\n\nThis\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\n\n\nThe\n\nThe\n##\n\nThe\n\nThe\n\n\nIn the\n\nThe\n\n\n. I \n\nThe\n\n\n\n\n\n\n\n\n\nA and\n\n\n\nThe\n\n The\n\n\n\nI hope that this is a\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\n \n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n \n"},"done_reason":"stop","done":true,"total_duration":16496306664,"load_duration":5568411,"prompt_eval_count":2029,"prompt_eval_duration":1037676000,"eval_count":731,"eval_duration":15313688000}

Increased context:

curl -s http://ollama:11434/api/chat -d '{"model":"gemma2","messages":[{"role":"user","content":'"$(jq -sR . msg)"'}],"format":"","options":{"num_ctx": 4096},"stream": false}'
{"model":"gemma2","created_at":"2024-06-28T13:42:03.685706116Z","message":{"role":"assistant","content":"The email thread covers various Enron business activities in 2000-2001, including financial deals, internal restructuring, reputation management, and the postponement of management conferences amidst challenges facing the company.  \n"},"done_reason":"stop","done":true,"total_duration":4270076853,"load_duration":2471060407,"prompt_eval_count":2034,"prompt_eval_duration":842801000,"eval_count":46,"eval_duration":898581000}

@rick-github commented on GitHub (Jun 28, 2024): I have found that increasing the context count makes the prompt from the first post work. Default context: ``` curl -s http://ollama:11434/api/chat -d '{"model":"gemma2","messages":[{"role":"user","content":'"$(jq -sR . msg)"'}],"format":"","options":{},"stream": false}' {"model":"gemma2","created_at":"2024-06-28T13:41:30.264385112Z","message":{"role":"assistant","content":"Enron executives discussed various financial deals, internal restructuring, and public image\n\n\nThe following the \n\nThe agenda\n\nI am=\n\nWe are\n\n\n\nThe The \nPlease Note: Please\n\nThis is the\n\nSubject to the. We will be\n\nThe\n\nThe,\n\n|\n\n\n,\n\nThe\n\n\n**\n\nA\n\nThe\n## |\n\nTo\n\n\nThe\n\nThe,\n\nI hope\n\nTo and\n\nTo\n\n\nThe\n\n\nThe\n\nIt’s\n\n\nThe\n\nIn this week.\n\nThe\n\nThe\n\n\nThe\n\nThere is\n\nWe\n\nTo the\n\nThis will be\n\nTo\n\nthe\n\nThis is\n\nPlease\n\nto\n\n\n| The\n\n\n[\n\n\n\nThe\nThe \nThe\n\nThe\n\nThis week, and\n\nThe\n\n**\n\nand\n\nIt’s\n\n\nThis\n\nand\n\nThe\n\nThe\n\n\nThank you to\n\nThe.\n\n\nI's\n\n\n\n\nThis\nto\n\nWe will\n\n\nThe\n\nTo\n\nThe\n\nTo\n\nTo\n\nTo\n\n\nThe\n\n,\n\n\n\nThe\n\nThese are\n\nTo\n\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n*\n\nof the\n\nThe\n\nThe\n\nThere will be\n\n\nPlease\n\nThe\n\nto\n\nThe\n\nPlease\n\nand\n\n\nThe\n\nThis is\n\n\nThe\n\n.\n\nI\n\n\nPlease\n\nThe\n\nThe\n\nThe\n\n**I\n\n\nThe\n\nTo\n\nThe\n\nThis is a\n\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nIn\n\nThe\n\n\nWe can\n\nThe\n\n\n*\n\nThis will be\n\nThe\n\n\n,\n\nThe\n\nWe have\n\n\n\n\\\n\nThere will be\n\nThe\n\n\nThe\n \nThis\n\nThe\n\nPlease\n\n\n\n\n\nThe\n\nThe\n\nThe\n\nTo The\n\nand to the\n\nThe\n \n\nThe\n\nThis. This\n\nto\n\n.\n\n\n\n\n\nThe\n\n\nThe\n\nIt is a\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe\n\nThe\n\nThe\n\n\n**\n\n**\n\nThe\n\n\nThe\n\nThe \nThe\n\nThe\n\nThis\n\nThe\n\n\nThis is\n\nThe\n\nThe\n\nThis will\n\nThe\n\n\nThis\n\n\nThe\n\nThis\n\nThe\n\nTo\n\nThe\n\n\nThis\n\nThe\n \n\nThe\n\n\nPlease\n\nThe\n\nThe\n\nThe\n\n\nPlease\n\nTo\n\n*\n\n\n\nThe\n\n**\n\nThe\n,\n\nThe\n\nThe\n\nand\n\nThe\n\nThe\n\nThis week.\n\nThe\n\nThe\n 1st\n\nThe\n\nThe\n\nThe\n The\n\n\nThe\n\n\nThis is\n\nThe\n\nThese\n\nThis\n\nThe\n\n\n\nThis\n\nThe\n\nThis will be\n\nThe\n\nThe\n\nThe\n\n\nThe\n\n**Please\n\nThe\nA\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\nThe\nTo\n\nThe\n\nThis\n\n\n\nThe\n\nThe\n \n\nThe\n\nThis week\nto\n\n\nThe\n\nThe\n\nThe\n\n\nThe\n\nThis\n\nThese\n\n*\n\n\nThis\n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe.\n\nThe\n\nThe\n\n**\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\nThe\n\nThe\n\n\n\nThere is\n\n\nThis \n\nThe\n\nThe\n\nThe\nTo\n\n\nThis\n\nThe\n\nThis\n\nThis week\n\nThis\n\nThis\n\nThese\n\nThe\n \n\n**\n\n\nPlease,\n\nThis\n\n\nThis\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\n\n\nThe\n\nThe\n##\n\nThe\n\nThe\n\n\nIn the\n\nThe\n\n\n. I \n\nThe\n\n\n\n\n\n\n\n\n\nA and\n\n\n\nThe\n\n The\n\n\n\nI hope that this is a\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThis\n\nThe\n\n \n\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n\nThe\n \n"},"done_reason":"stop","done":true,"total_duration":16496306664,"load_duration":5568411,"prompt_eval_count":2029,"prompt_eval_duration":1037676000,"eval_count":731,"eval_duration":15313688000} ``` Increased context: ``` curl -s http://ollama:11434/api/chat -d '{"model":"gemma2","messages":[{"role":"user","content":'"$(jq -sR . msg)"'}],"format":"","options":{"num_ctx": 4096},"stream": false}' {"model":"gemma2","created_at":"2024-06-28T13:42:03.685706116Z","message":{"role":"assistant","content":"The email thread covers various Enron business activities in 2000-2001, including financial deals, internal restructuring, reputation management, and the postponement of management conferences amidst challenges facing the company. \n"},"done_reason":"stop","done":true,"total_duration":4270076853,"load_duration":2471060407,"prompt_eval_count":2034,"prompt_eval_duration":842801000,"eval_count":46,"eval_duration":898581000} ```

GiteaMirror commented

@rick-github commented on GitHub (Jun 28, 2024):

Also, something I don't recall seeing previously, and don't know if this is relevant, but the value of n_ctx in the logs is 2x the value of num_ctx.

@rick-github commented on GitHub (Jun 28, 2024): Also, something I don't recall seeing previously, and don't know if this is relevant, but the value of `n_ctx` in the logs is 2x the value of `num_ctx`.

GiteaMirror commented

https://github.com/vllm-project/vllm/pull/5908#issuecomment-2197433268

@bfroemel commented on GitHub (Jun 28, 2024):

seems like logit soft-capping is required after all. Details, see Section 2: https://storage.googleapis.com/deepmind-media/gemma/gemma-2-report.pdf

@bfroemel commented on GitHub (Jun 28, 2024): https://github.com/vllm-project/vllm/pull/5908#issuecomment-2197433268 seems like logit soft-capping is required after all. Details, see Section 2: https://storage.googleapis.com/deepmind-media/gemma/gemma-2-report.pdf

GiteaMirror commented

@bfroemel commented on GitHub (Jun 29, 2024):

In case someone is impatient as well, cannot wait a few more hours before this is available without such hassles, and just has to try out gemma2 27b locally with Ollama and logit soft-capping from https://github.com/ggerganov/llama.cpp/pull/8197 :

Clone ollama
(Go to a commit where this should work, for example current master, commit 717f7229)
put the attached patch 08-gemma-sc2.diff.txt as 08-gemma-sc.diff into ./llm/patches/
Build ollama as usual

/edit: the following is not needed. Apparently, for gemma2 models that miss this info, the same defaults for soft-capping are assumed as used in 9b and 27b, so old quants should work

~~You also need to quantize the Gemma2 model again with llama.cpp and the logit soft-capping PR:~~

~~Get the bf16 model from here: https://huggingface.co/google/gemma-2-27b-it/tree/main~~
~~Clone llama.cpp, switch to branch add-gemma2-soft-capping (might be merged to master any minute)~~
~~Do a quant that works for you~~

A gguf gemma2 model with soft-capping can be easily identified when looking into its meta data, for example, a quick'n'dirty:

% head -n 200 model_fp16.gguf | strings | grep softcapping
gemma2.attn_logit_softcapping
gemma2.final_logit_softcapping

or, with llama.cpp tools:

% ./gguf-py/scripts/gguf-dump.py --no-tensors ./model_fp16.gguf | grep softcapping
INFO:gguf-dump:* Loading: ./model_fp16.gguf
     16: FLOAT32    |        1 | gemma2.attn_logit_softcapping = 50.0
     17: FLOAT32    |        1 | gemma2.final_logit_softcapping = 30.0

With logit soft-capping it seems to behave very similar to the model served on Google's AI Studio (https://aistudio.google.com)

/edit2: updated the patch to also include a89427908d

@bfroemel commented on GitHub (Jun 29, 2024): In case someone is impatient as well, cannot wait a few more hours before this is available without such hassles, and just has to try out gemma2 27b locally with Ollama and logit soft-capping from https://github.com/ggerganov/llama.cpp/pull/8197 : 1. Clone ollama 2. (Go to a commit where this should work, for example current master, commit 717f7229) 3. put the attached patch [08-gemma-sc2.diff.txt](https://github.com/user-attachments/files/16041917/08-gemma-sc2.diff.txt) as 08-gemma-sc.diff into ./llm/patches/ 4. Build ollama as usual **/edit: the following is not needed. Apparently, for gemma2 models that miss this info, the same defaults for soft-capping are assumed as used in 9b and 27b, so old quants should work** ~You also need to quantize the Gemma2 model again with llama.cpp and the logit soft-capping PR:~ 1. ~Get the bf16 model from here: https://huggingface.co/google/gemma-2-27b-it/tree/main~ 2. ~Clone llama.cpp, switch to branch add-gemma2-soft-capping (might be merged to master any minute)~ 3. ~Do a quant that works for you~ A gguf gemma2 model with soft-capping can be easily identified when looking into its meta data, for example, a quick'n'dirty: ``` % head -n 200 model_fp16.gguf | strings | grep softcapping gemma2.attn_logit_softcapping gemma2.final_logit_softcapping ``` or, with llama.cpp tools: ``` % ./gguf-py/scripts/gguf-dump.py --no-tensors ./model_fp16.gguf | grep softcapping INFO:gguf-dump:* Loading: ./model_fp16.gguf 16: FLOAT32 | 1 | gemma2.attn_logit_softcapping = 50.0 17: FLOAT32 | 1 | gemma2.final_logit_softcapping = 30.0 ``` With logit soft-capping it seems to behave very similar to the model served on Google's AI Studio (https://aistudio.google.com) **/edit2: updated the patch to also include https://github.com/ggerganov/llama.cpp/commit/a89427908d04fcf3b4e975724596efddce4db737**

GiteaMirror commented

@bfroemel commented on GitHub (Jun 30, 2024):

Finally, official fixes are in llama.cpp. For full Gemma2 support, only alternating local sliding window attention and global attention in every other layer is still missing with probably only minor impact for small context sizes (below 4k).

1c5eba6f8e

@bfroemel commented on GitHub (Jun 30, 2024): Finally, official fixes are in llama.cpp. For full Gemma2 support, only alternating local sliding window attention and global attention in every other layer is still missing with probably only minor impact for small context sizes (below 4k). https://github.com/ggerganov/llama.cpp/commit/1c5eba6f8e628fb0a98afb27d8aaeb3b0e136451

GiteaMirror commented

@bfroemel commented on GitHub (Jul 2, 2024):

@jmorganca Couple of more gemma2 fixes are available in llama.cpp now.

Good news: for me the locally run Gemma2 model behaves now very similar to the model run on Google's servers (aistudio). Bad news: models need to be reconverted. There has been a tokenizer conversion problem which is also fixed in llama.cpp, master.

@bfroemel commented on GitHub (Jul 2, 2024): @jmorganca Couple of more gemma2 fixes are available in llama.cpp now. Good news: for me the locally run Gemma2 model behaves now very similar to the model run on Google's servers (aistudio). Bad news: models need to be reconverted. There has been a tokenizer conversion problem which is also fixed in llama.cpp, master.

GiteaMirror commented

@jtoy commented on GitHub (Jul 4, 2024):

does this require a new ollama version or just reuploading the models?

@jtoy commented on GitHub (Jul 4, 2024): does this require a new ollama version or just reuploading the models?

GiteaMirror commented

@bfroemel commented on GitHub (Jul 4, 2024):

requires

new ollama version where either all gemma-2 runtime related fixes need to go in as a separate patch, or ollama makes use of a llama.cpp submodule of at least commit 49122a873f
reconverted models with llama.cpp, at least from commit 5fac350b9c or already fixed models from here: https://huggingface.co/bartowski/gemma-2-27b-it-GGUF and here: https://huggingface.co/bartowski/gemma-2-9b-it-GGUF

@bfroemel commented on GitHub (Jul 4, 2024): requires * new ollama version where either all gemma-2 runtime related fixes need to go in as a separate patch, or ollama makes use of a llama.cpp submodule of at least commit https://github.com/ggerganov/llama.cpp/commit/49122a873f54615626d1b49a2a39013ed4be98d5 * reconverted models with llama.cpp, at least from commit https://github.com/ggerganov/llama.cpp/commit/5fac350b9cc49d0446fc291b9c4ad53666c77591 or already fixed models from here: https://huggingface.co/bartowski/gemma-2-27b-it-GGUF and here: https://huggingface.co/bartowski/gemma-2-9b-it-GGUF

GiteaMirror commented

@grigio commented on GitHub (Jul 8, 2024):

I use GGUF Q4_k_m Gemma2:9b is fine but Gemma2:27B seems to write nonsense with longer queries

On Ollama 0.1.48

@grigio commented on GitHub (Jul 8, 2024): I use GGUF Q4_k_m Gemma2:9b is fine but Gemma2:27B seems to write nonsense with longer queries On Ollama 0.1.48

GiteaMirror commented

@Qualzz commented on GitHub (Jul 9, 2024):

I use GGUF Q4_k_m Gemma2:9b is fine but Gemma2:27B seems to write nonsense with longer queries

On Ollama 0.1.48

Wait a bit, a new version of ollama with gemma 2 fixes is on the way.

@Qualzz commented on GitHub (Jul 9, 2024): > I use GGUF Q4_k_m Gemma2:9b is fine but Gemma2:27B seems to write nonsense with longer queries > > On Ollama 0.1.48 Wait a bit, a new version of ollama with gemma 2 fixes is on the way.

GiteaMirror commented