[GH-ISSUE #4163] llava broke in new version v0.1.33 #2587

Closed
opened 2026-04-12 12:55:55 -05:00 by GiteaMirror · 14 comments
Owner

Originally created by @VideoFX on GitHub (May 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4163

Originally assigned to: @jmorganca on GitHub.

What is the issue?

Ollama v0.1.33
Intel Core i9 14900K 64GB ram
Nvidia RTX 4070

llava only works for the first inference attempt. All attempts afterwards make up strange descriptions not related to the image, almost like its looking at a different picture.

This also happens with llava:13b. It will work the first time after loading. After that, broken.

This also happens on other windows machines with different Intel and Nvidia combinations.

I have updated Ollama, and redownloaded the llava models.

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.1.33

Originally created by @VideoFX on GitHub (May 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4163 Originally assigned to: @jmorganca on GitHub. ### What is the issue? Ollama v0.1.33 Intel Core i9 14900K 64GB ram Nvidia RTX 4070 llava only works for the first inference attempt. All attempts afterwards make up strange descriptions not related to the image, almost like its looking at a different picture. This also happens with llava:13b. It will work the first time after loading. After that, broken. This also happens on other windows machines with different Intel and Nvidia combinations. I have updated Ollama, and redownloaded the llava models. ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.33
GiteaMirror added the bug label 2026-04-12 12:55:55 -05:00
Author
Owner

@jmorganca commented on GitHub (May 5, 2024):

Sorry you hit this issue. Looking into it now

<!-- gh-comment-id:2094634207 --> @jmorganca commented on GitHub (May 5, 2024): Sorry you hit this issue. Looking into it now
Author
Owner

@jmorganca commented on GitHub (May 5, 2024):

Related issue in llama.cpp: https://github.com/ggerganov/llama.cpp/issues/7060

<!-- gh-comment-id:2094634235 --> @jmorganca commented on GitHub (May 5, 2024): Related issue in llama.cpp: https://github.com/ggerganov/llama.cpp/issues/7060
Author
Owner

@brucecai-2001 commented on GitHub (May 5, 2024):

The same issue on Ollama mac, llava7b & 13b failed on the second attempt

<!-- gh-comment-id:2094649300 --> @brucecai-2001 commented on GitHub (May 5, 2024): The same issue on Ollama mac, llava7b & 13b failed on the second attempt
Author
Owner

@dlvoy commented on GitHub (May 5, 2024):

I noticed same after update to v0.1.33, reverting to v0.1.32 fixed issue so it have to be some kind of regression.

I have simple py script with query "What is on the picture?" done over collection of photos taken by phone. For each photo there is API request to generate endpoint:

From response comparisons:

  • first response is accurate
  • few next responses are kind of: "The image appears to be a distorted or abstract representation. It seems to feature multiple layers of overlapping shapes", "The image you've shared appears to be a collage of various pictures", "The image shows a collection of messages written on post-it notes"...
  • rest of responses are total hallucinations ("The image shows a comical representation of the "Göbekli Tepe,")

I do not know what are inner workings of ollama, but it seems that previous images or context is somehow preserved for next queries.

  • OS: Win10 22H2
  • GPU: 3090 Ti
  • Model: llava:34b (hash: 3d2d24f46674)
<!-- gh-comment-id:2094669180 --> @dlvoy commented on GitHub (May 5, 2024): I noticed same after update to v0.1.33, reverting to v0.1.32 fixed issue so it have to be some kind of regression. I have simple py script with query "What is on the picture?" done over collection of photos taken by phone. For each photo there is API request to generate endpoint: From response comparisons: * first response is accurate * few next responses are kind of: "The image appears to be a distorted or abstract representation. It seems to feature multiple layers of overlapping shapes", "The image you've shared appears to be a collage of various pictures", "The image shows a collection of messages written on post-it notes"... * rest of responses are total hallucinations ("The image shows a comical representation of the "Göbekli Tepe,") I do not know what are inner workings of ollama, but it seems that previous images or context is somehow preserved for next queries. - OS: Win10 22H2 - GPU: 3090 Ti - Model: llava:34b (hash: 3d2d24f46674)
Author
Owner

@DuckyBlender commented on GitHub (May 5, 2024):

for me, the first response is empty 95% of the time, if you follow up the question it works. running v0.1.33 and the moondream model

<!-- gh-comment-id:2094909182 --> @DuckyBlender commented on GitHub (May 5, 2024): for me, the first response is empty 95% of the time, if you follow up the question it works. running v0.1.33 and the moondream model
Author
Owner

@jdavid82 commented on GitHub (May 6, 2024):

I guess Llava is now broken for everyone around the world who wants to try it as of yesterday. If there is documentation on how to install an older version then please kindly point me to it

<!-- gh-comment-id:2095161524 --> @jdavid82 commented on GitHub (May 6, 2024): I guess Llava is now broken for everyone around the world who wants to try it as of yesterday. If there is documentation on how to install an older version then please kindly point me to it
Author
Owner

@dlvoy commented on GitHub (May 6, 2024):

If there is documentation on how to install an older version then please kindly point me to it

For me downloading previous installer (https://github.com/ollama/ollama/releases/tag/v0.1.32) and running it installed previous version.

<!-- gh-comment-id:2095301591 --> @dlvoy commented on GitHub (May 6, 2024): > If there is documentation on how to install an older version then please kindly point me to it For me downloading previous installer (https://github.com/ollama/ollama/releases/tag/v0.1.32) and running it installed previous version.
Author
Owner

@skye0402 commented on GitHub (May 6, 2024):

Same here. 0.1.32 worked, 0.1.33 doesn't. Using llava:13b-v1.6. Running on Nvidia T4 (16GB).

<!-- gh-comment-id:2095380854 --> @skye0402 commented on GitHub (May 6, 2024): Same here. 0.1.32 worked, 0.1.33 doesn't. Using `llava:13b-v1.6`. Running on Nvidia T4 (16GB).
Author
Owner

@jmorganca commented on GitHub (May 6, 2024):

Hi there this should be fixed in this patch https://github.com/ollama/ollama/pull/4164 for now and we'll help hunt down why this broke more broadly in llama.cpp in the meantime. It will be fixed in the next release 0.1.34 which should be out very soon

<!-- gh-comment-id:2097082409 --> @jmorganca commented on GitHub (May 6, 2024): Hi there this should be fixed in this patch https://github.com/ollama/ollama/pull/4164 for now and we'll help hunt down why this broke more broadly in llama.cpp in the meantime. It will be fixed in the next release 0.1.34 which should be out very soon
Author
Owner

@TheMasterFX commented on GitHub (May 10, 2024):

Is this really fixed? I can't confirm.
I use ollama 0.1.34 and OpenWebUI v0.1.124.
Here is my conversation with two pictures:
image
image
It seems like it mixed the Context of the first image with the second image. Am I using it wrong? Or is this more an issue with OpenWebUI providing the context to LLava?

<!-- gh-comment-id:2105323088 --> @TheMasterFX commented on GitHub (May 10, 2024): Is this really fixed? I can't confirm. I use ollama 0.1.34 and OpenWebUI v0.1.124. Here is my conversation with two pictures: ![image](https://github.com/ollama/ollama/assets/12451336/f107f277-951b-4928-9bbb-632d031aa4a4) ![image](https://github.com/ollama/ollama/assets/12451336/f00aa492-8063-40b4-af32-7c3a254de878) It seems like it mixed the Context of the first image with the second image. Am I using it wrong? Or is this more an issue with OpenWebUI providing the context to LLava?
Author
Owner

@dlvoy commented on GitHub (May 10, 2024):

v0.1.34 fixed it - but I am using ollama directly via ollama-python lib

<!-- gh-comment-id:2105334477 --> @dlvoy commented on GitHub (May 10, 2024): [v0.1.34](https://github.com/ollama/ollama/releases/tag/v0.1.34) fixed it - but I am using ollama directly via [ollama-python](https://github.com/ollama/ollama-python) lib
Author
Owner

@VideoFX commented on GitHub (May 10, 2024):

Yea fixed for me so far using various python methods including ollama-python. works in open WebUI as well, BUT it is also true that the context can confuse it, so my advice is to make a new chat when using Open Webui, and beware of the growing chat context influencing the outputs.

<!-- gh-comment-id:2105336279 --> @VideoFX commented on GitHub (May 10, 2024): Yea fixed for me so far using various python methods including ollama-python. works in open WebUI as well, BUT it is also true that the context can confuse it, so my advice is to make a new chat when using Open Webui, and beware of the growing chat context influencing the outputs.
Author
Owner

@jdavid82 commented on GitHub (May 15, 2024):

That means it's not really fixed because I didn't have this problem in the previous version:

import io
import ollama
import os
from PIL import Image

folder_path = 'images'
filtered_folder = 'filtered'
rejected_folder = 'rejected'
image_files = []

for filename in os.listdir(folder_path):
    if filename.endswith('.jpg') or filename.endswith('.png') or filename.endswith('.jpeg'):
        image_files.append(os.path.join(folder_path, filename))

image_files.sort()

for image_file in image_files:
  with open(image_file, 'rb') as file:
    img = file.read()
  score = 0
  totalChecks = 1
  minScore = 1
  print(f"Evaluating:{image_file}")
  for _ in range(totalChecks):
    if score == 0:
      output = ollama.generate(model="llava", prompt="Are there any dogs in this picture? Answer with just yes or no.", images=[img])
      print(f"{output['response']}")
      if 'Yes' in output['response'] or 'yes' in output['response']:
        score += 1
        if score >= minScore:
          break
  print(f"Final Score: {score}")
  if score >= minScore:
    filtered_image_path = os.path.join(filtered_folder, os.path.basename(image_file))
    os.makedirs(filtered_folder, exist_ok=True)
    os.rename(image_file, filtered_image_path)
  else:
    rejected_image_path = os.path.join(rejected_folder, os.path.basename(image_file))
    os.makedirs(rejected_folder, exist_ok=True)
    os.rename(image_file, rejected_image_path)

This code was working in the previous version, now it only works for the first image, after that its no longer accurate.
The ollama api doesn't seem to have a method for clearing the context either

<!-- gh-comment-id:2112165588 --> @jdavid82 commented on GitHub (May 15, 2024): That means it's not really fixed because I didn't have this problem in the previous version: ``` import io import ollama import os from PIL import Image folder_path = 'images' filtered_folder = 'filtered' rejected_folder = 'rejected' image_files = [] for filename in os.listdir(folder_path): if filename.endswith('.jpg') or filename.endswith('.png') or filename.endswith('.jpeg'): image_files.append(os.path.join(folder_path, filename)) image_files.sort() for image_file in image_files: with open(image_file, 'rb') as file: img = file.read() score = 0 totalChecks = 1 minScore = 1 print(f"Evaluating:{image_file}") for _ in range(totalChecks): if score == 0: output = ollama.generate(model="llava", prompt="Are there any dogs in this picture? Answer with just yes or no.", images=[img]) print(f"{output['response']}") if 'Yes' in output['response'] or 'yes' in output['response']: score += 1 if score >= minScore: break print(f"Final Score: {score}") if score >= minScore: filtered_image_path = os.path.join(filtered_folder, os.path.basename(image_file)) os.makedirs(filtered_folder, exist_ok=True) os.rename(image_file, filtered_image_path) else: rejected_image_path = os.path.join(rejected_folder, os.path.basename(image_file)) os.makedirs(rejected_folder, exist_ok=True) os.rename(image_file, rejected_image_path) ``` This code was working in the previous version, now it only works for the first image, after that its no longer accurate. The ollama api doesn't seem to have a method for clearing the context either
Author
Owner

@amonpaike commented on GitHub (May 17, 2024):

@jmorganca
on windows, native ollama v.1.38 the images are interpreted only for the first one, the subsequent ones are given a description of the first one.

<!-- gh-comment-id:2117193531 --> @amonpaike commented on GitHub (May 17, 2024): @jmorganca on windows, native ollama v.1.38 the images are interpreted only for the first one, the subsequent ones are given a description of the first one.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2587