[GH-ISSUE #10235] Granite 3.2 Vision and Chat History #32477

Closed
opened 2026-04-22 13:46:51 -05:00 by GiteaMirror · 11 comments
Owner

Originally created by @lemassykoi on GitHub (Apr 11, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10235

What is the issue?

When sending a picture to granite3.2-vision:2b-fp16, with and without tools set, if messages contains more than the user query, it fails after roughly the same generation time than when it's successful)

The problem is not present with mistral-small:24b-3.1-instruct-2503-q4_K_M (with history and with or without tools).

These are the two only model found on Ollama.com which are compatible Vision AND Tools

Relevant log output

FAILED 1: (with tools)

tools = [
    {
        'type': 'function',
        'function': {
            'name': 'get_current_weather',
            'description': 'Get the current weather for a city',
            'parameters': {
                'type': 'object',
                'properties': {
                'city': {
                    'type': 'string',
                    'description': 'The name of the city',
                },
            },
            'required': ['city'],
            },
        },
    },
]
user_query = 'Please describe the attached picture'
image_path = "PXL_20250331_141822874.jpg"
with open(image_path, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

messages = [
    {
        "role": "user",
        "content": "Hi, my name is Bob. How are you?",
    },
    {
        "role": "assistant",
        "content": "Hello Bob! I'm fine, thanks for asking. How can I help you?",
    },
    {
        "role": "user",
        "content": user_query,
        "images": [encoded_string],
    }
]

data = {
    "model": model,
    "messages": messages,
    "stream": False,
    "keep_alive": "1m",
    "tools": tools,
    "options": {'temperature': 0.0, 'seed': 1234567890}
}
headers = {"Content-Type": "application/json"}
payload = json.dumps(data).encode("utf-8")
response = requests.post(url, headers=headers, data=payload, stream=False)

if response.status_code == 200:
    print(response.json())
    print(response.json()['message']['content'])

Answer is unanswerable

FAILED 2: (without tools)

user_query = 'Please describe the attached picture'
image_path = "PXL_20250331_141822874.jpg"
with open(image_path, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

messages = [
    {
        "role": "user",
        "content": "Hi, my name is Bob. How are you?",
    },
    {
        "role": "assistant",
        "content": "Hello Bob! I'm fine, thanks for asking. How can I help you?",
    },
    {
        "role": "user",
        "content": user_query,
        "images": [encoded_string],
    }
]

data = {
    "model": model,
    "messages": messages,
    "stream": False,
    "keep_alive": "1m",
    "options": {'temperature': 0.0, 'seed': 1234567890}
}
headers = {"Content-Type": "application/json"}
payload = json.dumps(data).encode("utf-8")
response = requests.post(url, headers=headers, data=payload, stream=False)

if response.status_code == 200:
    print(response.json())
    print(response.json()['message']['content'])

Answer is I'm sorry, but I cannot provide a description of an image that has not been uploaded or provided to me. Please upload the image so that I can assist you further.

SUCCESS: with tools and without history

tools = [
    {
        'type': 'function',
        'function': {
            'name': 'get_current_weather',
            'description': 'Get the current weather for a city',
            'parameters': {
                'type': 'object',
                'properties': {
                'city': {
                    'type': 'string',
                    'description': 'The name of the city',
                },
            },
            'required': ['city'],
            },
        },
    },
]
user_query = 'Please describe the attached picture'
image_path = "PXL_20250331_141822874.jpg"
with open(image_path, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

messages = [
    {
        "role": "user",
        "content": user_query,
        "images": [encoded_string],
    }
]

data = {
    "model": model,
    "messages": messages,
    "stream": False,
    "keep_alive": "1m",
    "tools": tools,
    "options": {'temperature': 0.0, 'seed': 1234567890}
}
headers = {"Content-Type": "application/json"}
payload = json.dumps(data).encode("utf-8")
response = requests.post(url, headers=headers, data=payload, stream=False)

if response.status_code == 200:
    print(response.json())
    print(response.json()['message']['content'])

Answer is an accurate description

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.6.5

Originally created by @lemassykoi on GitHub (Apr 11, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10235 ### What is the issue? When sending a picture to `granite3.2-vision:2b-fp16`, with and without tools set, if messages contains more than the user query, it fails after roughly the same generation time than when it's successful) The problem is not present with `mistral-small:24b-3.1-instruct-2503-q4_K_M` (with history and with or without tools). These are the two only model found on Ollama.com which are compatible Vision AND Tools ### Relevant log output ## FAILED 1: (with tools) ```python tools = [ { 'type': 'function', 'function': { 'name': 'get_current_weather', 'description': 'Get the current weather for a city', 'parameters': { 'type': 'object', 'properties': { 'city': { 'type': 'string', 'description': 'The name of the city', }, }, 'required': ['city'], }, }, }, ] user_query = 'Please describe the attached picture' image_path = "PXL_20250331_141822874.jpg" with open(image_path, "rb") as image_file: encoded_string = base64.b64encode(image_file.read()).decode('utf-8') messages = [ { "role": "user", "content": "Hi, my name is Bob. How are you?", }, { "role": "assistant", "content": "Hello Bob! I'm fine, thanks for asking. How can I help you?", }, { "role": "user", "content": user_query, "images": [encoded_string], } ] data = { "model": model, "messages": messages, "stream": False, "keep_alive": "1m", "tools": tools, "options": {'temperature': 0.0, 'seed': 1234567890} } headers = {"Content-Type": "application/json"} payload = json.dumps(data).encode("utf-8") response = requests.post(url, headers=headers, data=payload, stream=False) if response.status_code == 200: print(response.json()) print(response.json()['message']['content']) ``` Answer is `unanswerable` ## FAILED 2: (without tools) ```shell user_query = 'Please describe the attached picture' image_path = "PXL_20250331_141822874.jpg" with open(image_path, "rb") as image_file: encoded_string = base64.b64encode(image_file.read()).decode('utf-8') messages = [ { "role": "user", "content": "Hi, my name is Bob. How are you?", }, { "role": "assistant", "content": "Hello Bob! I'm fine, thanks for asking. How can I help you?", }, { "role": "user", "content": user_query, "images": [encoded_string], } ] data = { "model": model, "messages": messages, "stream": False, "keep_alive": "1m", "options": {'temperature': 0.0, 'seed': 1234567890} } headers = {"Content-Type": "application/json"} payload = json.dumps(data).encode("utf-8") response = requests.post(url, headers=headers, data=payload, stream=False) if response.status_code == 200: print(response.json()) print(response.json()['message']['content']) ``` Answer is `I'm sorry, but I cannot provide a description of an image that has not been uploaded or provided to me. Please upload the image so that I can assist you further.` ## SUCCESS: with tools and without history ```python tools = [ { 'type': 'function', 'function': { 'name': 'get_current_weather', 'description': 'Get the current weather for a city', 'parameters': { 'type': 'object', 'properties': { 'city': { 'type': 'string', 'description': 'The name of the city', }, }, 'required': ['city'], }, }, }, ] user_query = 'Please describe the attached picture' image_path = "PXL_20250331_141822874.jpg" with open(image_path, "rb") as image_file: encoded_string = base64.b64encode(image_file.read()).decode('utf-8') messages = [ { "role": "user", "content": user_query, "images": [encoded_string], } ] data = { "model": model, "messages": messages, "stream": False, "keep_alive": "1m", "tools": tools, "options": {'temperature': 0.0, 'seed': 1234567890} } headers = {"Content-Type": "application/json"} payload = json.dumps(data).encode("utf-8") response = requests.post(url, headers=headers, data=payload, stream=False) if response.status_code == 200: print(response.json()) print(response.json()['message']['content']) ``` Answer is an accurate description ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.5
GiteaMirror added the bug label 2026-04-22 13:46:51 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 11, 2025):

I tried your example 1 (with a different image) and it returned the expected result:

--- 10235-orig.py       2025-04-11 20:44:17.050679911 +0200
+++ 10235.py    2025-04-11 20:45:09.103211218 +0200
@@ -1,3 +1,9 @@
+#!/usr/bin/env python3
+
+import requests
+import base64
+import json
+
 tools = [
     {
         'type': 'function',
@@ -18,7 +24,9 @@
     },
 ]
 user_query = 'Please describe the attached picture'
-image_path = "PXL_20250331_141822874.jpg"
+image_path = "puppy.jpg"
+url = "http://localhost:11434/api/chat"
+model = "granite3.2-vision:2b-fp16"
 with open(image_path, "rb") as image_file:
     encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
$ ./10235.py
{'model': 'granite3.2-vision:2b-fp16', 'created_at': '2025-04-11T18:45:36.892533476Z', 'message': {'role': 'assistant', 'content': '\nThe image shows a small white dog sitting on a stone surface. The dog is wearing a red collar with a gold-colored bell attached to it. The dog appears to be looking off to the side, possibly at something or someone out of frame. The background is blurred and indistinct, but it seems to be an outdoor setting with some greenery visible.'}, 'done_reason': 'stop', 'done': True, 'total_duration': 3915885599, 'load_duration': 1938526324, 'prompt_eval_count': 2343, 'prompt_eval_duration': 857899419, 'eval_count': 81, 'eval_duration': 1111227401}

The image shows a small white dog sitting on a stone surface. The dog is wearing a red collar with a
gold-colored bell attached to it. The dog appears to be looking off to the side, possibly at something or
someone out of frame. The background is blurred and indistinct, but it seems to be an outdoor setting
with some greenery visible.

It's pretty easy to add tools to llama3.2-vision if you want another vision model with tool support: https://github.com/ollama/ollama/issues/7611#issuecomment-2585009100

<!-- gh-comment-id:2797782470 --> @rick-github commented on GitHub (Apr 11, 2025): I tried your example 1 (with a different image) and it returned the expected result: ```diff --- 10235-orig.py 2025-04-11 20:44:17.050679911 +0200 +++ 10235.py 2025-04-11 20:45:09.103211218 +0200 @@ -1,3 +1,9 @@ +#!/usr/bin/env python3 + +import requests +import base64 +import json + tools = [ { 'type': 'function', @@ -18,7 +24,9 @@ }, ] user_query = 'Please describe the attached picture' -image_path = "PXL_20250331_141822874.jpg" +image_path = "puppy.jpg" +url = "http://localhost:11434/api/chat" +model = "granite3.2-vision:2b-fp16" with open(image_path, "rb") as image_file: encoded_string = base64.b64encode(image_file.read()).decode('utf-8') ``` ```console $ ./10235.py {'model': 'granite3.2-vision:2b-fp16', 'created_at': '2025-04-11T18:45:36.892533476Z', 'message': {'role': 'assistant', 'content': '\nThe image shows a small white dog sitting on a stone surface. The dog is wearing a red collar with a gold-colored bell attached to it. The dog appears to be looking off to the side, possibly at something or someone out of frame. The background is blurred and indistinct, but it seems to be an outdoor setting with some greenery visible.'}, 'done_reason': 'stop', 'done': True, 'total_duration': 3915885599, 'load_duration': 1938526324, 'prompt_eval_count': 2343, 'prompt_eval_duration': 857899419, 'eval_count': 81, 'eval_duration': 1111227401} The image shows a small white dog sitting on a stone surface. The dog is wearing a red collar with a gold-colored bell attached to it. The dog appears to be looking off to the side, possibly at something or someone out of frame. The background is blurred and indistinct, but it seems to be an outdoor setting with some greenery visible. ``` It's pretty easy to add tools to llama3.2-vision if you want another vision model with tool support: https://github.com/ollama/ollama/issues/7611#issuecomment-2585009100
Author
Owner

@lemassykoi commented on GitHub (Apr 11, 2025):

then the problem is from my picture?

<!-- gh-comment-id:2797794101 --> @lemassykoi commented on GitHub (Apr 11, 2025): then the problem is from my picture?
Author
Owner

@rick-github commented on GitHub (Apr 11, 2025):

Maybe? Is a picture from a phone? They can be quite large, perhaps the image is too large to be processed, or the processing in ollama is scaling it down to a size where it's unrecognizable, or the amount of tokens that the image uses is exceeding a buffer, or is leaving no space for output tokens, etc.

<!-- gh-comment-id:2797801400 --> @rick-github commented on GitHub (Apr 11, 2025): Maybe? Is a picture from a phone? They can be quite large, perhaps the image is too large to be processed, or the processing in ollama is scaling it down to a size where it's unrecognizable, or the amount of tokens that the image uses is exceeding a buffer, or is leaving no space for output tokens, etc.
Author
Owner

@lemassykoi commented on GitHub (Apr 11, 2025):

yes, picture is from a phone, 4032x2268 pixels

the base64 encoded string dumped to a txt file, size is 4.8 Mb

I tried with a smaller picture: 84x75 pixels (Cogito logo) for 4.4 Kb

Output:

{'model': 'granite3.2-vision:2b-fp16', 'created_at': '2025-04-11T19:10:17.629121353Z', 'message': {'role': 'assistant', 'content': "\nI apologize but I am unable to view or analyze images directly; however, if you could provide a detailed description of what's depicted in the image, I would be more than happy to assist with any questions related to it!"}, 'done_reason': 'stop', 'done': True, 'total_duration': 13279194626, 'load_duration': 1735265383, 'prompt_eval_count': 2344, 'prompt_eval_duration': 9482975184, 'eval_count': 51, 'eval_duration': 2053146361}

I apologize but I am unable to view or analyze images directly; however, if you could provide a detailed description of what's depicted in the image, I would be more than happy to assist with any questions related to it!

13.28 secondes

Here is the base64 string:

iVBORw0KGgoAAAANSUhEUgAAAFQAAABLCAIAAABhiTfYAAAAA3NCSVQICAjb4U/gAAAAGXRFWHRTb2Z0d2FyZQBnbm9tZS1zY3JlZW5zaG907wO/PgAAEMpJREFUeJztW2lwXMdx7p6Z996+3cVicYMLLIiTIMADAG9SEg9RomSJIiXHR2TZcsqusuNybKecyo+knJQrlUo5lfyxy47jklN2pLJ80DpsibookhJFEhIPgCBBECSxALEkjsXiWACL3X1v3szkByiJpHgsFstSqsjvL4CZ/l7P9HR/3UClFNypIJ+2AZ8m7pK/U3GX/J2Ku+TvVNwlf6fiLvk7FXc0efZpG3AVpARbKJurwTF5rMfpvugsr2KPrjY8Lrwd23365JUCi6vRKTk6pS6NOt2XnHMDYiQmAYAgdF/kLg13rDNux9afGnmlYDIh+yMyNOSEo3JwXAyNi6nE5frabeDaes3jwr0n7F0Hk5UlZHmVlnUbPgXyE3HVFuLtId437MSTKmEpLuAjVUFnsKZe37nOKCugCUsOjsu2Hv6L1xP/+nROnjfLEeq2k5cSHKG4gEhMHjlrHznHQ0MCETSGBMHnRo1hdFIiosGgtow9tcXVUM4QAQDcBmmuZl1hPjAmn9ub+tZ2t0azaRveJiXH4jA+LUen5MCYOHvJOXtJDI0LjWGelxT6sCSP+NzIHegKi76Ik+8l9eV0S5O+qk4jeFVsm4irf/t9/NyAk+shX91qblqmMZq14Jdlz6ds1TMoeodF/4gYHBdD43IyIV0aLiwhzTVGZTENFFBEON3vHO/hvcNCZ/hgi7GmnjUGmcd1nVPt9+JDK4xzA85UQr5+zKoroxVFWfN+lj3/67eT756yE5ayuHIb2FSlNdewRWXM50a3gfGkeqvNOtBpj09LSvCeJfqjq/USP735S5ay1fd/OTUwKgnB9Yu1v/+cJ1uuz6bnJ+LqjeOWzRUigALbUUqpfC8p8hEA2Ndhv3Q4NTolDQ2XV2lPbnLVLGCYBg+Xjt96xPNPz01LqQ6fsfe0swdbjKzwzxp5pWB/h21ztbZea1yodfXzwXHZ0ee8f5aX+KnG4GJU6AxW1Go71hrLqzQyF/MXlbGmKu1EL1cK/ufNZHkBbazIguX0hz/84fxXAYDppHrmzYRS+IWN5v1N+tp6vaGCVpXQ2IwMR+XkjAQAnWFZIXXpwCh6TaRpfwBKQWfQHnK4AEfASEw1BGmOOd+XL2ue7+jjo1NyYTFZVskAQGNQXcqkQu5Ys7/gc6NGsfWM3R7CXLcVKCCrF+nrG/R8760/AQI0BNnyKvZ+NweA7kt8bwf7/L3E0OZ1/LNDPmWr1m5ucbWhUXd/eB8Tlnr5cPLCiGwIsjNhHiykf/u4JxwVb5+wT/Y5py44J0LOL99MNFdrW5v15ZWa20BK4UZs8rykuZp19HEhgCC+fjTVGKQtNVo6UeNGyA75/hERHhG5btyy/HISzgX8qTV1+AxvqtK+8Rn3Pz83HZ2Sg+NyVZ22qk4bn5btvc6JEL8YFecGnPaQk5+DTZWspVYrL6RFucRrXksKEdY36Ae7nNP9TmMF7RsWz7+brCimhb7MD38W7rwj1KEu/n63vbXZWN+gA4BUsLfdev6dZKCAfmWrWVVKR6fU6X6nZgGtCVBENA2sLqVr6/Vllay6lBbl0qSlOvp4a7dz6gIPDYtoTAKC14VXpjSGhojqg26ua1hZQrsvOZMzatUifU6x80pkwfNTCXXygqMz3LbSBQBKqfaQs+tgilJ8fL2xOMgIwpp6/ZUPUhejImWD+8MKjRIIFtHyQrpuMUzOyJEpeewcP3KWHzxtHzkLuR5Smkdbati6ej1QQOBD5794ODU6KZuqtYkZdaDTriimn93gyszyLHj+/JDz4iGrqVrb2qwzioPj8ld7EuGofGSV64kNLkoAAEwd93bYKQ4ra9k1URoRGAWPC4tzSXO1tm2l3lKt2Q70DjmRmDoR4ruPWe0hBxEKcqhLh0UB9vpxy+fGNYu0zrC4NCoXBWhhbiaHf76eVwpaz3AAWL+YGQynE/Ll1lTPkFzfoH/lftdHF9c0sKmKtYf4ZAIW5N9sQY1iYwVrrGAzlruth7eFeH9EDI6Ln76SyDGTyyu1FbVaiZ+cuuBsXKpvWa7vabNebrW+4Sf5OXPmP1/ysbg83MUrikh1KVMAb7bZB07Z9WXs6ftdGvv4LlKE+nJ2qIv3DPLF5Wkl5x4D71uib2jQIhOyLyJ7Bp1zA86xHvtQl60xtB21p93+5mfMkZhoD/G9J+jj640rd0wH8yX/bqcdT6nNFVpJHmk9Y794OJWXQz53r6vEf5UfCIHyQuI24HgP375mDleUEgwU0EABXb2IxeJqdEp29DqvHEnZDnSF+Y//rFbUsIEx+epRq7mG1QXmRmdeSVLSUn9+P5VjwtJK7dKY/PnuhJDwyGpXc41GPhGCS/PowmLWMygckcleOsNiP6kLsLglZ1IKAISEnkHnxcNWylaxuPz3PybmuvK8yB/v4bEZFSyi+V783z2JhKU2LdMfXnH9qiPXQxbkYzypTvXzDPZyhAoNOT/aFd99xC7wEZ8bXTruXGcU+sh0UgFANCb+a/dMis+hSM2cfNJWrd1cKWiq0vZ12GcuiuZq9qVNLv0GWptLg2AR0zXc3zFn8kJC6xn+s1cT7SG+qpZ9eYsZKKCmjn+50fzOY+77luimjgDQeoa/12mnX6Nnfuf7I6J32PF7iJRqX4ddVkC+ts3tv7HMhgg1C6jPjaf7OReQviBlcfX7A6l9HXbSVk9scD262phJqRcOSbcBiNBYwcoLyZKF7Df7k9NJtfuIVVnC6gJprZ6h54WEzn4nOqnKC8mugymdwdNb3WUFt9gyWERyTDI+JXuHnDQ3Go/LH+2Kv3AoJaT6+jbzyU2u/BxiOzCdVD43IQQRIddDHl5p/PivfQRxIq5icZnm4hl6fiIuu8IOd1RX2NEYPLbWWFJBb1lj5Jikooj0DsPxHqe+/BZbOxI6L/Dn30mFhkRtgD212fVRGWNxlbJVnhev3DEccaRUi4OsIZguqQzJ94+IswPOrIkbG/QHWgw9jeqSIKyq0/aftHuHnZuf/BRXe9rs145aE3G5eZn2F/eYsxnuLLgD3IFC38dPiiPghcO2rmFDkHnNdF/7TMgrBcfO8dn3pjbAntpi+j3pXp/FQY1SjEyIoXFxIykyOil/sz/5wVluMPzSFnPLMi3HfdX6tqOUggLfx0pvX8S5EBF+L7lnyRx6G5mQn0mp907bSkGeF//hC545FZUFObiwiIxMytCQ80nySqm+iPj57uT5Qcdnku/tdDdfr2K3uAQEnxsBFAAqBXtP8HhSLqvSi3xz0HYzCXj7T1rTSZXrwa8/5C6YYzmNCNtWGElLDU8ocXVgsrhq7eb/8cJMOCqaq7UffS2npfb6WkXSRkLQ7bp86qOTMhwVlOKTG+dW3s3Z81zAW+02AOS4yEhMHu/hRT5S6CNuA9MUVZqqNUS4EHEmEzL/w6cxOinfbLNeO2rpDLavce1YZ+S6b7jc5IykCDoDRJQKTl90+oad+nIanKOkP/djr5SUAAADY+IPB1JuF3hN4jOx2E8qimlVMV1YwnI9eBOBociHZQWkf0SOT6t8LyiA3iHxuwOpth7b7yVPbzVX12num2rT49OKUpgtllO26g47FofP3WvOlcqcyWsMv7fT/YNn49xRtWU0kVL9UREWgGEAAIJACRb7SUOQNVawhiAtySPXdKAYxVV1+itHrIlpqUrJmbD4ySuJ4QkRLKQ/eNJb7L+1MBOJCY0CIwgAIzHx3mmrNI9Wl865k5NJwKsrY09vdf1mf8ri8I9f9HpNvBARvUOib0RciorppExa6uBp++12CxBy3VhVyqpLaVUpLc2jHhe6DQwWUSFUWw+PxNRz+2YYwQdbjL960O1Jow2vFIxNS0NDjQEAnA6LpAWfv1f3zr2RkQl5BNi41Lg4KvedsJ95I/Ht7Z6GD1ML7kB0UkRiMjIhIzEZnZLRmLgYFR19XElwu7DYT0v8CAAEYf9JO8WtolyyY62xebmeDnMAkApicZnroTpD24HdR6wCH6krY3TuLbwMk5wcE3euc12MOsfOO7sOJr9yvzkroWsMZsvvWStnUmo6KeNJNTmjwlHRHxH9UXn0HJ+N81woQuDx9ca2FpeWtiGOUAlLlRWAS8eTfXxwXGxcalQWZ9K9zJA8IgTyyde2uX/w7My+Dru8kD608tpjRxByTMwxL5u1qk4TUikF3IF3Ou1fvZW0HSUl/OG9FCW4tclIkz8XYHPwmsgoPLs3aTCsLiXpZ3VXWZjB33yE2gXsb7abAPDqEavzwi0qaURgFDWGpoFTM1Io2LHW2LhUVwqeeSPxny/Gzw86dhrVuMUVIHhNHBgTwxOyyE/uacxwYmW+MtaGRn1gTLzUau06mCrNo0VpqKjTKbWvwy71k83LjcoS+sFZ/vYJ60Svc35g5oEWY+MyPZD/SR3oY8ykFAAaDA902lyoJQtZUW6GHfv59voIwkMrjfX1Wle/8+zeFBe3dt3BTnt0SjZVsUA+oQTWL9a+9Yj7qS2mVPDi4dRP/hR/56RtOzdcZzKhEFQsoXqHBSWwfXXmg1pZaFr4PeSz97gujsn3TlvFfvzy/Te7fwlL/u7dlNfEJQuZaSAAIEJRLtm+xringT3zRvL9bh4amnm/W/v2dk/u9cYQRiclAHT28RSHmlJWXpj5oEZ25puCRfSL97n8HvJWm32w0xY3UBOUgiPn+GRCVhaz5uqrvjtBKPDRv/us97s7PcEieuw8/87PJ187asVm5DWy1NC4kAomE0oo+OoDn+jpzQVZG+5aWat9/j6XxeVLrameweuf2hlLvXOSI8Laeua9XnddY7ClSf/+E57H1ro0hr9+O/mzVxNHz/Mrv+al0csabUUhySCruxJZI08IbFth7FjnCo/IXQdT04nr0D8TdvpHRL4X71uq32gdBKgook9tdn1vp2dFrXYi5PzitcR/v5YYGBMAICVEJiQAUIJbmw19frc2m2N9GoXH1hirF7HjPfy37ySv+anFoT3kTMTlwysNn/sW++oaLqtk393h/vpDJndgT7v1L8/H3z1l244ajysAyPNibYDd7FVIA1keRcv1kCc2uIYn5Jtt1sJi8kCLi314MAfGnNNhnufFx9alVXUjgtvAh1cayyq1Z/cmO/v5T/6cWFuvTSckIqyo1YKF836q5vn3n0RtgD2xwfS48LcHrLYQnw1XUkJXWFwaFRuXGrMae/oo8uFXHzA3LTV0Boe6bC7AbWB92S0G2NJB9sdPCcK6xdrYlOvZfck/vpcsyMGaBSxpywOddo5J1i++dTamFIxNq8ExMTAmBsbkSExMJdRkQgp5OY4EC0lLbRbmkG/L7K3O4LG1rqEJuafNeulw6puPeHoGxPlB574l+pUi7JWIp9SFiOgZdHqGRGhQTCWkIy/P7UoFiFDgoy01Wl2A1S6g1QtYTkbJ/DW4XYPHjMKXNrsGx0RrN19YYrWdt90GNlUxneFUQs0K75GY7B8RoSHnQkREYtIRQAkYGuoMPC40DQwWkupSVllCq0pJnjerM8cAcPsGjwFAKtUVFr94LTE0IR2h8rzk3kYNACKTKjIhIjGZshUA5JiY5yV+D+Z5SYGPlPhJaR4pzaMFucjmF8xvidtIHgC4o97ttH/9dnI6oQiCAlAKTB0X5JOyQhbIJ4F84veSHBNzTJxtvN4+Yz6J2ztvrzHctMwYn1IHu+yqUlq7gNYGWCCfUgKMISNACcwnP50nbq/n/5/jjv7Xsrvk71TcJX+n4i75OxV3NPn/A5O5o3COl/1iAAAAAElFTkSuQmCC
<!-- gh-comment-id:2797827818 --> @lemassykoi commented on GitHub (Apr 11, 2025): yes, picture is from a phone, 4032x2268 pixels the base64 encoded string dumped to a txt file, size is 4.8 Mb I tried with a smaller picture: 84x75 pixels (Cogito logo) for 4.4 Kb Output: ``` {'model': 'granite3.2-vision:2b-fp16', 'created_at': '2025-04-11T19:10:17.629121353Z', 'message': {'role': 'assistant', 'content': "\nI apologize but I am unable to view or analyze images directly; however, if you could provide a detailed description of what's depicted in the image, I would be more than happy to assist with any questions related to it!"}, 'done_reason': 'stop', 'done': True, 'total_duration': 13279194626, 'load_duration': 1735265383, 'prompt_eval_count': 2344, 'prompt_eval_duration': 9482975184, 'eval_count': 51, 'eval_duration': 2053146361} I apologize but I am unable to view or analyze images directly; however, if you could provide a detailed description of what's depicted in the image, I would be more than happy to assist with any questions related to it! 13.28 secondes ``` Here is the base64 string: ``` iVBORw0KGgoAAAANSUhEUgAAAFQAAABLCAIAAABhiTfYAAAAA3NCSVQICAjb4U/gAAAAGXRFWHRTb2Z0d2FyZQBnbm9tZS1zY3JlZW5zaG907wO/PgAAEMpJREFUeJztW2lwXMdx7p6Z996+3cVicYMLLIiTIMADAG9SEg9RomSJIiXHR2TZcsqusuNybKecyo+knJQrlUo5lfyxy47jklN2pLJ80DpsibookhJFEhIPgCBBECSxALEkjsXiWACL3X1v3szkByiJpHgsFstSqsjvL4CZ/l7P9HR/3UClFNypIJ+2AZ8m7pK/U3GX/J2Ku+TvVNwlf6fiLvk7FXc0efZpG3AVpARbKJurwTF5rMfpvugsr2KPrjY8Lrwd23365JUCi6vRKTk6pS6NOt2XnHMDYiQmAYAgdF/kLg13rDNux9afGnmlYDIh+yMyNOSEo3JwXAyNi6nE5frabeDaes3jwr0n7F0Hk5UlZHmVlnUbPgXyE3HVFuLtId437MSTKmEpLuAjVUFnsKZe37nOKCugCUsOjsu2Hv6L1xP/+nROnjfLEeq2k5cSHKG4gEhMHjlrHznHQ0MCETSGBMHnRo1hdFIiosGgtow9tcXVUM4QAQDcBmmuZl1hPjAmn9ub+tZ2t0azaRveJiXH4jA+LUen5MCYOHvJOXtJDI0LjWGelxT6sCSP+NzIHegKi76Ik+8l9eV0S5O+qk4jeFVsm4irf/t9/NyAk+shX91qblqmMZq14Jdlz6ds1TMoeodF/4gYHBdD43IyIV0aLiwhzTVGZTENFFBEON3vHO/hvcNCZ/hgi7GmnjUGmcd1nVPt9+JDK4xzA85UQr5+zKoroxVFWfN+lj3/67eT756yE5ayuHIb2FSlNdewRWXM50a3gfGkeqvNOtBpj09LSvCeJfqjq/USP735S5ay1fd/OTUwKgnB9Yu1v/+cJ1uuz6bnJ+LqjeOWzRUigALbUUqpfC8p8hEA2Ndhv3Q4NTolDQ2XV2lPbnLVLGCYBg+Xjt96xPNPz01LqQ6fsfe0swdbjKzwzxp5pWB/h21ztbZea1yodfXzwXHZ0ee8f5aX+KnG4GJU6AxW1Go71hrLqzQyF/MXlbGmKu1EL1cK/ufNZHkBbazIguX0hz/84fxXAYDppHrmzYRS+IWN5v1N+tp6vaGCVpXQ2IwMR+XkjAQAnWFZIXXpwCh6TaRpfwBKQWfQHnK4AEfASEw1BGmOOd+XL2ue7+jjo1NyYTFZVskAQGNQXcqkQu5Ys7/gc6NGsfWM3R7CXLcVKCCrF+nrG/R8760/AQI0BNnyKvZ+NweA7kt8bwf7/L3E0OZ1/LNDPmWr1m5ucbWhUXd/eB8Tlnr5cPLCiGwIsjNhHiykf/u4JxwVb5+wT/Y5py44J0LOL99MNFdrW5v15ZWa20BK4UZs8rykuZp19HEhgCC+fjTVGKQtNVo6UeNGyA75/hERHhG5btyy/HISzgX8qTV1+AxvqtK+8Rn3Pz83HZ2Sg+NyVZ22qk4bn5btvc6JEL8YFecGnPaQk5+DTZWspVYrL6RFucRrXksKEdY36Ae7nNP9TmMF7RsWz7+brCimhb7MD38W7rwj1KEu/n63vbXZWN+gA4BUsLfdev6dZKCAfmWrWVVKR6fU6X6nZgGtCVBENA2sLqVr6/Vllay6lBbl0qSlOvp4a7dz6gIPDYtoTAKC14VXpjSGhojqg26ua1hZQrsvOZMzatUifU6x80pkwfNTCXXygqMz3LbSBQBKqfaQs+tgilJ8fL2xOMgIwpp6/ZUPUhejImWD+8MKjRIIFtHyQrpuMUzOyJEpeewcP3KWHzxtHzkLuR5Smkdbati6ej1QQOBD5794ODU6KZuqtYkZdaDTriimn93gyszyLHj+/JDz4iGrqVrb2qwzioPj8ld7EuGofGSV64kNLkoAAEwd93bYKQ4ra9k1URoRGAWPC4tzSXO1tm2l3lKt2Q70DjmRmDoR4ruPWe0hBxEKcqhLh0UB9vpxy+fGNYu0zrC4NCoXBWhhbiaHf76eVwpaz3AAWL+YGQynE/Ll1lTPkFzfoH/lftdHF9c0sKmKtYf4ZAIW5N9sQY1iYwVrrGAzlruth7eFeH9EDI6Ln76SyDGTyyu1FbVaiZ+cuuBsXKpvWa7vabNebrW+4Sf5OXPmP1/ysbg83MUrikh1KVMAb7bZB07Z9WXs6ftdGvv4LlKE+nJ2qIv3DPLF5Wkl5x4D71uib2jQIhOyLyJ7Bp1zA86xHvtQl60xtB21p93+5mfMkZhoD/G9J+jj640rd0wH8yX/bqcdT6nNFVpJHmk9Y794OJWXQz53r6vEf5UfCIHyQuI24HgP375mDleUEgwU0EABXb2IxeJqdEp29DqvHEnZDnSF+Y//rFbUsIEx+epRq7mG1QXmRmdeSVLSUn9+P5VjwtJK7dKY/PnuhJDwyGpXc41GPhGCS/PowmLWMygckcleOsNiP6kLsLglZ1IKAISEnkHnxcNWylaxuPz3PybmuvK8yB/v4bEZFSyi+V783z2JhKU2LdMfXnH9qiPXQxbkYzypTvXzDPZyhAoNOT/aFd99xC7wEZ8bXTruXGcU+sh0UgFANCb+a/dMis+hSM2cfNJWrd1cKWiq0vZ12GcuiuZq9qVNLv0GWptLg2AR0zXc3zFn8kJC6xn+s1cT7SG+qpZ9eYsZKKCmjn+50fzOY+77luimjgDQeoa/12mnX6Nnfuf7I6J32PF7iJRqX4ddVkC+ts3tv7HMhgg1C6jPjaf7OReQviBlcfX7A6l9HXbSVk9scD262phJqRcOSbcBiNBYwcoLyZKF7Df7k9NJtfuIVVnC6gJprZ6h54WEzn4nOqnKC8mugymdwdNb3WUFt9gyWERyTDI+JXuHnDQ3Go/LH+2Kv3AoJaT6+jbzyU2u/BxiOzCdVD43IQQRIddDHl5p/PivfQRxIq5icZnm4hl6fiIuu8IOd1RX2NEYPLbWWFJBb1lj5Jikooj0DsPxHqe+/BZbOxI6L/Dn30mFhkRtgD212fVRGWNxlbJVnhev3DEccaRUi4OsIZguqQzJ94+IswPOrIkbG/QHWgw9jeqSIKyq0/aftHuHnZuf/BRXe9rs145aE3G5eZn2F/eYsxnuLLgD3IFC38dPiiPghcO2rmFDkHnNdF/7TMgrBcfO8dn3pjbAntpi+j3pXp/FQY1SjEyIoXFxIykyOil/sz/5wVluMPzSFnPLMi3HfdX6tqOUggLfx0pvX8S5EBF+L7lnyRx6G5mQn0mp907bSkGeF//hC545FZUFObiwiIxMytCQ80nySqm+iPj57uT5Qcdnku/tdDdfr2K3uAQEnxsBFAAqBXtP8HhSLqvSi3xz0HYzCXj7T1rTSZXrwa8/5C6YYzmNCNtWGElLDU8ocXVgsrhq7eb/8cJMOCqaq7UffS2npfb6WkXSRkLQ7bp86qOTMhwVlOKTG+dW3s3Z81zAW+02AOS4yEhMHu/hRT5S6CNuA9MUVZqqNUS4EHEmEzL/w6cxOinfbLNeO2rpDLavce1YZ+S6b7jc5IykCDoDRJQKTl90+oad+nIanKOkP/djr5SUAAADY+IPB1JuF3hN4jOx2E8qimlVMV1YwnI9eBOBociHZQWkf0SOT6t8LyiA3iHxuwOpth7b7yVPbzVX12num2rT49OKUpgtllO26g47FofP3WvOlcqcyWsMv7fT/YNn49xRtWU0kVL9UREWgGEAAIJACRb7SUOQNVawhiAtySPXdKAYxVV1+itHrIlpqUrJmbD4ySuJ4QkRLKQ/eNJb7L+1MBOJCY0CIwgAIzHx3mmrNI9Wl865k5NJwKsrY09vdf1mf8ri8I9f9HpNvBARvUOib0RciorppExa6uBp++12CxBy3VhVyqpLaVUpLc2jHhe6DQwWUSFUWw+PxNRz+2YYwQdbjL960O1Jow2vFIxNS0NDjQEAnA6LpAWfv1f3zr2RkQl5BNi41Lg4KvedsJ95I/Ht7Z6GD1ML7kB0UkRiMjIhIzEZnZLRmLgYFR19XElwu7DYT0v8CAAEYf9JO8WtolyyY62xebmeDnMAkApicZnroTpD24HdR6wCH6krY3TuLbwMk5wcE3euc12MOsfOO7sOJr9yvzkroWsMZsvvWStnUmo6KeNJNTmjwlHRHxH9UXn0HJ+N81woQuDx9ca2FpeWtiGOUAlLlRWAS8eTfXxwXGxcalQWZ9K9zJA8IgTyyde2uX/w7My+Dru8kD608tpjRxByTMwxL5u1qk4TUikF3IF3Ou1fvZW0HSUl/OG9FCW4tclIkz8XYHPwmsgoPLs3aTCsLiXpZ3VXWZjB33yE2gXsb7abAPDqEavzwi0qaURgFDWGpoFTM1Io2LHW2LhUVwqeeSPxny/Gzw86dhrVuMUVIHhNHBgTwxOyyE/uacxwYmW+MtaGRn1gTLzUau06mCrNo0VpqKjTKbWvwy71k83LjcoS+sFZ/vYJ60Svc35g5oEWY+MyPZD/SR3oY8ykFAAaDA902lyoJQtZUW6GHfv59voIwkMrjfX1Wle/8+zeFBe3dt3BTnt0SjZVsUA+oQTWL9a+9Yj7qS2mVPDi4dRP/hR/56RtOzdcZzKhEFQsoXqHBSWwfXXmg1pZaFr4PeSz97gujsn3TlvFfvzy/Te7fwlL/u7dlNfEJQuZaSAAIEJRLtm+xringT3zRvL9bh4amnm/W/v2dk/u9cYQRiclAHT28RSHmlJWXpj5oEZ25puCRfSL97n8HvJWm32w0xY3UBOUgiPn+GRCVhaz5uqrvjtBKPDRv/us97s7PcEieuw8/87PJ187asVm5DWy1NC4kAomE0oo+OoDn+jpzQVZG+5aWat9/j6XxeVLrameweuf2hlLvXOSI8Laeua9XnddY7ClSf/+E57H1ro0hr9+O/mzVxNHz/Mrv+al0csabUUhySCruxJZI08IbFth7FjnCo/IXQdT04nr0D8TdvpHRL4X71uq32gdBKgook9tdn1vp2dFrXYi5PzitcR/v5YYGBMAICVEJiQAUIJbmw19frc2m2N9GoXH1hirF7HjPfy37ySv+anFoT3kTMTlwysNn/sW++oaLqtk393h/vpDJndgT7v1L8/H3z1l244ajysAyPNibYDd7FVIA1keRcv1kCc2uIYn5Jtt1sJi8kCLi314MAfGnNNhnufFx9alVXUjgtvAh1cayyq1Z/cmO/v5T/6cWFuvTSckIqyo1YKF836q5vn3n0RtgD2xwfS48LcHrLYQnw1XUkJXWFwaFRuXGrMae/oo8uFXHzA3LTV0Boe6bC7AbWB92S0G2NJB9sdPCcK6xdrYlOvZfck/vpcsyMGaBSxpywOddo5J1i++dTamFIxNq8ExMTAmBsbkSExMJdRkQgp5OY4EC0lLbRbmkG/L7K3O4LG1rqEJuafNeulw6puPeHoGxPlB574l+pUi7JWIp9SFiOgZdHqGRGhQTCWkIy/P7UoFiFDgoy01Wl2A1S6g1QtYTkbJ/DW4XYPHjMKXNrsGx0RrN19YYrWdt90GNlUxneFUQs0K75GY7B8RoSHnQkREYtIRQAkYGuoMPC40DQwWkupSVllCq0pJnjerM8cAcPsGjwFAKtUVFr94LTE0IR2h8rzk3kYNACKTKjIhIjGZshUA5JiY5yV+D+Z5SYGPlPhJaR4pzaMFucjmF8xvidtIHgC4o97ttH/9dnI6oQiCAlAKTB0X5JOyQhbIJ4F84veSHBNzTJxtvN4+Yz6J2ztvrzHctMwYn1IHu+yqUlq7gNYGWCCfUgKMISNACcwnP50nbq/n/5/jjv7Xsrvk71TcJX+n4i75OxV3NPn/A5O5o3COl/1iAAAAAElFTkSuQmCC ```
Author
Owner

@pdevine commented on GitHub (Apr 11, 2025):

cc @gabe-l-hart

<!-- gh-comment-id:2797989762 --> @pdevine commented on GitHub (Apr 11, 2025): cc @gabe-l-hart
Author
Owner

@gabe-l-hart commented on GitHub (Apr 11, 2025):

Thanks for raising this issue! We'll get into it and try to figure out what's going on.

<!-- gh-comment-id:2798010936 --> @gabe-l-hart commented on GitHub (Apr 11, 2025): Thanks for raising this issue! We'll get into it and try to figure out what's going on.
Author
Owner

@alex-jw-brooks commented on GitHub (Apr 17, 2025):

Hey @lemassykoi, thank you for raising this issue! I took a look, it does seem like it produces unanswerable more than in other implementations, e.g., transformers - I think there are a few things that may be contributing to the behavior you are seeing.

  • The LLM being quantized

  • Tthe chat template - for granite vision, the ollama template is slightly different than in transformers, particularly in how tools are handled, because in transformers, the JSON for the tools is pretty-printed, but it's not pretty-printed at the moment in ollama

This may be why you're seeing pickier behavior, particularly when there are tools enabled, because this model is pretty sensitive to the chat format.

To clarify on this part

Maybe? Is a picture from a phone? They can be quite large, perhaps the image is too large to be processed, or the processing in ollama is scaling it down to a size where it's unrecognizable

This shouldn't be the issue in this case 🙂 granite vision is super close to llava next architecturally, so it does anyres to handle different image sizes in the preprocessing - since the image is large, it'll take a lot of tokens (for this size, 6-7k tokens, which is comparatively a lot since the model is 16k context), but it being too big to fit into the context for a single image shouldn't be the underlying cause of what you are seeing.

I had tried each of your 3 examples, and was able to (sometimes) get a response out of the chat API for an image of the same size when I removed the seed & increased the temperature for all cases - experimenting with sampling params, or potentially trying to pre-format a raw request to send pretty printed JSON for tools might help 🤞

<!-- gh-comment-id:2814189800 --> @alex-jw-brooks commented on GitHub (Apr 17, 2025): Hey @lemassykoi, thank you for raising this issue! I took a look, it does seem like it produces `unanswerable` more than in other implementations, e.g., transformers - I think there are a few things that may be contributing to the behavior you are seeing. - The LLM being quantized - Tthe chat template - for granite vision, the ollama template is slightly different than in [transformers](https://huggingface.co/ibm-granite/granite-vision-3.2-2b/blob/main/chat_template.json), particularly in how tools are handled, because in transformers, the JSON for the tools is pretty-printed, but it's not pretty-printed at the moment in ollama This may be why you're seeing pickier behavior, particularly when there are tools enabled, because this model is pretty sensitive to the chat format. To clarify on this part > Maybe? Is a picture from a phone? They can be quite large, perhaps the image is too large to be processed, or the processing in ollama is scaling it down to a size where it's unrecognizable This shouldn't be the issue in this case 🙂 granite vision is super close to llava next architecturally, so it does anyres to handle different image sizes in the preprocessing - since the image is large, it'll take a lot of tokens (for this size, 6-7k tokens, which is comparatively a lot since the model is 16k context), but it being too big to fit into the context for a single image shouldn't be the underlying cause of what you are seeing. I had tried each of your 3 examples, and was able to (sometimes) get a response out of the chat API for an image of the same size when I removed the seed & increased the temperature for all cases - experimenting with sampling params, or potentially trying to pre-format a raw request to send pretty printed JSON for tools might help 🤞
Author
Owner

@lemassykoi commented on GitHub (Apr 18, 2025):

This may be why you're seeing pickier behavior, particularly when there are tools enabled, because this model is pretty sensitive to the chat format.

Thanks for investigations.

My main problem here is not about the tools, but about chat history.
When sending a chat history (case 1 and case 2), I can't get a consistent output

Without chat history (case 3), no problem

<!-- gh-comment-id:2815320700 --> @lemassykoi commented on GitHub (Apr 18, 2025): > This may be why you're seeing pickier behavior, particularly when there are tools enabled, because this model is pretty sensitive to the chat format. Thanks for investigations. My main problem here is not about the tools, but about chat history. When sending a chat history (case 1 and case 2), I can't get a consistent output Without chat history (case 3), no problem
Author
Owner

@lemassykoi commented on GitHub (Apr 18, 2025):

and an new problem for Granite3.3 : when using tools, the thinking doesn't work

https://github.com/ollama/ollama/issues/10338

After modifying Modelfile to accept thinking with tools, same problem than for granite3.2 : if I pass chat history, thinking doesn't work.

Should the "control" message be sent before each user message? (edit: not working either)

<!-- gh-comment-id:2815578407 --> @lemassykoi commented on GitHub (Apr 18, 2025): and an new problem for Granite3.3 : when using tools, the thinking doesn't work https://github.com/ollama/ollama/issues/10338 After modifying Modelfile to accept thinking with tools, same problem than for granite3.2 : if I pass chat history, thinking doesn't work. Should the "control" message be sent before each user message? (edit: not working either)
Author
Owner

@urlan commented on GitHub (Aug 30, 2025):

So... this issue is opened since april. Any change?

<!-- gh-comment-id:3238880627 --> @urlan commented on GitHub (Aug 30, 2025): So... this issue is opened since april. Any change?
Author
Owner

@lemassykoi commented on GitHub (Aug 30, 2025):

No. I switched to LM Studio

<!-- gh-comment-id:3239154667 --> @lemassykoi commented on GitHub (Aug 30, 2025): No. I switched to LM Studio
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32477