[GH-ISSUE #15245] Degradation of ollama 0.17.4 => 0.19.0 -- unable to decode images with gemini-3-flash-preview:cloud #9751

Closed
opened 2026-04-12 22:38:45 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @tigran123 on GitHub (Apr 2, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15245

What is the issue?

Here is a sample image:

Image

Ollama version 0.17.4 understands this with gemini-3-flash-preview:cloud model perfectly, look:

$ md5sum sample1.jpg 
617fd385e8732b6d35bb527eff6cefc5  sample1.jpg
$ ollama --version
ollama version is 0.17.4
$ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg
Added image '/home/tigran/sample1.jpg'
Thinking...
**Describing the Scene**

I'm currently analyzing the request to describe a picture. The input image shows yellow fruit in a bowl, and I've started forming an initial overview, noting the 
presence of a pile of small, oval objects. The next step is to refine this and provide a more detailed description.


**Analyzing the Fruits**

I've completed my analysis of the fruit, identifying them as likely Barhi dates. I've noted the quantity, shape (oval with some elongation), and color variations, from 
pale lemon to deeper gold. I've also observed the smooth, glossy skin and any blemishes, reflecting their freshness, and included the presence of stems. Now, I'm 
analyzing the container.


**Detailing the Surroundings**

I've moved on to the bowl, noting its round shape and flared rim, centered in the frame. I can see that the contents nearly fill it. I'm viewing it from a high angle, 
almost from the top. The analysis now extends to include the background.

\n\n

My current focus is the background. I've noted a light tan/beige color with a subtle swirl or floral pattern. It appears to be a kitchen countertop or perhaps a 
textured tablecloth. It serves as a neutral backdrop that makes the yellow dates stand out.


...done thinking.

This high-angle, top-down photograph captures a white ceramic bowl filled with fresh yellow dates, likely of the Barhi variety.

**The Subject:**
The bowl is piled high with approximately 15 to 20 dates. These fruits are distinct from the more common brown, dried dates; they are fresh (in the "khalal" stage of 
ripeness). They are predominantly oval or egg-shaped, with a firm, smooth, and glossy texture. Their color is a vibrant, sunny yellow, ranging from a pale lemon hue to 
a deeper golden-yellow. Several dates show minor natural blemishes or small brown spots. Some thin, thread-like yellow fibers (part of the strands they grow on) are 
still attached to a few of the fruits, and small circular indentations are visible at the points where they were attached to the stem.

**The Container:**
The dates are held in a simple, clean, circular white bowl. The rim of the bowl is visible around the edges of the fruit, providing a neutral frame that makes the 
yellow color of the dates pop.

**The Background and Lighting:**
The bowl sits on a neutral-toned surface, possibly a kitchen countertop or a tablecloth, featuring a subtle, swirling tan and beige pattern. 

The lighting is bright and appears to come from the upper left, creating distinct white highlights on the glossy skin of the dates and casting soft, dark shadows toward 
the bottom right within the bowl and onto the surface beneath it. This lighting emphasizes the roundness and three-dimensional volume of the fruit. The overall 
composition is clean and focuses entirely on the freshness of the produce.

However, with version 0.19.3 we see a complete failure:

$ ollama --version
ollama version is 0.19.0
$ md5sum sample1.jpg 
617fd385e8732b6d35bb527eff6cefc5  sample1.jpg
$ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg
Thinking...
**Defining the Goal**

I'm currently focused on defining the objective: I must generate a detailed description of an image located at a specific file path. I understand the user needs this 
for a local file. The immediate next step is to figure out my limitations regarding file access, given my AI nature.


...done thinking.

I do not have access to your computer's local file system, so I cannot see the image located at `/home/tigran/Pictures/sample1.jpg`.

To have me describe the image, please **upload it directly to this chat** using the "upload" or "attachment" icon (usually a plus sign or a paperclip) in your message 
box. Once you upload the file, I will be able to see it and provide a detailed description for you.

$ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg
Thinking...
**Initiating Image Description**

I'm currently focused on the user's request. My initial analysis reveals they want a detailed description of an image located on their system. I've realized the crucial 
step is accessing this specific image. However, I've hit a snag: I lack the direct access needed to retrieve the file. This is my current priority to proceed.

Note that the line Added image... is MISSING in the invocation of ollama 0.19.0. So, how do we fix this? Maybe we need some special command-line option to pass to ollama now in order for it to add the image to the context?

OS

Ubuntu Linux 22.04.5 LTS

GPU

None

CPU

8-core Intel, 32GB RAM

Ollama version

0.19.0 and 0.17.4 compared

Originally created by @tigran123 on GitHub (Apr 2, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15245 ### What is the issue? Here is a sample image: ![Image](https://github.com/user-attachments/assets/90a42993-d0a1-4a28-9d7f-65b5e995e90e) Ollama version 0.17.4 understands this with `gemini-3-flash-preview:cloud` model perfectly, look: ``` $ md5sum sample1.jpg 617fd385e8732b6d35bb527eff6cefc5 sample1.jpg $ ollama --version ollama version is 0.17.4 $ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg Added image '/home/tigran/sample1.jpg' Thinking... **Describing the Scene** I'm currently analyzing the request to describe a picture. The input image shows yellow fruit in a bowl, and I've started forming an initial overview, noting the presence of a pile of small, oval objects. The next step is to refine this and provide a more detailed description. **Analyzing the Fruits** I've completed my analysis of the fruit, identifying them as likely Barhi dates. I've noted the quantity, shape (oval with some elongation), and color variations, from pale lemon to deeper gold. I've also observed the smooth, glossy skin and any blemishes, reflecting their freshness, and included the presence of stems. Now, I'm analyzing the container. **Detailing the Surroundings** I've moved on to the bowl, noting its round shape and flared rim, centered in the frame. I can see that the contents nearly fill it. I'm viewing it from a high angle, almost from the top. The analysis now extends to include the background. \n\n My current focus is the background. I've noted a light tan/beige color with a subtle swirl or floral pattern. It appears to be a kitchen countertop or perhaps a textured tablecloth. It serves as a neutral backdrop that makes the yellow dates stand out. ...done thinking. This high-angle, top-down photograph captures a white ceramic bowl filled with fresh yellow dates, likely of the Barhi variety. **The Subject:** The bowl is piled high with approximately 15 to 20 dates. These fruits are distinct from the more common brown, dried dates; they are fresh (in the "khalal" stage of ripeness). They are predominantly oval or egg-shaped, with a firm, smooth, and glossy texture. Their color is a vibrant, sunny yellow, ranging from a pale lemon hue to a deeper golden-yellow. Several dates show minor natural blemishes or small brown spots. Some thin, thread-like yellow fibers (part of the strands they grow on) are still attached to a few of the fruits, and small circular indentations are visible at the points where they were attached to the stem. **The Container:** The dates are held in a simple, clean, circular white bowl. The rim of the bowl is visible around the edges of the fruit, providing a neutral frame that makes the yellow color of the dates pop. **The Background and Lighting:** The bowl sits on a neutral-toned surface, possibly a kitchen countertop or a tablecloth, featuring a subtle, swirling tan and beige pattern. The lighting is bright and appears to come from the upper left, creating distinct white highlights on the glossy skin of the dates and casting soft, dark shadows toward the bottom right within the bowl and onto the surface beneath it. This lighting emphasizes the roundness and three-dimensional volume of the fruit. The overall composition is clean and focuses entirely on the freshness of the produce. ``` However, with version 0.19.3 we see a complete failure: ``` $ ollama --version ollama version is 0.19.0 $ md5sum sample1.jpg 617fd385e8732b6d35bb527eff6cefc5 sample1.jpg $ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg Thinking... **Defining the Goal** I'm currently focused on defining the objective: I must generate a detailed description of an image located at a specific file path. I understand the user needs this for a local file. The immediate next step is to figure out my limitations regarding file access, given my AI nature. ...done thinking. I do not have access to your computer's local file system, so I cannot see the image located at `/home/tigran/Pictures/sample1.jpg`. To have me describe the image, please **upload it directly to this chat** using the "upload" or "attachment" icon (usually a plus sign or a paperclip) in your message box. Once you upload the file, I will be able to see it and provide a detailed description for you. $ ollama run gemini-3-flash-preview:cloud "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg Thinking... **Initiating Image Description** I'm currently focused on the user's request. My initial analysis reveals they want a detailed description of an image located on their system. I've realized the crucial step is accessing this specific image. However, I've hit a snag: I lack the direct access needed to retrieve the file. This is my current priority to proceed. ``` Note that the line `Added image...` is MISSING in the invocation of ollama 0.19.0. So, how do we fix this? Maybe we need some special command-line option to pass to ollama now in order for it to add the image to the context? ### OS Ubuntu Linux 22.04.5 LTS ### GPU None ### CPU 8-core Intel, 32GB RAM ### Ollama version 0.19.0 and 0.17.4 compared
GiteaMirror added the bug label 2026-04-12 22:38:45 -05:00
Author
Owner

@tigran123 commented on GitHub (Apr 2, 2026):

Of course, if I use Ollama API inside Open WebUI, then it works, but this is not acceptable, as I need the command-line version to work, since this is coded in the "vision" skill of my nanobot. And nanobot, unfortunately, cannot use gemini-3-flash-preview:cloud because of the various bugs in ollama which cause it to fail to execute any tools, so I am forced to use glm-5:cloud as the main model for nanobot and then teach it the skill for vision via external ollama run... command.

Image
<!-- gh-comment-id:4180817679 --> @tigran123 commented on GitHub (Apr 2, 2026): Of course, if I use Ollama API inside Open WebUI, then it works, but this is not acceptable, as I need the command-line version to work, since this is coded in the "vision" skill of my nanobot. And nanobot, unfortunately, cannot use `gemini-3-flash-preview:cloud` because of the various bugs in ollama which cause it to fail to execute any tools, so I am forced to use `glm-5:cloud` as the main model for nanobot and then teach it the skill for vision via external `ollama run...` command. <img width="1277" height="1829" alt="Image" src="https://github.com/user-attachments/assets/418dfc78-afde-4cd8-a50f-f7b4e7091d99" />
Author
Owner

@rick-github commented on GitHub (Apr 2, 2026):

Starting at 0.18.0, the capabilities list for gemini-3-flash-preview:cloud no longer includes vision.

$ ollama -v
ollama version is 0.17.7
$ ollama show gemini-3-flash-preview:cloud | grep vision
    vision        
$ ollama -v
ollama version is 0.18.0
$ ollama show gemini-3-flash-preview:cloud | grep vision

Looking at the changelog there are several cloud related changes so it's not clear which one is causing the issue, possibly #14608.

<!-- gh-comment-id:4180954897 --> @rick-github commented on GitHub (Apr 2, 2026): Starting at 0.18.0, the capabilities list for gemini-3-flash-preview:cloud no longer includes vision. ```console $ ollama -v ollama version is 0.17.7 $ ollama show gemini-3-flash-preview:cloud | grep vision vision ``` ```console $ ollama -v ollama version is 0.18.0 $ ollama show gemini-3-flash-preview:cloud | grep vision ``` Looking at the changelog there are several cloud related changes so it's not clear which one is causing the issue, possibly #14608.
Author
Owner

@rick-github commented on GitHub (Apr 2, 2026):

Drop ":cloud" from the model name.

$ ollama -v
ollama version is 0.18.0
$ ollama show gemini-3-flash-preview | grep vision
    vision        
$ ollama run gemini-3-flash-preview "Describe in detail what is depicted on this picture" ./sample1.jpg
Added image './sample1.jpg'
Thinking...
**Analyzing the Visual Content**

I'm now zeroing in on the visual content. Initially, the goal is to create a detailed description, which means breaking down the image I'm analyzing. Currently, I see a white bowl filled with small...
...

...done thinking.

This image is a top-down, close-up photograph of a white bowl filled with fresh yellow dates, likely the Barhi variety.

...

The lighting appears to be coming from directly above, creating bright specular highlights on the top of each date and casting soft, dark shadows in the crevices between the fruits and at the very bottom 
of the bowl. The overall composition is tight and focused, emphasizing the texture and color of the fresh produce.
<!-- gh-comment-id:4180998773 --> @rick-github commented on GitHub (Apr 2, 2026): Drop ":cloud" from the model name. ```console $ ollama -v ollama version is 0.18.0 $ ollama show gemini-3-flash-preview | grep vision vision $ ollama run gemini-3-flash-preview "Describe in detail what is depicted on this picture" ./sample1.jpg Added image './sample1.jpg' Thinking... **Analyzing the Visual Content** I'm now zeroing in on the visual content. Initially, the goal is to create a detailed description, which means breaking down the image I'm analyzing. Currently, I see a white bowl filled with small... ... ...done thinking. This image is a top-down, close-up photograph of a white bowl filled with fresh yellow dates, likely the Barhi variety. ... The lighting appears to be coming from directly above, creating bright specular highlights on the top of each date and casting soft, dark shadows in the crevices between the fruits and at the very bottom of the bowl. The overall composition is tight and focused, emphasizing the texture and color of the fresh produce. ```
Author
Owner

@tigran123 commented on GitHub (Apr 2, 2026):

Perfect! Thank you, it works!

$ ollama run gemini-3-flash-preview "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg
pulling manifest 
pulling 045bfd1a16af: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  367 B                         
verifying sha256 digest 
writing manifest 
success 
Added image '/home/tigran/Pictures/sample1.jpg'
Thinking...
**Observing the Still Life**

I'm currently focused on dissecting the visual elements. The subject is clearly identifiable: a bowl of small, yellow, oval fruits. The container itself is a white, 
round bowl. The surface, which appears patterned, is tan or beige, which provides the background. I'm noting the interplay of shapes, colors, and textures.


**Analyzing the Light Source**

I've been thinking about the lighting, and I'm seeing bright, direct light from above and slightly off-center. This creates strong highlights on the fruit, with soft 
shadows underneath the bowl and the fruit. I am noting the visual impact of the light and shadow interplay. The fruits have thin, dry stems, looking like fresh yellow 
dates, and some have slight tone variations. It is a top-down close-up shot of the centered bowl.


...done thinking.

A top-down, close-up photograph depicts a white ceramic bowl filled with fresh, yellow dates (likely the Barhi variety).

The dates are oval-shaped and have a smooth, glossy skin that reflects the light, suggesting they might be slightly damp or naturally waxy. Their color is a vibrant, 
uniform yellow, though a few show tiny, natural brown specks. Several thin, light-colored, straw-like stems or fibers are still attached to some of the fruits, 
crisscrossing through the pile.

The fruits are piled high, filling the circular white bowl. The bowl itself is positioned in the center of the frame and rests on a light beige or tan surface with a 
subtle, swirling floral or marbled pattern. 

The lighting is bright and comes from above, creating prominent white highlights on the curved surfaces of the dates and casting soft shadows within the bowl and around 
its base on the patterned surface.
<!-- gh-comment-id:4181004758 --> @tigran123 commented on GitHub (Apr 2, 2026): Perfect! Thank you, it works! ``` $ ollama run gemini-3-flash-preview "Describe in detail what is depicted on this picture" `pwd`/sample1.jpg pulling manifest pulling 045bfd1a16af: 100% ▕████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 367 B verifying sha256 digest writing manifest success Added image '/home/tigran/Pictures/sample1.jpg' Thinking... **Observing the Still Life** I'm currently focused on dissecting the visual elements. The subject is clearly identifiable: a bowl of small, yellow, oval fruits. The container itself is a white, round bowl. The surface, which appears patterned, is tan or beige, which provides the background. I'm noting the interplay of shapes, colors, and textures. **Analyzing the Light Source** I've been thinking about the lighting, and I'm seeing bright, direct light from above and slightly off-center. This creates strong highlights on the fruit, with soft shadows underneath the bowl and the fruit. I am noting the visual impact of the light and shadow interplay. The fruits have thin, dry stems, looking like fresh yellow dates, and some have slight tone variations. It is a top-down close-up shot of the centered bowl. ...done thinking. A top-down, close-up photograph depicts a white ceramic bowl filled with fresh, yellow dates (likely the Barhi variety). The dates are oval-shaped and have a smooth, glossy skin that reflects the light, suggesting they might be slightly damp or naturally waxy. Their color is a vibrant, uniform yellow, though a few show tiny, natural brown specks. Several thin, light-colored, straw-like stems or fibers are still attached to some of the fruits, crisscrossing through the pile. The fruits are piled high, filling the circular white bowl. The bowl itself is positioned in the center of the frame and rests on a light beige or tan surface with a subtle, swirling floral or marbled pattern. The lighting is bright and comes from above, creating prominent white highlights on the curved surfaces of the dates and casting soft shadows within the bowl and around its base on the patterned surface. ```
Author
Owner

@tigran123 commented on GitHub (Apr 2, 2026):

We can close this now.

<!-- gh-comment-id:4181021598 --> @tigran123 commented on GitHub (Apr 2, 2026): We can close this now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9751