[GH-ISSUE #7300] Llama3.2-vision Run Error #66699

Closed
opened 2026-05-04 07:51:05 -05:00 by GiteaMirror · 21 comments
Owner

Originally created by @mruckman1 on GitHub (Oct 21, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7300

What is the issue?

  1. Updated Ollama this morning.
  2. Entered ollama run x/llama3.2-vision on macbook
  3. Got below output:

pulling manifest
pulling 652e85aa1e14... 100% ▕████████████████▏ 6.0 GB
pulling 622429e8d318... 100% ▕████████████████▏ 1.9 GB
pulling 962e0f69a367... 100% ▕████████████████▏ 163 B
pulling dc49c86b8ebb... 100% ▕████████████████▏ 30 B
pulling 6a50468ba2a8... 100% ▕████████████████▏ 498 B
verifying sha256 digest
writing manifest
success
> Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder

Expected: Ollama download without error.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.14

Originally created by @mruckman1 on GitHub (Oct 21, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7300 ### What is the issue? 1. Updated Ollama this morning. 2. Entered `ollama run x/llama3.2-vision` on macbook 3. Got below output: > pulling manifest > pulling 652e85aa1e14... 100% ▕████████████████▏ 6.0 GB > pulling 622429e8d318... 100% ▕████████████████▏ 1.9 GB > pulling 962e0f69a367... 100% ▕████████████████▏ 163 B > pulling dc49c86b8ebb... 100% ▕████████████████▏ 30 B > pulling 6a50468ba2a8... 100% ▕████████████████▏ 498 B > verifying sha256 digest > writing manifest > success **> Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder** Expected: Ollama download without error. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.3.14
GiteaMirror added the bug label 2026-05-04 07:51:05 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 21, 2024):

Vision support was merged recently (https://github.com/ollama/ollama/pull/6963), 0.3.14 doesn't include it.

<!-- gh-comment-id:2427337123 --> @rick-github commented on GitHub (Oct 21, 2024): Vision support was merged recently (https://github.com/ollama/ollama/pull/6963), 0.3.14 doesn't include it.
Author
Owner

@silasalves commented on GitHub (Oct 21, 2024):

What does "vision support" mean? Does it enabling "submitting multiple images for inference" or "video inference"? Or is it just the support for this particular model?

AFAIK, video or multiple images are still an open issue #3184

<!-- gh-comment-id:2427615124 --> @silasalves commented on GitHub (Oct 21, 2024): What does "vision support" mean? Does it enabling "submitting multiple images for inference" or "video inference"? Or is it just the support for this particular model? AFAIK, video or multiple images are still an open issue #3184
Author
Owner

@rick-github commented on GitHub (Oct 21, 2024):

Vision support for llama3.2. llama3.2 doesn't do video, and doesn't work reliably with multiple images.

<!-- gh-comment-id:2427627832 --> @rick-github commented on GitHub (Oct 21, 2024): Vision support for llama3.2. llama3.2 doesn't do video, and [doesn't work reliably with multiple images](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct/discussions/43#:~:text=image%20for%20inferencing%2C-,the%20model%20doesn%27t%20work%20reliably%20well%20with%20multiple%20images,-JOJOHuang).
Author
Owner

@pavan-otthi123 commented on GitHub (Oct 22, 2024):

Does this mean that llama3.2-vision can't be used in the current version of Ollama?

I'm also getting the same error when attempting to run the model

<!-- gh-comment-id:2428251219 --> @pavan-otthi123 commented on GitHub (Oct 22, 2024): Does this mean that llama3.2-vision can't be used in the current version of Ollama? I'm also getting the same error when attempting to run the model
Author
Owner

@rick-github commented on GitHub (Oct 22, 2024):

Version 0.4.0 will support llama3.2-vision.

<!-- gh-comment-id:2428534532 --> @rick-github commented on GitHub (Oct 22, 2024): Version [0.4.0](https://github.com/ollama/ollama/releases/tag/v0.4.0-rc3) will support llama3.2-vision.
Author
Owner

@Animaxx commented on GitHub (Oct 22, 2024):

Thank you for the hard work, could we also this change to Llama.cpp repo as well?
How can we convert the model from HF to GGUF with llama vision structure?

<!-- gh-comment-id:2429724134 --> @Animaxx commented on GitHub (Oct 22, 2024): Thank you for the hard work, could we also this change to Llama.cpp repo as well? How can we convert the model from HF to GGUF with llama vision structure?
Author
Owner

@silasalves commented on GitHub (Oct 22, 2024):

@rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only.

<!-- gh-comment-id:2429874071 --> @silasalves commented on GitHub (Oct 22, 2024): @rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only.
Author
Owner

@jessegross commented on GitHub (Oct 22, 2024):

@rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only.

It can run on the GPU but it needs more RAM than the text-only versions, so it has likely exceed the limit of your GPU.

<!-- gh-comment-id:2429904797 --> @jessegross commented on GitHub (Oct 22, 2024): > @rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only. It can run on the GPU but it needs more RAM than the text-only versions, so it has likely exceed the limit of your GPU.
Author
Owner

@rick-github commented on GitHub (Oct 22, 2024):

It should run on GPU if it fits:

$ ollama ps
NAME                            ID              SIZE    PROCESSOR       UNTIL   
x/llama3.2-vision:latest        25e973636a29    11 GB   100% GPU        Forever

If you can provide server logs perhaps we can see why it's not working for you.

<!-- gh-comment-id:2429907091 --> @rick-github commented on GitHub (Oct 22, 2024): It should run on GPU if it fits: ```console $ ollama ps NAME ID SIZE PROCESSOR UNTIL x/llama3.2-vision:latest 25e973636a29 11 GB 100% GPU Forever ``` If you can provide [server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) perhaps we can see why it's not working for you.
Author
Owner

@silasalves commented on GitHub (Oct 22, 2024):

@jessegross Thanks for pointing that out. That sounds correct, my GPU is quite old and has only 4GB RAM.

@rick-github Thanks for the support, this is my server.log https://gist.github.com/silasalves/f2bdfc195618f19ecd557b945cab32b9

I think this is the important part?

time=2024-10-22T14:22:10.644-04:00 level=INFO source=llama-server.go:72 msg="system memory" total="31.9 GiB" free="13.6 GiB" free_swap="19.0 GiB"
time=2024-10-22T14:22:10.649-04:00 level=INFO source=memory.go:346 msg="offload to cuda" projector.weights="1.8 GiB" projector.graph="2.8 GiB" layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[4.1 GiB]" memory.gpu_overhead="0 B" memory.required.full="5.9 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B]" memory.weights.total="5.2 GiB" memory.weights.repeating="4.8 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB"
<!-- gh-comment-id:2429978205 --> @silasalves commented on GitHub (Oct 22, 2024): @jessegross Thanks for pointing that out. That sounds correct, my GPU is quite old and has only 4GB RAM. @rick-github Thanks for the support, this is my server.log https://gist.github.com/silasalves/f2bdfc195618f19ecd557b945cab32b9 I think this is the important part? ``` time=2024-10-22T14:22:10.644-04:00 level=INFO source=llama-server.go:72 msg="system memory" total="31.9 GiB" free="13.6 GiB" free_swap="19.0 GiB" time=2024-10-22T14:22:10.649-04:00 level=INFO source=memory.go:346 msg="offload to cuda" projector.weights="1.8 GiB" projector.graph="2.8 GiB" layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[4.1 GiB]" memory.gpu_overhead="0 B" memory.required.full="5.9 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B]" memory.weights.total="5.2 GiB" memory.weights.repeating="4.8 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" ```
Author
Owner

@rick-github commented on GitHub (Oct 22, 2024):

Yep, too big for your card.

<!-- gh-comment-id:2429989017 --> @rick-github commented on GitHub (Oct 22, 2024): Yep, too big for your card.
Author
Owner

@pdevine commented on GitHub (Oct 23, 2024):

@Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++.

I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work.

<!-- gh-comment-id:2430613683 --> @pdevine commented on GitHub (Oct 23, 2024): @Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++. I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work.
Author
Owner

@ludos1978 commented on GitHub (Oct 25, 2024):

i've read that ollama 0.4 should support vision tasks.
but also i understood that 0.3.14 should be able to load the x/llama-vision model. Is that correct?

if it's correct i am getting the same error as mentioned above, on a 90GByte M2 Macbook using 0.3.14:
Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder

<!-- gh-comment-id:2437058712 --> @ludos1978 commented on GitHub (Oct 25, 2024): i've read that ollama 0.4 should support vision tasks. but also i understood that 0.3.14 should be able to load the x/llama-vision model. Is that correct? if it's correct i am getting the same error as mentioned above, on a 90GByte M2 Macbook using 0.3.14: Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder
Author
Owner

@rick-github commented on GitHub (Oct 25, 2024):

0.3.14 cannot load x/llama3.2-vision.

<!-- gh-comment-id:2438016516 --> @rick-github commented on GitHub (Oct 25, 2024): 0.3.14 cannot load x/llama3.2-vision.
Author
Owner

@eulercat commented on GitHub (Oct 26, 2024):

@pdevine
Is it possible to use REST API like this on the latest?

curl -X POST http://127.0.0.1:11434/api/chat \
-H "Content-Type: application/json" \
-d '{ "model": "x/llama3.2-vision", 
 "message": [
     {"role": "user", 
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
     }
] }'
<!-- gh-comment-id:2439172603 --> @eulercat commented on GitHub (Oct 26, 2024): @pdevine Is it possible to use REST API like this on the latest? ``` curl -X POST http://127.0.0.1:11434/api/chat \ -H "Content-Type: application/json" \ -d '{ "model": "x/llama3.2-vision", "message": [ {"role": "user", "content": [ {"type": "text", "text": "What’s in this image?"}, { "type": "image_url", "image_url": { "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg", }, }, ], } ] }' ```
Author
Owner

@pdevine commented on GitHub (Oct 28, 2024):

@eulercat we don't support pulling images w/ image_url. You'll have to base64 encode your image, so it looks like:

curl http://localhost:11434/api/chat -d '{
  "model": "x/llama3.2-vision",
  "messages": [
    {
      "role": "user",
      "content": "what is in this image?",
      "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
    }
  ]
}'

You can find out more information here

<!-- gh-comment-id:2442817817 --> @pdevine commented on GitHub (Oct 28, 2024): @eulercat we don't support pulling images w/ `image_url`. You'll have to base64 encode your image, so it looks like: ``` curl http://localhost:11434/api/chat -d '{ "model": "x/llama3.2-vision", "messages": [ { "role": "user", "content": "what is in this image?", "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"] } ] }' ``` You can find out more information [here](https://github.com/ollama/ollama/blob/main/docs/api.md#chat-request-with-images)
Author
Owner

@pdevine commented on GitHub (Oct 28, 2024):

@ludos1978 you'll need 0.4.0 for it to work. Unfortunately we're still working through some issues w/ the release candidates.

<!-- gh-comment-id:2442820078 --> @pdevine commented on GitHub (Oct 28, 2024): @ludos1978 you'll need `0.4.0` for it to work. Unfortunately we're still working through some issues w/ the release candidates.
Author
Owner

@rick-github commented on GitHub (Oct 28, 2024):

If the image is large, it will exceed the maximum argument length of the shell.

(echo '{
         "model":"x/llama3.2-vision",
         "messages":[
           { "role":"user",
             "content":"describe this image",
             "images":["' ;
               curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
             ]
           }
         ],
         "stream":false
       }') | curl -s localhost:11434/api/chat -d @- | jq
{
  "model": "x/llama3.2-vision",
  "created_at": "2024-10-28T23:14:35.376161501Z",
  "message": {
    "role": "assistant",
    "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The boardwalk is made of light-colored wood and features a simple design, with no visible railings or obstacles to obstruct the view.\n\nAs the boardwalk stretches out into the distance, it disappears from sight, inviting the viewer to imagine where it might lead. The surrounding grass is tall and green, swaying gently in the breeze, while trees dot the horizon, adding depth and texture to the landscape.\n\nAbove, a brilliant blue sky with white clouds provides a stunning backdrop, casting dappled shadows across the boardwalk and creating a sense of warmth and tranquility. Overall, the image exudes a sense of calmness and serenity, inviting the viewer to step into its peaceful world."
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 3744887728,
  "load_duration": 34980268,
  "prompt_eval_count": 13,
  "prompt_eval_duration": 45000000,
  "eval_count": 164,
  "eval_duration": 3302000000
}
<!-- gh-comment-id:2442852924 --> @rick-github commented on GitHub (Oct 28, 2024): If the image is large, it will exceed the maximum argument length of the shell. ```sh (echo '{ "model":"x/llama3.2-vision", "messages":[ { "role":"user", "content":"describe this image", "images":["' ; curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '" ] } ], "stream":false }') | curl -s localhost:11434/api/chat -d @- | jq ``` ```json { "model": "x/llama3.2-vision", "created_at": "2024-10-28T23:14:35.376161501Z", "message": { "role": "assistant", "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The boardwalk is made of light-colored wood and features a simple design, with no visible railings or obstacles to obstruct the view.\n\nAs the boardwalk stretches out into the distance, it disappears from sight, inviting the viewer to imagine where it might lead. The surrounding grass is tall and green, swaying gently in the breeze, while trees dot the horizon, adding depth and texture to the landscape.\n\nAbove, a brilliant blue sky with white clouds provides a stunning backdrop, casting dappled shadows across the boardwalk and creating a sense of warmth and tranquility. Overall, the image exudes a sense of calmness and serenity, inviting the viewer to step into its peaceful world." }, "done_reason": "stop", "done": true, "total_duration": 3744887728, "load_duration": 34980268, "prompt_eval_count": 13, "prompt_eval_duration": 45000000, "eval_count": 164, "eval_duration": 3302000000 } ```
Author
Owner

@jhowilbur commented on GitHub (Nov 2, 2024):

@Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++.

I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work.

But with some effort, I believe it will be possible to use their Golang binding to c++
they did it with whisper.cpp
https://github.com/ggerganov/whisper.cpp/tree/master/bindings/go

To our surprise, it's calling the same libraries as those used in llama.cpp, the core to do the tensor computations, the lib GGML written in cpp.

<!-- gh-comment-id:2452805308 --> @jhowilbur commented on GitHub (Nov 2, 2024): > @Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++. > > I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work. But with some effort, I believe it will be possible to use their Golang binding to c++ they did it with whisper.cpp https://github.com/ggerganov/whisper.cpp/tree/master/bindings/go To our surprise, it's calling the same libraries as those used in llama.cpp, the core to do the tensor computations, the lib GGML written in cpp.
Author
Owner

@delenius commented on GitHub (Nov 5, 2024):

I am getting the same error on a M3 Macbook with 64gb, with Ollama 0.4.0-rc8.

<!-- gh-comment-id:2457597355 --> @delenius commented on GitHub (Nov 5, 2024): I am getting the same error on a M3 Macbook with 64gb, with Ollama 0.4.0-rc8.
Author
Owner

@rick-github commented on GitHub (Nov 5, 2024):

Server logs will help in debugging.

$ curl localhost:11434/api/version
{"version":"0.4.0-rc8"}
$ (echo '{
         "model":"x/llama3.2-vision",
         "messages":[
           { "role":"user",
             "content":"describe this image",
             "images":["' ;
               curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
             ]
           }
         ],
         "stream":false
       }') | curl -s localhost:11434/api/chat -d @- | jq
{
  "model": "x/llama3.2-vision",
  "created_at": "2024-11-05T16:15:16.856668179Z",
  "message": {
    "role": "assistant",
    "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The purpose of the image is to showcase the beauty of nature and the tranquility that can be found in such settings.\n\n* A wooden boardwalk:\n\t+ Winding its way through a grassy field\n\t+ Made of light-colored wood planks\n\t+ Surrounded by tall blades of grass on either side\n* Tall grass:\n\t+ Swaying gently in the breeze\n\t+ Varying shades of green, from light to dark\n\t+ Creating a sense of depth and texture in the image\n* Trees in the background:\n\t+ Scattered throughout the field\n\t+ Providing shade and shelter for wildlife\n\t+ Adding to the overall sense of serenity and calmness\n\nThe image effectively captures the beauty and tranquility of nature, inviting the viewer to step into the peaceful atmosphere. The use of natural colors and textures adds to the sense of realism, making the scene feel more immersive and engaging."
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 79628322199,
  "load_duration": 70623694007,
  "prompt_eval_count": 14,
  "prompt_eval_duration": 2349000000,
  "eval_count": 212,
  "eval_duration": 6235000000
}

<!-- gh-comment-id:2457610096 --> @rick-github commented on GitHub (Nov 5, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will help in debugging. ```console $ curl localhost:11434/api/version {"version":"0.4.0-rc8"} $ (echo '{ "model":"x/llama3.2-vision", "messages":[ { "role":"user", "content":"describe this image", "images":["' ; curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '" ] } ], "stream":false }') | curl -s localhost:11434/api/chat -d @- | jq { "model": "x/llama3.2-vision", "created_at": "2024-11-05T16:15:16.856668179Z", "message": { "role": "assistant", "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The purpose of the image is to showcase the beauty of nature and the tranquility that can be found in such settings.\n\n* A wooden boardwalk:\n\t+ Winding its way through a grassy field\n\t+ Made of light-colored wood planks\n\t+ Surrounded by tall blades of grass on either side\n* Tall grass:\n\t+ Swaying gently in the breeze\n\t+ Varying shades of green, from light to dark\n\t+ Creating a sense of depth and texture in the image\n* Trees in the background:\n\t+ Scattered throughout the field\n\t+ Providing shade and shelter for wildlife\n\t+ Adding to the overall sense of serenity and calmness\n\nThe image effectively captures the beauty and tranquility of nature, inviting the viewer to step into the peaceful atmosphere. The use of natural colors and textures adds to the sense of realism, making the scene feel more immersive and engaging." }, "done_reason": "stop", "done": true, "total_duration": 79628322199, "load_duration": 70623694007, "prompt_eval_count": 14, "prompt_eval_duration": 2349000000, "eval_count": 212, "eval_duration": 6235000000 } ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#66699