[GH-ISSUE #7362] Llama3.2-vision image processing not implemented for /generate #30438

Closed
opened 2026-04-22 10:03:33 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @jessegross on GitHub (Oct 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7362

Originally assigned to: @pdevine on GitHub.

What is the issue?

Reported by @oderwat:
https://github.com/ollama/ollama/issues/6972#issuecomment-2437586368

OS

No response

GPU

No response

CPU

No response

Ollama version

0.4.0

Originally created by @jessegross on GitHub (Oct 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7362 Originally assigned to: @pdevine on GitHub. ### What is the issue? Reported by @oderwat: https://github.com/ollama/ollama/issues/6972#issuecomment-2437586368 ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version 0.4.0
GiteaMirror added the bug label 2026-04-22 10:03:33 -05:00
Author
Owner

@chiehpower commented on GitHub (Oct 26, 2024):

Hi @jessegross

Could I confirm that currently only /api/chat is working properly, while /api/generate is not functioning as expected?

Thank you so much.

<!-- gh-comment-id:2439643651 --> @chiehpower commented on GitHub (Oct 26, 2024): Hi @jessegross Could I confirm that currently only `/api/chat` is working properly, while `/api/generate` is not functioning as expected? Thank you so much.
Author
Owner

@pdevine commented on GitHub (Oct 27, 2024):

@chiehpower that's correct. You can try PR #7384 to test out the /api/generate handler.

<!-- gh-comment-id:2440162339 --> @pdevine commented on GitHub (Oct 27, 2024): @chiehpower that's correct. You can try PR #7384 to test out the `/api/generate` handler.
Author
Owner

@chiehpower commented on GitHub (Oct 28, 2024):

Got it. Thank you so much!

<!-- gh-comment-id:2440259919 --> @chiehpower commented on GitHub (Oct 28, 2024): Got it. Thank you so much!
Author
Owner

@oderwat commented on GitHub (Oct 28, 2024):

This was merged into "main", and when compiling main this does not work for me when using 'llama-3.2-vision' neither with chat nor with generate.

key clip.has_text_encoder not found in file
terminate called after throwing an instance of 'std::runtime_error'
  what():  Missing required key: clip.has_text_encoder

I believe that this needs to be merged with v0.4.0-rc5 and get a v0.4.0-r6 to be useable or is the plan to rebase the v0.4.0 code and then create the new release candidate for v0.4.0?

Or did I understand something wrong?

<!-- gh-comment-id:2442677663 --> @oderwat commented on GitHub (Oct 28, 2024): This was merged into "main", and when compiling main this does not work for me when using 'llama-3.2-vision' neither with chat nor with generate. ``` key clip.has_text_encoder not found in file terminate called after throwing an instance of 'std::runtime_error' what(): Missing required key: clip.has_text_encoder ``` I believe that this needs to be merged with v0.4.0-rc5 and get a v0.4.0-r6 to be useable or is the plan to rebase the v0.4.0 code and then create the new release candidate for v0.4.0? Or did I understand something wrong?
Author
Owner

@jessegross commented on GitHub (Oct 28, 2024):

You need to build according to the instructions here in order to support llama3.2-vision:
https://github.com/ollama/ollama/blob/main/docs/development.md#transition-to-go-runner

<!-- gh-comment-id:2442849451 --> @jessegross commented on GitHub (Oct 28, 2024): You need to build according to the instructions here in order to support llama3.2-vision: https://github.com/ollama/ollama/blob/main/docs/development.md#transition-to-go-runner
Author
Owner

@oderwat commented on GitHub (Oct 28, 2024):

@jessegross thank you!

For reference: I am on WSL2 with RTX 3090 TI, and it builds (and uses CUDA) with this build script:

#!/bin/bash
echo "Setting limit higher for mlock to work"
limit=8413752832
sudo prlimit --memlock=$limit:$limit --pid $$

git fetch
git checkout main
export LIBRARY_PATH=/usr/lib/wsl/lib
make -C llama -j 4
go build .
# run it for unlimited use in my private network
OLLAMA_HOST=0.0.0.0:11434 OLLAMA_ORIGINS="*" ./ollama serve
<!-- gh-comment-id:2442862378 --> @oderwat commented on GitHub (Oct 28, 2024): @jessegross thank you! For reference: I am on WSL2 with RTX 3090 TI, and it builds (and uses CUDA) with this build script: ``` #!/bin/bash echo "Setting limit higher for mlock to work" limit=8413752832 sudo prlimit --memlock=$limit:$limit --pid $$ git fetch git checkout main export LIBRARY_PATH=/usr/lib/wsl/lib make -C llama -j 4 go build . # run it for unlimited use in my private network OLLAMA_HOST=0.0.0.0:11434 OLLAMA_ORIGINS="*" ./ollama serve ```
Author
Owner

@oderwat commented on GitHub (Oct 28, 2024):

Using this I could update my capollama to use the generate API 👍

<!-- gh-comment-id:2442866184 --> @oderwat commented on GitHub (Oct 28, 2024): Using this I could update my [capollama](https://github.com/oderwat/capollama) to use the generate API 👍
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30438