[GH-ISSUE #7592] Wrong Prompt Token Report from Ignoring Image Token Count #4841

Closed
opened 2026-04-12 15:50:25 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @chigkim on GitHub (Nov 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7592

What is the issue?

If you run llama3.2-vision with /set verbose, it looks like it only counts tokens for text ignoring tokens from image. Thus it reports extremely slow speed for prompt rate.
It's the same with API.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.4.0

Originally created by @chigkim on GitHub (Nov 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7592 ### What is the issue? If you run llama3.2-vision with /set verbose, it looks like it only counts tokens for text ignoring tokens from image. Thus it reports extremely slow speed for prompt rate. It's the same with API. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.4.0
GiteaMirror added the bug label 2026-04-12 15:50:25 -05:00
Author
Owner

@jessegross commented on GitHub (Nov 11, 2024):

llama3.2-vision treats the image as a single token - it's counted, it just doesn't have a big impact on the count. Different models handle this differently and also have significantly different image processing times before they get to the language model processing stage. We count the full amount of time for accuracy but TPS will vary.

<!-- gh-comment-id:2468823095 --> @jessegross commented on GitHub (Nov 11, 2024): llama3.2-vision treats the image as a single token - it's counted, it just doesn't have a big impact on the count. Different models handle this differently and also have significantly different image processing times before they get to the language model processing stage. We count the full amount of time for accuracy but TPS will vary.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4841