[GH-ISSUE #9728] Gemma3 wrong context length #52871

Closed
opened 2026-04-29 01:14:35 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @wizardbc on GitHub (Mar 13, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9728

What is the issue?

Gemma3 4b, 12b, 27b models' gemma3.context_length is NOT 8192.
It should be 131072.

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @wizardbc on GitHub (Mar 13, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9728 ### What is the issue? Gemma3 4b, 12b, 27b models' `gemma3.context_length` is NOT `8192`. It should be `131072`. ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-29 01:14:35 -05:00
Author
Owner

@nistvan86 commented on GitHub (Mar 13, 2025):

I think this is the default amount it will allocate and you can increase it. But this context length requires a lot of additional VRAM, I think I read somewhere 20+GB VRAM for 128k.

Also it seems Ollama is not really efficient with handling kv caches just yet and needs optimizations. from Reddit

<!-- gh-comment-id:2721252279 --> @nistvan86 commented on GitHub (Mar 13, 2025): I think this is the default amount it will allocate and you can increase it. But this context length requires a lot of additional VRAM, I think I read somewhere 20+GB VRAM for 128k. Also it seems Ollama is not really efficient with handling kv caches just yet and needs optimizations. [from Reddit](https://www.reddit.com/r/ollama/comments/1j9kv29/comment/mhfe6ok/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)
Author
Owner

@rick-github commented on GitHub (Mar 13, 2025):

#9702

<!-- gh-comment-id:2721416473 --> @rick-github commented on GitHub (Mar 13, 2025): #9702
Author
Owner

@pdevine commented on GitHub (Mar 13, 2025):

See my other comment in #9702... hopefully clarifies that yes, you can set it to 128k right now.

<!-- gh-comment-id:2722846915 --> @pdevine commented on GitHub (Mar 13, 2025): See my other comment in #9702... hopefully clarifies that yes, you can set it to 128k right now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#52871