[GH-ISSUE #9702] gemma3 model card context size does not match the description's #32096

Closed
opened 2026-04-22 13:01:56 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @MarkWard0110 on GitHub (Mar 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9702

Originally assigned to: @pdevine on GitHub.

What is the issue?

The gemma3 description says
feature a 128K context window
but the model cards have a lesser context defined

1b https://ollama.com/library/gemma3:1b-it-fp16/blobs/95686f6f23df
gemma3.context_length 32768

4b https://ollama.com/library/gemma3:4b-it-fp16/blobs/8300f2d40f8b
gemma3.context_length 8192

12b https://ollama.com/library/gemma3:12b-it-fp16/blobs/6c4f660fdd8f
gemma3.context_length 8192

27b https://ollama.com/library/gemma3:27b-it-fp16/blobs/8bf5daddfa5b
gemma3.context_length 8192

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @MarkWard0110 on GitHub (Mar 12, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9702 Originally assigned to: @pdevine on GitHub. ### What is the issue? The gemma3 [description ](https://ollama.com/library/[gemma3](https://ollama.com/library/gemma3)) says `feature a 128K context window` but the model cards have a lesser context defined 1b https://ollama.com/library/gemma3:1b-it-fp16/blobs/95686f6f23df gemma3.context_length 32768 4b https://ollama.com/library/gemma3:4b-it-fp16/blobs/8300f2d40f8b gemma3.context_length 8192 12b https://ollama.com/library/gemma3:12b-it-fp16/blobs/6c4f660fdd8f gemma3.context_length 8192 27b https://ollama.com/library/gemma3:27b-it-fp16/blobs/8bf5daddfa5b gemma3.context_length 8192 ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-22 13:01:56 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 12, 2025):

The 1b model context window of 32768 is correct. The 4b, 12b, and 27b models use RoPE scaling to be able to handle extended context windows. However, it's not clear if ollama is actually doing this for gemma3 at the moment - the logs aren't showing rope values as they do for other models. Some Needle-In-The-Haystack tests would reveal if it was the case.

<!-- gh-comment-id:2719195748 --> @rick-github commented on GitHub (Mar 12, 2025): The 1b model context window of 32768 is correct. The 4b, 12b, and 27b models use [RoPE scaling](https://www.hopsworks.ai/dictionary/rope-scaling) to be able to handle extended context windows. However, it's not clear if ollama is actually doing this for gemma3 at the moment - the logs aren't showing rope values as they do for other models. Some Needle-In-The-Haystack tests would reveal if it was the case.
Author
Owner

@GhostGuy9 commented on GitHub (Mar 13, 2025):

try updating ollama? i found logs stating you will need to update for gemma 3 to work

<!-- gh-comment-id:2719942909 --> @GhostGuy9 commented on GitHub (Mar 13, 2025): try updating ollama? i found logs stating you will need to update for gemma 3 to work
Author
Owner

@MarkWard0110 commented on GitHub (Mar 13, 2025):

try updating ollama? i found logs stating you will need to update for gemma 3 to work

The model card on ollama.com contains the values. The source

<!-- gh-comment-id:2722173977 --> @MarkWard0110 commented on GitHub (Mar 13, 2025): > try updating ollama? i found logs stating you will need to update for gemma 3 to work The model card on ollama.com contains the values. The source
Author
Owner

@pdevine commented on GitHub (Mar 13, 2025):

The problem is the config setting in the original HF weights didn't specify the max_position_embeddings (they omitted it for the 4B/12B/27B models, but it's set correctly for the 1B model). I had shoved in a default 8K setting which is what the converter picked up when it didn't find the value (I didn't have the correct setting from the Deepmind team when I originally wrote that).

That said, we actually ignore this setting in the new ollama engine, so it should be working correctly. If you use /set parameter num_ctx <size> it will get changed to whatever you set it to. We don't clamp this value like we do with the old engine.

I've also updated the converter to default to 128K, so you should see that if you reconvert the weights or when we re-push the metadata for gemma3.

<!-- gh-comment-id:2722840055 --> @pdevine commented on GitHub (Mar 13, 2025): The problem is the config setting in the original HF weights didn't specify the `max_position_embeddings` (they omitted it for the 4B/12B/27B models, but it's set correctly for the 1B model). I had shoved in a default 8K setting which is what the converter picked up when it didn't find the value (I didn't have the correct setting from the Deepmind team when I originally wrote that). That said, we actually *ignore* this setting in the new ollama engine, so it *should be working correctly*. If you use `/set parameter num_ctx <size>` it will get changed to whatever you set it to. We don't clamp this value like we do with the old engine. I've also updated the converter to default to 128K, so you should see that if you reconvert the weights or when we re-push the metadata for gemma3.
Author
Owner

@MarkWard0110 commented on GitHub (Mar 15, 2025):

The problem is the config setting in the original HF weights didn't specify the max_position_embeddings (they omitted it for the 4B/12B/27B models, but it's set correctly for the 1B model). I had shoved in a default 8K setting which is what the converter picked up when it didn't find the value (I didn't have the correct setting from the Deepmind team when I originally wrote that).

That said, we actually ignore this setting in the new ollama engine, so it should be working correctly. If you use /set parameter num_ctx <size> it will get changed to whatever you set it to. We don't clamp this value like we do with the old engine.

I've also updated the converter to default to 128K, so you should see that if you reconvert the weights or when we re-push the metadata for gemma3.

I have a program that reads the model card to determine the model's maximum context. It uses the model card as the maximum context the model will support and will not use a context size larger than this value.

<!-- gh-comment-id:2726939795 --> @MarkWard0110 commented on GitHub (Mar 15, 2025): > The problem is the config setting in the original HF weights didn't specify the `max_position_embeddings` (they omitted it for the 4B/12B/27B models, but it's set correctly for the 1B model). I had shoved in a default 8K setting which is what the converter picked up when it didn't find the value (I didn't have the correct setting from the Deepmind team when I originally wrote that). > > That said, we actually _ignore_ this setting in the new ollama engine, so it _should be working correctly_. If you use `/set parameter num_ctx <size>` it will get changed to whatever you set it to. We don't clamp this value like we do with the old engine. > > I've also updated the converter to default to 128K, so you should see that if you reconvert the weights or when we re-push the metadata for gemma3. I have a program that reads the model card to determine the model's maximum context. It uses the model card as the maximum context the model will support and will not use a context size larger than this value.
Author
Owner

@pdevine commented on GitHub (Mar 15, 2025):

@MarkWard0110 We'll update the metadata soon with the larger gemma3.context_length soon. As I mentioned the weights from Huggingface didn't set the values correctly for the 4B/12B/27B weights and I picked the wrong default value. There shouldn't be any impact to actually running the models in Ollama other than the model config looks incorrect.

<!-- gh-comment-id:2727063304 --> @pdevine commented on GitHub (Mar 15, 2025): @MarkWard0110 We'll update the metadata soon with the larger `gemma3.context_length` soon. As I mentioned the weights from Huggingface didn't set the values correctly for the 4B/12B/27B weights and I picked the wrong default value. There shouldn't be any impact to actually _running_ the models in Ollama other than the model config looks incorrect.
Author
Owner

@rick-github commented on GitHub (Mar 30, 2025):

Models updated, re-pulling will update local copies.

<!-- gh-comment-id:2764650813 --> @rick-github commented on GitHub (Mar 30, 2025): Models updated, re-pulling will update local copies.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32096