[GH-ISSUE #4717] phi3:medium-128k doesn't use the full context window by default #2974

Closed
opened 2026-04-12 13:21:19 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @derluke on GitHub (May 30, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4717

I was playing with the new phi3:medium-128k model and was surprised to see it struggled to keep track of my earlier questions, or handle long documents. But on the bright side it was surprisingly fast.

After a little digging I found out how to specify the context size using a new model file. I decided to give it a go and used this

FROM phi3:medium-128k
TEMPLATE "{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>"
PARAMETER stop <|end|>
PARAMETER stop <|user|>
PARAMETER stop <|assistant|>
PARAMETER num_ctx 65536

(I wasn't sure exactly how much 128k is supposed to be (assume 2**17) so decided to be on the safe side and take the power of two below)
And it worked, now the model can read long documents and behaves as expected (and is rather slow, but that is due to my limited hardware)

It would be amazing if these models came pre-configured such that they use the full context window by default.

Originally created by @derluke on GitHub (May 30, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4717 I was playing with the new phi3:medium-128k model and was surprised to see it struggled to keep track of my earlier questions, or handle long documents. But on the bright side it was surprisingly fast. After a little digging I found out how to specify the context size using a new model file. I decided to give it a go and used this ``` FROM phi3:medium-128k TEMPLATE "{{ if .System }}<|system|> {{ .System }}<|end|> {{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}<|end|> {{ end }}<|assistant|> {{ .Response }}<|end|>" PARAMETER stop <|end|> PARAMETER stop <|user|> PARAMETER stop <|assistant|> PARAMETER num_ctx 65536 ``` (I wasn't sure exactly how much 128k is supposed to be (assume 2**17) so decided to be on the safe side and take the power of two below) And it worked, now the model can read long documents and behaves as expected (and is rather slow, but that is due to my limited hardware) It would be amazing if these models came pre-configured such that they use the full context window by default.
GiteaMirror added the feature request label 2026-04-12 13:21:19 -05:00
Author
Owner

@jmorganca commented on GitHub (May 30, 2024):

Thanks for the issue! Will close this for https://github.com/ollama/ollama/issues/1005

<!-- gh-comment-id:2140142972 --> @jmorganca commented on GitHub (May 30, 2024): Thanks for the issue! Will close this for https://github.com/ollama/ollama/issues/1005
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2974