[GH-ISSUE #3848] phi3 doesn't seem to accept SYSTEM prompt #64421

Closed
opened 2026-05-03 17:35:33 -05:00 by GiteaMirror · 18 comments
Owner

Originally created by @rb81 on GitHub (Apr 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3848

What is the issue?

Regardless of what's in the modelfile, it seems phi3 doesn't take in the SYSTEM prompt at all. I've looked around and can't find anyone else discussing this. Assuming this is a bug of some kind.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.1.32

Originally created by @rb81 on GitHub (Apr 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3848 ### What is the issue? Regardless of what's in the modelfile, it seems phi3 doesn't take in the SYSTEM prompt at all. I've looked around and can't find anyone else discussing this. Assuming this is a bug of some kind. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.1.32
GiteaMirror added the bug label 2026-05-03 17:35:33 -05:00
Author
Owner

@thinkverse commented on GitHub (Apr 23, 2024):

Took a cursory glance at the technical report1 and I'm new to AI in general but it looks to me like Phi3 wasn't trained with a system prompt in mind, the template in the report shows only user and assistant.

<|user|>\n Question <|end|>\n <|assistant|>

The Modelfile2 sample provided by Microsoft also doesn't include a system prompt. Perhaps @jmorganca or someone else from the Ollama team can shed some light on it. They know more than I do. 👍

<!-- gh-comment-id:2073367159 --> @thinkverse commented on GitHub (Apr 23, 2024): Took a cursory glance at the technical report[^1] and I'm new to AI in general but it looks to me like Phi3 wasn't trained with a system prompt in mind, the template in the report shows only user and assistant. ``` <|user|>\n Question <|end|>\n <|assistant|> ``` The Modelfile[^2] sample provided by Microsoft also doesn't include a system prompt. Perhaps @jmorganca or someone else from the Ollama team can shed some light on it. They know more than I do. 👍 [^1]: https://arxiv.org/html/2404.14219v1#S2 [^2]: https://github.com/Azure-Samples/Phi-3MiniSamples/blob/main/ollama/Modelfile
Author
Owner

@thinkverse commented on GitHub (Apr 23, 2024):

Circling back to this after some testing. From my testing, I get that it kinda supports it but not really. The results are not consistent at all. I've tried different system prompts and some it kinds of adhered to and others ignore completely.

The best result I had was Ollama's Mario example where Phi3 was not acting as Mario, like for instance, Llama3 does. But instead acted as an assistant answering questions about the Mushroom Kingdom.

The template on the HuggingFace READMEs shows <|system|>, but not on the GGUF and ONNX variants. And the sample Modelfile on HuggingFace1 again doesn't have <|system|>, so maybe it was trained with it for a bit and then they decided to scrap it?

If I was going to use Phi3 I wouldn't bother with a system prompt given these results.

<!-- gh-comment-id:2073671215 --> @thinkverse commented on GitHub (Apr 23, 2024): Circling back to this after some testing. From my testing, I get that it kinda supports it but not really. The results are not consistent at all. I've tried different system prompts and some it kinds of adhered to and others ignore completely. The best result I had was Ollama's Mario example where Phi3 was not acting as Mario, like for instance, Llama3 does. But instead acted as an assistant answering questions about the Mushroom Kingdom. The template on the HuggingFace READMEs shows `<|system|>`, but not on the `GGUF` and `ONNX` variants. And the sample Modelfile on HuggingFace[^1] again doesn't have `<|system|>`, so maybe it was trained with it for a bit and then they decided to scrap it? If I was going to use Phi3 I wouldn't bother with a system prompt given these results. [^1]: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/blob/main/Modelfile_q4
Author
Owner

@rb81 commented on GitHub (Apr 24, 2024):

@thinkverse - Thanks for your comments. Perhaps they did scrap the system prompt to keep the model's training simpler..? In any case, feeding in a system prompt as the user is good enough for a model this small, I guess.

<!-- gh-comment-id:2074933804 --> @rb81 commented on GitHub (Apr 24, 2024): @thinkverse - Thanks for your comments. Perhaps they did scrap the system prompt to keep the model's training simpler..? In any case, feeding in a system prompt as the user is good enough for a model this small, I guess.
Author
Owner

@jmorganca commented on GitHub (Apr 24, 2024):

@rb81 Phi 3 has been updated with a system prompt as per their published tokenizer configuration.

% ollama run phi3
>>> /set system You are a helpful assistant that always answers in french.
Set system message.
>>> Hi!
Bonjour ! En tant qu'assistant, je suis toujours prêt à répondre et à vous assister dans votre langue maternelle française. 
Comment puis-je vous aider aujourd'hui ?
<!-- gh-comment-id:2074952146 --> @jmorganca commented on GitHub (Apr 24, 2024): @rb81 Phi 3 has been updated with a system prompt as per their published tokenizer configuration. ``` % ollama run phi3 >>> /set system You are a helpful assistant that always answers in french. Set system message. >>> Hi! Bonjour ! En tant qu'assistant, je suis toujours prêt à répondre et à vous assister dans votre langue maternelle française. Comment puis-je vous aider aujourd'hui ? ```
Author
Owner

@thinkverse commented on GitHub (Apr 24, 2024):

So basically the system prompts I was testing with were just bad. 😂 Time to learn how to write better prompts. 👍

<!-- gh-comment-id:2075072027 --> @thinkverse commented on GitHub (Apr 24, 2024): So basically the system prompts I was testing with were just bad. 😂 Time to learn how to write better prompts. 👍
Author
Owner

@lostmygithubaccount commented on GitHub (Apr 24, 2024):

from some limited testing yesterday, phi3 doesn't seem great at following system prompts/instructions. this was also an issue w/ phi2 that made it hard to use for anything 🤷

<!-- gh-comment-id:2075104138 --> @lostmygithubaccount commented on GitHub (Apr 24, 2024): from some limited testing yesterday, phi3 doesn't seem great at following system prompts/instructions. this was also an issue w/ phi2 that made it hard to use for anything 🤷
Author
Owner

@rb81 commented on GitHub (Apr 24, 2024):

Phi 3 has been updated with a system prompt as per their published tokenizer configuration.

% ollama run phi3
>>> /set system You are a helpful assistant that always answers in french.
Set system message.
>>> Hi!
Bonjour ! En tant qu'assistant, je suis toujours prêt à répondre et à vous assister dans votre langue maternelle française. 
Comment puis-je vous aider aujourd'hui ?

So, it seems setting the system prompt in runtime as you did works, but setting it in the modelfile doesn't. Have you had the same experience?

Edit: It seems phi3 decides which system prompts to follow 😂 More elaborate prompts don't work, but something simple like "Only respond in French." works just fine. Weird.

<!-- gh-comment-id:2075420804 --> @rb81 commented on GitHub (Apr 24, 2024): > Phi 3 has been updated with a system prompt as per their published tokenizer configuration. > > ``` > % ollama run phi3 > >>> /set system You are a helpful assistant that always answers in french. > Set system message. > >>> Hi! > Bonjour ! En tant qu'assistant, je suis toujours prêt à répondre et à vous assister dans votre langue maternelle française. > Comment puis-je vous aider aujourd'hui ? > ``` So, it seems setting the system prompt in runtime as you did works, but setting it in the modelfile doesn't. Have you had the same experience? **Edit:** It seems phi3 decides which system prompts to follow 😂 More elaborate prompts don't work, but something simple like "Only respond in French." works just fine. Weird.
Author
Owner

@jxtt01 commented on GitHub (Apr 29, 2024):

Does anyone involved in this issue understand how Phi-3-Mini's system prompt is handled under the hood of ollama? I'd like to replicate it in my own code, but I've run into the same issue that thinkverse spoke with regards to earlier (quoted below for clarity).

Took a cursory glance at the technical report1 and I'm new to AI in general but it looks to me like Phi3 wasn't trained with a system prompt in mind, the template in the report shows only user and assistant.

<|user|>\n Question <|end|>\n <|assistant|>

The Modelfile2 sample provided by Microsoft also doesn't include a system prompt. Perhaps @jmorganca or someone else from the Ollama team can shed some light on it. They know more than I do. 👍

Footnotes

1. [arxiv.org/html/2404.14219v1#S2](https://arxiv.org/html/2404.14219v1#S2) [↩](#user-content-fnref-1-07bb2b45304e3802ea204f5642c73b0b)

2. [Azure-Samples/Phi-3MiniSamples@`main`/ollama/Modelfile](https://github.com/Azure-Samples/Phi-3MiniSamples/blob/main/ollama/Modelfile?rgh-link-date=2024-04-23T20%3A15%3A02Z) [↩](#user-content-fnref-2-07bb2b45304e3802ea204f5642c73b0b)
<!-- gh-comment-id:2083863125 --> @jxtt01 commented on GitHub (Apr 29, 2024): Does anyone involved in this issue understand how Phi-3-Mini's system prompt is handled under the hood of ollama? I'd like to replicate it in my own code, but I've run into the same issue that thinkverse spoke with regards to earlier (quoted below for clarity). > Took a cursory glance at the technical report[1](#user-content-fn-1-07bb2b45304e3802ea204f5642c73b0b) and I'm new to AI in general but it looks to me like Phi3 wasn't trained with a system prompt in mind, the template in the report shows only user and assistant. > > ``` > <|user|>\n Question <|end|>\n <|assistant|> > ``` > > The Modelfile[2](#user-content-fn-2-07bb2b45304e3802ea204f5642c73b0b) sample provided by Microsoft also doesn't include a system prompt. Perhaps @jmorganca or someone else from the Ollama team can shed some light on it. They know more than I do. 👍 > ## Footnotes > > 1. [arxiv.org/html/2404.14219v1#S2](https://arxiv.org/html/2404.14219v1#S2) [↩](#user-content-fnref-1-07bb2b45304e3802ea204f5642c73b0b) > > 2. [Azure-Samples/Phi-3MiniSamples@`main`/ollama/Modelfile](https://github.com/Azure-Samples/Phi-3MiniSamples/blob/main/ollama/Modelfile?rgh-link-date=2024-04-23T20%3A15%3A02Z) [↩](#user-content-fnref-2-07bb2b45304e3802ea204f5642c73b0b)
Author
Owner

@wuriyanto48 commented on GitHub (May 25, 2024):

i just noticed, there is no <|system|> token in thephi3's chat_template config. https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/tokenizer_config.json

adding this config to the tokenizer, it works

tokenizer.use_default_system_prompt = True
tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

I took chat_template config from https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json

<!-- gh-comment-id:2131435628 --> @wuriyanto48 commented on GitHub (May 25, 2024): i just noticed, there is no `<|system|>` token in the`phi3's chat_template` config. https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/tokenizer_config.json adding this config to the tokenizer, it works ```python tokenizer.use_default_system_prompt = True tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}" ``` I took chat_template config from https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json
Author
Owner

@spike-xiong commented on GitHub (May 31, 2024):

it seems like Phi3 doesn't accept the system role, however, they said Phi3 has 3 roles in their cookbook:
https://github.com/microsoft/Phi-3CookBook/blob/main/md/02.QuickStart/Huggingface_QuickStart.md

<!-- gh-comment-id:2141138892 --> @spike-xiong commented on GitHub (May 31, 2024): it seems like Phi3 doesn't accept the system role, however, they said Phi3 has 3 roles in their cookbook: https://github.com/microsoft/Phi-3CookBook/blob/main/md/02.QuickStart/Huggingface_QuickStart.md
Author
Owner

@spike-xiong commented on GitHub (May 31, 2024):

i just noticed, there is no <|system|> token in thephi3's chat_template config. https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/tokenizer_config.json

adding this config to the tokenizer, it works

tokenizer.use_default_system_prompt = True
tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

I took chat_template config from https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json

great, thank you! And you don't even need this line tokenizer.use_default_system_prompt = True

<!-- gh-comment-id:2141140880 --> @spike-xiong commented on GitHub (May 31, 2024): > i just noticed, there is no `<|system|>` token in the`phi3's chat_template` config. https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/blob/main/tokenizer_config.json > > adding this config to the tokenizer, it works > > ```python > tokenizer.use_default_system_prompt = True > tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}" > ``` > > I took chat_template config from https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer_config.json great, thank you! And you don't even need this line `tokenizer.use_default_system_prompt = True`
Author
Owner

@wuriyanto48 commented on GitHub (May 31, 2024):

Thank you @CaptXiong , for pointing it out

<!-- gh-comment-id:2141163508 --> @wuriyanto48 commented on GitHub (May 31, 2024): Thank you @CaptXiong , for pointing it out
Author
Owner

@rb81 commented on GitHub (May 31, 2024):

@wuriyanto48 @CaptXiong - I'm assuming what you guys figured out requires modifying the source code and recompiling, correct? If so, can we get the project team to look at this?

<!-- gh-comment-id:2141257500 --> @rb81 commented on GitHub (May 31, 2024): @wuriyanto48 @CaptXiong - I'm assuming what you guys figured out requires modifying the source code and recompiling, correct? If so, can we get the project team to look at this?
Author
Owner

@wuriyanto48 commented on GitHub (May 31, 2024):

@rb81 You don't need to recompile the source code, just like this

model_id = "microsoft/Phi-3-mini-4k-instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
    {
        "role": "system",
        "content": "You are GoodBot, You are a helpful assistant.",
    },
    {
        "role": "user",
        "content": "who are you?"
    }

 ]

tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)

hope this help

<!-- gh-comment-id:2141298846 --> @wuriyanto48 commented on GitHub (May 31, 2024): @rb81 You don't need to recompile the source code, just like this ```python model_id = "microsoft/Phi-3-mini-4k-instruct" model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained(model_id) messages = [ { "role": "system", "content": "You are GoodBot, You are a helpful assistant.", }, { "role": "user", "content": "who are you?" } ] tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}" pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, device=device, ) generation_args = { "max_new_tokens": 500, "return_full_text": False, "temperature": 0.0, "do_sample": False, } output = pipe(messages, **generation_args) ``` hope this help
Author
Owner

@rb81 commented on GitHub (May 31, 2024):

@rb81 You don't need to recompile the source code, just like this

model_id = "microsoft/Phi-3-mini-4k-instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
    {
        "role": "system",
        "content": "You are GoodBot, You are a helpful assistant.",
    },
    {
        "role": "user",
        "content": "who are you?"
    }

 ]

tokenizer.use_default_system_prompt = True
tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)

hope this help

Thanks @wuriyanto48, much appreciated!

<!-- gh-comment-id:2142848051 --> @rb81 commented on GitHub (May 31, 2024): > @rb81 You don't need to recompile the source code, just like this > > ```python > model_id = "microsoft/Phi-3-mini-4k-instruct" > model = AutoModelForCausalLM.from_pretrained( > model_id, > torch_dtype="auto", > trust_remote_code=True, > ) > > tokenizer = AutoTokenizer.from_pretrained(model_id) > > messages = [ > { > "role": "system", > "content": "You are GoodBot, You are a helpful assistant.", > }, > { > "role": "user", > "content": "who are you?" > } > > ] > > tokenizer.use_default_system_prompt = True > tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}" > > pipe = pipeline( > "text-generation", > model=model, > tokenizer=tokenizer, > device=device, > ) > > generation_args = { > "max_new_tokens": 500, > "return_full_text": False, > "temperature": 0.0, > "do_sample": False, > } > > output = pipe(messages, **generation_args) > ``` > > hope this help Thanks @wuriyanto48, much appreciated!
Author
Owner

@jmorganca commented on GitHub (Jul 5, 2024):

Hey folks, will close this for now, but let me know if you're still seeing issues.

<!-- gh-comment-id:2210088896 --> @jmorganca commented on GitHub (Jul 5, 2024): Hey folks, will close this for now, but let me know if you're still seeing issues.
Author
Owner

@rcastberg commented on GitHub (Aug 21, 2024):

Using python code on the Phi models I can get the phy models to follow a system instruction and only return what I instruct it to.
This does not work when the model is loaded into Ollama: See the examples below where I request that it only returns yes, no or I don't know.

I have attempted to include this in the user message but it doesn't seem to listen to that either. Is there anyway to get phi to listen to system messages in ollama.

Example in Ollama:

Same behaviour whatever phi model I choose.
`$ ollama run phi3:3.8b-instruct
>>> /set system As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know"
Set system message.

>>> /show system
As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know"

>>> Does a cheetah have stripes?
No. A cheetah has spots instead of stripes, which help with camounerage in their natural habitat. The black tear marks
under the eyes may also aid them to see better at high speeds by counteracting sun glare off dusty surfaces. While they
are similar looking animals called leopards and jaguars that have striking orange or spotted coats, cheetahs do not have
stripes on their fur.
`

Example Test Script in python :

`import torch
from transformers import pipeline

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "microsoft/Phi-3-mini-4k-instruct"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
{
"role": "system",
"content": """As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know"""
},
{
"role": "user",
"content": "Does a cheetah have stripes?",
}

]

tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device="cuda",
)

generation_args = {
"max_new_tokens": 500,
"return_full_text": False,
"temperature": 0.0,
"do_sample": False,
}

output = pipe(messages, **generation_args)
print(output)
`

Result: [{'generated_text': ' No'}]

<!-- gh-comment-id:2301241801 --> @rcastberg commented on GitHub (Aug 21, 2024): Using python code on the Phi models I can get the phy models to follow a system instruction and only return what I instruct it to. This does not work when the model is loaded into Ollama: See the examples below where I request that it only returns yes, no or I don't know. I have attempted to include this in the user message but it doesn't seem to listen to that either. Is there anyway to get phi to listen to system messages in ollama. # Example in Ollama: Same behaviour whatever phi model I choose. `$ ollama run phi3:3.8b-instruct \>\>\> /set system As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know" Set system message. \>\>\> /show system As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know" \>\>\> Does a cheetah have stripes? No. A cheetah has spots instead of stripes, which help with camounerage in their natural habitat. The black tear marks under the eyes may also aid them to see better at high speeds by counteracting sun glare off dusty surfaces. While they are similar looking animals called leopards and jaguars that have striking orange or spotted coats, cheetahs do not have stripes on their fur. ` # Example Test Script in python : `import torch from transformers import pipeline from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "microsoft/Phi-3-mini-4k-instruct" model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained(model_id) messages = [ { "role": "system", "content": """As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know""" }, { "role": "user", "content": "Does a cheetah have stripes?", } ] tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + '<|end|>' }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + '<|end|>' }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}" pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, device="cuda", ) generation_args = { "max_new_tokens": 500, "return_full_text": False, "temperature": 0.0, "do_sample": False, } output = pipe(messages, **generation_args) print(output) ` **Result:** [{'generated_text': ' No'}]
Author
Owner

@rick-github commented on GitHub (Aug 25, 2024):

phi3:3.8b-instruct is the q4_0 quantized version of the model. If you use the fp16 version, it follows instructions better.

$ ollama run phi3:3.8b-mini-128k-instruct-fp16
>>> /set system As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know"
Set system message.
>>> Does a cheetah have stripes?
No
<!-- gh-comment-id:2308954793 --> @rick-github commented on GitHub (Aug 25, 2024): phi3:3.8b-instruct is the q4_0 quantized version of the model. If you use the fp16 version, it follows instructions better. ``` $ ollama run phi3:3.8b-mini-128k-instruct-fp16 >>> /set system As a knowledge responder, your task is to provide a direct answer to a factual question. Your response options are limited to "Yes", "No", or "I dont know" Set system message. >>> Does a cheetah have stripes? No ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64421