[GH-ISSUE #15056] Ollama Cloud kimi-k2.5 cannot process SystemMessage perfectly. #71723

Open
opened 2026-05-05 02:25:00 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @weiyinfu on GitHub (Mar 25, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15056

What is the issue?

I am using the langchain framework. And I found the root reason. It's still a problem.
The input may have multi messages. The system message must be put in the first . It will ignore other system message after the user message.

Relevant log output


curl -s localhost:11434/v1/chat/completions -d '{
"model":"kimi-k2.5:cloud",
"messages":[
{"role":"user","content":"Tell me current time without calling any tools."},
{"role":"system","content":"Current time is '"$(date)"'"}
],"stream":false}' | jq -r '.choices[0].message.content'

# If I change the order the answer is right. 

curl -s localhost:11434/v1/chat/completions -d '{
"model":"kimi-k2.5:cloud",
"messages":[
{"role":"system","content":"Current time is '"$(date)"'"},
{"role":"user","content":"Tell me current time without calling any tools."}
],"stream":false}' | jq -r '.choices[0].message.content'

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @weiyinfu on GitHub (Mar 25, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15056 ### What is the issue? I am using the langchain framework. And I found the root reason. It's still a problem. The input may have multi messages. The system message must be put in the first . It will ignore other system message after the user message. ### Relevant log output ```shell curl -s localhost:11434/v1/chat/completions -d '{ "model":"kimi-k2.5:cloud", "messages":[ {"role":"user","content":"Tell me current time without calling any tools."}, {"role":"system","content":"Current time is '"$(date)"'"} ],"stream":false}' | jq -r '.choices[0].message.content' # If I change the order the answer is right. curl -s localhost:11434/v1/chat/completions -d '{ "model":"kimi-k2.5:cloud", "messages":[ {"role":"system","content":"Current time is '"$(date)"'"}, {"role":"user","content":"Tell me current time without calling any tools."} ],"stream":false}' | jq -r '.choices[0].message.content' ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the cloudbug labels 2026-05-05 02:25:01 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 25, 2026):

Putting the system message first is normal practice. Why do you need to intersperse system messages?

<!-- gh-comment-id:4126674324 --> @rick-github commented on GitHub (Mar 25, 2026): Putting the system message first is normal practice. Why do you need to intersperse system messages?
Author
Owner

@weiyinfu commented on GitHub (Mar 26, 2026):

Best practice is not required. When we say best practice,it's not required.
The first has only one place,but I have multi system message.How to organize them.
The kimi,openai and most of the LLM provider, allow multi message and has no requirement on the message order.

As shown in the Python demo, you can see a case where multi system message is needed.

<!-- gh-comment-id:4130897329 --> @weiyinfu commented on GitHub (Mar 26, 2026): Best practice is not required. When we say best practice,it's not required. The first has only one place,but I have multi system message.How to organize them. The kimi,openai and most of the LLM provider, allow multi message and has no requirement on the message order. As shown in the Python demo, you can see a case where multi system message is needed.
Author
Owner

@weiyinfu commented on GitHub (Mar 26, 2026):

Putting the system message first is normal practice. Why do you need to intersperse system messages?

I am using langchain's deepagents frame work. It has a system_instruction which ocuppies the first system message.

def test_model_system_prompt(m: ChatOpenAI):
    agent = create_deep_agent(model=m, system_prompt="You are a helpful agent.")
    result = agent.invoke(
        {
            "messages": [
                {"role": "user", "content": "Tell me current time."},
                SystemMessage(content=f"Current time is {datetime.now()}"),
            ]
        }
    )
    print(result["messages"][-1].content)

The reason why I need add a SystemMessage is like this:
when user query: recommend some paper on cancer cure methods. The agent should call paperSearch("cancer cure 2026") rather than paperSearch("cancer cure 2025"). The agent's model is trained in 2025. It think current time is 2025.
So this systemMessage is necessary.

<!-- gh-comment-id:4133203895 --> @weiyinfu commented on GitHub (Mar 26, 2026): > Putting the system message first is normal practice. Why do you need to intersperse system messages? I am using langchain's deepagents frame work. It has a system_instruction which ocuppies the first system message. ``` def test_model_system_prompt(m: ChatOpenAI): agent = create_deep_agent(model=m, system_prompt="You are a helpful agent.") result = agent.invoke( { "messages": [ {"role": "user", "content": "Tell me current time."}, SystemMessage(content=f"Current time is {datetime.now()}"), ] } ) print(result["messages"][-1].content) ``` The reason why I need add a SystemMessage is like this: when user query: recommend some paper on cancer cure methods. The agent should call paperSearch("cancer cure 2026") rather than paperSearch("cancer cure 2025"). The agent's model is trained in 2025. It think current time is 2025. So this systemMessage is necessary.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71723