[GH-ISSUE #5952] find system prompt encapsulation error in mistral-nemo 12b #3718

Closed
opened 2026-04-12 14:31:48 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @map9 on GitHub (Jul 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5952

What is the issue?

I used autogen + Ollama + mistral-nemo 12b model,
I find Ollama missed system message or lost question message.
maybe mistral-nemo 12b model template defined error.

case 1:

extractor_system_message = "...extractor_system_message..."
extractor = AssistantAgent(
"Extractor",
system_message = extractor_system_message,
llm_config = llm_config,
human_input_mode = "NEVER",
)

messages = [{"content": "...message...", "role": "user", "name": "Initializer"}]

ollama output:
time=2024-07-25T21:44:27.533+08:00 level=DEBUG source=routes.go:1337 msg="chat request" images=0 prompt="[INST]...message...[/INST]"


error is lost system prompt

case 2:

extractor_system_message = "...extractor_system_message..."
extractor = AssistantAgent(
"Extractor",
system_message = extractor_system_message,
llm_config = llm_config,
human_input_mode = "NEVER",
)

messages = [{"content": "...message1...", "role": "user", "name": "Initializer"},
{"content": "...message2...", "role": "user", "name": "Extractor"},
{"content": "...message3...", "role": "user", "name": "Editor"}]

ollama output:
time=2024-07-25T21:44:39.481+08:00 level=DEBUG source=routes.go:1337 msg="chat request" images=0 prompt="[INST] ...extractor_system_message...\n\n\n...message3...[/INST]"


error is lost message1, message2.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.2.8

Originally created by @map9 on GitHub (Jul 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5952 ### What is the issue? I used autogen + Ollama + mistral-nemo 12b model, I find Ollama missed system message or lost question message. maybe mistral-nemo 12b model template defined error. case 1: ----------------------------------- extractor_system_message = "...extractor_system_message..." extractor = AssistantAgent( "Extractor", system_message = extractor_system_message, llm_config = llm_config, human_input_mode = "NEVER", ) messages = [{"content": "...message...", "role": "user", "name": "Initializer"}] ollama output: time=2024-07-25T21:44:27.533+08:00 level=DEBUG source=routes.go:1337 msg="chat request" images=0 prompt="[INST]...message...[/INST]" ------------------------------- error is lost system prompt case 2: ----------------------------------- extractor_system_message = "...extractor_system_message..." extractor = AssistantAgent( "Extractor", system_message = extractor_system_message, llm_config = llm_config, human_input_mode = "NEVER", ) messages = [{"content": "...message1...", "role": "user", "name": "Initializer"}, {"content": "...message2...", "role": "user", "name": "Extractor"}, {"content": "...message3...", "role": "user", "name": "Editor"}] ollama output: time=2024-07-25T21:44:39.481+08:00 level=DEBUG source=routes.go:1337 msg="chat request" images=0 prompt="[INST] ...extractor_system_message...\n\n\n...message3...[/INST]" ------------------------------- error is lost message1, message2. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.2.8
GiteaMirror added the bug label 2026-04-12 14:31:48 -05:00
Author
Owner

@map9 commented on GitHub (Jul 25, 2024):

I find same error in llama3.1, glm4 models

<!-- gh-comment-id:2250425298 --> @map9 commented on GitHub (Jul 25, 2024): I find same error in llama3.1, glm4 models
Author
Owner

@rick-github commented on GitHub (Jul 25, 2024):

Ollama server logs may make it easier to diagnose the issue.

<!-- gh-comment-id:2250434981 --> @rick-github commented on GitHub (Jul 25, 2024): Ollama server logs may make it easier to diagnose the issue.
Author
Owner

@map9 commented on GitHub (Jul 25, 2024):

OK, I find input messages truncated by 2048 num_ctx.

<!-- gh-comment-id:2250594799 --> @map9 commented on GitHub (Jul 25, 2024): OK, I find input messages truncated by 2048 num_ctx.
Author
Owner

@rick-github commented on GitHub (Jul 25, 2024):

You can either adjust the API call to add "options": {"num_ctx": 8192}, or create a new model with a different default context size:

$ ollama show --modelfile mistral-nemo:12b-instruct-2407-q4_K_M > Modelfile
# edit Modefile and add "PARAMETER num_ctx 8192"
$ ollama create my-mistral-nemo -f Modelfile

Change 8192 to whatever size you need but be aware that the larger the context window, the more VRAM/RAM is needed.

<!-- gh-comment-id:2250664206 --> @rick-github commented on GitHub (Jul 25, 2024): You can either adjust the API call to add `"options": {"num_ctx": 8192}`, or create a new model with a different default context size: ```sh $ ollama show --modelfile mistral-nemo:12b-instruct-2407-q4_K_M > Modelfile # edit Modefile and add "PARAMETER num_ctx 8192" $ ollama create my-mistral-nemo -f Modelfile ``` Change `8192` to whatever size you need but be aware that the larger the context window, the more VRAM/RAM is needed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3718