[GH-ISSUE #6477] Llama3.1 template doesn't work well with multi function calling as well as Environment: ipython mode #4076

Open
opened 2026-04-12 14:58:56 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @martinkozle on GitHub (Aug 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6477

What is the issue?

Tool descriptions

The current template checks if the final message is of Role "user" to decide whether to add the tool descriptions to it:

{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}
...

In a multi function use case however, where the last 2 messages are assistant and tool, and we would want the assistant to continue and use another tool (instead of giving the final response), then the tool descriptions aren't added anywhere, because the last message isn't a user message.
I get that this behavior is actually ok for a single function calling use case, where the assistant doesn't need to know the tools for generating the final response, but for multi function calling this makes it completely not work, as the assistant won't know what tools exist and how to use them for the second function call.

My proposed solution is:


{{- $lastUserIdx := -1 }}

{{- range $i, $_ := .Messages }}
{{- with eq .Role "user" }}
{{- $lastUserIdx = $i }}{{ end }}
{{- end }}

{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools (eq $i $lastUserIdx) }}
...

This adds the descriptions to the last user message, which doesn't necessarily have to be the last message overall.

Usage of <|eom_id|> token

According to the Meta documentation https://llama.meta.com/docs/model-cards-and-prompt-formats/ (Meta Llama docs are down at the time of writing). When using Environment: ipython, after the assistant calls a tool it is recommended to use an <|eom_id|> token instead of a <|eot_id|> token in order to signify that it is expecting a tool response next. When the assistant generates the final response, it then needs to be <|eot_id|>. I have checked the tokenizer_config.json jinja2 template, and this is how it is done there, but the Ollama template doesn't do the same, instead it doesn't add any token after the assistant tool call.
Parts of jinja2 template that does this:

...
{%- if builtin_tools is defined or tools is not none %}
    {{- "Environment: ipython\n" }}
...
{%- for message in messages %}
  ...
        {%- if builtin_tools is defined %}
          {#- This means we're in ipython mode #}
          {{- "<|eom_id|>" }}
        {%- else %}
          {{- "<|eot_id|>" }}
        {%- endif %}
...

What needs to be added to the Ollama template (the <|eom_id|> at the end):

{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}<|eom_id|>

Conclusion

These 2 things lead to poorer performance when using Llama3.1 in Ollama for multi function calling using the chat API. I believe that these should be addressed and fixed in order to achieve the intended quality and results from Llama3.1.

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.5

Originally created by @martinkozle on GitHub (Aug 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6477 ### What is the issue? ## Tool descriptions The current template checks if the final message is of Role "user" to decide whether to add the tool descriptions to it: ```go {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 }} {{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|> {{- if and $.Tools $last }} ... ``` In a multi function use case however, where the last 2 messages are assistant and tool, and we would want the assistant to continue and use another tool (instead of giving the final response), then the tool descriptions aren't added anywhere, because the last message isn't a user message. I get that this behavior is actually ok for a single function calling use case, where the assistant doesn't need to know the tools for generating the final response, but for multi function calling this makes it completely not work, as the assistant won't know what tools exist and how to use them for the second function call. My proposed solution is: ```go {{- $lastUserIdx := -1 }} {{- range $i, $_ := .Messages }} {{- with eq .Role "user" }} {{- $lastUserIdx = $i }}{{ end }} {{- end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 }} {{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|> {{- if and $.Tools (eq $i $lastUserIdx) }} ... ``` This adds the descriptions to the last user message, which doesn't necessarily have to be the last message overall. ## Usage of `<|eom_id|>` token According to the Meta documentation <https://llama.meta.com/docs/model-cards-and-prompt-formats/> (Meta Llama docs are down at the time of writing). When using `Environment: ipython`, after the assistant calls a tool it is recommended to use an `<|eom_id|>` token instead of a `<|eot_id|>` token in order to signify that it is expecting a tool response next. When the assistant generates the final response, it then needs to be `<|eot_id|>`. I have checked the `tokenizer_config.json` jinja2 template, and this is how it is done there, but the Ollama template doesn't do the same, instead it doesn't add any token after the assistant tool call. Parts of jinja2 template that does this: ```jinja2 ... {%- if builtin_tools is defined or tools is not none %} {{- "Environment: ipython\n" }} ... {%- for message in messages %} ... {%- if builtin_tools is defined %} {#- This means we're in ipython mode #} {{- "<|eom_id|>" }} {%- else %} {{- "<|eot_id|>" }} {%- endif %} ... ``` What needs to be added to the Ollama template (the `<|eom_id|>` at the end): ``` {{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}<|eom_id|> ``` ## Conclusion These 2 things lead to poorer performance when using Llama3.1 in Ollama for multi function calling using the `chat` API. I believe that these should be addressed and fixed in order to achieve the intended quality and results from Llama3.1. ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.5
GiteaMirror added the bug label 2026-04-12 14:58:56 -05:00
Author
Owner

@thedan158 commented on GitHub (Feb 10, 2025):

any update on this ?
im still encounter this with llama3.2 and 3.3 with chatOllama from langchain and also ollamajs

<!-- gh-comment-id:2646860330 --> @thedan158 commented on GitHub (Feb 10, 2025): any update on this ? im still encounter this with llama3.2 and 3.3 with chatOllama from langchain and also ollamajs
Author
Owner

@jean-rl commented on GitHub (Apr 3, 2025):

I also noticed this issue and it's hurting really bad the performance of one of my agents.

<!-- gh-comment-id:2774464087 --> @jean-rl commented on GitHub (Apr 3, 2025): I also noticed this issue and it's hurting really bad the performance of one of my agents.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4076