[GH-ISSUE #8312] ChatOllama does not work with tools reliably in LangChain #5323

Closed
opened 2026-04-12 16:31:30 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @dlin95123 on GitHub (Jan 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8312

What is the issue?

I use ChatOllama in LangChain and find it does not work with tools' output correctly. I reported the bug in LangChain's GitHub issues. But I was told that I should report the bug here. With the same LLM model, if I use ChatGroq instead of ChatOllama, it works correctly. Others have reported similar issues.

The issue I reported at LangChain's GitHub is: https://github.com/langchain-ai/langchain/issues/28934

Another similar issue reported there is: https://github.com/langchain-ai/langchain/issues/29030

It would be great if you could address this issue ASAP as it blocks users from using Ollama in more advanced applications.

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.5.4

Originally created by @dlin95123 on GitHub (Jan 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8312 ### What is the issue? I use ChatOllama in LangChain and find it does not work with tools' output correctly. I reported the bug in LangChain's GitHub issues. But I was told that I should report the bug here. With the same LLM model, if I use ChatGroq instead of ChatOllama, it works correctly. Others have reported similar issues. The issue I reported at LangChain's GitHub is: https://github.com/langchain-ai/langchain/issues/28934 Another similar issue reported there is: https://github.com/langchain-ai/langchain/issues/29030 It would be great if you could address this issue ASAP as it blocks users from using Ollama in more advanced applications. ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.5.4
GiteaMirror added the bug label 2026-04-12 16:31:30 -05:00
Author
Owner

@ofri commented on GitHub (Jan 5, 2025):

@dlin95123 my issue (29030) does get resolved by updating to latest ollama (apparently I was running an old version). but admittedly, I did not get to very complex tool usage use cases (yet :) )

<!-- gh-comment-id:2571751221 --> @ofri commented on GitHub (Jan 5, 2025): @dlin95123 my issue (29030) does get resolved by updating to latest ollama (apparently I was running an old version). but admittedly, I did not get to very complex tool usage use cases (yet :) )
Author
Owner

@dlin95123 commented on GitHub (Jan 5, 2025):

Thanks for the comments, @ofri . I have run into this problem with more advanced tool applications. The only way for me to get my code to work is replacing ChatOllama with ChatGroq, if I want to use the same LLM model. ChatGPT works great too. But I want to run my LLM locally....

<!-- gh-comment-id:2571753932 --> @dlin95123 commented on GitHub (Jan 5, 2025): Thanks for the comments, @ofri . I have run into this problem with more advanced tool applications. The only way for me to get my code to work is replacing ChatOllama with ChatGroq, if I want to use the same LLM model. ChatGPT works great too. But I want to run my LLM locally....
Author
Owner

@rick-github commented on GitHub (Jan 5, 2025):

Have you tried increasing the context size (num_ctx)?

<!-- gh-comment-id:2571756432 --> @rick-github commented on GitHub (Jan 5, 2025): Have you tried increasing the context size (`num_ctx`)?
Author
Owner

@dlin95123 commented on GitHub (Jan 5, 2025):

@rick-github Thank you for the pointer. It works for me after increasing the context size to 8192. Is there a max for setting the context size?

<!-- gh-comment-id:2571795306 --> @dlin95123 commented on GitHub (Jan 5, 2025): @rick-github Thank you for the pointer. It works for me after increasing the context size to 8192. Is there a max for setting the context size?
Author
Owner

@rick-github commented on GitHub (Jan 6, 2025):

llama3.1 has a max context size of 131072 tokens:

$ ollama show llama3.1
  Model
    architecture        llama     
    parameters          8.0B      
    context length      131072    
    embedding length    4096      
    quantization        Q4_0      

  Parameters
    stop    "<|start_header_id|>"    
    stop    "<|end_header_id|>"      
    stop    "<|eot_id|>"             

  License
    LLAMA 3.1 COMMUNITY LICENSE AGREEMENT            
    Llama 3.1 Version Release Date: July 23, 2024   

But setting it this high will require more VRAM. You can mitigate this a bit by setting OLLAMA_NUM_PARALLEL=1 if you are not going to do concurrent completions:

num_ctx VRAM
2048 5334 MiB
4096 5628 MiB
8192 6404 MiB
16384 7956 MiB
65536 17268 MiB
131072 29684 MiB

You can reduce that a little further with flash attention (OLLAMA_FLASH_ATTENTION=1):

num_ctx VRAM
2048 5334 MiB
4096 5590 MiB
8192 6102 MiB
16384 7126 MiB
65536 13270 MiB
131072 21620 MiB
<!-- gh-comment-id:2572122795 --> @rick-github commented on GitHub (Jan 6, 2025): llama3.1 has a max context size of 131072 tokens: ```console $ ollama show llama3.1 Model architecture llama parameters 8.0B context length 131072 embedding length 4096 quantization Q4_0 Parameters stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>" License LLAMA 3.1 COMMUNITY LICENSE AGREEMENT Llama 3.1 Version Release Date: July 23, 2024 ``` But setting it this high will require more VRAM. You can mitigate this a bit by setting `OLLAMA_NUM_PARALLEL=1` if you are not going to do concurrent completions: | num_ctx | VRAM | |---|---| | 2048 | 5334 MiB| | 4096 | 5628 MiB| | 8192 | 6404 MiB| | 16384 | 7956 MiB| | 65536 | 17268 MiB| | 131072 | 29684 MiB| You can reduce that a little further with [flash attention](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-enable-flash-attention) (`OLLAMA_FLASH_ATTENTION=1`): | num_ctx | VRAM | |---|---| | 2048 | 5334 MiB| | 4096 | 5590 MiB| | 8192 | 6102 MiB| | 16384 | 7126 MiB| | 65536 | 13270 MiB| | 131072 | 21620 MiB|
Author
Owner

@dlin95123 commented on GitHub (Jan 6, 2025):

Thanks again for the info. I can close this issue.

<!-- gh-comment-id:2572263417 --> @dlin95123 commented on GitHub (Jan 6, 2025): Thanks again for the info. I can close this issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5323