[GH-ISSUE #10712] toolcall supports streaming output #69097

Closed
opened 2026-05-04 17:08:41 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @williamlzw on GitHub (May 15, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10712

I use C# OllamaSharp to call the Ollama model, but I found that the tool call feature does not support streaming output.
I use the qwen API and OpenAI C# SDK to call ToolCall, which supports streaming output.
baseurl: https://dashscope.aliyuncs.com/compatible-mode/v1
model: qwen3-235b-a22b

Originally created by @williamlzw on GitHub (May 15, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10712 I use C# OllamaSharp to call the Ollama model, but I found that the tool call feature does not support streaming output. I use the qwen API and OpenAI C# SDK to call ToolCall, which supports streaming output. baseurl: https://dashscope.aliyuncs.com/compatible-mode/v1 model: qwen3-235b-a22b
GiteaMirror added the feature request label 2026-05-04 17:08:41 -05:00
Author
Owner

@williamlzw commented on GitHub (May 15, 2025):

import os
from openai import OpenAI

client = OpenAI(
    # 若没有配置环境变量,请用百炼API Key将下行替换为:api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"), # 如何获取API Key:https://help.aliyun.com/zh/model-studio/developer-reference/get-api-key
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="qwen3-235b-a22b", # 模型列表:https://help.aliyun.com/zh/model-studio/getting-started/models
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'user', 'content': '你是谁?'}
        ]
)
print(completion.choices[0].message.content)
<!-- gh-comment-id:2882122028 --> @williamlzw commented on GitHub (May 15, 2025): ``` import os from openai import OpenAI client = OpenAI( # 若没有配置环境变量,请用百炼API Key将下行替换为:api_key="sk-xxx", api_key=os.getenv("DASHSCOPE_API_KEY"), # 如何获取API Key:https://help.aliyun.com/zh/model-studio/developer-reference/get-api-key base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", ) completion = client.chat.completions.create( model="qwen3-235b-a22b", # 模型列表:https://help.aliyun.com/zh/model-studio/getting-started/models messages=[ {'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': '你是谁?'} ] ) print(completion.choices[0].message.content) ```
Author
Owner

@rick-github commented on GitHub (May 15, 2025):

#10415

<!-- gh-comment-id:2882804575 --> @rick-github commented on GitHub (May 15, 2025): #10415
Author
Owner

@williamlzw commented on GitHub (May 19, 2025):

When the OpenAI SDK calls the tool method, it captures the internal raw return data in real time, parses the Think tags, and renders them in real time, so as to achieve the effect of invoking the tool to render the Think tags in a stream. The ollama parsing tool method needs to be added to the callback to capture the original data.

<!-- gh-comment-id:2890122715 --> @williamlzw commented on GitHub (May 19, 2025): When the OpenAI SDK calls the tool method, it captures the internal raw return data in real time, parses the Think tags, and renders them in real time, so as to achieve the effect of invoking the tool to render the Think tags in a stream. The ollama parsing tool method needs to be added to the callback to capture the original data.
Author
Owner
<!-- gh-comment-id:2890271730 --> @williamlzw commented on GitHub (May 19, 2025): https://github.com/imxcstar/Tinvo/blob/master/Tinvo.Provider.OpenAI/AIScheduler/OpenAIProviderParser.cs
Author
Owner

@williamlzw commented on GitHub (May 20, 2025):

https://github.com/ollama/ollama/issues/8529

<!-- gh-comment-id:2893185852 --> @williamlzw commented on GitHub (May 20, 2025): https://github.com/ollama/ollama/issues/8529
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69097