[GH-ISSUE #15457] OpenAI-compatible streaming: tool_calls index is always 0 for multiple tool calls #9880

Closed
opened 2026-04-12 22:44:32 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @CPIDLE on GitHub (Apr 9, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15457

Originally assigned to: @drifkin on GitHub.

What is the issue?

When a model returns multiple tool calls in a single streaming response via the OpenAI-compatible API (/v1/chat/completions), all tool call chunks have index: 0 instead of incrementing indices (0, 1, 2...).

This is different from #7881 which was about the index field being missing entirely (fixed in v0.4.7). The field is now present but always 0.

Reproduction

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-coder:7b",
    "stream": true,
    "messages": [
      {"role": "system", "content": "Use the provided tools."},
      {"role": "user", "content": "Create hello.py with print(\"hello\") and world.py with print(\"world\")."}
    ],
    "tools": [{
      "type": "function",
      "function": {
        "name": "file_write",
        "description": "Write content to a file",
        "parameters": {
          "type": "object",
          "properties": {
            "filePath": {"type": "string"},
            "content": {"type": "string"}
          },
          "required": ["filePath", "content"]
        }
      }
    }]
  }'

Actual output (abbreviated)

data: {...,"tool_calls":[{"id":"abc123","function":{"arguments":"{\"filePath\":\"hello.py\",...}","name":"file_write"},"type":"function","index":0}]}
data: {...,"tool_calls":[{"id":"def456","function":{"arguments":"{\"filePath\":\"world.py\",...}","name":"file_write"},"type":"function","index":0}]}

Both chunks have "index": 0 despite having different id values.

Expected output

The second tool call should have "index": 1:

data: {...,"tool_calls":[{...,"index":0}]}   # first tool call
data: {...,"tool_calls":[{...,"index":1}]}   # second tool call

Per OpenAI's streaming spec, index should enumerate tool calls sequentially.

Impact

The Vercel AI SDK (@ai-sdk/openai-compatible) uses index as the array key to track tool calls. When all indices are 0, the second tool call either gets merged into the first or silently dropped, causing 100% failure rate on any task requiring multiple tool calls in one response.

Tested with:

  • Ollama 0.20.2 (local) and 0.20.4 (remote)
  • Models: qwen2.5-coder:7b, qwen3-coder:30b, gemma4:e2b, gemma4:e4b
  • Also occurs when going through LiteLLM (ollama_chat/ backend)

Workaround

HTTP proxy that reassigns correct sequential indices based on unique tool call id values: opencode-bench/fix-proxy

Environment

  • OS: Linux (DGX Spark) + Windows 11
  • Ollama: 0.20.2 / 0.20.4
  • GPU: RTX 4090 / Grace CPU
Originally created by @CPIDLE on GitHub (Apr 9, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15457 Originally assigned to: @drifkin on GitHub. ### What is the issue? When a model returns **multiple tool calls** in a single streaming response via the OpenAI-compatible API (`/v1/chat/completions`), all tool call chunks have `index: 0` instead of incrementing indices (0, 1, 2...). This is different from [#7881](https://github.com/ollama/ollama/issues/7881) which was about the `index` field being **missing entirely** (fixed in v0.4.7). The field is now present but always `0`. ### Reproduction ```bash curl -s http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "qwen2.5-coder:7b", "stream": true, "messages": [ {"role": "system", "content": "Use the provided tools."}, {"role": "user", "content": "Create hello.py with print(\"hello\") and world.py with print(\"world\")."} ], "tools": [{ "type": "function", "function": { "name": "file_write", "description": "Write content to a file", "parameters": { "type": "object", "properties": { "filePath": {"type": "string"}, "content": {"type": "string"} }, "required": ["filePath", "content"] } } }] }' ``` ### Actual output (abbreviated) ``` data: {...,"tool_calls":[{"id":"abc123","function":{"arguments":"{\"filePath\":\"hello.py\",...}","name":"file_write"},"type":"function","index":0}]} data: {...,"tool_calls":[{"id":"def456","function":{"arguments":"{\"filePath\":\"world.py\",...}","name":"file_write"},"type":"function","index":0}]} ``` Both chunks have **`"index": 0`** despite having different `id` values. ### Expected output The second tool call should have `"index": 1`: ``` data: {...,"tool_calls":[{...,"index":0}]} # first tool call data: {...,"tool_calls":[{...,"index":1}]} # second tool call ``` Per [OpenAI's streaming spec](https://platform.openai.com/docs/api-reference/chat/streaming), `index` should enumerate tool calls sequentially. ### Impact The Vercel AI SDK (`@ai-sdk/openai-compatible`) uses `index` as the array key to track tool calls. When all indices are `0`, the second tool call either gets merged into the first or silently dropped, causing **100% failure rate** on any task requiring multiple tool calls in one response. Tested with: - Ollama 0.20.2 (local) and 0.20.4 (remote) - Models: qwen2.5-coder:7b, qwen3-coder:30b, gemma4:e2b, gemma4:e4b - Also occurs when going through LiteLLM (`ollama_chat/` backend) ### Workaround HTTP proxy that reassigns correct sequential indices based on unique tool call `id` values: [opencode-bench/fix-proxy](https://github.com/CPIDLE/opencode-bench/tree/master/fix-proxy) ### Environment - OS: Linux (DGX Spark) + Windows 11 - Ollama: 0.20.2 / 0.20.4 - GPU: RTX 4090 / Grace CPU
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9880