[GH-ISSUE #23921] fix: Premature finish_reason 'stop' on first SSE chunk breaks API clients for Ollama reasoning models #35637

New Issue

GiteaMirror · 2026-04-25T09:48:05-05:00

GiteaMirror commented

2026-04-25 09:48:05 -05:00

Originally created by @pvyswiss on GitHub (Apr 21, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23921

Bug Description

API clients see the first SSE chunk then the stream appears to end. Ollama reasoning models (DeepSeek R1, Gemma 4) hang after "let me think." The Web UI works fine.

Related to: #23917 (Bug 2)

Root Cause

openai_chat_chunk_message_template() in misc.py uses truthy checks:

if not content and not reasoning_content and not tool_calls:
    template['choices'][0]['finish_reason'] = 'stop'

When Ollama sends the first chunk for a reasoning model, both content and thinking are empty strings (""). Empty strings are falsy in Python, so finish_reason: "stop" is set on the very first chunk.

API clients complying with the OpenAI spec close the stream on finish_reason: "stop".

Fix

Only set finish_reason: "stop" when usage is present (final chunk):

if usage and not content and not reasoning_content and not tool_calls:
    template['choices'][0]['finish_reason'] = 'stop'

Impact

Web UI: None (browser path ignores finish_reason)
API clients: Stream continues correctly through reasoning and content phases

File

backend/open_webui/utils/misc.py

Reproduction

curl -sN -H "Authorization: Bearer $KEY" \
  -d '{"model":"deepseek-r1:32b","messages":[{"role":"user","content":"hi"}],"stream":true}' \
  http://localhost:3000/api/chat/completions
# Result: First chunk has finish_reason:"stop", stream ends prematurely

Originally created by @pvyswiss on GitHub (Apr 21, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23921 ## Bug Description API clients see the first SSE chunk then the stream appears to end. Ollama reasoning models (DeepSeek R1, Gemma 4) hang after "let me think." The Web UI works fine. Related to: #23917 (Bug 2) ## Root Cause `openai_chat_chunk_message_template()` in `misc.py` uses truthy checks: ```python if not content and not reasoning_content and not tool_calls: template['choices'][0]['finish_reason'] = 'stop' ``` When Ollama sends the first chunk for a reasoning model, both `content` and `thinking` are empty strings (`""`). Empty strings are **falsy in Python**, so `finish_reason: "stop"` is set on the very first chunk. API clients complying with the OpenAI spec close the stream on `finish_reason: "stop"`. ## Fix Only set `finish_reason: "stop"` when `usage` is present (final chunk): ```python if usage and not content and not reasoning_content and not tool_calls: template['choices'][0]['finish_reason'] = 'stop' ``` ## Impact - **Web UI**: None (browser path ignores `finish_reason`) - **API clients**: Stream continues correctly through reasoning and content phases ## File `backend/open_webui/utils/misc.py` ## Reproduction ```bash curl -sN -H "Authorization: Bearer $KEY" \ -d '{"model":"deepseek-r1:32b","messages":[{"role":"user","content":"hi"}],"stream":true}' \ http://localhost:3000/api/chat/completions # Result: First chunk has finish_reason:"stop", stream ends prematurely ```

GiteaMirror closed this issue

2026-04-25 09:48:06 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#35637