[GH-ISSUE #4788] Add EventSource for format /api/generate #49528

Closed
opened 2026-04-28 12:07:21 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Vali-98 on GitHub (Jun 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4788

What is the issue?

This was tested specifically with /api/generate and react-native-sse.

https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events
Stream responses sent in ollama doesn't seem to conform to SSE specifications, and breaks when using it with EventSource-like libraries.

Having this implementation will help with frontends and systems which prefer the EventSource format.

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.1.39

Originally created by @Vali-98 on GitHub (Jun 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4788 ### What is the issue? This was tested specifically with `/api/generate` and `react-native-sse`. https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events Stream responses sent in ollama doesn't seem to conform to SSE specifications, and breaks when using it with EventSource-like libraries. Having this implementation will help with frontends and systems which prefer the EventSource format. ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.1.39
GiteaMirror added the bug label 2026-04-28 12:07:21 -05:00
Author
Owner

@Vali-98 commented on GitHub (Jul 3, 2024):

This should be solved with https://github.com/ollama/ollama/pull/5209.

<!-- gh-comment-id:2205241234 --> @Vali-98 commented on GitHub (Jul 3, 2024): This should be solved with https://github.com/ollama/ollama/pull/5209.
Author
Owner

@prantlf commented on GitHub (Jul 30, 2025):

This issue wasn't fixed. Ollama doesn't support SSE.

Streaming is supported by chunked encoding with the content type application/x-ndjson. For example:

curl -v -H 'Content-Type: application/json' \
  -d '{"model":"deepseek-r1:1.5b","messages":[{"role":"user","content":"what is ollama?"}]}' \
  http://localhost:11434/api/chat

< HTTP/1.1 200 OK
< Content-Type: application/x-ndjson
< Date: Wed, 30 Jul 2025 09:45:03 GMT
< Transfer-Encoding: chunked
<
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.537758Z","message":{"role":"assistant","content":"\n\n"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.654744Z","message":{"role":"assistant","content":"\u003c/think\u003e"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.76711Z","message":{"role":"assistant","content":"\n\n"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.877521Z","message":{"role":"assistant","content":"O"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.988219Z","message":{"role":"assistant","content":"ll"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:04.098806Z","message":{"role":"assistant","content":"ama"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:04.21934Z","message":{"role":"assistant","content":" is"},"done":false}
...
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.36868Z","message":{"role":"assistant","content":" setup"},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.517462Z","message":{"role":"assistant","content":"."},"done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.665553Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":49729537151,"load_duration":3445259896,"prompt_eval_count":9,"prompt_eval_duration":1024746038,"eval_count":274,"eval_duration":45257702287}

The same response will be sent even of the client requests SSE by the header Accept: text/event-stream.

Until SSE is supported, other content types should be rejected with the status 415:

HTTP/1.1 415 Unsupported Media Type
Accept-Post: application/x-ndjson

The simplest SSE support can be added by prefixing each JSON object (chunk) with data: and ending it with two line breaks (\n). No need for the rest of the SSE protocl, at the beginning.

For example thi sis how Google Gemini supports streaming with SSE supports SSE:

POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse

< HTTP/2 200
< content-type: text/event-stream
<
data: {"candidates": ...}

data: {"candidates": ... "finishReason": "STOP" ...}
<!-- gh-comment-id:3135741284 --> @prantlf commented on GitHub (Jul 30, 2025): This issue wasn't fixed. Ollama doesn't support SSE. Streaming is supported by chunked encoding with the content type `application/x-ndjson`. For example: ``` curl -v -H 'Content-Type: application/json' \ -d '{"model":"deepseek-r1:1.5b","messages":[{"role":"user","content":"what is ollama?"}]}' \ http://localhost:11434/api/chat < HTTP/1.1 200 OK < Content-Type: application/x-ndjson < Date: Wed, 30 Jul 2025 09:45:03 GMT < Transfer-Encoding: chunked < {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.537758Z","message":{"role":"assistant","content":"\n\n"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.654744Z","message":{"role":"assistant","content":"\u003c/think\u003e"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.76711Z","message":{"role":"assistant","content":"\n\n"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.877521Z","message":{"role":"assistant","content":"O"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:03.988219Z","message":{"role":"assistant","content":"ll"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:04.098806Z","message":{"role":"assistant","content":"ama"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:04.21934Z","message":{"role":"assistant","content":" is"},"done":false} ... {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.36868Z","message":{"role":"assistant","content":" setup"},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.517462Z","message":{"role":"assistant","content":"."},"done":false} {"model":"deepseek-r1:1.5b","created_at":"2025-07-30T09:45:48.665553Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":49729537151,"load_duration":3445259896,"prompt_eval_count":9,"prompt_eval_duration":1024746038,"eval_count":274,"eval_duration":45257702287} ``` The same response will be sent even of the client requests SSE by the header `Accept: text/event-stream`. Until SSE is supported, other content types should be [rejected with the status 415](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/415): ``` HTTP/1.1 415 Unsupported Media Type Accept-Post: application/x-ndjson ``` The [simplest SSE support](https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream) can be added by prefixing each JSON object (chunk) with `data: ` and ending it with two line breaks (`\n`). No need for the rest of the SSE protocl, at the beginning. For example thi sis how [Google Gemini supports streaming with SSE](https://ai.google.dev/gemini-api/docs/text-generation) supports SSE: ``` POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse < HTTP/2 200 < content-type: text/event-stream < data: {"candidates": ...} data: {"candidates": ... "finishReason": "STOP" ...} ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49528