[GH-ISSUE #670] Context at each token, to allow interrupting a response? #46811

Closed
opened 2026-04-28 00:19:41 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @tonyg on GitHub (Oct 2, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/670

Forgive me if I misunderstand, but would it be possible to get context at every streamed JSON partial-response, instead of just at the end, to allow the user to interrupt a response without the conversation losing a memory of how far it had gotten before being interrupted?

Originally created by @tonyg on GitHub (Oct 2, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/670 Forgive me if I misunderstand, but would it be possible to get context at every streamed JSON partial-response, instead of just at the end, to allow the user to interrupt a response without the conversation losing a memory of how far it had gotten before being interrupted?
GiteaMirror added the feature request label 2026-04-28 00:19:41 -05:00
Author
Owner

@jmorganca commented on GitHub (Dec 24, 2023):

Hi there, we've designed the new Chat completion endpoint for this use case. With chat completions, no more worrying about passing in context. Instead, you can accumulate the response-so-far and send that in next.

I'll mark this issue as closed, that said let me know if it doesn't solve your use case and I'll make sure to re-open it :)

<!-- gh-comment-id:1868598605 --> @jmorganca commented on GitHub (Dec 24, 2023): Hi there, we've designed the new [Chat completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-chat-completion) endpoint for this use case. With chat completions, no more worrying about passing in `context`. Instead, you can accumulate the response-so-far and send that in next. I'll mark this issue as closed, that said let me know if it doesn't solve your use case and I'll make sure to re-open it :)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#46811