[GH-ISSUE #8732] RFE: Please do not remove the generate context parameter #5665

Closed
opened 2026-04-12 16:57:43 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @asterbini on GitHub (Jan 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8732

I am using the context returned from the generate call to continue the chat without re-interpreting the earlier prompts.
Why this parameter has been deprecated?
Please keep it

Originally created by @asterbini on GitHub (Jan 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8732 I am using the context returned from the generate call to continue the chat without re-interpreting the earlier prompts. Why this parameter has been deprecated? Please keep it
GiteaMirror added the feature request label 2026-04-12 16:57:43 -05:00
Author
Owner

@meltyli commented on GitHub (Feb 27, 2025):

Is there a substitute option in perhaps the chat endpoint? It is currently more efficient than adding the earlier prompt as part of a following prompt.

<!-- gh-comment-id:2686913777 --> @meltyli commented on GitHub (Feb 27, 2025): Is there a substitute option in perhaps the chat endpoint? It is currently more efficient than adding the earlier prompt as part of a following prompt.
Author
Owner

@asterbini commented on GitHub (Feb 27, 2025):

As I understand, OpenAI caches the earlier chats so that a list of earlier prompt allows them to internally retrieve the context and continue the chat.
The context parameter allows us to implement a simple, stateless "continue chatting" multi prompt software

<!-- gh-comment-id:2687150418 --> @asterbini commented on GitHub (Feb 27, 2025): As I understand, OpenAI caches the earlier chats so that a list of earlier prompt allows them to internally retrieve the context and continue the chat. The context parameter allows us to implement a simple, stateless "continue chatting" multi prompt software
Author
Owner

@WizardMiner commented on GitHub (May 5, 2025):

Please do not remove the context parameter.

This is necessary for resuming from prior points in the conversation. For instance, if we want to try a prompt 3 different ways. Or if we want to back up several prompt/response iterations and go different directions. Context seeds the response with where we are now. Without it, we can't roll back the conversation. With context, we can store and reuse this point of view for unlimited kinds of interactions provided the underlying LLM hasn't changed.

Consider if we're in a dam water monitoring scenario, or patient medical health or fraud analysis. We load up all the safeguards into a context, then have the LLM evaluate the prompt within current status. We don't want to have to reload the rules each time or manage context degradation over time. With the context array parameter, we can hit it at that moment of understanding as many times as we want.

Please do not remove the context parameter or give us an alternative to manage our own. The LLMs don't care how we call them. They don't need to have an exclusive on-going chat memory. Using API with the same context array over and over is very useful. Without this feature, energy utilization increases and quality of response decreases. Often LLM come to different conclusions with the same input so that it sounds conversational. This is a huge problem trying to get back to the same state of understanding. Context parameter is critical. Please keep it.

<!-- gh-comment-id:2851953964 --> @WizardMiner commented on GitHub (May 5, 2025): Please do not remove the context parameter. This is necessary for resuming from prior points in the conversation. For instance, if we want to try a prompt 3 different ways. Or if we want to back up several prompt/response iterations and go different directions. Context seeds the response with where we are now. Without it, we can't roll back the conversation. With context, we can store and reuse this point of view for unlimited kinds of interactions provided the underlying LLM hasn't changed. Consider if we're in a dam water monitoring scenario, or patient medical health or fraud analysis. We load up all the safeguards into a context, then have the LLM evaluate the prompt within current status. We don't want to have to reload the rules each time or manage context degradation over time. With the context array parameter, we can hit it at that moment of understanding as many times as we want. Please do not remove the context parameter or give us an alternative to manage our own. The LLMs don't care how we call them. They don't need to have an **exclusive** on-going chat memory. Using API with the same context array over and over is very useful. Without this feature, energy utilization increases and quality of response decreases. Often LLM come to different conclusions with the same input so that it sounds conversational. This is a huge problem trying to get back to the same state of understanding. Context parameter is critical. Please keep it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5665