Support client-side context clipping using OpenAI compliant API #1129

Closed
opened 2025-11-11 14:38:13 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @matbeedotcom on GitHub (Jun 4, 2024).

Is your feature request related to a problem? Please describe.
Yes, once the maximum context window is reached, it fails by sending the entire context to the llm api.

Describe the solution you'd like
Support a variety of context clipping mechanisms

I'm trying to use Exllamav2 models as Ollama is far too slow with LLaMa3-70b

Originally created by @matbeedotcom on GitHub (Jun 4, 2024). **Is your feature request related to a problem? Please describe.** Yes, once the maximum context window is reached, it fails by sending the entire context to the llm api. **Describe the solution you'd like** Support a variety of context clipping mechanisms I'm trying to use Exllamav2 models as Ollama is far too slow with LLaMa3-70b
Author
Owner

@tjbck commented on GitHub (Jun 4, 2024):

#1268

@tjbck commented on GitHub (Jun 4, 2024): #1268
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1129