mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 07:43:10 -05:00
feat: Limit the number of tokens when sending requests directly #4790
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @shentong0722 on GitHub (Apr 10, 2025).
Check Existing Issues
Problem Description
I'm using Groq's free plan, and Groq limits the maximum single input to 6000 tokens for most models. When my context is too long, sending it to Groq will result in an error.
I tried limiting the context refresh tokens and context tokens in the settings, but neither had any effect.
Desired Solution you'd like
Is it possible to perform context truncation directly when sending to the upstream server? I need this very much
Alternatives Considered
No response
Additional Context
No response
@shentong0722 commented on GitHub (Apr 10, 2025):
找到了:
"""
title: Token Clip Filter
author: houxin
author_url: https://github.com/hx173149
funding_url: https://github.com/hx173149
version: 0.1
"""
from pydantic import BaseModel, Field
from typing import Optional
import tiktoken
class Filter:
class Valves(BaseModel):
priority: int = Field(
default=0, description="Priority level for the filter operations."
)
n_token_limit: int = Field(
default=7000, description="Number of token limit to retain."
)
pass