feat: Implement logit_bias parameter for token-level generation bias #3553

New Issue

GiteaMirror · 2025-11-11T15:33:49-06:00

GiteaMirror commented

2025-11-11 15:33:49 -06:00

Originally created by @Fluorite8 on GitHub (Feb 3, 2025).

Originally assigned to: @dannyl1u on GitHub.

Is your feature request related to a problem? Please describe.

I would like to request the addition of logit_bias parameter support for per-token bias adjustment. This feature is already implemented in major APIs like llama.cpp and OpenAI. For example, llama.cpp supports passing a logit_bias parameter as a dictionary of {token1: bias1, token2: bias2, ...}.

Reference implementation in llama.cpp:
21c84b5d2d/examples/server/server.cpp (L169C61-L169C69)

This feature would enable important use cases such as:

Boosting specific tokens (e.g., 'Yes'/'No' and ) for constrained responses
Improving Chain-of-Thought reasoning accuracy by penalizing transition tokens like 'Alternatively,' and 'Also,' as suggested in recent research [arXiv:2501.18585v1]

Describe the solution you'd like

Implement logit_bias support through modifications to these core components:

Frontend parameter UI in src/lib/components/chat/Settings/Advanced/AdvancedParams.svelte
Settings state management in src/lib/components/chat/Settings/General.svelte
Backend parameter parsing in backend/open_webui/utils/misc.py
API payload construction in backend/open_webui/utils/payload.py

Proposed implementation details:

Accept input as comma-separated "token_id:bias_value" pairs
Convert to dictionary format for API requests
Validate token IDs as integers and bias values between -30 to 30

Describe alternatives you've considered

An alternative approach could allow raw JSON input for advanced users, but this would require:

Strict input validation to prevent security risks
Additional error handling for malformed JSON
More complex UI/UX considerations

However, the proposed dictionary-based implementation balances flexibility with safety, following established patterns in the codebase.

Originally created by @Fluorite8 on GitHub (Feb 3, 2025). Originally assigned to: @dannyl1u on GitHub. **Is your feature request related to a problem? Please describe.** I would like to request the addition of logit_bias parameter support for per-token bias adjustment. This feature is already implemented in major APIs like llama.cpp and OpenAI. For example, llama.cpp supports passing a logit_bias parameter as a dictionary of {token1: bias1, token2: bias2, ...}. Reference implementation in llama.cpp: https://github.com/ggerganov/llama.cpp/blob/21c84b5d2dc04050714567501bf78762bfa17846/examples/server/server.cpp#L169C61-L169C69 This feature would enable important use cases such as: - Boosting specific tokens (e.g., 'Yes'/'No' and <eos>) for constrained responses - Improving Chain-of-Thought reasoning accuracy by penalizing transition tokens like 'Alternatively,' and 'Also,' as suggested in recent research [arXiv:2501.18585v1] **Describe the solution you'd like** Implement logit_bias support through modifications to these core components: 1. Frontend parameter UI in `src/lib/components/chat/Settings/Advanced/AdvancedParams.svelte` 2. Settings state management in `src/lib/components/chat/Settings/General.svelte` 3. Backend parameter parsing in `backend/open_webui/utils/misc.py` 4. API payload construction in `backend/open_webui/utils/payload.py` Proposed implementation details: - Accept input as comma-separated "token_id:bias_value" pairs - Convert to dictionary format for API requests - Validate token IDs as integers and bias values between -30 to 30 **Describe alternatives you've considered** An alternative approach could allow raw JSON input for advanced users, but this would require: - Strict input validation to prevent security risks - Additional error handling for malformed JSON - More complex UI/UX considerations However, the proposed dictionary-based implementation balances flexibility with safety, following established patterns in the codebase.

GiteaMirror closed this issue

2025-11-11 15:33:50 -06:00

GiteaMirror commented

2025-11-11 15:33:50 -06:00

@dannyl1u commented on GitHub (Mar 4, 2025):

Completed in #10373

@dannyl1u commented on GitHub (Mar 4, 2025): Completed in #10373

GiteaMirror referenced this issue

2026-04-19 20:05:03 -05:00

[GH-ISSUE #3553] Ollama: 500, message='Internal Server Error', url=URL('http://localhost:11434/api/chat') #13306

GiteaMirror referenced this issue

2026-04-25 03:22:49 -05:00