[GH-ISSUE #18619] issue: Nested options.max_tokens in model parameters not recursively converted to num_predict #34184

New Issue

GiteaMirror · 2026-04-25T08:05:56-05:00

GiteaMirror commented

2026-04-25 08:05:56 -05:00

Originally created by @elazar on GitHub (Oct 25, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18619

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.33 (tested) / main branch (as of Oct 25, 2025)

Ollama Version (if applicable)

v0.5.11

Operating System

Linux (Ubuntu 22.04) - should also be reproduceable on Debian 12, macOS, and Windows

Browser (if applicable)

Chrome 130.0.6723.92 (for model settings UI)

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When model parameters contain max_tokens nested within an options dictionary structure, the parameter should be recursively converted to num_predict before sending to Ollama. This ensures compatibility with Ollama's native API and prevents warning messages.

Summary:

No warning in Ollama logs
max_tokens converted to num_predict before reaching Ollama
Parameter properly applied to limit output length

Actual Behavior

Nested max_tokens parameters within model parameter dictionaries are copied as-is without conversion, causing:

Ollama warnings in logs: level=WARN msg="invalid option provided" option=max_tokens
Parameter silently ignored by Ollama (no effect on output length)
User confusion when saved model settings don't work as expected

Summary:

Warning logged: invalid option provided: max_tokens
Parameter passed unconverted to Ollama
Parameter ignored (output not limited)

Steps to Reproduce

Prerequisites

Fresh Ubuntu 22.04 system
Docker Engine v24.0.5+ installed
Ollama v0.5.11+ running with debug logging enabled
Open WebUI v0.6.33 (main branch)
Chrome browser v130+

Docker Setup

Enable Ollama debug logging:

# Edit Ollama service or set environment variable
export OLLAMA_DEBUG=1
ollama serve

Run Open WebUI container:

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Access Open WebUI:
- Navigate to http://localhost:3000 in Chrome
- Create admin account
- Verify Ollama connection

Reproduction Steps

Create a model with nested parameters via UI:
- In Open WebUI, click "Workspace" → "Models"
- Click "+ Add Model"
- Fill in:
  - Name: test-model
  - Base Model: llama2
  - Model Parameters: Click "Advanced"

Add custom parameters with nested structure:

In the "Custom Parameters" JSON field, enter:

{
  "temperature": 0.7,
  "options": {
    "max_tokens": 4096,
    "top_p": 0.9
  }
}

Click "Save"

Alternatively, create via API:

# Get API key from Settings → Account → API Keys
export OPENWEBUI_API_KEY="your_api_key_here"

curl -X POST http://localhost:3000/api/models/create \
  -H "Authorization: Bearer $OPENWEBUI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "test-model",
    "base_model_id": "llama2",
    "params": {
      "temperature": 0.7,
      "options": {
        "max_tokens": 4096,
        "top_p": 0.9
      }
    }
  }'

Use the model in a chat:
- Start a new chat in Open WebUI
- Select the test-model from the model dropdown
- Send any message: "Hello, how are you?"

Verify the bug:

# Monitor Ollama logs in real-time
journalctl -u ollama -f | grep -E "(max_tokens|invalid option)"

# OR if Ollama is in container:
docker logs -f ollama 2>&1 | grep -E "(max_tokens|invalid option)"

Observe warning in logs:

time=2025-10-25T10:30:15.789-05:00 level=WARN source=types.go:753 msg="invalid option provided" option=max_tokens

Logs & Screenshots

Browser Console Logs

Console: No errors, UI functions normally

Network Tab → API Calls:
POST /api/chat
Status: 200 OK
Response: [Chat completion received successfully]

Docker Container Logs

Open WebUI Container

No errors, conversion logic silently passes through max_tokens.

$ docker logs open-webui -f
INFO:     127.0.0.1:42356 - "POST /api/chat HTTP/1.1" 200 OK
INFO:     127.0.0.1:42357 - "GET /api/models HTTP/1.1" 200 OK

Ollama

$ journalctl -u ollama -f
Oct 25 10:30:15 hostname ollama[12345]: time=2025-10-25T10:30:15.123-05:00 level=INFO source=server.go:123 msg="request received" method=POST path=/api/chat
Oct 25 10:30:15 hostname ollama[12345]: time=2025-10-25T10:30:15.124-05:00 level=WARN source=types.go:753 msg="invalid option provided" option=max_tokens
Oct 25 10:30:18 hostname ollama[12345]: time=2025-10-25T10:30:18.456-05:00 level=INFO source=server.go:145 msg="request completed" duration=3.332s tokens=150

Additional Information

Root Cause Analysis

Origin: Initial architecture of apply_model_params_to_body_ollama function

Historical Context:

ff46fe2b4 (Sept 7, 2024): Function created with top-level conversion only
v0.6.16: "Smarter Custom Parameter Handling" feature enabled JSON input for model params
This created nested structures the original function wasn't designed to handle

Architectural Limitation:

The function performs top-level parameter name conversion but uses deep_update and apply_model_params_to_body helpers that copy nested dictionaries without recursive processing of their contents.

# Top-level conversion happens:
name_differences = {"max_tokens": "num_predict"}
for key, value in name_differences.items():
    if params.get(key):
        params[value] = params[key]  # Only checks top level
        del params[key]

# But nested dicts are merged as-is:
params = deep_update(params, custom_params)  # No conversion inside

Impact Assessment

Severity: Low-Medium
Type: Architectural oversight (never implemented)
Scope: Models with custom parameter JSON containing nested structures
User Impact:
- Confusing warnings in logs
- Parameters silently ignored
- Saved model settings don't work as expected
Workaround: Use flat parameter structure (non-intuitive for advanced users)

Environment Configuration

Test Environment:

Docker: Engine 24.0.5
Open WebUI: v0.6.33 / main branch (commit 6bc5d33)
Ollama: v0.5.11 with OLLAMA_DEBUG=1
OS: Ubuntu 22.04 LTS
Browser: Chrome 130.0.6723.92
Python: 3.11 (in container)

Environment Variables:

OLLAMA_BASE_URL=http://host.docker.internal:11434
OLLAMA_DEBUG=1  # On Ollama host to see warnings
# Standard Docker setup per README.md

Configuration Files:

None - uses default Open WebUI Docker configuration

Proposed Fix

Location: backend/open_webui/utils/payload.py, function apply_model_params_to_body_ollama

Solution: Add recursive conversion function:

def convert_max_tokens_recursive(d):
    """Recursively convert max_tokens to num_predict in nested dicts."""
    if isinstance(d, dict):
        if "max_tokens" in d:
            d["num_predict"] = d["max_tokens"]
            del d["max_tokens"]
        for key, value in d.items():
            if isinstance(value, dict):
                convert_max_tokens_recursive(value)

def apply_model_params_to_body_ollama(params: dict, form_data: dict) -> dict:
    """Apply model parameters to request body for Ollama."""
    params = remove_open_webui_params(params)

    custom_params = params.pop("custom_params", {})
    if custom_params:
        try:
            custom_params = json.loads(custom_params)
            params = deep_update(params, custom_params)
        except Exception:
            pass

    # Top-level parameter name conversion
    name_differences = {
        "max_tokens": "num_predict",
    }

    for key, value in name_differences.items():
        if (param := params.get(key, None)) is not None:
            params[value] = params[key]
            del params[key]

    # FIX: Recursively convert nested max_tokens
    convert_max_tokens_recursive(params)

    # Continue with existing logic
    form_data["options"] = apply_model_params_to_body(
        params, form_data.get("options", {}), mappings
    )

    return form_data

Rationale:

Handles arbitrarily nested parameter structures
Preserves all other parameters unchanged
Minimal performance impact (only processes dicts)
Backward compatible with existing flat structures

Related to: Open WebUI Issue #18618 (root-level regression)
Related to: Ollama issue #12779
Feature history:
- v0.6.16: JSON custom parameters enabled (created the use case)
- Never updated conversion logic to handle nested structures

Why This Bug Wasn't Caught

No test coverage for nested parameter structures
Uncommon usage pattern - most users don't create nested param structures
Non-breaking - generates warnings but doesn't crash
Feature evolution - JSON params added after conversion logic designed
Silent failure - parameter just ignored, no error to user

Additional Testing Performed

# Test with various nesting levels
params_test_cases = [
    {"max_tokens": 100},  # Top-level (issue #18618 per "Related Issues & Context")
    {"options": {"max_tokens": 100}},  # One level
    {"options": {"advanced": {"max_tokens": 100}}},  # Two levels
]

All cases should convert max_tokens → num_predict
Currently: Only nested within "options" at first level works (and only after Feb 2025 fix for #18618 nested case)

Originally created by @elazar on GitHub (Oct 25, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/18619 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.33 (tested) / main branch (as of Oct 25, 2025) ### Ollama Version (if applicable) v0.5.11 ### Operating System Linux (Ubuntu 22.04) - should also be reproduceable on Debian 12, macOS, and Windows ### Browser (if applicable) Chrome 130.0.6723.92 (for model settings UI) ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When model parameters contain `max_tokens` nested within an `options` dictionary structure, the parameter should be recursively converted to `num_predict` before sending to Ollama. This ensures compatibility with Ollama's native API and prevents warning messages. Summary: - No warning in Ollama logs - `max_tokens` converted to `num_predict` before reaching Ollama - Parameter properly applied to limit output length ### Actual Behavior Nested `max_tokens` parameters within model parameter dictionaries are copied as-is without conversion, causing: 1. **Ollama warnings in logs**: `level=WARN msg="invalid option provided" option=max_tokens` 2. **Parameter silently ignored** by Ollama (no effect on output length) 3. **User confusion** when saved model settings don't work as expected Summary: - Warning logged: `invalid option provided: max_tokens` - Parameter passed unconverted to Ollama - Parameter ignored (output not limited) ### Steps to Reproduce ## Prerequisites - Fresh Ubuntu 22.04 system - Docker Engine v24.0.5+ installed - Ollama v0.5.11+ running with debug logging enabled - Open WebUI v0.6.33 (main branch) - Chrome browser v130+ ## Docker Setup 1. **Enable Ollama debug logging**: ```bash # Edit Ollama service or set environment variable export OLLAMA_DEBUG=1 ollama serve ``` 2. **Run Open WebUI container**: ```bash docker run -d -p 3000:8080 \ --add-host=host.docker.internal:host-gateway \ -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:main ``` 3. **Access Open WebUI**: - Navigate to `http://localhost:3000` in Chrome - Create admin account - Verify Ollama connection ## Reproduction Steps 1. **Create a model with nested parameters via UI**: - In Open WebUI, click "Workspace" → "Models" - Click "+ Add Model" - Fill in: - **Name:** `test-model` - **Base Model:** `llama2` - **Model Parameters:** Click "Advanced" 2. **Add custom parameters with nested structure**: - In the "Custom Parameters" JSON field, enter: ```json { "temperature": 0.7, "options": { "max_tokens": 4096, "top_p": 0.9 } } ``` - Click "Save" 3. **Alternatively, create via API**: ```bash # Get API key from Settings → Account → API Keys export OPENWEBUI_API_KEY="your_api_key_here" curl -X POST http://localhost:3000/api/models/create \ -H "Authorization: Bearer $OPENWEBUI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "test-model", "base_model_id": "llama2", "params": { "temperature": 0.7, "options": { "max_tokens": 4096, "top_p": 0.9 } } }' ``` 4. **Use the model in a chat**: - Start a new chat in Open WebUI - Select the `test-model` from the model dropdown - Send any message: "Hello, how are you?" 5. **Verify the bug**: ```bash # Monitor Ollama logs in real-time journalctl -u ollama -f | grep -E "(max_tokens|invalid option)" # OR if Ollama is in container: docker logs -f ollama 2>&1 | grep -E "(max_tokens|invalid option)" ``` 6. **Observe warning in logs**: ``` time=2025-10-25T10:30:15.789-05:00 level=WARN source=types.go:753 msg="invalid option provided" option=max_tokens ``` ### Logs & Screenshots ## Browser Console Logs Console: No errors, UI functions normally Network Tab → API Calls: POST /api/chat Status: 200 OK Response: [Chat completion received successfully] ## Docker Container Logs ### Open WebUI Container No errors, conversion logic silently passes through `max_tokens`. ```bash $ docker logs open-webui -f INFO: 127.0.0.1:42356 - "POST /api/chat HTTP/1.1" 200 OK INFO: 127.0.0.1:42357 - "GET /api/models HTTP/1.1" 200 OK ``` ### Ollama ```bash $ journalctl -u ollama -f Oct 25 10:30:15 hostname ollama[12345]: time=2025-10-25T10:30:15.123-05:00 level=INFO source=server.go:123 msg="request received" method=POST path=/api/chat Oct 25 10:30:15 hostname ollama[12345]: time=2025-10-25T10:30:15.124-05:00 level=WARN source=types.go:753 msg="invalid option provided" option=max_tokens Oct 25 10:30:18 hostname ollama[12345]: time=2025-10-25T10:30:18.456-05:00 level=INFO source=server.go:145 msg="request completed" duration=3.332s tokens=150 ``` ### Additional Information ## Root Cause Analysis **Origin:** Initial architecture of `apply_model_params_to_body_ollama` function **Historical Context:** - ff46fe2b4 (Sept 7, 2024): Function created with top-level conversion only - v0.6.16: "Smarter Custom Parameter Handling" feature enabled JSON input for model params - This created nested structures the original function wasn't designed to handle **Architectural Limitation:** The function performs top-level parameter name conversion but uses `deep_update` and `apply_model_params_to_body` helpers that copy nested dictionaries without recursive processing of their contents. ```python # Top-level conversion happens: name_differences = {"max_tokens": "num_predict"} for key, value in name_differences.items(): if params.get(key): params[value] = params[key] # Only checks top level del params[key] # But nested dicts are merged as-is: params = deep_update(params, custom_params) # No conversion inside ``` ## Impact Assessment - **Severity:** Low-Medium - **Type:** Architectural oversight (never implemented) - **Scope:** Models with custom parameter JSON containing nested structures - **User Impact:** - Confusing warnings in logs - Parameters silently ignored - Saved model settings don't work as expected - **Workaround:** Use flat parameter structure (non-intuitive for advanced users) ## Environment Configuration **Test Environment:** - **Docker:** Engine 24.0.5 - **Open WebUI:** v0.6.33 / main branch (commit 6bc5d33) - **Ollama:** v0.5.11 with `OLLAMA_DEBUG=1` - **OS:** Ubuntu 22.04 LTS - **Browser:** Chrome 130.0.6723.92 - **Python:** 3.11 (in container) **Environment Variables:** ```bash OLLAMA_BASE_URL=http://host.docker.internal:11434 OLLAMA_DEBUG=1 # On Ollama host to see warnings # Standard Docker setup per README.md ``` **Configuration Files:** None - uses default Open WebUI Docker configuration ## Proposed Fix **Location:** `backend/open_webui/utils/payload.py`, function `apply_model_params_to_body_ollama` **Solution:** Add recursive conversion function: ```python def convert_max_tokens_recursive(d): """Recursively convert max_tokens to num_predict in nested dicts.""" if isinstance(d, dict): if "max_tokens" in d: d["num_predict"] = d["max_tokens"] del d["max_tokens"] for key, value in d.items(): if isinstance(value, dict): convert_max_tokens_recursive(value) def apply_model_params_to_body_ollama(params: dict, form_data: dict) -> dict: """Apply model parameters to request body for Ollama.""" params = remove_open_webui_params(params) custom_params = params.pop("custom_params", {}) if custom_params: try: custom_params = json.loads(custom_params) params = deep_update(params, custom_params) except Exception: pass # Top-level parameter name conversion name_differences = { "max_tokens": "num_predict", } for key, value in name_differences.items(): if (param := params.get(key, None)) is not None: params[value] = params[key] del params[key] # FIX: Recursively convert nested max_tokens convert_max_tokens_recursive(params) # Continue with existing logic form_data["options"] = apply_model_params_to_body( params, form_data.get("options", {}), mappings ) return form_data ``` **Rationale:** - Handles arbitrarily nested parameter structures - Preserves all other parameters unchanged - Minimal performance impact (only processes dicts) - Backward compatible with existing flat structures ## Related Issues & Context - **Related to:** Open WebUI Issue [#18618](https://github.com/open-webui/open-webui/issues/18618) (root-level regression) - **Related to:** Ollama issue [#12779](https://github.com/ollama/ollama/issues/12779) - **Feature history:** - v0.6.16: JSON custom parameters enabled (created the use case) - Never updated conversion logic to handle nested structures ## Why This Bug Wasn't Caught 1. **No test coverage** for nested parameter structures 2. **Uncommon usage pattern** - most users don't create nested param structures 3. **Non-breaking** - generates warnings but doesn't crash 4. **Feature evolution** - JSON params added after conversion logic designed 5. **Silent failure** - parameter just ignored, no error to user ## Additional Testing Performed ```bash # Test with various nesting levels params_test_cases = [ {"max_tokens": 100}, # Top-level (issue #18618 per "Related Issues & Context") {"options": {"max_tokens": 100}}, # One level {"options": {"advanced": {"max_tokens": 100}}}, # Two levels ] ``` - All cases should convert `max_tokens` → `num_predict` - Currently: Only nested within "options" at first level works (and only after Feb 2025 fix for #18618 nested case)

GiteaMirror added the bug label 2026-04-25 08:05:56 -05:00

GiteaMirror closed this issue

2026-04-25 08:05:56 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#34184

[GH-ISSUE #18619] issue: Nested options.max_tokens in model parameters not recursively converted to num_predict #34184