issue: Performance bottleneck in /api/models: excessive DB queries in get_all_models and access filtering #5420

Closed
opened 2025-11-11 16:20:32 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @escargotbuffed on GitHub (Jun 2, 2025).

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.13

Ollama Version (if applicable)

No response

Operating System

Ubuntu 22.04

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

  • Calling GET /api/models should return a filtered list of models within a few hundred milliseconds at most.
  • The endpoint should retrieve all available models (from OpenAI/functions) and apply access-control filtering without issuing dozens of individual database queries.

Actual Behavior

  • The response time for GET /api/models often exceeds 7–8 seconds.

  • Profiling shows two main hotspots:

    1. get_all_models (≈ 4.5 s) — fetching/merging every model from external sources
    2. Access-filtering loop (≈ 3.3 s) — for each model that does not have arena=true, the code invokes Models.get_model_by_id(model_id) (1 DB query per model) followed—if the user is not the owner—by has_access(...), which itself calls Groups.get_groups_by_member_id(user_id) (another DB query per model)

Because there are O(40–50) models, the combined effect is tens of database round-trips in sequence, causing a multi-second delay before any data is returned to the client.

Even though Redis is configured, the final filtered model list is not being cached. Every request re-executes all database queries against the remote Postgres instance.

Steps to Reproduce

  1. Ensure the application is pointed at a remote (non‐local) PostgreSQL database instance.

  2. Authenticate as a normal user (role = “user”) who owns or has permission on a subset of models.

  3. Issue a request to the /api/models endpoint:

    GET https://<your-domain>/api/models
    Authorization: Bearer <valid token>
    
  4. Observe the response time in your browser’s network tab (it will typically take around 8 seconds).

Logs & Screenshots

2025-06-02 00:19:59.509 | DEBUG    | open_webui.main:get_models:1198 
    Times (ms): get_all_models=4484.3, tags_processing=0.1, sorting=0.1, access_filtering=3355.3, total=7839.7  
2025-06-02 00:19:59.512 | INFO     | uvicorn.protocols.http.httptools_impl:send:476 
    177.39.9.158:0 - "GET /api/models HTTP/1.1" 200 - {}  

@app.get("/api/models")
async def get_models(request: Request, user=Depends(get_verified_user)):
    t0 = time.perf_counter()
    def get_filtered_models(models, user):
        filtered_models = []
        for model in models:
            if model.get("arena"):
                if has_access(
                    user.id,
                    type="read",
                    access_control=model.get("info", {})
                    .get("meta", {})
                    .get("access_control", {}),
                ):
                    filtered_models.append(model)
                continue

            model_info = Models.get_model_by_id(model["id"])
            if model_info:
                if user.id == model_info.user_id or has_access(
                    user.id, type="read", access_control=model_info.access_control
                ):
                    filtered_models.append(model)

        return filtered_models

    t1 = time.perf_counter()
    all_models = await get_all_models(request, user=user)
    t2 = time.perf_counter()

    models = []
    for model in all_models:
        # Filter out filter pipelines
        if "pipeline" in model and model["pipeline"].get("type", None) == "filter":
            continue

        try:
            model_tags = [
                tag.get("name")
                for tag in model.get("info", {}).get("meta", {}).get("tags", [])
            ]
            tags = [tag.get("name") for tag in model.get("tags", [])]

            tags = list(set(model_tags + tags))
            model["tags"] = [{"name": tag} for tag in tags]
        except Exception as e:
            log.debug(f"Error processing model tags: {e}")
            model["tags"] = []
            pass

        models.append(model)
    t3 = time.perf_counter()

    model_order_list = request.app.state.config.MODEL_ORDER_LIST
    if model_order_list:
        model_order_dict = {model_id: i for i, model_id in enumerate(model_order_list)}
        # Sort models by order list priority, with fallback for those not in the list
        models.sort(
            key=lambda x: (model_order_dict.get(x["id"], float("inf")), x["name"])
        )

    t4 = time.perf_counter()
    # Filter out models that the user does not have access to
    if user.role == "user" and not BYPASS_MODEL_ACCESS_CONTROL:
        models = get_filtered_models(models, user)

    t5 = time.perf_counter()
    log.debug(
        f"/api/models returned filtered models accessible to the user: {json.dumps([model['id'] for model in models])}"
    )

    log.debug(
        f"Times (ms): get_all_models={(fetch_end-fetch_start)*1000:.1f}, "
        f"tags_processing={(tags_end-fetch_end)*1000:.1f}, "
        f"sorting={(sort_end-tags_end)*1000:.1f}, "
        f"access_filtering={(access_end-access_start)*1000:.1f}, "
        f"total={(end_time-start_time)*1000:.1f}"
    )
    
    return {"data": models}

Additional Information

  • Database & Cache Configuration

    • Postgres is running on a separate host (e.g., postgres://user:pass@db.example.com:5432/open_webui).
    • Redis is running and reachable, but the code never writes the final filtered‐model list or intermediate permission results into Redis. Consequently, Redis is effectively unused for /api/models.
  • Why This Matters

    • Because the database is not local, each SQL call adds network latency (often 20–50 ms or more). With ~45 models, that quickly multiplies into multiple seconds.
    • Redis is already configured in the project, but no code path writes the merged/filtered model list into Redis. Even if Redis is available, every request misses the cache and repeats all remote queries.
Originally created by @escargotbuffed on GitHub (Jun 2, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.13 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 22.04 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior * Calling `GET /api/models` should return a filtered list of models within a few hundred milliseconds at most. * The endpoint should retrieve all available models (from OpenAI/functions) and apply access-control filtering without issuing dozens of individual database queries. ### Actual Behavior * The response time for `GET /api/models` often exceeds 7–8 seconds. * Profiling shows two main hotspots: 1. **`get_all_models`** (≈ 4.5 s) — fetching/merging every model from external sources 2. **Access-filtering loop** (≈ 3.3 s) — for each model that does not have `arena=true`, the code invokes `Models.get_model_by_id(model_id)` (1 DB query per model) followed—if the user is not the owner—by `has_access(...)`, which itself calls `Groups.get_groups_by_member_id(user_id)` (another DB query per model) Because there are O(40–50) models, the combined effect is tens of database round-trips in sequence, causing a multi-second delay before any data is returned to the client. Even though Redis is configured, the final filtered model list is not being cached. Every request re-executes all database queries against the remote Postgres instance. ### Steps to Reproduce 1. Ensure the application is pointed at a **remote (non‐local) PostgreSQL** database instance. 2. Authenticate as a normal user (role = “user”) who owns or has permission on a subset of models. 3. Issue a request to the `/api/models` endpoint: ``` GET https://<your-domain>/api/models Authorization: Bearer <valid token> ``` 4. Observe the response time in your browser’s network tab (it will typically take around 8 seconds). ### Logs & Screenshots ``` 2025-06-02 00:19:59.509 | DEBUG | open_webui.main:get_models:1198 Times (ms): get_all_models=4484.3, tags_processing=0.1, sorting=0.1, access_filtering=3355.3, total=7839.7 2025-06-02 00:19:59.512 | INFO | uvicorn.protocols.http.httptools_impl:send:476 177.39.9.158:0 - "GET /api/models HTTP/1.1" 200 - {} ``` ``` @app.get("/api/models") async def get_models(request: Request, user=Depends(get_verified_user)): t0 = time.perf_counter() def get_filtered_models(models, user): filtered_models = [] for model in models: if model.get("arena"): if has_access( user.id, type="read", access_control=model.get("info", {}) .get("meta", {}) .get("access_control", {}), ): filtered_models.append(model) continue model_info = Models.get_model_by_id(model["id"]) if model_info: if user.id == model_info.user_id or has_access( user.id, type="read", access_control=model_info.access_control ): filtered_models.append(model) return filtered_models t1 = time.perf_counter() all_models = await get_all_models(request, user=user) t2 = time.perf_counter() models = [] for model in all_models: # Filter out filter pipelines if "pipeline" in model and model["pipeline"].get("type", None) == "filter": continue try: model_tags = [ tag.get("name") for tag in model.get("info", {}).get("meta", {}).get("tags", []) ] tags = [tag.get("name") for tag in model.get("tags", [])] tags = list(set(model_tags + tags)) model["tags"] = [{"name": tag} for tag in tags] except Exception as e: log.debug(f"Error processing model tags: {e}") model["tags"] = [] pass models.append(model) t3 = time.perf_counter() model_order_list = request.app.state.config.MODEL_ORDER_LIST if model_order_list: model_order_dict = {model_id: i for i, model_id in enumerate(model_order_list)} # Sort models by order list priority, with fallback for those not in the list models.sort( key=lambda x: (model_order_dict.get(x["id"], float("inf")), x["name"]) ) t4 = time.perf_counter() # Filter out models that the user does not have access to if user.role == "user" and not BYPASS_MODEL_ACCESS_CONTROL: models = get_filtered_models(models, user) t5 = time.perf_counter() log.debug( f"/api/models returned filtered models accessible to the user: {json.dumps([model['id'] for model in models])}" ) log.debug( f"Times (ms): get_all_models={(fetch_end-fetch_start)*1000:.1f}, " f"tags_processing={(tags_end-fetch_end)*1000:.1f}, " f"sorting={(sort_end-tags_end)*1000:.1f}, " f"access_filtering={(access_end-access_start)*1000:.1f}, " f"total={(end_time-start_time)*1000:.1f}" ) return {"data": models} ``` ### Additional Information * **Database & Cache Configuration** * **Postgres** is running on a separate host (e.g., `postgres://user:pass@db.example.com:5432/open_webui`). * **Redis** is running and reachable, but the code never writes the final filtered‐model list or intermediate permission results into Redis. Consequently, Redis is effectively unused for `/api/models`. * **Why This Matters** * Because the database is **not local**, each SQL call adds network latency (often 20–50 ms or more). With \~45 models, that quickly multiplies into multiple seconds. * Redis is already configured in the project, but no code path writes the merged/filtered model list into Redis. Even if Redis is available, every request misses the cache and repeats all remote queries.
GiteaMirror added the bug label 2025-11-11 16:20:32 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#5420