[GH-ISSUE #17350] issue: Llama.cpp server timing metrics not parsed correctly #18251

Closed
opened 2026-04-20 00:27:23 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @ITankForCAD on GitHub (Sep 11, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17350

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.28

Ollama Version (if applicable)

No response

Operating System

Fedora 42

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

After assistant message completion, timing statistics should be displayed under the message

Actual Behavior

Llama.cpp timing statistics do not appear alongside the other usage metrics.

Steps to Reproduce

  1. Start a llama-server instance
  2. Chat with a loaded model through owui
  3. Upon assistant message completion, inspect the usage & timing statistics available below the chat bubble.

Logs & Screenshots

Image

The first screenshot is the response from a server instance using curl. In the response object, there is a field called "timings".

Image

The second screenshot is from the pr that introduced this feature, see relevant commit.

Additional Information

The fix seems simple enough. It appears that an "s" is missing at the end of the word timing, causing the .get to return an empty dictionary even when timing statistics are available.

Originally created by @ITankForCAD on GitHub (Sep 11, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/17350 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.28 ### Ollama Version (if applicable) _No response_ ### Operating System Fedora 42 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior After assistant message completion, timing statistics should be displayed under the message ### Actual Behavior Llama.cpp timing statistics do not appear alongside the other usage metrics. ### Steps to Reproduce 1. Start a llama-server instance 2. Chat with a loaded model through owui 3. Upon assistant message completion, inspect the usage & timing statistics available below the chat bubble. ### Logs & Screenshots <img width="1485" height="81" alt="Image" src="https://github.com/user-attachments/assets/0e88e6ec-f532-4a7d-88cf-ed6a4efc128f" /> The first screenshot is the response from a server instance using curl. In the response object, there is a field called "timings". <img width="727" height="72" alt="Image" src="https://github.com/user-attachments/assets/b5d94133-4458-4e90-bfa8-e72141d67880" /> The second screenshot is from the pr that introduced this feature, see relevant [commit](https://github.com/open-webui/open-webui/pull/17070/commits/e830b4959ecd4b2795e29e53026984a58a7696a9). ### Additional Information The fix seems simple enough. It appears that an "s" is missing at the end of the word timing, causing the .get to return an empty dictionary even when timing statistics are available.
GiteaMirror added the bug label 2026-04-20 00:27:23 -05:00
Author
Owner

@tjbck commented on GitHub (Sep 11, 2025):

cf72f5503f

<!-- gh-comment-id:3279153063 --> @tjbck commented on GitHub (Sep 11, 2025): cf72f5503f39834b9da44ebbb426a3674dad0caa
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18251