[PR #6122] [CLOSED] llama: Implement timings response in Go server #43273

Closed
opened 2026-04-24 22:55:52 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6122
Author: @dhiltgen
Created: 8/1/2024
Status: Closed

Base: jmorganca/llamaHead: go_server_timings


📝 Commits (1)

  • b9db0d3 Implement timings response in Go server

📊 Changes

1 file changed (+65 additions, -9 deletions)

View changed files

📝 llama/runner/runner.go (+65 -9)

📄 Description

This implements the fields necessary for run --verbose to generate timing information.

(Examples from my other branch wiring this into the main ollama serve)
C++ runner:

% ollama run orca-mini --verbose "what is the origin of independence day?"
 Independence Day, also known as the Fourth of July, celebrates the adoption of the
Declaration of Independence on July 4, 1776. This document declared the United States as
a new nation and is considered a founding moment in American history. The holiday has
become an important cultural and historical event for Americans, with parades, fireworks,
barbecues, and other festivities taking place throughout the country.

total duration:       1.929017s
load duration:        1.036014583s
prompt eval count:    48 token(s)
prompt eval duration: 95.572ms
prompt eval rate:     502.24 tokens/s
eval count:           84 token(s)
eval duration:        796.532ms
eval rate:            105.46 tokens/s

Go runner:

% ollama run orca-mini --verbose "what is the origin of independence day?"
 Day, also known as Canada Day, commemorates the day in 1867 when British
North America Act was passed, granting responsible government and Canada
as a self-governing dominion within the British Empire.

total duration:       3.265021459s
load duration:        535.450084ms
prompt eval count:    47 token(s)
prompt eval duration: 2.29s
prompt eval rate:     20.52 tokens/s
eval count:           48 token(s)
eval duration:        437ms
eval rate:            109.84 tokens/s

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6122 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 8/1/2024 **Status:** ❌ Closed **Base:** `jmorganca/llama` ← **Head:** `go_server_timings` --- ### 📝 Commits (1) - [`b9db0d3`](https://github.com/ollama/ollama/commit/b9db0d385ca9809d03d2a67d613217f59dee7882) Implement timings response in Go server ### 📊 Changes **1 file changed** (+65 additions, -9 deletions) <details> <summary>View changed files</summary> 📝 `llama/runner/runner.go` (+65 -9) </details> ### 📄 Description This implements the fields necessary for `run --verbose` to generate timing information. (Examples from my [other branch wiring this into the main ollama serve](https://github.com/ollama/ollama/pull/5287)) C++ runner: ``` % ollama run orca-mini --verbose "what is the origin of independence day?" Independence Day, also known as the Fourth of July, celebrates the adoption of the Declaration of Independence on July 4, 1776. This document declared the United States as a new nation and is considered a founding moment in American history. The holiday has become an important cultural and historical event for Americans, with parades, fireworks, barbecues, and other festivities taking place throughout the country. total duration: 1.929017s load duration: 1.036014583s prompt eval count: 48 token(s) prompt eval duration: 95.572ms prompt eval rate: 502.24 tokens/s eval count: 84 token(s) eval duration: 796.532ms eval rate: 105.46 tokens/s ``` Go runner: ``` % ollama run orca-mini --verbose "what is the origin of independence day?" Day, also known as Canada Day, commemorates the day in 1867 when British North America Act was passed, granting responsible government and Canada as a self-governing dominion within the British Empire. total duration: 3.265021459s load duration: 535.450084ms prompt eval count: 47 token(s) prompt eval duration: 2.29s prompt eval rate: 20.52 tokens/s eval count: 48 token(s) eval duration: 437ms eval rate: 109.84 tokens/s ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 22:55:52 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#43273