[PR #14240] [MERGED] bench: improve benchmarking tool #14584

Closed
opened 2026-04-13 00:58:33 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14240
Author: @dhiltgen
Created: 2/13/2026
Status: Merged
Merged: 3/15/2026
Merged by: @dhiltgen

Base: mainHead: bench


📝 Commits (1)

  • 2963e47 bench: improve benchmarking tool

📊 Changes

3 files changed (+1460 additions, -298 deletions)

View changed files

📝 cmd/bench/README.md (+60 -32)
📝 cmd/bench/bench.go (+313 -126)
📝 cmd/bench/bench_test.go (+1087 -140)

📄 Description

New features:

  • Warmup phase to eliminate cold-start outliers
  • time-to-first-token measured in each epoch
  • VRAM/memory tracking to identify CPU spillover
  • Controlled prompt length
  • Defaults to 6 epochs and 200 tokens max

Benchstat fixes:

  • ns/request instead of ns/op — non-standard unit created a separate group instead of grouping with timing metrics
  • Token count as the N field — benchstat interprets N as iteration count for statistical weighting, not as a token count

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14240 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 2/13/2026 **Status:** ✅ Merged **Merged:** 3/15/2026 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `bench` --- ### 📝 Commits (1) - [`2963e47`](https://github.com/ollama/ollama/commit/2963e479f796bfe7ad882a47cf85fb4c3fb559e5) bench: improve benchmarking tool ### 📊 Changes **3 files changed** (+1460 additions, -298 deletions) <details> <summary>View changed files</summary> 📝 `cmd/bench/README.md` (+60 -32) 📝 `cmd/bench/bench.go` (+313 -126) 📝 `cmd/bench/bench_test.go` (+1087 -140) </details> ### 📄 Description New features: - Warmup phase to eliminate cold-start outliers - time-to-first-token measured in each epoch - VRAM/memory tracking to identify CPU spillover - Controlled prompt length - Defaults to 6 epochs and 200 tokens max Benchstat fixes: - ns/request instead of ns/op — non-standard unit created a separate group instead of grouping with timing metrics - Token count as the N field — benchstat interprets N as iteration count for statistical weighting, not as a token count --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:58:33 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14584