[PR #13944] [CLOSED] server: usage api #24991

Closed
opened 2026-04-19 17:56:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13944
Author: @BruceMacD
Created: 1/28/2026
Status: Closed

Base: mainHead: brucemacd/usage-api


📝 Commits (3)

📊 Changes

9 files changed (+381 additions, -12 deletions)

View changed files

📝 api/types.go (+13 -0)
📝 docs/api.md (+48 -0)
📝 server/routes.go (+21 -1)
📝 server/routes_debug_test.go (+2 -0)
📝 server/routes_generate_renderer_test.go (+2 -0)
📝 server/routes_generate_test.go (+94 -11)
📝 server/routes_harmony_streaming_test.go (+3 -0)
server/usage.go (+62 -0)
server/usage_test.go (+136 -0)

📄 Description

Add a new /api/usage endpoint that shows aggregate usage statistics per model since the server started.

curl http://localhost:11434/api/usage | jq
{
  "start": "2026-01-28T01:22:12.913022Z",
  "usage": [
    {
      "model": "qwen3-coder:30b",
      "requests": 1,
      "prompt_tokens": 13,
      "completion_tokens": 248
    }
  ]
}

Currently:

  • Per-response metrics exist (prompt_eval_count, eval_count) but no overall usage
  • OpenAI and Anthropic compatibility endpoints already return standardized usage objects which tools can use to track context consumption
  • No aggregate usage endpoint exists to query cumulative statistics across requests

The endpoint will return aggregate statistics about model usage since the server started.

Non-goals

  • Historical time-bucketed usage across server restarts

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13944 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 1/28/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `brucemacd/usage-api` --- ### 📝 Commits (3) - [`91dc088`](https://github.com/ollama/ollama/commit/91dc088e8bcfa996763bb638ac74fe78d5c3664f) server: usage api - [`43d9907`](https://github.com/ollama/ollama/commit/43d9907dd69bb40654d886ecece4e830099692cf) fix tests - [`9ac1300`](https://github.com/ollama/ollama/commit/9ac1300805d65f1601bf6cd6bf8307eba1070379) fix lint ### 📊 Changes **9 files changed** (+381 additions, -12 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+13 -0) 📝 `docs/api.md` (+48 -0) 📝 `server/routes.go` (+21 -1) 📝 `server/routes_debug_test.go` (+2 -0) 📝 `server/routes_generate_renderer_test.go` (+2 -0) 📝 `server/routes_generate_test.go` (+94 -11) 📝 `server/routes_harmony_streaming_test.go` (+3 -0) ➕ `server/usage.go` (+62 -0) ➕ `server/usage_test.go` (+136 -0) </details> ### 📄 Description Add a new /api/usage endpoint that shows aggregate usage statistics per model since the server started. ```bash curl http://localhost:11434/api/usage | jq { "start": "2026-01-28T01:22:12.913022Z", "usage": [ { "model": "qwen3-coder:30b", "requests": 1, "prompt_tokens": 13, "completion_tokens": 248 } ] } ``` ### Currently: - **Per-response metrics exist** (`prompt_eval_count`, `eval_count`) but no overall usage - **OpenAI and Anthropic compatibility endpoints** already return standardized `usage` objects which tools can use to track context consumption - **No aggregate usage endpoint exists** to query cumulative statistics across requests The endpoint will return aggregate statistics about model usage since the server started. ### Non-goals - Historical time-bucketed usage across server restarts --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 17:56:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#24991