[PR #5709] [MERGED] Add Metrics to api\embed response #11894

Closed
opened 2026-04-12 23:42:00 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5709
Author: @royjhan
Created: 7/15/2024
Status: Merged
Merged: 7/30/2024
Merged by: @royjhan

Base: mainHead: royh-embed-tokens


📝 Commits (10+)

📊 Changes

6 files changed (+40 additions, -16 deletions)

View changed files

📝 api/types.go (+4 -0)
📝 integration/embed_test.go (+9 -1)
📝 llm/ext_server/server.cpp (+6 -1)
📝 llm/server.go (+7 -6)
📝 server/routes.go (+12 -6)
📝 server/sched_test.go (+2 -2)

📄 Description

"timings" is returned per request_completion in server.cpp, which must be aggregated to return metrics for a batch of completions.

supporting: prompt_eval_count (total number of tokens evaluated), load duration, total duration


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5709 **Author:** [@royjhan](https://github.com/royjhan) **Created:** 7/15/2024 **Status:** ✅ Merged **Merged:** 7/30/2024 **Merged by:** [@royjhan](https://github.com/royjhan) **Base:** `main` ← **Head:** `royh-embed-tokens` --- ### 📝 Commits (10+) - [`2ea86a5`](https://github.com/ollama/ollama/commit/2ea86a5a9b5fa2bbf9b677f28666b34b3633f155) add prompt tokens to embed response - [`15b7e41`](https://github.com/ollama/ollama/commit/15b7e4103aaec7e5153419f8ff1f3d7e8336b4e8) rm slog - [`a62d212`](https://github.com/ollama/ollama/commit/a62d2124f61a1fe812b944cb3ccf05352f7acc7a) metrics - [`1afe799`](https://github.com/ollama/ollama/commit/1afe7999546be7b0c38d6646fcdf29892a497a0a) types - [`f8b7b71`](https://github.com/ollama/ollama/commit/f8b7b71ca6bf71ad7dc14c0a3a6369d4bf9b2693) prompt n - [`6ca8f19`](https://github.com/ollama/ollama/commit/6ca8f19a1f97d0c05dd71423aa66a9e05bc5bcf0) clean up - [`ea3deb9`](https://github.com/ollama/ollama/commit/ea3deb90a5e1b040a18af457dbbee081d2adb993) reset submodule - [`c0e62bc`](https://github.com/ollama/ollama/commit/c0e62bce1cf95462bdd982180f361ec83f47fe71) update tests - [`b543e1a`](https://github.com/ollama/ollama/commit/b543e1a44b4fb1ba64d87e9154a8d29c44fc9543) test name - [`1862a82`](https://github.com/ollama/ollama/commit/1862a821418ce71f09e9a43d9d7886fe624b8c5a) list metrics ### 📊 Changes **6 files changed** (+40 additions, -16 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+4 -0) 📝 `integration/embed_test.go` (+9 -1) 📝 `llm/ext_server/server.cpp` (+6 -1) 📝 `llm/server.go` (+7 -6) 📝 `server/routes.go` (+12 -6) 📝 `server/sched_test.go` (+2 -2) </details> ### 📄 Description "timings" is returned per request_completion in server.cpp, which must be aggregated to return metrics for a batch of completions. supporting: prompt_eval_count (total number of tokens evaluated), load duration, total duration --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:42:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11894