[GH-ISSUE #10419] Provide an API to retrieve the number of requests being processed #6847

Closed
opened 2026-04-12 18:39:19 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @cr7258 on GitHub (Apr 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10419

We have integrated Ollama into our inference platform, and we are currently implementing a feature that waits for all active requests to complete before shutting down the pod, ensuring a graceful termination.

We hope Ollama can provide an API for retrieving the number of requests being processed, this could be a Prometheus metrics (gauge type), for example:

Originally created by @cr7258 on GitHub (Apr 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10419 We have integrated [Ollama into our inference platform](https://github.com/InftyAI/llmaz/blob/main/docs/examples/ollama/playground.yaml), and we are currently implementing a feature that waits for all active requests to complete before shutting down the pod, ensuring a graceful termination. We hope Ollama can provide an API for retrieving the number of requests being processed, this could be a Prometheus metrics (gauge type), for example: - [llama.cpp metrics](https://github.com/ggml-org/llama.cpp/blob/558a764713468f26f5a163d25a22100c9a04a48f/examples/server/README.md#get-metrics-prometheus-compatible-metrics-exporter) - [vLLM metrics](https://docs.vllm.ai/en/latest/design/v1/metrics.html#metrics-publishing-prometheus)
GiteaMirror added the feature request label 2026-04-12 18:39:19 -05:00
Author
Owner

@googs1025 commented on GitHub (Apr 26, 2025):

/cc

<!-- gh-comment-id:2831827993 --> @googs1025 commented on GitHub (Apr 26, 2025): /cc
Author
Owner

@rick-github commented on GitHub (Apr 26, 2025):

#3144

<!-- gh-comment-id:2832309431 --> @rick-github commented on GitHub (Apr 26, 2025): #3144
Author
Owner

@ParthSareen commented on GitHub (Apr 28, 2025):

Will close this in favor of tracking https://github.com/ollama/ollama/issues/10419

<!-- gh-comment-id:2834014868 --> @ParthSareen commented on GitHub (Apr 28, 2025): Will close this in favor of tracking https://github.com/ollama/ollama/issues/10419
Author
Owner

@hendrikebbers commented on GitHub (May 19, 2025):

@ParthSareen I assume you mean #3144 ?

<!-- gh-comment-id:2890049005 --> @hendrikebbers commented on GitHub (May 19, 2025): @ParthSareen I assume you mean #3144 ?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6847