[GH-ISSUE #8008] Return prompt cache utilization on completion responses #5125

Open
opened 2026-04-12 16:13:31 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @reckart on GitHub (Dec 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/8008

Since Ollama has prompt caching now (right?), it would be great if the utilization of the cache could be returned in requests.

E.g. the OpenAI-compatible API could be extended with the new usage/prompt_tokens_details/cached_tokens.

A similar field in the Ollama API would also be great.

Originally created by @reckart on GitHub (Dec 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/8008 Since Ollama has prompt caching now (right?), it would be great if the utilization of the cache could be returned in requests. E.g. the OpenAI-compatible API could be extended with the new [`usage/prompt_tokens_details/cached_tokens`](https://platform.openai.com/docs/guides/prompt-caching). A similar field in the Ollama API would also be great.
GiteaMirror added the feature request label 2026-04-12 16:13:31 -05:00
Author
Owner

@codefromthecrypt commented on GitHub (Nov 3, 2025):

been a while on this and the lack of cached_tokens and reasoning_tokens interferes with openai-agents SDK usage. will try to get a workaround there until this is sorted here

<!-- gh-comment-id:3479385976 --> @codefromthecrypt commented on GitHub (Nov 3, 2025): been a while on this and the lack of cached_tokens and reasoning_tokens interferes with openai-agents SDK usage. will try to get a workaround there until this is sorted here
Author
Owner

@codefromthecrypt commented on GitHub (Nov 3, 2025):

raised this but not sure when it will merge.. https://github.com/openai/openai-agents-python/pull/2034

<!-- gh-comment-id:3482997485 --> @codefromthecrypt commented on GitHub (Nov 3, 2025): raised this but not sure when it will merge.. https://github.com/openai/openai-agents-python/pull/2034
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5125