mirror of
https://github.com/ollama/ollama.git
synced 2026-03-09 07:16:38 -05:00
Only the last token's processing time is included in prompt processing, giving an artificially high rate. In addition, the number of tokens only included the tokens that miss the cache, instead of our historic total tokens.