[GH-ISSUE #121] Performance question? #42

Closed
opened 2026-04-12 09:34:32 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @kosecki123 on GitHub (Jul 19, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/121

This is just request for info rather than a bug.

What's kind of performance / latency on prompts we should expect running on M2 Pro ? Seems like takes up to 10s to generate the answers using llama2 model. Is that something that can improve in the future?

Originally created by @kosecki123 on GitHub (Jul 19, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/121 This is just request for info rather than a bug. What's kind of performance / latency on prompts we should expect running on M2 Pro ? Seems like takes up to 10s to generate the answers using `llama2` model. Is that something that can improve in the future?
Author
Owner

@jmorganca commented on GitHub (Jul 19, 2023):

Hi @kosecki123 . Absolutely. That 10 seconds is loading in the model, and we're working on making that much shorter!

Thanks for the issue!

<!-- gh-comment-id:1641632393 --> @jmorganca commented on GitHub (Jul 19, 2023): Hi @kosecki123 . Absolutely. That 10 seconds is loading in the model, and we're working on making that much shorter! Thanks for the issue!
Author
Owner

@zhiyun-deng commented on GitHub (Jul 29, 2023):

Why not keep the model loaded in between prompts? Providing such option would make the user experience much smoother.

<!-- gh-comment-id:1656656538 --> @zhiyun-deng commented on GitHub (Jul 29, 2023): Why not keep the model loaded in between prompts? Providing such option would make the user experience much smoother.
Author
Owner

@jmorganca commented on GitHub (Jul 30, 2023):

@zhiyun-deng 💯 – the next release will do this! You can also test this out on main

<!-- gh-comment-id:1657001868 --> @jmorganca commented on GitHub (Jul 30, 2023): @zhiyun-deng 💯 – the next release will do this! You can also test this out on `main`
Author
Owner

@jmorganca commented on GitHub (Aug 23, 2023):

Closing for now as the same model will stay loaded between prompts

<!-- gh-comment-id:1690377788 --> @jmorganca commented on GitHub (Aug 23, 2023): Closing for now as the same model will stay loaded between prompts
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#42