[GH-ISSUE #2875] Set attention type for Mistral 7B #1756

Closed
opened 2026-04-12 11:45:59 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @commit4ever on GitHub (Mar 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2875

Hi - is there any documentation that talks abt how attention types can be set ?
There seem to be options as per the Mistral git.
https://github.com/mistralai/mistral-src
https://huggingface.co/docs/transformers/main/model_doc/mistral HF talks abt using flash attention

Also any pointer on how to interpert the debug messages esp when the model loads and when ut receives a generate request?

Any inputs would be much appreciated!

Thanks.

Originally created by @commit4ever on GitHub (Mar 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2875 Hi - is there any documentation that talks abt how attention types can be set ? There seem to be options as per the Mistral git. https://github.com/mistralai/mistral-src https://huggingface.co/docs/transformers/main/model_doc/mistral HF talks abt using flash attention Also any pointer on how to interpert the debug messages esp when the model loads and when ut receives a `generate` request? Any inputs would be much appreciated! Thanks.
Author
Owner

@pdevine commented on GitHub (May 18, 2024):

I'm going to close this as a dupe of #4051 . It's coming soon!

<!-- gh-comment-id:2118615509 --> @pdevine commented on GitHub (May 18, 2024): I'm going to close this as a dupe of #4051 . It's coming soon!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1756