[GH-ISSUE #3175] Run Mixtral-8x7B on Consumer Hardware with Expert Offloading #1955

Closed
opened 2026-04-12 12:06:20 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @arjunkrishna on GitHub (Mar 16, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3175

What are you trying to do?

mixtral:8x7B on rtx 3090 runs slow due to size issue.

How should we solve this?

in this article it says we can offload some experts to make it run faster.
https://kaitchup.substack.com/p/run-mixtral-8x7b-on-consumer-hardware
If you have already implemented this in ollama, then I apologize.

What is the impact of not solving this?

mixtral may run faster on rtx 3090.

Anything else?

https://github.com/dvmazur/mixtral-offloading

No response

Originally created by @arjunkrishna on GitHub (Mar 16, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3175 ### What are you trying to do? mixtral:8x7B on rtx 3090 runs slow due to size issue. ### How should we solve this? in this article it says we can offload some experts to make it run faster. https://kaitchup.substack.com/p/run-mixtral-8x7b-on-consumer-hardware If you have already implemented this in ollama, then I apologize. ### What is the impact of not solving this? mixtral may run faster on rtx 3090. ### Anything else? https://github.com/dvmazur/mixtral-offloading _No response_
Author
Owner

@arjunkrishna commented on GitHub (Mar 16, 2024):

ah... I see this on another issue with mixtral slowness on 3090. I am closing this as it is a duplicate of https://github.com/ollama/ollama/issues/1556 I am going to try mixtral:8x7b-instruct-v0.1-q2_K version instead

<!-- gh-comment-id:2000929885 --> @arjunkrishna commented on GitHub (Mar 16, 2024): ah... I see this on another issue with mixtral slowness on 3090. I am closing this as it is a duplicate of https://github.com/ollama/ollama/issues/1556 I am going to try mixtral:8x7b-instruct-v0.1-q2_K version instead
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1955