[GH-ISSUE #2798] When I load two models mistral and llama2, inference goes to CPU for both it seems #63730

Closed
opened 2026-05-03 14:48:02 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @pankajkumar229 on GitHub (Feb 28, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2798

Would be nice if one of them was loaded in GPU even if both could not fit in GPU simultaneously.

Originally created by @pankajkumar229 on GitHub (Feb 28, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2798 Would be nice if one of them was loaded in GPU even if both could not fit in GPU simultaneously.
Author
Owner

@pankajkumar229 commented on GitHub (Feb 28, 2024):

wrong observation

<!-- gh-comment-id:1968286584 --> @pankajkumar229 commented on GitHub (Feb 28, 2024): wrong observation
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63730