[GH-ISSUE #14699] Models load into ram first and then into gpu on latest update #35270

Closed
opened 2026-04-22 19:39:50 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @MCorbo7 on GitHub (Mar 8, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14699

What is the issue?

Models that took 15 seconds to load now take over a minute. Downgraded to 1.17.4 which doesn't have this issue.

Relevant log output


OS

Windows 11

GPU

RTX 5070 12GB

CPU

i7 14700f

Ollama version

1.17.7

Originally created by @MCorbo7 on GitHub (Mar 8, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14699 ### What is the issue? Models that took 15 seconds to load now take over a minute. Downgraded to 1.17.4 which doesn't have this issue. ### Relevant log output ```shell ``` ### OS Windows 11 ### GPU RTX 5070 12GB ### CPU i7 14700f ### Ollama version 1.17.7
GiteaMirror added the bugneeds more info labels 2026-04-22 19:39:50 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 8, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:4018605721 --> @rick-github commented on GitHub (Mar 8, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35270