[GH-ISSUE #15143] Fail to run some models on 0.19 #35455

Open
opened 2026-04-22 19:57:16 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @razvanab on GitHub (Mar 30, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15143

What is the issue?

Ollama fail to run some models in the latest ollama version is 0.19.0.

Errors:

500 Internal Server Error: llama runner process has terminated: %!w(<nil>)
500 Internal Server Error: memory layout cannot be allocated
Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details

server.log
app.log

Relevant log output


OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.19.0

Originally created by @razvanab on GitHub (Mar 30, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15143 ### What is the issue? Ollama fail to run some models in the latest ollama version is 0.19.0. Errors: ``` 500 Internal Server Error: llama runner process has terminated: %!w(<nil>) 500 Internal Server Error: memory layout cannot be allocated Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details ``` [server.log](https://github.com/user-attachments/files/26347819/server.log) [app.log](https://github.com/user-attachments/files/26347820/app.log) ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.19.0
GiteaMirror added the bug label 2026-04-22 19:57:16 -05:00
Author
Owner

@razvanab commented on GitHub (Mar 30, 2026):

I forgot to mention that these models worked very well in the previous version.

<!-- gh-comment-id:4154107012 --> @razvanab commented on GitHub (Mar 30, 2026): I forgot to mention that these models worked very well in the previous version.
Author
Owner

@dhiltgen commented on GitHub (Mar 30, 2026):

It looks like your GPU(s) aren't being discovered

time=2026-03-30T13:46:23.278+03:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.9 GiB" available="19.1 GiB"

I notice a few env vars that might be related... CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0

Can you try removing those and see if the situation improves? You also have quite a few different CUDA versions installed - v12.8, v12.9, v13.0, v13.1, v13.2 - perhaps try starting the ollama server in a shell where you unset those from the PATH and other env vars to see if somehow library loading is getting mixed up.

<!-- gh-comment-id:4156529062 --> @dhiltgen commented on GitHub (Mar 30, 2026): It looks like your GPU(s) aren't being discovered ``` time=2026-03-30T13:46:23.278+03:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.9 GiB" available="19.1 GiB" ``` I notice a few env vars that might be related... CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0 Can you try removing those and see if the situation improves? You also have quite a few different CUDA versions installed - v12.8, v12.9, v13.0, v13.1, v13.2 - perhaps try starting the ollama server in a shell where you unset those from the PATH and other env vars to see if somehow library loading is getting mixed up.
Author
Owner

@isMTv commented on GitHub (Mar 30, 2026):

Hi everyone, I also encountered this problem.
OS: Windows 11
RAM DDR5: 64 GB
GPU: Nvidia 2070 Super

Running the model: qwen3-coder-next.
Error: 500 Internal Server Error: memory layout cannot be allocated

So far, I've determined that it works fine in version 0.17.7 of Ollama.

How can I resolve this issue in newer versions?

<!-- gh-comment-id:4157762591 --> @isMTv commented on GitHub (Mar 30, 2026): Hi everyone, I also encountered this problem. **OS:** Windows 11 **RAM DDR5:** 64 GB **GPU:** Nvidia 2070 Super **Running the model:** qwen3-coder-next. `Error: 500 Internal Server Error: memory layout cannot be allocated` So far, I've determined that it works fine in version 0.17.7 of Ollama. How can I resolve this issue in newer versions?
Author
Owner

@razvanab commented on GitHub (Apr 2, 2026):

It looks like your GPU(s) aren't being discovered

time=2026-03-30T13:46:23.278+03:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.9 GiB" available="19.1 GiB"

I notice a few env vars that might be related... CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0

Can you try removing those and see if the situation improves? You also have quite a few different CUDA versions installed - v12.8, v12.9, v13.0, v13.1, v13.2 - perhaps try starting the ollama server in a shell where you unset those from the PATH and other env vars to see if somehow library loading is getting mixed up.

I did remove CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0 and the unused CUDA versions and it does work now. Thanks.

<!-- gh-comment-id:4178546705 --> @razvanab commented on GitHub (Apr 2, 2026): > It looks like your GPU(s) aren't being discovered > > ``` > time=2026-03-30T13:46:23.278+03:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.9 GiB" available="19.1 GiB" > ``` > > I notice a few env vars that might be related... CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0 > > Can you try removing those and see if the situation improves? You also have quite a few different CUDA versions installed - v12.8, v12.9, v13.0, v13.1, v13.2 - perhaps try starting the ollama server in a shell where you unset those from the PATH and other env vars to see if somehow library loading is getting mixed up. I did remove CUDA_LAUNCH_BLOCKING=1 and CUDA_VISIBLE_DEVICES=0 and the unused CUDA versions and it does work now. Thanks.
Author
Owner

@PureBlissAK commented on GitHub (Apr 18, 2026):

🤖 Automated Triage & Analysis Report

Issue: #15143
Analyzed: 2026-04-18T18:23:01.172673

Analysis

  • Type: unknown
  • Severity: medium
  • Components: unknown

Implementation Plan

  • Effort: medium
  • Steps:

This issue has been triaged and marked for implementation.

<!-- gh-comment-id:4274311092 --> @PureBlissAK commented on GitHub (Apr 18, 2026): <!-- ollama-issue-orchestrator:v1 issue:15143 --> ## 🤖 Automated Triage & Analysis Report **Issue**: #15143 **Analyzed**: 2026-04-18T18:23:01.172673 ### Analysis - **Type**: unknown - **Severity**: medium - **Components**: unknown ### Implementation Plan - **Effort**: medium - **Steps**: *This issue has been triaged and marked for implementation.*
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35455