[GH-ISSUE #10742] Ollama 0.6.3 and higher can't run Gemma2:9b #53569

Closed
opened 2026-04-29 03:49:11 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @csngmusic on GitHub (May 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10742

What is the issue?

Hello, I've encountered an issue where I can't run said model as well as any of Gemma3 in a terminal.
GPU: Radeon RX7600XT
CPU: Intel core i7-4790k
RAM: 16GB
VRAM: 16GB
Error info:
C:\Users\Server>ollama run gemma2:9b
⠇ >>> What can you do?
As an open-weights AI assistant, I am trained on a massive dataset of text and code. This allows me to perform a
variety of tasks, including:

Communication and Language:

  • Generating creative content: Write stories, poems, articlesError: an error was encountered while running the model: read tcp 127.0.0.1:50126->127.0.0.1:50124: wsarecv: An existing connection was forcibly closed by the remote host.

So basically the model begins its answer and then gets cut off somehow. The port is always different, I checked and there's nothing using them as well as 11434.
The answers were just fine when I used Ollama v0.5.12.0
Here's the logs it gave me:
app-1.log
server-1.log

Relevant log output


OS

Windows

GPU

AMD

CPU

Intel

Ollama version

0.7.0

Originally created by @csngmusic on GitHub (May 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10742 ### What is the issue? Hello, I've encountered an issue where I can't run said model as well as any of Gemma3 in a terminal. GPU: Radeon RX7600XT CPU: Intel core i7-4790k RAM: 16GB VRAM: 16GB Error info: C:\Users\Server>ollama run gemma2:9b ⠇ >>> What can you do? As an open-weights AI assistant, I am trained on a massive dataset of text and code. This allows me to perform a variety of tasks, including: **Communication and Language:** * **Generating creative content:** Write stories, poems, articlesError: an error was encountered while running the model: read tcp 127.0.0.1:50126->127.0.0.1:50124: wsarecv: An existing connection was forcibly closed by the remote host. ------------------------------------ So basically the model begins its answer and then gets cut off somehow. The port is always different, I checked and there's nothing using them as well as 11434. The answers were just fine when I used Ollama v0.5.12.0 Here's the logs it gave me: [app-1.log](https://github.com/user-attachments/files/20256525/app-1.log) [server-1.log](https://github.com/user-attachments/files/20256524/server-1.log) ### Relevant log output ```shell ``` ### OS Windows ### GPU AMD ### CPU Intel ### Ollama version 0.7.0
GiteaMirror added the bugamd labels 2026-04-29 03:49:13 -05:00
Author
Owner

@rick-github commented on GitHub (May 17, 2025):

$ ollama -v
ollama version is 0.7.0
$ ollama run gemma2:9b
>>> What can you do?
I am Gemma, an open-weights AI assistant. Here's a glimpse of what I can do:

**Communication and Language:**

* **Engage in conversations:** I can chat with you on a wide range of topics, answer your questions, and even tell you stories.
* **Generate different creative text formats:**

I can write poems, code, scripts, musical pieces, email, letters, etc. 
* **Translate languages:** While I primarily communicate in English, I have some capacity to translate between other languages.
* **Summarize text:** Give me a long piece of writing, and I can provide a concise summary.

**Knowledge and Information:**

* **Answer your questions:** I have been trained on a massive dataset of text and code, so I can provide information on many subjects. Keep in mind that my 
knowledge is only up to a certain point in time, as I am not connected to the internet for real-time updates.
* **Help with brainstorming:** Need ideas for a project or story? I can help you generate different concepts.

**Remember:**

* I am still under development and learning new things every day.
* I don't have personal experiences or feelings.
* I am not able to access or provide real-time information or perform actions in the real world.

I'm excited to see how you use me! Let me know if you have any questions or tasks for me.

>>> /bye

Unfortunately there's norhing in the logs that's a smoking gun. Can you set OLLAMA_DEBUG=1 and try again with a prompt? The extra debug logging may contain relevant information.

<!-- gh-comment-id:2888504760 --> @rick-github commented on GitHub (May 17, 2025): ```console $ ollama -v ollama version is 0.7.0 $ ollama run gemma2:9b >>> What can you do? I am Gemma, an open-weights AI assistant. Here's a glimpse of what I can do: **Communication and Language:** * **Engage in conversations:** I can chat with you on a wide range of topics, answer your questions, and even tell you stories. * **Generate different creative text formats:** I can write poems, code, scripts, musical pieces, email, letters, etc. * **Translate languages:** While I primarily communicate in English, I have some capacity to translate between other languages. * **Summarize text:** Give me a long piece of writing, and I can provide a concise summary. **Knowledge and Information:** * **Answer your questions:** I have been trained on a massive dataset of text and code, so I can provide information on many subjects. Keep in mind that my knowledge is only up to a certain point in time, as I am not connected to the internet for real-time updates. * **Help with brainstorming:** Need ideas for a project or story? I can help you generate different concepts. **Remember:** * I am still under development and learning new things every day. * I don't have personal experiences or feelings. * I am not able to access or provide real-time information or perform actions in the real world. I'm excited to see how you use me! Let me know if you have any questions or tasks for me. >>> /bye ``` Unfortunately there's norhing in the logs that's a smoking gun. Can you set `OLLAMA_DEBUG=1` and try again with a prompt? The extra debug logging may contain relevant information.
Author
Owner

@csngmusic commented on GitHub (May 18, 2025):

I'll install the newest version again and try it out. Could it use more vram than 0.5.12 by any chance? I'm pretty limited in this aspect as it already takes around 11gb to run

<!-- gh-comment-id:2888855439 --> @csngmusic commented on GitHub (May 18, 2025): I'll install the newest version again and try it out. Could it use more vram than 0.5.12 by any chance? I'm pretty limited in this aspect as it already takes around 11gb to run
Author
Owner

@csngmusic commented on GitHub (May 18, 2025):

Here's the output again
C:\Users\Server>set OLLAMA_DEBUG=1

C:\Users\Server>ollama run gemma2:9b

What can you do?
I am Gemma, an open-weights AI assistant. Here's a glimpse of what I can do:

Communication and Language:

  • Generate creative content: Write stories, poems, articles, and more.
  • **Engage in conversationsError: an error was encountered while running the model: read tcp 127.0.0.1:53332->127.0.0.1:53329: wsarecv: An existing connection was forcibly closed by the remote host.

logs once again:
app.log
server.log

<!-- gh-comment-id:2889030363 --> @csngmusic commented on GitHub (May 18, 2025): Here's the output again C:\Users\Server>set OLLAMA_DEBUG=1 C:\Users\Server>ollama run gemma2:9b >>> >>> >>> What can you do? I am Gemma, an open-weights AI assistant. Here's a glimpse of what I can do: **Communication and Language:** * **Generate creative content:** Write stories, poems, articles, and more. * **Engage in conversationsError: an error was encountered while running the model: read tcp 127.0.0.1:53332->127.0.0.1:53329: wsarecv: An existing connection was forcibly closed by the remote host. logs once again: [app.log](https://github.com/user-attachments/files/20273320/app.log) [server.log](https://github.com/user-attachments/files/20273319/server.log)
Author
Owner

@rick-github commented on GitHub (May 18, 2025):

[GIN] 2025/05/18 - 17:43:15 | 200 |   40.5405071s |       127.0.0.1 | POST     "/api/generate"
Exception 0xc0000005 0x0 0x0 0x7ff84e4d51fe
PC=0x7ff84e4d51fe
signal arrived during external code execution

runtime.cgocall(0x7ff7eaeb4c40, 0xc00011fbd8)
        C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc00011fbb0 sp=0xc00011fb48 pc=0x7ff7ea1f241e
github.com/ollama/ollama/llama._Cfunc_llama_decode(0x23966985890, {0x1, 0x239141da0a0, 0x0, 0x239141db0b0, 0x239141d9090, 0x23966b48090, 0x2395b5656a0})
        _cgo_gotypes.go:597 +0x50 fp=0xc00011fbd8 sp=0xc00011fbb0 pc=0x7ff7ea5a3f30

The code in llama_decode() tried to de-reference a null pointer and the runner got killed for an access violation. It's not clear why the pointer was null. Interestingly, there's another recent access violation issue (#10758) involving ROCm which was mitigated by setting OLLAMA_NUM_PARALLEL=1. Can you try that and see if it makes a difference?

<!-- gh-comment-id:2889036677 --> @rick-github commented on GitHub (May 18, 2025): ``` [GIN] 2025/05/18 - 17:43:15 | 200 | 40.5405071s | 127.0.0.1 | POST "/api/generate" Exception 0xc0000005 0x0 0x0 0x7ff84e4d51fe PC=0x7ff84e4d51fe signal arrived during external code execution runtime.cgocall(0x7ff7eaeb4c40, 0xc00011fbd8) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc00011fbb0 sp=0xc00011fb48 pc=0x7ff7ea1f241e github.com/ollama/ollama/llama._Cfunc_llama_decode(0x23966985890, {0x1, 0x239141da0a0, 0x0, 0x239141db0b0, 0x239141d9090, 0x23966b48090, 0x2395b5656a0}) _cgo_gotypes.go:597 +0x50 fp=0xc00011fbd8 sp=0xc00011fbb0 pc=0x7ff7ea5a3f30 ``` The code in llama_decode() tried to de-reference a null pointer and the runner got killed for an access violation. It's not clear why the pointer was null. Interestingly, there's another recent access violation issue (#10758) involving ROCm which was mitigated by setting `OLLAMA_NUM_PARALLEL=1`. Can you try that and see if it makes a difference?
Author
Owner

@csngmusic commented on GitHub (May 19, 2025):

C:\Users\Server>set OLLAMA_DEBUG=1

C:\Users\Server>set OLLAMA_NUM_PARALLEL=1

C:\Users\Server>ollama run gemma2:9b

What can you do?
As an open-weights AI, I am trained on a massive dataset of text and code. This allows me to perform a variety of
tasks, including:

Communication and Language:

  • Generate creative content: Write stories, poems, articles,Error: an error was encountered while running the model: read tcp 127.0.0.1:52609->127.0.0.1:52607: wsarecv: An existing connection was forcibly closed by the remote host.

still have this issue. I'll provide new logs:

server.log
app.log

<!-- gh-comment-id:2889988280 --> @csngmusic commented on GitHub (May 19, 2025): C:\Users\Server>set OLLAMA_DEBUG=1 C:\Users\Server>set OLLAMA_NUM_PARALLEL=1 C:\Users\Server>ollama run gemma2:9b >>> What can you do? As an open-weights AI, I am trained on a massive dataset of text and code. This allows me to perform a variety of tasks, including: **Communication and Language:** * **Generate creative content:** Write stories, poems, articles,Error: an error was encountered while running the model: read tcp 127.0.0.1:52609->127.0.0.1:52607: wsarecv: An existing connection was forcibly closed by the remote host. still have this issue. I'll provide new logs: [server.log](https://github.com/user-attachments/files/20279500/server.log) [app.log](https://github.com/user-attachments/files/20279501/app.log)
Author
Owner

@csngmusic commented on GitHub (Jun 21, 2025):

Tried updating Ollama to 0.9.2 (latest release), did not fix the issue. I can provide a screen recording if maybe there's some sort of abnormality in GPU and CPU usage as well as RAM and VRAM.
What I noticed is that ollama ps shows that the model runs on 100% GPU, however when I look at RAM and VRAM usage it is equal with both around 7-10GB. I don't think I noticed this behavior before on v0.5.12.
I can't switch to NVidia :(

<!-- gh-comment-id:2993628765 --> @csngmusic commented on GitHub (Jun 21, 2025): Tried updating Ollama to 0.9.2 (latest release), did not fix the issue. I can provide a screen recording if maybe there's some sort of abnormality in GPU and CPU usage as well as RAM and VRAM. What I noticed is that ollama ps shows that the model runs on 100% GPU, however when I look at RAM and VRAM usage it is equal with both around 7-10GB. I don't think I noticed this behavior before on v0.5.12. I can't switch to NVidia :(
Author
Owner

@rick-github commented on GitHub (Jun 22, 2025):

I got access to a ROCm system this week and was unable to reproduce the issue. Other users with similar problems have resolved it by upgrading drivers, is that something that you could try?

<!-- gh-comment-id:2994081606 --> @rick-github commented on GitHub (Jun 22, 2025): I got access to a ROCm system this week and was unable to reproduce the issue. Other users with similar problems have resolved it by [upgrading drivers](https://github.com/ollama/ollama/issues/11072#issuecomment-2972129657), is that something that you could try?
Author
Owner

@csngmusic commented on GitHub (Jun 24, 2025):

Oh... That worked, thanks! I have no idea on why I didn't think of updating them sooner

<!-- gh-comment-id:3001846429 --> @csngmusic commented on GitHub (Jun 24, 2025): Oh... That worked, thanks! I have no idea on why I didn't think of updating them sooner
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53569