[GH-ISSUE #12209] Version 11 bombing out and responds with GGGGGGGGGGGGGGG #70183

New Issue

@R1U2 commented on GitHub (Sep 9, 2025):

Ok as promised started with Ollama 0.11.8 with llama3.2:latest on my Jetson. OpenWebai separate distro on omv7/docker and Intell Haswell processor.
Running from CLI gets the same response. If i ask another question or start a new conversation it will either be GGGGGGGGGGGG immediately or after the third question.

Ran deepseek-r1:7b for a about 8 questions. No GGGGGGGGGGGGGGGGGGG responses, but twice when i change the subject gives me a response on the previous question.

Cleared the chats and started with llama3.2:latest again. Managed to do 5 questions before giving me the GGGGGGGGGGGGG.

Cleared the chat and asked Qwen2.5-coder:3b some questions. Gave me GGGGGGGGGGGGGGGGGG after 5 Questions.

Cleared and ran deepseek-r1:1.5b gave me GGGGGGGGGGGGGGGG after two questions.

I will later roll back to Ollama version 0.11.7 and retest again.

@R1U2 commented on GitHub (Sep 9, 2025): Ok as promised started with Ollama 0.11.8 with llama3.2:latest on my Jetson. OpenWebai separate distro on omv7/docker and Intell Haswell processor. Running from CLI gets the same response. If i ask another question or start a new conversation it will either be GGGGGGGGGGGG immediately or after the third question. <img width="1063" height="350" alt="Image" src="https://github.com/user-attachments/assets/86aa976c-e328-4684-b7a1-1bb438dd6101" /> Ran deepseek-r1:7b for a about 8 questions. No GGGGGGGGGGGGGGGGGGG responses, but twice when i change the subject gives me a response on the previous question. <img width="593" height="868" alt="Image" src="https://github.com/user-attachments/assets/ea397150-25a5-4419-8630-6afe7893371c" /> Cleared the chats and started with llama3.2:latest again. Managed to do 5 questions before giving me the GGGGGGGGGGGGG. Cleared the chat and asked Qwen2.5-coder:3b some questions. Gave me GGGGGGGGGGGGGGGGGG after 5 Questions. <img width="558" height="822" alt="Image" src="https://github.com/user-attachments/assets/1ea0af9b-ba6e-4c61-8f86-3c17764cba83" /> Cleared and ran deepseek-r1:1.5b gave me GGGGGGGGGGGGGGGG after two questions. <img width="356" height="836" alt="Image" src="https://github.com/user-attachments/assets/0fcdfbe5-fc58-49bf-b5b5-008a738e09c9" /> I will later roll back to Ollama version 0.11.7 and retest again.

GiteaMirror commented

@dhiltgen commented on GitHub (Sep 9, 2025):

From the screenshots above, it looks like you're on Jetpack v6. Did we select the correct runtime in the "inference compute" log line? Something like this:

time=2025-09-09T14:23:40.179-07:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-67834ba8-0312-50b2-9286-9b3b02e80059 library=cuda variant=jetpack6 compute=8.7 driver=12.6 name=Orin total="61.4 GiB" available="51.2 GiB"

@dhiltgen commented on GitHub (Sep 9, 2025): From the screenshots above, it looks like you're on Jetpack v6. Did we select the correct runtime in the "inference compute" log line? Something like this: ``` time=2025-09-09T14:23:40.179-07:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-67834ba8-0312-50b2-9286-9b3b02e80059 library=cuda variant=jetpack6 compute=8.7 driver=12.6 name=Orin total="61.4 GiB" available="51.2 GiB" ```

GiteaMirror commented

@R1U2 commented on GitHub (Sep 9, 2025):

@dhiltgen i ran a test now on my Intel Nuc /omv7/docker/ollama 0.11.8 latest running llama3.2b . Using short questions the issue does not present itself. So not sure where i musty select the correct run time ?

@R1U2 commented on GitHub (Sep 9, 2025): @dhiltgen i ran a test now on my Intel Nuc /omv7/docker/ollama 0.11.8 latest running llama3.2b . Using short questions the issue does not present itself. So not sure where i musty select the correct run time ?

GiteaMirror commented

2026-05-04 20:37:09 -05:00

@HazmanNaim commented on GitHub (Sep 11, 2025):

Hi, I encountered a similar issue. I am running Ollama (Docker version 0.11.10) on a Jetson Orin and experimenting with LangGraph agents. For some unknown reason, Ollama starts responding with GGGG after a few interactions, regardless of which model is loaded. In one case, triggering a tool call immediately caused Ollama to respond with GGGG. However, if I run the Ollama on amd64 machine, ollama is stable and no issues.

So, I rolled back to Docker Ollama version 0.10.0, and the issue seems to have gone away. Probably the issue is something related to version 0.11 if it is running on Jetson.

@HazmanNaim commented on GitHub (Sep 11, 2025): Hi, I encountered a similar issue. I am running Ollama (Docker version 0.11.10) on a Jetson Orin and experimenting with LangGraph agents. For some unknown reason, Ollama starts responding with GGGG after a few interactions, regardless of which model is loaded. In one case, triggering a tool call immediately caused Ollama to respond with GGGG. However, if I run the Ollama on amd64 machine, ollama is stable and no issues. So, I rolled back to Docker Ollama version 0.10.0, and the issue seems to have gone away. Probably the issue is something related to version 0.11 if it is running on Jetson.

GiteaMirror commented

@v1ckxy commented on GitHub (Sep 16, 2025):

Same here. Orin Nano after two messages starts throwing G's

@v1ckxy commented on GitHub (Sep 16, 2025): Same here. Orin Nano after two messages starts throwing G's

GiteaMirror commented

2026-05-04 20:37:09 -05:00

@eschoell commented on GitHub (Sep 22, 2025):

I am having the same issue running on an Orin. The gpt-oss model runs fine, while any other will quickly -- if not immediately -- fail. Based on that, it seems that whatever was done to support the gpt-oss model is the cause.

@eschoell commented on GitHub (Sep 22, 2025): I am having the same issue running on an Orin. The gpt-oss model runs fine, while any other will quickly -- if not immediately -- fail. Based on that, it seems that whatever was done to support the gpt-oss model is the cause.

GiteaMirror commented

2026-05-04 20:37:09 -05:00

@thunderfm commented on GitHub (Sep 26, 2025):

Updated to 0.12.2 today and it seems to have been fixed. Tried a bunch of different models and they're all working well now.

@thunderfm commented on GitHub (Sep 26, 2025): Updated to 0.12.2 today and it seems to have been fixed. Tried a bunch of different models and they're all working well now.

GiteaMirror commented

2026-05-04 20:37:10 -05:00

@dhiltgen commented on GitHub (Sep 26, 2025):

@R1U2 please look in the server logs to see if Ollama auto-detected the correct runtime. This is not something you have to do, but Ollama is supposed to figure it out from information on the system. If we chose the wrong runtime, then gibberish responses (or crashing) will happen. Our troubleshooting guide explains how to find the logs https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

@dhiltgen commented on GitHub (Sep 26, 2025): @R1U2 please look in the server logs to see if Ollama auto-detected the correct runtime. This is not something you have to do, but Ollama is supposed to figure it out from information on the system. If we chose the wrong runtime, then gibberish responses (or crashing) will happen. Our troubleshooting guide explains how to find the logs https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

GiteaMirror commented

2026-05-04 20:37:10 -05:00

@eschoell commented on GitHub (Sep 27, 2025):

It has not been fixed. I am still experiencing the same problem with version 0.12.3.

@eschoell commented on GitHub (Sep 27, 2025): It has not been fixed. I am still experiencing the same problem with version 0.12.3.

GiteaMirror commented

2026-05-04 20:37:10 -05:00

@eschoell commented on GitHub (Sep 27, 2025):

I see the log, but I do not see where it talks about finding the runtime...

@eschoell commented on GitHub (Sep 27, 2025): I see the log, but I do not see where it talks about finding the runtime...

GiteaMirror commented

2026-05-04 20:37:11 -05:00

@rick-github commented on GitHub (Sep 27, 2025):

https://github.com/ollama/ollama/issues/12209#issuecomment-3271542429

Or post the log.

@rick-github commented on GitHub (Sep 27, 2025): https://github.com/ollama/ollama/issues/12209#issuecomment-3271542429 Or post the log.

GiteaMirror commented

2026-05-04 20:37:11 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

Hi All, been looking at the comments coming in. Busy away from home during the week and also playing around with comfyui on how to use it.

During this exercise i have learnt a few things about my Jetson Orin Nano. Ollama as well as Comfyui can be run directly like in Jetson containers or docker containers. Reason why i dont like jetson containers is that they are not persistant, when i reset the unit the container is gone, everything needs to be downloaded again when a new container is started. I dont know enough of running it in cli to make it run long enough to play around with it. Thus the need to run it in docker environment with portainer, and a decent docker-compose.yaml file to run it up. Then portainer to quick change the network or attached storage , GPU setting etc.

In saying that my previous Ollama container i ran, ran with i believe a bottleneck CPU, although it was still fast, i felt it was still not accessing the GPU on the Jetson as it should. Saw that on Jtop. Then getting Comfyui to run with a good docker-compose.yaml that performs and uses the gpu, i redid my Ollama compose file and started it up again. I realized that DustyNV's last version 34.4.0 version only had the Ollama/ollam:0.10.0 version in for jetson. I will now retest with my new compose setup version 0.11.8 and then do the new 0.12.2. If the results are the same i will post the logs as @dhiltgen requested.

Be back soon.

@R1U2 commented on GitHub (Sep 28, 2025): Hi All, been looking at the comments coming in. Busy away from home during the week and also playing around with comfyui on how to use it. During this exercise i have learnt a few things about my Jetson Orin Nano. Ollama as well as Comfyui can be run directly like in Jetson containers or docker containers. Reason why i dont like jetson containers is that they are not persistant, when i reset the unit the container is gone, everything needs to be downloaded again when a new container is started. I dont know enough of running it in cli to make it run long enough to play around with it. Thus the need to run it in docker environment with portainer, and a decent docker-compose.yaml file to run it up. Then portainer to quick change the network or attached storage , GPU setting etc. In saying that my previous Ollama container i ran, ran with i believe a bottleneck CPU, although it was still fast, i felt it was still not accessing the GPU on the Jetson as it should. Saw that on Jtop. Then getting Comfyui to run with a good docker-compose.yaml that performs and uses the gpu, i redid my Ollama compose file and started it up again. I realized that DustyNV's last version 34.4.0 version only had the Ollama/ollam:0.10.0 version in for jetson. I will now retest with my new compose setup version 0.11.8 and then do the new 0.12.2. If the results are the same i will post the logs as @dhiltgen requested. Be back soon.

GiteaMirror commented

@R1U2 commented on GitHub (Sep 28, 2025):

Ok test results are in.

Spun up ollama 0.11.8 to retest.

llama3.2b had no issues and i could ask it about twenty questions.
I moved over to deepseek-R1:1.5b , second question in i get the gggggggggg. Log file below.

_ollama_logs.txt

@R1U2 commented on GitHub (Sep 28, 2025): Ok test results are in. Spun up ollama 0.11.8 to retest. llama3.2b had no issues and i could ask it about twenty questions. I moved over to deepseek-R1:1.5b , second question in i get the gggggggggg. Log file below. [_ollama_logs.txt](https://github.com/user-attachments/files/22579894/_ollama_logs.txt)

GiteaMirror commented

@R1U2 commented on GitHub (Sep 28, 2025):

ran deepseek-r1:7b.

6 questions and it bombed out.
i changed the subject on 5 and asked it to tell me a joke., it replied but with no joke and previous line of questioning. question 6 answered with ggggggggggggg . log below does not show much.

_ollama_logs(1).txt

@R1U2 commented on GitHub (Sep 28, 2025): ran deepseek-r1:7b. 6 questions and it bombed out. i changed the subject on 5 and asked it to tell me a joke., it replied but with no joke and previous line of questioning. question 6 answered with ggggggggggggg . log below does not show much. [_ollama_logs(1).txt](https://github.com/user-attachments/files/22580175/_ollama_logs.1.txt)

GiteaMirror commented

@R1U2 commented on GitHub (Sep 28, 2025):

started a new chat with qwen , bombed out on second question. Log does not show much. below.

_ollama_logs(2).txt

@R1U2 commented on GitHub (Sep 28, 2025): started a new chat with qwen , bombed out on second question. Log does not show much. below. [_ollama_logs(2).txt](https://github.com/user-attachments/files/22580186/_ollama_logs.2.txt)

GiteaMirror commented

2026-05-04 20:37:13 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

Spun up Ollama 0.12.3 with llama3.2 latest. 6 questions in it bombs out with gggggggggg.
log below.

_ollama_logs.txt

@thunderfm - Still not fixed.

Will now revert back to 0.10.0 again until this is fixed. If there is anything you want me to assist with testing wise, let me know.

@R1U2 commented on GitHub (Sep 28, 2025): Spun up Ollama 0.12.3 with llama3.2 latest. 6 questions in it bombs out with gggggggggg. log below. [_ollama_logs.txt](https://github.com/user-attachments/files/22580221/_ollama_logs.txt) @thunderfm - Still not fixed. Will now revert back to 0.10.0 again until this is fixed. If there is anything you want me to assist with testing wise, let me know.

GiteaMirror commented

@eschoell commented on GitHub (Oct 22, 2025):

It seems that there should be enough info to proceed fixing this, correct? The issue still has the "needs more info" tag.

I have resorted to running the latest version in a Docker container (just for gpt-oss:20b) alongside the native build of v0.10 for everything else. This is clearly not a sustainable workaround.

@eschoell commented on GitHub (Oct 22, 2025): It seems that there should be enough info to proceed fixing this, correct? The issue still has the "needs more info" tag. I have resorted to running the latest version in a Docker container (just for gpt-oss:20b) alongside the native build of v0.10 for *everything else*. This is clearly not a sustainable workaround.

GiteaMirror commented

2026-05-04 20:37:13 -05:00

@dhiltgen commented on GitHub (Oct 22, 2025):

@R1U2 your logs aren't complete so I can't tell if this is a discovery problem where we're using the wrong CUDA runtime, or possibly over-committing GPU memory, or something else.

I believe you said you're using a container, so something like this should hopefully work: (adjust the flags if you need to)

docker run --rm -it --runtime=nvidia -e JETSON_JETPACK=6 -e OLLAMA_DEBUG=2 ollama/ollama 2>&1 | tee serve.log

As soon as you see the log line ... msg="inference compute" ... show up, ctrl-c the docker run and share that serve.log.

(I should also point out, if your container does not have JETSON_JETPACK=6 it's probable we're using the wrong runtime - see the note at https://github.com/ollama/ollama/blob/main/docs/docker.md#start-the-container)

@dhiltgen commented on GitHub (Oct 22, 2025): @R1U2 your logs aren't complete so I can't tell if this is a discovery problem where we're using the wrong CUDA runtime, or possibly over-committing GPU memory, or something else. I believe you said you're using a container, so something like this should hopefully work: (adjust the flags if you need to) ``` docker run --rm -it --runtime=nvidia -e JETSON_JETPACK=6 -e OLLAMA_DEBUG=2 ollama/ollama 2>&1 | tee serve.log ``` As soon as you see the log line `... msg="inference compute" ...` show up, ctrl-c the docker run and share that serve.log. (I should also point out, if your container does not have JETSON_JETPACK=6 it's probable we're using the wrong runtime - see the note at https://github.com/ollama/ollama/blob/main/docs/docker.md#start-the-container)

GiteaMirror commented

@undeadindustries commented on GitHub (Oct 26, 2025):

Just chiming in that I'm getting this exact same issue on nvidia dgx spark. version 0.12.3.

@undeadindustries commented on GitHub (Oct 26, 2025): Just chiming in that I'm getting this exact same issue on nvidia dgx spark. version 0.12.3.

GiteaMirror commented

@dhiltgen commented on GitHub (Nov 5, 2025):

@undeadindustries can you share more complete logs so we can try to isolate what's going wrong?

@dhiltgen commented on GitHub (Nov 5, 2025): @undeadindustries can you share more complete logs so we can try to isolate what's going wrong?

GiteaMirror commented

@undeadindustries commented on GitHub (Nov 10, 2025):

@dhiltgen absolutely. Just to make sure I'm giving you exactly what you want. Which command should I run for the logs and/or which log file would you like?

Thanks for looking into it!

@undeadindustries commented on GitHub (Nov 10, 2025): @dhiltgen absolutely. Just to make sure I'm giving you exactly what you want. Which command should I run for the logs and/or which log file would you like? Thanks for looking into it!

GiteaMirror commented