[GH-ISSUE #12209] Version 11 bombing out and responds with GGGGGGGGGGGGGGG #8124

New Issue

GiteaMirror · 2026-04-12T20:29:28-05:00

GiteaMirror commented

2026-04-12 20:29:28 -05:00

Originally created by @R1U2 on GitHub (Sep 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12209

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

Hi, this started last week, When running Ollama in a docker env on my Jetson Orin Nano llama3.2 will after two or three responses using openwebui with GGGGGGGGGGGGGGGGGGGGGGGGG.
At first i thought it is due to the Jetson settings etc, but nothing fixed it. I then rolled my Ollama version back to 0.10 and it has been running stable. I just stopped my container and started it again and as it is set to latest pulled ver 0.11 again. I thought the issue smught have been addressed, unfortunately not. I will be rolling my version back to 0.10 again.

Relevant log output

OS

Docker

GPU

Nvidia

CPU

Intel

Ollama version

0.11 latest

Originally created by @R1U2 on GitHub (Sep 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12209 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? Hi, this started last week, When running Ollama in a docker env on my Jetson Orin Nano llama3.2 will after two or three responses using openwebui with GGGGGGGGGGGGGGGGGGGGGGGGG. At first i thought it is due to the Jetson settings etc, but nothing fixed it. I then rolled my Ollama version back to 0.10 and it has been running stable. I just stopped my container and started it again and as it is set to latest pulled ver 0.11 again. I thought the issue smught have been addressed, unfortunately not. I will be rolling my version back to 0.10 again. ### Relevant log output ```shell ``` ### OS Docker ### GPU Nvidia ### CPU Intel ### Ollama version 0.11 latest

GiteaMirror added the nvidia bug needs more info labels 2026-04-12 20:29:28 -05:00

GiteaMirror commented

2026-04-12 20:29:30 -05:00

@rick-github commented on GitHub (Sep 9, 2025):

More context in #12142.
Also reported in the discord.

@rick-github commented on GitHub (Sep 9, 2025): More context in #12142. Also reported in the [discord](https://discord.com/channels/1128867683291627614/1211804431340019753/threads/1408405019774156961).

GiteaMirror commented

2026-04-12 20:29:32 -05:00

@dhiltgen commented on GitHub (Sep 9, 2025):

Which jetpack version are you running?
What did the "inference compute" line in the server log show?

@dhiltgen commented on GitHub (Sep 9, 2025): Which jetpack version are you running? What did the "inference compute" line in the server log show?

GiteaMirror commented

2026-04-12 20:29:33 -05:00

@R1U2 commented on GitHub (Sep 9, 2025):

Hi , my distro is the Jetpack orin nano super

I am running Ollama in a docker container from a docker-compose.yaml file.

I read on Redit a week ago looking into the reason why it is doing this from some other user on his distro. His issue was the same except that flash attention enabled was not in the ollama version he was using. He mentioned that once this was enabled his GGGGGGGGGGGGGGGGG responses went away. So basically the opposite of what we are experiencing now.

Looking at Ollama ver 0.11.8 flash attention enabled was now on permanently. I will run a test later today with ver 0.11.7 and see if i can recreate the GGGGGGGGGGGGGGGG response with that version, then i will do 0.11.8 to compare the two.

I will run it with deepseek r1 as well as llama3.2b.

@R1U2 commented on GitHub (Sep 9, 2025): Hi , my distro is the Jetpack orin nano super I am running Ollama in a docker container from a docker-compose.yaml file. I read on Redit a week ago looking into the reason why it is doing this from some other user on his distro. His issue was the same except that flash attention enabled was not in the ollama version he was using. He mentioned that once this was enabled his GGGGGGGGGGGGGGGGG responses went away. So basically the opposite of what we are experiencing now. Looking at Ollama ver 0.11.8 flash attention enabled was now on permanently. I will run a test later today with ver 0.11.7 and see if i can recreate the GGGGGGGGGGGGGGGG response with that version, then i will do 0.11.8 to compare the two. I will run it with deepseek r1 as well as llama3.2b. <img width="997" height="475" alt="Image" src="https://github.com/user-attachments/assets/904dee7b-91b0-4e7d-b898-c0038e569b54" />

GiteaMirror commented

2026-04-12 20:29:34 -05:00

@R1U2 commented on GitHub (Sep 9, 2025):

Ok as promised started with Ollama 0.11.8 with llama3.2:latest on my Jetson. OpenWebai separate distro on omv7/docker and Intell Haswell processor.
Running from CLI gets the same response. If i ask another question or start a new conversation it will either be GGGGGGGGGGGG immediately or after the third question.

Ran deepseek-r1:7b for a about 8 questions. No GGGGGGGGGGGGGGGGGGG responses, but twice when i change the subject gives me a response on the previous question.

Cleared the chats and started with llama3.2:latest again. Managed to do 5 questions before giving me the GGGGGGGGGGGGG.

Cleared the chat and asked Qwen2.5-coder:3b some questions. Gave me GGGGGGGGGGGGGGGGGG after 5 Questions.

Cleared and ran deepseek-r1:1.5b gave me GGGGGGGGGGGGGGGG after two questions.

I will later roll back to Ollama version 0.11.7 and retest again.

@R1U2 commented on GitHub (Sep 9, 2025): Ok as promised started with Ollama 0.11.8 with llama3.2:latest on my Jetson. OpenWebai separate distro on omv7/docker and Intell Haswell processor. Running from CLI gets the same response. If i ask another question or start a new conversation it will either be GGGGGGGGGGGG immediately or after the third question. <img width="1063" height="350" alt="Image" src="https://github.com/user-attachments/assets/86aa976c-e328-4684-b7a1-1bb438dd6101" /> Ran deepseek-r1:7b for a about 8 questions. No GGGGGGGGGGGGGGGGGGG responses, but twice when i change the subject gives me a response on the previous question. <img width="593" height="868" alt="Image" src="https://github.com/user-attachments/assets/ea397150-25a5-4419-8630-6afe7893371c" /> Cleared the chats and started with llama3.2:latest again. Managed to do 5 questions before giving me the GGGGGGGGGGGGG. Cleared the chat and asked Qwen2.5-coder:3b some questions. Gave me GGGGGGGGGGGGGGGGGG after 5 Questions. <img width="558" height="822" alt="Image" src="https://github.com/user-attachments/assets/1ea0af9b-ba6e-4c61-8f86-3c17764cba83" /> Cleared and ran deepseek-r1:1.5b gave me GGGGGGGGGGGGGGGG after two questions. <img width="356" height="836" alt="Image" src="https://github.com/user-attachments/assets/0fcdfbe5-fc58-49bf-b5b5-008a738e09c9" /> I will later roll back to Ollama version 0.11.7 and retest again.

GiteaMirror commented

2026-04-12 20:29:35 -05:00

@dhiltgen commented on GitHub (Sep 9, 2025):

From the screenshots above, it looks like you're on Jetpack v6. Did we select the correct runtime in the "inference compute" log line? Something like this:

time=2025-09-09T14:23:40.179-07:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-67834ba8-0312-50b2-9286-9b3b02e80059 library=cuda variant=jetpack6 compute=8.7 driver=12.6 name=Orin total="61.4 GiB" available="51.2 GiB"

@dhiltgen commented on GitHub (Sep 9, 2025): From the screenshots above, it looks like you're on Jetpack v6. Did we select the correct runtime in the "inference compute" log line? Something like this: ``` time=2025-09-09T14:23:40.179-07:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-67834ba8-0312-50b2-9286-9b3b02e80059 library=cuda variant=jetpack6 compute=8.7 driver=12.6 name=Orin total="61.4 GiB" available="51.2 GiB" ```

GiteaMirror commented

2026-04-12 20:29:37 -05:00

@R1U2 commented on GitHub (Sep 9, 2025):

@dhiltgen i ran a test now on my Intel Nuc /omv7/docker/ollama 0.11.8 latest running llama3.2b . Using short questions the issue does not present itself. So not sure where i musty select the correct run time ?

@R1U2 commented on GitHub (Sep 9, 2025): @dhiltgen i ran a test now on my Intel Nuc /omv7/docker/ollama 0.11.8 latest running llama3.2b . Using short questions the issue does not present itself. So not sure where i musty select the correct run time ?

GiteaMirror commented

2026-04-12 20:29:38 -05:00

@HazmanNaim commented on GitHub (Sep 11, 2025):

Hi, I encountered a similar issue. I am running Ollama (Docker version 0.11.10) on a Jetson Orin and experimenting with LangGraph agents. For some unknown reason, Ollama starts responding with GGGG after a few interactions, regardless of which model is loaded. In one case, triggering a tool call immediately caused Ollama to respond with GGGG. However, if I run the Ollama on amd64 machine, ollama is stable and no issues.

So, I rolled back to Docker Ollama version 0.10.0, and the issue seems to have gone away. Probably the issue is something related to version 0.11 if it is running on Jetson.

@HazmanNaim commented on GitHub (Sep 11, 2025): Hi, I encountered a similar issue. I am running Ollama (Docker version 0.11.10) on a Jetson Orin and experimenting with LangGraph agents. For some unknown reason, Ollama starts responding with GGGG after a few interactions, regardless of which model is loaded. In one case, triggering a tool call immediately caused Ollama to respond with GGGG. However, if I run the Ollama on amd64 machine, ollama is stable and no issues. So, I rolled back to Docker Ollama version 0.10.0, and the issue seems to have gone away. Probably the issue is something related to version 0.11 if it is running on Jetson.

GiteaMirror commented

2026-04-12 20:29:39 -05:00

@v1ckxy commented on GitHub (Sep 16, 2025):

Same here. Orin Nano after two messages starts throwing G's

@v1ckxy commented on GitHub (Sep 16, 2025): Same here. Orin Nano after two messages starts throwing G's

GiteaMirror commented

2026-04-12 20:29:40 -05:00

@eschoell commented on GitHub (Sep 22, 2025):

I am having the same issue running on an Orin. The gpt-oss model runs fine, while any other will quickly -- if not immediately -- fail. Based on that, it seems that whatever was done to support the gpt-oss model is the cause.

@eschoell commented on GitHub (Sep 22, 2025): I am having the same issue running on an Orin. The gpt-oss model runs fine, while any other will quickly -- if not immediately -- fail. Based on that, it seems that whatever was done to support the gpt-oss model is the cause.

GiteaMirror commented

2026-04-12 20:29:40 -05:00

@thunderfm commented on GitHub (Sep 26, 2025):

Updated to 0.12.2 today and it seems to have been fixed. Tried a bunch of different models and they're all working well now.

@thunderfm commented on GitHub (Sep 26, 2025): Updated to 0.12.2 today and it seems to have been fixed. Tried a bunch of different models and they're all working well now.

GiteaMirror commented

2026-04-12 20:29:41 -05:00

@dhiltgen commented on GitHub (Sep 26, 2025):

@R1U2 please look in the server logs to see if Ollama auto-detected the correct runtime. This is not something you have to do, but Ollama is supposed to figure it out from information on the system. If we chose the wrong runtime, then gibberish responses (or crashing) will happen. Our troubleshooting guide explains how to find the logs https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

@dhiltgen commented on GitHub (Sep 26, 2025): @R1U2 please look in the server logs to see if Ollama auto-detected the correct runtime. This is not something you have to do, but Ollama is supposed to figure it out from information on the system. If we chose the wrong runtime, then gibberish responses (or crashing) will happen. Our troubleshooting guide explains how to find the logs https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

GiteaMirror commented

2026-04-12 20:29:42 -05:00

@eschoell commented on GitHub (Sep 27, 2025):

It has not been fixed. I am still experiencing the same problem with version 0.12.3.

@eschoell commented on GitHub (Sep 27, 2025): It has not been fixed. I am still experiencing the same problem with version 0.12.3.

GiteaMirror commented

2026-04-12 20:29:42 -05:00

@eschoell commented on GitHub (Sep 27, 2025):

I see the log, but I do not see where it talks about finding the runtime...

@eschoell commented on GitHub (Sep 27, 2025): I see the log, but I do not see where it talks about finding the runtime...

GiteaMirror commented

2026-04-12 20:29:43 -05:00

@rick-github commented on GitHub (Sep 27, 2025):

https://github.com/ollama/ollama/issues/12209#issuecomment-3271542429

Or post the log.

@rick-github commented on GitHub (Sep 27, 2025): https://github.com/ollama/ollama/issues/12209#issuecomment-3271542429 Or post the log.

GiteaMirror commented

2026-04-12 20:29:44 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

Hi All, been looking at the comments coming in. Busy away from home during the week and also playing around with comfyui on how to use it.

During this exercise i have learnt a few things about my Jetson Orin Nano. Ollama as well as Comfyui can be run directly like in Jetson containers or docker containers. Reason why i dont like jetson containers is that they are not persistant, when i reset the unit the container is gone, everything needs to be downloaded again when a new container is started. I dont know enough of running it in cli to make it run long enough to play around with it. Thus the need to run it in docker environment with portainer, and a decent docker-compose.yaml file to run it up. Then portainer to quick change the network or attached storage , GPU setting etc.

In saying that my previous Ollama container i ran, ran with i believe a bottleneck CPU, although it was still fast, i felt it was still not accessing the GPU on the Jetson as it should. Saw that on Jtop. Then getting Comfyui to run with a good docker-compose.yaml that performs and uses the gpu, i redid my Ollama compose file and started it up again. I realized that DustyNV's last version 34.4.0 version only had the Ollama/ollam:0.10.0 version in for jetson. I will now retest with my new compose setup version 0.11.8 and then do the new 0.12.2. If the results are the same i will post the logs as @dhiltgen requested.

Be back soon.

@R1U2 commented on GitHub (Sep 28, 2025): Hi All, been looking at the comments coming in. Busy away from home during the week and also playing around with comfyui on how to use it. During this exercise i have learnt a few things about my Jetson Orin Nano. Ollama as well as Comfyui can be run directly like in Jetson containers or docker containers. Reason why i dont like jetson containers is that they are not persistant, when i reset the unit the container is gone, everything needs to be downloaded again when a new container is started. I dont know enough of running it in cli to make it run long enough to play around with it. Thus the need to run it in docker environment with portainer, and a decent docker-compose.yaml file to run it up. Then portainer to quick change the network or attached storage , GPU setting etc. In saying that my previous Ollama container i ran, ran with i believe a bottleneck CPU, although it was still fast, i felt it was still not accessing the GPU on the Jetson as it should. Saw that on Jtop. Then getting Comfyui to run with a good docker-compose.yaml that performs and uses the gpu, i redid my Ollama compose file and started it up again. I realized that DustyNV's last version 34.4.0 version only had the Ollama/ollam:0.10.0 version in for jetson. I will now retest with my new compose setup version 0.11.8 and then do the new 0.12.2. If the results are the same i will post the logs as @dhiltgen requested. Be back soon.

GiteaMirror commented

2026-04-12 20:29:45 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

Ok test results are in.

Spun up ollama 0.11.8 to retest.

llama3.2b had no issues and i could ask it about twenty questions.
I moved over to deepseek-R1:1.5b , second question in i get the gggggggggg. Log file below.

_ollama_logs.txt

@R1U2 commented on GitHub (Sep 28, 2025): Ok test results are in. Spun up ollama 0.11.8 to retest. llama3.2b had no issues and i could ask it about twenty questions. I moved over to deepseek-R1:1.5b , second question in i get the gggggggggg. Log file below. [_ollama_logs.txt](https://github.com/user-attachments/files/22579894/_ollama_logs.txt)

GiteaMirror commented

2026-04-12 20:29:46 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

ran deepseek-r1:7b.

6 questions and it bombed out.
i changed the subject on 5 and asked it to tell me a joke., it replied but with no joke and previous line of questioning. question 6 answered with ggggggggggggg . log below does not show much.

_ollama_logs(1).txt

@R1U2 commented on GitHub (Sep 28, 2025): ran deepseek-r1:7b. 6 questions and it bombed out. i changed the subject on 5 and asked it to tell me a joke., it replied but with no joke and previous line of questioning. question 6 answered with ggggggggggggg . log below does not show much. [_ollama_logs(1).txt](https://github.com/user-attachments/files/22580175/_ollama_logs.1.txt)

GiteaMirror commented

2026-04-12 20:29:46 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

started a new chat with qwen , bombed out on second question. Log does not show much. below.

_ollama_logs(2).txt

@R1U2 commented on GitHub (Sep 28, 2025): started a new chat with qwen , bombed out on second question. Log does not show much. below. [_ollama_logs(2).txt](https://github.com/user-attachments/files/22580186/_ollama_logs.2.txt)

GiteaMirror commented

2026-04-12 20:29:48 -05:00

@R1U2 commented on GitHub (Sep 28, 2025):

Spun up Ollama 0.12.3 with llama3.2 latest. 6 questions in it bombs out with gggggggggg.
log below.

_ollama_logs.txt

@thunderfm - Still not fixed.

Will now revert back to 0.10.0 again until this is fixed. If there is anything you want me to assist with testing wise, let me know.

@R1U2 commented on GitHub (Sep 28, 2025): Spun up Ollama 0.12.3 with llama3.2 latest. 6 questions in it bombs out with gggggggggg. log below. [_ollama_logs.txt](https://github.com/user-attachments/files/22580221/_ollama_logs.txt) @thunderfm - Still not fixed. Will now revert back to 0.10.0 again until this is fixed. If there is anything you want me to assist with testing wise, let me know.

GiteaMirror commented

2026-04-12 20:29:48 -05:00

@eschoell commented on GitHub (Oct 22, 2025):

It seems that there should be enough info to proceed fixing this, correct? The issue still has the "needs more info" tag.

I have resorted to running the latest version in a Docker container (just for gpt-oss:20b) alongside the native build of v0.10 for everything else. This is clearly not a sustainable workaround.

@eschoell commented on GitHub (Oct 22, 2025): It seems that there should be enough info to proceed fixing this, correct? The issue still has the "needs more info" tag. I have resorted to running the latest version in a Docker container (just for gpt-oss:20b) alongside the native build of v0.10 for *everything else*. This is clearly not a sustainable workaround.

GiteaMirror commented

2026-04-12 20:29:50 -05:00

@dhiltgen commented on GitHub (Oct 22, 2025):

@R1U2 your logs aren't complete so I can't tell if this is a discovery problem where we're using the wrong CUDA runtime, or possibly over-committing GPU memory, or something else.

I believe you said you're using a container, so something like this should hopefully work: (adjust the flags if you need to)

docker run --rm -it --runtime=nvidia -e JETSON_JETPACK=6 -e OLLAMA_DEBUG=2 ollama/ollama 2>&1 | tee serve.log

As soon as you see the log line ... msg="inference compute" ... show up, ctrl-c the docker run and share that serve.log.

(I should also point out, if your container does not have JETSON_JETPACK=6 it's probable we're using the wrong runtime - see the note at https://github.com/ollama/ollama/blob/main/docs/docker.md#start-the-container)

@dhiltgen commented on GitHub (Oct 22, 2025): @R1U2 your logs aren't complete so I can't tell if this is a discovery problem where we're using the wrong CUDA runtime, or possibly over-committing GPU memory, or something else. I believe you said you're using a container, so something like this should hopefully work: (adjust the flags if you need to) ``` docker run --rm -it --runtime=nvidia -e JETSON_JETPACK=6 -e OLLAMA_DEBUG=2 ollama/ollama 2>&1 | tee serve.log ``` As soon as you see the log line `... msg="inference compute" ...` show up, ctrl-c the docker run and share that serve.log. (I should also point out, if your container does not have JETSON_JETPACK=6 it's probable we're using the wrong runtime - see the note at https://github.com/ollama/ollama/blob/main/docs/docker.md#start-the-container)

GiteaMirror commented

2026-04-12 20:29:50 -05:00

@undeadindustries commented on GitHub (Oct 26, 2025):

Just chiming in that I'm getting this exact same issue on nvidia dgx spark. version 0.12.3.

@undeadindustries commented on GitHub (Oct 26, 2025): Just chiming in that I'm getting this exact same issue on nvidia dgx spark. version 0.12.3.

GiteaMirror commented

2026-04-12 20:29:51 -05:00

@dhiltgen commented on GitHub (Nov 5, 2025):

@undeadindustries can you share more complete logs so we can try to isolate what's going wrong?

@dhiltgen commented on GitHub (Nov 5, 2025): @undeadindustries can you share more complete logs so we can try to isolate what's going wrong?

GiteaMirror commented

2026-04-12 20:29:52 -05:00

@undeadindustries commented on GitHub (Nov 10, 2025):

@dhiltgen absolutely. Just to make sure I'm giving you exactly what you want. Which command should I run for the logs and/or which log file would you like?

Thanks for looking into it!

@undeadindustries commented on GitHub (Nov 10, 2025): @dhiltgen absolutely. Just to make sure I'm giving you exactly what you want. Which command should I run for the logs and/or which log file would you like? Thanks for looking into it!

GiteaMirror commented

2026-04-12 20:29:53 -05:00

@dhiltgen commented on GitHub (Nov 12, 2025):

@undeadindustries make sure you're running the latest version, start the server with OLLAMA_DEBUG=2 and share the log from startup to the point where it reports "inference compute" so we can see why it's failing to discovery your GPU properly.

@dhiltgen commented on GitHub (Nov 12, 2025): @undeadindustries make sure you're running the latest version, start the server with OLLAMA_DEBUG=2 and share the log from startup to the point where it reports "inference compute" so we can see why it's failing to discovery your GPU properly.

GiteaMirror referenced this issue

2026-04-13 00:05:23 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #12635

GiteaMirror referenced this issue

2026-04-16 06:18:55 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #17906

GiteaMirror referenced this issue

2026-04-19 16:48:51 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #23175

GiteaMirror referenced this issue

2026-04-22 10:49:43 -05:00

[GH-ISSUE #7978] Structured Output is not OpenAI compliant #30864

GiteaMirror referenced this issue

2026-04-22 23:10:34 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #38508

GiteaMirror referenced this issue

2026-04-24 23:26:53 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #43883

GiteaMirror referenced this issue

2026-04-28 20:38:38 -05:00

[GH-ISSUE #7978] Structured Output is not OpenAI compliant #51615

GiteaMirror referenced this issue

2026-04-29 14:16:14 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #59332

GiteaMirror referenced this issue

2026-05-04 09:33:59 -05:00

[GH-ISSUE #7978] Structured Output is not OpenAI compliant #67160

GiteaMirror referenced this issue

2026-05-05 07:15:27 -05:00

[PR #8124] [CLOSED] grammar: introduce new grammar package #74929

Sign in to join this conversation.

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#8124