[GH-ISSUE #13313] ministral-3:14b num_ctx over 128512 results in repetition #55306

New Issue

GiteaMirror · 2026-04-29T08:49:24-05:00

GiteaMirror commented

2026-04-29 08:49:24 -05:00

Originally created by @dan-and on GitHub (Dec 3, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13313

What is the issue?

ministral-3:14b ( ministral-3:14b-instruct-2512-q4_K_M ) model description mentions a context length of 262144.

during my first tests, I figured out that any value over 128512 results in repetitions of the first, sometimes the first two tokens and ministral-3:14b is not usable anymore until, the num_ctx is set to 128512 or lower.

Update: Same applies to ministral-3:14b-instruct-2512-q8_0 , but at num_ctx over 206848

Important:
This is not a problem with ministral-3:8b or ministral-3:3b . These two models work fine with a num_ctx of 262144

System:
I am running Ubuntu 24.04, with 3x RTX 3080 - 20GB and 2x RTX 3060 - 12GB, so a total of 84 GB raw (76 GB usable) VRAM.

ollama_0.13.1_ministral.txt

Relevant log output

# ollama show ministral-3:14b
  Model
    architecture        mistral3
    parameters          13.9B
    context length      262144
    embedding length    5120
    quantization        Q4_K_M

  Capabilities
    completion
    vision
    tools

  Parameters
    temperature    0.15

  System
    You are Ministral-3-14B-Instruct-2512, a Large Language Model (LLM) created by Mistral AI, a French
      startup headquartered in Paris.
    You power an AI assistant called Le Chat.
    ...


# ollama run ministral-3:14b
>>> /set parameter num_ctx 128512
Set parameter 'num_ctx' to '128512'
>>> why is the sky blue?
The sky appears blue due to a phenomenon called **Rayleigh scattering**, which is the scattering of sunlight by the molecules and tiny particles in Earth's atmosphere. Here’s a step-by-step explanation:

1. **Sunlight Composition**: Sunlight (white light) is made up of all colors of the visible spectrum—red, orange, yellow, green, blue, indigo, and violet—each with different wavelengths. Blue and violet light have the shortest
wavelengths, while red light has the longest.

2. **Scattering by the Atmosphere**: When sunlight reaches Earth's atmosphere, it interacts with nitrogen and oxygen molecules. Shorter wavelengths (blue and violet) are scattered more efficiently than longer wavelengths (red, orange,
yellow) because they interact more strongly with the molecules.

3. **Why Not Violet?**: Although violet light is scattered even more than blue, the human eye is less sensitive to violet light, and some of it is also absorbed by the upper atmosphere. As a result, blue light dominates what we
perceive.

4. **Perception**: Our eyes detect the scattered blue light from all directions, making the sky appear blue during the day.

At sunrise or sunset, the sky often appears red or orange because sunlight passes through more of the atmosphere, scattering the shorter blue wavelengths and leaving the longer red and orange wavelengths to reach our eyes.

>>> /set parameter num_ctx 128513
Set parameter 'num_ctx' to '128513'
>>> why is the sky blue?
TheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheThe

>>> Send a message (/? for help)


# ollama ps
NAME               ID              SIZE     PROCESSOR    CONTEXT    UNTIL
ministral-3:14b    8a5cdca192c0    40 GB    100% GPU     128513     59 minutes from now


# ollama -v
ollama version is 0.13.1

# nvidia-smi
Wed Dec  3 11:16:24 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.105.08             Driver Version: 580.105.08     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080        On  |   00000000:23:00.0 Off |                  N/A |
|  0%   32C    P8             21W /  105W |       4MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3080        On  |   00000000:25:00.0 Off |                  N/A |
|  0%   29C    P8              4W /  105W |   14319MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 3060        On  |   00000000:2F:00.0 Off |                  N/A |
|  0%   36C    P8              9W /  105W |   10791MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 3080        On  |   00000000:30:00.0 Off |                  N/A |
|  0%   32C    P8              8W /  105W |   14137MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 3060        On  |   00000000:31:00.0 Off |                  N/A |
|  0%   26C    P8              5W /  105W |       4MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A           20744      C   /usr/bin/ollama                       14310MiB |
|    2   N/A  N/A           20744      C   /usr/bin/ollama                       10782MiB |
|    3   N/A  N/A           20744      C   /usr/bin/ollama                       14128MiB |
+-----------------------------------------------------------------------------------------+

OS

Linux Ubuntu 24.04

GPU

3x RTX 3080 20GB
2x RTX 3060 12GB

CPU

AMD Ryzen 5 5500GT

Ollama version

0.13.1

Originally created by @dan-and on GitHub (Dec 3, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13313 ### What is the issue? ministral-3:14b ( ministral-3:14b-instruct-2512-q4_K_M ) model description mentions a context length of 262144. during my first tests, I figured out that any value over 128512 results in repetitions of the first, sometimes the first two tokens and ministral-3:14b is not usable anymore until, the num_ctx is set to 128512 or lower. _Update: Same applies to ministral-3:14b-instruct-2512-q8_0 , but at num_ctx over 206848_ **Important:** This is not a problem with ministral-3:8b or ministral-3:3b . These two models work fine with a num_ctx of 262144 **System:** I am running Ubuntu 24.04, with 3x RTX 3080 - 20GB and 2x RTX 3060 - 12GB, so a total of 84 GB raw (76 GB usable) VRAM. [ollama_0.13.1_ministral.txt](https://github.com/user-attachments/files/23904723/ollama_0.13.1_ministral.txt) ### Relevant log output ```shell # ollama show ministral-3:14b Model architecture mistral3 parameters 13.9B context length 262144 embedding length 5120 quantization Q4_K_M Capabilities completion vision tools Parameters temperature 0.15 System You are Ministral-3-14B-Instruct-2512, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris. You power an AI assistant called Le Chat. ... # ollama run ministral-3:14b >>> /set parameter num_ctx 128512 Set parameter 'num_ctx' to '128512' >>> why is the sky blue? The sky appears blue due to a phenomenon called **Rayleigh scattering**, which is the scattering of sunlight by the molecules and tiny particles in Earth's atmosphere. Here’s a step-by-step explanation: 1. **Sunlight Composition**: Sunlight (white light) is made up of all colors of the visible spectrum—red, orange, yellow, green, blue, indigo, and violet—each with different wavelengths. Blue and violet light have the shortest wavelengths, while red light has the longest. 2. **Scattering by the Atmosphere**: When sunlight reaches Earth's atmosphere, it interacts with nitrogen and oxygen molecules. Shorter wavelengths (blue and violet) are scattered more efficiently than longer wavelengths (red, orange, yellow) because they interact more strongly with the molecules. 3. **Why Not Violet?**: Although violet light is scattered even more than blue, the human eye is less sensitive to violet light, and some of it is also absorbed by the upper atmosphere. As a result, blue light dominates what we perceive. 4. **Perception**: Our eyes detect the scattered blue light from all directions, making the sky appear blue during the day. At sunrise or sunset, the sky often appears red or orange because sunlight passes through more of the atmosphere, scattering the shorter blue wavelengths and leaving the longer red and orange wavelengths to reach our eyes. >>> /set parameter num_ctx 128513 Set parameter 'num_ctx' to '128513' >>> why is the sky blue? TheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheTheThe >>> Send a message (/? for help) # ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL ministral-3:14b 8a5cdca192c0 40 GB 100% GPU 128513 59 minutes from now # ollama -v ollama version is 0.13.1 # nvidia-smi Wed Dec 3 11:16:24 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.105.08 Driver Version: 580.105.08 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3080 On | 00000000:23:00.0 Off | N/A | | 0% 32C P8 21W / 105W | 4MiB / 20480MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3080 On | 00000000:25:00.0 Off | N/A | | 0% 29C P8 4W / 105W | 14319MiB / 20480MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 2 NVIDIA GeForce RTX 3060 On | 00000000:2F:00.0 Off | N/A | | 0% 36C P8 9W / 105W | 10791MiB / 12288MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 3 NVIDIA GeForce RTX 3080 On | 00000000:30:00.0 Off | N/A | | 0% 32C P8 8W / 105W | 14137MiB / 20480MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 4 NVIDIA GeForce RTX 3060 On | 00000000:31:00.0 Off | N/A | | 0% 26C P8 5W / 105W | 4MiB / 12288MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 1 N/A N/A 20744 C /usr/bin/ollama 14310MiB | | 2 N/A N/A 20744 C /usr/bin/ollama 10782MiB | | 3 N/A N/A 20744 C /usr/bin/ollama 14128MiB | +-----------------------------------------------------------------------------------------+ ``` ### OS Linux Ubuntu 24.04 ### GPU 3x RTX 3080 20GB 2x RTX 3060 12GB ### CPU AMD Ryzen 5 5500GT ### Ollama version 0.13.1

GiteaMirror added the bug label 2026-04-29 08:49:24 -05:00

GiteaMirror closed this issue

2026-04-29 08:50:02 -05:00

GiteaMirror commented

2026-04-29 08:50:05 -05:00

@rick-github commented on GitHub (Dec 3, 2025):

This may be caused by the model being split over multiple GPUs. I ran the same commands as your example and the second why is the sky blue? ran without a problem, the difference being that in my test the model fit on one GPU. What is the output of nvidia-smi for the the two different num_ctx values?

It's also interesting that the size of the model in your case (40GB) is less than in my single GPU case (44GB). Typically the memory footprint goes up when a model is split across multiple devices.

@rick-github commented on GitHub (Dec 3, 2025): This may be caused by the model being split over multiple GPUs. I ran the same commands as your example and the second `why is the sky blue?` ran without a problem, the difference being that in my test the model fit on one GPU. What is the output of `nvidia-smi` for the the two different `num_ctx` values? It's also interesting that the size of the model in your case (40GB) is less than in my single GPU case (44GB). Typically the memory footprint goes up when a model is split across multiple devices.

GiteaMirror commented

2026-04-29 08:50:07 -05:00

@dan-and commented on GitHub (Dec 3, 2025):

That may explain, why CPU offloading does create issues / is not supported currently ( see https://github.com/ollama/ollama/issues/13312 )

However, I am able to run ministral-3:14b with num_ctx of 206848 over several GPUs, as it requires 60GB of VRAM:

ollama ps

NAME ID SIZE PROCESSOR CONTEXT UNTIL
ministral-3_14b_q8:206k 932e1d757f66 60 GB 100% GPU 206848 59 minutes from now

$ nvidia-smi
Wed Dec  3 17:29:28 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.105.08             Driver Version: 580.105.08     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080        On  |   00000000:23:00.0 Off |                  N/A |
| 50%   38C    P8             23W /  105W |   18839MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3080        On  |   00000000:25:00.0 Off |                  N/A |
| 50%   38C    P8             16W /  105W |   19629MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 3060        On  |   00000000:2F:00.0 Off |                  N/A |
|  0%   36C    P8              9W /  105W |       4MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 3080        On  |   00000000:30:00.0 Off |                  N/A |
| 50%   35C    P8              9W /  105W |   18895MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 3060        On  |   00000000:31:00.0 Off |                  N/A |
|  0%   33C    P8             10W /  105W |       4MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           79030      C   /usr/bin/ollama                       18830MiB |
|    1   N/A  N/A           79030      C   /usr/bin/ollama                       19620MiB |
|    3   N/A  N/A           79030      C   /usr/bin/ollama                       18886MiB |
+-----------------------------------------------------------------------------------------+

$ ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
ministral-3_14b_q8:206k 932e1d757f66 60 GB 100% GPU 206849 59 minutes from now

$ nvidia-smi
Wed Dec  3 17:44:31 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.105.08             Driver Version: 580.105.08     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080        On  |   00000000:23:00.0 Off |                  N/A |
| 44%   37C    P2             95W /  105W |   14423MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3080        On  |   00000000:25:00.0 Off |                  N/A |
| 50%   39C    P2             90W /  105W |   16337MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 3060        On  |   00000000:2F:00.0 Off |                  N/A |
|  0%   36C    P2             27W /  105W |   10955MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 3080        On  |   00000000:30:00.0 Off |                  N/A |
| 50%   39C    P2             81W /  105W |   16339MiB /  20480MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 3060        On  |   00000000:31:00.0 Off |                  N/A |
|  0%   34C    P2             28W /  105W |     113MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           81364      C   /usr/bin/ollama                       14414MiB |
|    1   N/A  N/A           81364      C   /usr/bin/ollama                       16328MiB |
|    2   N/A  N/A           81364      C   /usr/bin/ollama                       10946MiB |
|    3   N/A  N/A           81364      C   /usr/bin/ollama                       16330MiB |
|    4   N/A  N/A           81364      C   /usr/bin/ollama                         104MiB |
+-----------------------------------------------------------------------------------------+

From 3 GPU allocation to 5 GPUs - especially the latest one with just 104MiB is concerning.

GiteaMirror commented

2026-04-29 08:50:09 -05:00

@dabe-19 commented on GitHub (Dec 4, 2025):

I made this comment on a similar thread. Don't have quite the same capacity as you, but if you're expecting total VRAM usage to be close to your full GPU capacity, this might be related. It was certainly preventing me from running any of the new, smaller models at ctx 4096 and lowering to 2048 didn't even help.

https://github.com/ollama/ollama/issues/13315#issuecomment-3609968353

@dabe-19 commented on GitHub (Dec 4, 2025): I made this comment on a similar thread. Don't have quite the same capacity as you, but if you're expecting total VRAM usage to be close to your full GPU capacity, this might be related. It was certainly preventing me from running any of the new, smaller models at ctx 4096 and lowering to 2048 didn't even help. https://github.com/ollama/ollama/issues/13315#issuecomment-3609968353

GiteaMirror commented

2026-04-29 08:50:10 -05:00

@crwsolutions commented on GitHub (Dec 4, 2025):

I have the same problem, I think. This is the behavior out of the box:

C:\Users\yes>ollama run ministral-3:14b
>>> Hi
HelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHello

>>> /bye

C:\Users\yes>ollama --version
ollama version is 0.13.1

My PC: Windows 11, nvidia 5060 TI + nvidia 3060

@crwsolutions commented on GitHub (Dec 4, 2025): I have the same problem, I think. This is the behavior out of the box: ```bash C:\Users\yes>ollama run ministral-3:14b >>> Hi HelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHelloHello >>> /bye C:\Users\yes>ollama --version ollama version is 0.13.1 ``` My PC: Windows 11, nvidia 5060 TI + nvidia 3060

GiteaMirror commented

2026-04-29 08:50:11 -05:00

@dan-and commented on GitHub (Dec 8, 2025):

Fixed with ollama 0.13.2-rc2

@dan-and commented on GitHub (Dec 8, 2025): Fixed with ollama 0.13.2-rc2

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#55306