[GH-ISSUE #12304] GGML_ASSERT Crash with Qwen3-Embedding on Kaby Lake CPU (i7-8809G) #54688

Closed
opened 2026-04-29 06:55:41 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Yue-bin on GitHub (Sep 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12304

What is the issue?

Hi Ollama Team,

I'm experiencing a consistent server crash when using the hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:Q8_0 embedding model. The crash appears to be specific to the CPU architecture and the model used.

The issue is reproducible on both Ollama v0.1.10 and v0.1.11. The logs indicate that Ollama is loading the libggml-cpu-haswell.so backend, but the crash is occurring on a Kaby Lake CPU.

This issue has been tested on both Ollama v0.1.10 and v0.1.11 with the same result.

  1. Environment
  • Operating System:
❯ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm
  • CPU Information:
❯ lscpu | grep -E "Model name|Architecture|Vendor ID"
Architecture:                         x86_64
Vendor ID:                            GenuineIntel
Model name:                           Intel(R) Core(TM) i7-8809G CPU @ 3.10GHz
  • GPU Information:
    The server is running in CPU-only mode.
  1. Steps to Reproduce

Run Ollama v0.1.11 on a system with an Intel Core i7-8809G (Kaby Lake) processor.
Pull the embedding model: ollama pull hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:Q8_0
Send a basic request to the /api/embed endpoint using this model.

  1. Actual BehaviorThe Ollama server crashes. The log shows a GGML_ASSERT(i01 >= 0 && i01 < ne01) failed error in ggml-cpu/ops.cpp, followed by a SIGABRT.

  2. Expected Behavior

The server should process the embedding request successfully.

  1. Workaround Found

This crash does not occur when using a different embedding model like mxbai-embed-large on the exact same hardware and Ollama version. This strongly suggests the bug is an interaction between the specific model (Qwen3-Embedding) and the ggml CPU backend used for this processor microarchitecture.

  1. Log Output

ollama_log.txt

cause of my poor English, the template is generated by ai

Relevant log output

in file https://github.com/user-attachments/files/22359419/ollama_log.txt

OS

Linux

GPU

No response

CPU

Intel

Ollama version

v0.11.11 and v0.11.10

Originally created by @Yue-bin on GitHub (Sep 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12304 ### What is the issue? Hi Ollama Team, I'm experiencing a consistent server crash when using the hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:Q8_0 embedding model. The crash appears to be specific to the CPU architecture and the model used. The issue is reproducible on both Ollama v0.1.10 and v0.1.11. The logs indicate that Ollama is loading the libggml-cpu-haswell.so backend, but the crash is occurring on a Kaby Lake CPU. This issue has been tested on both Ollama v0.1.10 and v0.1.11 with the same result. 1. Environment - Operating System: ```bash ❯ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 12 (bookworm) Release: 12 Codename: bookworm ``` - CPU Information: ```bash ❯ lscpu | grep -E "Model name|Architecture|Vendor ID" Architecture: x86_64 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-8809G CPU @ 3.10GHz ``` - GPU Information: The server is running in CPU-only mode. 2. Steps to Reproduce Run Ollama v0.1.11 on a system with an Intel Core i7-8809G (Kaby Lake) processor. Pull the embedding model: ollama pull hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:Q8_0 Send a basic request to the /api/embed endpoint using this model. 3. Actual BehaviorThe Ollama server crashes. The log shows a GGML_ASSERT(i01 >= 0 && i01 < ne01) failed error in ggml-cpu/ops.cpp, followed by a SIGABRT. 4. Expected Behavior The server should process the embedding request successfully. 5. Workaround Found This crash does not occur when using a different embedding model like mxbai-embed-large on the exact same hardware and Ollama version. This strongly suggests the bug is an interaction between the specific model (Qwen3-Embedding) and the ggml CPU backend used for this processor microarchitecture. 6. Log Output [ollama_log.txt](https://github.com/user-attachments/files/22359419/ollama_log.txt) *cause of my poor English, the template is generated by ai* ### Relevant log output ```shell in file https://github.com/user-attachments/files/22359419/ollama_log.txt ``` ### OS Linux ### GPU _No response_ ### CPU Intel ### Ollama version v0.11.11 and v0.11.10
GiteaMirror added the bug label 2026-04-29 06:55:41 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 16, 2025):

#12014

<!-- gh-comment-id:3296923737 --> @rick-github commented on GitHub (Sep 16, 2025): #12014
Author
Owner

@pdevine commented on GitHub (Sep 17, 2025):

I'm going to close as a dupe. If you want to try qwen3 out in the Ollama engine (vs. the legacy llama.cpp implementation) #12301 is close to merging.

<!-- gh-comment-id:3300784805 --> @pdevine commented on GitHub (Sep 17, 2025): I'm going to close as a dupe. If you want to try qwen3 out in the Ollama engine (vs. the legacy llama.cpp implementation) #12301 is close to merging.
Author
Owner

@Yue-bin commented on GitHub (Sep 18, 2025):

I'm going to close as a dupe. If you want to try qwen3 out in the Ollama engine (vs. the legacy llama.cpp implementation) #12301 is close to merging.我将将其标记为重复问题关闭。如果你想在 Ollama 引擎中(而不是传统的 llama.cpp 实现)尝试 qwen3, #12301 接近合并。

thanks

<!-- gh-comment-id:3307215255 --> @Yue-bin commented on GitHub (Sep 18, 2025): > I'm going to close as a dupe. If you want to try qwen3 out in the Ollama engine (vs. the legacy llama.cpp implementation) [#12301](https://github.com/ollama/ollama/pull/12301) is close to merging.我将将其标记为重复问题关闭。如果你想在 Ollama 引擎中(而不是传统的 llama.cpp 实现)尝试 qwen3, [#12301](https://github.com/ollama/ollama/pull/12301) 接近合并。 thanks
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54688