[GH-ISSUE #13571] KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511 can not generate embedding #55449

Closed
opened 2026-04-29 09:14:39 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @Genuifx on GitHub (Dec 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13571

KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511 is recognized as chat model.

curl http://localhost:11434/api/embed -d '{
  "model": "KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511",
  "input": "Why is the sky blue?"
}'
{"error":"this model does not support embeddings"}% 
Originally created by @Genuifx on GitHub (Dec 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13571 KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511 is recognized as chat model. ``` curl http://localhost:11434/api/embed -d '{ "model": "KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511", "input": "Why is the sky blue?" }' {"error":"this model does not support embeddings"}% ```
Author
Owner

@anumukul commented on GitHub (Dec 26, 2025):

Hi, I’d like to take this issue and can deliver a fix within 24 hours.
I’ve worked on similar projects before and have relevant experience, so I should be able to handle this efficiently.

<!-- gh-comment-id:3693470959 --> @anumukul commented on GitHub (Dec 26, 2025): Hi, I’d like to take this issue and can deliver a fix within 24 hours. I’ve worked on similar projects before and have relevant experience, so I should be able to handle this efficiently.
Author
Owner

@rick-github commented on GitHub (Jan 4, 2026):

The model is missing metadata that identifies it as an embedding model:

$ ollama show KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511 
  Model
    architecture        gemma3    
    parameters          11.8B     
    context length      131072    
    embedding length    3840      
    quantization        BF16      

  Capabilities
    completion    

I imported the original model from https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511 and uploaded it to the ollama library at https://ollama.com/frob/KaLM-Embedding-Gemma3-12B-2511.

$ ollama show frob/KaLM-Embedding-Gemma3-12B-2511 
  Model
    architecture        gemma-embedding    
    parameters          11.8B              
    context length      131072             
    embedding length    3840               
    quantization        F16                

  Capabilities
    embedding    

$ curl -s http://localhost:11434/api/embed -d '{
  "model": "frob/KaLM-Embedding-Gemma3-12B-2511",
  "input": "Why is the sky blue?"
}' | jq -c '.embeddings[0]|.[:3] + ["..."] + .[-3:]'
[0.022915034,-0.016313432,-0.08286263,"...",0.0087134335,0.0029289108,0.012182114]
<!-- gh-comment-id:3707660107 --> @rick-github commented on GitHub (Jan 4, 2026): The model is missing metadata that identifies it as an embedding model: ```console $ ollama show KaLM-Embedding/KaLM-Embedding-Gemma3-12B-2511 Model architecture gemma3 parameters 11.8B context length 131072 embedding length 3840 quantization BF16 Capabilities completion ``` I imported the original model from https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511 and uploaded it to the ollama library at https://ollama.com/frob/KaLM-Embedding-Gemma3-12B-2511. ```console $ ollama show frob/KaLM-Embedding-Gemma3-12B-2511 Model architecture gemma-embedding parameters 11.8B context length 131072 embedding length 3840 quantization F16 Capabilities embedding $ curl -s http://localhost:11434/api/embed -d '{ "model": "frob/KaLM-Embedding-Gemma3-12B-2511", "input": "Why is the sky blue?" }' | jq -c '.embeddings[0]|.[:3] + ["..."] + .[-3:]' [0.022915034,-0.016313432,-0.08286263,"...",0.0087134335,0.0029289108,0.012182114] ```
Author
Owner

@ashunaveed commented on GitHub (Jan 4, 2026):

how to convert huggingface based safetensors models to ollama models to be used for embeddings usage with the embedding dimension provided by the model..

<!-- gh-comment-id:3708365206 --> @ashunaveed commented on GitHub (Jan 4, 2026): how to convert huggingface based safetensors models to ollama models to be used for embeddings usage with the embedding dimension provided by the model..
Author
Owner

@tatankam commented on GitHub (Mar 2, 2026):

For me it doesn't work.

I obtain:

curl http://localhost:11434/api/embeddings   -d '{
    "model": "frob/KaLM-Embedding-Gemma3-12B-2511",
    "prompt": "Ciao, come va oggi?"
  }'
{"error":"llama runner process has terminated: GGML_ASSERT(ggml_nbytes(src0) \u003c= INT_MAX) failed"}
<!-- gh-comment-id:3983892063 --> @tatankam commented on GitHub (Mar 2, 2026): For me it doesn't work. I obtain: ``` curl http://localhost:11434/api/embeddings -d '{ "model": "frob/KaLM-Embedding-Gemma3-12B-2511", "prompt": "Ciao, come va oggi?" }' {"error":"llama runner process has terminated: GGML_ASSERT(ggml_nbytes(src0) \u003c= INT_MAX) failed"} ```
Author
Owner

@rick-github commented on GitHub (Mar 2, 2026):

Set OLLAMA_CONTEXT_LENGTH=8192 in the server environment.

<!-- gh-comment-id:3986154155 --> @rick-github commented on GitHub (Mar 2, 2026): Set `OLLAMA_CONTEXT_LENGTH=8192` in the server environment.
Author
Owner

@tatankam commented on GitHub (Mar 3, 2026):

Hi,
on my ollama server that is bigger:
Environment=OLLAMA_CONTEXT_LENGTH=32768

but it doesn't work

<!-- gh-comment-id:3989240158 --> @tatankam commented on GitHub (Mar 3, 2026): Hi, on my ollama server that is bigger: Environment=OLLAMA_CONTEXT_LENGTH=32768 but it doesn't work
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55449