[GH-ISSUE #2965] Is it possible to support more embedding models in the future? #1821

Closed
opened 2026-04-12 11:52:45 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @wwjCMP on GitHub (Mar 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2965

The nomic-embed-text is great, but it does not support Chinese.

Originally created by @wwjCMP on GitHub (Mar 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2965 The nomic-embed-text is great, but it does not support Chinese.
GiteaMirror added the model label 2026-04-12 11:52:45 -05:00
Author
Owner

@jackleeforce commented on GitHub (Mar 7, 2024):

jina-embeddings-v2-base-zh is good,but i have no idea for how to run it with ollama. i'm trying for this: https://github.com/ollama/ollama/issues/327

<!-- gh-comment-id:1982227989 --> @jackleeforce commented on GitHub (Mar 7, 2024): jina-embeddings-v2-base-zh is good,but i have no idea for how to run it with ollama. i'm trying for this: https://github.com/ollama/ollama/issues/327
Author
Owner

@jackleeforce commented on GitHub (Mar 7, 2024):

ok,now i run a customized embedding model Dmeta-embedding-zh successfully with ollama, since ollama using llama.cpp as inference, and it support BERT which is architecture of most of embedding model, following is steps:

  • Coverting hugging-face model into GGUF file
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.
<!-- gh-comment-id:1982951246 --> @jackleeforce commented on GitHub (Mar 7, 2024): ok,now i run a customized embedding model [Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh) successfully with [ollama](https://github.com/ollama/ollama), since [ollama](https://github.com/ollama/ollama) using [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference, and it support BERT which is architecture of most of embedding model, following is steps: - Coverting hugging-face model into GGUF file ``` 1. git clone https://github.com/ggerganov/llama.cpp.git 2. cd llama.cpp 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. ``` - Add customize model into [ollama](https://github.com/ollama/ollama) - According Tutorial in [import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) - Enjoy it !
Author
Owner

@wwjCMP commented on GitHub (Mar 7, 2024):

ok,now i run a customized embedding model Dmeta-embedding-zh successfully with ollama, since ollama using llama.cpp as inference, and it support BERT which is architecture of most of embedding model, following is steps:

  • Coverting hugging-face model into GGUF file
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.

Snipaste_2024-03-07_18-31-41
I have an unexpected problem here

<!-- gh-comment-id:1983216682 --> @wwjCMP commented on GitHub (Mar 7, 2024): > ok,now i run a customized embedding model [Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh) successfully with [ollama](https://github.com/ollama/ollama), since [ollama](https://github.com/ollama/ollama) using [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference, and it support BERT which is architecture of most of embedding model, following is steps: > > * Coverting hugging-face model into GGUF file > > ``` > 1. git clone https://github.com/ggerganov/llama.cpp.git > 2. cd llama.cpp > 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models > 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf > 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. > ``` > > * Add customize model into [ollama](https://github.com/ollama/ollama) > > * According Tutorial in [import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) > * Enjoy it ! ![Snipaste_2024-03-07_18-31-41](https://github.com/ollama/ollama/assets/32979859/1cd74cc5-51be-4eaa-9aae-020ea2636c55) I have an unexpected problem here
Author
Owner

@wwjCMP commented on GitHub (Mar 7, 2024):

ok,now i run a customized embedding model Dmeta-embedding-zh successfully with ollama, since ollama using llama.cpp as inference, and it support BERT which is architecture of most of embedding model, following is steps:

  • Coverting hugging-face model into GGUF file
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.

Can the model you converted be used normally?

<!-- gh-comment-id:1984573842 --> @wwjCMP commented on GitHub (Mar 7, 2024): > ok,now i run a customized embedding model [Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh) successfully with [ollama](https://github.com/ollama/ollama), since [ollama](https://github.com/ollama/ollama) using [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference, and it support BERT which is architecture of most of embedding model, following is steps: > > * Coverting hugging-face model into GGUF file > > ``` > 1. git clone https://github.com/ggerganov/llama.cpp.git > 2. cd llama.cpp > 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models > 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf > 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. > ``` > > * Add customize model into [ollama](https://github.com/ollama/ollama) > > * According Tutorial in [import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) > * Enjoy it ! Can the model you converted be used normally?
Author
Owner

@jackleeforce commented on GitHub (Mar 8, 2024):

ok,now i run a customized embedding model Dmeta-embedding-zh successfully with ollama, since ollama using llama.cpp as inference, and it support BERT which is architecture of most of embedding model, following is steps:

  • Coverting hugging-face model into GGUF file
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.

Snipaste_2024-03-07_18-31-41 I have an unexpected problem here

after you create the example from Modelfile, no need to using ollama pull , ollama pull is used pull model from official repository, actually after ollama create example -f Modelfile, then the model example is in your local environment, just using 'ollama run example':

image
<!-- gh-comment-id:1984866751 --> @jackleeforce commented on GitHub (Mar 8, 2024): > > ok,now i run a customized embedding model [Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh) successfully with [ollama](https://github.com/ollama/ollama), since [ollama](https://github.com/ollama/ollama) using [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference, and it support BERT which is architecture of most of embedding model, following is steps: > > > > * Coverting hugging-face model into GGUF file > > > > ``` > > 1. git clone https://github.com/ggerganov/llama.cpp.git > > 2. cd llama.cpp > > 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models > > 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf > > 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > * Add customize model into [ollama](https://github.com/ollama/ollama) > > > > * According Tutorial in [import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) > > * Enjoy it ! > > ![Snipaste_2024-03-07_18-31-41](https://private-user-images.githubusercontent.com/32979859/310844442-1cd74cc5-51be-4eaa-9aae-020ea2636c55.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk4NTk5MjUsIm5iZiI6MTcwOTg1OTYyNSwicGF0aCI6Ii8zMjk3OTg1OS8zMTA4NDQ0NDItMWNkNzRjYzUtNTFiZS00ZWFhLTlhYWUtMDIwZWEyNjM2YzU1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzA4VDAxMDAyNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFlYjZmNGU3MGYwMDQ2OTQ3MDk2N2Q2ZTk2ODY2MzY2YWExOGNiMzFhNzJlNTMxZWNlM2EyNGQ4ODlmZjMzZDQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.lXdusQXgP95R-_iJRq95hz-HHSs_kuA-wAtKkX0Fgfo) I have an unexpected problem here after you create the `example` from `Modelfile`, no need to using `ollama pull` , `ollama pull` is used pull model from official repository, actually after `ollama create example -f Modelfile`, then the model `example` is in your local environment, just using 'ollama run example': <img width="1045" alt="image" src="https://github.com/ollama/ollama/assets/4516554/54029a80-c63f-4206-a0a4-d7d84eee429e">
Author
Owner

@jackleeforce commented on GitHub (Mar 8, 2024):

ok,now i run a customized embedding model Dmeta-embedding-zh successfully with ollama, since ollama using llama.cpp as inference, and it support BERT which is architecture of most of embedding model, following is steps:

  • Coverting hugging-face model into GGUF file
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.

Can the model you converted be used normally?

yes, i integrated it into dify, work perfect~

<!-- gh-comment-id:1984868108 --> @jackleeforce commented on GitHub (Mar 8, 2024): > > ok,now i run a customized embedding model [Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh) successfully with [ollama](https://github.com/ollama/ollama), since [ollama](https://github.com/ollama/ollama) using [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference, and it support BERT which is architecture of most of embedding model, following is steps: > > > > * Coverting hugging-face model into GGUF file > > > > ``` > > 1. git clone https://github.com/ggerganov/llama.cpp.git > > 2. cd llama.cpp > > 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models > > 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf > > 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > * Add customize model into [ollama](https://github.com/ollama/ollama) > > > > * According Tutorial in [import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) > > * Enjoy it ! > > Can the model you converted be used normally? yes, i integrated it into [dify](https://github.com/langgenius/dify), work perfect~
Author
Owner

@wwjCMP commented on GitHub (Mar 8, 2024):

好的,现在我用ollama成功运行了一个定制的嵌入模型Dmeta-embedding-zh,因为ollama使用llama.cpp作为推理,并且它支持 BERT,这是大多数嵌入模型的架构,以下是步骤:

  • 将拥抱脸模型转换为GGUF文件
1. git clone https://github.com/ggerganov/llama.cpp.git
2. cd llama.cpp
3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models
4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf
5. Dmeta-embedding-zh.gguf is the only file we needed for next step.

Snipaste_2024-03-07_18-31-41我这里有一个意想不到的问题

创建examplefrom后Modelfile,无需使用ollama pullollama pull使用从官方存储库拉取模型,实际上在创建之后ollama create example -f Modelfile,模型example就在您的本地环境中,只需使用“ollama run example”:

图像

At the beginning, I also used the ollama run command, but it gave the following error
Error: embedding models do not support chat

<!-- gh-comment-id:1985017336 --> @wwjCMP commented on GitHub (Mar 8, 2024): > > > 好的,现在我用[ollama](https://github.com/ollama/ollama)成功运行了一个定制的嵌入模型[Dmeta-embedding-zh](https://huggingface.co/DMetaSoul/Dmeta-embedding-zh),因为[ollama](https://github.com/ollama/ollama)使用[llama.cpp](https://github.com/ggerganov/llama.cpp)作为推理,并且它支持 BERT,这是大多数嵌入模型的架构,以下是步骤:[](https://github.com/ollama/ollama)[](https://github.com/ollama/ollama)[](https://github.com/ggerganov/llama.cpp) > > > > > > * 将拥抱脸模型转换为GGUF文件 > > > > > > ``` > > > 1. git clone https://github.com/ggerganov/llama.cpp.git > > > 2. cd llama.cpp > > > 3. downloading embedding model https://huggingface.co/DMetaSoul/Dmeta-embedding-zh into folder llama.cpp/models > > > 4. python3.11 convert-hf-to-gguf.py --outtype f32 ./models/Dmeta-embedding-zh --outfile ./models/Dmeta-embedding-zh/Dmeta-embedding-zh.gguf > > > 5. Dmeta-embedding-zh.gguf is the only file we needed for next step. > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > * 将自定义模型添加到[ollama中](https://github.com/ollama/ollama) > > > > > > * [根据import-from-gguf](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf)中的教程[](https://github.com/ollama/ollama?tab=readme-ov-file#import-from-gguf) > > > * 好好享受 ! > > > > > > ![Snipaste_2024-03-07_18-31-41](https://private-user-images.githubusercontent.com/32979859/310844442-1cd74cc5-51be-4eaa-9aae-020ea2636c55.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk4NTk5MjUsIm5iZiI6MTcwOTg1OTYyNSwicGF0aCI6Ii8zMjk3OTg1OS8zMTA4NDQ0NDItMWNkNzRjYzUtNTFiZS00ZWFhLTlhYWUtMDIwZWEyNjM2YzU1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzA4VDAxMDAyNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFlYjZmNGU3MGYwMDQ2OTQ3MDk2N2Q2ZTk2ODY2MzY2YWExOGNiMzFhNzJlNTMxZWNlM2EyNGQ4ODlmZjMzZDQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.lXdusQXgP95R-_iJRq95hz-HHSs_kuA-wAtKkX0Fgfo)我这里有一个意想不到的问题 > > 创建`example`from后`Modelfile`,无需使用`ollama pull`,`ollama pull`使用从官方存储库拉取模型,实际上在创建之后`ollama create example -f Modelfile`,模型`example`就在您的本地环境中,只需使用“ollama run example”: > > <img alt="图像" width="1045" src="https://private-user-images.githubusercontent.com/4516554/311083597-54029a80-c63f-4206-a0a4-d7d84eee429e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk4NzE4NTEsIm5iZiI6MTcwOTg3MTU1MSwicGF0aCI6Ii80NTE2NTU0LzMxMTA4MzU5Ny01NDAyOWE4MC1jNjNmLTQyMDYtYTBhNC1kN2Q4NGVlZTQyOWUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDMwOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAzMDhUMDQxOTExWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzgzMTExNDAxYmIzMWFmMTA4OWY4NzVjZjBjYjY1NzhhYmNkMWVjZTM4ODcxODQyNmQ2ZDJlNGExZDU3ZTFmMiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.WKfvtyxIL3sFn4u32X8y3ysopCwVbOo9sz9_vvTrSKQ"> At the beginning, I also used the `ollama run` command, but it gave the following error `Error: embedding models do not support chat`
Author
Owner

@wwjCMP commented on GitHub (Mar 8, 2024):

I think I got it

<!-- gh-comment-id:1985041383 --> @wwjCMP commented on GitHub (Mar 8, 2024): I think I got it
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1821