[GH-ISSUE #11244] Support for Baidu's ERNIE 4.5/VL #85092

Open
opened 2026-05-09 22:31:53 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @3unnycheung on GitHub (Jun 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11244

https://huggingface.co/collections/baidu/ernie-45-6861cd4c9be84540645f35c9

Image

Image

Originally created by @3unnycheung on GitHub (Jun 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11244 https://huggingface.co/collections/baidu/ernie-45-6861cd4c9be84540645f35c9 ![Image](https://github.com/user-attachments/assets/08d68683-0fe7-42db-bcfd-e060f9465e3a) ![Image](https://github.com/user-attachments/assets/d0425f4c-19a8-46c5-bb1e-2a9f62cad5ea)
GiteaMirror added the model label 2026-05-09 22:31:53 -05:00
Author
Owner

@dengcao commented on GitHub (Jul 3, 2025):

Error: unable to load model:https://huggingface.co/dengcao/ERNIE-4.5-0.3B-PT-GGUF

<!-- gh-comment-id:3032440572 --> @dengcao commented on GitHub (Jul 3, 2025): Error: unable to load model:https://huggingface.co/dengcao/ERNIE-4.5-0.3B-PT-GGUF
Author
Owner

@Himanshu8881212 commented on GitHub (Jul 4, 2025):

i read the paper where they mentioned that they have implemented a 4 bit and 2 bit lossless quantization. if we can get those models in ollama it would be something of a start. both the LLM and VLM

<!-- gh-comment-id:3036942034 --> @Himanshu8881212 commented on GitHub (Jul 4, 2025): i read the paper where they mentioned that they have implemented a 4 bit and 2 bit lossless quantization. if we can get those models in ollama it would be something of a start. both the LLM and VLM
Author
Owner

@Mugane commented on GitHub (Jul 6, 2025):

I prefer the reference model. We can quantize on our own

<!-- gh-comment-id:3040972831 --> @Mugane commented on GitHub (Jul 6, 2025): I prefer the reference model. We can quantize on our own
Author
Owner

@fighter3005 commented on GitHub (Jul 10, 2025):

Also the VL model would be very strong, as it sets Ernie apart from deepseek, qwen, magistral and the other reasoning models that don't support vision input, and puts it closer to the SOTA models like gemini 2.5 flash and gpt-4.1 or o1.

<!-- gh-comment-id:3057002342 --> @fighter3005 commented on GitHub (Jul 10, 2025): Also the VL model would be very strong, as it sets Ernie apart from deepseek, qwen, magistral and the other reasoning models that don't support vision input, and puts it closer to the SOTA models like gemini 2.5 flash and gpt-4.1 or o1.
Author
Owner

@fighter3005 commented on GitHub (Jul 18, 2025):

Quick question, llama.cpp will support the non VL versions soon. Possibly also the VL version at some point.
should we wait for that and use these implementations or should we implement model support for Ernie with the "new" model engine ourselves?

<!-- gh-comment-id:3088757572 --> @fighter3005 commented on GitHub (Jul 18, 2025): Quick question, llama.cpp will support the non VL versions soon. Possibly also the VL version at some point. should we wait for that and use these implementations or should we implement model support for Ernie with the "new" model engine ourselves?
Author
Owner

@3unnycheung commented on GitHub (Jul 19, 2025):

Quick question, llama.cpp will support the non VL versions soon. Possibly also the VL version at some point. should we wait for that and use these implementations or should we implement model support for Ernie with the "new" model engine ourselves?

https://github.com/ggml-org/llama.cpp/releases/tag/b5924

<!-- gh-comment-id:3092159328 --> @3unnycheung commented on GitHub (Jul 19, 2025): > Quick question, llama.cpp will support the non VL versions soon. Possibly also the VL version at some point. should we wait for that and use these implementations or should we implement model support for Ernie with the "new" model engine ourselves? https://github.com/ggml-org/llama.cpp/releases/tag/b5924
Author
Owner

@rick-github commented on GitHub (Aug 16, 2025):

$ ollama -v
ollama version is 0.11.5-rc2
$ ollama run ernie-4.5:21b-a3b-q4_K_M hello
Hello! 😊 How can I assist you today? Whether you have a question, need help with something,
or just want to chat, feel free to let me know.
<!-- gh-comment-id:3193044355 --> @rick-github commented on GitHub (Aug 16, 2025): ```console $ ollama -v ollama version is 0.11.5-rc2 $ ollama run ernie-4.5:21b-a3b-q4_K_M hello Hello! 😊 How can I assist you today? Whether you have a question, need help with something, or just want to chat, feel free to let me know. ```
Author
Owner

@fighter3005 commented on GitHub (Aug 16, 2025):

$ ollama -v
ollama version is 0.11.5-rc2
$ ollama run ernie-4.5:21b-a3b-q4_K_M hello
Hello! 😊 How can I assist you today? Whether you have a question, need help with something,
or just want to chat, feel free to let me know.

Is that the llama.cpp backend? Sadly they don't have vision support yet. :(

<!-- gh-comment-id:3193457317 --> @fighter3005 commented on GitHub (Aug 16, 2025): > $ ollama -v > ollama version is 0.11.5-rc2 > $ ollama run ernie-4.5:21b-a3b-q4_K_M hello > Hello! 😊 How can I assist you today? Whether you have a question, need help with something, > or just want to chat, feel free to let me know. Is that the llama.cpp backend? Sadly they don't have vision support yet. :(
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#85092