[GH-ISSUE #8618] Support Janus-Pro-7b for vision models #31337

Open
opened 2026-04-22 11:42:31 -05:00 by GiteaMirror · 56 comments
Owner

Originally created by @franz101 on GitHub (Jan 27, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8618

Just announced and performing great with OCR
https://huggingface.co/deepseek-ai/Janus-Pro-7B

Originally created by @franz101 on GitHub (Jan 27, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8618 Just announced and performing great with OCR https://huggingface.co/deepseek-ai/Janus-Pro-7B
GiteaMirror added the feature request label 2026-04-22 11:42:32 -05:00
Author
Owner

@skytodmoon commented on GitHub (Jan 28, 2025):

Mark +1

<!-- gh-comment-id:2617303549 --> @skytodmoon commented on GitHub (Jan 28, 2025): Mark +1
Author
Owner

@libing64 commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2617374513 --> @libing64 commented on GitHub (Jan 28, 2025): +1
Author
Owner

@kattatzu commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2617517827 --> @kattatzu commented on GitHub (Jan 28, 2025): +1
Author
Owner

@dengber commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2617932076 --> @dengber commented on GitHub (Jan 28, 2025): +1
Author
Owner

@random-zhu commented on GitHub (Jan 28, 2025):

Mark +1

<!-- gh-comment-id:2618165434 --> @random-zhu commented on GitHub (Jan 28, 2025): Mark +1
Author
Owner

@sakujor commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2618258583 --> @sakujor commented on GitHub (Jan 28, 2025): +1
Author
Owner

@DhairyaNxtgen commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2618851476 --> @DhairyaNxtgen commented on GitHub (Jan 28, 2025): +1
Author
Owner

@TheurgicDuke771 commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2619002437 --> @TheurgicDuke771 commented on GitHub (Jan 28, 2025): +1
Author
Owner

@philogicae commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2619479068 --> @philogicae commented on GitHub (Jan 28, 2025): +1
Author
Owner

@ImranR98 commented on GitHub (Jan 28, 2025):

Commenting "+1" sends an unnecessary email to everyone who is subscribed to the issue. Probably a better idea to just add a thumbs up to the original post.

<!-- gh-comment-id:2619487542 --> @ImranR98 commented on GitHub (Jan 28, 2025): Commenting "+1" sends an unnecessary email to everyone who is subscribed to the issue. Probably a better idea to just add a thumbs up to the original post.
Author
Owner

@edgett commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2619988996 --> @edgett commented on GitHub (Jan 28, 2025): +1
Author
Owner

@BrokenByteOfCode commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2620094803 --> @BrokenByteOfCode commented on GitHub (Jan 28, 2025): +1
Author
Owner

@movitecc commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2620203738 --> @movitecc commented on GitHub (Jan 28, 2025): +1
Author
Owner

@iammrbt commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2620266561 --> @iammrbt commented on GitHub (Jan 28, 2025): +1
Author
Owner

@cmheong commented on GitHub (Jan 28, 2025):

+1

<!-- gh-comment-id:2620286543 --> @cmheong commented on GitHub (Jan 28, 2025): +1
Author
Owner

@wwek commented on GitHub (Jan 29, 2025):

+1

<!-- gh-comment-id:2620503692 --> @wwek commented on GitHub (Jan 29, 2025): +1
Author
Owner

@OverStruck commented on GitHub (Jan 29, 2025):

+1

<!-- gh-comment-id:2620721155 --> @OverStruck commented on GitHub (Jan 29, 2025): +1
Author
Owner

@4austinpowers commented on GitHub (Jan 29, 2025):

+1

<!-- gh-comment-id:2620988388 --> @4austinpowers commented on GitHub (Jan 29, 2025): +1
Author
Owner

@deadprogram commented on GitHub (Jan 29, 2025):

How about also https://huggingface.co/deepseek-ai/Janus-Pro-1B for whoever has the correct setup also to import this, please.

<!-- gh-comment-id:2621491441 --> @deadprogram commented on GitHub (Jan 29, 2025): How about also https://huggingface.co/deepseek-ai/Janus-Pro-1B for whoever has the correct setup also to import this, please.
Author
Owner

@tobalo commented on GitHub (Jan 29, 2025):

+1

<!-- gh-comment-id:2622707185 --> @tobalo commented on GitHub (Jan 29, 2025): +1
Author
Owner

@nurena24 commented on GitHub (Jan 30, 2025):

+1

<!-- gh-comment-id:2623387602 --> @nurena24 commented on GitHub (Jan 30, 2025): +1
Author
Owner

@xindoreen commented on GitHub (Jan 30, 2025):

+1

<!-- gh-comment-id:2623443963 --> @xindoreen commented on GitHub (Jan 30, 2025): +1
Author
Owner

@toplinuxsir commented on GitHub (Jan 30, 2025):

+1

<!-- gh-comment-id:2623762112 --> @toplinuxsir commented on GitHub (Jan 30, 2025): +1
Author
Owner

@zytoh0 commented on GitHub (Jan 30, 2025):

Just announced and performing great with OCR https://huggingface.co/deepseek-ai/Janus-Pro-7B
Not just 7B but also 1B :)
https://huggingface.co/deepseek-ai/Janus-Pro-1B
https://huggingface.co/deepseek-ai/Janus-Pro-7B

<!-- gh-comment-id:2625205408 --> @zytoh0 commented on GitHub (Jan 30, 2025): > Just announced and performing great with OCR https://huggingface.co/deepseek-ai/Janus-Pro-7B Not just 7B but also 1B :) https://huggingface.co/deepseek-ai/Janus-Pro-1B https://huggingface.co/deepseek-ai/Janus-Pro-7B
Author
Owner

@MIC-BO commented on GitHub (Jan 31, 2025):

+1

<!-- gh-comment-id:2626021477 --> @MIC-BO commented on GitHub (Jan 31, 2025): +1
Author
Owner

@snt1017 commented on GitHub (Jan 31, 2025):

+1

<!-- gh-comment-id:2626741135 --> @snt1017 commented on GitHub (Jan 31, 2025): +1
Author
Owner

@jorgevespa commented on GitHub (Jan 31, 2025):

+1

<!-- gh-comment-id:2628322058 --> @jorgevespa commented on GitHub (Jan 31, 2025): +1
Author
Owner

@isaacasancheza commented on GitHub (Jan 31, 2025):

+1

<!-- gh-comment-id:2628323542 --> @isaacasancheza commented on GitHub (Jan 31, 2025): +1
Author
Owner

@wlsoft2006 commented on GitHub (Feb 1, 2025):

+1

<!-- gh-comment-id:2628652310 --> @wlsoft2006 commented on GitHub (Feb 1, 2025): +1
Author
Owner

@kongkang commented on GitHub (Feb 1, 2025):

+1

<!-- gh-comment-id:2628922380 --> @kongkang commented on GitHub (Feb 1, 2025): +1
Author
Owner

@jackwang2 commented on GitHub (Feb 3, 2025):

+1

<!-- gh-comment-id:2629649962 --> @jackwang2 commented on GitHub (Feb 3, 2025): +1
Author
Owner

@maddinek commented on GitHub (Feb 3, 2025):

+1

<!-- gh-comment-id:2630270989 --> @maddinek commented on GitHub (Feb 3, 2025): +1
Author
Owner

@jangrewe commented on GitHub (Feb 4, 2025):

Please STOP COMMENTING +1, use the 👍 reaction to the original post instead!

<!-- gh-comment-id:2633406462 --> @jangrewe commented on GitHub (Feb 4, 2025): ### Please **STOP COMMENTING +1**, use the 👍 reaction to the original post instead!
Author
Owner

@philogicae commented on GitHub (Feb 4, 2025):

Please STOP COMMENTING +1, use the 👍 reaction to the original post instead!

No.

Image

<!-- gh-comment-id:2633667160 --> @philogicae commented on GitHub (Feb 4, 2025): > ### Please **STOP COMMENTING +1**, use the 👍 reaction to the original post instead! No. ![Image](https://github.com/user-attachments/assets/b32f941a-4768-45f5-b9fd-77a3eac7e446)
Author
Owner

@jangrewe commented on GitHub (Feb 4, 2025):

No.

What kind of special idiot... individual are you? This is not about notifications, but about useless noise that adds nothing to the discussion.

<!-- gh-comment-id:2633673627 --> @jangrewe commented on GitHub (Feb 4, 2025): > No. What kind of special ~idiot~... _individual_ are you? This is not about notifications, but about useless noise that adds nothing to the discussion.
Author
Owner

@svaningelgem commented on GitHub (Feb 4, 2025):

What kind of special idiot are you?

Let's keep things professional, even though other people might annoy you...

What would be most useful to me is a guide on how to create & upload such model. I'd do this myself then...

<!-- gh-comment-id:2633678589 --> @svaningelgem commented on GitHub (Feb 4, 2025): > What kind of special idiot are you? Let's keep things professional, even though other people might annoy you... What would be most useful to me is a guide on how to create & upload such model. I'd do this myself then...
Author
Owner

@jangrewe commented on GitHub (Feb 4, 2025):

Let's keep things professional

fixed.

<!-- gh-comment-id:2633683111 --> @jangrewe commented on GitHub (Feb 4, 2025): > Let's keep things professional fixed.
Author
Owner

@dandv commented on GitHub (Feb 4, 2025):

Let's keep things professional

No, but seriously, what kind of people who can:

  • use GitHub
  • are interested in a CLI tool
  • to run inference locally

Don't already know to NOT SPAM WITH STUPDID +1s

AND

Keep doing it after commends advising very nicely NOT TO DO SO.

Are these bots? An influx of complete and utter GitHub n00bs?

<!-- gh-comment-id:2633709802 --> @dandv commented on GitHub (Feb 4, 2025): > Let's keep things professional No, but seriously, what kind of people who can: - use GitHub - are interested in a CLI tool - to run inference locally Don't already know to NOT SPAM WITH STUPDID `+1`s AND Keep doing it after commends advising [very nicely](https://github.com/ollama/ollama/issues/8618#issuecomment-2619487542) NOT TO DO SO. Are these bots? An influx of complete and utter GitHub n00bs?
Author
Owner

@vertago1 commented on GitHub (Feb 4, 2025):

Let's keep things professional

No, but seriously, what kind of people who can:

  • use GitHub
  • are interested in a CLI tool
  • to run inference locally

Don't already know to NOT SPAM WITH STUPDID +1s

AND

Keep doing it after commends advising very nicely NOT TO DO SO.

Are these bots? An influx of complete and utter GitHub n00bs?

They must not be devs or they would realize that kind of thing leads to turning off notifications for a thread and it going off the devs radar which is counterproductive if they really want this added.

<!-- gh-comment-id:2633802969 --> @vertago1 commented on GitHub (Feb 4, 2025): > > Let's keep things professional > > No, but seriously, what kind of people who can: > > * use GitHub > * are interested in a CLI tool > * to run inference locally > > Don't already know to NOT SPAM WITH STUPDID `+1`s > > AND > > Keep doing it after commends advising [very nicely](https://github.com/ollama/ollama/issues/8618#issuecomment-2619487542) NOT TO DO SO. > > Are these bots? An influx of complete and utter GitHub n00bs? They must not be devs or they would realize that kind of thing leads to turning off notifications for a thread and it going off the devs radar which is counterproductive if they really want this added.
Author
Owner

@cmheong commented on GitHub (Feb 5, 2025):

I got to this thread because a Google search directed me here, so this is probably not the place to post this comment, so my apologies in advance to the irritable ones on the mailing list. The reason everyone is here is we want to use Janus-Pro-7b from ollama. I get it, it is not supported as of now. Now I only got ollama last week so I am definitely a newbie. I simply asked Deepseek how to run Janus-Pro-7b-LM from ollama, and the instructions it gave actually worked. I am now running it from ollama. For those who are interested, the instructions are:
Download the gguf from https://huggingface.co/mradermacher/Janus-Pro-7B-LM-GGUF/blob/main/Janus-Pro-7B-LM.Q4_K_M.gguf
Copy it to your docker ollama container. I used 'docker cp'
Make the file Modelfile in the same directory containing the line:
./Janus-Pro-7B-LM.Q4_K_M.gguf
From your docker container, run the command
ollama create janus-pro-7b-lm -f Modelfile
Then run
ollama run janus-pro-7b-lm
That is all. Have fun with janus-pro-7b. I sure am.

<!-- gh-comment-id:2636982785 --> @cmheong commented on GitHub (Feb 5, 2025): I got to this thread because a Google search directed me here, so this is probably not the place to post this comment, so my apologies in advance to the irritable ones on the mailing list. The reason everyone is here is we want to use Janus-Pro-7b from ollama. I get it, it is not supported as of now. Now I only got ollama last week so I am definitely a newbie. I simply asked Deepseek how to run Janus-Pro-7b-LM from ollama, and the instructions it gave actually worked. I am now running it from ollama. For those who are interested, the instructions are: Download the gguf from https://huggingface.co/mradermacher/Janus-Pro-7B-LM-GGUF/blob/main/Janus-Pro-7B-LM.Q4_K_M.gguf Copy it to your docker ollama container. I used 'docker cp' Make the file Modelfile in the same directory containing the line: ./Janus-Pro-7B-LM.Q4_K_M.gguf From your docker container, run the command ollama create janus-pro-7b-lm -f Modelfile Then run ollama run janus-pro-7b-lm That is all. Have fun with janus-pro-7b. I sure am.
Author
Owner

@davrot commented on GitHub (Feb 5, 2025):

@cmheong Could you share the working Modelfile with us? Thanks!

<!-- gh-comment-id:2637905792 --> @davrot commented on GitHub (Feb 5, 2025): @cmheong Could you share the working Modelfile with us? Thanks!
Author
Owner

@jangrewe commented on GitHub (Feb 5, 2025):

@davrot Uhm... he says what you need to put in there? Those files are not rocket surgery, but just to make sure:

FROM  /path/to/Janus-Pro-7B-LM.Q4_K_M.gguf

For your reference: https://github.com/ollama/ollama/blob/main/docs/modelfile.md

<!-- gh-comment-id:2637921330 --> @jangrewe commented on GitHub (Feb 5, 2025): @davrot Uhm... he says what you need to put in there? Those files are not rocket surgery, but just to make sure: ``` FROM /path/to/Janus-Pro-7B-LM.Q4_K_M.gguf ``` For your reference: https://github.com/ollama/ollama/blob/main/docs/modelfile.md
Author
Owner

@jangrewe commented on GitHub (Feb 5, 2025):

@davrot Open WebUI != Ollama

<!-- gh-comment-id:2638212029 --> @jangrewe commented on GitHub (Feb 5, 2025): @davrot Open WebUI != Ollama
Author
Owner

@sealad886 commented on GitHub (Feb 6, 2025):

@jangrewe Can tell me how do you send images to "ollama run janus-pro-7b-lm" ?

Image

Multimodal Models are described in the main README.md, near the bottom.

If you're having issues with a specific non-Ollama tool/frontend that connects to the Ollama API, see the documentation for that tool separately.

<!-- gh-comment-id:2638339779 --> @sealad886 commented on GitHub (Feb 6, 2025): > [@jangrewe](https://github.com/jangrewe) Can tell me how do you send images to "ollama run janus-pro-7b-lm" ? > > ![Image](https://github.com/user-attachments/assets/64c426f4-a657-48a4-a9c4-3035d701c17b) [Multimodal Models](https://github.com/ollama/ollama?tab=readme-ov-file#multimodal-models) are described in the main README.md, near the bottom. If you're having issues with a specific non-Ollama tool/frontend that _connects to_ the Ollama API, see the documentation for that tool separately.
Author
Owner

@davrot commented on GitHub (Feb 6, 2025):

ollama run janus-pro-7b-lm "What do you see in the image /data_1/deepseek/kohlfahrt0015.jpg"
?**

I don't see an image, I see a question asking me to provide information about a specific image or data file that may contain
a unique identifier and name format, possibly related to "deepseek" and "kohlfahrt". However, there is no actual visual
content associated with this request. It seems like the text contains placeholder characters, which might be due to encoding
issues or incomplete instructions. If you could provide more context or clarify what you're trying to achieve by asking
about an image or data file based on a specific name and identifier, I'd be happy to assist further!

ollama run llama3.2-vision:11b "What do you see in the image /data_1/deepseek/kohlfahrt0015.jpg"
Added image '/data_1/deepseek/kohlfahrt0015.jpg'
The image shows a group of people walking together, with trees and buildings visible in the background.

  • A group of people are walking together.
    + There are approximately 10 individuals in the group.
    + They appear to be walking on a sidewalk or path.
    + Some of them are looking at something off-camera, while others seem to be engaged in conversation.
  • The group is made up of both men and women.
    + The men are wearing casual clothing such as jeans and t-shirts.
    + The women are also dressed casually, with some wearing dresses or skirts.
  • They are all wearing similar jackets or coats.
    + The jackets are dark-colored and appear to be waterproof or windproof.
    + Some of the individuals have their hands in their pockets, while others are holding onto bags or other items.

Overall, the image suggests that the group is on a casual outing or hike, possibly enjoying the outdoors together.

<!-- gh-comment-id:2639057214 --> @davrot commented on GitHub (Feb 6, 2025): > ollama run janus-pro-7b-lm "What do you see in the image /data_1/deepseek/kohlfahrt0015.jpg" ?** I don't see an image, I see a question asking me to provide information about a specific image or data file that may contain a unique identifier and name format, possibly related to "deepseek" and "kohlfahrt". However, there is no actual visual content associated with this request. It seems like the text contains placeholder characters, which might be due to encoding issues or incomplete instructions. If you could provide more context or clarify what you're trying to achieve by asking about an image or data file based on a specific name and identifier, I'd be happy to assist further! > ollama run llama3.2-vision:11b "What do you see in the image /data_1/deepseek/kohlfahrt0015.jpg" Added image '/data_1/deepseek/kohlfahrt0015.jpg' The image shows a group of people walking together, with trees and buildings visible in the background. * A group of people are walking together. + There are approximately 10 individuals in the group. + They appear to be walking on a sidewalk or path. + Some of them are looking at something off-camera, while others seem to be engaged in conversation. * The group is made up of both men and women. + The men are wearing casual clothing such as jeans and t-shirts. + The women are also dressed casually, with some wearing dresses or skirts. * They are all wearing similar jackets or coats. + The jackets are dark-colored and appear to be waterproof or windproof. + Some of the individuals have their hands in their pockets, while others are holding onto bags or other items. Overall, the image suggests that the group is on a casual outing or hike, possibly enjoying the outdoors together.
Author
Owner

@sealad886 commented on GitHub (Feb 6, 2025):

Hey @davrot thanks for pasting from the shell terminal there. If you could, if would be very helpful to use the Markdown tags for indicating scripting, etc, so that that output is a bit clearer in terms of what commands you gave and what the output was, vs your own exposition (if any--based on the text, I'm assuming that's 100% LLM generated).

As another resource, you can check out the Llama3.2-Vision blog post that has usage information for that model, or the LLaVA announcement post that uses a slightly different method to interact with the model.

Overall, CLI-based multimodal interaction doesn't appear to be consistent across models. All models should be able to accept an image through the API, it seems. Refer back to those blog posts (in particular the Llama3.2-Vision one) for links to the docs.

<!-- gh-comment-id:2639263399 --> @sealad886 commented on GitHub (Feb 6, 2025): Hey @davrot thanks for pasting from the shell terminal there. If you could, if would be very helpful to use the Markdown tags for indicating scripting, etc, so that that output is a bit clearer in terms of what commands you gave and what the output was, vs your own exposition (if any--based on the text, I'm assuming that's 100% LLM generated). As another resource, you can check out the [Llama3.2-Vision](https://ollama.com/blog/llama3.2-vision) blog post that has usage information for that model, or the [LLaVA announcement post](https://ollama.com/blog/vision-models) that uses a slightly different method to interact with the model. Overall, CLI-based multimodal interaction doesn't appear to be consistent across models. All models should be able to accept an image through the API, it seems. Refer back to those blog posts (in particular the Llama3.2-Vision one) for links to the docs.
Author
Owner

@sealad886 commented on GitHub (Feb 6, 2025):

It doesn't appear that the GGUF available from HF actually works.

input:

response: ollama.ChatResponse = ollama.chat(model=model, messages=[
    {
            'role': 'user',
            'contents': 'Tell me about this image.',
            'images': ['/path/to/local/image.webp']
    }
])

print(response.message.content):

 * Hello, World!</div>
        <p id="text-1" class="para">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget arcu quis sapien euismod bibendum.</p>
        <p id="text-2" class="para">Nunc et orci non libero luctus convallis nec vel quam. Aliquam erat volutpat. Suspendisse sit amet ante ut nunc tristique aliquet.</p>
      </div>
    </body>
  </html>

To be fair, I don't know if the webp format is supported in this model or in the conversion to what I assume is base64, so that may be one thing causing issues here. But suffice it to say that that response is a wildly inappropriate response to the query posed.

<!-- gh-comment-id:2639319157 --> @sealad886 commented on GitHub (Feb 6, 2025): It doesn't appear that the GGUF available from HF actually works. input: ```python response: ollama.ChatResponse = ollama.chat(model=model, messages=[ { 'role': 'user', 'contents': 'Tell me about this image.', 'images': ['/path/to/local/image.webp'] } ]) ``` `print(response.message.content)`: ```python * Hello, World!</div> <p id="text-1" class="para">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget arcu quis sapien euismod bibendum.</p> <p id="text-2" class="para">Nunc et orci non libero luctus convallis nec vel quam. Aliquam erat volutpat. Suspendisse sit amet ante ut nunc tristique aliquet.</p> </div> </body> </html> ``` To be fair, I don't know if the webp format is supported in this model or in the conversion to what I assume is base64, so that may be one thing causing issues here. But suffice it to say that that response is a wildly inappropriate response to the query posed.
Author
Owner

@davrot commented on GitHub (Feb 6, 2025):

It seems that llama.cpp is working on it:

Add supports for Janus vision encoder and projector [WIP] #11646
https://github.com/ggerganov/llama.cpp/pull/11646

<!-- gh-comment-id:2639543744 --> @davrot commented on GitHub (Feb 6, 2025): It seems that llama.cpp is working on it: > Add supports for Janus vision encoder and projector [WIP] #11646 > https://github.com/ggerganov/llama.cpp/pull/11646
Author
Owner

@ravenouse commented on GitHub (Feb 6, 2025):

From my understanding, the current GGUF models available on Hugging Face do not include the vision encoder and projector components—only the language model. This means that the Janus model lacks image understanding when running with Ollama.

I have submitted a PR to llama.cpp and am working on adding support for the Janus vision encoder and projector. The main challenge is the customized code used by the DeepSeek team, along with potential modifications to the clip model architecture in C++. As a result, this PR may take some time to complete.

<!-- gh-comment-id:2641170724 --> @ravenouse commented on GitHub (Feb 6, 2025): From my understanding, the current GGUF models available on Hugging Face do not include the vision encoder and projector components—only the language model. This means that the Janus model lacks image understanding when running with Ollama. I have submitted a [PR](https://github.com/ggerganov/llama.cpp/pull/11646) to llama.cpp and am working on adding support for the Janus vision encoder and projector. The main challenge is the customized code used by the DeepSeek team, along with potential modifications to the clip model architecture in C++. As a result, this PR may take some time to complete.
Author
Owner

@S4GU4R0 commented on GitHub (Feb 8, 2025):

Are these bots? An influx of complete and utter GitHub n00bs?

It seems like it, or they're literally children. Having worked with kids in an online context, enthusiasm sometimes comes across as spam and bot-like behavior.

<!-- gh-comment-id:2645864070 --> @S4GU4R0 commented on GitHub (Feb 8, 2025): > Are these bots? An influx of complete and utter GitHub n00bs? It seems like it, or they're literally children. Having worked with kids in an online context, enthusiasm sometimes comes across as spam and bot-like behavior.
Author
Owner

@Forevery1 commented on GitHub (Feb 14, 2025):

+1

<!-- gh-comment-id:2660131074 --> @Forevery1 commented on GitHub (Feb 14, 2025): +1
Author
Owner

@DarkAlchy commented on GitHub (Feb 16, 2025):

Janus-7B is the best vision model I have tried to date locally as I gave it an image, and what it described I fed to Flux. The output Flux dev gave back was almost a verbatim copy. It did mess up the woman (a silhouette) to be a man, but the room was almost identical even to the images on the walls. Jaw dropped. Llama-3.2-vision is not even close, and the other ones I used to use are rubbish in comparison.

<!-- gh-comment-id:2661575807 --> @DarkAlchy commented on GitHub (Feb 16, 2025): Janus-7B is the best vision model I have tried to date locally as I gave it an image, and what it described I fed to Flux. The output Flux dev gave back was almost a verbatim copy. It did mess up the woman (a silhouette) to be a man, but the room was almost identical even to the images on the walls. Jaw dropped. Llama-3.2-vision is not even close, and the other ones I used to use are rubbish in comparison.
Author
Owner

@byjlw commented on GitHub (Feb 16, 2025):

Janus-7B is the best vision model I have tried to date locally as I gave it an image, and what it described I fed to Flux. The output Flux dev gave back was almost a verbatim copy. It did mess up the woman (a silhouette) to be a man, but the room was almost identical even to the images on the walls. Jaw dropped. Llama-3.2-vision is not even close, and the other ones I used to use are rubbish in comparison.

How did you run it? Can you describe exact steps?
Every other comment suggests that image input doesn't work with Ollama

<!-- gh-comment-id:2661642410 --> @byjlw commented on GitHub (Feb 16, 2025): > Janus-7B is the best vision model I have tried to date locally as I gave it an image, and what it described I fed to Flux. The output Flux dev gave back was almost a verbatim copy. It did mess up the woman (a silhouette) to be a man, but the room was almost identical even to the images on the walls. Jaw dropped. Llama-3.2-vision is not even close, and the other ones I used to use are rubbish in comparison. How did you run it? Can you describe exact steps? Every other comment suggests that image input doesn't work with Ollama
Author
Owner

@DarkAlchy commented on GitHub (Feb 17, 2025):

I would like to use Ollama with it, but I used it in Comfy UI.

Image

<!-- gh-comment-id:2661743871 --> @DarkAlchy commented on GitHub (Feb 17, 2025): I would like to use Ollama with it, but I used it in Comfy UI. ![Image](https://github.com/user-attachments/assets/d581aabc-a710-4e58-895c-e88881e5c622)
Author
Owner

@snailfrying commented on GitHub (Feb 18, 2025):

https://ollama.com/gguf/DeepSeek-Janus-Pro-7B This website can be deployed, and the corresponding huggingface also has corresponding files that support ollama, as well as commands for using the model. However, I deployed it but did not use it properly。

<!-- gh-comment-id:2664776973 --> @snailfrying commented on GitHub (Feb 18, 2025): https://ollama.com/gguf/DeepSeek-Janus-Pro-7B This website can be deployed, and the corresponding huggingface also has corresponding files that support ollama, as well as commands for using the model. However, I deployed it but did not use it properly。
Author
Owner

@ghmole commented on GitHub (Mar 22, 2025):

+1

<!-- gh-comment-id:2745278268 --> @ghmole commented on GitHub (Mar 22, 2025): +1
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#31337