[GH-ISSUE #20099] feat: Need to Bypass Non-Multimodal LLM for ComfyUI Image Generation/Editing in open-webui #19085

Closed
opened 2026-04-20 01:24:42 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @freesunshine on GitHub (Dec 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20099

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

I have set up the Qwen 235b thinking model using vLLM, which is not a multimodal model and therefore cannot process images. To address this, I configured two instances of ComfyUI for image generation and image editing, respectively. The current issues are as follows:

  1. Both image generation and editing are actually handled without involving Qwen. However, every time generation or editing finishes, Qwen outputs a lot of irrelevant text, forcing me to manually terminate it.

  2. When I select the image generation feature and attempt to upload a local image as input for image editing, open-webui reports an error, stating that this is not a multimodal model, and the process cannot continue. This prevents me from editing locally uploaded images. In reality, the image editing functionality is provided by ComfyUI and has nothing to do with Qwen. Is it possible to bypass the LLM for both input and output when the image generation feature is selected?

Is this module community-contributed, so I should not open an issue here?

Desired Solution you'd like

Bypass the LLM for both input and output when the image generation feature is selected

Alternatives Considered

No response

Additional Context

No response

Originally created by @freesunshine on GitHub (Dec 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/20099 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description I have set up the Qwen 235b thinking model using vLLM, which is not a multimodal model and therefore cannot process images. To address this, I configured two instances of ComfyUI for image generation and image editing, respectively. The current issues are as follows: 1. Both image generation and editing are actually handled without involving Qwen. However, every time generation or editing finishes, Qwen outputs a lot of irrelevant text, forcing me to manually terminate it. 2. When I select the image generation feature and attempt to upload a local image as input for image editing, open-webui reports an error, stating that this is not a multimodal model, and the process cannot continue. This prevents me from editing locally uploaded images. In reality, the image editing functionality is provided by ComfyUI and has nothing to do with Qwen. Is it possible to bypass the LLM for both input and output when the image generation feature is selected? Is this module community-contributed, so I should not open an issue here? ### Desired Solution you'd like Bypass the LLM for both input and output when the image generation feature is selected ### Alternatives Considered _No response_ ### Additional Context _No response_
Author
Owner

@owui-terminator[bot] commented on GitHub (Dec 22, 2025):

🔍 Similar Issues Found

I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:

  1. #14431 feat: Comfy UI Improvements to support Keywords, which opens the door for Audio and Video Generation.
    by digitalassassins • May 28, 2025

  2. #18058 issue: handle thinking for Qwen3-VL models
    by SlavikCA • Oct 05, 2025 • bug

  3. #16645 issue: Multimodal models cannot recognize larger-sized images
    by AXuanCreator • Aug 15, 2025 • bug

  4. #18381 feat: Qwen3-Next reasoning support
    by R3tr0ooo • Oct 17, 2025


💡 Tips:

  • If this is a duplicate, please consider closing this issue and adding any additional details to the existing one
  • If you found a solution in any of these issues, please share it here to help others

This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

<!-- gh-comment-id:3680822286 --> @owui-terminator[bot] commented on GitHub (Dec 22, 2025): 🔍 **Similar Issues Found** I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions: 1. [#14431](https://github.com/open-webui/open-webui/issues/14431) **feat: Comfy UI Improvements to support Keywords, which opens the door for Audio and Video Generation.** *by digitalassassins • May 28, 2025* 2. [#18058](https://github.com/open-webui/open-webui/issues/18058) **issue: handle thinking for Qwen3-VL models** *by SlavikCA • Oct 05, 2025 • `bug`* 3. [#16645](https://github.com/open-webui/open-webui/issues/16645) **issue: Multimodal models cannot recognize larger-sized images** *by AXuanCreator • Aug 15, 2025 • `bug`* 4. [#18381](https://github.com/open-webui/open-webui/issues/18381) **feat: Qwen3-Next reasoning support** *by R3tr0ooo • Oct 17, 2025* --- 💡 **Tips:** - If this is a duplicate, please consider closing this issue and adding any additional details to the existing one - If you found a solution in any of these issues, please share it here to help others *This comment was generated automatically by a bot.* Please react with a 👍 if this comment was helpful, or a 👎 if it was not.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#19085