[GH-ISSUE #18671] issue: Image Generation Display Issue with OpenRouter.ai API (Multimodal Output) #34195

New Issue

GiteaMirror · 2026-04-25T08:06:22-05:00

GiteaMirror commented

2026-04-25 08:06:22 -05:00

Originally created by @aiinsightec on GitHub (Oct 27, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18671

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.6.34

Ollama Version (if applicable)

No response

Operating System

Ubuntu 22.04

Browser (if applicable)

Chrome

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Hi team,

I'm experiencing an issue with generated images in OpenWebUI when using models from OpenRouter.ai that support multimodal output (specifically image generation).

Problem Description:
When I configure OpenWebUI to use an OpenRouter.ai API key with image-capable models (e.g., google/gemini-2.5-flash-image-preview), the text responses are displayed correctly, but the generated images are not visible in the chat interface. I'm using a custom proxy that normalizes OpenRouter's multimodal responses into a format that should be compatible with OpenAI-like API structures.

Expected Behavior:
Images generated by OpenRouter.ai models should be displayed within the OpenWebUI chat interface, similar to how they would appear with OpenAI DALL-E or other image generation integrations.

Steps to Reproduce (if applicable):

Configure OpenWebUI with an OpenRouter.ai API key (or through a proxy).
Select an image generation capable model (e.g., google/gemini-2.5-flash-image-preview).
Send a prompt requesting an image ( "Generate a beautiful sunset over mountains").
Observe that only text content (if any) is displayed, but no image appears.
Context and Details:

OpenRouter.ai Response Format: OpenRouter.ai returns image generation responses in the message.images field, where images is an array of objects. Each object contains an image_url field, which in turn has a url field containing a data:image/png;base64,... string. You can find their documentation here: OpenRouter Image Generation Documentation
Example of OpenRouter's image response structure:
json

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I've generated a beautiful sunset image for you.",
        "images": [
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
            }
          }
        ]
      }
    }
  ]
}

Proxy Conversion (if applicable): My proxy ensures that binary responses from other models (or direct base64 responses from Gemini's inline_data) are also converted into this message.images structure, or an OpenAI /data[].b64_json like structure. I've also implemented a conversion from URLs to base64 when the upstream API provides image URLs instead of direct base64 data.
Troubleshooting done:
Confirmed OpenRouter.ai API is returning valid image data (base64 URLs).
Checked network requests in browser developer tools; the responses coming into OpenWebUI contain the expected JSON structure with base64 image data.
Verified that the content field is present alongside the images field in the message object when images are generated.
Tested with various image prompts and models.
I suspect there might be a specific parsing requirement in OpenWebUI for handling multimodal especially from non-OpenAI APIs, or potentially a rendering issue with data:image/png;base64 URLs under certain conditions within the UI components.

Could you please provide guidance on the expected JSON structure for image generation responses for OpenWebUI, or suggest any specific configurations or known issues related to displaying images from custom API endpoints/proxies?

Thank you for your time and assistance!

Actual Behavior

Hi team,

I'm experiencing an issue with generated images in OpenWebUI when using models from OpenRouter.ai that support multimodal output (specifically image generation).

Problem Description:
When I configure OpenWebUI to use an OpenRouter.ai API key with image-capable models (e.g., google/gemini-2.5-flash-image-preview), the text responses are displayed correctly, but the generated images are not visible in the chat interface. I'm using a custom proxy that normalizes OpenRouter's multimodal responses into a format that should be compatible with OpenAI-like API structures.

Expected Behavior:
Images generated by OpenRouter.ai models should be displayed within the OpenWebUI chat interface, similar to how they would appear with OpenAI DALL-E or other image generation integrations.

Steps to Reproduce (if applicable):

Configure OpenWebUI with an OpenRouter.ai API key (or through a proxy).
Select an image generation capable model (e.g., google/gemini-2.5-flash-image-preview).
Send a prompt requesting an image ( "Generate a beautiful sunset over mountains").
Observe that only text content (if any) is displayed, but no image appears.
Context and Details:

OpenRouter.ai Response Format: OpenRouter.ai returns image generation responses in the message.images field, where images is an array of objects. Each object contains an image_url field, which in turn has a url field containing a data:image/png;base64,... string. You can find their documentation here: OpenRouter Image Generation Documentation
Example of OpenRouter's image response structure:
json

{
"choices": [
{
"message": {
"role": "assistant",
"content": "I've generated a beautiful sunset image for you.",
"images": [
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
}
}
]
}
}
]
}
Proxy Conversion (if applicable): My proxy ensures that binary responses from other models (or direct base64 responses from Gemini's inline_data) are also converted into this message.images structure, or an OpenAI /data[].b64_json like structure. I've also implemented a conversion from URLs to base64 when the upstream API provides image URLs instead of direct base64 data.
Troubleshooting done:
Confirmed OpenRouter.ai API is returning valid image data (base64 URLs).
Checked network requests in browser developer tools; the responses coming into OpenWebUI contain the expected JSON structure with base64 image data.
Verified that the content field is present alongside the images field in the message object when images are generated.
Tested with various image prompts and models.
I suspect there might be a specific parsing requirement in OpenWebUI for handling multimodal especially from non-OpenAI APIs, or potentially a rendering issue with data:image/png;base64 URLs under certain conditions within the UI components.

Could you please provide guidance on the expected JSON structure for image generation responses for OpenWebUI, or suggest any specific configurations or known issues related to displaying images from custom API endpoints/proxies?

Thank you for your time and assistance!

Steps to Reproduce

.

Logs & Screenshots

.

Additional Information

No response

Originally created by @aiinsightec on GitHub (Oct 27, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/18671 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.6.34 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 22.04 ### Browser (if applicable) Chrome ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Hi team, I'm experiencing an issue with generated images in OpenWebUI when using models from OpenRouter.ai that support multimodal output (specifically image generation). Problem Description: When I configure OpenWebUI to use an OpenRouter.ai API key with image-capable models (e.g., google/gemini-2.5-flash-image-preview), the text responses are displayed correctly, but the generated images are not visible in the chat interface. I'm using a custom proxy that normalizes OpenRouter's multimodal responses into a format that should be compatible with OpenAI-like API structures. Expected Behavior: Images generated by OpenRouter.ai models should be displayed within the OpenWebUI chat interface, similar to how they would appear with OpenAI DALL-E or other image generation integrations. Steps to Reproduce (if applicable): Configure OpenWebUI with an OpenRouter.ai API key (or through a proxy). Select an image generation capable model (e.g., google/gemini-2.5-flash-image-preview). Send a prompt requesting an image ( "Generate a beautiful sunset over mountains"). Observe that only text content (if any) is displayed, but no image appears. Context and Details: OpenRouter.ai Response Format: OpenRouter.ai returns image generation responses in the message.images field, where images is an array of objects. Each object contains an image_url field, which in turn has a url field containing a data:image/png;base64,... string. You can find their documentation here: [OpenRouter Image Generation Documentation](https://openrouter.ai/docs/features/multimodal/image-generation) Example of OpenRouter's image response structure: json ``` { "choices": [ { "message": { "role": "assistant", "content": "I've generated a beautiful sunset image for you.", "images": [ { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." } } ] } } ] } ``` Proxy Conversion (if applicable): My proxy ensures that binary responses from other models (or direct base64 responses from Gemini's inline_data) are also converted into this message.images structure, or an OpenAI /data[].b64_json like structure. I've also implemented a conversion from URLs to base64 when the upstream API provides image URLs instead of direct base64 data. Troubleshooting done: Confirmed OpenRouter.ai API is returning valid image data (base64 URLs). Checked network requests in browser developer tools; the responses coming into OpenWebUI contain the expected JSON structure with base64 image data. Verified that the content field is present alongside the images field in the message object when images are generated. Tested with various image prompts and models. I suspect there might be a specific parsing requirement in OpenWebUI for handling multimodal especially from non-OpenAI APIs, or potentially a rendering issue with data:image/png;base64 URLs under certain conditions within the UI components. Could you please provide guidance on the expected JSON structure for image generation responses for OpenWebUI, or suggest any specific configurations or known issues related to displaying images from custom API endpoints/proxies? Thank you for your time and assistance! ### Actual Behavior Hi team, I'm experiencing an issue with generated images in OpenWebUI when using models from OpenRouter.ai that support multimodal output (specifically image generation). Problem Description: When I configure OpenWebUI to use an OpenRouter.ai API key with image-capable models (e.g., google/gemini-2.5-flash-image-preview), the text responses are displayed correctly, but the generated images are not visible in the chat interface. I'm using a custom proxy that normalizes OpenRouter's multimodal responses into a format that should be compatible with OpenAI-like API structures. Expected Behavior: Images generated by OpenRouter.ai models should be displayed within the OpenWebUI chat interface, similar to how they would appear with OpenAI DALL-E or other image generation integrations. Steps to Reproduce (if applicable): Configure OpenWebUI with an OpenRouter.ai API key (or through a proxy). Select an image generation capable model (e.g., google/gemini-2.5-flash-image-preview). Send a prompt requesting an image ( "Generate a beautiful sunset over mountains"). Observe that only text content (if any) is displayed, but no image appears. Context and Details: OpenRouter.ai Response Format: OpenRouter.ai returns image generation responses in the message.images field, where images is an array of objects. Each object contains an image_url field, which in turn has a url field containing a data:image/png;base64,... string. You can find their documentation here: [OpenRouter Image Generation Documentation](https://openrouter.ai/docs/features/multimodal/image-generation) Example of OpenRouter's image response structure: json { "choices": [ { "message": { "role": "assistant", "content": "I've generated a beautiful sunset image for you.", "images": [ { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." } } ] } } ] } Proxy Conversion (if applicable): My proxy ensures that binary responses from other models (or direct base64 responses from Gemini's inline_data) are also converted into this message.images structure, or an OpenAI /data[].b64_json like structure. I've also implemented a conversion from URLs to base64 when the upstream API provides image URLs instead of direct base64 data. Troubleshooting done: Confirmed OpenRouter.ai API is returning valid image data (base64 URLs). Checked network requests in browser developer tools; the responses coming into OpenWebUI contain the expected JSON structure with base64 image data. Verified that the content field is present alongside the images field in the message object when images are generated. Tested with various image prompts and models. I suspect there might be a specific parsing requirement in OpenWebUI for handling multimodal especially from non-OpenAI APIs, or potentially a rendering issue with data:image/png;base64 URLs under certain conditions within the UI components. Could you please provide guidance on the expected JSON structure for image generation responses for OpenWebUI, or suggest any specific configurations or known issues related to displaying images from custom API endpoints/proxies? Thank you for your time and assistance! ### Steps to Reproduce . ### Logs & Screenshots . ### Additional Information _No response_

GiteaMirror added the bug label 2026-04-25 08:06:22 -05:00

GiteaMirror closed this issue

2026-04-25 08:06:22 -05:00

GiteaMirror commented

2026-04-25 08:06:24 -05:00

@outis151 commented on GitHub (Oct 27, 2025):

Seems like a duplicate of #16935

@outis151 commented on GitHub (Oct 27, 2025): Seems like a duplicate of #16935

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#34195