[GH-ISSUE #13180] feat: Support gpt-image-1 from OpenAI new Image gen model #16834

Closed
opened 2026-04-19 22:39:43 -05:00 by GiteaMirror · 17 comments
Owner

Originally created by @sFritsch09 on GitHub (Apr 23, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/13180

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

To support the GPT-Image-1 model for image generation and set quality with corresponding size, you typically need to specify parameters such as the image resolution or dimensions and a quality setting if supported by the API or SDK you are using.

Since GPT-Image-1 is a hypothetical or specific model name, I’ll provide a general example of how you might call an image generation API with quality and size parameters. If you have a specific platform or API in mind, please share it for a more tailored example.


Example: Calling GPT-Image-1 with Quality and Size Parameters

from openai import OpenAI
import base64
client = OpenAI()

result = client.images.generate(
    model="gpt-image-1",
    prompt="Draw a 2D pixel art style sprite sheet of a tabby gray cat",
    size="1024x1024",
    background="transparent",
    quality="high",
)

image_base64 = result.json()["data"][0]["b64_json"]
image_bytes = base64.b64decode(image_base64)

# Save the image to a file
with open("sprite.png", "wb") as f:
    f.write(image_bytes)

Notes:

Customize Image Output
You can configure the following output options:

Size: Image dimensions (e.g., 1024x1024, 1024x1536)
Quality: Rendering quality (e.g. low, medium, high)
Format: File output format
Compression: Compression level (0-100%) for JPEG and WebP formats
Background: Transparent or opaque

Model Docs:

https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1

https://platform.openai.com/docs/models/gpt-image-1

Desired Solution you'd like

Support gpt-image-1

Alternatives Considered

No response

Additional Context

No response

Originally created by @sFritsch09 on GitHub (Apr 23, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/13180 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description To support the GPT-Image-1 model for image generation and set quality with corresponding size, you typically need to specify parameters such as the image resolution or dimensions and a quality setting if supported by the API or SDK you are using. Since GPT-Image-1 is a hypothetical or specific model name, I’ll provide a general example of how you might call an image generation API with quality and size parameters. If you have a specific platform or API in mind, please share it for a more tailored example. --- ### Example: Calling GPT-Image-1 with Quality and Size Parameters ```python from openai import OpenAI import base64 client = OpenAI() result = client.images.generate( model="gpt-image-1", prompt="Draw a 2D pixel art style sprite sheet of a tabby gray cat", size="1024x1024", background="transparent", quality="high", ) image_base64 = result.json()["data"][0]["b64_json"] image_bytes = base64.b64decode(image_base64) # Save the image to a file with open("sprite.png", "wb") as f: f.write(image_bytes) ``` --- ### Notes: Customize Image Output You can configure the following output options: Size: Image dimensions (e.g., 1024x1024, 1024x1536) Quality: Rendering quality (e.g. low, medium, high) Format: File output format Compression: Compression level (0-100%) for JPEG and WebP formats Background: Transparent or opaque Model Docs: https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1 https://platform.openai.com/docs/models/gpt-image-1 ### Desired Solution you'd like Support gpt-image-1 ### Alternatives Considered _No response_ ### Additional Context _No response_
Author
Owner

@spammenotinoz commented on GitHub (Apr 24, 2025):

I uploaded a pipe, seems to do the job for both image creation and editing.
https://openwebui.com/f/spammenot/gpt_image_1

But please review the code, I can not code.
Edit: 0.3.0 updated to non-blocking

<!-- gh-comment-id:2825937441 --> @spammenotinoz commented on GitHub (Apr 24, 2025): I uploaded a pipe, seems to do the job for both image creation and editing. https://openwebui.com/f/spammenot/gpt_image_1 But please review the code, I can not code. Edit: 0.3.0 updated to non-blocking
Author
Owner

@coskunm commented on GitHub (Apr 24, 2025):

I uploaded a pipe, seems to do the job for both image creation and editing. https://openwebui.com/f/spammenot/gpt_image_1

But please review the code, I can not code. Edit: 0.3.0 updated to non-blocking

Is it working? I tried but it is not generating image. Also I didnt understand proxy variables. Should you send an example of valves screen?

<!-- gh-comment-id:2826777093 --> @coskunm commented on GitHub (Apr 24, 2025): > I uploaded a pipe, seems to do the job for both image creation and editing. https://openwebui.com/f/spammenot/gpt_image_1 > > But please review the code, I can not code. Edit: 0.3.0 updated to non-blocking Is it working? I tried but it is not generating image. Also I didnt understand proxy variables. Should you send an example of valves screen?
Author
Owner

@spammenotinoz commented on GitHub (Apr 24, 2025):

I uploaded a pipe, seems to do the job for both image creation and editing. https://openwebui.com/f/spammenot/gpt_image_1
But please review the code, I can not code. Edit: 0.3.0 updated to non-blocking

Is it working? I tried but it is not generating image. Also I didnt understand proxy variables. Should you send an example of valves screen?

Like 4.1, you do need to be a verified organisation to use this model.

Here is a sample screenshot of my settings, proxy ect.. left as default
Image

Yes it works

Image

<!-- gh-comment-id:2826941752 --> @spammenotinoz commented on GitHub (Apr 24, 2025): > > I uploaded a pipe, seems to do the job for both image creation and editing. https://openwebui.com/f/spammenot/gpt_image_1 > > But please review the code, I can not code. Edit: 0.3.0 updated to non-blocking > > Is it working? I tried but it is not generating image. Also I didnt understand proxy variables. Should you send an example of valves screen? Like 4.1, you do need to be a verified organisation to use this model. Here is a sample screenshot of my settings, proxy ect.. left as default ![Image](https://github.com/user-attachments/assets/e1069b55-18e5-4989-af21-bc159e228e66) Yes it works ![Image](https://github.com/user-attachments/assets/8db2ecdd-bf73-4e4b-ad6a-2c64e7822943)
Author
Owner

@kristaller486 commented on GitHub (Apr 24, 2025):

Unfortunately, it doesn't work if you need to edit the generated image.

<!-- gh-comment-id:2827243088 --> @kristaller486 commented on GitHub (Apr 24, 2025): Unfortunately, it doesn't work if you need to edit the generated image.
Author
Owner

@spammenotinoz commented on GitHub (Apr 24, 2025):

Unfortunately, it doesn't work if you need to edit the generated image.

As above, works for me. Do you have an example and I can check.
Note only png, webp, or jpg file less than 25MB are currently supported by OpenAI.

Otherwise, I wonder if it works for me because I hard coded client side image compression at 1024x1024

<!-- gh-comment-id:2827399799 --> @spammenotinoz commented on GitHub (Apr 24, 2025): > Unfortunately, it doesn't work if you need to edit the generated image. As above, works for me. Do you have an example and I can check. Note only png, webp, or jpg file less than 25MB are currently supported by OpenAI. Otherwise, I wonder if it works for me because I hard coded client side image compression at 1024x1024
Author
Owner

@danieldilly commented on GitHub (Apr 24, 2025):

After adding this pipeline function Open WebUI doesn't load for me anymore. I just get a black background. How can I fix this? This was the first function I ever tried using.

Image

<!-- gh-comment-id:2828089627 --> @danieldilly commented on GitHub (Apr 24, 2025): After adding this pipeline function Open WebUI doesn't load for me anymore. I just get a black background. How can I fix this? This was the first function I ever tried using. ![Image](https://github.com/user-attachments/assets/4e794381-8967-4951-b6e1-13f737798ef0)
Author
Owner

@danieldilly commented on GitHub (Apr 24, 2025):

I removed the container and re-added it and it still doesn't work.

2025-04-24 12:06:37   File "/app/backend/open_webui/utils/models.py", line 55, in get_all_base_models
2025-04-24 12:06:37     function_models = await get_function_models(request)
2025-04-24 12:06:37                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-04-24 12:06:37   File "/app/backend/open_webui/functions.py", line 94, in get_function_models
2025-04-24 12:06:37     for p in sub_pipes:
2025-04-24 12:06:37 TypeError: 'coroutine' object is not iterable

Can I not use Open WebUI ever again now or what?

Update: I had to remove the volume and I lost all my chats and prompts and everything but at least its working again

Update: Tried the function again, same result. After enabling it I just get a black screen when loading Open WebUI. It totally breaks it for me. Thanks for trying though.

<!-- gh-comment-id:2828162379 --> @danieldilly commented on GitHub (Apr 24, 2025): I removed the container and re-added it and it still doesn't work. ``` 2025-04-24 12:06:37 File "/app/backend/open_webui/utils/models.py", line 55, in get_all_base_models 2025-04-24 12:06:37 function_models = await get_function_models(request) 2025-04-24 12:06:37 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025-04-24 12:06:37 File "/app/backend/open_webui/functions.py", line 94, in get_function_models 2025-04-24 12:06:37 for p in sub_pipes: 2025-04-24 12:06:37 TypeError: 'coroutine' object is not iterable ``` Can I not use Open WebUI ever again now or what? Update: I had to remove the volume and I lost all my chats and prompts and everything but at least its working again Update: Tried the function again, same result. After enabling it I just get a black screen when loading Open WebUI. It totally breaks it for me. Thanks for trying though.
Author
Owner

@barnabehvrd commented on GitHub (Apr 24, 2025):

I do not have the same issue as @danieldilly with v0.3.1 Have you tried with this version ?

However, I have another issue :

Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing.

cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4

I'm also happy to share the little dinosaur with you :
Image

Finally, @spammenotinoz could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ?

<!-- gh-comment-id:2828298393 --> @barnabehvrd commented on GitHub (Apr 24, 2025): I do not have the same issue as @danieldilly with `v0.3.1` Have you tried with this version ? However, I have another issue : Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing. cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4 I'm also happy to share the little dinosaur with you : ![Image](https://github.com/user-attachments/assets/d848e0f8-f24c-464b-862a-4a89ab1dc4d5) Finally, @spammenotinoz could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ?
Author
Owner

@spammenotinoz commented on GitHub (Apr 25, 2025):

I do not have the same issue as @danieldilly with v0.3.1 Have you tried with this version ?

However, I have another issue :

Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing.

cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4

I'm also happy to share the little dinosaur with you : Image

Finally, @spammenotinoz could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ?

Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image.
Removed proxy support (poor implementation, was impacting reliability).

Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM.
ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint.
Limitation being if you want to edit an image returned by the model, you need to paste that in.

I can't code, this was really just a quick implementation, but overall has been working for me.

<!-- gh-comment-id:2830180836 --> @spammenotinoz commented on GitHub (Apr 25, 2025): > I do not have the same issue as [@danieldilly](https://github.com/danieldilly) with `v0.3.1` Have you tried with this version ? > > However, I have another issue : > > Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing. > > cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4 > > I'm also happy to share the little dinosaur with you : ![Image](https://github.com/user-attachments/assets/d848e0f8-f24c-464b-862a-4a89ab1dc4d5) > > Finally, [@spammenotinoz](https://github.com/spammenotinoz) could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ? Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image. Removed proxy support (poor implementation, was impacting reliability). Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM. ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint. Limitation being if you want to edit an image returned by the model, you need to paste that in. I can't code, this was really just a quick implementation, but overall has been working for me.
Author
Owner

@JiangNanGenius commented on GitHub (Apr 26, 2025):

Thank you for your helpful code! I have a suggestion: could you add an option to modify the base URL? This would be especially useful for users outside the US, as it allows them to specify different API servers or proxy addresses. Hope you can consider this—thank you!

<!-- gh-comment-id:2831748743 --> @JiangNanGenius commented on GitHub (Apr 26, 2025): Thank you for your helpful code! I have a suggestion: could you add an option to modify the base URL? This would be especially useful for users outside the US, as it allows them to specify different API servers or proxy addresses. Hope you can consider this—thank you!
Author
Owner

@JiangNanGenius commented on GitHub (Apr 26, 2025):

"""
title: OpenAI Image Generator (GPTo\GPT-image-1)
description: Quick Pipe to enable image creation and editing with gpt-image-1
author: MorningBean.ai
author_url: https://morningbean.ai
funding_url: FREE
version: 0.3.2
license: MIT
requirements: typing, pydantic, openai
environment_variables:
disclaimer: This pipe is provided as is without any guarantees.
Please ensure that it meets your requirements.
0.3.2 Logic fix to only invoke editing when latets user message (prompt) contains an image.
0.3.0 BugFix move to Non-Blocking
"""

import json
import random
import base64
import asyncio
import re
import tempfile
import os
from typing import List, AsyncGenerator, Callable, Awaitable
from pydantic import BaseModel, Field
from openai import OpenAI

class Pipe:
class Valves(BaseModel):
OPENAI_API_KEYS: str = Field(
default="", description="OpenAI API Keys, comma-separated"
)
IMAGE_NUM: int = Field(default=2, description="Number of images (1-10)")
IMAGE_SIZE: str = Field(
default="1024x1024",
description="Image size: 1024x1024, 1536x1024, 1024x1536, auto",
)
IMAGE_QUALITY: str = Field(
default="auto", description="Image quality: high, medium, low, auto"
)
MODERATION: str = Field(
default="auto", description="Moderation strictness: auto (default) or low"
)
BASE_URL: str = Field(
default="https://api.openai.com", description="Custom base URL for OpenAI"
)

def __init__(self):
    self.type = "manifold"
    self.name = "ChatGPT: "
    self.valves = self.Valves()
    self.emitter: Callable[[dict], Awaitable[None]] | None = None

def _get_proxy_url(self) -> str | None:
    # Proxy logic has been removed, but you can customize return here if needed
    return None

async def emit_status(self, message: str = "", done: bool = False):
    if self.emitter:
        await self.emitter(
            {"type": "status", "data": {"description": message, "done": done}}
        )

async def pipes(self) -> List[dict]:
    return [{"id": "gpt-image-1", "name": "GPT Image 1"}]

def convert_message_to_prompt(self, messages: List[dict]) -> tuple[str, List[dict]]:
    for msg in reversed(messages):
        if msg.get("role") != "user":
            continue
        content = msg.get("content")
        # If content is a list (it can be mixed text and images)
        if isinstance(content, list):
            text_parts, image_data_list = [], []
            for part in content:
                if part.get("type") == "text":
                    text_parts.append(part.get("text", ""))
                elif part.get("type") == "image_url":
                    url = part.get("image_url", {}).get("url", "")
                    if url.startswith("data:"):
                        header, data = url.split(";base64,", 1)
                        mime = header.split("data:")[-1]
                        image_data_list.append({"mimeType": mime, "data": data})
            prompt = (
                " ".join(text_parts).strip() or "Please edit the provided image(s)"
            )
            return prompt, image_data_list

        # If content is a plain string (search for embedded images in it)
        if isinstance(content, str):
            pattern = r"!\[[^\]]*\]\(data:([^;]+);base64,([^)]+)\)"
            matches = re.findall(pattern, content)
            image_data_list = [{"mimeType": m, "data": d} for m, d in matches]
            clean = (
                re.sub(pattern, "", content).strip()
                or "Please edit the provided image(s)"
            )
            return clean, image_data_list

    # Default case: No images found, return a default prompt
    return "Please edit the provided image(s)", []

async def _run_blocking(self, fn: Callable, *args, **kwargs):
    loop = asyncio.get_running_loop()
    return await loop.run_in_executor(None, lambda: fn(*args, **kwargs))

async def generate_image(
    self,
    prompt: str,
    model: str,
    n: int,
    size: str,
    quality: str,
) -> AsyncGenerator[str, None]:
    await self.emit_status("🖼️ Generating image(s)...")
    key = random.choice(self.valves.OPENAI_API_KEYS.split(",")).strip()
    if not key:
        yield "Error: OPENAI_API_KEYS not set"
        return

    client = OpenAI(api_key=key, base_url=self.valves.BASE_URL)

    def _call_gen():
        try:
            return client.images.generate(
                model=model,
                prompt=prompt,
                n=n,
                size=size,
                quality=quality,
                moderation=self.valves.MODERATION,
            )
        except TypeError:
            # Fallback for older versions or extra params
            return client.images.generate(
                model=model,
                prompt=prompt,
                n=n,
                size=size,
                quality=quality,
                extra_body={"moderation": self.valves.MODERATION},
            )

    try:
        resp = await self._run_blocking(_call_gen)
        for i, img in enumerate(resp.data, 1):
            yield f"![image_{i}](data:image/png;base64,{img.b64_json})"
        await self.emit_status("🎉 Image generation successful", done=True)
    except Exception as e:
        yield f"Error during image generation: {e}"
        await self.emit_status("❌ Image generation failed", done=True)

async def edit_image(
    self,
    base64_images: List[dict],
    prompt: str,
    model: str,
    n: int,
    size: str,
    quality: str,
) -> AsyncGenerator[str, None]:
    await self.emit_status("✂️ Editing image(s)...")
    key = random.choice(self.valves.OPENAI_API_KEYS.split(",")).strip()
    if not key:
        yield "Error: OPENAI_API_KEYS not set"
        return

    client = OpenAI(api_key=key, base_url=self.valves.BASE_URL)

    for idx, img_dict in enumerate(base64_images, start=1):
        tmp_path = None
        try:
            data = base64.b64decode(img_dict["data"])
            # Check size limit 25MB
            if len(data) > 25 * 1024 * 1024:
                raise ValueError("Image exceeds 25MB limit")

            suffix = {
                "image/png": ".png",
                "image/jpeg": ".jpg",
                "image/webp": ".webp",
            }.get(img_dict["mimeType"])
            if not suffix:
                raise ValueError(f"Unsupported format: {img_dict['mimeType']}")

            tmpf = tempfile.NamedTemporaryFile(delete=False, suffix=suffix)
            tmpf.write(data)
            tmpf.close()
            tmp_path = tmpf.name

            def _call_edit():
                with open(tmp_path, "rb") as f:
                    try:
                        return client.images.edit(
                            model=model,
                            image=f,
                            prompt=prompt,
                            n=n,
                            size=size,
                            extra_body={
                                "quality": quality,
                                "moderation": self.valves.MODERATION,
                            },
                        )
                    except TypeError:
                        return client.images.edit(
                            model=model,
                            image=f,
                            prompt=prompt,
                            n=n,
                            size=size,
                            moderation=self.valves.MODERATION,
                        )

            resp = await self._run_blocking(_call_edit)
            for i, out_img in enumerate(resp.data, start=1):
                yield (
                    f"![edited_image_{idx}_{i}]"
                    f"(data:image/png;base64,{out_img.b64_json})"
                )
        except Exception as e:
            yield f"Error editing image {idx}: {e}"
        finally:
            if tmp_path and os.path.exists(tmp_path):
                os.unlink(tmp_path)

    await self.emit_status("🎉 Image edit successful", done=True)

async def pipe(
    self,
    body: dict,
    __event_emitter__: Callable[[dict], Awaitable[None]] = None,
) -> AsyncGenerator[str, None]:
    self.emitter = __event_emitter__
    msgs = body.get("messages", [])

    # Hardcoded for demonstration, could be configurable or from the body
    model_id = "gpt-image-1"
    n = min(max(1, self.valves.IMAGE_NUM), 10)
    size = self.valves.IMAGE_SIZE
    quality = self.valves.IMAGE_QUALITY

    prompt, imgs = self.convert_message_to_prompt(msgs)

    if imgs:
        async for out in self.edit_image(
            base64_images=imgs,
            prompt=prompt,
            model=model_id,
            n=n,
            size=size,
            quality=quality,
        ):
            yield out
    else:
        async for out in self.generate_image(
            prompt=prompt,
            model=model_id,
            n=n,
            size=size,
            quality=quality,
        ):
            yield out
<!-- gh-comment-id:2832208002 --> @JiangNanGenius commented on GitHub (Apr 26, 2025): """ title: OpenAI Image Generator (GPTo\GPT-image-1) description: Quick Pipe to enable image creation and editing with gpt-image-1 author: MorningBean.ai author_url: https://morningbean.ai funding_url: FREE version: 0.3.2 license: MIT requirements: typing, pydantic, openai environment_variables: disclaimer: This pipe is provided as is without any guarantees. Please ensure that it meets your requirements. 0.3.2 Logic fix to only invoke editing when latets user message (prompt) contains an image. 0.3.0 BugFix move to Non-Blocking """ import json import random import base64 import asyncio import re import tempfile import os from typing import List, AsyncGenerator, Callable, Awaitable from pydantic import BaseModel, Field from openai import OpenAI class Pipe: class Valves(BaseModel): OPENAI_API_KEYS: str = Field( default="", description="OpenAI API Keys, comma-separated" ) IMAGE_NUM: int = Field(default=2, description="Number of images (1-10)") IMAGE_SIZE: str = Field( default="1024x1024", description="Image size: 1024x1024, 1536x1024, 1024x1536, auto", ) IMAGE_QUALITY: str = Field( default="auto", description="Image quality: high, medium, low, auto" ) MODERATION: str = Field( default="auto", description="Moderation strictness: auto (default) or low" ) BASE_URL: str = Field( default="https://api.openai.com", description="Custom base URL for OpenAI" ) def __init__(self): self.type = "manifold" self.name = "ChatGPT: " self.valves = self.Valves() self.emitter: Callable[[dict], Awaitable[None]] | None = None def _get_proxy_url(self) -> str | None: # Proxy logic has been removed, but you can customize return here if needed return None async def emit_status(self, message: str = "", done: bool = False): if self.emitter: await self.emitter( {"type": "status", "data": {"description": message, "done": done}} ) async def pipes(self) -> List[dict]: return [{"id": "gpt-image-1", "name": "GPT Image 1"}] def convert_message_to_prompt(self, messages: List[dict]) -> tuple[str, List[dict]]: for msg in reversed(messages): if msg.get("role") != "user": continue content = msg.get("content") # If content is a list (it can be mixed text and images) if isinstance(content, list): text_parts, image_data_list = [], [] for part in content: if part.get("type") == "text": text_parts.append(part.get("text", "")) elif part.get("type") == "image_url": url = part.get("image_url", {}).get("url", "") if url.startswith("data:"): header, data = url.split(";base64,", 1) mime = header.split("data:")[-1] image_data_list.append({"mimeType": mime, "data": data}) prompt = ( " ".join(text_parts).strip() or "Please edit the provided image(s)" ) return prompt, image_data_list # If content is a plain string (search for embedded images in it) if isinstance(content, str): pattern = r"!\[[^\]]*\]\(data:([^;]+);base64,([^)]+)\)" matches = re.findall(pattern, content) image_data_list = [{"mimeType": m, "data": d} for m, d in matches] clean = ( re.sub(pattern, "", content).strip() or "Please edit the provided image(s)" ) return clean, image_data_list # Default case: No images found, return a default prompt return "Please edit the provided image(s)", [] async def _run_blocking(self, fn: Callable, *args, **kwargs): loop = asyncio.get_running_loop() return await loop.run_in_executor(None, lambda: fn(*args, **kwargs)) async def generate_image( self, prompt: str, model: str, n: int, size: str, quality: str, ) -> AsyncGenerator[str, None]: await self.emit_status("🖼️ Generating image(s)...") key = random.choice(self.valves.OPENAI_API_KEYS.split(",")).strip() if not key: yield "Error: OPENAI_API_KEYS not set" return client = OpenAI(api_key=key, base_url=self.valves.BASE_URL) def _call_gen(): try: return client.images.generate( model=model, prompt=prompt, n=n, size=size, quality=quality, moderation=self.valves.MODERATION, ) except TypeError: # Fallback for older versions or extra params return client.images.generate( model=model, prompt=prompt, n=n, size=size, quality=quality, extra_body={"moderation": self.valves.MODERATION}, ) try: resp = await self._run_blocking(_call_gen) for i, img in enumerate(resp.data, 1): yield f"![image_{i}](data:image/png;base64,{img.b64_json})" await self.emit_status("🎉 Image generation successful", done=True) except Exception as e: yield f"Error during image generation: {e}" await self.emit_status("❌ Image generation failed", done=True) async def edit_image( self, base64_images: List[dict], prompt: str, model: str, n: int, size: str, quality: str, ) -> AsyncGenerator[str, None]: await self.emit_status("✂️ Editing image(s)...") key = random.choice(self.valves.OPENAI_API_KEYS.split(",")).strip() if not key: yield "Error: OPENAI_API_KEYS not set" return client = OpenAI(api_key=key, base_url=self.valves.BASE_URL) for idx, img_dict in enumerate(base64_images, start=1): tmp_path = None try: data = base64.b64decode(img_dict["data"]) # Check size limit 25MB if len(data) > 25 * 1024 * 1024: raise ValueError("Image exceeds 25MB limit") suffix = { "image/png": ".png", "image/jpeg": ".jpg", "image/webp": ".webp", }.get(img_dict["mimeType"]) if not suffix: raise ValueError(f"Unsupported format: {img_dict['mimeType']}") tmpf = tempfile.NamedTemporaryFile(delete=False, suffix=suffix) tmpf.write(data) tmpf.close() tmp_path = tmpf.name def _call_edit(): with open(tmp_path, "rb") as f: try: return client.images.edit( model=model, image=f, prompt=prompt, n=n, size=size, extra_body={ "quality": quality, "moderation": self.valves.MODERATION, }, ) except TypeError: return client.images.edit( model=model, image=f, prompt=prompt, n=n, size=size, moderation=self.valves.MODERATION, ) resp = await self._run_blocking(_call_edit) for i, out_img in enumerate(resp.data, start=1): yield ( f"![edited_image_{idx}_{i}]" f"(data:image/png;base64,{out_img.b64_json})" ) except Exception as e: yield f"Error editing image {idx}: {e}" finally: if tmp_path and os.path.exists(tmp_path): os.unlink(tmp_path) await self.emit_status("🎉 Image edit successful", done=True) async def pipe( self, body: dict, __event_emitter__: Callable[[dict], Awaitable[None]] = None, ) -> AsyncGenerator[str, None]: self.emitter = __event_emitter__ msgs = body.get("messages", []) # Hardcoded for demonstration, could be configurable or from the body model_id = "gpt-image-1" n = min(max(1, self.valves.IMAGE_NUM), 10) size = self.valves.IMAGE_SIZE quality = self.valves.IMAGE_QUALITY prompt, imgs = self.convert_message_to_prompt(msgs) if imgs: async for out in self.edit_image( base64_images=imgs, prompt=prompt, model=model_id, n=n, size=size, quality=quality, ): yield out else: async for out in self.generate_image( prompt=prompt, model=model_id, n=n, size=size, quality=quality, ): yield out
Author
Owner

@JiangNanGenius commented on GitHub (Apr 26, 2025):

I do not have the same issue as @danieldilly with v0.3.1 Have you tried with this version ?
However, I have another issue :
Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing.
cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4
I'm also happy to share the little dinosaur with you : Image
Finally, @spammenotinoz could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ?

Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image. Removed proxy support (poor implementation, was impacting reliability).

Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM. ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint. Limitation being if you want to edit an image returned by the model, you need to paste that in.

I can't code, this was really just a quick implementation, but overall has been working for me.

i tried to change to add base url settings, tested success

<!-- gh-comment-id:2832208846 --> @JiangNanGenius commented on GitHub (Apr 26, 2025): > > I do not have the same issue as [@danieldilly](https://github.com/danieldilly) with `v0.3.1` Have you tried with this version ? > > However, I have another issue : > > Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing. > > cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4 > > I'm also happy to share the little dinosaur with you : ![Image](https://github.com/user-attachments/assets/d848e0f8-f24c-464b-862a-4a89ab1dc4d5) > > Finally, [@spammenotinoz](https://github.com/spammenotinoz) could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ? > > Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image. Removed proxy support (poor implementation, was impacting reliability). > > Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM. ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint. Limitation being if you want to edit an image returned by the model, you need to paste that in. > > I can't code, this was really just a quick implementation, but overall has been working for me. i tried to change to add base url settings, tested success
Author
Owner

@GaussianGuaicai commented on GitHub (Apr 26, 2025):

I do not have the same issue as @danieldilly with v0.3.1 Have you tried with this version ?

However, I have another issue :

Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing.

cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4

I'm also happy to share the little dinosaur with you : Image

Finally, @spammenotinoz could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ?

Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image.
Removed proxy support (poor implementation, was impacting reliability).

Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM.
ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint.
Limitation being if you want to edit an image returned by the model, you need to paste that in.

I can't code, this was really just a quick implementation, but overall has been working for me.

The biggest downside is too much tokens when you had Title Generate or Auto Completion enabled, because image datas is treated as text tokens too, you always get the millions tokens warning.

<!-- gh-comment-id:2832471567 --> @GaussianGuaicai commented on GitHub (Apr 26, 2025): > > I do not have the same issue as [@danieldilly](https://github.com/danieldilly) with `v0.3.1` Have you tried with this version ? > > > > However, I have another issue : > > > > Whenever I ask for an image generation, the image the plugin will also edit the image it just has generated, resulting in ~ 0.02$ per image generation for a low quality 1024x1024 image due to several API calls with a lot of input tokens due to the editing. > > > > cf : https://files.voltis.cloud/S7Tqu883bBc1UWqBCYcl8eG6q37yQKmf.mp4 > > > > I'm also happy to share the little dinosaur with you : ![Image](https://github.com/user-attachments/assets/d848e0f8-f24c-464b-862a-4a89ab1dc4d5) > > > > Finally, [@spammenotinoz](https://github.com/spammenotinoz) could you create a GitHub repo so we could move the issue to it, instead of talking in this issue ? > > Uploaded version 0.3.2, logic change. Edit function is only invoked when latest user message (prompt) contains an image. > Removed proxy support (poor implementation, was impacting reliability). > > Long weekend here, I will think about adding a repo, but as mentioned above I can't code. The issue here is unlike ChatGPT to edit you need to call a different endpoint. Your not actually conversing with an LLM. > ie: I am not coding to detect if the user wants to edit an existing image, just making the assumption that if there is a message in the prompt, invoke the EDIT function\endpoint. > Limitation being if you want to edit an image returned by the model, you need to paste that in. > > I can't code, this was really just a quick implementation, but overall has been working for me. The biggest downside is too much tokens when you had Title Generate or Auto Completion enabled, because image datas is treated as text tokens too, you always get the millions tokens warning.
Author
Owner

@spammenotinoz commented on GitHub (Apr 26, 2025):

Ahh, I use Title Generate and don't seem to have the issue. My titles use nano and cost about 2c per day.
I don't use auto completion, but surely these bug would impact standard chats with images? As these features are also based on the user prompt not the response.

Will look into this after the holidays.

<!-- gh-comment-id:2832686174 --> @spammenotinoz commented on GitHub (Apr 26, 2025): Ahh, I use Title Generate and don't seem to have the issue. My titles use nano and cost about 2c per day. I don't use auto completion, but surely these bug would impact standard chats with images? As these features are also based on the user prompt not the response. Will look into this after the holidays.
Author
Owner

@mazierovictor commented on GitHub (Apr 27, 2025):

How to integrate this function into an assistant?

<!-- gh-comment-id:2833543171 --> @mazierovictor commented on GitHub (Apr 27, 2025): How to integrate this function into an assistant?
Author
Owner

@MichaelMKenny commented on GitHub (Apr 28, 2025):

I've updated @spammenotinoz pipe to fix generating multiple images for each image provided when editing an image. It also doesn't re-call the OpenAI image gen API when a failure happens. Here's my gist:

https://gist.github.com/MichaelMKenny/c6f07ce661165d1a84ef7b41ad08216b

Enjoy! And thank you @spammenotinoz for making the original code :)

<!-- gh-comment-id:2835010779 --> @MichaelMKenny commented on GitHub (Apr 28, 2025): I've updated @spammenotinoz pipe to fix generating multiple images for each image provided when editing an image. It also doesn't re-call the OpenAI image gen API when a failure happens. Here's my gist: https://gist.github.com/MichaelMKenny/c6f07ce661165d1a84ef7b41ad08216b Enjoy! And thank you @spammenotinoz for making the original code :)
Author
Owner

@auggie246 commented on GitHub (Apr 28, 2025):

When I upload an image via openwebui, the body that's getting passed into pipe, does not contain any images

{
  "model": "gpt_image_1.gpt-image-1",
  "messages": [
    {
      "role": "user",
      "content": "### Task:\nGenerate 1-3 broad tags categorizing the main themes of the chat history, along with 1-3 more specific subtopic tags.\n\n### Guidelines:\n- Start with high-level domains (e.g. Science, Technology, Philosophy, Arts, Politics, Business, Health, Sports, Entertainment, Education)\n- Consider including relevant subfields/subdomains if they are strongly represented throughout the conversation\n- If content is too short (less than 3 messages) or too diverse, use only [\"General\"]\n- Use the chat's primary language; default to English if multilingual\n- Prioritize accuracy over specificity\n\n### Output:\nJSON format: { \"tags\": [\"tag1\", \"tag2\", \"tag3\"] }\n\n### Chat History:\n<chat_history>\nUSER: auggie is here\nASSISTANT: Error editing image 1: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}\n</chat_history>"
    }
  ],
  "stream": false
}
<!-- gh-comment-id:2835180225 --> @auggie246 commented on GitHub (Apr 28, 2025): When I upload an image via openwebui, the `body` that's getting passed into pipe, does not contain any images ``` { "model": "gpt_image_1.gpt-image-1", "messages": [ { "role": "user", "content": "### Task:\nGenerate 1-3 broad tags categorizing the main themes of the chat history, along with 1-3 more specific subtopic tags.\n\n### Guidelines:\n- Start with high-level domains (e.g. Science, Technology, Philosophy, Arts, Politics, Business, Health, Sports, Entertainment, Education)\n- Consider including relevant subfields/subdomains if they are strongly represented throughout the conversation\n- If content is too short (less than 3 messages) or too diverse, use only [\"General\"]\n- Use the chat's primary language; default to English if multilingual\n- Prioritize accuracy over specificity\n\n### Output:\nJSON format: { \"tags\": [\"tag1\", \"tag2\", \"tag3\"] }\n\n### Chat History:\n<chat_history>\nUSER: auggie is here\nASSISTANT: Error editing image 1: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}\n</chat_history>" } ], "stream": false } ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#16834