[GH-ISSUE #7759] [bug] text-generation-inference backend for VLMs does not see images #14878

Closed
opened 2026-04-19 21:08:07 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @lucyknada on GitHub (Dec 11, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/7759

issue: sending images to tgi returns nonsense, probably because the image is not sent in a way it expects it to?

image

ref: https://huggingface.co/docs/text-generation-inference/main/en/basic_tutorials/visual_language_models#hugging-face-hub-python-library

comparing that to the payload openwebui sends via devtools:

image

it seems they might expect the markdown embed? ![](base64-here) or are they unable to use images mid-context and force you to send "inputs" key?

just guessing / mindstorming what the issue could be, thanks!

Originally created by @lucyknada on GitHub (Dec 11, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/7759 issue: sending images to tgi returns nonsense, probably because the image is not sent in a way it expects it to? ![image](https://github.com/user-attachments/assets/dda71a48-44ed-4282-904c-6361a801095d) ref: https://huggingface.co/docs/text-generation-inference/main/en/basic_tutorials/visual_language_models#hugging-face-hub-python-library comparing that to the payload openwebui sends via devtools: ![image](https://github.com/user-attachments/assets/dfc65d4c-af42-4066-9995-5aa2d7e18380) it seems they might expect the markdown embed? `![](base64-here)` or are they unable to use images mid-context and force you to send "inputs" key? just guessing / mindstorming what the issue could be, thanks!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#14878