[GH-ISSUE #8719] Add /api/v1/embeddings endpoint for 100% OpenAI compatibility #30755

Closed
opened 2026-04-25 04:58:44 -05:00 by GiteaMirror · 23 comments
Owner

Originally created by @hdnh2006 on GitHub (Jan 21, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8719

Feature Request

Hello! I want to use OpenWebUI through the API Key provided by the interface.

Image

I use LiteLLM for 100% OpenAI compatibility and for my use case, I need the users can call an embeddings model registered in LiteLLM. However the endpoint /api/embeddings doesn't exist in /api/v1/:

Image

Image

It would be wonderful if you can add it so anyone can use OpenWebUI as a 100% OpenAI compatible API.

Would that be possible?


Is your feature request related to a problem? Please describe.
No, I just want that OpenWebUI can be used as API also.

Describe the solution you'd like
Create /api/v1/embeddings that call the /openai/embeddings endpoint.

Describe alternatives you've considered
This endpoint should call the endpoint /openai/embeddings. That's it:

Image

Additional context
In past versions (0.4.*) I was able to call /openai/embeddings endpoint using my API key, but now that's not allowed anymore

Originally created by @hdnh2006 on GitHub (Jan 21, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8719 # Feature Request Hello! I want to use OpenWebUI through the API Key provided by the interface. ![Image](https://github.com/user-attachments/assets/7a866677-1815-4b10-8d09-58d2bec922bc) I use LiteLLM for 100% OpenAI compatibility and for my use case, I need the users can call an embeddings model registered in LiteLLM. However the endpoint `/api/embeddings` doesn't exist in `/api/v1/`: ![Image](https://github.com/user-attachments/assets/1d3b5860-c0c6-4305-8605-9d9e9a2f5a74) ![Image](https://github.com/user-attachments/assets/304240f3-f586-426f-88c0-559c1dae2f39) It would be wonderful if you can add it so anyone can use OpenWebUI as a 100% OpenAI compatible API. Would that be possible? --- **Is your feature request related to a problem? Please describe.** No, I just want that OpenWebUI can be used as API also. **Describe the solution you'd like** Create `/api/v1/embeddings` that call the `/openai/embeddings` endpoint. **Describe alternatives you've considered** This endpoint should call the endpoint `/openai/embeddings`. That's it: ![Image](https://github.com/user-attachments/assets/d0336a3f-2fe6-4608-b665-e8d69a78d52a) **Additional context** In past versions (`0.4.*`) I was able to call `/openai/embeddings` endpoint using my API key, but now that's not allowed anymore
GiteaMirror added the enhancementgood first issuehelp wantedcore labels 2026-04-25 04:58:44 -05:00
Author
Owner

@jjamgmv commented on GitHub (Jan 27, 2025):

I have the same issue to use continue dev

<!-- gh-comment-id:2615100953 --> @jjamgmv commented on GitHub (Jan 27, 2025): I have the same issue to use continue dev
Author
Owner

@Aleksahek commented on GitHub (Jan 28, 2025):

I have this is it: "404 Client Error: Not Found for url: http://localhost:11434/api/embed"

<!-- gh-comment-id:2619921332 --> @Aleksahek commented on GitHub (Jan 28, 2025): I have this is it: "404 Client Error: Not Found for url: http://localhost:11434/api/embed"
Author
Owner

@hdnh2006 commented on GitHub (Jan 29, 2025):

I have this is it: "404 Client Error: Not Found for url: http://localhost:11434/api/embed"

That's Ollama, bro. That's another thing.

<!-- gh-comment-id:2621123426 --> @hdnh2006 commented on GitHub (Jan 29, 2025): > I have this is it: "404 Client Error: Not Found for url: http://localhost:11434/api/embed" That's Ollama, bro. That's another thing.
Author
Owner

@Mavyre commented on GitHub (Feb 8, 2025):

Great idea. That would allow to use Open Webui as an embedding provider for users, whatever Ollama or Openai is used behind it. That would be a real improvement!

<!-- gh-comment-id:2644797883 --> @Mavyre commented on GitHub (Feb 8, 2025): Great idea. That would allow to use Open Webui as an embedding provider for users, whatever Ollama or Openai is used behind it. That would be a real improvement!
Author
Owner

@drupol commented on GitHub (Feb 8, 2025):

I also support this initiative.

<!-- gh-comment-id:2644868054 --> @drupol commented on GitHub (Feb 8, 2025): I also support this initiative.
Author
Owner

@tjbck commented on GitHub (Feb 13, 2025):

PR Welcome!

<!-- gh-comment-id:2657769615 --> @tjbck commented on GitHub (Feb 13, 2025): PR Welcome!
Author
Owner

@hdnh2006 commented on GitHub (Feb 14, 2025):

PR Welcome!

Ok @tjbck, the problem here is that you say you are going to deprecate the route openai/{path:path}. As it's shown here

is there any chance to revert this deprecation? So we can create the /api/v1/embeddings endpoint that calls the /openai/embeddings.

<!-- gh-comment-id:2659433163 --> @hdnh2006 commented on GitHub (Feb 14, 2025): > PR Welcome! Ok @tjbck, the problem here is that you say you are going to deprecate the route `openai/{path:path}`. As it's shown [here](https://github.com/open-webui/open-webui/blob/2017856791b666fac5f1c2f80a3bc7916439438b/backend/open_webui/routers/openai.py#L722) is there any chance to revert this deprecation? So we can create the `/api/v1/embeddings` endpoint that calls the `/openai/embeddings`.
Author
Owner

@hdnh2006 commented on GitHub (Feb 14, 2025):

No worries @tjbck I have opened a PR, I hope this helps to everyone!

PR: https://github.com/open-webui/open-webui/pull/10018

<!-- gh-comment-id:2659890678 --> @hdnh2006 commented on GitHub (Feb 14, 2025): No worries @tjbck I have opened a PR, I hope this helps to everyone! PR: https://github.com/open-webui/open-webui/pull/10018
Author
Owner

@hdnh2006 commented on GitHub (Feb 17, 2025):

Due to the PR is closed, I would like to ask you if you can be more specific about what exactly you expect @tjbck .

I would like to have this feature if possible, but I don't understand your comment about it:
This does not implement /embeddings endpoint as it should. Please refer to how other endpoints are implemented.

Originally posted by @tjbck in https://github.com/open-webui/open-webui/issues/10018#issuecomment-2660359753

<!-- gh-comment-id:2663772281 --> @hdnh2006 commented on GitHub (Feb 17, 2025): Due to the PR is closed, I would like to ask you if you can be more specific about what exactly you expect @tjbck . I would like to have this feature if possible, but I don't understand your comment about it: This does not implement `/embeddings` endpoint as it should. Please refer to how other endpoints are implemented. _Originally posted by @tjbck in https://github.com/open-webui/open-webui/issues/10018#issuecomment-2660359753_
Author
Owner

@tjbck commented on GitHub (Feb 17, 2025):

@hdnh2006 Your implementation was for OpenAI only, if you take a look at the /chat/completion endpoint, you'll see we have a compatibility layer that allows users to access models from other providers including Ollama. /embeddings endpoint should be no exception. Hope that helps!

Additionally, you should follow FastAPI conventions and adhere to other conventions used throughout the codebase. However, that's a discussion for another time—I can provide a more detailed code review when needed.

<!-- gh-comment-id:2663780910 --> @tjbck commented on GitHub (Feb 17, 2025): @hdnh2006 Your implementation was for OpenAI only, if you take a look at the `/chat/completion` endpoint, you'll see we have a compatibility layer that allows users to access models from other providers including Ollama. `/embeddings` endpoint should be no exception. Hope that helps! Additionally, you should follow FastAPI conventions and adhere to other conventions used throughout the codebase. However, that's a discussion for another time—I can provide a more detailed code review when needed.
Author
Owner

@jayteaftw commented on GitHub (Feb 27, 2025):

Hey @tjbck I just wanted to follow up as I have added more functionality to the embedding API to allow for prefixing queries and documents before sending them to an embedding API. This will allow for using instructions based embedding models. If you can review https://github.com/open-webui/open-webui/pull/8594 and let me know if anything needs to be changed or modified I would greatly appreciate it :)

<!-- gh-comment-id:2688557466 --> @jayteaftw commented on GitHub (Feb 27, 2025): Hey @tjbck I just wanted to follow up as I have added more functionality to the embedding API to allow for prefixing queries and documents before sending them to an embedding API. This will allow for using instructions based embedding models. If you can review https://github.com/open-webui/open-webui/pull/8594 and let me know if anything needs to be changed or modified I would greatly appreciate it :)
Author
Owner

@segtio commented on GitHub (Mar 24, 2025):

+1

<!-- gh-comment-id:2747534074 --> @segtio commented on GitHub (Mar 24, 2025): +1
Author
Owner

@AndiMajore commented on GitHub (Apr 16, 2025):

Supporting this request big time! We are using Ollama internally since many months hosting many different LLMs and using OpenWebUI as a chatting but also authentication layer. For now there was always either a dedicated library for Ollama available, that allowed the used of the /ollama/ endpoint or some workarounds. But now I am reaching the limits with e.g. neo4j, as there no Ollama connector in the form of "apoc.ml.ollama.embedding" exists. For now I had to resort to using ollama without the authentication layer of OpenWebUI, as Ollama provides the /v1 endpoint which allows me to go for "apoc.ml.openai.embedding". Bringing back the authentication layer aka proxying the requests through OpenWebUI would be amazing, so I really hope this /ollama/v1 proxy will still happen!

<!-- gh-comment-id:2809352543 --> @AndiMajore commented on GitHub (Apr 16, 2025): Supporting this request big time! We are using Ollama internally since many months hosting many different LLMs and using OpenWebUI as a chatting but also authentication layer. For now there was always either a dedicated library for Ollama available, that allowed the used of the /ollama/<api> endpoint or some workarounds. But now I am reaching the limits with e.g. neo4j, as there no Ollama connector in the form of "apoc.ml.ollama.embedding" exists. For now I had to resort to using ollama without the authentication layer of OpenWebUI, as Ollama provides the /v1 endpoint which allows me to go for "apoc.ml.openai.embedding". Bringing back the authentication layer aka proxying the requests through OpenWebUI would be amazing, so I really hope this /ollama/v1 proxy will still happen!
Author
Owner

@hdnh2006 commented on GitHub (Apr 17, 2025):

I will work on this @AndiMajore, my first PR dilikes @tjbck, I hope the next will be suitable.

<!-- gh-comment-id:2814159331 --> @hdnh2006 commented on GitHub (Apr 17, 2025): I will work on this @AndiMajore, my first PR dilikes @tjbck, I hope the next will be suitable.
Author
Owner

@pranitl commented on GitHub (May 12, 2025):

@hdnh2006 @AndiMajore , how far along are you guys on this? I was going to submit a similar PR.

<!-- gh-comment-id:2873279903 --> @pranitl commented on GitHub (May 12, 2025): @hdnh2006 @AndiMajore , how far along are you guys on this? I was going to submit a similar PR.
Author
Owner

@hdnh2006 commented on GitHub (May 13, 2025):

@hdnh2006 @AndiMajore , how far along are you guys on this? I was going to submit a similar PR.

My PR was rejected and I have not had time to dedicate to this. If there's anything you consider I can help you with, please let me know.

<!-- gh-comment-id:2875728564 --> @hdnh2006 commented on GitHub (May 13, 2025): > [@hdnh2006](https://github.com/hdnh2006) [@AndiMajore](https://github.com/AndiMajore) , how far along are you guys on this? I was going to submit a similar PR. My PR was rejected and I have not had time to dedicate to this. If there's anything you consider I can help you with, please let me know.
Author
Owner

@pranitl commented on GitHub (May 15, 2025):

@hdnh2006 @AndiMajore , how far along are you guys on this? I was going to submit a similar PR.

My PR was rejected and I have not had time to dedicate to this. If there's anything you consider I can help you with, please let me know.

Created this one, hopefully can be merged, link

<!-- gh-comment-id:2882183501 --> @pranitl commented on GitHub (May 15, 2025): > > [@hdnh2006](https://github.com/hdnh2006) [@AndiMajore](https://github.com/AndiMajore) , how far along are you guys on this? I was going to submit a similar PR. > > My PR was rejected and I have not had time to dedicate to this. If there's anything you consider I can help you with, please let me know. Created this one, hopefully can be merged, [link](https://github.com/open-webui/open-webui/pull/13898)
Author
Owner

@gaby commented on GitHub (May 15, 2025):

@pranitl Looking at your PR, how are file uploads managed?

  • Will it still use the Embedding Engine defined by OWUI?

I have Qdrant setup for embedding in OWUI, my OpenAI backend is vLLM which supports (/v1/embeddings) as documented here https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#supported-apis

Trying to see if your PR makes this work out of the box

<!-- gh-comment-id:2884014641 --> @gaby commented on GitHub (May 15, 2025): @pranitl Looking at your PR, how are file uploads managed? - Will it still use the Embedding Engine defined by OWUI? I have Qdrant setup for embedding in OWUI, my OpenAI backend is vLLM which supports `(/v1/embeddings)` as documented here https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#supported-apis Trying to see if your PR makes this work out of the box
Author
Owner

@pranitl commented on GitHub (May 15, 2025):

@gaby File uploads for RAG KBs should still use the existing RAG-specific embedding engine and model configuration via the logic for this is primarily within backend/open_webui/routers/retrieval.py and related retrieval modules and the new /openai/v1/embeddings and /ollama/v1/embeddings endpoints are separate general purpose APIs. The goal is to just put any text using hte models specified in the api calls to be resolved by the various backends (OpenAI or Ollama) through OpenWebUI's general model configuration for chat/API access. Theres no automatic link where file uploads will start using these new v1 API endpoints for their embedding needs because the RAG pipeline is distinct.

<!-- gh-comment-id:2884223084 --> @pranitl commented on GitHub (May 15, 2025): @gaby File uploads for RAG KBs should still use the existing RAG-specific embedding engine and model configuration via the logic for this is primarily within backend/open_webui/routers/retrieval.py and related retrieval modules and the new /openai/v1/embeddings and /ollama/v1/embeddings endpoints are separate general purpose APIs. The goal is to just put any text using hte models specified in the api calls to be resolved by the various backends (OpenAI or Ollama) through OpenWebUI's general model configuration for chat/API access. Theres no automatic link where file uploads will start using these new v1 API endpoints for their embedding needs because the RAG pipeline is distinct.
Author
Owner

@hdnh2006 commented on GitHub (Jun 4, 2025):

A new PR that solves this has been opened.

Please check

<!-- gh-comment-id:2940332587 --> @hdnh2006 commented on GitHub (Jun 4, 2025): A new [PR](https://github.com/open-webui/open-webui/pull/14667#issue-3118119576) that solves this has been opened. Please check
Author
Owner

@Ithanil commented on GitHub (Jun 4, 2025):

@hdnh2006 Thanks for your great effort, hope this one goes through!

<!-- gh-comment-id:2940442103 --> @Ithanil commented on GitHub (Jun 4, 2025): @hdnh2006 Thanks for your great effort, hope this one goes through!
Author
Owner

@hdnh2006 commented on GitHub (Jun 5, 2025):

Thanks, @Ithanil and all the OpenWebUI developers. I hope this contribution can help everyone make OpenWebUI not only the best interface for AI interaction, but also an administration panel for any AI application. We need to make it more OpenAI-compatible, let's keep working hard 💪

<!-- gh-comment-id:2942935943 --> @hdnh2006 commented on GitHub (Jun 5, 2025): Thanks, @Ithanil and all the OpenWebUI developers. I hope this contribution can help everyone make OpenWebUI not only the best interface for AI interaction, but also an administration panel for any AI application. We need to make it more OpenAI-compatible, let's keep working hard 💪
Author
Owner

@Ithanil commented on GitHub (Jun 5, 2025):

@hdnh2006 Yes, it's incredible useful that Open WebUI can also serve as an easily manageable API-endpoint for end users.

<!-- gh-comment-id:2943107567 --> @Ithanil commented on GitHub (Jun 5, 2025): @hdnh2006 Yes, it's incredible useful that Open WebUI can also serve as an easily manageable API-endpoint for end users.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#30755