[PR #10728] llm/api/runner: Add ability to generate multimodal embeddings via embeddings endpoints #23886

Open
opened 2026-04-19 17:16:21 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10728
Author: @ajroetker
Created: 5/16/2025
Status: 🔄 Open

Base: mainHead: feat_multimodal-embeddings


📝 Commits (1)

  • 5b971ff Add ability to generate multimodal embeddings via embeddings endpoints

📊 Changes

5 files changed (+129 additions, -21 deletions)

View changed files

📝 api/types.go (+4 -2)
📝 llm/server.go (+9 -8)
📝 runner/ollamarunner/runner.go (+87 -7)
📝 server/routes.go (+28 -3)
📝 server/sched_test.go (+1 -1)

📄 Description

Inre: https://github.com/ollama/ollama/issues/5304

Explain the problem you are trying to solve, not what you are trying to do.
I'm trying to generate embeddings of pdfs so that I can compare embeddings of webpage descriptions with embeddings for their pdfs and merge results.

Explain how the change will be tested.
I tested the code with added examples to the api example clients but I did not commit this code as that looked like mostly minimalistic in scope

Happy to change anything, add tests or address feedback but wanted to see if there was a reason this wasn't added already.

go run main.go foo.png
[-1.7418279647827148 1.9501290321350098 -1.991140604019165 3.5079853534698486 0.8471968173980713 -4.52871036529541 3.787987232208252 1.8196138143539429 -0.24654169380664825 -2.8415348529815674 -3.5892462730407715 -3.5571892261505127 0.5544070601463318 1.1104142665863037 -3.033186674118042 4.001904487609863

A simple test using llava can be done with...

	imgData, err := os.ReadFile("foo.png")
	if err != nil {
		log.Fatal(err)
	}

	url, err := url.Parse("http://localhost:11434")
	if err != nil {
		log.Fatal(err)
	}
	client := api.NewClient(url, http.DefaultClient)

	req := &api.EmbeddingRequest{
		Model: "llava",
		Image: api.ImageData(imgData),
	}

	ctx := context.Background()
	embedResponse, err := client.Embeddings(ctx, req)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(embedResponse.Embedding)

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10728 **Author:** [@ajroetker](https://github.com/ajroetker) **Created:** 5/16/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `feat_multimodal-embeddings` --- ### 📝 Commits (1) - [`5b971ff`](https://github.com/ollama/ollama/commit/5b971ff9ab14800cbfe24aecf0cf87c9da3cf0d2) Add ability to generate multimodal embeddings via embeddings endpoints ### 📊 Changes **5 files changed** (+129 additions, -21 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+4 -2) 📝 `llm/server.go` (+9 -8) 📝 `runner/ollamarunner/runner.go` (+87 -7) 📝 `server/routes.go` (+28 -3) 📝 `server/sched_test.go` (+1 -1) </details> ### 📄 Description Inre: https://github.com/ollama/ollama/issues/5304 Explain the problem you are trying to solve, not what you are trying to do. I'm trying to generate embeddings of pdfs so that I can compare embeddings of webpage descriptions with embeddings for their pdfs and merge results. Explain how the change will be tested. I tested the code with added examples to the api example clients but I did not commit this code as that looked like mostly minimalistic in scope Happy to change anything, add tests or address feedback but wanted to see if there was a reason this wasn't added already. ``` go run main.go foo.png [-1.7418279647827148 1.9501290321350098 -1.991140604019165 3.5079853534698486 0.8471968173980713 -4.52871036529541 3.787987232208252 1.8196138143539429 -0.24654169380664825 -2.8415348529815674 -3.5892462730407715 -3.5571892261505127 0.5544070601463318 1.1104142665863037 -3.033186674118042 4.001904487609863 ``` A simple test using llava can be done with... ``` imgData, err := os.ReadFile("foo.png") if err != nil { log.Fatal(err) } url, err := url.Parse("http://localhost:11434") if err != nil { log.Fatal(err) } client := api.NewClient(url, http.DefaultClient) req := &api.EmbeddingRequest{ Model: "llava", Image: api.ImageData(imgData), } ctx := context.Background() embedResponse, err := client.Embeddings(ctx, req) if err != nil { log.Fatal(err) } fmt.Println(embedResponse.Embedding) ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 17:16:21 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#23886