[PR #15355] Feature/7170 image url cli private #77416

Open
opened 2026-05-05 10:05:10 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15355
Author: @ljluestc
Created: 4/6/2026
Status: 🔄 Open

Base: mainHead: feature/7170-image-url-cli-private


📝 Commits (5)

  • 5a86ebc ps: report actual layer counts instead of percentage
  • 234f652 feat(cli): support image URLs in multimodal prompt parsing
  • de549aa cmd: support image URLs in multimodal prompt input
  • 73e3809 feat: add external image URL support for multimodal requests
  • 627054c fix tests

📊 Changes

17 files changed (+667 additions, -69 deletions)

View changed files

PR_DESCRIPTION.md (+95 -0)
📝 api/types.go (+14 -0)
📝 cmd/cmd.go (+6 -11)
📝 cmd/interactive.go (+72 -21)
📝 cmd/interactive_test.go (+38 -0)
📝 docs/api.md (+3 -1)
📝 docs/capabilities/vision.mdx (+3 -0)
📝 docs/cli.mdx (+4 -0)
📝 docs/faq.mdx (+6 -6)
📝 docs/openapi.yaml (+6 -0)
📝 envconfig/config.go (+91 -20)
📝 llm/server.go (+10 -0)
server/image_downloader.go (+192 -0)
server/image_downloader_test.go (+57 -0)
📝 server/routes.go (+62 -8)
📝 server/sched.go (+6 -2)
📝 server/sched_test.go (+2 -0)

📄 Description

feat: support external image URLs for multimodal inputs (#7170)

Closes #7170

Summary

This PR adds first-class support for external image URLs in multimodal workflows, including CLI prompt parsing and server-side request handling.
Users can now provide http:// / https:// image URLs, and Ollama will download, validate, and attach those images to multimodal requests.

Problem

Previously, image inputs were effectively limited to local files (CLI) or pre-decoded bytes (images field in API payloads).
For remote/server deployments, this made common workflows cumbersome because users had to manually download and encode images before inference.

What Changed

CLI

  • cmd/interactive.go
    • Added URL detection for image links in prompt text.
    • Added URL image loading with:
      • request timeout,
      • status validation,
      • content-type validation,
      • max-size protection.
    • Kept existing local file behavior.
    • Updated multimodal usage hint to include URL usage.
  • cmd/interactive_test.go
    • Added URL extraction and URL image loading tests.

API types

  • api/types.go
    • Added ImageURL type:
      • url
      • allow_http
    • Added image_urls to:
      • GenerateRequest
      • Message

Server

  • server/image_downloader.go (new)
    • Added centralized image downloader with:
      • scheme validation (https by default, optional http via allow_http),
      • host allow-list enforcement,
      • download timeout,
      • max-size enforcement,
      • cache support.
  • server/routes.go
    • Restored/fixed GenerateHandler declaration.
    • Added generate request image_urls processing (processImageURLs).
    • Added chat message image_urls processing (processMessageImageURLs).
    • Converts downloaded images into existing images flow before inference.

Env config

  • envconfig/config.go
    • Added settings for image URL handling:
      • OLLAMA_IMAGE_URL_ENABLED
      • OLLAMA_IMAGE_URL_MAX_SIZE
      • OLLAMA_IMAGE_URL_TIMEOUT
      • OLLAMA_IMAGE_URL_ALLOWED_HOSTS
      • OLLAMA_IMAGE_URL_CACHE_DIR

Security and Operational Controls

  • URL scheme restrictions (https default, optional http only when explicitly allowed).
  • Host allow-list support.
  • Download timeout.
  • Maximum image size enforcement.
  • Content type checks for image data.

Backward Compatibility

  • Backward compatible: Yes
  • Existing images behavior: Unchanged
  • Existing local-file CLI behavior: Unchanged
  • New fields/config are optional.

Validation

go test ./cmd -count=1
go test ./server -run TestImageDownloader -count=1
go test ./server -run TestDummyDoesNotExist -count=1

Change Type

  • Feature
  • CLI
  • API / server

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15355 **Author:** [@ljluestc](https://github.com/ljluestc) **Created:** 4/6/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `feature/7170-image-url-cli-private` --- ### 📝 Commits (5) - [`5a86ebc`](https://github.com/ollama/ollama/commit/5a86ebc64dda249b434f9af1c356d4532a19029f) ps: report actual layer counts instead of percentage - [`234f652`](https://github.com/ollama/ollama/commit/234f6527250e0aecff84ee4c97ea9a9390601fd9) feat(cli): support image URLs in multimodal prompt parsing - [`de549aa`](https://github.com/ollama/ollama/commit/de549aa5575d281eb574a7b874057aeabe776c48) cmd: support image URLs in multimodal prompt input - [`73e3809`](https://github.com/ollama/ollama/commit/73e3809862eec2b95153a98ef7a796a3356e32e5) feat: add external image URL support for multimodal requests - [`627054c`](https://github.com/ollama/ollama/commit/627054c16725faeeb086c9aa94b439a944466c02) fix tests ### 📊 Changes **17 files changed** (+667 additions, -69 deletions) <details> <summary>View changed files</summary> ➕ `PR_DESCRIPTION.md` (+95 -0) 📝 `api/types.go` (+14 -0) 📝 `cmd/cmd.go` (+6 -11) 📝 `cmd/interactive.go` (+72 -21) 📝 `cmd/interactive_test.go` (+38 -0) 📝 `docs/api.md` (+3 -1) 📝 `docs/capabilities/vision.mdx` (+3 -0) 📝 `docs/cli.mdx` (+4 -0) 📝 `docs/faq.mdx` (+6 -6) 📝 `docs/openapi.yaml` (+6 -0) 📝 `envconfig/config.go` (+91 -20) 📝 `llm/server.go` (+10 -0) ➕ `server/image_downloader.go` (+192 -0) ➕ `server/image_downloader_test.go` (+57 -0) 📝 `server/routes.go` (+62 -8) 📝 `server/sched.go` (+6 -2) 📝 `server/sched_test.go` (+2 -0) </details> ### 📄 Description # feat: support external image URLs for multimodal inputs (#7170) ## Related Issue Closes #7170 ## Summary This PR adds first-class support for external image URLs in multimodal workflows, including CLI prompt parsing and server-side request handling. Users can now provide `http://` / `https://` image URLs, and Ollama will download, validate, and attach those images to multimodal requests. ## Problem Previously, image inputs were effectively limited to local files (CLI) or pre-decoded bytes (`images` field in API payloads). For remote/server deployments, this made common workflows cumbersome because users had to manually download and encode images before inference. ## What Changed ### CLI - `cmd/interactive.go` - Added URL detection for image links in prompt text. - Added URL image loading with: - request timeout, - status validation, - content-type validation, - max-size protection. - Kept existing local file behavior. - Updated multimodal usage hint to include URL usage. - `cmd/interactive_test.go` - Added URL extraction and URL image loading tests. ### API types - `api/types.go` - Added `ImageURL` type: - `url` - `allow_http` - Added `image_urls` to: - `GenerateRequest` - `Message` ### Server - `server/image_downloader.go` (new) - Added centralized image downloader with: - scheme validation (`https` by default, optional `http` via `allow_http`), - host allow-list enforcement, - download timeout, - max-size enforcement, - cache support. - `server/routes.go` - Restored/fixed `GenerateHandler` declaration. - Added generate request `image_urls` processing (`processImageURLs`). - Added chat message `image_urls` processing (`processMessageImageURLs`). - Converts downloaded images into existing `images` flow before inference. ### Env config - `envconfig/config.go` - Added settings for image URL handling: - `OLLAMA_IMAGE_URL_ENABLED` - `OLLAMA_IMAGE_URL_MAX_SIZE` - `OLLAMA_IMAGE_URL_TIMEOUT` - `OLLAMA_IMAGE_URL_ALLOWED_HOSTS` - `OLLAMA_IMAGE_URL_CACHE_DIR` ## Security and Operational Controls - URL scheme restrictions (`https` default, optional `http` only when explicitly allowed). - Host allow-list support. - Download timeout. - Maximum image size enforcement. - Content type checks for image data. ## Backward Compatibility - Backward compatible: **Yes** - Existing `images` behavior: **Unchanged** - Existing local-file CLI behavior: **Unchanged** - New fields/config are optional. ## Validation ```bash go test ./cmd -count=1 go test ./server -run TestImageDownloader -count=1 go test ./server -run TestDummyDoesNotExist -count=1 ``` ## Change Type - [x] Feature - [x] CLI - [x] API / server --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 10:05:10 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77416