[PR #14739] server: handle NaN values in embedding responses #77108

Open
opened 2026-05-05 09:48:25 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14739
Author: @mvanhorn
Created: 3/9/2026
Status: 🔄 Open

Base: mainHead: osc/14657-fix-embedding-nan


📝 Commits (2)

  • 1f90db4 server: handle NaN values in embedding responses
  • 6aa8c5a server: consolidate embedding validation to single ValidateEmbedding call

📊 Changes

5 files changed (+107 additions, -1 deletions)

View changed files

llm/embedding_test.go (+79 -0)
📝 llm/server.go (+12 -0)
📝 runner/llamarunner/runner.go (+4 -0)
📝 runner/ollamarunner/runner.go (+7 -1)
📝 server/routes.go (+5 -0)

📄 Description

Fixes #14657

Summary

  • Added ValidateEmbedding function in the llm package to detect NaN/Inf values before JSON serialization
  • Applied validation in both runner-level embedding handlers (ollamarunner and llamarunner) where the crash originates
  • Also added NaN/Inf check in the deprecated EmbeddingsHandler endpoint which was missing the validation that EmbedHandler already had via normalize()
  • Returns a clear error message ("model produced invalid embedding values (NaN or Inf)") instead of crashing with json: unsupported value: NaN

Context

Go's encoding/json does not support NaN or Inf float values. When a model (e.g., bge-m3 with certain inputs) produces NaN values in its embeddings, the JSON encoder crashes with an unhelpful 500 error. The EmbedHandler path already catches this via the normalize() function, but the runner-level handlers and the deprecated EmbeddingsHandler did not have this protection.

The workaround OLLAMA_FLASH_ATTENTION=false mentioned in the issue suggests the root cause may be in flash attention computation, which could be investigated separately as a deeper fix.

Testing

  • Added TestValidateEmbedding with coverage for valid embeddings, NaN, positive/negative Inf, empty/nil slices, and edge cases
  • Existing TestNormalize continues to pass

This contribution was developed with AI assistance (Claude Code).


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14739 **Author:** [@mvanhorn](https://github.com/mvanhorn) **Created:** 3/9/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `osc/14657-fix-embedding-nan` --- ### 📝 Commits (2) - [`1f90db4`](https://github.com/ollama/ollama/commit/1f90db4b93ae1c412ef4e2eceb085ee39e5d8ae8) server: handle NaN values in embedding responses - [`6aa8c5a`](https://github.com/ollama/ollama/commit/6aa8c5aeb6986bc51b556456b1e148e9bb2d6fc3) server: consolidate embedding validation to single ValidateEmbedding call ### 📊 Changes **5 files changed** (+107 additions, -1 deletions) <details> <summary>View changed files</summary> ➕ `llm/embedding_test.go` (+79 -0) 📝 `llm/server.go` (+12 -0) 📝 `runner/llamarunner/runner.go` (+4 -0) 📝 `runner/ollamarunner/runner.go` (+7 -1) 📝 `server/routes.go` (+5 -0) </details> ### 📄 Description Fixes #14657 ## Summary - Added `ValidateEmbedding` function in the `llm` package to detect NaN/Inf values before JSON serialization - Applied validation in both runner-level embedding handlers (`ollamarunner` and `llamarunner`) where the crash originates - Also added NaN/Inf check in the deprecated `EmbeddingsHandler` endpoint which was missing the validation that `EmbedHandler` already had via `normalize()` - Returns a clear error message (`"model produced invalid embedding values (NaN or Inf)"`) instead of crashing with `json: unsupported value: NaN` ## Context Go's `encoding/json` does not support NaN or Inf float values. When a model (e.g., bge-m3 with certain inputs) produces NaN values in its embeddings, the JSON encoder crashes with an unhelpful 500 error. The `EmbedHandler` path already catches this via the `normalize()` function, but the runner-level handlers and the deprecated `EmbeddingsHandler` did not have this protection. The workaround `OLLAMA_FLASH_ATTENTION=false` mentioned in the issue suggests the root cause may be in flash attention computation, which could be investigated separately as a deeper fix. ## Testing - Added `TestValidateEmbedding` with coverage for valid embeddings, NaN, positive/negative Inf, empty/nil slices, and edge cases - Existing `TestNormalize` continues to pass This contribution was developed with AI assistance (Claude Code). --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 09:48:25 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77108