[PR #14489] fs/ggml: prevent runtime panics on malformed or corrupt GGUF inputs #61400

Open
opened 2026-04-29 16:28:06 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14489
Author: @maralcbr
Created: 2/27/2026
Status: 🔄 Open

Base: mainHead: fix/gguf-string-length-panic


📝 Commits (2)

  • 2f0d019 fs/ggml: prevent runtime panics on malformed GGUF inputs
  • 3145dde fs/ggml: move panic-prevention tests into dedicated TestDecodeGGUFCorruptInputs

📊 Changes

2 files changed (+153 additions, -1 deletions)

View changed files

📝 fs/ggml/gguf.go (+26 -1)
📝 fs/ggml/gguf_test.go (+127 -0)

📄 Description

Summary

readGGUFString, readGGUFArray, and the tensor-info decoder all cast user-supplied uint64 values to int without first validating them. On 64-bit platforms any value larger than math.MaxInt64 wraps to a negative integer, which then causes make() to panic at runtime with:

panic: runtime error: makeslice: len out of range
    github.com/ollama/ollama/fs/ggml/gguf.go:361 +0xc5
github.com/ollama/ollama/fs/ggml.(*gguf).Decode ...
    github.com/ollama/ollama/fs/ggml/gguf.go:195 +0x45a

This was reproduced loading multi-shard Unsloth UD-Q3_K_XL GGUFs that contain mxfp4 tensors: prior to the separation of tensor-info reads from size-validation seeks (now in a dedicated post-processing loop), a misaligned file reader could land on raw weight data whose bytes decoded as an enormous string length, crashing the entire Ollama server process instead of returning a descriptive error.

Changes

  • readGGUFString: validate the raw uint64 length before casting to int; return a descriptive error if it exceeds 1 GiB (far beyond any legitimate GGUF string).
  • readGGUFArray: validate the element count before casting to int; return an error if it exceeds 2^32.
  • gguf.Decode tensor loop: validate that dims does not exceed GGML_MAX_DIMS (4) before allocating the shape slice, matching the llama.cpp spec.

All three fixes convert would-be runtime panics into well-formed errors that propagate up cleanly.

Tests

Three new sub-tests added to TestWriteGGUF:

  • oversized_string_length_returns_error — covers both maxUint64 and maxInt64+1 (the overflow boundary), and just over the 1 GiB cap
  • oversized_array_length_returns_errormaxUint64 element count
  • too_many_tensor_dims_returns_errordims = 5 > GGML_MAX_DIMS

Test plan

  • go test ./fs/ggml/... passes (all new sub-tests return errors rather than panicking)
  • Existing TestWriteGGUF round-trip tests still pass
  • Smoke test with an Unsloth UD-Q3_K_XL merged GGUF: ollama create completes without server panic

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14489 **Author:** [@maralcbr](https://github.com/maralcbr) **Created:** 2/27/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/gguf-string-length-panic` --- ### 📝 Commits (2) - [`2f0d019`](https://github.com/ollama/ollama/commit/2f0d019e75c27b7a4ff98a3f4895a4d498ad089f) fs/ggml: prevent runtime panics on malformed GGUF inputs - [`3145dde`](https://github.com/ollama/ollama/commit/3145dde1c80bcaf489b7d5299d17d8b9c22c30ff) fs/ggml: move panic-prevention tests into dedicated TestDecodeGGUFCorruptInputs ### 📊 Changes **2 files changed** (+153 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `fs/ggml/gguf.go` (+26 -1) 📝 `fs/ggml/gguf_test.go` (+127 -0) </details> ### 📄 Description ## Summary `readGGUFString`, `readGGUFArray`, and the tensor-info decoder all cast user-supplied `uint64` values to `int` without first validating them. On 64-bit platforms any value larger than `math.MaxInt64` wraps to a negative integer, which then causes `make()` to panic at runtime with: ``` panic: runtime error: makeslice: len out of range github.com/ollama/ollama/fs/ggml/gguf.go:361 +0xc5 github.com/ollama/ollama/fs/ggml.(*gguf).Decode ... github.com/ollama/ollama/fs/ggml/gguf.go:195 +0x45a ``` This was reproduced loading multi-shard [Unsloth UD-Q3_K_XL GGUFs](https://huggingface.co/unsloth/Qwen3.5-122B-A10B-GGUF) that contain `mxfp4` tensors: prior to the separation of tensor-info reads from size-validation seeks (now in a dedicated post-processing loop), a misaligned file reader could land on raw weight data whose bytes decoded as an enormous string length, crashing the entire Ollama server process instead of returning a descriptive error. ## Changes - **`readGGUFString`**: validate the raw `uint64` length before casting to `int`; return a descriptive error if it exceeds 1 GiB (far beyond any legitimate GGUF string). - **`readGGUFArray`**: validate the element count before casting to `int`; return an error if it exceeds `2^32`. - **`gguf.Decode` tensor loop**: validate that `dims` does not exceed `GGML_MAX_DIMS` (4) before allocating the shape slice, matching the [llama.cpp spec](https://github.com/ggml-org/ggml/blob/master/docs/gguf.md). All three fixes convert would-be runtime panics into well-formed errors that propagate up cleanly. ## Tests Three new sub-tests added to `TestWriteGGUF`: - `oversized_string_length_returns_error` — covers both `maxUint64` and `maxInt64+1` (the overflow boundary), and just over the 1 GiB cap - `oversized_array_length_returns_error` — `maxUint64` element count - `too_many_tensor_dims_returns_error` — `dims = 5 > GGML_MAX_DIMS` ## Test plan - [ ] `go test ./fs/ggml/...` passes (all new sub-tests return errors rather than panicking) - [ ] Existing `TestWriteGGUF` round-trip tests still pass - [ ] Smoke test with an Unsloth UD-Q3_K_XL merged GGUF: `ollama create` completes without server panic --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:28:06 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#61400