[PR #8301] [MERGED] Runner for OIlama engine #12682

Closed
opened 2026-04-13 00:06:51 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/8301
Author: @jessegross
Created: 1/4/2025
Status: Merged
Merged: 2/14/2025
Merged by: @jessegross

Base: mainHead: jessegross/new_runner


📝 Commits (10+)

  • 5ae4f91 backend: Don't return an error on Close
  • a0c7212 backend: Consistently use int (vs. int64) for tensor shapes
  • 3ddf5c4 backend: Support graph computation that does not return an output
  • dc0edc2 backend: API to support full precision matmul
  • b913c13 ggml-backend: Let GGML allocate context memory
  • 713786e ggml-backend: Ensure data is available after async computation
  • 93a163a ggml-backend: Close on nil should be a no-op
  • 48baaee model: Load tensors behind an interface
  • 1d9a88c vocab: Use int32 for special tokens
  • fac7944 models: Move model into their own directory

📊 Changes

41 files changed (+3103 additions, -335 deletions)

View changed files

cache/cache.go (+0 -63)
📝 cmd/cmd.go (+5 -2)
📝 cmd/runner/main.go (+1 -1)
📝 envconfig/config.go (+3 -0)
kvcache/cache.go (+54 -0)
kvcache/causal.go (+455 -0)
kvcache/causal_test.go (+506 -0)
kvcache/encoder.go (+97 -0)
kvcache/wrapper.go (+93 -0)
📝 llm/server.go (+3 -0)
📝 ml/backend.go (+30 -21)
📝 ml/backend/ggml/ggml.go (+86 -38)
📝 model/model.go (+67 -97)
📝 model/model_test.go (+4 -4)
📝 model/models/llama/model.go (+28 -16)
📝 model/models/mllama/imageproc.go (+0 -0)
📝 model/models/mllama/imageproc_test.go (+0 -0)
📝 model/models/mllama/model.go (+18 -8)
📝 model/models/mllama/model_text.go (+44 -28)
📝 model/models/mllama/model_vision.go (+10 -10)

...and 21 more files

📄 Description

Instructions (works best on Metal):

Start the server with the OLLAMA_NEW_ENGINE environment variable set.
OLLAMA_NEW_ENGINE=1 ./ollama serve

Start a model that is supported by the Ollama engine. This one is Llama 3.1 8b Q4_K_M:
./ollama run jessegross/llama3.1


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/8301 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 1/4/2025 **Status:** ✅ Merged **Merged:** 2/14/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/new_runner` --- ### 📝 Commits (10+) - [`5ae4f91`](https://github.com/ollama/ollama/commit/5ae4f914d37f40c4f64f73ca3678837633c0c489) backend: Don't return an error on Close - [`a0c7212`](https://github.com/ollama/ollama/commit/a0c721231af50458e9c650ad45b8a3bd95b70142) backend: Consistently use int (vs. int64) for tensor shapes - [`3ddf5c4`](https://github.com/ollama/ollama/commit/3ddf5c43d20a3419432ab86005d5053c84ece2bb) backend: Support graph computation that does not return an output - [`dc0edc2`](https://github.com/ollama/ollama/commit/dc0edc2d9db28affb9d4f91b5a2c92b693890193) backend: API to support full precision matmul - [`b913c13`](https://github.com/ollama/ollama/commit/b913c133a2bea63222e2af4ea26e12dde6986b49) ggml-backend: Let GGML allocate context memory - [`713786e`](https://github.com/ollama/ollama/commit/713786e0ad7caf1c362b623b78b55c44938c508d) ggml-backend: Ensure data is available after async computation - [`93a163a`](https://github.com/ollama/ollama/commit/93a163a72a22f57a2332dadcb16d962330ef2a0b) ggml-backend: Close on nil should be a no-op - [`48baaee`](https://github.com/ollama/ollama/commit/48baaeed6230da728237fbbed8f48be119216ef4) model: Load tensors behind an interface - [`1d9a88c`](https://github.com/ollama/ollama/commit/1d9a88c94628b0a5ef8c73c855ebd8a4e5652d84) vocab: Use int32 for special tokens - [`fac7944`](https://github.com/ollama/ollama/commit/fac79447256ea46de8a2750cf953fdd0488e33fa) models: Move model into their own directory ### 📊 Changes **41 files changed** (+3103 additions, -335 deletions) <details> <summary>View changed files</summary> ➖ `cache/cache.go` (+0 -63) 📝 `cmd/cmd.go` (+5 -2) 📝 `cmd/runner/main.go` (+1 -1) 📝 `envconfig/config.go` (+3 -0) ➕ `kvcache/cache.go` (+54 -0) ➕ `kvcache/causal.go` (+455 -0) ➕ `kvcache/causal_test.go` (+506 -0) ➕ `kvcache/encoder.go` (+97 -0) ➕ `kvcache/wrapper.go` (+93 -0) 📝 `llm/server.go` (+3 -0) 📝 `ml/backend.go` (+30 -21) 📝 `ml/backend/ggml/ggml.go` (+86 -38) 📝 `model/model.go` (+67 -97) 📝 `model/model_test.go` (+4 -4) 📝 `model/models/llama/model.go` (+28 -16) 📝 `model/models/mllama/imageproc.go` (+0 -0) 📝 `model/models/mllama/imageproc_test.go` (+0 -0) 📝 `model/models/mllama/model.go` (+18 -8) 📝 `model/models/mllama/model_text.go` (+44 -28) 📝 `model/models/mllama/model_vision.go` (+10 -10) _...and 21 more files_ </details> ### 📄 Description Instructions (works best on Metal): Start the server with the OLLAMA_NEW_ENGINE environment variable set. `OLLAMA_NEW_ENGINE=1 ./ollama serve` Start a model that is supported by the Ollama engine. This one is Llama 3.1 8b Q4_K_M: `./ollama run jessegross/llama3.1` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:06:51 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12682