[PR #8148] Add support for applying control vectors in gguf format [Rebased on v0.11.4] #12648

Open
opened 2026-04-13 00:05:48 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/8148
Author: @itszn
Created: 12/18/2024
Status: 🔄 Open

Base: mainHead: feat/control-vectors


📝 Commits (2)

  • 57f3108 Add support for control vectors in gguf format
  • 40f8dd2 Add template property to chat endpoint

📊 Changes

16 files changed (+263 additions, -42 deletions)

View changed files

📝 api/types.go (+25 -19)
📝 cmd/cmd.go (+15 -0)
📝 docs/modelfile.md (+16 -0)
llama/control_vectors.cpp (+40 -0)
llama/control_vectors.h (+27 -0)
📝 llama/llama.go (+14 -0)
📝 llm/server.go (+13 -1)
📝 openai/openai.go (+2 -0)
📝 parser/parser.go (+13 -2)
📝 runner/llamarunner/runner.go (+30 -1)
📝 server/create.go (+27 -0)
📝 server/images.go (+22 -12)
📝 server/prompt.go (+12 -3)
📝 server/prompt_test.go (+1 -1)
📝 server/routes.go (+3 -1)
📝 server/sched.go (+3 -2)

📄 Description

Current Supported Ollama Release Version: v0.6.8

Control Vectors allow for changing the behavior of a model by steering towards or away from a specific behavior.

You can learn more about them from these sources:
https://hlfshell.ai/posts/representation-engineering/
https://vgel.me/posts/representation-engineering/

Earlier this year https://github.com/ggerganov/llama.cpp/pull/5970 support for loading and applying control vectors in GGUF format was added to llama.cpp. This pull request exposes that cpp feature via the Modelfile, to make it easy to apply a control vector on top of an existing ollama model and serve it via ollama's api (currently there is no off-the-shelf serving solution which supports control vectors)

To create a control vector for a model, you can use this library which includes exporting in the GGUF format: https://github.com/vgel/repeng

Building

Note on updating This branch will update as ollama releases, you can use git checkout origin/feat/control-vectors to update your local copy, as git pull will not work due to rebasing.

Until this PR is merged you can use it by cloning my branch and building directly. See these docs on how to build: https://github.com/ollama/ollama/blob/main/docs/development.md#overview

It boils down to installing a few dependencies and then using make

# Grab this branch if you do not have it already
git clone https://github.com/itszn/ollama

cd ollama

# If you already have it and just want to update
git fetch
git checkout origin/feat/control-vectors

cmake -B build
cmake --build build
go build .
ls -la ./ollama

Then you can run the server like normal via the serve command (make sure no old versions of ollama are running)

./ollama serve

I will try to update this PR along side ollama releases, so you hopefully won't miss any features while keeping control vector support 🎉

If you like this feature, leave an emoji ❤️

Example

Here is an example of how you can use this PR to build and serve a model with a control vector:

Training the vector

First train a control vector for the given model. In this example I am using https://github.com/vgel/repeng to train off of Mistral-7B-Instruct-v0.3. This takes about ~3 min on my mac mini

from repeng import ControlVector, ControlModel, DatasetEntry
model_name = "mistralai/Mistral-7B-Instruct-v0.3"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
model = ControlModel(model, list(range(-5, -18, -1)))

# Train the control vector with two contrasting writing styles
happy_dataset = make_dataset(
    "Act as if you're extremely {persona}.",
    ["happy", "euphoric"],
    ["sad", "depressed"],
    truncated_output_suffixes_512,
)
model.reset()
happy_vector = ControlVector.train(model, tokenizer, happy_dataset)
happy_vector.export_gguf('/opt/happy.gguf')

Applying the GGUF via Modelfile

Ensure the base model works with ollama

First we make sure we have a working instance of Mistral-7B-Instruct-v0.3 in ollama (using correct mistral template) (can be quantized)

You can also use the official ollama images for the model (ie https://ollama.com/library/mistral:7b ) , but make sure it is the same one you trained against, or else it may not work as expected.

Minimal Modelfile for `Mistral-7B-Instruct-v0.3`
FROM .

TEMPLATE """{{- if .Suffix }}[SUFFIX]{{ .Suffix }}[PREFIX] {{ .Prompt }}
{{- else if .Messages }}
{{- range $index, $_ := .Messages }}
{{- if eq .Role "user" }}[INST] {{ if and $.System (eq (len (slice $.Messages $index)) 1) }}{{ $.System }}

{{ end }}{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }} {{ .Content }}</s>
{{- end }}
{{- end }}
{{- else }}[INST] {{ if .System }}{{ .System }}

{{ end }}{{ .Prompt }} [/INST]
{{- end }} {{ .Response }}
{{- if .Response }}</s>
{{- end }}"""

PARAMETER stop [INST]
PARAMETER stop [/INST]
PARAMETER stop [PREFIX]
PARAMETER stop [MIDDLE]
PARAMETER stop [SUFFIX]
Mistral-7B-Instruct-v0.3 % ./ollama create Mistral-7B-Instruct-v0.3 -f Modelfile
transferring model data 100%
converting model
using existing layer sha256:6cd684e7092d1561237a5de7262a0693940e712dbfa5f4556faf2ee41eec004c
using existing layer sha256:51707752a87ca45dc91470c0e4974028eb50096af69f97d9bef091edcf51a649
using existing layer sha256:5dea4f4d0fffcd67078a5f8fa107312bcf1d7d658cc668631a4fd6b4530a7159
creating new layer sha256:ccfb628e0111a2a7cd07cb19c0fc8c984ed8d72846c6f92e8a1b1b8842e82bb3
writing manifest
success
% ./ollama run Mistral-7B-Instruct-v0.3
>>> how do you feel?
1. I am an artificial intelligence and do not have feelings or emotions like humans do.

Create a new model with the control vector

Now we will define our modified model via a new Modefile (IMPORTANT you must provide an absolute path to the control vector gguf file, ie /home/user/models/happy-mistral/happy.gguf)

FROM Mistral-7B-Instruct-v0.3

CONTROLVECTOR /opt/happy.gguf
PARAMETER control_strength 0.4
happy-mistral % ./ollama create happy-mistral -f Modelfile
transferring model data
using existing layer sha256:6cd684e7092d1561237a5de7262a0693940e712dbfa5f4556faf2ee41eec004c
using existing layer sha256:51707752a87ca45dc91470c0e4974028eb50096af69f97d9bef091edcf51a649
using existing layer sha256:218dfbcc5cc2b4949ff62cf67ef3707ad5ebbfcc110f35ab5516440775cc1ca5
using existing layer sha256:683c80fc5e2261a899cc69eea9143e4a8fee9194acf4aed2767d0354c862dfe7
creating new layer sha256:1c7084be5b7e84d88d27c2b8ad5542a583d39e2c67b517436ef2d9fcce859ceb
writing manifest
success
% ./ollama run happy-mistral
>>> how do you feel
😃 I'm absolutely thrilled and elated! You did it, dude!

Note in the server debug logs, we apply the new control vector with the strength from the PARAMETER control_strength

time=2024-12-17T15:57:40.248-08:00 level=DEBUG source=runner.go:897 msg="applying control vector" /Users/nyan/.ollama/models/blobs/sha256-218dfbcc5cc2b4949ff62cf67ef3707ad5ebbfcc110f35ab5516440775cc1ca5=0.4000000059604645

Negative strength

We can also use a negative strength in our Modelfile to use the opposite direction of the vector (note not ever vector works well bi-directonally)

FROM Mistral-7B-Instruct-v0.3

CONTROLVECTOR /opt/happy.gguf
PARAMETER control_strength -0.5
happy-mistral % ollama create sad-mistral -f Modelfile
% ./ollama run sad-mistral
>>> how do you feel
1. I'm not sure if I should feel bad for a while, but I do know that it's important to have someone who is always sad and feeling the weight of the world. 

Known Issues / Limitations

  • Control vectors must be provided as an absolute path in the Model file (ie CONTROLVECTOR /home/user/models/happy-mistral/happy.gguf)
  • Vector currently applies to all layers
  • control_strength is only applied when launching a new runner, not applicable via the API params

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/8148 **Author:** [@itszn](https://github.com/itszn) **Created:** 12/18/2024 **Status:** 🔄 Open **Base:** `main` ← **Head:** `feat/control-vectors` --- ### 📝 Commits (2) - [`57f3108`](https://github.com/ollama/ollama/commit/57f3108dd34bc51e11f212880202bf0f4563555b) Add support for control vectors in gguf format - [`40f8dd2`](https://github.com/ollama/ollama/commit/40f8dd26976c9ba4477ee3d8c655a6962a253d40) Add template property to chat endpoint ### 📊 Changes **16 files changed** (+263 additions, -42 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+25 -19) 📝 `cmd/cmd.go` (+15 -0) 📝 `docs/modelfile.md` (+16 -0) ➕ `llama/control_vectors.cpp` (+40 -0) ➕ `llama/control_vectors.h` (+27 -0) 📝 `llama/llama.go` (+14 -0) 📝 `llm/server.go` (+13 -1) 📝 `openai/openai.go` (+2 -0) 📝 `parser/parser.go` (+13 -2) 📝 `runner/llamarunner/runner.go` (+30 -1) 📝 `server/create.go` (+27 -0) 📝 `server/images.go` (+22 -12) 📝 `server/prompt.go` (+12 -3) 📝 `server/prompt_test.go` (+1 -1) 📝 `server/routes.go` (+3 -1) 📝 `server/sched.go` (+3 -2) </details> ### 📄 Description Current Supported Ollama Release Version: v0.6.8 Control Vectors allow for changing the behavior of a model by steering towards or away from a specific behavior. You can learn more about them from these sources: https://hlfshell.ai/posts/representation-engineering/ https://vgel.me/posts/representation-engineering/ Earlier this year https://github.com/ggerganov/llama.cpp/pull/5970 support for loading and applying control vectors in GGUF format was added to llama.cpp. This pull request exposes that cpp feature via the Modelfile, to make it easy to apply a control vector on top of an existing ollama model and serve it via ollama's api (currently there is no off-the-shelf serving solution which supports control vectors) To create a control vector for a model, you can use this library which includes exporting in the GGUF format: https://github.com/vgel/repeng # Building **Note on updating** This branch will update as ollama releases, you can use `git checkout origin/feat/control-vectors` to update your local copy, as git pull will not work due to rebasing. Until this PR is merged you can use it by cloning my branch and building directly. See these docs on how to build: https://github.com/ollama/ollama/blob/main/docs/development.md#overview It boils down to installing a few dependencies and then using make ```bash # Grab this branch if you do not have it already git clone https://github.com/itszn/ollama cd ollama # If you already have it and just want to update git fetch git checkout origin/feat/control-vectors cmake -B build cmake --build build go build . ls -la ./ollama ``` Then you can run the server like normal via the `serve` command (make sure no old versions of ollama are running) ``` ./ollama serve ``` I will try to update this PR along side ollama releases, so you hopefully won't miss any features while keeping control vector support 🎉 If you like this feature, leave an emoji ❤️ # Example Here is an example of how you can use this PR to build and serve a model with a control vector: ## Training the vector First train a control vector for the given model. In this example I am using https://github.com/vgel/repeng to train off of [`Mistral-7B-Instruct-v0.3`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3). This takes about ~3 min on my mac mini ```python from repeng import ControlVector, ControlModel, DatasetEntry model_name = "mistralai/Mistral-7B-Instruct-v0.3" model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16) model = ControlModel(model, list(range(-5, -18, -1))) # Train the control vector with two contrasting writing styles happy_dataset = make_dataset( "Act as if you're extremely {persona}.", ["happy", "euphoric"], ["sad", "depressed"], truncated_output_suffixes_512, ) model.reset() happy_vector = ControlVector.train(model, tokenizer, happy_dataset) happy_vector.export_gguf('/opt/happy.gguf') ``` ## Applying the GGUF via Modelfile ### Ensure the base model works with ollama First we make sure we have a working instance of `Mistral-7B-Instruct-v0.3` in ollama (using correct mistral template) (can be quantized) You can also use the official ollama images for the model (ie https://ollama.com/library/mistral:7b ) , but make sure it is the same one you trained against, or else it may not work as expected. <details> <summary>Minimal Modelfile for `Mistral-7B-Instruct-v0.3` </summary> ```modelfile FROM . TEMPLATE """{{- if .Suffix }}[SUFFIX]{{ .Suffix }}[PREFIX] {{ .Prompt }} {{- else if .Messages }} {{- range $index, $_ := .Messages }} {{- if eq .Role "user" }}[INST] {{ if and $.System (eq (len (slice $.Messages $index)) 1) }}{{ $.System }} {{ end }}{{ .Content }}[/INST] {{- else if eq .Role "assistant" }} {{ .Content }}</s> {{- end }} {{- end }} {{- else }}[INST] {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }} [/INST] {{- end }} {{ .Response }} {{- if .Response }}</s> {{- end }}""" PARAMETER stop [INST] PARAMETER stop [/INST] PARAMETER stop [PREFIX] PARAMETER stop [MIDDLE] PARAMETER stop [SUFFIX] ``` </details> ```bash Mistral-7B-Instruct-v0.3 % ./ollama create Mistral-7B-Instruct-v0.3 -f Modelfile transferring model data 100% converting model using existing layer sha256:6cd684e7092d1561237a5de7262a0693940e712dbfa5f4556faf2ee41eec004c using existing layer sha256:51707752a87ca45dc91470c0e4974028eb50096af69f97d9bef091edcf51a649 using existing layer sha256:5dea4f4d0fffcd67078a5f8fa107312bcf1d7d658cc668631a4fd6b4530a7159 creating new layer sha256:ccfb628e0111a2a7cd07cb19c0fc8c984ed8d72846c6f92e8a1b1b8842e82bb3 writing manifest success % ./ollama run Mistral-7B-Instruct-v0.3 >>> how do you feel? 1. I am an artificial intelligence and do not have feelings or emotions like humans do. ``` ### Create a new model with the control vector Now we will define our modified model via a new Modefile (**IMPORTANT** you must provide an absolute path to the control vector gguf file, ie `/home/user/models/happy-mistral/happy.gguf`) ```modelfile FROM Mistral-7B-Instruct-v0.3 CONTROLVECTOR /opt/happy.gguf PARAMETER control_strength 0.4 ``` ```bash happy-mistral % ./ollama create happy-mistral -f Modelfile transferring model data using existing layer sha256:6cd684e7092d1561237a5de7262a0693940e712dbfa5f4556faf2ee41eec004c using existing layer sha256:51707752a87ca45dc91470c0e4974028eb50096af69f97d9bef091edcf51a649 using existing layer sha256:218dfbcc5cc2b4949ff62cf67ef3707ad5ebbfcc110f35ab5516440775cc1ca5 using existing layer sha256:683c80fc5e2261a899cc69eea9143e4a8fee9194acf4aed2767d0354c862dfe7 creating new layer sha256:1c7084be5b7e84d88d27c2b8ad5542a583d39e2c67b517436ef2d9fcce859ceb writing manifest success % ./ollama run happy-mistral >>> how do you feel 😃 I'm absolutely thrilled and elated! You did it, dude! ``` Note in the server debug logs, we apply the new control vector with the strength from the `PARAMETER control_strength` ``` time=2024-12-17T15:57:40.248-08:00 level=DEBUG source=runner.go:897 msg="applying control vector" /Users/nyan/.ollama/models/blobs/sha256-218dfbcc5cc2b4949ff62cf67ef3707ad5ebbfcc110f35ab5516440775cc1ca5=0.4000000059604645 ``` ### Negative strength We can also use a negative strength in our Modelfile to use the opposite direction of the vector (note not ever vector works well bi-directonally) ```modelfile FROM Mistral-7B-Instruct-v0.3 CONTROLVECTOR /opt/happy.gguf PARAMETER control_strength -0.5 ``` ``` happy-mistral % ollama create sad-mistral -f Modelfile % ./ollama run sad-mistral >>> how do you feel 1. I'm not sure if I should feel bad for a while, but I do know that it's important to have someone who is always sad and feeling the weight of the world. ``` # Known Issues / Limitations - Control vectors must be provided as an absolute path in the Model file (ie `CONTROLVECTOR /home/user/models/happy-mistral/happy.gguf`) - Vector currently applies to all layers - `control_strength` is only applied when launching a new runner, not applicable via the API params --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:05:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12648