[GH-ISSUE #8110] Support llama.cpp's Control Vector Functionality #5185

Open
opened 2026-04-12 16:18:54 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @amyb-asu on GitHub (Dec 16, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/8110

llama.cpp added support for control vectors a while ago https://github.com/ggerganov/llama.cpp/pull/5970

They can be loaded via llama_control_vector_load and llama_control_vector_apply which can take a vector in the form of a .gguf

https://github.com/ollama/ollama/blob/main/llama/common.h#L645
https://github.com/ollama/ollama/blob/main/llama/llama.h#L571

Example of how llama.cpp normally applies them: https://github.com/ollama/ollama/blob/main/llama/common.cpp#L920-L944

The vectors can be trained and exported to .gguf via https://github.com/vgel/repeng/

It would be great if we could load control vectors the same way that adapter loras can be loaded currently.

FROM ./models/mistralai/Mistral-7B-Instruct-v0.1
CONTROLVECTOR ./vectors/my_control_vector.gguf 1.5

I think this feature would open a range of model customization which is currently only possible through the more difficult (much much slower and way more memory intensive) adaptor training methods

Originally created by @amyb-asu on GitHub (Dec 16, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/8110 llama.cpp added support for control vectors a while ago https://github.com/ggerganov/llama.cpp/pull/5970 They can be loaded via `llama_control_vector_load` and `llama_control_vector_apply` which can take a vector in the form of a `.gguf` https://github.com/ollama/ollama/blob/main/llama/common.h#L645 https://github.com/ollama/ollama/blob/main/llama/llama.h#L571 Example of how llama.cpp normally applies them: https://github.com/ollama/ollama/blob/main/llama/common.cpp#L920-L944 The vectors can be trained and exported to `.gguf` via https://github.com/vgel/repeng/ It would be great if we could load control vectors the same way that adapter loras can be loaded currently. ```dockerfile FROM ./models/mistralai/Mistral-7B-Instruct-v0.1 CONTROLVECTOR ./vectors/my_control_vector.gguf 1.5 ``` I think this feature would open a range of model customization which is currently only possible through the more difficult (much much slower and way more memory intensive) adaptor training methods
GiteaMirror added the feature request label 2026-04-12 16:18:54 -05:00
Author
Owner

@amyb-asu commented on GitHub (Dec 17, 2024):

I have this mostly working now, but currently I'm not sure the best way to save the strength float for the vector along with the actual vector gguf layer. Is it possible to include it as some extra metadata, or should I add it as a PARAMETER command?

Any ideas?

Anyway, will drop a draft PR soon once I do some testing. Be warned it's probably not at your standards to merge yet, but does seem to be functional

<!-- gh-comment-id:2547339348 --> @amyb-asu commented on GitHub (Dec 17, 2024): I have this mostly working now, but currently I'm not sure the best way to save the strength float for the vector along with the actual vector gguf layer. Is it possible to include it as some extra metadata, or should I add it as a PARAMETER command? Any ideas? Anyway, will drop a draft PR soon once I do some testing. Be warned it's probably not at your standards to merge yet, but does seem to be functional
Author
Owner

@amyb-asu commented on GitHub (Dec 17, 2024):

Here is is in action! This is a control vector trained in ~3 min on a Mac Mini using https://github.com/vgel/repeng/ on Mistral-7B-Instruct-v0.1

trippy_dataset = make_dataset(
    "Act as if you're extremely {persona}.",
    ["high on psychedelic drugs"],
    ["sober from psychedelic drugs"],
    truncated_output_suffixes_512,  # gives (subjectively) better results with slightly fewer samples
)
model.reset()
trippy_vector = ControlVector.train(model, tokenizer, trippy_dataset)
trippy_vector.export_gguf('/tmp/trippy.gguf')

image

Like I mentioned before I still need to propagate the 0.3 to the actual vector application, right now it is hard coded.

<!-- gh-comment-id:2547366139 --> @amyb-asu commented on GitHub (Dec 17, 2024): Here is is in action! This is a control vector trained in ~3 min on a Mac Mini using https://github.com/vgel/repeng/ on `Mistral-7B-Instruct-v0.1` ```python trippy_dataset = make_dataset( "Act as if you're extremely {persona}.", ["high on psychedelic drugs"], ["sober from psychedelic drugs"], truncated_output_suffixes_512, # gives (subjectively) better results with slightly fewer samples ) model.reset() trippy_vector = ControlVector.train(model, tokenizer, trippy_dataset) trippy_vector.export_gguf('/tmp/trippy.gguf') ``` ![image](https://github.com/user-attachments/assets/72aa43ee-d0cf-4988-b0c0-b5a9bea4cd31) Like I mentioned before I still need to propagate the 0.3 to the actual vector application, right now it is hard coded.
Author
Owner

@itszn commented on GitHub (Dec 18, 2024):

Here is the PR for this feature: https://github.com/ollama/ollama/pull/8148

<!-- gh-comment-id:2549986710 --> @itszn commented on GitHub (Dec 18, 2024): Here is the PR for this feature: https://github.com/ollama/ollama/pull/8148
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5185