[GH-ISSUE #7079] Support for I16 data type in conversion from Safetensors #4495

Open
opened 2026-04-12 15:25:22 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @josefblaha on GitHub (Oct 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7079

I tried importing the ISTA-DASLab/Meta-Llama-3.1-8B-Instruct-AQLM-PV-2Bit-1x16-hf model from Hugging Face. It's in Safetensors format with tensor type FP16 and I16.

I downloaded the files, created a simple Modelfile in the same directory:

FROM .

From model creation I got this:

PS D:\OllamaModels\llama3.1-8b-instruct-aqlm> ollama create llama3.1-instruct-aqlm:8b
transferring model data 100%
converting model
Error: unknown data type: I16

Could Ollama conversion support the I16 data type?

Originally created by @josefblaha on GitHub (Oct 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7079 I tried importing the [ISTA-DASLab/Meta-Llama-3.1-8B-Instruct-AQLM-PV-2Bit-1x16-hf](https://huggingface.co/ISTA-DASLab/Meta-Llama-3.1-8B-Instruct-AQLM-PV-2Bit-1x16-hf) model from Hugging Face. It's in Safetensors format with tensor type FP16 and I16. I downloaded the files, created a simple `Modelfile` in the same directory: ``` FROM . ``` From model creation I got this: ``` PS D:\OllamaModels\llama3.1-8b-instruct-aqlm> ollama create llama3.1-instruct-aqlm:8b transferring model data 100% converting model Error: unknown data type: I16 ``` Could Ollama conversion support the I16 data type?
GiteaMirror added the feature request label 2026-04-12 15:25:22 -05:00
Author
Owner

@YangWang92 commented on GitHub (Oct 20, 2024):

Hi @josefblaha,
I sincerely hope I didn’t offend you by using AQLM. I was wondering if you might be interested in taking a look at our VPTQ project? https://github.com/microsoft/VPTQ

Similar to AQLM, VPTQ also uses vector quantization to quantize models down to extremely low bits.
One key difference is that the packaged models in the open-source community use i32 instead of i16.

Would you be interested in collaborating with us to explore applying VPTQ to Ollama?
Thanks!

<!-- gh-comment-id:2424947371 --> @YangWang92 commented on GitHub (Oct 20, 2024): Hi @josefblaha, I sincerely hope I didn’t offend you by using AQLM. I was wondering if you might be interested in taking a look at our VPTQ project? https://github.com/microsoft/VPTQ Similar to AQLM, VPTQ also uses vector quantization to quantize models down to extremely low bits. One key difference is that the packaged models in the open-source community use i32 instead of i16. Would you be interested in collaborating with us to explore applying VPTQ to Ollama? Thanks!
Author
Owner

@josefblaha commented on GitHub (Oct 22, 2024):

@YangWang92 I'm afraid I can't help you with that. I'm a mere user of Ollama and I don't have any insight into the various model formats.

<!-- gh-comment-id:2429389149 --> @josefblaha commented on GitHub (Oct 22, 2024): @YangWang92 I'm afraid I can't help you with that. I'm a mere user of Ollama and I don't have any insight into the various model formats.
Author
Owner

@YangWang92 commented on GitHub (Oct 22, 2024):

Never mind, Ollama is quite useful~

<!-- gh-comment-id:2429742370 --> @YangWang92 commented on GitHub (Oct 22, 2024): Never mind, Ollama is quite useful~
Author
Owner

@BMukhtar commented on GitHub (Nov 13, 2024):

+1 to feature request, AFAIK AQLM+PV (note before was only AQLM) is the best performing method at the moment (better than VPTQ)

<!-- gh-comment-id:2472487410 --> @BMukhtar commented on GitHub (Nov 13, 2024): +1 to feature request, AFAIK AQLM+PV (note before was only AQLM) is the best performing method at the moment (better than VPTQ)
Author
Owner

@coyoteXujie commented on GitHub (Mar 6, 2025):

I tried importing the QwQ-32B-AWQ model from modelscope.

I downloaded the files, created a simple Modelfile like ollama:

TEMPLATE """{{- if or .System .Tools }}<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
# Tools
You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>
For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end }}<|im_end|>
{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ if .Content }}{{ .Content }}
{{- else if .ToolCalls }}<tool_call>
{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{ end }}</tool_call>
{{- end }}{{ if not $last }}<|im_end|>
{{ end }}
{{- else if eq .Role "tool" }}<|im_start|>user
<tool_response>
{{ .Content }}
</tool_response><|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
{{ end }}
{{- end }}
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.6```

From model creation I got this:

![Image](https://github.com/user-attachments/assets/530a7cf2-63ae-42a7-a424-1dcec7cafede)

Could Ollama conversion support the I32 data type?
<!-- gh-comment-id:2703111962 --> @coyoteXujie commented on GitHub (Mar 6, 2025): I tried importing the [QwQ-32B-AWQ](https://www.modelscope.cn/models/Qwen/QwQ-32B-AWQ) model from modelscope. I downloaded the files, created a simple Modelfile like [ollama](https://ollama.com/library/qwq): ```FROM /home/paas/ollama/modelfile/QwQ-32B-AWQ TEMPLATE """{{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{- end }} {{- if .Tools }} # Tools You may call one or more functions to assist with the user query. You are provided with function signatures within <tools></tools> XML tags: <tools> {{- range .Tools }} {"type": "function", "function": {{ .Function }}} {{- end }} </tools> For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: <tool_call> {"name": <function-name>, "arguments": <args-json-object>} </tool_call> {{- end }}<|im_end|> {{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "user" }}<|im_start|>user {{ .Content }}<|im_end|> {{ else if eq .Role "assistant" }}<|im_start|>assistant {{ if .Content }}{{ .Content }} {{- else if .ToolCalls }}<tool_call> {{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}} {{ end }}</tool_call> {{- end }}{{ if not $last }}<|im_end|> {{ end }} {{- else if eq .Role "tool" }}<|im_start|>user <tool_response> {{ .Content }} </tool_response><|im_end|> {{ end }} {{- if and (ne .Role "assistant") $last }}<|im_start|>assistant {{ end }} {{- end }} """ PARAMETER stop "<|im_start|>" PARAMETER stop "<|im_end|>" PARAMETER temperature 0.6``` From model creation I got this: ![Image](https://github.com/user-attachments/assets/530a7cf2-63ae-42a7-a424-1dcec7cafede) Could Ollama conversion support the I32 data type?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4495