[GH-ISSUE #13707] Support request for https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8 #71048

Closed
opened 2026-05-04 23:50:56 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @eslowney on GitHub (Jan 13, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13707

When attempting to run ollama create from the HF /safetensors*, I get:

invalid character 'I' looking for beginning of value

v0.13.5

Originally created by @eslowney on GitHub (Jan 13, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13707 When attempting to run `ollama create` from the HF /safetensors*, I get: `invalid character 'I' looking for beginning of value` v0.13.5
Author
Owner

@illusdolphin commented on GitHub (Jan 14, 2026):

It's already supported https://ollama.com/library/nemotron-3-nano so just
ollama run nemotron-3-nano

<!-- gh-comment-id:3748098373 --> @illusdolphin commented on GitHub (Jan 14, 2026): It's already supported https://ollama.com/library/nemotron-3-nano so just `ollama run nemotron-3-nano`
Author
Owner

@eslowney commented on GitHub (Jan 14, 2026):

It's already supported https://ollama.com/library/nemotron-3-nano so just ollama run nemotron-3-nano

I'm trying to run it off-line, so the run commands don't work for me (or any network connection to ollama.com or anywhere else)

<!-- gh-comment-id:3751012499 --> @eslowney commented on GitHub (Jan 14, 2026): > It's already supported https://ollama.com/library/nemotron-3-nano so just `ollama run nemotron-3-nano` I'm trying to run it off-line, so the `run` commands don't work for me (or any network connection to ollama.com or anywhere else)
Author
Owner

@rick-github commented on GitHub (Jan 15, 2026):

Download the model when online, copy it to your offline machine.

$ cd /tmp
$ OLLAMA_MODELS=/tmp/models ollama serve 2>&- >&- &
$ ollama pull nemotron-3-nano
$ kill $!
$ zip -r models.zip models
$ rm -rf models

Copy models.zip to your offline machine and extract into ollama directory.

$ sudo -u ollama -s
$ cd ~/.ollama
$ unzip /tmp/models.zip

Alternatively, show what command you are using to convert the HF safetensors into an ollama model.

<!-- gh-comment-id:3753175162 --> @rick-github commented on GitHub (Jan 15, 2026): Download the model when online, copy it to your offline machine. ```console $ cd /tmp $ OLLAMA_MODELS=/tmp/models ollama serve 2>&- >&- & $ ollama pull nemotron-3-nano $ kill $! $ zip -r models.zip models $ rm -rf models ``` Copy models.zip to your offline machine and extract into ollama directory. ```console $ sudo -u ollama -s $ cd ~/.ollama $ unzip /tmp/models.zip ``` Alternatively, show what command you are using to convert the HF safetensors into an ollama model.
Author
Owner

@eslowney commented on GitHub (Jan 15, 2026):

i just have a Modelfile with
FROM /models/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
and then
ollama create Nemotron-3-Nano-30B from within that Modelfile directory.

However, I also tried to convert the model to a gguf first with convert_hf_to_gguf.py from llama.cpp, and that is also throwing a different error, so it could just be the model itself isn't supported by anything yet.

<!-- gh-comment-id:3756291785 --> @eslowney commented on GitHub (Jan 15, 2026): i just have a Modelfile with `FROM /models/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8` and then `ollama create Nemotron-3-Nano-30B` from within that Modelfile directory. However, I also tried to convert the model to a gguf first with `convert_hf_to_gguf.py` from llama.cpp, and that is also throwing a different error, so it could just be the model itself isn't supported by anything yet.
Author
Owner

@rick-github commented on GitHub (Jan 15, 2026):

that is also throwing a different error

Difficult to diagnose the issue without the actual error message. Nemotron is supported by ollama, safetensors import is not, llama.cpp conversion should work with the BF16 tensors.

<!-- gh-comment-id:3757128153 --> @rick-github commented on GitHub (Jan 15, 2026): > that is also throwing a different error Difficult to diagnose the issue without the actual error message. Nemotron is supported by ollama, safetensors import is not, llama.cpp conversion should work with the BF16 tensors.
Author
Owner

@eslowney commented on GitHub (Jan 16, 2026):

that is also throwing a different error

Difficult to diagnose the issue without the actual error message. Nemotron is supported by ollama, safetensors import is not, llama.cpp conversion should work with the BF16 tensors.

Thanks, I downloaded and imported the -BF16 version (instead of -FP8) and it quantized fine and I was able to import it into ollama.

<!-- gh-comment-id:3761314030 --> @eslowney commented on GitHub (Jan 16, 2026): > > that is also throwing a different error > > Difficult to diagnose the issue without the actual error message. Nemotron is supported by ollama, safetensors import is not, llama.cpp conversion should work with the BF16 tensors. Thanks, I downloaded and imported the -BF16 version (instead of -FP8) and it quantized fine and I was able to import it into ollama.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71048