[GH-ISSUE #3667] exception create_tensor: tensor 'blk.0.ffn_gate.0.weight' not found #2258

Closed
opened 2026-04-12 12:31:55 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @nkeilar on GitHub (Apr 16, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3667

What is the issue?

Getting this error when trying to use wizardlm2:8x22b-q2_K on a dual 3090 system.

image

ollama version is 0.1.31

Someone else is having same issue in this thread, but I think its a new issue:

https://github.com/ollama/ollama/issues/3032#issuecomment-2058129280

What did you expect to see?

Model loads into memory

Steps to reproduce

Install latest Ollama, try load the wizardlm2:8x22b-q2_K model

Are there any recent changes that introduced the issue?

No response

OS

Linux

Architecture

amd64

Platform

No response

Ollama version

0.1.31

GPU

Nvidia

GPU info

ue Apr 16 14:09:48 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 On | N/A |
| 33% 43C P5 48W / 350W | 1429MiB / 24576MiB | 23% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 Off | 00000000:08:00.0 Off | N/A |
| 30% 35C P8 25W / 200W | 276MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3657 G /usr/lib/xorg/Xorg 654MiB |
| 0 N/A N/A 4314 G /usr/bin/gnome-shell 82MiB |
| 0 N/A N/A 10313 G /usr/bin/nextcloud 82MiB |
| 0 N/A N/A 911826 C /usr/local/bin/ollama 260MiB |
| 0 N/A N/A 1525117 G ...onEnabled --variations-seed-version 36MiB |
| 0 N/A N/A 3518670 G /usr/lib/firefox/firefox 198MiB |
| 1 N/A N/A 3657 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 911826 C /usr/local/bin/ollama 260MiB |
+-----------------------------------------------------------------------------------------+

CPU

Intel

Other software

No response

Originally created by @nkeilar on GitHub (Apr 16, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3667 ### What is the issue? Getting this error when trying to use wizardlm2:8x22b-q2_K on a dual 3090 system. ![image](https://github.com/ollama/ollama/assets/325430/e80c8180-0b82-43d9-8625-ffa795045bfc) ollama version is 0.1.31 Someone else is having same issue in this thread, but I think its a new issue: https://github.com/ollama/ollama/issues/3032#issuecomment-2058129280 ### What did you expect to see? Model loads into memory ### Steps to reproduce Install latest Ollama, try load the wizardlm2:8x22b-q2_K model ### Are there any recent changes that introduced the issue? _No response_ ### OS Linux ### Architecture amd64 ### Platform _No response_ ### Ollama version 0.1.31 ### GPU Nvidia ### GPU info ue Apr 16 14:09:48 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 On | N/A | | 33% 43C P5 48W / 350W | 1429MiB / 24576MiB | 23% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 Off | 00000000:08:00.0 Off | N/A | | 30% 35C P8 25W / 200W | 276MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 3657 G /usr/lib/xorg/Xorg 654MiB | | 0 N/A N/A 4314 G /usr/bin/gnome-shell 82MiB | | 0 N/A N/A 10313 G /usr/bin/nextcloud 82MiB | | 0 N/A N/A 911826 C /usr/local/bin/ollama 260MiB | | 0 N/A N/A 1525117 G ...onEnabled --variations-seed-version 36MiB | | 0 N/A N/A 3518670 G /usr/lib/firefox/firefox 198MiB | | 1 N/A N/A 3657 G /usr/lib/xorg/Xorg 4MiB | | 1 N/A N/A 911826 C /usr/local/bin/ollama 260MiB | +-----------------------------------------------------------------------------------------+ ### CPU Intel ### Other software _No response_
GiteaMirror added the bug label 2026-04-12 12:31:55 -05:00
Author
Owner

@jmorganca commented on GitHub (Apr 16, 2024):

Hi there, sorry this isn't more apparent in the error, but the new 8x22b models require Ollama 0.1.32 which will be released soon – in the meantime you can install the prerelease from https://github.com/ollama/ollama/releases/tag/v0.1.32

Hope this helps!

<!-- gh-comment-id:2058196510 --> @jmorganca commented on GitHub (Apr 16, 2024): Hi there, sorry this isn't more apparent in the error, but the new 8x22b models require Ollama 0.1.32 which will be released soon – in the meantime you can install the prerelease from https://github.com/ollama/ollama/releases/tag/v0.1.32 Hope this helps!
Author
Owner

@nkeilar commented on GitHub (Apr 16, 2024):

Okay, sorry - just found this - https://github.com/ggerganov/llama.cpp/issues/6665 which seems related. Sounds like it will be out soon.

<!-- gh-comment-id:2058197958 --> @nkeilar commented on GitHub (Apr 16, 2024): Okay, sorry - just found this - https://github.com/ggerganov/llama.cpp/issues/6665 which seems related. Sounds like it will be out soon.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2258