[GH-ISSUE #7591] Cant compile ollma 0.4.1 on arm Jetson agx orin #51351

Closed
opened 2026-04-28 19:38:26 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @rebotnix on GitHub (Nov 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7591

What is the issue?

I have some issues to compile latest ollma on an ARM nvidia jetson plattform cuda version is 11.2 with jetpack 5.1.2.

ggml-quants.c: In function ‘ggml_vec_dot_q4_0_q8_0’:
ggml-quants.c:4023:52: warning: implicit declaration of function ‘vmmlaq_s32’; did you mean ‘vmlaq_s32’? [-Wimplicit-function-declaration]
4023 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)),
| ^~~~~~~~~~
| vmlaq_s32
ggml-quants.c:4023:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’
4023 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)),
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| int
4024 | l1, r1)), l2, r2)), l3, r3))), scale);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ggml-cpu-impl.h:164,
from ggml-quants.c:32:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’
15168 | vcvtq_f32_s32 (int32x4_t __a)
| ~~~~~~~~~~^~~
ggml-quants.c: In function ‘ggml_vec_dot_q4_1_q8_1’:
ggml-quants.c:4603:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’
4603 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)),
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| int
4604 | l1, r1)), l2, r2)), l3, r3))), scale);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ggml-cpu-impl.h:164,
from ggml-quants.c:32:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’
15168 | vcvtq_f32_s32 (int32x4_t __a)
| ~~~~~~~~~~^~~
ggml-quants.c: In function ‘ggml_vec_dot_q8_0_q8_0’:
ggml-quants.c:5607:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’
5607 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)),
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| int
5608 | l1, r1)), l2, r2)), l3, r3))), scale);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ggml-cpu-impl.h:164,
from ggml-quants.c:32:
/usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’
15168 | vcvtq_f32_s32 (int32x4_t __a)
| ~~~~~~~~~~^~~
make[2]: *** [make/gpu.make:85: /opt/ssd/ollama/ollama/llama/build/linux-arm64/runners/cuda_v11/ollama_llama_server] Error 1
make[2]: Leaving directory '/opt/ssd/ollama/ollama/llama'
make[1]: *** [Makefile:41: cuda_v11] Error 2
make[1]: Leaving directory '/opt/ssd/ollama/ollama/llama'
make: *** [Makefile:4: all] Error 2

Did anyone has found an fix for this?

OS

Linux

GPU

Nvidia embedded AGX Orin 64GB

CPU

ARM

Ollama version

0.4.1

Originally created by @rebotnix on GitHub (Nov 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7591 ### What is the issue? I have some issues to compile latest ollma on an ARM nvidia jetson plattform cuda version is 11.2 with jetpack 5.1.2. ggml-quants.c: In function ‘ggml_vec_dot_q4_0_q8_0’: ggml-quants.c:4023:52: warning: implicit declaration of function ‘vmmlaq_s32’; did you mean ‘vmlaq_s32’? [-Wimplicit-function-declaration] 4023 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)), | ^~~~~~~~~~ | vmlaq_s32 ggml-quants.c:4023:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’ 4023 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | int 4024 | l1, r1)), l2, r2)), l3, r3))), scale); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from ggml-cpu-impl.h:164, from ggml-quants.c:32: /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’ 15168 | vcvtq_f32_s32 (int32x4_t __a) | ~~~~~~~~~~^~~ ggml-quants.c: In function ‘ggml_vec_dot_q4_1_q8_1’: ggml-quants.c:4603:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’ 4603 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | int 4604 | l1, r1)), l2, r2)), l3, r3))), scale); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from ggml-cpu-impl.h:164, from ggml-quants.c:32: /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’ 15168 | vcvtq_f32_s32 (int32x4_t __a) | ~~~~~~~~~~^~~ ggml-quants.c: In function ‘ggml_vec_dot_q8_0_q8_0’: ggml-quants.c:5607:52: error: incompatible type for argument 1 of ‘vcvtq_f32_s32’ 5607 | sumv0 = vmlaq_f32(sumv0,(vcvtq_f32_s32(vmmlaq_s32((vmmlaq_s32((vmmlaq_s32((vmmlaq_s32(vdupq_n_s32(0), l0, r0)), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | int 5608 | l1, r1)), l2, r2)), l3, r3))), scale); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from ggml-cpu-impl.h:164, from ggml-quants.c:32: /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:15168:26: note: expected ‘int32x4_t’ but argument is of type ‘int’ 15168 | vcvtq_f32_s32 (int32x4_t __a) | ~~~~~~~~~~^~~ make[2]: *** [make/gpu.make:85: /opt/ssd/ollama/ollama/llama/build/linux-arm64/runners/cuda_v11/ollama_llama_server] Error 1 make[2]: Leaving directory '/opt/ssd/ollama/ollama/llama' make[1]: *** [Makefile:41: cuda_v11] Error 2 make[1]: Leaving directory '/opt/ssd/ollama/ollama/llama' make: *** [Makefile:4: all] Error 2 Did anyone has found an fix for this? ### OS Linux ### GPU Nvidia embedded AGX Orin 64GB ### CPU ARM ### Ollama version 0.4.1
GiteaMirror added the buildbugnvidia labels 2026-04-28 19:38:27 -05:00
Author
Owner

@davidADSP commented on GitHub (Nov 10, 2024):

Perhaps you need to upgrade to jetpack 6?

<!-- gh-comment-id:2466665405 --> @davidADSP commented on GitHub (Nov 10, 2024): Perhaps you need to upgrade to jetpack 6?
Author
Owner

@rebotnix commented on GitHub (Nov 10, 2024):

I dont know yet, older version including 0.4.1 of ollma compiled. Have to check the ggml-quants.c files, it seem ARM neon related, not cuda. @davidADSP Will check today.

<!-- gh-comment-id:2466666780 --> @rebotnix commented on GitHub (Nov 10, 2024): I dont know yet, older version including 0.4.1 of ollma compiled. Have to check the ggml-quants.c files, it seem ARM neon related, not cuda. @davidADSP Will check today.
Author
Owner

@dhiltgen commented on GitHub (Nov 13, 2024):

Now that #7217 is merged, you should be able to build main as long as you rebase.

<!-- gh-comment-id:2471988433 --> @dhiltgen commented on GitHub (Nov 13, 2024): Now that #7217 is merged, you should be able to build main as long as you rebase.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#51351