[GH-ISSUE #5100] Jetson - Alternating Errors (Timed Out & CUDA Error) When Trying to Use Ollama #65255

Closed
opened 2026-05-03 20:13:39 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Vassar-HARPER-Project on GitHub (Jun 17, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5100

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

When running ollama run commands (regardless of model, even for something small like gemma:2b, the command does not work properly and results in an error. However, after testing, the error seems to consistently alternate back and forth (swaps every time) between this output:
Error: timed out waiting for llama runner to start - progress 1.00 -

and this output:
Error: llama runner process has terminated: signal: aborted (core dumped) CUDA error: CUBLAS_STATUS_EXECUTION_FAILED current device: 0, in function ggml_cuda_mul_mat_batched_cublas at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:1905 cublasGemmBatchedEx(ctx.cublas_handle(), CUBLAS_OP_T, CUBLAS_OP_N, ne01, ne11, ne10, alpha, (const void **) (ptrs_src.get() + 0*ne23), CUDA_R_16F, nb01/nb00, (const void **) (ptrs_src.get() + 1*ne23), CUDA_R_16F, nb11/nb10, beta, ( void **) (ptrs_dst.get() + 0*ne23), cu_data_type, ne01, ne23, cu_compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP) GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:100: !"CUDA error"

I'm running on a Jetson AGX Orin with 32GB of memory.

OS

Linux

GPU

Nvidia

CPU

Other

Ollama version

0.1.44

Originally created by @Vassar-HARPER-Project on GitHub (Jun 17, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5100 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? When running `ollama run` commands (regardless of model, even for something small like `gemma:2b`, the command does not work properly and results in an error. However, after testing, the error seems to consistently alternate back and forth (swaps every time) between this output: `Error: timed out waiting for llama runner to start - progress 1.00 -` and this output: `Error: llama runner process has terminated: signal: aborted (core dumped) CUDA error: CUBLAS_STATUS_EXECUTION_FAILED current device: 0, in function ggml_cuda_mul_mat_batched_cublas at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:1905 cublasGemmBatchedEx(ctx.cublas_handle(), CUBLAS_OP_T, CUBLAS_OP_N, ne01, ne11, ne10, alpha, (const void **) (ptrs_src.get() + 0*ne23), CUDA_R_16F, nb01/nb00, (const void **) (ptrs_src.get() + 1*ne23), CUDA_R_16F, nb11/nb10, beta, ( void **) (ptrs_dst.get() + 0*ne23), cu_data_type, ne01, ne23, cu_compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP) GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:100: !"CUDA error"` I'm running on a Jetson AGX Orin with 32GB of memory. ### OS Linux ### GPU Nvidia ### CPU Other ### Ollama version 0.1.44
GiteaMirror added the bugnvidia labels 2026-05-03 20:13:50 -05:00
Author
Owner

@jmorganca commented on GitHub (Jun 18, 2024):

I know @dhiltgen has been testing Ollama on the Jetsons, I'm not sure Ollama supports the AGX Orin yet.

<!-- gh-comment-id:2175871683 --> @jmorganca commented on GitHub (Jun 18, 2024): I know @dhiltgen has been testing Ollama on the Jetsons, I'm not sure Ollama supports the AGX Orin yet.
Author
Owner

@Vassar-HARPER-Project commented on GitHub (Jun 18, 2024):

I can tell you it has worked at some point in the past for this exact device, before a re-flash (factory reset, for all intents and purposes)

<!-- gh-comment-id:2176058527 --> @Vassar-HARPER-Project commented on GitHub (Jun 18, 2024): I can tell you it has worked at some point in the past for this exact device, before a re-flash (factory reset, for all intents and purposes)
Author
Owner

@dhiltgen commented on GitHub (Jun 18, 2024):

PR #4741 will add support for JetPack 5 and JetPack 6 devices in the official builds.

<!-- gh-comment-id:2176464371 --> @dhiltgen commented on GitHub (Jun 18, 2024): PR #4741 will add support for JetPack 5 and JetPack 6 devices in the official builds.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65255