[GH-ISSUE #12083] option to disable cuda graph with parameter or env #8027

Open
opened 2026-04-12 20:15:50 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @bogan-FMA on GitHub (Aug 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12083

dear,

currently the cuda graph enable/disable is only available during the compile. but we are always using pre-compiled version.

#if (defined(GGML_CUDA_USE_GRAPHS) || defined(GGML_HIP_GRAPHS)) || defined(GGML_MUSA_GRAPHS) #define USE_CUDA_GRAPH #endif

may I know if there is any option to disable the usage of cuda graph?
if not, Is it possible to make this a feature? since in our usage, cuda graph will cause bad performance sometimes. thanks.
it is also available in vllm to use eager mode.

Originally created by @bogan-FMA on GitHub (Aug 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12083 dear, currently the cuda graph enable/disable is only available during the compile. but we are always using pre-compiled version. `#if (defined(GGML_CUDA_USE_GRAPHS) || defined(GGML_HIP_GRAPHS)) || defined(GGML_MUSA_GRAPHS) #define USE_CUDA_GRAPH #endif` may I know if there is any option to disable the usage of cuda graph? if not, Is it possible to make this a feature? since in our usage, cuda graph will cause bad performance sometimes. thanks. it is also available in vllm to use eager mode.
GiteaMirror added the feature request label 2026-04-12 20:15:50 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8027