[GH-ISSUE #6569] TensorRT Support #66173

Open
opened 2026-05-04 00:28:14 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @JonahMMay on GitHub (Aug 30, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6569

Does ollama leverage TensorRT and if not, can support for it be added?

Originally created by @JonahMMay on GitHub (Aug 30, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6569 Does ollama leverage TensorRT and if not, can support for it be added?
GiteaMirror added the feature request label 2026-05-04 00:28:14 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 30, 2024):

That would be a feature request for llama.cpp.

<!-- gh-comment-id:2322164426 --> @rick-github commented on GitHub (Aug 30, 2024): That would be a feature request for [llama.cpp](https://github.com/ggerganov/llama.cpp/issues).
Author
Owner

@AbhiAbzs commented on GitHub (Jan 25, 2025):

Jan.ai have already added support for TensorRT, along side Llama.cpp engine, if its possible can please also add the support for TensorRT as its benchmark seems very promising for inferencing. I tested it once in case of FaceFusion and after the initial load time it reduces the inferening time by more then 4x.
I don't know if its technically feasible to add this or not but it seems promising.
https://jan.ai/post/benchmarking-nvidia-tensorrt-llm

<!-- gh-comment-id:2614055183 --> @AbhiAbzs commented on GitHub (Jan 25, 2025): Jan.ai have already added support for TensorRT, along side Llama.cpp engine, if its possible can please also add the support for TensorRT as its benchmark seems very promising for inferencing. I tested it once in case of FaceFusion and after the initial load time it reduces the inferening time by more then 4x. I don't know if its technically feasible to add this or not but it seems promising. https://jan.ai/post/benchmarking-nvidia-tensorrt-llm
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#66173