[PR #5638] [CLOSED] feat: add NeuralSpeed backend to boost up the inference speed on CPU … #10603

Closed
opened 2025-11-12 15:32:19 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5638
Author: @atelepov
Created: 7/11/2024
Status: Closed

Base: mainHead: main


📝 Commits (1)

  • 4e44cd3 feat: add NeuralSpeed backend to boost up the inference speed on CPU side

📊 Changes

24 files changed (+43846 additions, -234 deletions)

View changed files

📝 .gitmodules (+6 -1)
📝 llm/dyn_ext_server.go (+1 -1)
📝 llm/generate/gen_common.sh (+44 -32)
📝 llm/generate/gen_linux.sh (+10 -0)
📝 llm/llama.cpp (+1 -1)
llm/neural_speed (+1 -0)
llm/ns_ext_server/CMakeLists.txt (+16 -0)
llm/ns_ext_server/ext_server.cpp (+311 -0)
llm/ns_ext_server/ext_server.h (+93 -0)
llm/ns_ext_server/httplib.h (+8794 -0)
llm/ns_ext_server/json.hpp (+24596 -0)
llm/ns_ext_server/server.cpp (+1615 -0)
llm/ns_ext_server/unicode.h (+784 -0)
llm/ns_ext_server/utils.hpp (+488 -0)
llm/patches/01-cache.diff (+0 -21)
llm/patches/02-cudaleaks.diff (+0 -117)
llm/patches/04-locale.diff (+0 -13)
llm/patches/05-fix-clip-free.diff (+0 -45)
📝 llm/patches/llama.cpp/03-load_exception.diff (+0 -0)
llm/patches/neural_speed/0001-refine-model_load-to-provide-unified-interface-cross.diff (+529 -0)

...and 4 more files

📄 Description

…side


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5638 **Author:** [@atelepov](https://github.com/atelepov) **Created:** 7/11/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (1) - [`4e44cd3`](https://github.com/ollama/ollama/commit/4e44cd35a143d1d1a583c6aa1818f61a0db7153a) feat: add NeuralSpeed backend to boost up the inference speed on CPU side ### 📊 Changes **24 files changed** (+43846 additions, -234 deletions) <details> <summary>View changed files</summary> 📝 `.gitmodules` (+6 -1) 📝 `llm/dyn_ext_server.go` (+1 -1) 📝 `llm/generate/gen_common.sh` (+44 -32) 📝 `llm/generate/gen_linux.sh` (+10 -0) 📝 `llm/llama.cpp` (+1 -1) ➕ `llm/neural_speed` (+1 -0) ➕ `llm/ns_ext_server/CMakeLists.txt` (+16 -0) ➕ `llm/ns_ext_server/ext_server.cpp` (+311 -0) ➕ `llm/ns_ext_server/ext_server.h` (+93 -0) ➕ `llm/ns_ext_server/httplib.h` (+8794 -0) ➕ `llm/ns_ext_server/json.hpp` (+24596 -0) ➕ `llm/ns_ext_server/server.cpp` (+1615 -0) ➕ `llm/ns_ext_server/unicode.h` (+784 -0) ➕ `llm/ns_ext_server/utils.hpp` (+488 -0) ➖ `llm/patches/01-cache.diff` (+0 -21) ➖ `llm/patches/02-cudaleaks.diff` (+0 -117) ➖ `llm/patches/04-locale.diff` (+0 -13) ➖ `llm/patches/05-fix-clip-free.diff` (+0 -45) 📝 `llm/patches/llama.cpp/03-load_exception.diff` (+0 -0) ➕ `llm/patches/neural_speed/0001-refine-model_load-to-provide-unified-interface-cross.diff` (+529 -0) _...and 4 more files_ </details> ### 📄 Description …side --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-12 15:32:19 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#10603