[PR #5898] [MERGED] server: speed up single gguf creates #58656

Closed
opened 2026-04-29 13:33:39 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5898
Author: @joshyan1
Created: 7/24/2024
Status: Merged
Merged: 8/12/2024
Merged by: @joshyan1

Base: mainHead: jyan/gguf


📝 Commits (10+)

📊 Changes

2 files changed (+96 additions, -3 deletions)

View changed files

📝 server/model.go (+14 -3)
📝 server/model_test.go (+82 -0)

📄 Description

the blob for a file with a single gguf is already copied to the server on /api/blobs/:digest.
on createModel, we can avoid the rewriting of this blob.

This does not improve create speeds for safetensors or files with multiple gguf files

New logs

[GIN] 2024/07/23 - 17:21:45 | 201 |  5.298126708s |       127.0.0.1 | POST     "/api/blobs/sha256:54696cbcadd1959275fc99f9cc67880d2f38419124da06cdf2140bad2dc3d94c"
[GIN] 2024/07/23 - 17:21:45 | 200 |    8.519125ms |       127.0.0.1 | POST     "/api/create"

Old logs

[GIN] 2024/07/23 - 17:24:27 | 201 |  5.283626083s |       127.0.0.1 | POST     "/api/blobs/sha256:54696cbcadd1959275fc99f9cc67880d2f38419124da06cdf2140bad2dc3d94c"
[GIN] 2024/07/23 - 17:24:32 | 200 |  4.302020959s |       127.0.0.1 | POST     "/api/create"

resolves: https://github.com/ollama/ollama/issues/5388


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5898 **Author:** [@joshyan1](https://github.com/joshyan1) **Created:** 7/24/2024 **Status:** ✅ Merged **Merged:** 8/12/2024 **Merged by:** [@joshyan1](https://github.com/joshyan1) **Base:** `main` ← **Head:** `jyan/gguf` --- ### 📝 Commits (10+) - [`fed007f`](https://github.com/ollama/ollama/commit/fed007fde0e2aca4b0cda7ef2bfa482f730db880) vroom - [`620413d`](https://github.com/ollama/ollama/commit/620413d262e381c6ff8a2ea6c925b6166aaeefa9) test setup - [`9cc312a`](https://github.com/ollama/ollama/commit/9cc312a04bc8e99e421fdcb68bf530cb018f774d) reuse gguf - [`7d424ea`](https://github.com/ollama/ollama/commit/7d424ea9060391d1af9e256ddc6cde32570c6d91) test - [`6df1f76`](https://github.com/ollama/ollama/commit/6df1f76369441433f6ded46b0291fc5f0a9a0062) test complete - [`27550da`](https://github.com/ollama/ollama/commit/27550da12932fbb1f31f07563671d74a10393f58) report err - [`d5802e3`](https://github.com/ollama/ollama/commit/d5802e3079becc6fbba8c663c13cbd50fcdbc0a4) lint - [`0dbd1ae`](https://github.com/ollama/ollama/commit/0dbd1aeb06094c09ba6ab38a23145573e32fe5f1) remove println - [`15e6a4f`](https://github.com/ollama/ollama/commit/15e6a4fbd4746e244dbea8ee03e40fe5536703cf) move err - [`92047c0`](https://github.com/ollama/ollama/commit/92047c04cd76190a458a0f29584bc133fd3939fb) log ### 📊 Changes **2 files changed** (+96 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `server/model.go` (+14 -3) 📝 `server/model_test.go` (+82 -0) </details> ### 📄 Description the blob for a file with a single gguf is already copied to the server on `/api/blobs/:digest`. on `createModel`, we can avoid the rewriting of this blob. This does not improve create speeds for safetensors or files with multiple gguf files New logs ``` [GIN] 2024/07/23 - 17:21:45 | 201 | 5.298126708s | 127.0.0.1 | POST "/api/blobs/sha256:54696cbcadd1959275fc99f9cc67880d2f38419124da06cdf2140bad2dc3d94c" [GIN] 2024/07/23 - 17:21:45 | 200 | 8.519125ms | 127.0.0.1 | POST "/api/create" ``` Old logs ``` [GIN] 2024/07/23 - 17:24:27 | 201 | 5.283626083s | 127.0.0.1 | POST "/api/blobs/sha256:54696cbcadd1959275fc99f9cc67880d2f38419124da06cdf2140bad2dc3d94c" [GIN] 2024/07/23 - 17:24:32 | 200 | 4.302020959s | 127.0.0.1 | POST "/api/create" ``` resolves: https://github.com/ollama/ollama/issues/5388 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 13:33:39 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#58656