[PR #11058] gguf: even faster parsing by lazily reading array values #13425

Open
opened 2026-04-13 00:26:59 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/11058

State: open
Merged: No


this change improves gguf parsing by discarding array items on initial read and returning a struct for lazily reading array items using file offsets. the result is a noticeable performance bump for both string array types and number array types in all cases.

there's the added benefit where the array data is always available

goos: darwin
goarch: arm64
pkg: github.com/ollama/ollama/fs/gguf
cpu: Apple M3 Max
BenchmarkReadArray/float32-16               8242            145775 ns/op
BenchmarkReadArray/string-16                 157           7694131 ns/op
BenchmarkReadArray/int32-16                 7747            149120 ns/op
BenchmarkReadArray/uint32-16                8326            146902 ns/op
PASS
ok      github.com/ollama/ollama/fs/gguf        7.287s
goos: darwin
goarch: arm64
pkg: github.com/ollama/ollama/fs/ggml
cpu: Apple M3 Max
BenchmarkReadArray/float32-maxArraySize=-1-16                 67          17617177 ns/op
BenchmarkReadArray/float32-maxArraySize=0-16                  67          17463565 ns/op
BenchmarkReadArray/float32-maxArraySize=1024-16               68          17402137 ns/op
BenchmarkReadArray/string-maxArraySize=-1-16                  45          23541357 ns/op
BenchmarkReadArray/string-maxArraySize=0-16                   93          12076803 ns/op
BenchmarkReadArray/string-maxArraySize=1024-16                93          12190660 ns/op
BenchmarkReadArray/int32-maxArraySize=-1-16                   68          17309949 ns/op
BenchmarkReadArray/int32-maxArraySize=0-16                    66          17324967 ns/op
BenchmarkReadArray/int32-maxArraySize=1024-16                 68          17315369 ns/op
BenchmarkReadArray/uint32-maxArraySize=-1-16                  67          17362288 ns/op
BenchmarkReadArray/uint32-maxArraySize=0-16                   67          17207407 ns/op
BenchmarkReadArray/uint32-maxArraySize=1024-16                68          17178488 ns/op
PASS
ok      github.com/ollama/ollama/fs/ggml        20.938s
**Original Pull Request:** https://github.com/ollama/ollama/pull/11058 **State:** open **Merged:** No --- this change improves gguf parsing by discarding array items on initial read and returning a struct for lazily reading array items using file offsets. the result is a noticeable performance bump for both string array types and number array types in all cases. there's the added benefit where the array data is always available ``` goos: darwin goarch: arm64 pkg: github.com/ollama/ollama/fs/gguf cpu: Apple M3 Max BenchmarkReadArray/float32-16 8242 145775 ns/op BenchmarkReadArray/string-16 157 7694131 ns/op BenchmarkReadArray/int32-16 7747 149120 ns/op BenchmarkReadArray/uint32-16 8326 146902 ns/op PASS ok github.com/ollama/ollama/fs/gguf 7.287s ``` ``` goos: darwin goarch: arm64 pkg: github.com/ollama/ollama/fs/ggml cpu: Apple M3 Max BenchmarkReadArray/float32-maxArraySize=-1-16 67 17617177 ns/op BenchmarkReadArray/float32-maxArraySize=0-16 67 17463565 ns/op BenchmarkReadArray/float32-maxArraySize=1024-16 68 17402137 ns/op BenchmarkReadArray/string-maxArraySize=-1-16 45 23541357 ns/op BenchmarkReadArray/string-maxArraySize=0-16 93 12076803 ns/op BenchmarkReadArray/string-maxArraySize=1024-16 93 12190660 ns/op BenchmarkReadArray/int32-maxArraySize=-1-16 68 17309949 ns/op BenchmarkReadArray/int32-maxArraySize=0-16 66 17324967 ns/op BenchmarkReadArray/int32-maxArraySize=1024-16 68 17315369 ns/op BenchmarkReadArray/uint32-maxArraySize=-1-16 67 17362288 ns/op BenchmarkReadArray/uint32-maxArraySize=0-16 67 17207407 ns/op BenchmarkReadArray/uint32-maxArraySize=1024-16 68 17178488 ns/op PASS ok github.com/ollama/ollama/fs/ggml 20.938s ```
GiteaMirror added the pull-request label 2026-04-13 00:26:59 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13425