[GH-ISSUE #11811] API client silently ignores large responses #7838

Closed
opened 2026-04-12 20:00:22 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @msiebuhr on GitHub (Aug 8, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11811

What is the issue?

When uploading a very large document through the Golang APIs Generate method (https://pkg.go.dev/github.com/ollama/ollama/api#Client.Generate, version 0.11.3), the API returns without error or ever having called fn (GenerateResponseFunc):

In this case a "please turn this HTML table into CSV" and then 2.2 MB of HTML table with plenty of white space.

req := &api.GenerateRequest{
    Model: "<whatever>",
    Prompt: "<something very large>",
    Stream: new(bool),
}

err := client.Generate(ctx, req, func(r api.GenerateResponse) {
    // Never called
    fmt.Print(r.Response)
    return nil
})

if err != nil {
    // No error either
    panic(err)
}

If I manually build the same request and pipe it through curl, it works fine and a reply is returned.

From my reading of the sources at 114c3f2265/api/client.go (L211-L214) and documentation for bufio scanner.Buffer() https://pkg.go.dev/bufio#Scanner.Buffer, it does seem that the Ollama client will ignore answers larger than maxBufferSize which in turn is hard-coded to 512 KB 114c3f2265/api/client.go (L162)

If I use stream: true I get most of the data back but it still swallows the very last message with the context.


I used a script to generate the large-ish requests, here a ~2.2 MB request:

% ./make_request.py | jq . -c | wc --chars                                                                  
2264580

And showing the large context (~1.8 MB) after running through a ollama:

% ./make_request.py | curl -s http://localhost:11434/api/generate --data-binary @- | jq .context -c | wc --chars
1803319

Relevant log output


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.11.3

Originally created by @msiebuhr on GitHub (Aug 8, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11811 ### What is the issue? When uploading a very large document through the Golang APIs `Generate` method (https://pkg.go.dev/github.com/ollama/ollama/api#Client.Generate, version 0.11.3), the API returns without error or ever having called `fn` (GenerateResponseFunc): In this case a "please turn this HTML table into CSV" and then 2.2 MB of HTML table with plenty of white space. ```go req := &api.GenerateRequest{ Model: "<whatever>", Prompt: "<something very large>", Stream: new(bool), } err := client.Generate(ctx, req, func(r api.GenerateResponse) { // Never called fmt.Print(r.Response) return nil }) if err != nil { // No error either panic(err) } ``` If I manually build the same request and pipe it through `curl`, it works fine and a reply is returned. From my reading of the sources at https://github.com/ollama/ollama/blob/114c3f22657750cfb57f70c4a0d6e7389fb7a9fe/api/client.go#L211-L214 and documentation for `bufio scanner.Buffer()` https://pkg.go.dev/bufio#Scanner.Buffer, it does seem that the Ollama client will ignore answers larger than `maxBufferSize` which in turn is hard-coded to 512 KB https://github.com/ollama/ollama/blob/114c3f22657750cfb57f70c4a0d6e7389fb7a9fe/api/client.go#L162 If I use `stream: true` I get most of the data back but it still swallows the very last message with the context. ---- I used a script to generate the large-ish requests, here a ~2.2 MB request: ``` % ./make_request.py | jq . -c | wc --chars 2264580 ``` And showing the large context (~1.8 MB) after running through a ollama: ``` % ./make_request.py | curl -s http://localhost:11434/api/generate --data-binary @- | jq .context -c | wc --chars 1803319 ``` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.11.3
GiteaMirror added the bug label 2026-04-12 20:00:22 -05:00
Author
Owner

@msiebuhr commented on GitHub (Aug 8, 2025):

Seem to be introduced in 9e2de1bd2c

<!-- gh-comment-id:3167976331 --> @msiebuhr commented on GitHub (Aug 8, 2025): Seem to be introduced in https://github.com/ollama/ollama/commit/9e2de1bd2c09cfc6a68deb50e7ec5033df6d22ed
Author
Owner

@msiebuhr commented on GitHub (Jan 16, 2026):

Fixed by 9667c2282f

<!-- gh-comment-id:3758725592 --> @msiebuhr commented on GitHub (Jan 16, 2026): Fixed by 9667c2282f477fb3ba947585c5417ffbc4654a43
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7838