[PR #3907] [CLOSED] Add /api/infill for fill-in-the-middle #37189

Closed
opened 2026-04-22 21:54:37 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/3907
Author: @Crustack
Created: 4/25/2024
Status: Closed

Base: mainHead: infill-api


📝 Commits (1)

  • a2b1f49 Add POST /api/infill for fill-in-the-middle

📊 Changes

6 files changed (+380 additions, -118 deletions)

View changed files

📝 api/client.go (+14 -0)
📝 api/types.go (+30 -0)
📝 llm/ext_server/server.cpp (+82 -73)
📝 llm/server.go (+94 -45)
📝 server/routes.go (+157 -0)
📝 server/sched_test.go (+3 -0)

📄 Description

This PR closes #3869

Adds /api/infill to leverage llama.cpp's POST /infill API for infilling / fill-in-the-middle / code-completion.

An example request looks like this:

POST /api/infill HTTP/1.1
Host: localhost:11434
Content-Type: application/json
Content-Length: 199
{
    "stream": false,
    "model": "codellama:7b-instruct-q3_K_M",
    "input_prefix": "public int gcd(int x, int y) {",
    "input_suffix": "\n}",
    "options": {
        "num_predict": 10
    }
}

Response:

{
    "model": "codellama:7b-instruct-q3_K_M",
    "created_at": "2024-04-25T11:09:06.691622744Z",
    "response": "\n    return y == 0 ? x :",
    "done": true,
    "total_duration": 3385921466,
    "load_duration": 948913941,
    "prompt_eval_count": 18,
    "prompt_eval_duration": 1175793000,
    "eval_count": 10,
    "eval_duration": 1219642000
}

(Streaming is also available)

Note: could probably need more refactoring to cleanup duplicate code, I am very new to Golang programming, so feedback would be appreciated


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/3907 **Author:** [@Crustack](https://github.com/Crustack) **Created:** 4/25/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `infill-api` --- ### 📝 Commits (1) - [`a2b1f49`](https://github.com/ollama/ollama/commit/a2b1f490457198a9c854fdda0dfec1bee1179b3e) Add POST /api/infill for fill-in-the-middle ### 📊 Changes **6 files changed** (+380 additions, -118 deletions) <details> <summary>View changed files</summary> 📝 `api/client.go` (+14 -0) 📝 `api/types.go` (+30 -0) 📝 `llm/ext_server/server.cpp` (+82 -73) 📝 `llm/server.go` (+94 -45) 📝 `server/routes.go` (+157 -0) 📝 `server/sched_test.go` (+3 -0) </details> ### 📄 Description This PR closes #3869 Adds `/api/infill` to leverage llama.cpp's [POST /infill](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints) API for infilling / fill-in-the-middle / code-completion. An example request looks like this: ```http POST /api/infill HTTP/1.1 Host: localhost:11434 Content-Type: application/json Content-Length: 199 { "stream": false, "model": "codellama:7b-instruct-q3_K_M", "input_prefix": "public int gcd(int x, int y) {", "input_suffix": "\n}", "options": { "num_predict": 10 } } ``` Response: ```json { "model": "codellama:7b-instruct-q3_K_M", "created_at": "2024-04-25T11:09:06.691622744Z", "response": "\n return y == 0 ? x :", "done": true, "total_duration": 3385921466, "load_duration": 948913941, "prompt_eval_count": 18, "prompt_eval_duration": 1175793000, "eval_count": 10, "eval_duration": 1219642000 } ``` (Streaming is also available) Note: could probably need more refactoring to cleanup duplicate code, I am very new to Golang programming, so feedback would be appreciated --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 21:54:37 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#37189