[GH-ISSUE #9894] Ollama endpoints convention #6478

Closed
opened 2026-04-12 18:03:06 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @Propfend on GitHub (Mar 19, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9894

Why not to follow the llama.cpp's convention with endpoint?

like /v1/completions and /v1/chat/completions instead of /api/chat and /api/generate.

I notice a lot of problems the actual community has is about proxying with ollama.

I come from some context of proxying to ollama server, and using it in a load balancer along with other LLM engines like llama.cpp is being difficult, because the result of request is not predictable, because their endpoints are different.

We could try to deprecate the other endpoints when introducing these ones, so the update is not a breaking change.

Originally created by @Propfend on GitHub (Mar 19, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9894 Why not to follow the llama.cpp's convention with endpoint? like `/v1/completions` and `/v1/chat/completions` instead of `/api/chat` and `/api/generate`. I notice a lot of problems the actual community has is about proxying with ollama. I come from some context of proxying to ollama server, and using it in a load balancer along with other LLM engines like llama.cpp is being difficult, because the result of request is not predictable, because their endpoints are different. We could try to deprecate the other endpoints when introducing these ones, so the update is not a breaking change.
GiteaMirror added the feature request label 2026-04-12 18:03:06 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 19, 2025):

https://github.com/ollama/ollama/blob/main/docs/openai.md#endpoints

<!-- gh-comment-id:2738186489 --> @rick-github commented on GitHub (Mar 19, 2025): https://github.com/ollama/ollama/blob/main/docs/openai.md#endpoints
Author
Owner

@pearsonkyle commented on GitHub (Mar 20, 2025):

There's quite a difference between the default sampling parameters using each of the endpoints so much so that the openAI format is effectively useless. Does anyone have insight on how to make the openai-api interface more consistent with api/chat ?

<!-- gh-comment-id:2741777996 --> @pearsonkyle commented on GitHub (Mar 20, 2025): There's quite a difference between the default sampling parameters using each of the endpoints so much so that the openAI format is effectively useless. Does anyone have insight on how to make the openai-api interface more consistent with `api/chat` ?
Author
Owner

@rick-github commented on GitHub (Mar 20, 2025):

Examples?

<!-- gh-comment-id:2741788036 --> @rick-github commented on GitHub (Mar 20, 2025): Examples?
Author
Owner

@pearsonkyle commented on GitHub (Mar 20, 2025):

I take it back, things seem to working as intended now.. sorry for the confusion. This issue can probably be closed

import httpx
from openai import OpenAI

model = "qwen2.5:latest"
api_key: str = "ollama"
base_url: str = "http://localhost:11434"

client = OpenAI(
    base_url=f"{base_url}/v1",
    api_key=api_key
)

messages = [{"role": "system", "content": "Talk like a pirate."}, 
            {"role": "user", "content": "Why is the sky blue?"}]

response1 = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=0.0,
    max_tokens=1024
)

print(response1.choices[0].message.content)

payload = {
    "model": model,
    "messages": messages,
    "stream": False,
    "options": {
        "temperature": 0.0,
        "num_predict": 1024
    }
}
with httpx.Client() as client:
    response2 = client.post(f"{base_url}/api/chat", json=payload, timeout=600)

print(response2.json()['message']['content'])

Output:

Arrr, matey! The sky be blue 'cause o' the way light behaves when it hits the atmosphere. Sunlight, which is made up of different colors, comes through the air and then gets scattered all around by gases and tiny particles in the atmosphere. Blue light gets scattered more than other colors because it travels as shorter, smaller waves. So, when ye look up at the sky, ye see that blue color most of the time!
Arrr, matey! The sky be blue 'cause o' the way light behaves when it hits the atmosphere. Sunlight, which is made up of different colors, comes through the air and then gets scattered all around by gases and tiny particles in the atmosphere. Blue light gets scattered more than other colors because it travels as shorter, smaller waves. So, when ye look up at the sky, ye see that blue color most of the time!
<!-- gh-comment-id:2741813494 --> @pearsonkyle commented on GitHub (Mar 20, 2025): I take it back, things seem to working as intended now.. sorry for the confusion. This issue can probably be closed ``` import httpx from openai import OpenAI model = "qwen2.5:latest" api_key: str = "ollama" base_url: str = "http://localhost:11434" client = OpenAI( base_url=f"{base_url}/v1", api_key=api_key ) messages = [{"role": "system", "content": "Talk like a pirate."}, {"role": "user", "content": "Why is the sky blue?"}] response1 = client.chat.completions.create( model=model, messages=messages, temperature=0.0, max_tokens=1024 ) print(response1.choices[0].message.content) payload = { "model": model, "messages": messages, "stream": False, "options": { "temperature": 0.0, "num_predict": 1024 } } with httpx.Client() as client: response2 = client.post(f"{base_url}/api/chat", json=payload, timeout=600) print(response2.json()['message']['content']) ``` Output: ``` Arrr, matey! The sky be blue 'cause o' the way light behaves when it hits the atmosphere. Sunlight, which is made up of different colors, comes through the air and then gets scattered all around by gases and tiny particles in the atmosphere. Blue light gets scattered more than other colors because it travels as shorter, smaller waves. So, when ye look up at the sky, ye see that blue color most of the time! Arrr, matey! The sky be blue 'cause o' the way light behaves when it hits the atmosphere. Sunlight, which is made up of different colors, comes through the air and then gets scattered all around by gases and tiny particles in the atmosphere. Blue light gets scattered more than other colors because it travels as shorter, smaller waves. So, when ye look up at the sky, ye see that blue color most of the time! ```
Author
Owner

@rick-github commented on GitHub (Mar 21, 2025):

There was a time when they were inconsistent (#6665, #6688) but everything should be aligned now. If you do see inconsistencies please open an issue.

<!-- gh-comment-id:2741926472 --> @rick-github commented on GitHub (Mar 21, 2025): There was a time when they were inconsistent (#6665, #6688) but everything should be aligned now. If you do see inconsistencies please open an issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6478