[PR #2410] [CLOSED] Added Encoding endpoint #21420

Closed
opened 2026-04-19 15:37:24 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/2410
Author: @suvalaki
Created: 2/8/2024
Status: Closed

Base: mainHead: main


📝 Commits (1)

📊 Changes

3 files changed (+169 additions, -0 deletions)

View changed files

📝 api/client.go (+13 -0)
📝 api/types.go (+10 -0)
📝 server/routes.go (+146 -0)

📄 Description

It seems useful to expose the encoding function of a model that us called by the generate methods to enable token counting (without running the model end to end).

Some thoughts:

  • Im not sure whether replicating the logic that modifies the prompt (the same as the generate function is correct here or whether we should just simplify and just look at the raw prompt string.
  • I havent made a similar endpoint for getting the tokens from a chat call.

My current thinking is that simplification of this would be better and i dont need replicate all the prompt logic necessarily. Im interested in feedback and if folks have any pushback to exposing the function.

Following up from https://github.com/ollama/ollama/issues/1345

Btw the proposal makes a new endpoint /api/encode for requests of the following

Looks a bit like this at a request level:
Input

{
  "model": "mistral:latest",
  "prompt": "Why is the sky blue?"
}

Output

{
    "model": "mistral:latest",
    "created_at": "2024-02-05T21:49:44.472893Z",
    "total_duration": 8965307875,
    "load_duration": 8961889917,
    "context": [
        733,
        16289,
        28793,
        28705,
        4315,
        349,
        272,
        7212,
        5045,
        28804,
        733,
        28748,
        16289,
        28793,
        13
    ],
    "prompt_eval_count": 15
}

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/2410 **Author:** [@suvalaki](https://github.com/suvalaki) **Created:** 2/8/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (1) - [`60d90ca`](https://github.com/ollama/ollama/commit/60d90ca094c68ef33c93e904eb412d06d5f6bffc) Added Encoding endpoint ### 📊 Changes **3 files changed** (+169 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `api/client.go` (+13 -0) 📝 `api/types.go` (+10 -0) 📝 `server/routes.go` (+146 -0) </details> ### 📄 Description It seems useful to expose the encoding function of a model that us called by the generate methods to enable token counting (without running the model end to end). Some thoughts: - Im not sure whether replicating the logic that modifies the prompt (the same as the generate function is correct here or whether we should just simplify and just look at the raw prompt string. - I havent made a similar endpoint for getting the tokens from a chat call. My current thinking is that simplification of this would be better and i dont need replicate all the prompt logic necessarily. Im interested in feedback and if folks have any pushback to exposing the function. Following up from https://github.com/ollama/ollama/issues/1345 Btw the proposal makes a new endpoint `/api/encode` for requests of the following Looks a bit like this at a request level: Input ```json { "model": "mistral:latest", "prompt": "Why is the sky blue?" } ``` Output ```json { "model": "mistral:latest", "created_at": "2024-02-05T21:49:44.472893Z", "total_duration": 8965307875, "load_duration": 8961889917, "context": [ 733, 16289, 28793, 28705, 4315, 349, 272, 7212, 5045, 28804, 733, 28748, 16289, 28793, 13 ], "prompt_eval_count": 15 } ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 15:37:24 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#21420