[GH-ISSUE #14633] Ollama Intialization #35241

Closed
opened 2026-04-22 19:37:36 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @archer12082002-cpu on GitHub (Mar 5, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14633

I am creating a small project which is written in python script which accesses Llama3:8b through ollama in offline mode. The objective is to read a sentence and rephrase it. So i give prompt accordingly and planning to attach a xml or txt file with it which will have 10 examples, for better understanding to the model.
Facing some problems on this:

  1. For the first sentence its taking longer than usual to give output, so how to set timeout for it or is there any way to cold start first and then give inputs.
  2. Ideal timeout should be 2 min for each sentence.
  3. Is it better to attach separate attachment or give the examples in prompt only.
Originally created by @archer12082002-cpu on GitHub (Mar 5, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14633 I am creating a small project which is written in python script which accesses Llama3:8b through ollama in offline mode. The objective is to read a sentence and rephrase it. So i give prompt accordingly and planning to attach a xml or txt file with it which will have 10 examples, for better understanding to the model. Facing some problems on this: 1. For the first sentence its taking longer than usual to give output, so how to set timeout for it or is there any way to cold start first and then give inputs. 2. Ideal timeout should be 2 min for each sentence. 3. Is it better to attach separate attachment or give the examples in prompt only.
GiteaMirror added the feature request label 2026-04-22 19:37:36 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 5, 2026):

  1. The script can pre-load a model by calling it without a prompt. Setting keep_alive will prevent the model from being unloaded if it is idle for more than 5 minutes
ollama.chat(model="llama3:8b", keep_alive=-1)
  1. The ollama server doesn't have a timeout but one can be set when initializing the client:
client = ollama.Client(timeout=120)

If the ollama server doesn't respond the client will throw a httpx.ReadTimeout exception so this would be wrapped in a try/except block.

  1. If would be better to add examples of how to format text to the system message.
import ollama
import httpx

model = "llama3:8b"
client = ollama.Client(timeout=120)

ollama.chat(model=model, keep_alive=-1)

with open("example.txt") as f:
  examples = f.read()
messages = [
  { "role":"system", "content":f"Your job is to rephrase sentences.  Here are some examples:\n{examples}\n"}
]
sentence = input("Enter a sentence to be rephrased: ")
try:
  response = client.chat(
    model=model,
    messages=messages+[{"role":"user","content":sentence}],
    keep_alive=-1)
  print(response.message.content)
except httpx.ReadTimeout:
  print(f"timed out")
except Exception as e:
  print(f"chat failed: {e}")
<!-- gh-comment-id:4004588721 --> @rick-github commented on GitHub (Mar 5, 2026): 1. The script can pre-load a model by calling it without a prompt. Setting `keep_alive` will prevent the model from being unloaded if it is idle for more than 5 minutes ```python ollama.chat(model="llama3:8b", keep_alive=-1) ``` 2. The ollama server doesn't have a timeout but one can be set when initializing the client: ```python client = ollama.Client(timeout=120) ``` If the ollama server doesn't respond the client will throw a `httpx.ReadTimeout` exception so this would be wrapped in a `try`/`except` block. 3. If would be better to add examples of how to format text to the system message. ```python import ollama import httpx model = "llama3:8b" client = ollama.Client(timeout=120) ollama.chat(model=model, keep_alive=-1) with open("example.txt") as f: examples = f.read() messages = [ { "role":"system", "content":f"Your job is to rephrase sentences. Here are some examples:\n{examples}\n"} ] sentence = input("Enter a sentence to be rephrased: ") try: response = client.chat( model=model, messages=messages+[{"role":"user","content":sentence}], keep_alive=-1) print(response.message.content) except httpx.ReadTimeout: print(f"timed out") except Exception as e: print(f"chat failed: {e}") ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35241