[GH-ISSUE #9723] A client-only interface #68412

Closed
opened 2026-05-04 13:51:31 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Sora233 on GitHub (Mar 13, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9723

Hello, I want a small client-only interface, without any runtime drivers.
Image

Everything is running on OLLAMA_HOST

How can I do this?

Originally created by @Sora233 on GitHub (Mar 13, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9723 Hello, I want a small client-only interface, without any runtime drivers. ![Image](https://github.com/user-attachments/assets/ab082232-fadf-418b-9f51-5e46b42cc0e0) Everything is running on `OLLAMA_HOST` How can I do this?
GiteaMirror added the feature request label 2026-05-04 13:51:32 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 21, 2025):

Terminal clients can be found here.

Or you can just extract bin/ollama from the .tgz file use that, it won't try to load runtime drivers unless you start it as ollama serve.

If you want the most minimal client, you can just use a simple python script.

#!/usr/bin/env python3

import ollama
import argparse
import sys
try:
  import readline
except:
  pass

parser = argparse.ArgumentParser()
parser.add_argument("--system", help="Set system message", default=None)
parser.add_argument("--num_ctx", help="Set context size", default=None)
parser.add_argument("--num_gpu", help="Set number of GPU layers", default=None)
parser.add_argument("--temperature", help="Set temperature", default=None)
parser.add_argument("model")
parser.add_argument("prompts", nargs='*')
args = parser.parse_args()

client = ollama.Client()
userprompt = ">>> " if sys.stdin.isatty() else ""

options = {
  "temperature": int(args.temperature) if args.temperature else None,
  "num_ctx": int(args.num_ctx) if args.num_ctx else None,
  "num_gpu": int(args.num_gpu) if args.num_gpu else None,
}

def chat(messages, prompt):
  messages.append({"role":"user", "content": prompt})
  response = client.chat(model=args.model, messages=messages, options=options, stream=True)
  m = ''
  for r in response:
    c = r['message']['content']
    print(c, end='', flush=True)
    m = m + c
  print()
  messages.append({"role": "assistant", "content": m})
  return messages

messages = []
if args.system:
  messages.append({"role":"system","content":args.system})
for prompt in args.prompts:
  messages = chat(messages, prompt)
while True:
  try:
    prompt = input(userprompt)
  except:
    print()
    break
  if prompt == "/bye":
    break
  messages = chat(messages, prompt)
$ OLLAMA_HOST=:11434 ./simple.py --system "Speak like a pirate" gemma3 hello
Ahoy there, matey! What brings ye to these waters? Speak yer piece, or be walkin' the plank! 

(Just to give you a better feel for it, here are some other phrases I might use:)

*   Shiver me timbers!
*   Avast ye!
*   Aye, aye, Captain!
*   Land ho!
*   Heave ho!

What's on yer mind, savvy?
>>> count to 8
Aye, aye, Cap'n! Let's see...

One! *Scratch, scratch*
Two! *Swish of a cutlass*
Three! *A hearty bellow*
Four! *A wink and a grin*
Five! *A quick, "Yo ho ho!"*
Six! *A thoughtful pause*
Seven! *A flourish of the hand*
Eight! *A triumphant shout!*

There ye have it, Cap'n! Eight be a fine number, aye?
>>> /bye
<!-- gh-comment-id:2819548324 --> @rick-github commented on GitHub (Apr 21, 2025): Terminal clients can be found [here](https://github.com/ollama/ollama?tab=readme-ov-file#terminal). Or you can just extract `bin/ollama` from the .tgz file use that, it won't try to load runtime drivers unless you start it as `ollama serve`. If you want the most minimal client, you can just use a simple python script. ```python #!/usr/bin/env python3 import ollama import argparse import sys try: import readline except: pass parser = argparse.ArgumentParser() parser.add_argument("--system", help="Set system message", default=None) parser.add_argument("--num_ctx", help="Set context size", default=None) parser.add_argument("--num_gpu", help="Set number of GPU layers", default=None) parser.add_argument("--temperature", help="Set temperature", default=None) parser.add_argument("model") parser.add_argument("prompts", nargs='*') args = parser.parse_args() client = ollama.Client() userprompt = ">>> " if sys.stdin.isatty() else "" options = { "temperature": int(args.temperature) if args.temperature else None, "num_ctx": int(args.num_ctx) if args.num_ctx else None, "num_gpu": int(args.num_gpu) if args.num_gpu else None, } def chat(messages, prompt): messages.append({"role":"user", "content": prompt}) response = client.chat(model=args.model, messages=messages, options=options, stream=True) m = '' for r in response: c = r['message']['content'] print(c, end='', flush=True) m = m + c print() messages.append({"role": "assistant", "content": m}) return messages messages = [] if args.system: messages.append({"role":"system","content":args.system}) for prompt in args.prompts: messages = chat(messages, prompt) while True: try: prompt = input(userprompt) except: print() break if prompt == "/bye": break messages = chat(messages, prompt) ``` ```console $ OLLAMA_HOST=:11434 ./simple.py --system "Speak like a pirate" gemma3 hello Ahoy there, matey! What brings ye to these waters? Speak yer piece, or be walkin' the plank! (Just to give you a better feel for it, here are some other phrases I might use:) * Shiver me timbers! * Avast ye! * Aye, aye, Captain! * Land ho! * Heave ho! What's on yer mind, savvy? >>> count to 8 Aye, aye, Cap'n! Let's see... One! *Scratch, scratch* Two! *Swish of a cutlass* Three! *A hearty bellow* Four! *A wink and a grin* Five! *A quick, "Yo ho ho!"* Six! *A thoughtful pause* Seven! *A flourish of the hand* Eight! *A triumphant shout!* There ye have it, Cap'n! Eight be a fine number, aye? >>> /bye
Author
Owner

@Sora233 commented on GitHub (Apr 22, 2025):

That's great. Thanks.

<!-- gh-comment-id:2819838530 --> @Sora233 commented on GitHub (Apr 22, 2025): That's great. Thanks.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68412