[GH-ISSUE #1997] 🔙 Some kind of regression while running on some LlamaIndex versions (Kaggle & Killercoda) #63190

New Issue

GiteaMirror · 2026-05-03T12:27:56-05:00

GiteaMirror commented

2026-05-03 12:27:56 -05:00

Originally created by @adriens on GitHub (Jan 15, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1997

Originally assigned to: @jmorganca on GitHub.

❔ About

While working on a ollama tutorial on Kaggle, since a few days, I faced a regression while working with LlamaIndex.

Here is the output I could get on any model (worked everytime)

... vs now (the code is now broken, and it fails consistetly):

ℹ️

✔️ Everything works perfectly well on my laptop

🤔 Looks like something changed that causes this "regression" while playing around in some cases 💭

📜 Detailed stacktrace

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map)
      9 try:
---> 10     yield
     11 except Exception as exc:  # noqa: PIE786

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    205 with map_exceptions(exc_map):
--> 206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )
    211     for option in socket_options:

File /opt/conda/lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address)
    844 try:
--> 845     raise err
    846 finally:
    847     # Break explicitly a reference cycle

File /opt/conda/lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address)
    832     sock.bind(source_address)
--> 833 sock.connect(sa)
    834 # Break explicitly a reference cycle

OSError: [Errno 99] Cannot assign requested address

The above exception was the direct cause of the following exception:

ConnectError                              Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:67, in map_httpcore_exceptions()
     66 try:
---> 67     yield
     68 except Exception as exc:

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:231, in HTTPTransport.handle_request(self, request)
    230 with map_httpcore_exceptions():
--> 231     resp = self._pool.handle_request(req)
    233 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request)
    267         self.response_closed(status)
--> 268     raise exc
    269 else:

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request)
    250 try:
--> 251     response = connection.handle_request(request)
    252 except ConnectionNotAvailable:
    253     # The ConnectionNotAvailable exception is a special case, that
    254     # indicates we need to retry the request on a new connection.
   (...)
    258     # might end up as an HTTP/2 connection, but which actually ends
    259     # up as HTTP/1.1.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request)
     98         self._connect_failed = True
---> 99         raise exc
    100 elif not self._connection.is_available():

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request)
     75 try:
---> 76     stream = self._connect(request)
     78     ssl_object = stream.get_extra_info("ssl_object")

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request)
    123 with Trace("connect_tcp", logger, request, kwargs) as trace:
--> 124     stream = self._network_backend.connect_tcp(**kwargs)
    125     trace.return_value = stream

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    200 exc_map: ExceptionMapping = {
    201     socket.timeout: ConnectTimeout,
    202     OSError: ConnectError,
    203 }
--> 205 with map_exceptions(exc_map):
    206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ConnectError: [Errno 99] Cannot assign requested address

The above exception was the direct cause of the following exception:

ConnectError                              Traceback (most recent call last)
Cell In[13], line 5
      2 from llama_index.llms import Ollama
      4 llm = Ollama(model=OLLAMA_MODEL)
----> 5 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
      6 (Answer with markdown sections, markdown with be the GitHub flavor.)""")
      7 print(response)

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/base.py:226, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs)
    216 with wrapper_logic(_self) as callback_manager:
    217     event_id = callback_manager.on_event_start(
    218         CBEventType.LLM,
    219         payload={
   (...)
    223         },
    224     )
--> 226     f_return_val = f(_self, *args, **kwargs)
    227     if isinstance(f_return_val, Generator):
    228         # intercept the generator and add a callback to the end
    229         def wrapped_gen() -> CompletionResponseGen:

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama.py:180, in Ollama.complete(self, prompt, formatted, **kwargs)
    171 payload = {
    172     self.prompt_key: prompt,
    173     "model": self.model,
   (...)
    176     **kwargs,
    177 }
    179 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 180     response = client.post(
    181         url=f"{self.base_url}/api/generate",
    182         json=payload,
    183     )
    184     response.raise_for_status()
    185     raw = response.json()

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1146, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
   1125 def post(
   1126     self,
   1127     url: URLTypes,
   (...)
   1139     extensions: typing.Optional[RequestExtensions] = None,
   1140 ) -> Response:
   1141     """
   1142     Send a `POST` request.
   1143 
   1144     **Parameters**: See `httpx.request`.
   1145     """
-> 1146     return self.request(
   1147         "POST",
   1148         url,
   1149         content=content,
   1150         data=data,
   1151         files=files,
   1152         json=json,
   1153         params=params,
   1154         headers=headers,
   1155         cookies=cookies,
   1156         auth=auth,
   1157         follow_redirects=follow_redirects,
   1158         timeout=timeout,
   1159         extensions=extensions,
   1160     )

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:828, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    813     warnings.warn(message, DeprecationWarning)
    815 request = self.build_request(
    816     method=method,
    817     url=url,
   (...)
    826     extensions=extensions,
    827 )
--> 828 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:915, in Client.send(self, request, stream, auth, follow_redirects)
    907 follow_redirects = (
    908     self.follow_redirects
    909     if isinstance(follow_redirects, UseClientDefault)
    910     else follow_redirects
    911 )
    913 auth = self._build_request_auth(request, auth)
--> 915 response = self._send_handling_auth(
    916     request,
    917     auth=auth,
    918     follow_redirects=follow_redirects,
    919     history=[],
    920 )
    921 try:
    922     if not stream:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:943, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    940 request = next(auth_flow)
    942 while True:
--> 943     response = self._send_handling_redirects(
    944         request,
    945         follow_redirects=follow_redirects,
    946         history=history,
    947     )
    948     try:
    949         try:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:980, in Client._send_handling_redirects(self, request, follow_redirects, history)
    977 for hook in self._event_hooks["request"]:
    978     hook(request)
--> 980 response = self._send_single_request(request)
    981 try:
    982     for hook in self._event_hooks["response"]:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1016, in Client._send_single_request(self, request)
   1011     raise RuntimeError(
   1012         "Attempted to send an async request with a sync Client instance."
   1013     )
   1015 with request_context(request=request):
-> 1016     response = transport.handle_request(request)
   1018 assert isinstance(response.stream, SyncByteStream)
   1020 response.request = request

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:230, in HTTPTransport.handle_request(self, request)
    216 assert isinstance(request.stream, SyncByteStream)
    218 req = httpcore.Request(
    219     method=request.method,
    220     url=httpcore.URL(
   (...)
    228     extensions=request.extensions,
    229 )
--> 230 with map_httpcore_exceptions():
    231     resp = self._pool.handle_request(req)
    233 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    151     value = typ()
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.
    158     return exc is not value

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:84, in map_httpcore_exceptions()
     81     raise
     83 message = str(exc)
---> 84 raise mapped_exc(message) from exc

ConnectError: [Errno 99] Cannot assign requested address

Originally created by @adriens on GitHub (Jan 15, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1997 Originally assigned to: @jmorganca on GitHub. # :grey_question: About While working on a `ollama` tutorial on Kaggle, since a few days, I faced a regression while working with LlamaIndex. Here is the output I could get on any model (worked everytime) ![image](https://github.com/langchain-ai/langchainjs/assets/5235127/89ebe9c2-55d4-41da-8b32-74d243759f2e) ... vs now (the code is now broken, and it fails consistetly): ![image](https://github.com/langchain-ai/langchainjs/assets/5235127/4121bd48-0c35-461b-81ba-f2353b06ee45) # :information_source: - :heavy_check_mark: Everything works perfectly well on my laptop :thinking: Looks like something changed that causes this "regression" while playing around in some cases :thought_balloon: # :tickets: Potentially related issues - https://github.com/jmorganca/ollama/issues/1478 - https://github.com/jmorganca/ollama/issues/1641 - https://github.com/jmorganca/ollama/issues/1550 - https://github.com/jmorganca/ollama/pull/1146 ## :scroll: Detailed stacktrace ``` --------------------------------------------------------------------------- OSError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map) 9 try: ---> 10 yield 11 except Exception as exc: # noqa: PIE786 File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options) 205 with map_exceptions(exc_map): --> 206 sock = socket.create_connection( 207 address, 208 timeout, 209 source_address=source_address, 210 ) 211 for option in socket_options: File /opt/conda/lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address) 844 try: --> 845 raise err 846 finally: 847 # Break explicitly a reference cycle File /opt/conda/lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address) 832 sock.bind(source_address) --> 833 sock.connect(sa) 834 # Break explicitly a reference cycle OSError: [Errno 99] Cannot assign requested address The above exception was the direct cause of the following exception: ConnectError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:67, in map_httpcore_exceptions() 66 try: ---> 67 yield 68 except Exception as exc: File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:231, in HTTPTransport.handle_request(self, request) 230 with map_httpcore_exceptions(): --> 231 resp = self._pool.handle_request(req) 233 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request) 267 self.response_closed(status) --> 268 raise exc 269 else: File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request) 250 try: --> 251 response = connection.handle_request(request) 252 except ConnectionNotAvailable: 253 # The ConnectionNotAvailable exception is a special case, that 254 # indicates we need to retry the request on a new connection. (...) 258 # might end up as an HTTP/2 connection, but which actually ends 259 # up as HTTP/1.1. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request) 98 self._connect_failed = True ---> 99 raise exc 100 elif not self._connection.is_available(): File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request) 75 try: ---> 76 stream = self._connect(request) 78 ssl_object = stream.get_extra_info("ssl_object") File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request) 123 with Trace("connect_tcp", logger, request, kwargs) as trace: --> 124 stream = self._network_backend.connect_tcp(**kwargs) 125 trace.return_value = stream File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options) 200 exc_map: ExceptionMapping = { 201 socket.timeout: ConnectTimeout, 202 OSError: ConnectError, 203 } --> 205 with map_exceptions(exc_map): 206 sock = socket.create_connection( 207 address, 208 timeout, 209 source_address=source_address, 210 ) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map) 13 if isinstance(exc, from_exc): ---> 14 raise to_exc(exc) from exc 15 raise ConnectError: [Errno 99] Cannot assign requested address The above exception was the direct cause of the following exception: ConnectError Traceback (most recent call last) Cell In[13], line 5 2 from llama_index.llms import Ollama 4 llm = Ollama(model=OLLAMA_MODEL) ----> 5 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 6 (Answer with markdown sections, markdown with be the GitHub flavor.)""") 7 print(response) File /opt/conda/lib/python3.10/site-packages/llama_index/llms/base.py:226, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs) 216 with wrapper_logic(_self) as callback_manager: 217 event_id = callback_manager.on_event_start( 218 CBEventType.LLM, 219 payload={ (...) 223 }, 224 ) --> 226 f_return_val = f(_self, *args, **kwargs) 227 if isinstance(f_return_val, Generator): 228 # intercept the generator and add a callback to the end 229 def wrapped_gen() -> CompletionResponseGen: File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama.py:180, in Ollama.complete(self, prompt, formatted, **kwargs) 171 payload = { 172 self.prompt_key: prompt, 173 "model": self.model, (...) 176 **kwargs, 177 } 179 with httpx.Client(timeout=Timeout(self.request_timeout)) as client: --> 180 response = client.post( 181 url=f"{self.base_url}/api/generate", 182 json=payload, 183 ) 184 response.raise_for_status() 185 raw = response.json() File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1146, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 1125 def post( 1126 self, 1127 url: URLTypes, (...) 1139 extensions: typing.Optional[RequestExtensions] = None, 1140 ) -> Response: 1141 """ 1142 Send a `POST` request. 1143 1144 **Parameters**: See `httpx.request`. 1145 """ -> 1146 return self.request( 1147 "POST", 1148 url, 1149 content=content, 1150 data=data, 1151 files=files, 1152 json=json, 1153 params=params, 1154 headers=headers, 1155 cookies=cookies, 1156 auth=auth, 1157 follow_redirects=follow_redirects, 1158 timeout=timeout, 1159 extensions=extensions, 1160 ) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:828, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 813 warnings.warn(message, DeprecationWarning) 815 request = self.build_request( 816 method=method, 817 url=url, (...) 826 extensions=extensions, 827 ) --> 828 return self.send(request, auth=auth, follow_redirects=follow_redirects) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:915, in Client.send(self, request, stream, auth, follow_redirects) 907 follow_redirects = ( 908 self.follow_redirects 909 if isinstance(follow_redirects, UseClientDefault) 910 else follow_redirects 911 ) 913 auth = self._build_request_auth(request, auth) --> 915 response = self._send_handling_auth( 916 request, 917 auth=auth, 918 follow_redirects=follow_redirects, 919 history=[], 920 ) 921 try: 922 if not stream: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:943, in Client._send_handling_auth(self, request, auth, follow_redirects, history) 940 request = next(auth_flow) 942 while True: --> 943 response = self._send_handling_redirects( 944 request, 945 follow_redirects=follow_redirects, 946 history=history, 947 ) 948 try: 949 try: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:980, in Client._send_handling_redirects(self, request, follow_redirects, history) 977 for hook in self._event_hooks["request"]: 978 hook(request) --> 980 response = self._send_single_request(request) 981 try: 982 for hook in self._event_hooks["response"]: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1016, in Client._send_single_request(self, request) 1011 raise RuntimeError( 1012 "Attempted to send an async request with a sync Client instance." 1013 ) 1015 with request_context(request=request): -> 1016 response = transport.handle_request(request) 1018 assert isinstance(response.stream, SyncByteStream) 1020 response.request = request File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:230, in HTTPTransport.handle_request(self, request) 216 assert isinstance(request.stream, SyncByteStream) 218 req = httpcore.Request( 219 method=request.method, 220 url=httpcore.URL( (...) 228 extensions=request.extensions, 229 ) --> 230 with map_httpcore_exceptions(): 231 resp = self._pool.handle_request(req) 233 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 151 value = typ() 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. 158 return exc is not value File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:84, in map_httpcore_exceptions() 81 raise 83 message = str(exc) ---> 84 raise mapped_exc(message) from exc ConnectError: [Errno 99] Cannot assign requested address ```

GiteaMirror added the bug label 2026-05-03 12:27:56 -05:00

GiteaMirror closed this issue

2026-05-03 12:27:58 -05:00

GiteaMirror commented

2026-05-03 12:27:58 -05:00

@adriens commented on GitHub (Jan 15, 2024):

❔ Is there a way to install any previous ollama version, from shell (so I can point where it started to fail)?

@adriens commented on GitHub (Jan 15, 2024): :grey_question: Is there a way to install any previous ollama version, from shell (so I can point where it started to fail)?

GiteaMirror commented

2026-05-03 12:27:59 -05:00

@jmorganca commented on GitHub (Jan 15, 2024):

@adriens sorry you hit this. Will look into it. Until it's fixed, you can install previous versions with this script (for example, 0.1.17)

curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh

@jmorganca commented on GitHub (Jan 15, 2024): @adriens sorry you hit this. Will look into it. Until it's fixed, you can install previous versions with this script (for example, 0.1.17) ``` curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh ```

GiteaMirror commented

2026-05-03 12:28:00 -05:00

@adriens commented on GitHub (Jan 15, 2024):

Thanks a lot for the fast answer and the shell tip 👍

@adriens commented on GitHub (Jan 15, 2024): Thanks a lot for the fast answer and the `shell` tip :+1:

GiteaMirror commented

2026-05-03 12:28:02 -05:00

@adriens commented on GitHub (Jan 15, 2024):

Test in progress: I will keep you up-to-date ⚡

@adriens commented on GitHub (Jan 15, 2024): Test in progress: I will keep you up-to-date :zap:

GiteaMirror commented

2026-05-03 12:28:03 -05:00

@adriens commented on GitHub (Jan 15, 2024):

Surprinsingly, looks like all previous versions are failing...I'm unable to reproduce a successful run:

`ollama` version	Result
v0.1.20	👎
v0.1.17	👎
v0.1.16	👎

👉 here are :

👍 A successful run : https://www.kaggle.com/adriensales/ollama-running-local-models-w-llamaindex-cpu
👎 A broken one: https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu?scriptVersionId=158989000

@adriens commented on GitHub (Jan 15, 2024): Surprinsingly, looks like all previous versions are failing...I'm unable to reproduce a successful run: | `ollama` version | Result | | --- | --- | | v0.1.20 | :-1: | |v0.1.17 | :-1: | |v0.1.16 | :-1: | :point_right: here are : - :+1: A successful run : https://www.kaggle.com/adriensales/ollama-running-local-models-w-llamaindex-cpu - :-1: A broken one: https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu?scriptVersionId=158989000

GiteaMirror commented

2026-05-03 12:28:04 -05:00

@adriens commented on GitHub (Jan 15, 2024):

I gave it a try on Killercoda and I could easily reproduce the behavior:

Then pip install llama_index

Then try to

 python demo.py

... produces the timeout:

llm = Ollama(model=OLLAMA_MODEL)
response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
(Answer with markdown sections, markdown with be the GitHub flavor.)""")
print(response)
ubuntu $ python demo.py 
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read
    return self._sock.recv(max_bytes)
socket.timeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request
    response = connection.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection.py", line 103, in handle_request
    return self._connection.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 133, in handle_request
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 111, in handle_request
    ) = self._receive_response_headers(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers
    event = self._receive_event(timeout=timeout)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 212, in _receive_event
    data = self._network_stream.read(
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read
    return self._sock.recv(max_bytes)
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ReadTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "demo.py", line 6, in <module>
    response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
  File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/base.py", line 226, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/ollama.py", line 180, in complete
    response = client.post(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1146, in post
    return self.request(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 828, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 915, in send
    response = self._send_handling_auth(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 943, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 980, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1016, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out

@adriens commented on GitHub (Jan 15, 2024): I gave it a try on Killercoda and I could easily reproduce the behavior: ![image](https://github.com/jmorganca/ollama/assets/5235127/889ffba0-979b-4da4-acb1-0f55dae4941f) Then `pip install llama_index` ![image](https://github.com/jmorganca/ollama/assets/5235127/4a2447a9-5020-4146-922f-c6b1e8249a34) Then try to ```sh python demo.py ``` ... produces the timeout: ``` llm = Ollama(model=OLLAMA_MODEL) response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with be the GitHub flavor.)""") print(response) ubuntu $ python demo.py Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 10, in map_exceptions yield File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read return self._sock.recv(max_bytes) socket.timeout: timed out The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions yield File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request resp = self._pool.handle_request(req) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request raise exc File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request response = connection.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection.py", line 103, in handle_request return self._connection.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 133, in handle_request raise exc File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 111, in handle_request ) = self._receive_response_headers(**kwargs) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers event = self._receive_event(timeout=timeout) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 212, in _receive_event data = self._network_stream.read( File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read return self._sock.recv(max_bytes) File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ReadTimeout: timed out The above exception was the direct cause of the following exception: Traceback (most recent call last): File "demo.py", line 6, in <module> response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/base.py", line 226, in wrapped_llm_predict f_return_val = f(_self, *args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/ollama.py", line 180, in complete response = client.post( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1146, in post return self.request( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 828, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 915, in send response = self._send_handling_auth( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 943, in _send_handling_auth response = self._send_handling_redirects( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 980, in _send_handling_redirects response = self._send_single_request(request) File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1016, in _send_single_request response = transport.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request resp = self._pool.handle_request(req) File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ReadTimeout: timed out ```

GiteaMirror commented

2026-05-03 12:28:05 -05:00

@adriens commented on GitHub (Jan 15, 2024):

🤔 Maybe something around llama_index ❔

@adriens commented on GitHub (Jan 15, 2024): :thinking: Maybe something around `llama_index` :grey_question:

GiteaMirror commented

2026-05-03 12:28:06 -05:00

@adriens commented on GitHub (Jan 16, 2024):

Gave a try with previous llama_index :

!pip install llama-index==0.9.23

... but still got the same issue:

@adriens commented on GitHub (Jan 16, 2024): Gave a try with previous `llama_index` : ```python !pip install llama-index==0.9.23 ``` ... but still got the same issue: ![image](https://github.com/jmorganca/ollama/assets/5235127/78a4308d-b8a9-4b42-b18d-88195aaab49c)

GiteaMirror commented

2026-05-03 12:28:07 -05:00

@adriens commented on GitHub (Jan 16, 2024):

https://github.com/jmorganca/ollama/issues/1863

@adriens commented on GitHub (Jan 16, 2024): - https://github.com/jmorganca/ollama/issues/1863

GiteaMirror commented

2026-05-03 12:28:09 -05:00

@adriens commented on GitHub (Jan 16, 2024):

https://github.com/jmorganca/ollama/issues/1910

@adriens commented on GitHub (Jan 16, 2024): - https://github.com/jmorganca/ollama/issues/1910

GiteaMirror commented

2026-05-03 12:28:11 -05:00

@adriens commented on GitHub (Jan 16, 2024):

✋ Compatibility matrix

Made it work with the following conf, here is the matrix:

`ollama`	`llama_index`	Status
`v0.1.16`	`0.9.21`	🆗
`v0.1.17`	`v0.9.21`	🆗
`v0.1.18`	`v0.9.21	🆗
`v0.1.20`	`v0.9.21`	🆗
`v0.1.16`	`0.9.22`	👎
`v0.1.16`	`v0.9.31 (current)`	👎
`v0.1.17`	`v0.9.31` (current)	👎
`v0.1.18`	`v0.9.31` (current)	❔
`v0.1.19`	`v0.9.31` (current)	❔
`v0.1.20`	`v0.9.31` (current)	👎

@adriens commented on GitHub (Jan 16, 2024): ## :hand: Compatibility matrix Made it work with the following conf, here is the matrix: | `ollama` | `llama_index` | Status | | --- | --- | --- | | `v0.1.16` | `0.9.21` | 🆗 | | `v0.1.17` | `v0.9.21` | 🆗 | | `v0.1.18` | `v0.9.21 | 🆗 | | `v0.1.20` | `v0.9.21` | 🆗 | | `v0.1.16` | `0.9.22` | 👎 | | `v0.1.16` | `v0.9.31 (current)` | 👎 | | `v0.1.17` | `v0.9.31` (current) | 👎| | `v0.1.18` | `v0.9.31` (current) | ❔| | `v0.1.19` | `v0.9.31` (current) | ❔| | `v0.1.20` | `v0.9.31` (current) | 👎 |

GiteaMirror commented

2026-05-03 12:28:14 -05:00

@adriens commented on GitHub (Jan 17, 2024):

🆓 Local & Open Source AI: a kind ollama & LlamaIndex intro

@adriens commented on GitHub (Jan 17, 2024): [🆓 Local & Open Source AI: a kind ollama & LlamaIndex intro](https://dev.to/adriens/local-open-source-ai-a-kind-ollama-llamaindex-intro-1nnc)

GiteaMirror commented

2026-05-03 12:28:16 -05:00

@tinycrops commented on GitHub (Jan 25, 2024):

was using a derivative of adriens notebook

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[8], line 53
     43 llm = Ollama(model=OLLAMA_MODEL)
     44 # response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
     45 # (Answer with markdown sections, markdown with be the GitHub flavor.)""")
     46 # print(response)
   (...)
     51 
     52 # bash_chain.run(text)
---> 53 llm.invoke(f"Translate to a scientific lecture: {PROMPT}")

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:230, in BaseLLM.invoke(self, input, config, stop, **kwargs)
    220 def invoke(
    221     self,
    222     input: LanguageModelInput,
   (...)
    226     **kwargs: Any,
    227 ) -> str:
    228     config = ensure_config(config)
    229     return (
--> 230         self.generate_prompt(
    231             [self._convert_input(input)],
    232             stop=stop,
    233             callbacks=config.get("callbacks"),
    234             tags=config.get("tags"),
    235             metadata=config.get("metadata"),
    236             run_name=config.get("run_name"),
    237             **kwargs,
    238         )
    239         .generations[0][0]
    240         .text
    241     )

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:525, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    517 def generate_prompt(
    518     self,
    519     prompts: List[PromptValue],
   (...)
    522     **kwargs: Any,
    523 ) -> LLMResult:
    524     prompt_strings = [p.to_string() for p in prompts]
--> 525     return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:698, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs)
    682         raise ValueError(
    683             "Asked to cache, but no cache found at `langchain.cache`."
    684         )
    685     run_managers = [
    686         callback_manager.on_llm_start(
    687             dumpd(self),
   (...)
    696         )
    697     ]
--> 698     output = self._generate_helper(
    699         prompts, stop, run_managers, bool(new_arg_supported), **kwargs
    700     )
    701     return output
    702 if len(missing_prompts) > 0:

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:562, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    560     for run_manager in run_managers:
    561         run_manager.on_llm_error(e, response=LLMResult(generations=[]))
--> 562     raise e
    563 flattened_outputs = output.flatten()
    564 for manager, flattened_output in zip(run_managers, flattened_outputs):

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:549, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    539 def _generate_helper(
    540     self,
    541     prompts: List[str],
   (...)
    545     **kwargs: Any,
    546 ) -> LLMResult:
    547     try:
    548         output = (
--> 549             self._generate(
    550                 prompts,
    551                 stop=stop,
    552                 # TODO: support multiple run managers
    553                 run_manager=run_managers[0] if run_managers else None,
    554                 **kwargs,
    555             )
    556             if new_arg_supported
    557             else self._generate(prompts, stop=stop)
    558         )
    559     except BaseException as e:
    560         for run_manager in run_managers:

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:400, in Ollama._generate(self, prompts, stop, images, run_manager, **kwargs)
    398 generations = []
    399 for prompt in prompts:
--> 400     final_chunk = super()._stream_with_aggregation(
    401         prompt,
    402         stop=stop,
    403         images=images,
    404         run_manager=run_manager,
    405         verbose=self.verbose,
    406         **kwargs,
    407     )
    408     generations.append([final_chunk])
    409 return LLMResult(generations=generations)

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:309, in _OllamaCommon._stream_with_aggregation(self, prompt, stop, run_manager, verbose, **kwargs)
    300 def _stream_with_aggregation(
    301     self,
    302     prompt: str,
   (...)
    306     **kwargs: Any,
    307 ) -> GenerationChunk:
    308     final_chunk: Optional[GenerationChunk] = None
--> 309     for stream_resp in self._create_generate_stream(prompt, stop, **kwargs):
    310         if stream_resp:
    311             chunk = _stream_response_to_generation_chunk(stream_resp)

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:154, in _OllamaCommon._create_generate_stream(self, prompt, stop, images, **kwargs)
    146 def _create_generate_stream(
    147     self,
    148     prompt: str,
   (...)
    151     **kwargs: Any,
    152 ) -> Iterator[str]:
    153     payload = {"prompt": prompt, "images": images}
--> 154     yield from self._create_stream(
    155         payload=payload,
    156         stop=stop,
    157         api_url=f"{self.base_url}/api/generate/",
    158         **kwargs,
    159     )

File /opt/conda/lib/python3.10/site-packages/requests/models.py:865, in Response.iter_lines(self, chunk_size, decode_unicode, delimiter)
    856 """Iterates over the response data, one line at a time.  When
    857 stream=True is set on the request, this avoids reading the
    858 content at once into memory for large responses.
    859 
    860 .. note:: This method is not reentrant safe.
    861 """
    863 pending = None
--> 865 for chunk in self.iter_content(
    866     chunk_size=chunk_size, decode_unicode=decode_unicode
    867 ):
    869     if pending is not None:
    870         chunk = pending + chunk

File /opt/conda/lib/python3.10/site-packages/requests/utils.py:571, in stream_decode_response_unicode(iterator, r)
    568     return
    570 decoder = codecs.getincrementaldecoder(r.encoding)(errors="replace")
--> 571 for chunk in iterator:
    572     rv = decoder.decode(chunk)
    573     if rv:

File /opt/conda/lib/python3.10/site-packages/requests/models.py:816, in Response.iter_content.<locals>.generate()
    814 if hasattr(self.raw, "stream"):
    815     try:
--> 816         yield from self.raw.stream(chunk_size, decode_content=True)
    817     except ProtocolError as e:
    818         raise ChunkedEncodingError(e)

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:624, in HTTPResponse.stream(self, amt, decode_content)
    608 """
    609 A generator wrapper for the read() method. A call will block until
    610 ``amt`` bytes have been read from the connection or until the
   (...)
    621     'content-encoding' header.
    622 """
    623 if self.chunked and self.supports_chunked_reads():
--> 624     for line in self.read_chunked(amt, decode_content=decode_content):
    625         yield line
    626 else:

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:828, in HTTPResponse.read_chunked(self, amt, decode_content)
    825     return
    827 while True:
--> 828     self._update_chunk_length()
    829     if self.chunk_left == 0:
    830         break

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:758, in HTTPResponse._update_chunk_length(self)
    756 if self.chunk_left is not None:
    757     return
--> 758 line = self._fp.fp.readline()
    759 line = line.split(b";", 1)[0]
    760 try:

File /opt/conda/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b)
    703 while True:
    704     try:
--> 705         return self._sock.recv_into(b)
    706     except timeout:
    707         self._timeout_occurred = True

KeyboardInterrupt:

@tinycrops commented on GitHub (Jan 25, 2024): was using a derivative of adriens [notebook](https://www.kaggle.com/code/matthewhendricks/notebook0cd9dcd006) ``` --------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) Cell In[8], line 53 43 llm = Ollama(model=OLLAMA_MODEL) 44 # response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 45 # (Answer with markdown sections, markdown with be the GitHub flavor.)""") 46 # print(response) (...) 51 52 # bash_chain.run(text) ---> 53 llm.invoke(f"Translate to a scientific lecture: {PROMPT}") File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:230, in BaseLLM.invoke(self, input, config, stop, **kwargs) 220 def invoke( 221 self, 222 input: LanguageModelInput, (...) 226 **kwargs: Any, 227 ) -> str: 228 config = ensure_config(config) 229 return ( --> 230 self.generate_prompt( 231 [self._convert_input(input)], 232 stop=stop, 233 callbacks=config.get("callbacks"), 234 tags=config.get("tags"), 235 metadata=config.get("metadata"), 236 run_name=config.get("run_name"), 237 **kwargs, 238 ) 239 .generations[0][0] 240 .text 241 ) File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:525, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, **kwargs) 517 def generate_prompt( 518 self, 519 prompts: List[PromptValue], (...) 522 **kwargs: Any, 523 ) -> LLMResult: 524 prompt_strings = [p.to_string() for p in prompts] --> 525 return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs) File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:698, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs) 682 raise ValueError( 683 "Asked to cache, but no cache found at `langchain.cache`." 684 ) 685 run_managers = [ 686 callback_manager.on_llm_start( 687 dumpd(self), (...) 696 ) 697 ] --> 698 output = self._generate_helper( 699 prompts, stop, run_managers, bool(new_arg_supported), **kwargs 700 ) 701 return output 702 if len(missing_prompts) > 0: File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:562, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 560 for run_manager in run_managers: 561 run_manager.on_llm_error(e, response=LLMResult(generations=[])) --> 562 raise e 563 flattened_outputs = output.flatten() 564 for manager, flattened_output in zip(run_managers, flattened_outputs): File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:549, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 539 def _generate_helper( 540 self, 541 prompts: List[str], (...) 545 **kwargs: Any, 546 ) -> LLMResult: 547 try: 548 output = ( --> 549 self._generate( 550 prompts, 551 stop=stop, 552 # TODO: support multiple run managers 553 run_manager=run_managers[0] if run_managers else None, 554 **kwargs, 555 ) 556 if new_arg_supported 557 else self._generate(prompts, stop=stop) 558 ) 559 except BaseException as e: 560 for run_manager in run_managers: File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:400, in Ollama._generate(self, prompts, stop, images, run_manager, **kwargs) 398 generations = [] 399 for prompt in prompts: --> 400 final_chunk = super()._stream_with_aggregation( 401 prompt, 402 stop=stop, 403 images=images, 404 run_manager=run_manager, 405 verbose=self.verbose, 406 **kwargs, 407 ) 408 generations.append([final_chunk]) 409 return LLMResult(generations=generations) File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:309, in _OllamaCommon._stream_with_aggregation(self, prompt, stop, run_manager, verbose, **kwargs) 300 def _stream_with_aggregation( 301 self, 302 prompt: str, (...) 306 **kwargs: Any, 307 ) -> GenerationChunk: 308 final_chunk: Optional[GenerationChunk] = None --> 309 for stream_resp in self._create_generate_stream(prompt, stop, **kwargs): 310 if stream_resp: 311 chunk = _stream_response_to_generation_chunk(stream_resp) File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:154, in _OllamaCommon._create_generate_stream(self, prompt, stop, images, **kwargs) 146 def _create_generate_stream( 147 self, 148 prompt: str, (...) 151 **kwargs: Any, 152 ) -> Iterator[str]: 153 payload = {"prompt": prompt, "images": images} --> 154 yield from self._create_stream( 155 payload=payload, 156 stop=stop, 157 api_url=f"{self.base_url}/api/generate/", 158 **kwargs, 159 ) File /opt/conda/lib/python3.10/site-packages/requests/models.py:865, in Response.iter_lines(self, chunk_size, decode_unicode, delimiter) 856 """Iterates over the response data, one line at a time. When 857 stream=True is set on the request, this avoids reading the 858 content at once into memory for large responses. 859 860 .. note:: This method is not reentrant safe. 861 """ 863 pending = None --> 865 for chunk in self.iter_content( 866 chunk_size=chunk_size, decode_unicode=decode_unicode 867 ): 869 if pending is not None: 870 chunk = pending + chunk File /opt/conda/lib/python3.10/site-packages/requests/utils.py:571, in stream_decode_response_unicode(iterator, r) 568 return 570 decoder = codecs.getincrementaldecoder(r.encoding)(errors="replace") --> 571 for chunk in iterator: 572 rv = decoder.decode(chunk) 573 if rv: File /opt/conda/lib/python3.10/site-packages/requests/models.py:816, in Response.iter_content.<locals>.generate() 814 if hasattr(self.raw, "stream"): 815 try: --> 816 yield from self.raw.stream(chunk_size, decode_content=True) 817 except ProtocolError as e: 818 raise ChunkedEncodingError(e) File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:624, in HTTPResponse.stream(self, amt, decode_content) 608 """ 609 A generator wrapper for the read() method. A call will block until 610 ``amt`` bytes have been read from the connection or until the (...) 621 'content-encoding' header. 622 """ 623 if self.chunked and self.supports_chunked_reads(): --> 624 for line in self.read_chunked(amt, decode_content=decode_content): 625 yield line 626 else: File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:828, in HTTPResponse.read_chunked(self, amt, decode_content) 825 return 827 while True: --> 828 self._update_chunk_length() 829 if self.chunk_left == 0: 830 break File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:758, in HTTPResponse._update_chunk_length(self) 756 if self.chunk_left is not None: 757 return --> 758 line = self._fp.fp.readline() 759 line = line.split(b";", 1)[0] 760 try: File /opt/conda/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b) 703 while True: 704 try: --> 705 return self._sock.recv_into(b) 706 except timeout: 707 self._timeout_occurred = True KeyboardInterrupt: ```

GiteaMirror commented

2026-05-03 12:28:17 -05:00

@adriens commented on GitHub (Jan 25, 2024):

🙏 @MeDott29 for the code submission 🐱

@adriens commented on GitHub (Jan 25, 2024): :pray: @MeDott29 for the code submission :cat:

GiteaMirror commented

2026-05-03 12:28:18 -05:00

@pdevine commented on GitHub (Mar 12, 2024):

Hey @adriens , this seems to be working fine at least locally. Llama Index added us to a new "ollama" package. I don't have access to Kaggle/Killercoda though, but:

% python3
Python 3.11.7 (main, Dec  4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from llama_index.llms.ollama import Ollama
>>> llm = Ollama(model="llama2", request_timeout=30.0)
>>> resp = llm.complete("Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with the GitHub flavor.)")
>>> print(resp)

Grigori Perelman is a Russian mathematician who made significant contributions to the field of geometry and topology, particularly in the area of Riemannian geometry and the Poincaré conjecture. He is considered one of the most important mathematicians of the 21st century, and his work has had a profound impact on the field of mathematics.

Early Life and Education
-------------------------

Grigori Perelman was born in Leningrad (now St. Petersburg), Russia in 1966. He grew up in a family of mathematicians and began studying mathematics at an early age. He graduated from the University of Leningrad in 1987 with a degree in mathematics and went on to pursue his graduate studies at the Steklov Institute of Mathematics in St. Petersburg.

Contributions to Mathematics
-----------------------------

Perelman's most significant contribution to mathematics is his proof of the Poincaré conjecture, which was a longstanding problem in topology. The conjecture states that a simply connected, closed three-dimensional manifold must be topologically equivalent to a three-dimensional sphere. Perelman's proof, which was published in 2003, involved the use of a combination of geometric and topological techniques.

Perelman's work on the Poincaré conjecture is considered one of the most important achievements in mathematics in the last century, and it has had a significant impact on the field of geometry and topology. His proof has been hailed as a masterpiece of mathematical rigor and creativity, and it has opened up new areas of research in geometry and topology.
...

I think maybe this is an issue with Kaggle?

@pdevine commented on GitHub (Mar 12, 2024): Hey @adriens , this seems to be working fine at least locally. Llama Index added us to a new "ollama" package. I don't have access to Kaggle/Killercoda though, but: ``` % python3 Python 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from llama_index.llms.ollama import Ollama >>> llm = Ollama(model="llama2", request_timeout=30.0) >>> resp = llm.complete("Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with the GitHub flavor.)") >>> print(resp) Grigori Perelman is a Russian mathematician who made significant contributions to the field of geometry and topology, particularly in the area of Riemannian geometry and the Poincaré conjecture. He is considered one of the most important mathematicians of the 21st century, and his work has had a profound impact on the field of mathematics. Early Life and Education ------------------------- Grigori Perelman was born in Leningrad (now St. Petersburg), Russia in 1966. He grew up in a family of mathematicians and began studying mathematics at an early age. He graduated from the University of Leningrad in 1987 with a degree in mathematics and went on to pursue his graduate studies at the Steklov Institute of Mathematics in St. Petersburg. Contributions to Mathematics ----------------------------- Perelman's most significant contribution to mathematics is his proof of the Poincaré conjecture, which was a longstanding problem in topology. The conjecture states that a simply connected, closed three-dimensional manifold must be topologically equivalent to a three-dimensional sphere. Perelman's proof, which was published in 2003, involved the use of a combination of geometric and topological techniques. Perelman's work on the Poincaré conjecture is considered one of the most important achievements in mathematics in the last century, and it has had a significant impact on the field of geometry and topology. His proof has been hailed as a masterpiece of mathematical rigor and creativity, and it has opened up new areas of research in geometry and topology. ... ``` I think maybe this is an issue with Kaggle?

GiteaMirror commented

2026-05-03 12:28:19 -05:00

@adriens commented on GitHub (Mar 20, 2024):

I'm giving it a try right now 🤞
https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu

@adriens commented on GitHub (Mar 20, 2024): I'm giving it a try right now :crossed_fingers: https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu ![image](https://github.com/ollama/ollama/assets/5235127/d709eca4-1e4c-4ef6-b86e-1b233a1bd6fc)

GiteaMirror commented

2026-05-03 12:28:20 -05:00

@adriens commented on GitHub (Mar 20, 2024):

Run in progress ⏳

@adriens commented on GitHub (Mar 20, 2024): Run in progress :hourglass_flowing_sand:

GiteaMirror commented

2026-05-03 12:28:22 -05:00

@adriens commented on GitHub (Mar 20, 2024):

Now I'm getting this

[0;31mImportError[0m: cannot import name 'Ollama' from 'llama_index.llms' (unknown location)

@adriens commented on GitHub (Mar 20, 2024): [Now I'm getting this](https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu/log?scriptVersionId=168043744) ``` [0;31mImportError[0m: cannot import name 'Ollama' from 'llama_index.llms' (unknown location) ```

GiteaMirror commented

2026-05-03 12:28:23 -05:00

@pdevine commented on GitHub (Mar 20, 2024):

@adriens it's from llama_index.llms.ollama import Ollama. They changed the package.

@pdevine commented on GitHub (Mar 20, 2024): @adriens it's `from llama_index.llms.ollama import Ollama`. They changed the package.

GiteaMirror commented

2026-05-03 12:28:24 -05:00

@adriens commented on GitHub (Apr 1, 2024):

Hi @pdevine , sorry for late feedback.

I've just patched the Notebook. I'll keep you posted within a few minutes 🤞

@adriens commented on GitHub (Apr 1, 2024): Hi @pdevine , sorry for late feedback. I've just patched the Notebook. I'll keep you posted within a few minutes :crossed_fingers:

GiteaMirror commented

2026-05-03 12:28:25 -05:00

@adriens commented on GitHub (Apr 2, 2024):

@adriens commented on GitHub (Apr 2, 2024): ![image](https://github.com/ollama/ollama/assets/5235127/dc3d0f64-0d17-4120-96b2-a203b5665ad0)

GiteaMirror commented

2026-05-03 12:28:26 -05:00

@pdevine commented on GitHub (Apr 2, 2024):

hey @adriens , you should follow the docs from llama index here: https://docs.llamaindex.ai/en/stable/examples/llm/ollama/

You'll need to pip install llama-index-llms-ollama first.

@pdevine commented on GitHub (Apr 2, 2024): hey @adriens , you should follow the docs from llama index here: https://docs.llamaindex.ai/en/stable/examples/llm/ollama/ You'll need to `pip install llama-index-llms-ollama` first.

GiteaMirror commented

2026-05-03 12:28:27 -05:00

@asma-10 commented on GitHub (Apr 5, 2024):

Hello everyone , do have the same issue when using Ollama
have you guys found any solution ? is it comming from Ollama itself ?

@asma-10 commented on GitHub (Apr 5, 2024): Hello everyone , do have the same issue when using Ollama have you guys found any solution ? is it comming from Ollama itself ? ![Capture d'écran 2024-04-05 210536](https://github.com/ollama/ollama/assets/101638276/9a8e0866-17b3-46ba-b36d-d0a73031edd8)

GiteaMirror commented

2026-05-03 12:28:27 -05:00

@jmorganca commented on GitHub (May 10, 2024):

Make sure to have the latest llama_index package: https://docs.llamaindex.ai/en/stable/api_reference/llms/ollama/

@jmorganca commented on GitHub (May 10, 2024): Make sure to have the latest `llama_index` package: https://docs.llamaindex.ai/en/stable/api_reference/llms/ollama/

GiteaMirror commented

2026-05-03 12:28:28 -05:00

@jmorganca commented on GitHub (May 10, 2024):

Let me know if you're still encountering this @adriens :)

@jmorganca commented on GitHub (May 10, 2024): Let me know if you're still encountering this @adriens :)

GiteaMirror commented

2026-05-03 12:28:29 -05:00

@adriens commented on GitHub (May 11, 2024):

Hi @jmorganca , I'm giving it a try right now ⚡

@adriens commented on GitHub (May 11, 2024): Hi @jmorganca , I'm giving it a try right now :zap:

GiteaMirror commented

2026-05-03 12:28:30 -05:00

@adriens commented on GitHub (May 11, 2024):

Applied modifications but still facing some timeout issue :

---------------------------------------------------------------------------
ReadTimeout                               Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions()
     68 try:
---> 69     yield
     70 except Exception as exc:

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request)
    232 with map_httpcore_exceptions():
--> 233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request)
    215     self._close_connections(closing)
--> 216     raise exc from None
    218 # Return the response. Note that in this case we still have to manage
    219 # the point at which the response is closed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request)
    194 try:
    195     # Send the request on the assigned connection.
--> 196     response = connection.handle_request(
    197         pool_request.request
    198     )
    199 except ConnectionNotAvailable:
    200     # In some cases a connection may initially be available to
    201     # handle a request, but then become unavailable.
    202     #
    203     # In this case we clear the connection and try again.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request)
     99     raise exc
--> 101 return self._connection.handle_request(request)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request)
    142         self._response_closed()
--> 143 raise exc

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request)
    104 with Trace(
    105     "receive_response_headers", logger, request, kwargs
    106 ) as trace:
    107     (
    108         http_version,
    109         status,
    110         reason_phrase,
    111         headers,
    112         trailing_data,
--> 113     ) = self._receive_response_headers(**kwargs)
    114     trace.return_value = (
    115         http_version,
    116         status,
    117         reason_phrase,
    118         headers,
    119     )

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request)
    185 while True:
--> 186     event = self._receive_event(timeout=timeout)
    187     if isinstance(event, h11.Response):

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout)
    223 if event is h11.NEED_DATA:
--> 224     data = self._network_stream.read(
    225         self.READ_NUM_BYTES, timeout=timeout
    226     )
    228     # If we feed this case through h11 we'll raise an exception like:
    229     #
    230     #     httpcore.RemoteProtocolError: can't handle event type
   (...)
    234     # perspective. Instead we handle this case distinctly and treat
    235     # it as a ConnectError.

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout)
    123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError}
--> 124 with map_exceptions(exc_map):
    125     self._sock.settimeout(timeout)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ReadTimeout: timed out

The above exception was the direct cause of the following exception:

ReadTimeout                               Traceback (most recent call last)
Cell In[12], line 6
      3 #from llama_index.llms.ollama import Ollama
      5 llm = Ollama(model=OLLAMA_MODEL)
----> 6 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
      7 (Answer with markdown sections, markdown with be the GitHub flavor.)""")
      8 print(response)

File /opt/conda/lib/python3.10/site-packages/llama_index/core/llms/callbacks.py:331, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs)
    314 dispatcher.event(
    315     LLMCompletionStartEvent(
    316         model_dict=model_dict,
   (...)
    320     )
    321 )
    322 event_id = callback_manager.on_event_start(
    323     CBEventType.LLM,
    324     payload={
   (...)
    328     },
    329 )
--> 331 f_return_val = f(_self, *args, **kwargs)
    332 if isinstance(f_return_val, Generator):
    333     # intercept the generator and add a callback to the end
    334     def wrapped_gen() -> CompletionResponseGen:

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama/base.py:303, in Ollama.complete(self, prompt, formatted, **kwargs)
    300     payload["format"] = "json"
    302 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 303     response = client.post(
    304         url=f"{self.base_url}/api/generate",
    305         json=payload,
    306     )
    307     response.raise_for_status()
    308     raw = response.json()

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1145, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
   1124 def post(
   1125     self,
   1126     url: URLTypes,
   (...)
   1138     extensions: RequestExtensions | None = None,
   1139 ) -> Response:
   1140     """
   1141     Send a `POST` request.
   1142 
   1143     **Parameters**: See `httpx.request`.
   1144     """
-> 1145     return self.request(
   1146         "POST",
   1147         url,
   1148         content=content,
   1149         data=data,
   1150         files=files,
   1151         json=json,
   1152         params=params,
   1153         headers=headers,
   1154         cookies=cookies,
   1155         auth=auth,
   1156         follow_redirects=follow_redirects,
   1157         timeout=timeout,
   1158         extensions=extensions,
   1159     )

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:827, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    812     warnings.warn(message, DeprecationWarning)
    814 request = self.build_request(
    815     method=method,
    816     url=url,
   (...)
    825     extensions=extensions,
    826 )
--> 827 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects)
    906 follow_redirects = (
    907     self.follow_redirects
    908     if isinstance(follow_redirects, UseClientDefault)
    909     else follow_redirects
    910 )
    912 auth = self._build_request_auth(request, auth)
--> 914 response = self._send_handling_auth(
    915     request,
    916     auth=auth,
    917     follow_redirects=follow_redirects,
    918     history=[],
    919 )
    920 try:
    921     if not stream:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    939 request = next(auth_flow)
    941 while True:
--> 942     response = self._send_handling_redirects(
    943         request,
    944         follow_redirects=follow_redirects,
    945         history=history,
    946     )
    947     try:
    948         try:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history)
    976 for hook in self._event_hooks["request"]:
    977     hook(request)
--> 979 response = self._send_single_request(request)
    980 try:
    981     for hook in self._event_hooks["response"]:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request)
   1010     raise RuntimeError(
   1011         "Attempted to send an async request with a sync Client instance."
   1012     )
   1014 with request_context(request=request):
-> 1015     response = transport.handle_request(request)
   1017 assert isinstance(response.stream, SyncByteStream)
   1019 response.request = request

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request)
    218 assert isinstance(request.stream, SyncByteStream)
    220 req = httpcore.Request(
    221     method=request.method,
    222     url=httpcore.URL(
   (...)
    230     extensions=request.extensions,
    231 )
--> 232 with map_httpcore_exceptions():
    233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    151     value = typ()
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.
    158     return exc is not value

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions()
     83     raise
     85 message = str(exc)
---> 86 raise mapped_exc(message) from exc

ReadTimeout: timed out

@adriens commented on GitHub (May 11, 2024): Applied modifications but still facing some timeout issue : ``` --------------------------------------------------------------------------- ReadTimeout Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions() 68 try: ---> 69 yield 70 except Exception as exc: File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request) 232 with map_httpcore_exceptions(): --> 233 resp = self._pool.handle_request(req) 235 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request) 215 self._close_connections(closing) --> 216 raise exc from None 218 # Return the response. Note that in this case we still have to manage 219 # the point at which the response is closed. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request) 194 try: 195 # Send the request on the assigned connection. --> 196 response = connection.handle_request( 197 pool_request.request 198 ) 199 except ConnectionNotAvailable: 200 # In some cases a connection may initially be available to 201 # handle a request, but then become unavailable. 202 # 203 # In this case we clear the connection and try again. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request) 99 raise exc --> 101 return self._connection.handle_request(request) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request) 142 self._response_closed() --> 143 raise exc File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request) 104 with Trace( 105 "receive_response_headers", logger, request, kwargs 106 ) as trace: 107 ( 108 http_version, 109 status, 110 reason_phrase, 111 headers, 112 trailing_data, --> 113 ) = self._receive_response_headers(**kwargs) 114 trace.return_value = ( 115 http_version, 116 status, 117 reason_phrase, 118 headers, 119 ) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request) 185 while True: --> 186 event = self._receive_event(timeout=timeout) 187 if isinstance(event, h11.Response): File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout) 223 if event is h11.NEED_DATA: --> 224 data = self._network_stream.read( 225 self.READ_NUM_BYTES, timeout=timeout 226 ) 228 # If we feed this case through h11 we'll raise an exception like: 229 # 230 # httpcore.RemoteProtocolError: can't handle event type (...) 234 # perspective. Instead we handle this case distinctly and treat 235 # it as a ConnectError. File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout) 123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError} --> 124 with map_exceptions(exc_map): 125 self._sock.settimeout(timeout) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map) 13 if isinstance(exc, from_exc): ---> 14 raise to_exc(exc) from exc 15 raise ReadTimeout: timed out The above exception was the direct cause of the following exception: ReadTimeout Traceback (most recent call last) Cell In[12], line 6 3 #from llama_index.llms.ollama import Ollama 5 llm = Ollama(model=OLLAMA_MODEL) ----> 6 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 7 (Answer with markdown sections, markdown with be the GitHub flavor.)""") 8 print(response) File /opt/conda/lib/python3.10/site-packages/llama_index/core/llms/callbacks.py:331, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs) 314 dispatcher.event( 315 LLMCompletionStartEvent( 316 model_dict=model_dict, (...) 320 ) 321 ) 322 event_id = callback_manager.on_event_start( 323 CBEventType.LLM, 324 payload={ (...) 328 }, 329 ) --> 331 f_return_val = f(_self, *args, **kwargs) 332 if isinstance(f_return_val, Generator): 333 # intercept the generator and add a callback to the end 334 def wrapped_gen() -> CompletionResponseGen: File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama/base.py:303, in Ollama.complete(self, prompt, formatted, **kwargs) 300 payload["format"] = "json" 302 with httpx.Client(timeout=Timeout(self.request_timeout)) as client: --> 303 response = client.post( 304 url=f"{self.base_url}/api/generate", 305 json=payload, 306 ) 307 response.raise_for_status() 308 raw = response.json() File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1145, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 1124 def post( 1125 self, 1126 url: URLTypes, (...) 1138 extensions: RequestExtensions | None = None, 1139 ) -> Response: 1140 """ 1141 Send a `POST` request. 1142 1143 **Parameters**: See `httpx.request`. 1144 """ -> 1145 return self.request( 1146 "POST", 1147 url, 1148 content=content, 1149 data=data, 1150 files=files, 1151 json=json, 1152 params=params, 1153 headers=headers, 1154 cookies=cookies, 1155 auth=auth, 1156 follow_redirects=follow_redirects, 1157 timeout=timeout, 1158 extensions=extensions, 1159 ) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:827, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 812 warnings.warn(message, DeprecationWarning) 814 request = self.build_request( 815 method=method, 816 url=url, (...) 825 extensions=extensions, 826 ) --> 827 return self.send(request, auth=auth, follow_redirects=follow_redirects) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects) 906 follow_redirects = ( 907 self.follow_redirects 908 if isinstance(follow_redirects, UseClientDefault) 909 else follow_redirects 910 ) 912 auth = self._build_request_auth(request, auth) --> 914 response = self._send_handling_auth( 915 request, 916 auth=auth, 917 follow_redirects=follow_redirects, 918 history=[], 919 ) 920 try: 921 if not stream: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history) 939 request = next(auth_flow) 941 while True: --> 942 response = self._send_handling_redirects( 943 request, 944 follow_redirects=follow_redirects, 945 history=history, 946 ) 947 try: 948 try: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history) 976 for hook in self._event_hooks["request"]: 977 hook(request) --> 979 response = self._send_single_request(request) 980 try: 981 for hook in self._event_hooks["response"]: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request) 1010 raise RuntimeError( 1011 "Attempted to send an async request with a sync Client instance." 1012 ) 1014 with request_context(request=request): -> 1015 response = transport.handle_request(request) 1017 assert isinstance(response.stream, SyncByteStream) 1019 response.request = request File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request) 218 assert isinstance(request.stream, SyncByteStream) 220 req = httpcore.Request( 221 method=request.method, 222 url=httpcore.URL( (...) 230 extensions=request.extensions, 231 ) --> 232 with map_httpcore_exceptions(): 233 resp = self._pool.handle_request(req) 235 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 151 value = typ() 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. 158 return exc is not value File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions() 83 raise 85 message = str(exc) ---> 86 raise mapped_exc(message) from exc ReadTimeout: timed out ```

GiteaMirror commented

2026-05-03 12:28:31 -05:00

@adriens commented on GitHub (May 11, 2024):

I'm trying a new RUN...

@adriens commented on GitHub (May 11, 2024): I'm trying a new RUN...

GiteaMirror commented

2026-05-03 12:28:33 -05:00

@adriens commented on GitHub (May 11, 2024):

Nope, could not make it run @jmorganca

cf Notebook

Would you share some code ?

@adriens commented on GitHub (May 11, 2024): Nope, could not make it run @jmorganca ![image](https://github.com/ollama/ollama/assets/5235127/a2000f0f-3f2f-4c84-8b06-739d9413cb76) cf [Notebook](https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu?scriptVersionId=177116161) Would you share some code ?

GiteaMirror commented

2026-05-03 12:28:35 -05:00

@adriens commented on GitHub (May 11, 2024):

Assuming there is an ollama instance running in background, here is mine :

!pip install --upgrade llama-index-llms-ollama
!pip install --upgrade llama-index



# Just runs .complete to make sure the LLM is listening
from llama_index.llms.ollama import Ollama
#from llama_index.llms.ollama import Ollama

llm = Ollama(model=OLLAMA_MODEL)
response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
(Answer with markdown sections, markdown with be the GitHub flavor.)""")
print(response)

@adriens commented on GitHub (May 11, 2024): Assuming there is an `ollama` instance running in background, here is mine : ```python !pip install --upgrade llama-index-llms-ollama !pip install --upgrade llama-index # Just runs .complete to make sure the LLM is listening from llama_index.llms.ollama import Ollama #from llama_index.llms.ollama import Ollama llm = Ollama(model=OLLAMA_MODEL) response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with be the GitHub flavor.)""") print(response) ``` ![image](https://github.com/ollama/ollama/assets/5235127/f8048caf-4580-4e3f-8560-5e5688b5a6cb)

GiteaMirror commented

2026-05-03 12:28:37 -05:00

@adriens commented on GitHub (May 11, 2024):

Giving a try by only keeping the llama-index.ollama part...

@adriens commented on GitHub (May 11, 2024): Giving a try by only keeping the llama-index.ollama part...

GiteaMirror commented

2026-05-03 12:28:38 -05:00

@adriens commented on GitHub (May 11, 2024):

Nope, still failing because of timeout 💩
Any idea to make it work... or can you reproduce it @jmorganca ?

@adriens commented on GitHub (May 11, 2024): Nope, still failing because of timeout :hankey: Any idea to make it work... or can you reproduce it @jmorganca ?

GiteaMirror commented

2026-05-03 12:28:39 -05:00

@MohammedMusadiq commented on GitHub (Nov 10, 2024):

Hey @adriens, were you able to make it work for you ?

@MohammedMusadiq commented on GitHub (Nov 10, 2024): Hey @adriens, were you able to make it work for you ?

GiteaMirror commented

2026-05-03 12:28:42 -05:00

@adriens commented on GitHub (Nov 18, 2024):

Hi @MohammedMusadiq , unfortunately not.

@adriens commented on GitHub (Nov 18, 2024): Hi @MohammedMusadiq , unfortunately not.

Sign in to join this conversation.

Branches Tags

main

hoyyeva/fix-claude-channels-env

parth-update-hermes-launch

hoyyeva/vscode-extension-docs-update

parth-gemma4-chat-template-renderer

parth-api-status-context-length

hoyyeva/wire-up-context-length

hoyyeva/claude-code-context-doc

jmorganca/investigate-issue-17046

hoyyeva/hermes-docs

jmorganca/agent-loop-style

hoyyeva/openclaw

parth-agent-loop

hoyyeva/ollama-vscode-extension

brucemacd/cache-metrics

brucemacd/hermes-desktop

hoyyeva/docs-vscode

parth-input-style-experiment

brucemacd/docs-glm52

hoyyeva/poc-docs

Parth/mlx-launch-recommendations

parth-first-time-app-cli-experience

test/darwin-xcode-pin

improve-cloud-model-recommendations

hoyyeva/goose-docs

jmorganca/context-limit-fixes

hoyyeva/qwen-doc

hoyyeva/vscode-docs

jmorganca/remove-mlx-imagegen-code

parth-copilot-token-length-defaults

hoyyeva/poolside-windows

laguna-support

jmorganca/harden-markdown-rendering

laguna-renderer-parser

laguna-llamacpp

codex/make-integration-hidden-and-lunchable

brucemacd/omp-docs

pdevine/gguf-mtp-oldstyle

hoyyeva/migrate-pi

hoyyeva/anthropic-local-image-path

parth-launch-codex-app

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth/hide-claude-desktop-till-release

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#63190

[GH-ISSUE #1997] 🔙 Some kind of regression while running on some LlamaIndex versions (Kaggle & Killercoda) #63190

❔ About

ℹ️

🎟️ Potentially related issues

📜 Detailed stacktrace

✋ Compatibility matrix