[GH-ISSUE #1997] 🔙 Some kind of regression while running on some LlamaIndex versions (Kaggle & Killercoda) #63190

Closed
opened 2026-05-03 12:27:56 -05:00 by GiteaMirror · 34 comments
Owner

Originally created by @adriens on GitHub (Jan 15, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1997

Originally assigned to: @jmorganca on GitHub.

About

While working on a ollama tutorial on Kaggle, since a few days, I faced a regression while working with LlamaIndex.

Here is the output I could get on any model (worked everytime)

image

... vs now (the code is now broken, and it fails consistetly):

image

ℹ️

  • ✔️ Everything works perfectly well on my laptop

🤔 Looks like something changed that causes this "regression" while playing around in some cases 💭

🎟️ Potentially related issues

📜 Detailed stacktrace

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map)
      9 try:
---> 10     yield
     11 except Exception as exc:  # noqa: PIE786

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    205 with map_exceptions(exc_map):
--> 206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )
    211     for option in socket_options:

File /opt/conda/lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address)
    844 try:
--> 845     raise err
    846 finally:
    847     # Break explicitly a reference cycle

File /opt/conda/lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address)
    832     sock.bind(source_address)
--> 833 sock.connect(sa)
    834 # Break explicitly a reference cycle

OSError: [Errno 99] Cannot assign requested address

The above exception was the direct cause of the following exception:

ConnectError                              Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:67, in map_httpcore_exceptions()
     66 try:
---> 67     yield
     68 except Exception as exc:

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:231, in HTTPTransport.handle_request(self, request)
    230 with map_httpcore_exceptions():
--> 231     resp = self._pool.handle_request(req)
    233 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request)
    267         self.response_closed(status)
--> 268     raise exc
    269 else:

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request)
    250 try:
--> 251     response = connection.handle_request(request)
    252 except ConnectionNotAvailable:
    253     # The ConnectionNotAvailable exception is a special case, that
    254     # indicates we need to retry the request on a new connection.
   (...)
    258     # might end up as an HTTP/2 connection, but which actually ends
    259     # up as HTTP/1.1.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request)
     98         self._connect_failed = True
---> 99         raise exc
    100 elif not self._connection.is_available():

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request)
     75 try:
---> 76     stream = self._connect(request)
     78     ssl_object = stream.get_extra_info("ssl_object")

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request)
    123 with Trace("connect_tcp", logger, request, kwargs) as trace:
--> 124     stream = self._network_backend.connect_tcp(**kwargs)
    125     trace.return_value = stream

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    200 exc_map: ExceptionMapping = {
    201     socket.timeout: ConnectTimeout,
    202     OSError: ConnectError,
    203 }
--> 205 with map_exceptions(exc_map):
    206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ConnectError: [Errno 99] Cannot assign requested address

The above exception was the direct cause of the following exception:

ConnectError                              Traceback (most recent call last)
Cell In[13], line 5
      2 from llama_index.llms import Ollama
      4 llm = Ollama(model=OLLAMA_MODEL)
----> 5 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
      6 (Answer with markdown sections, markdown with be the GitHub flavor.)""")
      7 print(response)

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/base.py:226, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs)
    216 with wrapper_logic(_self) as callback_manager:
    217     event_id = callback_manager.on_event_start(
    218         CBEventType.LLM,
    219         payload={
   (...)
    223         },
    224     )
--> 226     f_return_val = f(_self, *args, **kwargs)
    227     if isinstance(f_return_val, Generator):
    228         # intercept the generator and add a callback to the end
    229         def wrapped_gen() -> CompletionResponseGen:

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama.py:180, in Ollama.complete(self, prompt, formatted, **kwargs)
    171 payload = {
    172     self.prompt_key: prompt,
    173     "model": self.model,
   (...)
    176     **kwargs,
    177 }
    179 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 180     response = client.post(
    181         url=f"{self.base_url}/api/generate",
    182         json=payload,
    183     )
    184     response.raise_for_status()
    185     raw = response.json()

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1146, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
   1125 def post(
   1126     self,
   1127     url: URLTypes,
   (...)
   1139     extensions: typing.Optional[RequestExtensions] = None,
   1140 ) -> Response:
   1141     """
   1142     Send a `POST` request.
   1143 
   1144     **Parameters**: See `httpx.request`.
   1145     """
-> 1146     return self.request(
   1147         "POST",
   1148         url,
   1149         content=content,
   1150         data=data,
   1151         files=files,
   1152         json=json,
   1153         params=params,
   1154         headers=headers,
   1155         cookies=cookies,
   1156         auth=auth,
   1157         follow_redirects=follow_redirects,
   1158         timeout=timeout,
   1159         extensions=extensions,
   1160     )

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:828, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    813     warnings.warn(message, DeprecationWarning)
    815 request = self.build_request(
    816     method=method,
    817     url=url,
   (...)
    826     extensions=extensions,
    827 )
--> 828 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:915, in Client.send(self, request, stream, auth, follow_redirects)
    907 follow_redirects = (
    908     self.follow_redirects
    909     if isinstance(follow_redirects, UseClientDefault)
    910     else follow_redirects
    911 )
    913 auth = self._build_request_auth(request, auth)
--> 915 response = self._send_handling_auth(
    916     request,
    917     auth=auth,
    918     follow_redirects=follow_redirects,
    919     history=[],
    920 )
    921 try:
    922     if not stream:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:943, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    940 request = next(auth_flow)
    942 while True:
--> 943     response = self._send_handling_redirects(
    944         request,
    945         follow_redirects=follow_redirects,
    946         history=history,
    947     )
    948     try:
    949         try:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:980, in Client._send_handling_redirects(self, request, follow_redirects, history)
    977 for hook in self._event_hooks["request"]:
    978     hook(request)
--> 980 response = self._send_single_request(request)
    981 try:
    982     for hook in self._event_hooks["response"]:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1016, in Client._send_single_request(self, request)
   1011     raise RuntimeError(
   1012         "Attempted to send an async request with a sync Client instance."
   1013     )
   1015 with request_context(request=request):
-> 1016     response = transport.handle_request(request)
   1018 assert isinstance(response.stream, SyncByteStream)
   1020 response.request = request

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:230, in HTTPTransport.handle_request(self, request)
    216 assert isinstance(request.stream, SyncByteStream)
    218 req = httpcore.Request(
    219     method=request.method,
    220     url=httpcore.URL(
   (...)
    228     extensions=request.extensions,
    229 )
--> 230 with map_httpcore_exceptions():
    231     resp = self._pool.handle_request(req)
    233 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    151     value = typ()
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.
    158     return exc is not value

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:84, in map_httpcore_exceptions()
     81     raise
     83 message = str(exc)
---> 84 raise mapped_exc(message) from exc

ConnectError: [Errno 99] Cannot assign requested address
Originally created by @adriens on GitHub (Jan 15, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1997 Originally assigned to: @jmorganca on GitHub. # :grey_question: About While working on a `ollama` tutorial on Kaggle, since a few days, I faced a regression while working with LlamaIndex. Here is the output I could get on any model (worked everytime) ![image](https://github.com/langchain-ai/langchainjs/assets/5235127/89ebe9c2-55d4-41da-8b32-74d243759f2e) ... vs now (the code is now broken, and it fails consistetly): ![image](https://github.com/langchain-ai/langchainjs/assets/5235127/4121bd48-0c35-461b-81ba-f2353b06ee45) # :information_source: - :heavy_check_mark: Everything works perfectly well on my laptop :thinking: Looks like something changed that causes this "regression" while playing around in some cases :thought_balloon: # :tickets: Potentially related issues - https://github.com/jmorganca/ollama/issues/1478 - https://github.com/jmorganca/ollama/issues/1641 - https://github.com/jmorganca/ollama/issues/1550 - https://github.com/jmorganca/ollama/pull/1146 ## :scroll: Detailed stacktrace ``` --------------------------------------------------------------------------- OSError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map) 9 try: ---> 10 yield 11 except Exception as exc: # noqa: PIE786 File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options) 205 with map_exceptions(exc_map): --> 206 sock = socket.create_connection( 207 address, 208 timeout, 209 source_address=source_address, 210 ) 211 for option in socket_options: File /opt/conda/lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address) 844 try: --> 845 raise err 846 finally: 847 # Break explicitly a reference cycle File /opt/conda/lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address) 832 sock.bind(source_address) --> 833 sock.connect(sa) 834 # Break explicitly a reference cycle OSError: [Errno 99] Cannot assign requested address The above exception was the direct cause of the following exception: ConnectError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:67, in map_httpcore_exceptions() 66 try: ---> 67 yield 68 except Exception as exc: File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:231, in HTTPTransport.handle_request(self, request) 230 with map_httpcore_exceptions(): --> 231 resp = self._pool.handle_request(req) 233 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request) 267 self.response_closed(status) --> 268 raise exc 269 else: File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request) 250 try: --> 251 response = connection.handle_request(request) 252 except ConnectionNotAvailable: 253 # The ConnectionNotAvailable exception is a special case, that 254 # indicates we need to retry the request on a new connection. (...) 258 # might end up as an HTTP/2 connection, but which actually ends 259 # up as HTTP/1.1. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request) 98 self._connect_failed = True ---> 99 raise exc 100 elif not self._connection.is_available(): File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request) 75 try: ---> 76 stream = self._connect(request) 78 ssl_object = stream.get_extra_info("ssl_object") File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request) 123 with Trace("connect_tcp", logger, request, kwargs) as trace: --> 124 stream = self._network_backend.connect_tcp(**kwargs) 125 trace.return_value = stream File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options) 200 exc_map: ExceptionMapping = { 201 socket.timeout: ConnectTimeout, 202 OSError: ConnectError, 203 } --> 205 with map_exceptions(exc_map): 206 sock = socket.create_connection( 207 address, 208 timeout, 209 source_address=source_address, 210 ) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map) 13 if isinstance(exc, from_exc): ---> 14 raise to_exc(exc) from exc 15 raise ConnectError: [Errno 99] Cannot assign requested address The above exception was the direct cause of the following exception: ConnectError Traceback (most recent call last) Cell In[13], line 5 2 from llama_index.llms import Ollama 4 llm = Ollama(model=OLLAMA_MODEL) ----> 5 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 6 (Answer with markdown sections, markdown with be the GitHub flavor.)""") 7 print(response) File /opt/conda/lib/python3.10/site-packages/llama_index/llms/base.py:226, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs) 216 with wrapper_logic(_self) as callback_manager: 217 event_id = callback_manager.on_event_start( 218 CBEventType.LLM, 219 payload={ (...) 223 }, 224 ) --> 226 f_return_val = f(_self, *args, **kwargs) 227 if isinstance(f_return_val, Generator): 228 # intercept the generator and add a callback to the end 229 def wrapped_gen() -> CompletionResponseGen: File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama.py:180, in Ollama.complete(self, prompt, formatted, **kwargs) 171 payload = { 172 self.prompt_key: prompt, 173 "model": self.model, (...) 176 **kwargs, 177 } 179 with httpx.Client(timeout=Timeout(self.request_timeout)) as client: --> 180 response = client.post( 181 url=f"{self.base_url}/api/generate", 182 json=payload, 183 ) 184 response.raise_for_status() 185 raw = response.json() File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1146, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 1125 def post( 1126 self, 1127 url: URLTypes, (...) 1139 extensions: typing.Optional[RequestExtensions] = None, 1140 ) -> Response: 1141 """ 1142 Send a `POST` request. 1143 1144 **Parameters**: See `httpx.request`. 1145 """ -> 1146 return self.request( 1147 "POST", 1148 url, 1149 content=content, 1150 data=data, 1151 files=files, 1152 json=json, 1153 params=params, 1154 headers=headers, 1155 cookies=cookies, 1156 auth=auth, 1157 follow_redirects=follow_redirects, 1158 timeout=timeout, 1159 extensions=extensions, 1160 ) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:828, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 813 warnings.warn(message, DeprecationWarning) 815 request = self.build_request( 816 method=method, 817 url=url, (...) 826 extensions=extensions, 827 ) --> 828 return self.send(request, auth=auth, follow_redirects=follow_redirects) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:915, in Client.send(self, request, stream, auth, follow_redirects) 907 follow_redirects = ( 908 self.follow_redirects 909 if isinstance(follow_redirects, UseClientDefault) 910 else follow_redirects 911 ) 913 auth = self._build_request_auth(request, auth) --> 915 response = self._send_handling_auth( 916 request, 917 auth=auth, 918 follow_redirects=follow_redirects, 919 history=[], 920 ) 921 try: 922 if not stream: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:943, in Client._send_handling_auth(self, request, auth, follow_redirects, history) 940 request = next(auth_flow) 942 while True: --> 943 response = self._send_handling_redirects( 944 request, 945 follow_redirects=follow_redirects, 946 history=history, 947 ) 948 try: 949 try: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:980, in Client._send_handling_redirects(self, request, follow_redirects, history) 977 for hook in self._event_hooks["request"]: 978 hook(request) --> 980 response = self._send_single_request(request) 981 try: 982 for hook in self._event_hooks["response"]: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1016, in Client._send_single_request(self, request) 1011 raise RuntimeError( 1012 "Attempted to send an async request with a sync Client instance." 1013 ) 1015 with request_context(request=request): -> 1016 response = transport.handle_request(request) 1018 assert isinstance(response.stream, SyncByteStream) 1020 response.request = request File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:230, in HTTPTransport.handle_request(self, request) 216 assert isinstance(request.stream, SyncByteStream) 218 req = httpcore.Request( 219 method=request.method, 220 url=httpcore.URL( (...) 228 extensions=request.extensions, 229 ) --> 230 with map_httpcore_exceptions(): 231 resp = self._pool.handle_request(req) 233 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 151 value = typ() 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. 158 return exc is not value File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:84, in map_httpcore_exceptions() 81 raise 83 message = str(exc) ---> 84 raise mapped_exc(message) from exc ConnectError: [Errno 99] Cannot assign requested address ```
GiteaMirror added the bug label 2026-05-03 12:27:56 -05:00
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

Is there a way to install any previous ollama version, from shell (so I can point where it started to fail)?

<!-- gh-comment-id:1891234152 --> @adriens commented on GitHub (Jan 15, 2024): :grey_question: Is there a way to install any previous ollama version, from shell (so I can point where it started to fail)?
Author
Owner

@jmorganca commented on GitHub (Jan 15, 2024):

@adriens sorry you hit this. Will look into it. Until it's fixed, you can install previous versions with this script (for example, 0.1.17)

curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh
<!-- gh-comment-id:1891245643 --> @jmorganca commented on GitHub (Jan 15, 2024): @adriens sorry you hit this. Will look into it. Until it's fixed, you can install previous versions with this script (for example, 0.1.17) ``` curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh ```
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

Thanks a lot for the fast answer and the shell tip 👍

<!-- gh-comment-id:1891254427 --> @adriens commented on GitHub (Jan 15, 2024): Thanks a lot for the fast answer and the `shell` tip :+1:
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

Test in progress: I will keep you up-to-date

<!-- gh-comment-id:1891257801 --> @adriens commented on GitHub (Jan 15, 2024): Test in progress: I will keep you up-to-date :zap:
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

Surprinsingly, looks like all previous versions are failing...I'm unable to reproduce a successful run:

ollama version Result
v0.1.20 👎
v0.1.17 👎
v0.1.16 👎

👉 here are :

<!-- gh-comment-id:1891274311 --> @adriens commented on GitHub (Jan 15, 2024): Surprinsingly, looks like all previous versions are failing...I'm unable to reproduce a successful run: | `ollama` version | Result | | --- | --- | | v0.1.20 | :-1: | |v0.1.17 | :-1: | |v0.1.16 | :-1: | :point_right: here are : - :+1: A successful run : https://www.kaggle.com/adriensales/ollama-running-local-models-w-llamaindex-cpu - :-1: A broken one: https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu?scriptVersionId=158989000
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

I gave it a try on Killercoda and I could easily reproduce the behavior:

image

Then pip install llama_index

image

Then try to

 python demo.py

... produces the timeout:

llm = Ollama(model=OLLAMA_MODEL)
response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
(Answer with markdown sections, markdown with be the GitHub flavor.)""")
print(response)
ubuntu $ python demo.py 
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read
    return self._sock.recv(max_bytes)
socket.timeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request
    response = connection.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection.py", line 103, in handle_request
    return self._connection.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 133, in handle_request
    raise exc
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 111, in handle_request
    ) = self._receive_response_headers(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers
    event = self._receive_event(timeout=timeout)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 212, in _receive_event
    data = self._network_stream.read(
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read
    return self._sock.recv(max_bytes)
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ReadTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "demo.py", line 6, in <module>
    response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
  File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/base.py", line 226, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/ollama.py", line 180, in complete
    response = client.post(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1146, in post
    return self.request(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 828, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 915, in send
    response = self._send_handling_auth(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 943, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 980, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1016, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out
<!-- gh-comment-id:1892788944 --> @adriens commented on GitHub (Jan 15, 2024): I gave it a try on Killercoda and I could easily reproduce the behavior: ![image](https://github.com/jmorganca/ollama/assets/5235127/889ffba0-979b-4da4-acb1-0f55dae4941f) Then `pip install llama_index` ![image](https://github.com/jmorganca/ollama/assets/5235127/4a2447a9-5020-4146-922f-c6b1e8249a34) Then try to ```sh python demo.py ``` ... produces the timeout: ``` llm = Ollama(model=OLLAMA_MODEL) response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with be the GitHub flavor.)""") print(response) ubuntu $ python demo.py Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 10, in map_exceptions yield File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read return self._sock.recv(max_bytes) socket.timeout: timed out The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions yield File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request resp = self._pool.handle_request(req) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request raise exc File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request response = connection.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/connection.py", line 103, in handle_request return self._connection.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 133, in handle_request raise exc File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 111, in handle_request ) = self._receive_response_headers(**kwargs) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers event = self._receive_event(timeout=timeout) File "/usr/local/lib/python3.8/dist-packages/httpcore/_sync/http11.py", line 212, in _receive_event data = self._network_stream.read( File "/usr/local/lib/python3.8/dist-packages/httpcore/_backends/sync.py", line 126, in read return self._sock.recv(max_bytes) File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.8/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions raise to_exc(exc) from exc httpcore.ReadTimeout: timed out The above exception was the direct cause of the following exception: Traceback (most recent call last): File "demo.py", line 6, in <module> response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/base.py", line 226, in wrapped_llm_predict f_return_val = f(_self, *args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/llama_index/llms/ollama.py", line 180, in complete response = client.post( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1146, in post return self.request( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 828, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 915, in send response = self._send_handling_auth( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 943, in _send_handling_auth response = self._send_handling_redirects( File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 980, in _send_handling_redirects response = self._send_single_request(request) File "/usr/local/lib/python3.8/dist-packages/httpx/_client.py", line 1016, in _send_single_request response = transport.handle_request(request) File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 231, in handle_request resp = self._pool.handle_request(req) File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__ self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.8/dist-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ReadTimeout: timed out ```
Author
Owner

@adriens commented on GitHub (Jan 15, 2024):

🤔 Maybe something around llama_index

<!-- gh-comment-id:1892790008 --> @adriens commented on GitHub (Jan 15, 2024): :thinking: Maybe something around `llama_index` :grey_question:
Author
Owner

@adriens commented on GitHub (Jan 16, 2024):

Gave a try with previous llama_index :

!pip install llama-index==0.9.23

... but still got the same issue:

image

<!-- gh-comment-id:1892929593 --> @adriens commented on GitHub (Jan 16, 2024): Gave a try with previous `llama_index` : ```python !pip install llama-index==0.9.23 ``` ... but still got the same issue: ![image](https://github.com/jmorganca/ollama/assets/5235127/78a4308d-b8a9-4b42-b18d-88195aaab49c)
Author
Owner

@adriens commented on GitHub (Jan 16, 2024):

<!-- gh-comment-id:1892937191 --> @adriens commented on GitHub (Jan 16, 2024): - https://github.com/jmorganca/ollama/issues/1863
Author
Owner

@adriens commented on GitHub (Jan 16, 2024):

<!-- gh-comment-id:1892937584 --> @adriens commented on GitHub (Jan 16, 2024): - https://github.com/jmorganca/ollama/issues/1910
Author
Owner

@adriens commented on GitHub (Jan 16, 2024):

Compatibility matrix

Made it work with the following conf, here is the matrix:

ollama llama_index Status
v0.1.16 0.9.21 🆗
v0.1.17 v0.9.21 🆗
v0.1.18 `v0.9.21 🆗
v0.1.20 v0.9.21 🆗
v0.1.16 0.9.22 👎
v0.1.16 v0.9.31 (current) 👎
v0.1.17 v0.9.31 (current) 👎
v0.1.18 v0.9.31 (current)
v0.1.19 v0.9.31 (current)
v0.1.20 v0.9.31 (current) 👎
<!-- gh-comment-id:1892948729 --> @adriens commented on GitHub (Jan 16, 2024): ## :hand: Compatibility matrix Made it work with the following conf, here is the matrix: | `ollama` | `llama_index` | Status | | --- | --- | --- | | `v0.1.16` | `0.9.21` | 🆗 | | `v0.1.17` | `v0.9.21` | 🆗 | | `v0.1.18` | `v0.9.21 | 🆗 | | `v0.1.20` | `v0.9.21` | 🆗 | | `v0.1.16` | `0.9.22` | 👎 | | `v0.1.16` | `v0.9.31 (current)` | 👎 | | `v0.1.17` | `v0.9.31` (current) | 👎| | `v0.1.18` | `v0.9.31` (current) | ❔| | `v0.1.19` | `v0.9.31` (current) | ❔| | `v0.1.20` | `v0.9.31` (current) | 👎 |
Author
Owner

@adriens commented on GitHub (Jan 17, 2024):

🆓 Local & Open Source AI: a kind ollama & LlamaIndex intro

<!-- gh-comment-id:1896808634 --> @adriens commented on GitHub (Jan 17, 2024): [🆓 Local & Open Source AI: a kind ollama & LlamaIndex intro](https://dev.to/adriens/local-open-source-ai-a-kind-ollama-llamaindex-intro-1nnc)
Author
Owner

@tinycrops commented on GitHub (Jan 25, 2024):

was using a derivative of adriens notebook

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[8], line 53
     43 llm = Ollama(model=OLLAMA_MODEL)
     44 # response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
     45 # (Answer with markdown sections, markdown with be the GitHub flavor.)""")
     46 # print(response)
   (...)
     51 
     52 # bash_chain.run(text)
---> 53 llm.invoke(f"Translate to a scientific lecture: {PROMPT}")

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:230, in BaseLLM.invoke(self, input, config, stop, **kwargs)
    220 def invoke(
    221     self,
    222     input: LanguageModelInput,
   (...)
    226     **kwargs: Any,
    227 ) -> str:
    228     config = ensure_config(config)
    229     return (
--> 230         self.generate_prompt(
    231             [self._convert_input(input)],
    232             stop=stop,
    233             callbacks=config.get("callbacks"),
    234             tags=config.get("tags"),
    235             metadata=config.get("metadata"),
    236             run_name=config.get("run_name"),
    237             **kwargs,
    238         )
    239         .generations[0][0]
    240         .text
    241     )

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:525, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    517 def generate_prompt(
    518     self,
    519     prompts: List[PromptValue],
   (...)
    522     **kwargs: Any,
    523 ) -> LLMResult:
    524     prompt_strings = [p.to_string() for p in prompts]
--> 525     return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:698, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs)
    682         raise ValueError(
    683             "Asked to cache, but no cache found at `langchain.cache`."
    684         )
    685     run_managers = [
    686         callback_manager.on_llm_start(
    687             dumpd(self),
   (...)
    696         )
    697     ]
--> 698     output = self._generate_helper(
    699         prompts, stop, run_managers, bool(new_arg_supported), **kwargs
    700     )
    701     return output
    702 if len(missing_prompts) > 0:

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:562, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    560     for run_manager in run_managers:
    561         run_manager.on_llm_error(e, response=LLMResult(generations=[]))
--> 562     raise e
    563 flattened_outputs = output.flatten()
    564 for manager, flattened_output in zip(run_managers, flattened_outputs):

File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:549, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
    539 def _generate_helper(
    540     self,
    541     prompts: List[str],
   (...)
    545     **kwargs: Any,
    546 ) -> LLMResult:
    547     try:
    548         output = (
--> 549             self._generate(
    550                 prompts,
    551                 stop=stop,
    552                 # TODO: support multiple run managers
    553                 run_manager=run_managers[0] if run_managers else None,
    554                 **kwargs,
    555             )
    556             if new_arg_supported
    557             else self._generate(prompts, stop=stop)
    558         )
    559     except BaseException as e:
    560         for run_manager in run_managers:

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:400, in Ollama._generate(self, prompts, stop, images, run_manager, **kwargs)
    398 generations = []
    399 for prompt in prompts:
--> 400     final_chunk = super()._stream_with_aggregation(
    401         prompt,
    402         stop=stop,
    403         images=images,
    404         run_manager=run_manager,
    405         verbose=self.verbose,
    406         **kwargs,
    407     )
    408     generations.append([final_chunk])
    409 return LLMResult(generations=generations)

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:309, in _OllamaCommon._stream_with_aggregation(self, prompt, stop, run_manager, verbose, **kwargs)
    300 def _stream_with_aggregation(
    301     self,
    302     prompt: str,
   (...)
    306     **kwargs: Any,
    307 ) -> GenerationChunk:
    308     final_chunk: Optional[GenerationChunk] = None
--> 309     for stream_resp in self._create_generate_stream(prompt, stop, **kwargs):
    310         if stream_resp:
    311             chunk = _stream_response_to_generation_chunk(stream_resp)

File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:154, in _OllamaCommon._create_generate_stream(self, prompt, stop, images, **kwargs)
    146 def _create_generate_stream(
    147     self,
    148     prompt: str,
   (...)
    151     **kwargs: Any,
    152 ) -> Iterator[str]:
    153     payload = {"prompt": prompt, "images": images}
--> 154     yield from self._create_stream(
    155         payload=payload,
    156         stop=stop,
    157         api_url=f"{self.base_url}/api/generate/",
    158         **kwargs,
    159     )

File /opt/conda/lib/python3.10/site-packages/requests/models.py:865, in Response.iter_lines(self, chunk_size, decode_unicode, delimiter)
    856 """Iterates over the response data, one line at a time.  When
    857 stream=True is set on the request, this avoids reading the
    858 content at once into memory for large responses.
    859 
    860 .. note:: This method is not reentrant safe.
    861 """
    863 pending = None
--> 865 for chunk in self.iter_content(
    866     chunk_size=chunk_size, decode_unicode=decode_unicode
    867 ):
    869     if pending is not None:
    870         chunk = pending + chunk

File /opt/conda/lib/python3.10/site-packages/requests/utils.py:571, in stream_decode_response_unicode(iterator, r)
    568     return
    570 decoder = codecs.getincrementaldecoder(r.encoding)(errors="replace")
--> 571 for chunk in iterator:
    572     rv = decoder.decode(chunk)
    573     if rv:

File /opt/conda/lib/python3.10/site-packages/requests/models.py:816, in Response.iter_content.<locals>.generate()
    814 if hasattr(self.raw, "stream"):
    815     try:
--> 816         yield from self.raw.stream(chunk_size, decode_content=True)
    817     except ProtocolError as e:
    818         raise ChunkedEncodingError(e)

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:624, in HTTPResponse.stream(self, amt, decode_content)
    608 """
    609 A generator wrapper for the read() method. A call will block until
    610 ``amt`` bytes have been read from the connection or until the
   (...)
    621     'content-encoding' header.
    622 """
    623 if self.chunked and self.supports_chunked_reads():
--> 624     for line in self.read_chunked(amt, decode_content=decode_content):
    625         yield line
    626 else:

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:828, in HTTPResponse.read_chunked(self, amt, decode_content)
    825     return
    827 while True:
--> 828     self._update_chunk_length()
    829     if self.chunk_left == 0:
    830         break

File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:758, in HTTPResponse._update_chunk_length(self)
    756 if self.chunk_left is not None:
    757     return
--> 758 line = self._fp.fp.readline()
    759 line = line.split(b";", 1)[0]
    760 try:

File /opt/conda/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b)
    703 while True:
    704     try:
--> 705         return self._sock.recv_into(b)
    706     except timeout:
    707         self._timeout_occurred = True

KeyboardInterrupt: 
<!-- gh-comment-id:1910094822 --> @tinycrops commented on GitHub (Jan 25, 2024): was using a derivative of adriens [notebook](https://www.kaggle.com/code/matthewhendricks/notebook0cd9dcd006) ``` --------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) Cell In[8], line 53 43 llm = Ollama(model=OLLAMA_MODEL) 44 # response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 45 # (Answer with markdown sections, markdown with be the GitHub flavor.)""") 46 # print(response) (...) 51 52 # bash_chain.run(text) ---> 53 llm.invoke(f"Translate to a scientific lecture: {PROMPT}") File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:230, in BaseLLM.invoke(self, input, config, stop, **kwargs) 220 def invoke( 221 self, 222 input: LanguageModelInput, (...) 226 **kwargs: Any, 227 ) -> str: 228 config = ensure_config(config) 229 return ( --> 230 self.generate_prompt( 231 [self._convert_input(input)], 232 stop=stop, 233 callbacks=config.get("callbacks"), 234 tags=config.get("tags"), 235 metadata=config.get("metadata"), 236 run_name=config.get("run_name"), 237 **kwargs, 238 ) 239 .generations[0][0] 240 .text 241 ) File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:525, in BaseLLM.generate_prompt(self, prompts, stop, callbacks, **kwargs) 517 def generate_prompt( 518 self, 519 prompts: List[PromptValue], (...) 522 **kwargs: Any, 523 ) -> LLMResult: 524 prompt_strings = [p.to_string() for p in prompts] --> 525 return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs) File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:698, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs) 682 raise ValueError( 683 "Asked to cache, but no cache found at `langchain.cache`." 684 ) 685 run_managers = [ 686 callback_manager.on_llm_start( 687 dumpd(self), (...) 696 ) 697 ] --> 698 output = self._generate_helper( 699 prompts, stop, run_managers, bool(new_arg_supported), **kwargs 700 ) 701 return output 702 if len(missing_prompts) > 0: File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:562, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 560 for run_manager in run_managers: 561 run_manager.on_llm_error(e, response=LLMResult(generations=[])) --> 562 raise e 563 flattened_outputs = output.flatten() 564 for manager, flattened_output in zip(run_managers, flattened_outputs): File /opt/conda/lib/python3.10/site-packages/langchain_core/language_models/llms.py:549, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 539 def _generate_helper( 540 self, 541 prompts: List[str], (...) 545 **kwargs: Any, 546 ) -> LLMResult: 547 try: 548 output = ( --> 549 self._generate( 550 prompts, 551 stop=stop, 552 # TODO: support multiple run managers 553 run_manager=run_managers[0] if run_managers else None, 554 **kwargs, 555 ) 556 if new_arg_supported 557 else self._generate(prompts, stop=stop) 558 ) 559 except BaseException as e: 560 for run_manager in run_managers: File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:400, in Ollama._generate(self, prompts, stop, images, run_manager, **kwargs) 398 generations = [] 399 for prompt in prompts: --> 400 final_chunk = super()._stream_with_aggregation( 401 prompt, 402 stop=stop, 403 images=images, 404 run_manager=run_manager, 405 verbose=self.verbose, 406 **kwargs, 407 ) 408 generations.append([final_chunk]) 409 return LLMResult(generations=generations) File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:309, in _OllamaCommon._stream_with_aggregation(self, prompt, stop, run_manager, verbose, **kwargs) 300 def _stream_with_aggregation( 301 self, 302 prompt: str, (...) 306 **kwargs: Any, 307 ) -> GenerationChunk: 308 final_chunk: Optional[GenerationChunk] = None --> 309 for stream_resp in self._create_generate_stream(prompt, stop, **kwargs): 310 if stream_resp: 311 chunk = _stream_response_to_generation_chunk(stream_resp) File /opt/conda/lib/python3.10/site-packages/langchain_community/llms/ollama.py:154, in _OllamaCommon._create_generate_stream(self, prompt, stop, images, **kwargs) 146 def _create_generate_stream( 147 self, 148 prompt: str, (...) 151 **kwargs: Any, 152 ) -> Iterator[str]: 153 payload = {"prompt": prompt, "images": images} --> 154 yield from self._create_stream( 155 payload=payload, 156 stop=stop, 157 api_url=f"{self.base_url}/api/generate/", 158 **kwargs, 159 ) File /opt/conda/lib/python3.10/site-packages/requests/models.py:865, in Response.iter_lines(self, chunk_size, decode_unicode, delimiter) 856 """Iterates over the response data, one line at a time. When 857 stream=True is set on the request, this avoids reading the 858 content at once into memory for large responses. 859 860 .. note:: This method is not reentrant safe. 861 """ 863 pending = None --> 865 for chunk in self.iter_content( 866 chunk_size=chunk_size, decode_unicode=decode_unicode 867 ): 869 if pending is not None: 870 chunk = pending + chunk File /opt/conda/lib/python3.10/site-packages/requests/utils.py:571, in stream_decode_response_unicode(iterator, r) 568 return 570 decoder = codecs.getincrementaldecoder(r.encoding)(errors="replace") --> 571 for chunk in iterator: 572 rv = decoder.decode(chunk) 573 if rv: File /opt/conda/lib/python3.10/site-packages/requests/models.py:816, in Response.iter_content.<locals>.generate() 814 if hasattr(self.raw, "stream"): 815 try: --> 816 yield from self.raw.stream(chunk_size, decode_content=True) 817 except ProtocolError as e: 818 raise ChunkedEncodingError(e) File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:624, in HTTPResponse.stream(self, amt, decode_content) 608 """ 609 A generator wrapper for the read() method. A call will block until 610 ``amt`` bytes have been read from the connection or until the (...) 621 'content-encoding' header. 622 """ 623 if self.chunked and self.supports_chunked_reads(): --> 624 for line in self.read_chunked(amt, decode_content=decode_content): 625 yield line 626 else: File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:828, in HTTPResponse.read_chunked(self, amt, decode_content) 825 return 827 while True: --> 828 self._update_chunk_length() 829 if self.chunk_left == 0: 830 break File /opt/conda/lib/python3.10/site-packages/urllib3/response.py:758, in HTTPResponse._update_chunk_length(self) 756 if self.chunk_left is not None: 757 return --> 758 line = self._fp.fp.readline() 759 line = line.split(b";", 1)[0] 760 try: File /opt/conda/lib/python3.10/socket.py:705, in SocketIO.readinto(self, b) 703 while True: 704 try: --> 705 return self._sock.recv_into(b) 706 except timeout: 707 self._timeout_occurred = True KeyboardInterrupt: ```
Author
Owner

@adriens commented on GitHub (Jan 25, 2024):

🙏 @MeDott29 for the code submission 🐱

<!-- gh-comment-id:1911079164 --> @adriens commented on GitHub (Jan 25, 2024): :pray: @MeDott29 for the code submission :cat:
Author
Owner

@pdevine commented on GitHub (Mar 12, 2024):

Hey @adriens , this seems to be working fine at least locally. Llama Index added us to a new "ollama" package. I don't have access to Kaggle/Killercoda though, but:

% python3
Python 3.11.7 (main, Dec  4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from llama_index.llms.ollama import Ollama
>>> llm = Ollama(model="llama2", request_timeout=30.0)
>>> resp = llm.complete("Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with the GitHub flavor.)")
>>> print(resp)

Grigori Perelman is a Russian mathematician who made significant contributions to the field of geometry and topology, particularly in the area of Riemannian geometry and the Poincaré conjecture. He is considered one of the most important mathematicians of the 21st century, and his work has had a profound impact on the field of mathematics.

Early Life and Education
-------------------------

Grigori Perelman was born in Leningrad (now St. Petersburg), Russia in 1966. He grew up in a family of mathematicians and began studying mathematics at an early age. He graduated from the University of Leningrad in 1987 with a degree in mathematics and went on to pursue his graduate studies at the Steklov Institute of Mathematics in St. Petersburg.

Contributions to Mathematics
-----------------------------

Perelman's most significant contribution to mathematics is his proof of the Poincaré conjecture, which was a longstanding problem in topology. The conjecture states that a simply connected, closed three-dimensional manifold must be topologically equivalent to a three-dimensional sphere. Perelman's proof, which was published in 2003, involved the use of a combination of geometric and topological techniques.

Perelman's work on the Poincaré conjecture is considered one of the most important achievements in mathematics in the last century, and it has had a significant impact on the field of geometry and topology. His proof has been hailed as a masterpiece of mathematical rigor and creativity, and it has opened up new areas of research in geometry and topology.
...

I think maybe this is an issue with Kaggle?

<!-- gh-comment-id:1992675176 --> @pdevine commented on GitHub (Mar 12, 2024): Hey @adriens , this seems to be working fine at least locally. Llama Index added us to a new "ollama" package. I don't have access to Kaggle/Killercoda though, but: ``` % python3 Python 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from llama_index.llms.ollama import Ollama >>> llm = Ollama(model="llama2", request_timeout=30.0) >>> resp = llm.complete("Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with the GitHub flavor.)") >>> print(resp) Grigori Perelman is a Russian mathematician who made significant contributions to the field of geometry and topology, particularly in the area of Riemannian geometry and the Poincaré conjecture. He is considered one of the most important mathematicians of the 21st century, and his work has had a profound impact on the field of mathematics. Early Life and Education ------------------------- Grigori Perelman was born in Leningrad (now St. Petersburg), Russia in 1966. He grew up in a family of mathematicians and began studying mathematics at an early age. He graduated from the University of Leningrad in 1987 with a degree in mathematics and went on to pursue his graduate studies at the Steklov Institute of Mathematics in St. Petersburg. Contributions to Mathematics ----------------------------- Perelman's most significant contribution to mathematics is his proof of the Poincaré conjecture, which was a longstanding problem in topology. The conjecture states that a simply connected, closed three-dimensional manifold must be topologically equivalent to a three-dimensional sphere. Perelman's proof, which was published in 2003, involved the use of a combination of geometric and topological techniques. Perelman's work on the Poincaré conjecture is considered one of the most important achievements in mathematics in the last century, and it has had a significant impact on the field of geometry and topology. His proof has been hailed as a masterpiece of mathematical rigor and creativity, and it has opened up new areas of research in geometry and topology. ... ``` I think maybe this is an issue with Kaggle?
Author
Owner

@adriens commented on GitHub (Mar 20, 2024):

I'm giving it a try right now 🤞
https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu

image

<!-- gh-comment-id:2010683555 --> @adriens commented on GitHub (Mar 20, 2024): I'm giving it a try right now :crossed_fingers: https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu ![image](https://github.com/ollama/ollama/assets/5235127/d709eca4-1e4c-4ef6-b86e-1b233a1bd6fc)
Author
Owner

@adriens commented on GitHub (Mar 20, 2024):

Run in progress

<!-- gh-comment-id:2010686078 --> @adriens commented on GitHub (Mar 20, 2024): Run in progress :hourglass_flowing_sand:
Author
Owner

@adriens commented on GitHub (Mar 20, 2024):

Now I'm getting this

ImportError: cannot import name 'Ollama' from 'llama_index.llms' (unknown location)
<!-- gh-comment-id:2010720078 --> @adriens commented on GitHub (Mar 20, 2024): [Now I'm getting this](https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu/log?scriptVersionId=168043744) ``` ImportError: cannot import name 'Ollama' from 'llama_index.llms' (unknown location) ```
Author
Owner

@pdevine commented on GitHub (Mar 20, 2024):

@adriens it's from llama_index.llms.ollama import Ollama. They changed the package.

<!-- gh-comment-id:2010771236 --> @pdevine commented on GitHub (Mar 20, 2024): @adriens it's `from llama_index.llms.ollama import Ollama`. They changed the package.
Author
Owner

@adriens commented on GitHub (Apr 1, 2024):

Hi @pdevine , sorry for late feedback.

I've just patched the Notebook. I'll keep you posted within a few minutes 🤞

<!-- gh-comment-id:2030686925 --> @adriens commented on GitHub (Apr 1, 2024): Hi @pdevine , sorry for late feedback. I've just patched the Notebook. I'll keep you posted within a few minutes :crossed_fingers:
Author
Owner

@adriens commented on GitHub (Apr 2, 2024):

image

<!-- gh-comment-id:2030819926 --> @adriens commented on GitHub (Apr 2, 2024): ![image](https://github.com/ollama/ollama/assets/5235127/dc3d0f64-0d17-4120-96b2-a203b5665ad0)
Author
Owner

@pdevine commented on GitHub (Apr 2, 2024):

hey @adriens , you should follow the docs from llama index here: https://docs.llamaindex.ai/en/stable/examples/llm/ollama/

You'll need to pip install llama-index-llms-ollama first.

<!-- gh-comment-id:2030896865 --> @pdevine commented on GitHub (Apr 2, 2024): hey @adriens , you should follow the docs from llama index here: https://docs.llamaindex.ai/en/stable/examples/llm/ollama/ You'll need to `pip install llama-index-llms-ollama` first.
Author
Owner

@asma-10 commented on GitHub (Apr 5, 2024):

Hello everyone , do have the same issue when using Ollama
have you guys found any solution ? is it comming from Ollama itself ?
Capture d'écran 2024-04-05 210536

<!-- gh-comment-id:2040550514 --> @asma-10 commented on GitHub (Apr 5, 2024): Hello everyone , do have the same issue when using Ollama have you guys found any solution ? is it comming from Ollama itself ? ![Capture d'écran 2024-04-05 210536](https://github.com/ollama/ollama/assets/101638276/9a8e0866-17b3-46ba-b36d-d0a73031edd8)
Author
Owner

@jmorganca commented on GitHub (May 10, 2024):

Make sure to have the latest llama_index package: https://docs.llamaindex.ai/en/stable/api_reference/llms/ollama/

<!-- gh-comment-id:2103668488 --> @jmorganca commented on GitHub (May 10, 2024): Make sure to have the latest `llama_index` package: https://docs.llamaindex.ai/en/stable/api_reference/llms/ollama/
Author
Owner

@jmorganca commented on GitHub (May 10, 2024):

Let me know if you're still encountering this @adriens :)

<!-- gh-comment-id:2103668596 --> @jmorganca commented on GitHub (May 10, 2024): Let me know if you're still encountering this @adriens :)
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Hi @jmorganca , I'm giving it a try right now

<!-- gh-comment-id:2106026414 --> @adriens commented on GitHub (May 11, 2024): Hi @jmorganca , I'm giving it a try right now :zap:
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Applied modifications but still facing some timeout issue :

---------------------------------------------------------------------------
ReadTimeout                               Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions()
     68 try:
---> 69     yield
     70 except Exception as exc:

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request)
    232 with map_httpcore_exceptions():
--> 233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request)
    215     self._close_connections(closing)
--> 216     raise exc from None
    218 # Return the response. Note that in this case we still have to manage
    219 # the point at which the response is closed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request)
    194 try:
    195     # Send the request on the assigned connection.
--> 196     response = connection.handle_request(
    197         pool_request.request
    198     )
    199 except ConnectionNotAvailable:
    200     # In some cases a connection may initially be available to
    201     # handle a request, but then become unavailable.
    202     #
    203     # In this case we clear the connection and try again.

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request)
     99     raise exc
--> 101 return self._connection.handle_request(request)

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request)
    142         self._response_closed()
--> 143 raise exc

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request)
    104 with Trace(
    105     "receive_response_headers", logger, request, kwargs
    106 ) as trace:
    107     (
    108         http_version,
    109         status,
    110         reason_phrase,
    111         headers,
    112         trailing_data,
--> 113     ) = self._receive_response_headers(**kwargs)
    114     trace.return_value = (
    115         http_version,
    116         status,
    117         reason_phrase,
    118         headers,
    119     )

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request)
    185 while True:
--> 186     event = self._receive_event(timeout=timeout)
    187     if isinstance(event, h11.Response):

File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout)
    223 if event is h11.NEED_DATA:
--> 224     data = self._network_stream.read(
    225         self.READ_NUM_BYTES, timeout=timeout
    226     )
    228     # If we feed this case through h11 we'll raise an exception like:
    229     #
    230     #     httpcore.RemoteProtocolError: can't handle event type
   (...)
    234     # perspective. Instead we handle this case distinctly and treat
    235     # it as a ConnectError.

File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout)
    123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError}
--> 124 with map_exceptions(exc_map):
    125     self._sock.settimeout(timeout)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.

File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ReadTimeout: timed out

The above exception was the direct cause of the following exception:

ReadTimeout                               Traceback (most recent call last)
Cell In[12], line 6
      3 #from llama_index.llms.ollama import Ollama
      5 llm = Ollama(model=OLLAMA_MODEL)
----> 6 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
      7 (Answer with markdown sections, markdown with be the GitHub flavor.)""")
      8 print(response)

File /opt/conda/lib/python3.10/site-packages/llama_index/core/llms/callbacks.py:331, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs)
    314 dispatcher.event(
    315     LLMCompletionStartEvent(
    316         model_dict=model_dict,
   (...)
    320     )
    321 )
    322 event_id = callback_manager.on_event_start(
    323     CBEventType.LLM,
    324     payload={
   (...)
    328     },
    329 )
--> 331 f_return_val = f(_self, *args, **kwargs)
    332 if isinstance(f_return_val, Generator):
    333     # intercept the generator and add a callback to the end
    334     def wrapped_gen() -> CompletionResponseGen:

File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama/base.py:303, in Ollama.complete(self, prompt, formatted, **kwargs)
    300     payload["format"] = "json"
    302 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 303     response = client.post(
    304         url=f"{self.base_url}/api/generate",
    305         json=payload,
    306     )
    307     response.raise_for_status()
    308     raw = response.json()

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1145, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
   1124 def post(
   1125     self,
   1126     url: URLTypes,
   (...)
   1138     extensions: RequestExtensions | None = None,
   1139 ) -> Response:
   1140     """
   1141     Send a `POST` request.
   1142 
   1143     **Parameters**: See `httpx.request`.
   1144     """
-> 1145     return self.request(
   1146         "POST",
   1147         url,
   1148         content=content,
   1149         data=data,
   1150         files=files,
   1151         json=json,
   1152         params=params,
   1153         headers=headers,
   1154         cookies=cookies,
   1155         auth=auth,
   1156         follow_redirects=follow_redirects,
   1157         timeout=timeout,
   1158         extensions=extensions,
   1159     )

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:827, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    812     warnings.warn(message, DeprecationWarning)
    814 request = self.build_request(
    815     method=method,
    816     url=url,
   (...)
    825     extensions=extensions,
    826 )
--> 827 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects)
    906 follow_redirects = (
    907     self.follow_redirects
    908     if isinstance(follow_redirects, UseClientDefault)
    909     else follow_redirects
    910 )
    912 auth = self._build_request_auth(request, auth)
--> 914 response = self._send_handling_auth(
    915     request,
    916     auth=auth,
    917     follow_redirects=follow_redirects,
    918     history=[],
    919 )
    920 try:
    921     if not stream:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    939 request = next(auth_flow)
    941 while True:
--> 942     response = self._send_handling_redirects(
    943         request,
    944         follow_redirects=follow_redirects,
    945         history=history,
    946     )
    947     try:
    948         try:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history)
    976 for hook in self._event_hooks["request"]:
    977     hook(request)
--> 979 response = self._send_single_request(request)
    980 try:
    981     for hook in self._event_hooks["response"]:

File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request)
   1010     raise RuntimeError(
   1011         "Attempted to send an async request with a sync Client instance."
   1012     )
   1014 with request_context(request=request):
-> 1015     response = transport.handle_request(request)
   1017 assert isinstance(response.stream, SyncByteStream)
   1019 response.request = request

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request)
    218 assert isinstance(request.stream, SyncByteStream)
    220 req = httpcore.Request(
    221     method=request.method,
    222     url=httpcore.URL(
   (...)
    230     extensions=request.extensions,
    231 )
--> 232 with map_httpcore_exceptions():
    233     resp = self._pool.handle_request(req)
    235 assert isinstance(resp.stream, typing.Iterable)

File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    151     value = typ()
    152 try:
--> 153     self.gen.throw(typ, value, traceback)
    154 except StopIteration as exc:
    155     # Suppress StopIteration *unless* it's the same exception that
    156     # was passed to throw().  This prevents a StopIteration
    157     # raised inside the "with" statement from being suppressed.
    158     return exc is not value

File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions()
     83     raise
     85 message = str(exc)
---> 86 raise mapped_exc(message) from exc

ReadTimeout: timed out
<!-- gh-comment-id:2106028928 --> @adriens commented on GitHub (May 11, 2024): Applied modifications but still facing some timeout issue : ``` --------------------------------------------------------------------------- ReadTimeout Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions() 68 try: ---> 69 yield 70 except Exception as exc: File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request) 232 with map_httpcore_exceptions(): --> 233 resp = self._pool.handle_request(req) 235 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request) 215 self._close_connections(closing) --> 216 raise exc from None 218 # Return the response. Note that in this case we still have to manage 219 # the point at which the response is closed. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request) 194 try: 195 # Send the request on the assigned connection. --> 196 response = connection.handle_request( 197 pool_request.request 198 ) 199 except ConnectionNotAvailable: 200 # In some cases a connection may initially be available to 201 # handle a request, but then become unavailable. 202 # 203 # In this case we clear the connection and try again. File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request) 99 raise exc --> 101 return self._connection.handle_request(request) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request) 142 self._response_closed() --> 143 raise exc File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request) 104 with Trace( 105 "receive_response_headers", logger, request, kwargs 106 ) as trace: 107 ( 108 http_version, 109 status, 110 reason_phrase, 111 headers, 112 trailing_data, --> 113 ) = self._receive_response_headers(**kwargs) 114 trace.return_value = ( 115 http_version, 116 status, 117 reason_phrase, 118 headers, 119 ) File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request) 185 while True: --> 186 event = self._receive_event(timeout=timeout) 187 if isinstance(event, h11.Response): File /opt/conda/lib/python3.10/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout) 223 if event is h11.NEED_DATA: --> 224 data = self._network_stream.read( 225 self.READ_NUM_BYTES, timeout=timeout 226 ) 228 # If we feed this case through h11 we'll raise an exception like: 229 # 230 # httpcore.RemoteProtocolError: can't handle event type (...) 234 # perspective. Instead we handle this case distinctly and treat 235 # it as a ConnectError. File /opt/conda/lib/python3.10/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout) 123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError} --> 124 with map_exceptions(exc_map): 125 self._sock.settimeout(timeout) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. File /opt/conda/lib/python3.10/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map) 13 if isinstance(exc, from_exc): ---> 14 raise to_exc(exc) from exc 15 raise ReadTimeout: timed out The above exception was the direct cause of the following exception: ReadTimeout Traceback (most recent call last) Cell In[12], line 6 3 #from llama_index.llms.ollama import Ollama 5 llm = Ollama(model=OLLAMA_MODEL) ----> 6 response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? 7 (Answer with markdown sections, markdown with be the GitHub flavor.)""") 8 print(response) File /opt/conda/lib/python3.10/site-packages/llama_index/core/llms/callbacks.py:331, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs) 314 dispatcher.event( 315 LLMCompletionStartEvent( 316 model_dict=model_dict, (...) 320 ) 321 ) 322 event_id = callback_manager.on_event_start( 323 CBEventType.LLM, 324 payload={ (...) 328 }, 329 ) --> 331 f_return_val = f(_self, *args, **kwargs) 332 if isinstance(f_return_val, Generator): 333 # intercept the generator and add a callback to the end 334 def wrapped_gen() -> CompletionResponseGen: File /opt/conda/lib/python3.10/site-packages/llama_index/llms/ollama/base.py:303, in Ollama.complete(self, prompt, formatted, **kwargs) 300 payload["format"] = "json" 302 with httpx.Client(timeout=Timeout(self.request_timeout)) as client: --> 303 response = client.post( 304 url=f"{self.base_url}/api/generate", 305 json=payload, 306 ) 307 response.raise_for_status() 308 raw = response.json() File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1145, in Client.post(self, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 1124 def post( 1125 self, 1126 url: URLTypes, (...) 1138 extensions: RequestExtensions | None = None, 1139 ) -> Response: 1140 """ 1141 Send a `POST` request. 1142 1143 **Parameters**: See `httpx.request`. 1144 """ -> 1145 return self.request( 1146 "POST", 1147 url, 1148 content=content, 1149 data=data, 1150 files=files, 1151 json=json, 1152 params=params, 1153 headers=headers, 1154 cookies=cookies, 1155 auth=auth, 1156 follow_redirects=follow_redirects, 1157 timeout=timeout, 1158 extensions=extensions, 1159 ) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:827, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions) 812 warnings.warn(message, DeprecationWarning) 814 request = self.build_request( 815 method=method, 816 url=url, (...) 825 extensions=extensions, 826 ) --> 827 return self.send(request, auth=auth, follow_redirects=follow_redirects) File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects) 906 follow_redirects = ( 907 self.follow_redirects 908 if isinstance(follow_redirects, UseClientDefault) 909 else follow_redirects 910 ) 912 auth = self._build_request_auth(request, auth) --> 914 response = self._send_handling_auth( 915 request, 916 auth=auth, 917 follow_redirects=follow_redirects, 918 history=[], 919 ) 920 try: 921 if not stream: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history) 939 request = next(auth_flow) 941 while True: --> 942 response = self._send_handling_redirects( 943 request, 944 follow_redirects=follow_redirects, 945 history=history, 946 ) 947 try: 948 try: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history) 976 for hook in self._event_hooks["request"]: 977 hook(request) --> 979 response = self._send_single_request(request) 980 try: 981 for hook in self._event_hooks["response"]: File /opt/conda/lib/python3.10/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request) 1010 raise RuntimeError( 1011 "Attempted to send an async request with a sync Client instance." 1012 ) 1014 with request_context(request=request): -> 1015 response = transport.handle_request(request) 1017 assert isinstance(response.stream, SyncByteStream) 1019 response.request = request File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request) 218 assert isinstance(request.stream, SyncByteStream) 220 req = httpcore.Request( 221 method=request.method, 222 url=httpcore.URL( (...) 230 extensions=request.extensions, 231 ) --> 232 with map_httpcore_exceptions(): 233 resp = self._pool.handle_request(req) 235 assert isinstance(resp.stream, typing.Iterable) File /opt/conda/lib/python3.10/contextlib.py:153, in _GeneratorContextManager.__exit__(self, typ, value, traceback) 151 value = typ() 152 try: --> 153 self.gen.throw(typ, value, traceback) 154 except StopIteration as exc: 155 # Suppress StopIteration *unless* it's the same exception that 156 # was passed to throw(). This prevents a StopIteration 157 # raised inside the "with" statement from being suppressed. 158 return exc is not value File /opt/conda/lib/python3.10/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions() 83 raise 85 message = str(exc) ---> 86 raise mapped_exc(message) from exc ReadTimeout: timed out ```
Author
Owner

@adriens commented on GitHub (May 11, 2024):

I'm trying a new RUN...

<!-- gh-comment-id:2106028965 --> @adriens commented on GitHub (May 11, 2024): I'm trying a new RUN...
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Nope, could not make it run @jmorganca
image

cf Notebook

Would you share some code ?

<!-- gh-comment-id:2106030917 --> @adriens commented on GitHub (May 11, 2024): Nope, could not make it run @jmorganca ![image](https://github.com/ollama/ollama/assets/5235127/a2000f0f-3f2f-4c84-8b06-739d9413cb76) cf [Notebook](https://www.kaggle.com/code/adriensales/ollama-running-local-models-w-llamaindex-cpu?scriptVersionId=177116161) Would you share some code ?
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Assuming there is an ollama instance running in background, here is mine :

!pip install --upgrade llama-index-llms-ollama
!pip install --upgrade llama-index



# Just runs .complete to make sure the LLM is listening
from llama_index.llms.ollama import Ollama
#from llama_index.llms.ollama import Ollama

llm = Ollama(model=OLLAMA_MODEL)
response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics?
(Answer with markdown sections, markdown with be the GitHub flavor.)""")
print(response)

image

<!-- gh-comment-id:2106031655 --> @adriens commented on GitHub (May 11, 2024): Assuming there is an `ollama` instance running in background, here is mine : ```python !pip install --upgrade llama-index-llms-ollama !pip install --upgrade llama-index # Just runs .complete to make sure the LLM is listening from llama_index.llms.ollama import Ollama #from llama_index.llms.ollama import Ollama llm = Ollama(model=OLLAMA_MODEL) response = llm.complete("""Who is Grigori Perelman and why is he so important in mathematics? (Answer with markdown sections, markdown with be the GitHub flavor.)""") print(response) ``` ![image](https://github.com/ollama/ollama/assets/5235127/f8048caf-4580-4e3f-8560-5e5688b5a6cb)
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Giving a try by only keeping the llama-index.ollama part...

<!-- gh-comment-id:2106032282 --> @adriens commented on GitHub (May 11, 2024): Giving a try by only keeping the llama-index.ollama part...
Author
Owner

@adriens commented on GitHub (May 11, 2024):

Nope, still failing because of timeout 💩
Any idea to make it work... or can you reproduce it @jmorganca ?

<!-- gh-comment-id:2106034425 --> @adriens commented on GitHub (May 11, 2024): Nope, still failing because of timeout :hankey: Any idea to make it work... or can you reproduce it @jmorganca ?
Author
Owner

@MohammedMusadiq commented on GitHub (Nov 10, 2024):

Hey @adriens, were you able to make it work for you ?

<!-- gh-comment-id:2466738477 --> @MohammedMusadiq commented on GitHub (Nov 10, 2024): Hey @adriens, were you able to make it work for you ?
Author
Owner

@adriens commented on GitHub (Nov 18, 2024):

Hi @MohammedMusadiq , unfortunately not.

<!-- gh-comment-id:2484134479 --> @adriens commented on GitHub (Nov 18, 2024): Hi @MohammedMusadiq , unfortunately not.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63190