[GH-ISSUE #15507] LangChain: chatollama invocation fails when messages contain only SystemMessage/AIMessage without any HumanMessage #9910

Open
opened 2026-04-12 22:45:32 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Bkbest on GitHub (Apr 11, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15507

What is the issue?

When invoking the Ollama LLM via ChatOllama with a message list that contains only SystemMessage(s)/AIMessage(s) and no HumanMessage, the invocation fails with an error. This is problematic when using message summarization libraries like langmem which can sometimes replace all messages with a single SystemMessage during the summarization process.

Steps to Reproduce:

  1. Create a ChatOllama model instance
  2. Prepare a message list containing only SystemMessage(s) - for example, after a summarization process collapses all previous messages into a single system message
  3. Attempt to invoke the model with: model.ainvoke(messages)

Model being used: minimax-m2.7:cloud (ollama pro)

Code I have:

`async def llm_with_tools(state: State, runtime: Runtime):

"""
Processes messages using LLM with tools when required.

Args:
    state: Current state containing messages and tool requirement
    
Returns:
    Dict containing updated messages
"""
info = runtime.execution_info
if info.node_attempt > 1:
    print("sleeping for 60 seconds before retrying.")
    await asyncio.sleep(60)     
# Create the prompt template with system prompt and messages
agent_description = AGENT_DESCRIPTION
prompt = PromptTemplate.from_template(agent_description)
system_message = [SystemMessage(content=prompt.format(current_date=state["current_date"], skills_description=state["skills_description"]))]

#add systemmessage to the beginning of the messages to be summarized so that it is included in the summary.
state["messages"] = system_message + state["messages"]

summarization_result = await asummarize_messages(
    state["messages"],
    running_summary=state.get("summary"),
    model=llm,
    max_tokens=10000,
    max_tokens_before_summary=5000,
    max_summary_tokens=1000,
    initial_summary_prompt=DEFAULT_INITIAL_SUMMARY_PROMPT,
    existing_summary_prompt=DEFAULT_EXISTING_SUMMARY_PROMPT,
)
messags_after_summarization = summarization_result.messages

# If there are exactly two system messages after summarization, convert the second one to an AIMessage
system_message_indices = [i for i, msg in enumerate(messags_after_summarization) if isinstance(msg, SystemMessage)]
if len(system_message_indices) == 2:
    print(system_message_indices)
    second_system_idx = system_message_indices[1]
    messags_after_summarization[second_system_idx] = AIMessage(content=messags_after_summarization[second_system_idx].content)
    # Insert a HumanMessage after the second system message using the first message from state
    # human_msg = HumanMessage(content="Look at the summary and the conversation below and decide what to do next.")
    # messags_after_summarization.insert(second_system_idx, human_msg)
    print(messags_after_summarization)
await asyncio.sleep(10)
response = await llm_tools.ainvoke(messags_after_summarization)
state_update = {"messages": [response]}
if summarization_result.running_summary:
    state_update["summary"] = summarization_result.running_summary

# Return the response as a message
return state_update`

Full Code: https://github.com/Bkbest/basic_deep_agent

Expected Behavior:

The model should be able to process message list even if it does not contain a HumanMessage.

Current Workaround:
Insert a dummy HumanMessage before the system message when this scenario occurs

Versions I am using:

Name: langchain-ollama
Version: 1.0.1

Name: langchain
Version: 1.2.10

Name: langmem
Version: 0.0.30

Relevant log output

2026-04-11T20:20:44.443753Z [info     ] Retrying task llm_with_tools after 1.15 seconds (attempt 1) after ResponseError Service Temporarily Unavailable (status code: 503) [langgraph.pregel._retry] api_variant=local_dev assistant_id=fe096781-5601-53d2-b2f6-0d3403f7e9ca graph_id=agent langgraph_api_version=0.7.56 request_id=bd85cd81-e24d-4e35-adf2-a9eecd822482 run_attempt=1 run_id=019d7e32-29ee-7103-9a76-8df5d8300795 thread_id=2a49f20a-8cf8-4530-aa65-232198cd0833 thread_name=MainThread
Traceback (most recent call last):
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\pregel\_retry.py", line 211, in arun_with_retry
    return await task.proc.ainvoke(task.input, config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\_internal\_runnable.py", line 705, in ainvoke
    input = await asyncio.create_task(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\_internal\_runnable.py", line 473, in ainvoke
    ret = await self.afunc(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\basic-agent\basic_deep_agent\src\AI_Nodes\nodes.py", line 93, in llm_with_tools
    response = await llm_tools.ainvoke(messags_after_summarization)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\runnables\base.py", line 5708, in ainvoke
    return await self.bound.ainvoke(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 425, in ainvoke
    llm_result = await self.agenerate_prompt(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1134, in agenerate_prompt
    return await self.agenerate(
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1092, in agenerate
    raise exceptions[0]
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1318, in _agenerate_with_cache
    async for chunk in self._astream(messages, stop=stop, **kwargs):
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 1193, in _astream
    async for chunk in self._aiterate_over_stream(messages, stop, **kwargs):
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 1131, in _aiterate_over_stream
    async for stream_resp in self._acreate_chat_stream(messages, stop, **kwargs):
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 937, in _acreate_chat_stream
    async for part in await self._async_client.chat(**chat_params):
  File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 741, in inner
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: Service Temporarily Unavailable (status code: 503)
During task with name 'llm_with_tools' and id 'fa2904e7-60e4-5492-37ff-aec55365e083'
sleeping for 60 seconds before retrying.

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.17.1

Originally created by @Bkbest on GitHub (Apr 11, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15507 ### What is the issue? When invoking the Ollama LLM via ChatOllama with a message list that contains only SystemMessage(s)/AIMessage(s) and no HumanMessage, the invocation fails with an error. This is problematic when using message summarization libraries like langmem which can sometimes replace all messages with a single SystemMessage during the summarization process. Steps to Reproduce: 1. Create a ChatOllama model instance 2. Prepare a message list containing only SystemMessage(s) - for example, after a summarization process collapses all previous messages into a single system message 3. Attempt to invoke the model with: model.ainvoke(messages) Model being used: minimax-m2.7:cloud (ollama pro) Code I have: `async def llm_with_tools(state: State, runtime: Runtime): """ Processes messages using LLM with tools when required. Args: state: Current state containing messages and tool requirement Returns: Dict containing updated messages """ info = runtime.execution_info if info.node_attempt > 1: print("sleeping for 60 seconds before retrying.") await asyncio.sleep(60) # Create the prompt template with system prompt and messages agent_description = AGENT_DESCRIPTION prompt = PromptTemplate.from_template(agent_description) system_message = [SystemMessage(content=prompt.format(current_date=state["current_date"], skills_description=state["skills_description"]))] #add systemmessage to the beginning of the messages to be summarized so that it is included in the summary. state["messages"] = system_message + state["messages"] summarization_result = await asummarize_messages( state["messages"], running_summary=state.get("summary"), model=llm, max_tokens=10000, max_tokens_before_summary=5000, max_summary_tokens=1000, initial_summary_prompt=DEFAULT_INITIAL_SUMMARY_PROMPT, existing_summary_prompt=DEFAULT_EXISTING_SUMMARY_PROMPT, ) messags_after_summarization = summarization_result.messages # If there are exactly two system messages after summarization, convert the second one to an AIMessage system_message_indices = [i for i, msg in enumerate(messags_after_summarization) if isinstance(msg, SystemMessage)] if len(system_message_indices) == 2: print(system_message_indices) second_system_idx = system_message_indices[1] messags_after_summarization[second_system_idx] = AIMessage(content=messags_after_summarization[second_system_idx].content) # Insert a HumanMessage after the second system message using the first message from state # human_msg = HumanMessage(content="Look at the summary and the conversation below and decide what to do next.") # messags_after_summarization.insert(second_system_idx, human_msg) print(messags_after_summarization) await asyncio.sleep(10) response = await llm_tools.ainvoke(messags_after_summarization) state_update = {"messages": [response]} if summarization_result.running_summary: state_update["summary"] = summarization_result.running_summary # Return the response as a message return state_update` Full Code: https://github.com/Bkbest/basic_deep_agent Expected Behavior: The model should be able to process message list even if it does not contain a HumanMessage. Current Workaround: Insert a dummy HumanMessage before the system message when this scenario occurs **Versions I am using:** Name: langchain-ollama Version: 1.0.1 Name: langchain Version: 1.2.10 Name: langmem Version: 0.0.30 ### Relevant log output ```shell 2026-04-11T20:20:44.443753Z [info ] Retrying task llm_with_tools after 1.15 seconds (attempt 1) after ResponseError Service Temporarily Unavailable (status code: 503) [langgraph.pregel._retry] api_variant=local_dev assistant_id=fe096781-5601-53d2-b2f6-0d3403f7e9ca graph_id=agent langgraph_api_version=0.7.56 request_id=bd85cd81-e24d-4e35-adf2-a9eecd822482 run_attempt=1 run_id=019d7e32-29ee-7103-9a76-8df5d8300795 thread_id=2a49f20a-8cf8-4530-aa65-232198cd0833 thread_name=MainThread Traceback (most recent call last): File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\pregel\_retry.py", line 211, in arun_with_retry return await task.proc.ainvoke(task.input, config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\_internal\_runnable.py", line 705, in ainvoke input = await asyncio.create_task( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langgraph\_internal\_runnable.py", line 473, in ainvoke ret = await self.afunc(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\AI\basic-agent\basic_deep_agent\src\AI_Nodes\nodes.py", line 93, in llm_with_tools response = await llm_tools.ainvoke(messags_after_summarization) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\runnables\base.py", line 5708, in ainvoke return await self.bound.ainvoke( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 425, in ainvoke llm_result = await self.agenerate_prompt( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1134, in agenerate_prompt return await self.agenerate( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1092, in agenerate raise exceptions[0] File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1318, in _agenerate_with_cache async for chunk in self._astream(messages, stop=stop, **kwargs): File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 1193, in _astream async for chunk in self._aiterate_over_stream(messages, stop, **kwargs): File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 1131, in _aiterate_over_stream async for stream_resp in self._acreate_chat_stream(messages, stop, **kwargs): File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_ollama\chat_models.py", line 937, in _acreate_chat_stream async for part in await self._async_client.chat(**chat_params): File "C:\Users\aravi\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 741, in inner raise ResponseError(e.response.text, e.response.status_code) from None ollama._types.ResponseError: Service Temporarily Unavailable (status code: 503) During task with name 'llm_with_tools' and id 'fa2904e7-60e4-5492-37ff-aec55365e083' sleeping for 60 seconds before retrying. ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.17.1
GiteaMirror added the bug label 2026-04-12 22:45:32 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 12, 2026):

$ curl localhost:11434/v1/chat/completions -d '{
    "model":"minimax-m2.7:cloud",
    "messages":[{"role":"system","content":"this is a single system message"}],
    "stream":false
  }'
{"error":"Internal Server Error (ref: 1bd5e687-9e90-42aa-b772-13c5674cb27b)"}
<!-- gh-comment-id:4232751936 --> @rick-github commented on GitHub (Apr 12, 2026): ```console $ curl localhost:11434/v1/chat/completions -d '{ "model":"minimax-m2.7:cloud", "messages":[{"role":"system","content":"this is a single system message"}], "stream":false }' {"error":"Internal Server Error (ref: 1bd5e687-9e90-42aa-b772-13c5674cb27b)"}
Author
Owner

@Bkbest commented on GitHub (Apr 13, 2026):

yes, even If I add an AI messages, it would fail. Only a HumanMessage will result in successful call.

<!-- gh-comment-id:4233569007 --> @Bkbest commented on GitHub (Apr 13, 2026): yes, even If I add an AI messages, it would fail. Only a HumanMessage will result in successful call.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9910