[GH-ISSUE #9770] [false] New format option is caching overly-agressively #68445

Closed
opened 2026-05-04 13:58:39 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @jdblack on GitHub (Mar 14, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9770

Originally assigned to: @ParthSareen on GitHub.

What is the issue?

It appears that both /api/generate and /api/chat return incorrect cache responses when format is being used. I believe that longer prompts are susceptible to overly-aggressive caching even when important leading parts of the prompts are modified. I saw the same behavior with /api/generate + format, though I am just using /api/chat here.

In this bug report example, we are asking gemma3 (via ollama of course!) to extract a list of facts and related entities from news articles by providing an instructional prompt, with the article that should be parsed. The example article, with title, is ~ 9600 characters (2k tokens).

Below, it can be observed that when the format option is used, that the same cached response is consistently returned, even when the instructional prompt is changed into a nonsensical answer. A substantial change in the length of the article content does seem to have some affect.

Versions:
Ollama: {"version" : "0.6.0" }
Models: gemma3, 4.3B params, Q4_K_M
Kubernetes: v1.32.1

First prompt, without format argument.

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "I will give you an article. Extract from it as many facts as possible"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"}
newsapi(dev)> ])
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T13:59:39.616754526Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "Okay, here's a breakdown of the article, categorized with key takeaways and a summary of its core arguments:\n\n**I. Core Argument & Critique of Traditional DCF Valuation**\n\n* **DCF is Flawed:** The article fundamentally argues that relying *solely* on Discounted Cash Flow (DCF) analysis is often misleading. It highlights that a significant portion of the \"value\" derived from DCF calculations rests on highly uncertain assumptions, particularly the terminal value (the estimated value of the company beyond the explicit forecast period).\n* **Terminal Value is the Weak Link:** The article emphasizes that the terminal value is the most subjective and prone to error in DCF models. It’s often based on optimistic projections of future growth that rarely materialize.\n* **Beware of \"Magic Numbers\":** The article cautions against treating DCF output as a definitive “magic number.” It’s a tool, not a gospel.\n\n\n**II. Key Principles for Investing (Beyond DCF)**\n\n* **Focus on Operating Cash Flow (OCF):** This is presented as the *most* important investment screen. A company must consistently generate enough OCF to cover its expenses. This is a fundamental test of a business’s viability.\n* **Realistic Imagination (for Terminal Value):** Instead of forcing a terminal value based on overly optimistic growth, the article advocates for “realistic imagination.” This means considering how a sector or product might evolve, factoring in potential shifts in consumer needs or regulatory landscapes.\n* **Buy Below Fair Value:**  Identify a company’s “fair value” (which may be based on a range of estimates, not just DCF) and purchase it at a discount to that level.  This incorporates a margin of safety.\n* **Normalize Cash Yield:**  Estimate the average cash flow a company can generate over a business cycle (typically 3-4 years) and compare it to the current market valuation. This provides a more grounded view of a company’s potential.\n\n\n**III. Investment Strategy & Time Horizon**\n\n* **Longer Holding Periods:** The article recommends a holding period of 5 years as a sweet spot. This allows for real fundamentals to emerge, reduces the impact of short-term market noise, and facilitates compounding.\n* **Time as a Filter:**  Historical data shows that longer holding periods generally improve the risk-return balance. Time acts as a powerful filter, smoothing out volatility.\n* **Resilient Portfolio Construction:**  Build a portfolio that can withstand market storms by focusing on companies with strong cash flow generation and a margin of safety.\n\n**IV.  Shared DNA of Successful Companies (Examples)**\n\n* **Amazon, Tesla, Apple:** The article uses these companies as examples of businesses that demonstrate resilience and enduring value. They are characterized by consistent cash flow generation and adaptability.\n\n\n\n**In essence, the article advocates for a more holistic and pragmatic approach to investing, moving beyond the potentially misleading precision of DCF and embracing a longer-term perspective focused on fundamental cash flow generation and a healthy margin of safety.**\n\n---\n\nDo you want me to:\n\n*   Summarize a specific section of the article in more detail?\n*   Analyze the article's strengths and weaknesses?\n*   Generate questions based on the article's content?"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 354558840787,
 "load_duration" => 2254856491,
 "prompt_eval_count" => 2048,
 "prompt_eval_duration" => 191752000000,
 "eval_count" => 676,
 "eval_duration" => 160506000000}

Second identical prompt, also without format. Response, though similar, is not identical (this is fine).

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "I will give you an article. Extract from it as many facts as possible"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"}
newsapi(dev)> ])
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:05:20.588578371Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "Okay, here's a breakdown of the article, categorized for clarity and with key takeaways:\n\n**I. Core Argument & Critique of DCF Valuation**\n\n* **The Problem with DCF:** The article strongly argues that relying *solely* on Discounted Cash Flow (DCF) analysis is often misleading.  It emphasizes that a significant portion (80%) of the “value” derived from a DCF model is based on highly uncertain terminal value assumptions – essentially, predicting the distant future.\n* **Fragility of DCF:** The model is inherently fragile because it’s so sensitive to the assumptions made about long-term growth rates and discount rates.\n* **Focus on Realistic Imagination:** The article advocates for shifting from a purely numerical approach to a more qualitative “realistic imagination” – considering multiple potential future scenarios and a company’s adaptability.\n\n**II. Key Principles for Investing**\n\n1. **Cash Flow is King:**\n   * **Operating Cash Flow (OCF) as the Primary Screen:** The most important factor is a company's ability to generate sufficient OCF to cover its expenses.  If a company can't consistently produce OCF, it’s a red flag.\n   * **Normalized Cash Yield:**  This is a key concept – estimating the average cash flow a company can generate over a 3-4 year cycle and comparing it to the current market valuation.  It's like a “yield” on an equity investment.\n\n2. **Long-Term Perspective:**\n   * **5-Year Holding Period:** The article suggests a 5-year holding period is often optimal – long enough to allow fundamentals to play out, but short enough to avoid being overly influenced by short-term market fluctuations.\n   * **Time as a Filter:**  Longer holding periods tend to smooth out volatility and improve the risk-return balance.\n\n3. **Margin of Safety:**\n   * **Buy Below Fair Value:**  Always aim to purchase assets below their estimated “fair value.”\n   * **Resilience:** Companies that can withstand economic downturns and maintain their cash flow are more likely to succeed over the long term.\n\n**III. Identifying Winners – Characteristics of Durable Companies**\n\n* **Amazon, Tesla, Apple – The “DNA”:** These companies are presented as examples of businesses with enduring qualities:\n    * **Strong Cash Flow Generation:** They consistently generate significant OCF.\n    * **Adaptability:** They demonstrate the ability to evolve and remain relevant in changing market conditions.\n    * **Resilience:** They’ve shown the ability to weather economic storms.\n\n**IV. Portfolio Construction & Investment Philosophy**\n\n* **Focus on Quality:** Prioritize companies with strong cash flow, adaptability, and resilience.\n* **Patient Approach:**  Don't chase short-term gains.  Let compounding work its magic over time.\n* **Building a Resilient Portfolio:**  Diversify, but focus on companies with durable characteristics.\n\n\n\n**In essence, the article promotes a more holistic and patient approach to investing, emphasizing the importance of cash flow, adaptability, and a long-term perspective.** It’s a cautionary tale against relying too heavily on mathematical models and encourages investors to develop a deeper understanding of the businesses they invest in.\n\n---\n\nWould you like me to:\n\n*   Summarize a specific section in more detail?\n*   Answer a particular question about the article?\n*   Generate a list of key takeaways?"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 332303953100,
 "load_duration" => 2639887280,
 "prompt_eval_count" => 2048,
 "prompt_eval_duration" => 189785000000,
 "eval_count" => 720,
 "eval_duration" => 139833000000}

Now, we introduce a format with the following schema:

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "fact": {
        "type": "string"
      },
      "date": {
        "type": "string"
      },
      "entities": {
        "type": "array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "fact",
      "date",
      "entities"
    ]
  }
}

Here, we provide the format listed above, and we get a cached response.

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "I will give you an article. Extract from it as many facts as possible"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"}
newsapi(dev)> ], format: format)
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:09:22.983885988Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n  {\n    \"fact\": \"The article emphasizes the limitations of DCF (Discounted Cash Flow) valuation, arguing that a significant portion of the calculated 'value' relies on uncertain terminal assumptions.\",\n    \"date\": \"Throughout the article\"\n  ,\n  \"entities\": [\"DCF\", \"Terminal Value\", \"Amazon\", \"Tesla\", \"Apple\", \"S&P 500\"]\n}\n]"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 215217546152,
 "load_duration" => 2463070585,
 "prompt_eval_count" => 2048,
 "prompt_eval_duration" => 192975000000,
 "eval_count" => 94,
 "eval_duration" => 19224000000}

Let's change the instructions in the prompt to "just say happy birthday", but with the article attached. We see that we still get the same cache response, even though the instructions have changed dramatically.

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "All you do is say happy birthday"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"}
newsapi(dev)> ], format: format)
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:14:06.428583326Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n  {\n    \"fact\": \"The article emphasizes the limitations of DCF (Discounted Cash Flow) valuation, arguing that 80% of the calculated ‘value’ relies on uncertain terminal assumptions.\"\n  ,\n  \"date\": \"October 26, 2023\"\n  ,\n  \"entities\": [\"DCF valuation\", \"Amazon\", \"Tesla\", \"Apple\", \"S&P 500\", \"normalized cash yield\"]\n}\n]"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 221272249553,
 "load_duration" => 2303993079,
 "prompt_eval_count" => 2048,
 "prompt_eval_duration" => 192038000000,
 "eval_count" => 105,
 "eval_duration" => 26429000000}

Let's further test things by removing the article title and date to prove that it's providing a cached result (as the date is not in the article).

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "All you do is say happy birthday"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.content}"}
newsapi(dev)> ], format: format)
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:24:00.142707945Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n  {\n    \"fact\": \"The article emphasizes the limitations of DCF valuation, particularly the reliance on terminal value assumptions, which are often overly optimistic and prone to error.\",\n    \"date\": \"October 26, 2023\"\n  ,\n  \"entities\": [\n    \"Amazon\",\n    \"Apple\",\n    \"Tesla\",\n    \"S&P 500\"\n  ]\n}\n]"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 216398898821,
 "load_duration" => 2210238573,
 "prompt_eval_count" => 2047,
 "prompt_eval_duration" => 193039000000,
 "eval_count" => 98,
 "eval_duration" => 20638000000}

However, if we provide only the first half of the article, then we get an empty array response (which we would expect, because the instructions and the format are in conflict)

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "All you do is say happy birthday"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.content[0..(a.content.length/2)]}"}
newsapi(dev)> ], format: format)
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:34:22.335167517Z",
 "message" => {"role" => "assistant", "content" => "[ ]"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 105913645208,
 "load_duration" => 2189835287,
 "prompt_eval_count" => 1091,
 "prompt_eval_duration" => 102657000000,
 "eval_count" => 3,
 "eval_duration" => 566000000}

The same thing if we only give the second half of the article. An empty array becuase no sensible response is possible

newsapi(dev)* o.chat([
newsapi(dev)*     { role: :user, content: "All you do is say happy birthday"},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.content[(a.content.length/2)..]}"}
newsapi(dev)> ], format: format)
=>
{"model" => "gemma3:latest",
 "created_at" => "2025-03-14T14:36:42.213911452Z",
 "message" => {"role" => "assistant", "content" => "[ ]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 93354048044,
 "load_duration" => 2150341175,
 "prompt_eval_count" => 1002,
 "prompt_eval_duration" => 89970000000,
 "eval_count" => 4,
 "eval_duration" => 698000000}

The test code being used:

  def chat(messages, options={})

    model = options[:model] || models.first
    format = options[:format] || nil

    url = URI.parse("#{URL}/api/chat")
    http_client = Net::HTTP.new(url.host, url.port)
    http_client.use_ssl = true
    http_client.read_timeout = 10_000_000

    req = {
      model: model,
      messages: messages,
      stream: false
    }
    req[:format] = options[:format]  if options[:format]

    request = Net::HTTP::Post.new(url.request_uri)
    request.content_type = 'application/json'
    request.body = req.to_json
    response = http_client.request(request)
    res = JSON.parse response.body
    res
  end

I would post the article content, but I don't want to cause a copyright issue, so the the example article is here:
https://blogs.cfainstitute.org/investor/2025/01/13/the-discounted-cash-flow-dilemma-a-tool-for-theorists-or-practitioners/

Relevant log output


OS

Ubuntu Linux latest

GPU

none

CPU

5x nuc N150

Ollama version

0.6.0

Originally created by @jdblack on GitHub (Mar 14, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9770 Originally assigned to: @ParthSareen on GitHub. ### What is the issue? It appears that both /api/generate and /api/chat return incorrect cache responses when format is being used. I believe that longer prompts are susceptible to overly-aggressive caching even when important leading parts of the prompts are modified. I saw the same behavior with /api/generate + format, though I am just using /api/chat here. In this bug report example, we are asking gemma3 (via ollama of course!) to extract a list of facts and related entities from news articles by providing an instructional prompt, with the article that should be parsed. The example article, with title, is ~ 9600 characters (2k tokens). Below, it can be observed that when the format option is used, that the same cached response is consistently returned, even when the instructional prompt is changed into a nonsensical answer. A substantial change in the length of the article content does seem to have some affect. Versions: Ollama: {"version" : "0.6.0" } Models: gemma3, 4.3B params, Q4_K_M Kubernetes: v1.32.1 ### First prompt, without format argument. ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "I will give you an article. Extract from it as many facts as possible"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"} newsapi(dev)> ]) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T13:59:39.616754526Z", "message" => {"role" => "assistant", "content" => "Okay, here's a breakdown of the article, categorized with key takeaways and a summary of its core arguments:\n\n**I. Core Argument & Critique of Traditional DCF Valuation**\n\n* **DCF is Flawed:** The article fundamentally argues that relying *solely* on Discounted Cash Flow (DCF) analysis is often misleading. It highlights that a significant portion of the \"value\" derived from DCF calculations rests on highly uncertain assumptions, particularly the terminal value (the estimated value of the company beyond the explicit forecast period).\n* **Terminal Value is the Weak Link:** The article emphasizes that the terminal value is the most subjective and prone to error in DCF models. It’s often based on optimistic projections of future growth that rarely materialize.\n* **Beware of \"Magic Numbers\":** The article cautions against treating DCF output as a definitive “magic number.” It’s a tool, not a gospel.\n\n\n**II. Key Principles for Investing (Beyond DCF)**\n\n* **Focus on Operating Cash Flow (OCF):** This is presented as the *most* important investment screen. A company must consistently generate enough OCF to cover its expenses. This is a fundamental test of a business’s viability.\n* **Realistic Imagination (for Terminal Value):** Instead of forcing a terminal value based on overly optimistic growth, the article advocates for “realistic imagination.” This means considering how a sector or product might evolve, factoring in potential shifts in consumer needs or regulatory landscapes.\n* **Buy Below Fair Value:** Identify a company’s “fair value” (which may be based on a range of estimates, not just DCF) and purchase it at a discount to that level. This incorporates a margin of safety.\n* **Normalize Cash Yield:** Estimate the average cash flow a company can generate over a business cycle (typically 3-4 years) and compare it to the current market valuation. This provides a more grounded view of a company’s potential.\n\n\n**III. Investment Strategy & Time Horizon**\n\n* **Longer Holding Periods:** The article recommends a holding period of 5 years as a sweet spot. This allows for real fundamentals to emerge, reduces the impact of short-term market noise, and facilitates compounding.\n* **Time as a Filter:** Historical data shows that longer holding periods generally improve the risk-return balance. Time acts as a powerful filter, smoothing out volatility.\n* **Resilient Portfolio Construction:** Build a portfolio that can withstand market storms by focusing on companies with strong cash flow generation and a margin of safety.\n\n**IV. Shared DNA of Successful Companies (Examples)**\n\n* **Amazon, Tesla, Apple:** The article uses these companies as examples of businesses that demonstrate resilience and enduring value. They are characterized by consistent cash flow generation and adaptability.\n\n\n\n**In essence, the article advocates for a more holistic and pragmatic approach to investing, moving beyond the potentially misleading precision of DCF and embracing a longer-term perspective focused on fundamental cash flow generation and a healthy margin of safety.**\n\n---\n\nDo you want me to:\n\n* Summarize a specific section of the article in more detail?\n* Analyze the article's strengths and weaknesses?\n* Generate questions based on the article's content?"}, "done_reason" => "stop", "done" => true, "total_duration" => 354558840787, "load_duration" => 2254856491, "prompt_eval_count" => 2048, "prompt_eval_duration" => 191752000000, "eval_count" => 676, "eval_duration" => 160506000000} ``` ### Second identical prompt, also without format. Response, though similar, is not identical (this is fine). ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "I will give you an article. Extract from it as many facts as possible"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"} newsapi(dev)> ]) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:05:20.588578371Z", "message" => {"role" => "assistant", "content" => "Okay, here's a breakdown of the article, categorized for clarity and with key takeaways:\n\n**I. Core Argument & Critique of DCF Valuation**\n\n* **The Problem with DCF:** The article strongly argues that relying *solely* on Discounted Cash Flow (DCF) analysis is often misleading. It emphasizes that a significant portion (80%) of the “value” derived from a DCF model is based on highly uncertain terminal value assumptions – essentially, predicting the distant future.\n* **Fragility of DCF:** The model is inherently fragile because it’s so sensitive to the assumptions made about long-term growth rates and discount rates.\n* **Focus on Realistic Imagination:** The article advocates for shifting from a purely numerical approach to a more qualitative “realistic imagination” – considering multiple potential future scenarios and a company’s adaptability.\n\n**II. Key Principles for Investing**\n\n1. **Cash Flow is King:**\n * **Operating Cash Flow (OCF) as the Primary Screen:** The most important factor is a company's ability to generate sufficient OCF to cover its expenses. If a company can't consistently produce OCF, it’s a red flag.\n * **Normalized Cash Yield:** This is a key concept – estimating the average cash flow a company can generate over a 3-4 year cycle and comparing it to the current market valuation. It's like a “yield” on an equity investment.\n\n2. **Long-Term Perspective:**\n * **5-Year Holding Period:** The article suggests a 5-year holding period is often optimal – long enough to allow fundamentals to play out, but short enough to avoid being overly influenced by short-term market fluctuations.\n * **Time as a Filter:** Longer holding periods tend to smooth out volatility and improve the risk-return balance.\n\n3. **Margin of Safety:**\n * **Buy Below Fair Value:** Always aim to purchase assets below their estimated “fair value.”\n * **Resilience:** Companies that can withstand economic downturns and maintain their cash flow are more likely to succeed over the long term.\n\n**III. Identifying Winners – Characteristics of Durable Companies**\n\n* **Amazon, Tesla, Apple – The “DNA”:** These companies are presented as examples of businesses with enduring qualities:\n * **Strong Cash Flow Generation:** They consistently generate significant OCF.\n * **Adaptability:** They demonstrate the ability to evolve and remain relevant in changing market conditions.\n * **Resilience:** They’ve shown the ability to weather economic storms.\n\n**IV. Portfolio Construction & Investment Philosophy**\n\n* **Focus on Quality:** Prioritize companies with strong cash flow, adaptability, and resilience.\n* **Patient Approach:** Don't chase short-term gains. Let compounding work its magic over time.\n* **Building a Resilient Portfolio:** Diversify, but focus on companies with durable characteristics.\n\n\n\n**In essence, the article promotes a more holistic and patient approach to investing, emphasizing the importance of cash flow, adaptability, and a long-term perspective.** It’s a cautionary tale against relying too heavily on mathematical models and encourages investors to develop a deeper understanding of the businesses they invest in.\n\n---\n\nWould you like me to:\n\n* Summarize a specific section in more detail?\n* Answer a particular question about the article?\n* Generate a list of key takeaways?"}, "done_reason" => "stop", "done" => true, "total_duration" => 332303953100, "load_duration" => 2639887280, "prompt_eval_count" => 2048, "prompt_eval_duration" => 189785000000, "eval_count" => 720, "eval_duration" => 139833000000} ``` ### Now, we introduce a format with the following schema: ```json { "type": "array", "items": { "type": "object", "properties": { "fact": { "type": "string" }, "date": { "type": "string" }, "entities": { "type": "array", "items": { "type": "string" } } }, "required": [ "fact", "date", "entities" ] } } ``` ### Here, we provide the format listed above, and we get a cached response. ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "I will give you an article. Extract from it as many facts as possible"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"} newsapi(dev)> ], format: format) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:09:22.983885988Z", "message" => {"role" => "assistant", "content" => "[\n {\n \"fact\": \"The article emphasizes the limitations of DCF (Discounted Cash Flow) valuation, arguing that a significant portion of the calculated 'value' relies on uncertain terminal assumptions.\",\n \"date\": \"Throughout the article\"\n ,\n \"entities\": [\"DCF\", \"Terminal Value\", \"Amazon\", \"Tesla\", \"Apple\", \"S&P 500\"]\n}\n]"}, "done_reason" => "stop", "done" => true, "total_duration" => 215217546152, "load_duration" => 2463070585, "prompt_eval_count" => 2048, "prompt_eval_duration" => 192975000000, "eval_count" => 94, "eval_duration" => 19224000000} ``` ### Let's change the instructions in the prompt to "just say happy birthday", but with the article attached. We see that we still get the same cache response, even though the instructions have changed dramatically. ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "All you do is say happy birthday"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"} newsapi(dev)> ], format: format) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:14:06.428583326Z", "message" => {"role" => "assistant", "content" => "[\n {\n \"fact\": \"The article emphasizes the limitations of DCF (Discounted Cash Flow) valuation, arguing that 80% of the calculated ‘value’ relies on uncertain terminal assumptions.\"\n ,\n \"date\": \"October 26, 2023\"\n ,\n \"entities\": [\"DCF valuation\", \"Amazon\", \"Tesla\", \"Apple\", \"S&P 500\", \"normalized cash yield\"]\n}\n]"}, "done_reason" => "stop", "done" => true, "total_duration" => 221272249553, "load_duration" => 2303993079, "prompt_eval_count" => 2048, "prompt_eval_duration" => 192038000000, "eval_count" => 105, "eval_duration" => 26429000000} ``` ### Let's further test things by removing the article title and date to prove that it's providing a cached result (as the date is not in the article). ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "All you do is say happy birthday"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.content}"} newsapi(dev)> ], format: format) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:24:00.142707945Z", "message" => {"role" => "assistant", "content" => "[\n {\n \"fact\": \"The article emphasizes the limitations of DCF valuation, particularly the reliance on terminal value assumptions, which are often overly optimistic and prone to error.\",\n \"date\": \"October 26, 2023\"\n ,\n \"entities\": [\n \"Amazon\",\n \"Apple\",\n \"Tesla\",\n \"S&P 500\"\n ]\n}\n]"}, "done_reason" => "stop", "done" => true, "total_duration" => 216398898821, "load_duration" => 2210238573, "prompt_eval_count" => 2047, "prompt_eval_duration" => 193039000000, "eval_count" => 98, "eval_duration" => 20638000000} ``` ### However, if we provide only the first half of the article, then we get an empty array response (which we would expect, because the instructions and the format are in conflict) ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "All you do is say happy birthday"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.content[0..(a.content.length/2)]}"} newsapi(dev)> ], format: format) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:34:22.335167517Z", "message" => {"role" => "assistant", "content" => "[ ]"}, "done_reason" => "stop", "done" => true, "total_duration" => 105913645208, "load_duration" => 2189835287, "prompt_eval_count" => 1091, "prompt_eval_duration" => 102657000000, "eval_count" => 3, "eval_duration" => 566000000} ``` ### The same thing if we only give the second half of the article. An empty array becuase no sensible response is possible ```ruby newsapi(dev)* o.chat([ newsapi(dev)* { role: :user, content: "All you do is say happy birthday"}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.content[(a.content.length/2)..]}"} newsapi(dev)> ], format: format) => {"model" => "gemma3:latest", "created_at" => "2025-03-14T14:36:42.213911452Z", "message" => {"role" => "assistant", "content" => "[ ]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 93354048044, "load_duration" => 2150341175, "prompt_eval_count" => 1002, "prompt_eval_duration" => 89970000000, "eval_count" => 4, "eval_duration" => 698000000} ``` The test code being used: ```ruby def chat(messages, options={}) model = options[:model] || models.first format = options[:format] || nil url = URI.parse("#{URL}/api/chat") http_client = Net::HTTP.new(url.host, url.port) http_client.use_ssl = true http_client.read_timeout = 10_000_000 req = { model: model, messages: messages, stream: false } req[:format] = options[:format] if options[:format] request = Net::HTTP::Post.new(url.request_uri) request.content_type = 'application/json' request.body = req.to_json response = http_client.request(request) res = JSON.parse response.body res end ``` I would post the article content, but I don't want to cause a copyright issue, so the the example article is here: https://blogs.cfainstitute.org/investor/2025/01/13/the-discounted-cash-flow-dilemma-a-tool-for-theorists-or-practitioners/ ### Relevant log output ```shell ``` ### OS Ubuntu Linux latest ### GPU none ### CPU 5x nuc N150 ### Ollama version 0.6.0
GiteaMirror added the bug label 2026-05-04 13:58:39 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 14, 2025):

 "prompt_eval_count" => 2048,

This indicates that the context buffer was completely full. A quick token count of the article gives around 2800, so going in, ollama is going to do two passes over the input - the first will drop messages, so the instructions are removed, and since the prompt is still > 2048, the text is truncated. It's admirable that gemma3 still recognizes that you want a summary.

When you change the instruction to All you do is say happy birthday, once again the prompt is too big, so ollama removed the instructions and gemma3 just does a summary as before.

When you start shortening the article, the instruction survives, but as you note the format is incompatible with the instruction.

I think if you set num_ctx to something larger (eg 4096) you will get results in line with the instructions.

<!-- gh-comment-id:2725168910 --> @rick-github commented on GitHub (Mar 14, 2025): ``` "prompt_eval_count" => 2048, ``` This indicates that the context buffer was completely full. A quick token count of the article gives around 2800, so going in, ollama is going to do two passes over the input - the first will drop messages, so the instructions are removed, and since the prompt is still > 2048, the text is truncated. It's admirable that gemma3 still recognizes that you want a summary. When you change the instruction to `All you do is say happy birthday`, once again the prompt is too big, so ollama removed the instructions and gemma3 just does a summary as before. When you start shortening the article, the instruction survives, but as you note the format is incompatible with the instruction. I think if you set `num_ctx` to something larger (eg 4096) you will get results in line with the instructions.
Author
Owner

@Steve0929 commented on GitHub (Mar 14, 2025):

I think there is definetly some kind of caching issue. I tested sending Prompt A to Gemma 3, and then I sent a completely different Prompt B. However, the response to Prompt B included information from Prompt A, which couldn’t have been inferred from Prompt B alone

Ollama 0.6.0
Gemma3, 4b Q4_K_M

<!-- gh-comment-id:2725608597 --> @Steve0929 commented on GitHub (Mar 14, 2025): I think there is definetly some kind of caching issue. I tested sending Prompt A to Gemma 3, and then I sent a completely different Prompt B. However, the response to Prompt B included information from Prompt A, which couldn’t have been inferred from Prompt B alone Ollama 0.6.0 Gemma3, 4b Q4_K_M
Author
Owner

@rick-github commented on GitHub (Mar 14, 2025):

Can you provide more information, or better yet, code that demonstrates the problem?

<!-- gh-comment-id:2725615935 --> @rick-github commented on GitHub (Mar 14, 2025): Can you provide more information, or better yet, code that demonstrates the problem?
Author
Owner

@jdblack commented on GitHub (Mar 15, 2025):

Can you provide more information, or better yet, code that demonstrates the problem?

Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used"). I'll also paste the current code I'm testing with at the end of this comment.

I'm actively testing with new code that attempts to calculate and set num_ctx on the fly. I'll close the issue once I've verified that's the problem. It's taking some time though; I'm trying to run the gemma3 4.3b model on 4 core systems with N150s, so it can take some time between tests.

If that doesn't work out, I'll put together a proof of concept script into git that you can view. Would ruby be ok for you, or should I write it in a different language?

class Ollama
  URL = 'https://ollama'.freeze


  def chat(messages, options={})

    est_context = messages.to_json.split.count * 1.2
    puts "Calculated ontext is #{est_context}"

    model = options[:model] || models.first
    format = options[:format] || nil
    num_ctx = options[:num_ctx]

    url = URI.parse("#{URL}/api/chat")
    http_client = Net::HTTP.new(url.host, url.port)
    http_client.use_ssl = true
    http_client.read_timeout = 10_000_000

    req = {
      model: model,
      messages: messages,
      stream: false,
      options: {
        num_ctx: num_ctx
      }
    }
    req[:format] = options[:format]  if options[:format]

    request = Net::HTTP::Post.new(url.request_uri)
    request.content_type = 'application/json'
    request.body = req.to_json
    response = http_client.request(request)
    res = JSON.parse response.body
    res
  end

  def models
    res = Net::HTTP.get(URI.parse("#{URL}/api/tags"))
    JSON.parse(res)['models'].map { |model| model['name'] }
  end

I'm currently calling it by hand in rails console ( with the following:

newsapi(dev)> reload!
Reloading...
=> nil
newsapi(dev)> o = Ollama.new
=> #<Ollama:0x00000001326e9e88>
newsapi(dev)* puts Time.now; puts o.chat([
newsapi(dev)"     { role: :user, content: "You are a news reporter that specializes in extracting facts.  Extract  a long list of as many key facts as you can
newsapi(dev)"  find from the article as you can find.  You should be able to find many facts.  Each fact can be 1-3 sentences because each must be self contained.   Make your best
newsapi(dev)*  guess at the timestamp of each fact, relying on the article's date if you can't find one. Then, for each fact,  identify every associated entity for the fact."},
newsapi(dev)*     { role: :assistant, content: "Ok"},
newsapi(dev)*     { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"}
newsapi(dev)> ], format: format) rescue  puts Time.now; puts Time.now
<!-- gh-comment-id:2726155228 --> @jdblack commented on GitHub (Mar 15, 2025): > Can you provide more information, or better yet, code that demonstrates the problem? Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used"). I'll also paste the current code I'm testing with at the end of this comment. I'm actively testing with new code that attempts to calculate and set num_ctx on the fly. I'll close the issue once I've verified that's the problem. It's taking some time though; I'm trying to run the gemma3 4.3b model on 4 core systems with N150s, so it can take some time between tests. If that doesn't work out, I'll put together a proof of concept script into git that you can view. Would ruby be ok for you, or should I write it in a different language? ```ruby class Ollama URL = 'https://ollama'.freeze def chat(messages, options={}) est_context = messages.to_json.split.count * 1.2 puts "Calculated ontext is #{est_context}" model = options[:model] || models.first format = options[:format] || nil num_ctx = options[:num_ctx] url = URI.parse("#{URL}/api/chat") http_client = Net::HTTP.new(url.host, url.port) http_client.use_ssl = true http_client.read_timeout = 10_000_000 req = { model: model, messages: messages, stream: false, options: { num_ctx: num_ctx } } req[:format] = options[:format] if options[:format] request = Net::HTTP::Post.new(url.request_uri) request.content_type = 'application/json' request.body = req.to_json response = http_client.request(request) res = JSON.parse response.body res end def models res = Net::HTTP.get(URI.parse("#{URL}/api/tags")) JSON.parse(res)['models'].map { |model| model['name'] } end ``` I'm currently calling it by hand in rails console ( with the following: ```ruby newsapi(dev)> reload! Reloading... => nil newsapi(dev)> o = Ollama.new => #<Ollama:0x00000001326e9e88> newsapi(dev)* puts Time.now; puts o.chat([ newsapi(dev)" { role: :user, content: "You are a news reporter that specializes in extracting facts. Extract a long list of as many key facts as you can newsapi(dev)" find from the article as you can find. You should be able to find many facts. Each fact can be 1-3 sentences because each must be self contained. Make your best newsapi(dev)* guess at the timestamp of each fact, relying on the article's date if you can't find one. Then, for each fact, identify every associated entity for the fact."}, newsapi(dev)* { role: :assistant, content: "Ok"}, newsapi(dev)* { role: :user, content: "#{a.title}\n#{a.publishedAt}\n#{a.content}"} newsapi(dev)> ], format: format) rescue puts Time.now; puts Time.now ```
Author
Owner

@rick-github commented on GitHub (Mar 15, 2025):

Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used").

@jdblack Your initial report was quite thorough, thanks for that. It's why I'm pretty sure the problem is the size of the context. My follow up question was for @Steve0929. I think that's a different problem, for which more information is required for debugging.

<!-- gh-comment-id:2726353592 --> @rick-github commented on GitHub (Mar 15, 2025): > Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used"). @jdblack Your initial report was quite thorough, thanks for that. It's why I'm pretty sure the problem is the size of the context. My follow up question was for @Steve0929. I think that's a different problem, for which more information is required for debugging.
Author
Owner

@jdblack commented on GitHub (Mar 15, 2025):

Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used").

@jdblack Your initial report was quite thorough, thanks for that. It's why I'm pretty sure the problem is the size of the context. My follow up question was for @Steve0929. I think that's a different problem, for which more information is required for debugging.

Ahhh, sorry ! It was context, exactly as you predicted!

newsapi(dev)> 5.times do pp  o.tester(id);  end
{"model" => "gemma3:latest",
 "created_at" => "2025-03-15T08:07:23.824674826Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n" +
    "  {\n" +
    "    \"fact\": \"Donald Trump has returned to the U.S. presidency.\",\n" +
    "    \"date\": \"Unspecified - implied to be the present day\",\n" +
    "    \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump intends to sign dozens of executive orders upon his return.\",\n" +
    "    \"date\": \"Unspecified - implied to be the present day\",\n" +
    "    \"entities\": [\"executive orders\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The executive orders will likely address border policy, the environment, trade, and a potential TikTok ban delay.\",\n" +
    "    \"date\": \"Unspecified - implied to be the present day\",\n" +
    "    \"entities\": [\"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump’s administration will likely impact tech companies through tariffs, surveillance, climate change policy, antitrust cases, and immigration policies.\",\n" +
    "    \"date\": \"Next four years\",\n" +
    "    \"entities\": [\"Apple\", \"China\", \"US surveillance state\", \"climate change\", \"H-1B visas\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The administration will likely challenge legal challenges to its policies.\",\n" +
    "    \"date\": \"Next four years\",\n" +
    "    \"entities\": [\"court\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The EU’s Digital Services Act (DSA) and Digital Markets Act (DMA) laws will be a point of contention.\",\n" +
    "    \"date\": \"Unspecified - implied to be the present day\",\n" +
    "    \"entities\": [\"EU\", \"Digital Services Act\", \"Digital Markets Act\"]\n" +
    "  }\n" +
    "]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 127055925176,
 "load_duration" => 2609077213,
 "prompt_eval_count" => 622,
 "prompt_eval_duration" => 56534000000,
 "eval_count" => 360,
 "eval_duration" => 67334000000}
{"model" => "gemma3:latest",
 "created_at" => "2025-03-15T08:09:38.773502603Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n" +
    "  {\n" +
    "    \"fact\": \"Donald Trump has returned to office and intends to sign dozens of executive orders.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Donald Trump\", \"Oval Office\", \"Executive Orders\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Initial executive orders are expected to focus on border policy, the environment, trade, and a potential TikTok ban delay.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Border Policy\", \"Environment\", \"Trade\", \"TikTok\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Key areas of impact for tech companies under Trump’s administration include tariffs on goods assembled in China (particularly by Apple), the deployment of the US surveillance state for deportation purposes, and reduced US involvement in global climate change efforts.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"China\", \"Apple\", \"US Surveillance State\", \"Climate Change\", \"H-1B Visas\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump is anticipated to challenge legal challenges to his policies, testing the limits of his presidential power.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Legal Challenges\", \"Presidential Power\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Potential legal battles will involve areas such as tariffs, surveillance, and climate change policies.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Tariffs\", \"Surveillance\", \"Climate Change\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Other areas of concern include antitrust regulations, data privacy, and the use of AI.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Antitrust\", \"Data Privacy\", \"AI\"]\n" +
    "  }\n" +
    "]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 134870384932,
 "load_duration" => 2329031430,
 "prompt_eval_count" => 622,
 "prompt_eval_duration" => 53963000000,
 "eval_count" => 444,
 "eval_duration" => 78072000000}
{"model" => "gemma3:latest",
 "created_at" => "2025-03-15T08:12:07.145122611Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n" +
    "  {\n" +
    "    \"fact\": \"Donald Trump has returned to the presidency and plans to sign dozens of executive orders.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Donald Trump\", \"Oval Office\", \"Executive Orders\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Initial executive orders are expected to focus on border policy, the environment, trade, and a potential TikTok ban delay.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Border Policy\", \"Environment\", \"Trade\", \"TikTok\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump’s administration will likely impact tech companies through tariffs, surveillance, climate change policies, antitrust cases, and regulatory decisions.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Tariffs\", \"Apple\", \"China\", \"Surveillance State\", \"Climate Change\", \"Antitrust Cases\", \"Environmental Regulation\", \"US Surveillance State\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Areas of concern include the future of H-1B visas, the EU’s DSA and DMA laws, and Section 230 liability shield for tech companies.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"H-1B Visas\", \"EU\", \"DSA\", \"DMA\", \"Section 230\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The administration will likely face legal challenges regarding its policies, particularly those related to data privacy and surveillance.\",\n" +
    "    \"date\": \"October 26, 2023 (Implied)\",\n" +
    "    \"entities\": [\"Data Privacy\", \"Surveillance\"]\n" +
    "  }\n" +
    "]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 148266882058,
 "load_duration" => 56944659,
 "prompt_eval_count" => 622,
 "prompt_eval_duration" => 55150000000,
 "eval_count" => 399,
 "eval_duration" => 93058000000}
{"model" => "gemma3:latest",
 "created_at" => "2025-03-15T08:14:01.985859313Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n" +
    "  {\n" +
    "    \"fact\": \"Donald Trump has returned to the office of the President of the United States.\",\n" +
    "    \"date\": \"October 26, 2023\",\n" +
    "    \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump intends to sign dozens of executive orders upon his return, focusing on border policy, the environment, trade, and potentially delaying the TikTok ban.\",\n" +
    "    \"date\": \"October 26, 2023\",\n" +
    "    \"entities\": [\"executive orders\", \"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Several key areas will be impacted by Trump’s decisions, including tariffs against China (specifically Apple), the US surveillance state, climate change efforts, antitrust cases, electric vehicle incentives, FCC spectrum licenses, Section 230 liability, immigration policy (H-1B visas), and the EU’s DSA and DMA laws.\",\n" +
    "    \"date\": \"October 26, 2023\",\n" +
    "    \"entities\": [\"China\", \"Apple\", \"US surveillance state\", \"climate change\", \"antitrust cases\", \"electric vehicles\", \"FCC\", \"Section 230\", \"H-1B visas\", \"EU\", \"DSA\", \"DMA\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump’s policies are expected to face legal challenges, testing the limits of his power.\",\n" +
    "    \"date\": \"October 26, 2023\",\n" +
    "    \"entities\": [\"legal challenges\"]\n" +
    "  }\n" +
    "]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 114787944482,
 "load_duration" => 2258733285,
 "prompt_eval_count" => 622,
 "prompt_eval_duration" => 52651000000,
 "eval_count" => 348,
 "eval_duration" => 59392000000}
{"model" => "gemma3:latest",
 "created_at" => "2025-03-15T08:16:14.848943297Z",
 "message" =>
  {"role" => "assistant",
   "content" =>
    "[\n" +
    "  {\n" +
    "    \"fact\": \"Donald Trump has returned to the Oval Office.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump plans to sign dozens of executive orders on his first day back.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"executive orders\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The executive orders are expected to address border policy, the environment, trade, and a potential TikTok ban.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Trump's policies will likely impact tech companies through tariffs, surveillance, climate change efforts, antitrust cases, and immigration policies.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"Apple\", \"China\", \"US surveillance state\", \"climate change\", \"antitrust cases\", \"H-1B visas\", \"EU’s DSA and DMA laws\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"Areas of concern include the future of Section 230 liability shield and the potential for the FCC to revoke spectrum licenses.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"Section 230\", \"FCC\"]\n" +
    "  },\n" +
    "  {\n" +
    "    \"fact\": \"The impact of these policies will be felt across various sectors, including technology and international relations.\",\n" +
    "    \"date\": \"October 26, 2023 (implied)\",\n" +
    "    \"entities\": [\"technology\", \"international relations\"]\n" +
    "  }\n" +
    "]\n"},
 "done_reason" => "stop",
 "done" => true,
 "total_duration" => 132804690845,
 "load_duration" => 2299108361,
 "prompt_eval_count" => 622,
 "prompt_eval_duration" => 54162000000,
 "eval_count" => 417,
 "eval_duration" => 75837000000}
=> **5**
<!-- gh-comment-id:2726496776 --> @jdblack commented on GitHub (Mar 15, 2025): > > Happy to! The code that I was using is at the end of the bug report ( Under the "The test code being used"). > > [@jdblack](https://github.com/jdblack) Your initial report was quite thorough, thanks for that. It's why I'm pretty sure the problem is the size of the context. My follow up question was for [@Steve0929](https://github.com/Steve0929). I think that's a different problem, for which more information is required for debugging. Ahhh, sorry ! It was context, exactly as you predicted! ```ruby newsapi(dev)> 5.times do pp o.tester(id); end {"model" => "gemma3:latest", "created_at" => "2025-03-15T08:07:23.824674826Z", "message" => {"role" => "assistant", "content" => "[\n" + " {\n" + " \"fact\": \"Donald Trump has returned to the U.S. presidency.\",\n" + " \"date\": \"Unspecified - implied to be the present day\",\n" + " \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump intends to sign dozens of executive orders upon his return.\",\n" + " \"date\": \"Unspecified - implied to be the present day\",\n" + " \"entities\": [\"executive orders\"]\n" + " },\n" + " {\n" + " \"fact\": \"The executive orders will likely address border policy, the environment, trade, and a potential TikTok ban delay.\",\n" + " \"date\": \"Unspecified - implied to be the present day\",\n" + " \"entities\": [\"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump’s administration will likely impact tech companies through tariffs, surveillance, climate change policy, antitrust cases, and immigration policies.\",\n" + " \"date\": \"Next four years\",\n" + " \"entities\": [\"Apple\", \"China\", \"US surveillance state\", \"climate change\", \"H-1B visas\"]\n" + " },\n" + " {\n" + " \"fact\": \"The administration will likely challenge legal challenges to its policies.\",\n" + " \"date\": \"Next four years\",\n" + " \"entities\": [\"court\"]\n" + " },\n" + " {\n" + " \"fact\": \"The EU’s Digital Services Act (DSA) and Digital Markets Act (DMA) laws will be a point of contention.\",\n" + " \"date\": \"Unspecified - implied to be the present day\",\n" + " \"entities\": [\"EU\", \"Digital Services Act\", \"Digital Markets Act\"]\n" + " }\n" + "]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 127055925176, "load_duration" => 2609077213, "prompt_eval_count" => 622, "prompt_eval_duration" => 56534000000, "eval_count" => 360, "eval_duration" => 67334000000} {"model" => "gemma3:latest", "created_at" => "2025-03-15T08:09:38.773502603Z", "message" => {"role" => "assistant", "content" => "[\n" + " {\n" + " \"fact\": \"Donald Trump has returned to office and intends to sign dozens of executive orders.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Donald Trump\", \"Oval Office\", \"Executive Orders\"]\n" + " },\n" + " {\n" + " \"fact\": \"Initial executive orders are expected to focus on border policy, the environment, trade, and a potential TikTok ban delay.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Border Policy\", \"Environment\", \"Trade\", \"TikTok\"]\n" + " },\n" + " {\n" + " \"fact\": \"Key areas of impact for tech companies under Trump’s administration include tariffs on goods assembled in China (particularly by Apple), the deployment of the US surveillance state for deportation purposes, and reduced US involvement in global climate change efforts.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"China\", \"Apple\", \"US Surveillance State\", \"Climate Change\", \"H-1B Visas\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump is anticipated to challenge legal challenges to his policies, testing the limits of his presidential power.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Legal Challenges\", \"Presidential Power\"]\n" + " },\n" + " {\n" + " \"fact\": \"Potential legal battles will involve areas such as tariffs, surveillance, and climate change policies.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Tariffs\", \"Surveillance\", \"Climate Change\"]\n" + " },\n" + " {\n" + " \"fact\": \"Other areas of concern include antitrust regulations, data privacy, and the use of AI.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Antitrust\", \"Data Privacy\", \"AI\"]\n" + " }\n" + "]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 134870384932, "load_duration" => 2329031430, "prompt_eval_count" => 622, "prompt_eval_duration" => 53963000000, "eval_count" => 444, "eval_duration" => 78072000000} {"model" => "gemma3:latest", "created_at" => "2025-03-15T08:12:07.145122611Z", "message" => {"role" => "assistant", "content" => "[\n" + " {\n" + " \"fact\": \"Donald Trump has returned to the presidency and plans to sign dozens of executive orders.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Donald Trump\", \"Oval Office\", \"Executive Orders\"]\n" + " },\n" + " {\n" + " \"fact\": \"Initial executive orders are expected to focus on border policy, the environment, trade, and a potential TikTok ban delay.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Border Policy\", \"Environment\", \"Trade\", \"TikTok\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump’s administration will likely impact tech companies through tariffs, surveillance, climate change policies, antitrust cases, and regulatory decisions.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Tariffs\", \"Apple\", \"China\", \"Surveillance State\", \"Climate Change\", \"Antitrust Cases\", \"Environmental Regulation\", \"US Surveillance State\"]\n" + " },\n" + " {\n" + " \"fact\": \"Areas of concern include the future of H-1B visas, the EU’s DSA and DMA laws, and Section 230 liability shield for tech companies.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"H-1B Visas\", \"EU\", \"DSA\", \"DMA\", \"Section 230\"]\n" + " },\n" + " {\n" + " \"fact\": \"The administration will likely face legal challenges regarding its policies, particularly those related to data privacy and surveillance.\",\n" + " \"date\": \"October 26, 2023 (Implied)\",\n" + " \"entities\": [\"Data Privacy\", \"Surveillance\"]\n" + " }\n" + "]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 148266882058, "load_duration" => 56944659, "prompt_eval_count" => 622, "prompt_eval_duration" => 55150000000, "eval_count" => 399, "eval_duration" => 93058000000} {"model" => "gemma3:latest", "created_at" => "2025-03-15T08:14:01.985859313Z", "message" => {"role" => "assistant", "content" => "[\n" + " {\n" + " \"fact\": \"Donald Trump has returned to the office of the President of the United States.\",\n" + " \"date\": \"October 26, 2023\",\n" + " \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump intends to sign dozens of executive orders upon his return, focusing on border policy, the environment, trade, and potentially delaying the TikTok ban.\",\n" + " \"date\": \"October 26, 2023\",\n" + " \"entities\": [\"executive orders\", \"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" + " },\n" + " {\n" + " \"fact\": \"Several key areas will be impacted by Trump’s decisions, including tariffs against China (specifically Apple), the US surveillance state, climate change efforts, antitrust cases, electric vehicle incentives, FCC spectrum licenses, Section 230 liability, immigration policy (H-1B visas), and the EU’s DSA and DMA laws.\",\n" + " \"date\": \"October 26, 2023\",\n" + " \"entities\": [\"China\", \"Apple\", \"US surveillance state\", \"climate change\", \"antitrust cases\", \"electric vehicles\", \"FCC\", \"Section 230\", \"H-1B visas\", \"EU\", \"DSA\", \"DMA\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump’s policies are expected to face legal challenges, testing the limits of his power.\",\n" + " \"date\": \"October 26, 2023\",\n" + " \"entities\": [\"legal challenges\"]\n" + " }\n" + "]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 114787944482, "load_duration" => 2258733285, "prompt_eval_count" => 622, "prompt_eval_duration" => 52651000000, "eval_count" => 348, "eval_duration" => 59392000000} {"model" => "gemma3:latest", "created_at" => "2025-03-15T08:16:14.848943297Z", "message" => {"role" => "assistant", "content" => "[\n" + " {\n" + " \"fact\": \"Donald Trump has returned to the Oval Office.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"Donald Trump\", \"Oval Office\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump plans to sign dozens of executive orders on his first day back.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"executive orders\"]\n" + " },\n" + " {\n" + " \"fact\": \"The executive orders are expected to address border policy, the environment, trade, and a potential TikTok ban.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"border policy\", \"environment\", \"trade\", \"TikTok\"]\n" + " },\n" + " {\n" + " \"fact\": \"Trump's policies will likely impact tech companies through tariffs, surveillance, climate change efforts, antitrust cases, and immigration policies.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"Apple\", \"China\", \"US surveillance state\", \"climate change\", \"antitrust cases\", \"H-1B visas\", \"EU’s DSA and DMA laws\"]\n" + " },\n" + " {\n" + " \"fact\": \"Areas of concern include the future of Section 230 liability shield and the potential for the FCC to revoke spectrum licenses.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"Section 230\", \"FCC\"]\n" + " },\n" + " {\n" + " \"fact\": \"The impact of these policies will be felt across various sectors, including technology and international relations.\",\n" + " \"date\": \"October 26, 2023 (implied)\",\n" + " \"entities\": [\"technology\", \"international relations\"]\n" + " }\n" + "]\n"}, "done_reason" => "stop", "done" => true, "total_duration" => 132804690845, "load_duration" => 2299108361, "prompt_eval_count" => 622, "prompt_eval_duration" => 54162000000, "eval_count" => 417, "eval_duration" => 75837000000} => **5** ```
Author
Owner

@rick-github commented on GitHub (Mar 15, 2025):

OK, I'm closing this as resolved. @Steve0929, please open a new issue and include logs and examples.

<!-- gh-comment-id:2726562002 --> @rick-github commented on GitHub (Mar 15, 2025): OK, I'm closing this as resolved. @Steve0929, please open a new issue and include logs and examples.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68445