[GH-ISSUE #6176] System Prompts can not work on the first round. #65896

New Issue

GiteaMirror · 2026-05-03T23:06:14-05:00

GiteaMirror commented

2026-05-03 23:06:14 -05:00

Originally created by @DirtyKnightForVi on GitHub (Aug 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6176

What is the issue?

Description

Bug Summary:
System Prompts can not work on the first round.

Actual Behavior:
For a specific task scenario, there might be special System Prompts. However, in the current version (at least starting from 3.10), an additional round of conversation is needed before these System Prompts can take effect.

Assuming the scenario is to generate SQL.

In the previous normal version, you could set the System Prompts in Setting --- General, then start inputting questions to generate SQL.

However, in the current version, after inputting a question, an additional round of Q&A with the LLM is required to produce the desired SQL.

model

WizardLM2-8x22b

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.3

https://github.com/open-webui/open-webui/discussions/4381#discussion-7014017

Originally created by @DirtyKnightForVi on GitHub (Aug 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6176 # What is the issue? ## Description **Bug Summary:** System Prompts can not work on the first round. **Actual Behavior:** For a specific task scenario, there might be special System Prompts. However, in the current version (at least starting from 3.10), an additional round of conversation is needed before these System Prompts can take effect. Assuming the scenario is to generate SQL. In the previous normal version, you could set the System Prompts in Setting --- General, then start inputting questions to generate SQL. However, in the current version, after inputting a question, an additional round of Q&A with the LLM is required to produce the desired SQL. ## model WizardLM2-8x22b ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.3 https://github.com/open-webui/open-webui/discussions/4381#discussion-7014017

GiteaMirror added the bug label 2026-05-03 23:06:14 -05:00

GiteaMirror closed this issue

2026-05-03 23:06:15 -05:00

GiteaMirror commented

2026-05-03 23:06:17 -05:00

@DirtyKnightForVi commented on GitHub (Aug 5, 2024):

It seems that after the version update of Ollama, there is a setting that prevents the large model from answering certain questions. However, my task is related to confidential financial data, and it requires a response.

@DirtyKnightForVi commented on GitHub (Aug 5, 2024): It seems that after the version update of Ollama, there is a setting that prevents the large model from answering certain questions. However, my task is related to confidential financial data, and it requires a response.

GiteaMirror commented

2026-05-03 23:06:18 -05:00

@rick-github commented on GitHub (Aug 5, 2024):

You will need to provide more information to debug this: the system prompt you are using, a sample query that doesn't return the correct results, and ideally a capture of the request.

A simple test shows that ollama responds using the guidance of the system prompt on the first interaction:

$ curl localhost:11434/api/version
{"version":"0.3.3"}
$ curl -s localhost:11434/api/chat -d '{
    "model":"glm4",
    "messages":[
        {"role":"system","content":"reply with SQL commands for a table with the following schema: `TABLE somerandomname ( name VARCHAR(255) PRIMARY KEY)`"},
        {"role":"user","content":"how do i get all names starting with the letter `R` from the table"}
    ],
    "format":"json",
    "stream":false}' | jq -r .message.content
{
  "query": "SELECT * FROM somerandomname WHERE name LIKE 'R%'"
}

Compare to the same question without the mention of SQL in the system prompt:

$ curl -s localhost:11434/api/chat -d '{"model":"glm4","messages":[{"role":"system","content":"you are a helpful assistant"},{"role":"user","content":"how do i get all names starting with the letter `R` from the table"}],"format":"json","stream":false}' | jq -r .message.content
{"name": "John", "age": 25, "city": "New York"}

If you restart the ollama server with OLLAMA_DEBUG=1, the prompt sent to llama.cpp will be in the logs.

@rick-github commented on GitHub (Aug 5, 2024): You will need to provide more information to debug this: the system prompt you are using, a sample query that doesn't return the correct results, and ideally a capture of the request. A simple test shows that ollama responds using the guidance of the system prompt on the first interaction: ``` $ curl localhost:11434/api/version {"version":"0.3.3"} $ curl -s localhost:11434/api/chat -d '{ "model":"glm4", "messages":[ {"role":"system","content":"reply with SQL commands for a table with the following schema: `TABLE somerandomname ( name VARCHAR(255) PRIMARY KEY)`"}, {"role":"user","content":"how do i get all names starting with the letter `R` from the table"} ], "format":"json", "stream":false}' | jq -r .message.content { "query": "SELECT * FROM somerandomname WHERE name LIKE 'R%'" } ``` Compare to the same question without the mention of SQL in the system prompt: ``` $ curl -s localhost:11434/api/chat -d '{"model":"glm4","messages":[{"role":"system","content":"you are a helpful assistant"},{"role":"user","content":"how do i get all names starting with the letter `R` from the table"}],"format":"json","stream":false}' | jq -r .message.content {"name": "John", "age": 25, "city": "New York"} ``` If you restart the ollama server with OLLAMA_DEBUG=1, the prompt sent to llama.cpp will be in the [logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues).

GiteaMirror commented

2026-05-03 23:06:21 -05:00

@Rudd-O commented on GitHub (Aug 8, 2024):

I have this issue too. Totally ignores role system messages. Leads to breakage in Home Assistant assist. https://github.com/home-assistant/core/issues/123316 I have verified that the role system is being sent by HA and also by open-webui.

Using latest freshest installation of ollama installed today with curl | bash.

@Rudd-O commented on GitHub (Aug 8, 2024): I have this issue too. Totally ignores role system messages. Leads to breakage in Home Assistant assist. https://github.com/home-assistant/core/issues/123316 I have verified that the role system is being sent by HA and also by open-webui. Using latest freshest installation of ollama installed today with curl | bash.

GiteaMirror commented

2026-05-03 23:06:22 -05:00

@Rudd-O commented on GitHub (Aug 8, 2024):

I can confirm that the system prompt is totally ignored. Here is a screenshot proving it -- and I have personally verified the system prompt text is sent as system role.

Also happens with mistral-nemo for the record.

@Rudd-O commented on GitHub (Aug 8, 2024): I can confirm that the system prompt is totally ignored. Here is a screenshot proving it -- and I have personally verified the system prompt text is sent as system role. Also happens with mistral-nemo for the record. ![webui](https://github.com/user-attachments/assets/8c0d44ad-4f79-4e45-a72c-c073c60ba2b1)

GiteaMirror commented

2026-05-03 23:06:23 -05:00

@Rudd-O commented on GitHub (Aug 8, 2024):

More grist for the mill:

The problem happens when the length of the system prompt exceeds a certain size. If the system prompt is short, it is taken into account. If in my case it exceeds about 13K characters, it is totally ignored top to bottom. If, however, I halve the amount of text I send in the system prompt — or I double the context token size to 4096 — BAM, it works!

(affects both mistral-nemo and llama3.1)

@Rudd-O commented on GitHub (Aug 8, 2024): More grist for the mill: The problem happens when the length of the system prompt exceeds a certain size. If the system prompt is short, it is taken into account. If in my case it exceeds about 13K characters, **it is totally ignored top to bottom**. If, however, I halve the amount of text I send in the system prompt — **or** I double the context token size to 4096 — BAM, it works! (affects both mistral-nemo and llama3.1)

GiteaMirror commented

2026-05-03 23:06:23 -05:00

@Rudd-O commented on GitHub (Aug 8, 2024):

proof it works. note context size has been increased to 4096. for reference, the name alfred is mentioned in the first line of the prompt, and the kitchen motion sensor temperature is maybe 10 lines above the last line.

@Rudd-O commented on GitHub (Aug 8, 2024): ![4096](https://github.com/user-attachments/assets/57b7a6d2-06b6-4720-9a02-bcf3990cf0e6) proof it works. note context size has been increased to 4096. for reference, the name alfred is mentioned in the first line of the prompt, and the kitchen motion sensor temperature is maybe 10 lines above the last line.

GiteaMirror commented

2026-05-03 23:06:24 -05:00

@DirtyKnightForVi commented on GitHub (Aug 8, 2024):

My System Prompt over 16K, but GLM-4 In open webui works well, WizardLM2 does not.
However, if any param set in workspace or chat control , example num_ctx, bigger LLM works well.

https://github.com/open-webui/open-webui/discussions/4399

@DirtyKnightForVi commented on GitHub (Aug 8, 2024): My System Prompt over 16K, but GLM-4 In open webui works well, WizardLM2 does not. However, if any param set in `workspace` or `chat control` , example `num_ctx`, bigger LLM works well. https://github.com/open-webui/open-webui/discussions/4399

GiteaMirror commented

2026-05-03 23:06:26 -05:00

@rick-github commented on GitHub (Aug 8, 2024):

https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726

@rick-github commented on GitHub (Aug 8, 2024): https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726

GiteaMirror commented

2026-05-03 23:06:27 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

I'm following, same issue with Llama3 and 3.1.
Short system prompt is considered, long one (not above context length limit) is discarded

error happened during debug:
"truncating input message which exceed context length"

Therefore only user prompt is considered...

@cannox227 commented on GitHub (Aug 13, 2024): I'm following, same issue with Llama3 and 3.1. Short system prompt is considered, long one (not above context length limit) is discarded error happened during debug: "truncating input message which exceed context length" Therefore only user prompt is considered...

GiteaMirror commented

2026-05-03 23:06:29 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

Server logs will help in debugging.

@rick-github commented on GitHub (Aug 13, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will help in debugging.

GiteaMirror commented

2026-05-03 23:06:30 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

@rick-github
Actually, I'm already seeing them, that's why I was mentioned that error.
A better explained issue reference is explained from my colleague @vividfog in this issue here, since the same thing is happening.

@cannox227 commented on GitHub (Aug 13, 2024): @rick-github Actually, I'm already seeing them, that's why I was mentioned that error. A better explained issue reference is explained from my colleague @vividfog in this [issue here](https://github.com/ollama/ollama/issues/6253#issuecomment-2286521829), since the same thing is happening.

GiteaMirror commented

2026-05-03 23:06:35 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

Yes, but there are no relevant server logs. Posting one Debug line of interest: with no context is insufficient. If there's a bug somewhere, having information on how ollama and llama.cpp are processing tokens makes it easier to narrow down the fault. If multiple users are experiencing problems, then having multiple copies of server logs allows cross-checking to either eliminate or correlate possible causes. For example, I am unable to replicate these issues when context size is modified as per https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726. I would love to help those having this problem, but if I can't replicate it, I can't debug it.

@rick-github commented on GitHub (Aug 13, 2024): Yes, but there are no relevant server logs. Posting one `Debug line of interest:` with no context is insufficient. If there's a bug somewhere, having information on how ollama and llama.cpp are processing tokens makes it easier to narrow down the fault. If multiple users are experiencing problems, then having multiple copies of server logs allows cross-checking to either eliminate or correlate possible causes. For example, I am unable to replicate these issues when context size is modified as per https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726. I would love to help those having this problem, but if I can't replicate it, I can't debug it.

GiteaMirror commented

2026-05-03 23:06:36 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

@rick-github
I created a dummy example with a dummy prompt you can try.

I'm running llama3.1, according to these infos

  "parameters": "stop                           \"<|start_header_id|>\"\nstop                           \"<|end_header_id|>\"\nstop                           \"<|eot_id|>\"",
  "template": "{{ if .Messages }}\n{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>\n{{- if .System }}\n\n{{ .System }}\n{{- end }}\n{{- if .Tools }}\n\nYou are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.\n{{- end }}\n{{- end }}<|eot_id|>\n{{- range $i, $_ := .Messages }}\n{{- $last := eq (len (slice $.Messages $i)) 1 }}\n{{- if eq .Role \"user\" }}<|start_header_id|>user<|end_header_id|>\n{{- if and $.Tools $last }}\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n{{ $.Tools }}\n{{- end }}\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- else if eq .Role \"assistant\" }}<|start_header_id|>assistant<|end_header_id|>\n{{- if .ToolCalls }}\n\n{{- range .ToolCalls }}{\"name\": \"{{ .Function.Name }}\", \"parameters\": {{ .Function.Arguments }}}{{ end }}\n{{- else }}\n\n{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}\n{{- end }}\n{{- else if eq .Role \"tool\" }}<|start_header_id|>ipython<|end_header_id|>\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- end }}\n{{- end }}\n{{- else }}\n{{- if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}",
  "details": {
    "parent_model": "",
    "format": "gguf",
    "family": "llama",
    "families": [
      "llama"
    ],
    "parameter_size": "8.0B",
    "quantization_level": "Q4_0"
  },
  "model_info": {
    "general.architecture": "llama",
    "general.basename": "Meta-Llama-3.1",
    "general.file_type": 2,
    "general.finetune": "Instruct",
    "general.languages": [
      "en",
      "de",
      "fr",
      "it",
      "pt",
      "hi",
      "es",
      "th"
    ],
    "general.license": "llama3.1",
    "general.parameter_count": 8030261312,
    "general.quantization_version": 2,
    "general.size_label": "8B",
    "general.tags": [
      "facebook",
      "meta",
      "pytorch",
      "llama",
      "llama-3",
      "text-generation"
    ],
    "general.type": "model",
    "llama.attention.head_count": 32,
    "llama.attention.head_count_kv": 8,
    "llama.attention.layer_norm_rms_epsilon": 0.00001,
    "llama.block_count": 32,
    "llama.context_length": 131072,
    "llama.embedding_length": 4096,
    "llama.feed_forward_length": 14336,
    "llama.rope.dimension_count": 128,
    "llama.rope.freq_base": 500000,
    "llama.vocab_size": 128256,
    "tokenizer.ggml.bos_token_id": 128000,
    "tokenizer.ggml.eos_token_id": 128009,
    "tokenizer.ggml.merges": null,
    "tokenizer.ggml.model": "gpt2",
    "tokenizer.ggml.pre": "llama-bpe",
    "tokenizer.ggml.token_type": null,
    "tokenizer.ggml.tokens": null
  }

I asked an LLM to provide me a dummy system prompt, which I'm pasting two times so that I can have a "long" system prompt. The system prompt is concerning "Ice Cream Maker LLM instruction".

I'm sending to my ollama server instance this post request.

{
  "model": "llama3.1",
  "messages":[
    {
      "role": "system",
       "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity."
    },
    {
      "role": "user",
      "content": "Hi!"
    }
  ]
}

This is the answer i get

{
  "id": "chatcmpl-472",
  "object": "chat.completion",
  "created": 1723576946,
  "model": "llama3.1",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 24,
    "total_tokens": 37
  }
}

We both can see that there is something wrong with the amount of prompt_tokens.

Here there is a complete log from the server with DEBUG log level.

DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=118 tid="0x205ada080" timestamp=1723576942
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=119 tid="0x205ada080" timestamp=1723576942
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576942
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=120 tid="0x205ada080" timestamp=1723576942
DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112]
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [print_timings] prompt eval time     =    1473.49 ms /    13 tokens (  113.35 ms per token,     8.82 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=8.822567312795302 slot_id=0 t_prompt_processing=1473.494 t_token=113.3456923076923 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [print_timings] generation eval time =    2340.13 ms /    24 runs   (   97.51 ms per token,    10.26 tokens per second) | n_decoded=24 n_tokens_second=10.255822957146899 slot_id=0 t_token=97.50558333333333 t_token_generation=2340.134 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [print_timings]           total time =    3813.63 ms | slot_id=0 t_prompt_processing=1473.494 t_token_generation=2340.134 t_total=3813.6279999999997 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576946 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576946
[GIN] 2024/08/13 - 22:22:26 | 200 |  3.896133166s |       127.0.0.1 | POST     "/v1/chat/completions"
time=2024-08-13T22:22:26.255+03:00 level=DEBUG source=sched.go:403 msg="context for request finished"

As you can see from the logs here there is where the problem happens

time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

So it seems that with a 22k char prompt, the context length of 132k just explode.
Any idea on this? I hope it is reproducible enough 😄

@cannox227 commented on GitHub (Aug 13, 2024): @rick-github I created a dummy example with a dummy prompt you can try. I'm running llama3.1, according to these infos ```json "parameters": "stop \"<|start_header_id|>\"\nstop \"<|end_header_id|>\"\nstop \"<|eot_id|>\"", "template": "{{ if .Messages }}\n{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>\n{{- if .System }}\n\n{{ .System }}\n{{- end }}\n{{- if .Tools }}\n\nYou are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.\n{{- end }}\n{{- end }}<|eot_id|>\n{{- range $i, $_ := .Messages }}\n{{- $last := eq (len (slice $.Messages $i)) 1 }}\n{{- if eq .Role \"user\" }}<|start_header_id|>user<|end_header_id|>\n{{- if and $.Tools $last }}\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n{{ $.Tools }}\n{{- end }}\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- else if eq .Role \"assistant\" }}<|start_header_id|>assistant<|end_header_id|>\n{{- if .ToolCalls }}\n\n{{- range .ToolCalls }}{\"name\": \"{{ .Function.Name }}\", \"parameters\": {{ .Function.Arguments }}}{{ end }}\n{{- else }}\n\n{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}\n{{- end }}\n{{- else if eq .Role \"tool\" }}<|start_header_id|>ipython<|end_header_id|>\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- end }}\n{{- end }}\n{{- else }}\n{{- if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}", "details": { "parent_model": "", "format": "gguf", "family": "llama", "families": [ "llama" ], "parameter_size": "8.0B", "quantization_level": "Q4_0" }, "model_info": { "general.architecture": "llama", "general.basename": "Meta-Llama-3.1", "general.file_type": 2, "general.finetune": "Instruct", "general.languages": [ "en", "de", "fr", "it", "pt", "hi", "es", "th" ], "general.license": "llama3.1", "general.parameter_count": 8030261312, "general.quantization_version": 2, "general.size_label": "8B", "general.tags": [ "facebook", "meta", "pytorch", "llama", "llama-3", "text-generation" ], "general.type": "model", "llama.attention.head_count": 32, "llama.attention.head_count_kv": 8, "llama.attention.layer_norm_rms_epsilon": 0.00001, "llama.block_count": 32, "llama.context_length": 131072, "llama.embedding_length": 4096, "llama.feed_forward_length": 14336, "llama.rope.dimension_count": 128, "llama.rope.freq_base": 500000, "llama.vocab_size": 128256, "tokenizer.ggml.bos_token_id": 128000, "tokenizer.ggml.eos_token_id": 128009, "tokenizer.ggml.merges": null, "tokenizer.ggml.model": "gpt2", "tokenizer.ggml.pre": "llama-bpe", "tokenizer.ggml.token_type": null, "tokenizer.ggml.tokens": null } ``` I asked an LLM to provide me a dummy system prompt, which I'm pasting two times so that I can have a "long" system prompt. The system prompt is concerning "Ice Cream Maker LLM instruction". I'm sending to my ollama server instance this post request. ```json { "model": "llama3.1", "messages":[ { "role": "system", "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity." }, { "role": "user", "content": "Hi!" } ] } ``` This is the answer i get ```json { "id": "chatcmpl-472", "object": "chat.completion", "created": 1723576946, "model": "llama3.1", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 24, "total_tokens": 37 } } ``` We both can see that there is something wrong with the amount of prompt_tokens. Here there is a complete log from the server with DEBUG log level. ```log DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=118 tid="0x205ada080" timestamp=1723576942 DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=119 tid="0x205ada080" timestamp=1723576942 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576942 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=120 tid="0x205ada080" timestamp=1723576942 DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112] DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [print_timings] prompt eval time = 1473.49 ms / 13 tokens ( 113.35 ms per token, 8.82 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=8.822567312795302 slot_id=0 t_prompt_processing=1473.494 t_token=113.3456923076923 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [print_timings] generation eval time = 2340.13 ms / 24 runs ( 97.51 ms per token, 10.26 tokens per second) | n_decoded=24 n_tokens_second=10.255822957146899 slot_id=0 t_token=97.50558333333333 t_token_generation=2340.134 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [print_timings] total time = 3813.63 ms | slot_id=0 t_prompt_processing=1473.494 t_token_generation=2340.134 t_total=3813.6279999999997 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576946 truncated=false DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576946 [GIN] 2024/08/13 - 22:22:26 | 200 | 3.896133166s | 127.0.0.1 | POST "/v1/chat/completions" time=2024-08-13T22:22:26.255+03:00 level=DEBUG source=sched.go:403 msg="context for request finished" ``` As you can see from the logs here there is where the problem happens ``` time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" ``` So it seems that with a 22k char prompt, the context length of 132k just explode. Any idea on this? I hope it is reproducible enough 😄

GiteaMirror commented

2026-05-03 23:06:37 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

Thanks, let me dig in to this.

@rick-github commented on GitHub (Aug 13, 2024): Thanks, let me dig in to this.

GiteaMirror commented

2026-05-03 23:06:38 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

OK, I think there is a misunderstanding of how ollama manages the context size. The value of llama.context_length in the model parameter is the maximum context window that the model supports. However, because of the way the attention mechanism that transformer models use works, allocating the full context window is very expensive in terms of VRAM. The larger the context window, the less VRAM there is for loading the actual model weights. For that reason ollama doesn't allocate the context window from the value in the model: what ollama allocates is either determined by the option num_ctx in the API call, the PARAMETER num_ctx in the Modelfile, or by a default value of 2048 tokens. If the API call or the Modelfile don't specify num_ctx, ollama will use 2048.

So, to your example: if I used the request as given (with the exception of adding "stream":false), then yes, I get a poor response and errors in the logs:

{
  "model": "llama3.1",
  "created_at": "2024-08-13T20:39:02.605379472Z",
  "message": {
    "role": "assistant",
    "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 473421758,
  "load_duration": 20491602,
  "prompt_eval_count": 13,
  "prompt_eval_duration": 19839000,
  "eval_count": 24,
  "eval_duration": 285788000
}

ollama  | DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=40084 status=200 tid="140541018742784" timestamp=1723581542
ollama  | time=2024-08-13T20:39:02.256Z level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
ollama  | time=2024-08-13T20:39:02.256Z level=DEBUG source=routes.go:1346 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

What's interesting in the logs is that the entire system prompt is discarded as opposed to truncated, so that needs a look.

Now, let's take the request and add a larger context window, "options":{"num_ctx": 32000} (content shortened for space):

{
  "model": "llama3.1",
  "messages": [
    {
      "role": "system",
      "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging...."
    },
    {
      "role": "user",
      "content": "Hi!"
    }
  ],
  "options": {
    "num_ctx": 32000
  },
  "stream": false
}

This is the result of the new request:

{
  "model": "llama3.1",
  "created_at": "2024-08-13T20:45:11.208923511Z",
  "message": {
    "role": "assistant",
    "content": "Welcome to our virtual ice cream parlor! I'm thrilled to have you here. What brings you today? Are you in the mood for something classic and comforting, or perhaps something new and adventurous?\n\nWe've got all sorts of delicious flavors to choose from, made with love and care using only the freshest ingredients. Plus, I'd be happy to help you create your very own unique ice cream flavor!\n\nWhat would you like to do? Take a look at our menu, get inspired by some fun flavor ideas, or let me know what's on your mind and we can start from there!"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 5874873383,
  "load_duration": 2213909696,
  "prompt_eval_count": 4590,
  "prompt_eval_duration": 1822245000,
  "eval_count": 121,
  "eval_duration": 1734623000
}

I haven't read the system message to determine how accurate the response is, but the response mentions ice cream, so the system message wasn't ignored.

As you noted, prompt_eval_count seems low. It's understandable for the first attempt where the system message was completely dropped, but it seems like it should be higher for the second attempt. It could be that there's some sort of token caching going on, eg words that occur multiple times are tokenized just once and cached for later use. I'm not familiar with that part of llama.cpp but it could be an interesting dive in to the code.

@rick-github commented on GitHub (Aug 13, 2024): OK, I think there is a misunderstanding of how ollama manages the context size. The value of `llama.context_length` in the model parameter is the maximum context window that the model supports. However, because of the way the attention mechanism that transformer models use works, allocating the full context window is very expensive in terms of VRAM. The larger the context window, the less VRAM there is for loading the actual model weights. For that reason ollama doesn't allocate the context window from the value in the model: what ollama allocates is either determined by the option `num_ctx` in the API call, the `PARAMETER num_ctx` in the Modelfile, or by a default value of 2048 tokens. If the API call or the Modelfile don't specify `num_ctx`, ollama will use 2048. So, to your example: if I used the request as given (with the exception of adding `"stream":false`), then yes, I get a poor response and errors in the logs: ```json { "model": "llama3.1", "created_at": "2024-08-13T20:39:02.605379472Z", "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?" }, "done_reason": "stop", "done": true, "total_duration": 473421758, "load_duration": 20491602, "prompt_eval_count": 13, "prompt_eval_duration": 19839000, "eval_count": 24, "eval_duration": 285788000 } ``` ``` ollama | DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=40084 status=200 tid="140541018742784" timestamp=1723581542 ollama | time=2024-08-13T20:39:02.256Z level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 ollama | time=2024-08-13T20:39:02.256Z level=DEBUG source=routes.go:1346 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" ``` What's interesting in the logs is that the entire system prompt is discarded as opposed to truncated, so that needs a look. Now, let's take the request and add a larger context window, `"options":{"num_ctx": 32000}` (`content` shortened for space): ```json { "model": "llama3.1", "messages": [ { "role": "system", "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging...." }, { "role": "user", "content": "Hi!" } ], "options": { "num_ctx": 32000 }, "stream": false } ``` This is the result of the new request: ```json { "model": "llama3.1", "created_at": "2024-08-13T20:45:11.208923511Z", "message": { "role": "assistant", "content": "Welcome to our virtual ice cream parlor! I'm thrilled to have you here. What brings you today? Are you in the mood for something classic and comforting, or perhaps something new and adventurous?\n\nWe've got all sorts of delicious flavors to choose from, made with love and care using only the freshest ingredients. Plus, I'd be happy to help you create your very own unique ice cream flavor!\n\nWhat would you like to do? Take a look at our menu, get inspired by some fun flavor ideas, or let me know what's on your mind and we can start from there!" }, "done_reason": "stop", "done": true, "total_duration": 5874873383, "load_duration": 2213909696, "prompt_eval_count": 4590, "prompt_eval_duration": 1822245000, "eval_count": 121, "eval_duration": 1734623000 } ``` I haven't read the system message to determine how accurate the response is, but the response mentions ice cream, so the system message wasn't ignored. As you noted, `prompt_eval_count` seems low. It's understandable for the first attempt where the system message was completely dropped, but it seems like it should be higher for the second attempt. It could be that there's some sort of token caching going on, eg words that occur multiple times are tokenized just once and cached for later use. I'm not familiar with that part of llama.cpp but it could be an interesting dive in to the code.

GiteaMirror commented

2026-05-03 23:06:39 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

Thanks for your answer @rick-github, I did read this sort of "fix" in other issues. However, I did try to append

"options": {
    "num_ctx": 32000
  },
  "stream": false

to the request, but I get the same behaviour. (Truncated / omitted system prompt)

{
  "id": "chatcmpl-441",
  "object": "chat.completion",
  "created": 1723583662,
  "model": "llama3.1",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It's nice to meet you. Is there something I can help you with or would you like to chat?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 23,
    "total_tokens": 36
  }
}

I don't know if there's something related to hardware (I'm running ollama on the basic m1 processor).

If there is a lower level debug to do with llama.cpp I can help with that, but of course I'll need to be guided by someone more expert with the codebase 😄

LOGS

DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=58 tid="0x205ada080" timestamp=1723584168
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=59 tid="0x205ada080" timestamp=1723584168
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584168
time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=60 tid="0x205ada080" timestamp=1723584168
DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112]
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [print_timings] prompt eval time     =    1276.53 ms /    13 tokens (   98.19 ms per token,    10.18 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=10.183849824250254 slot_id=0 t_prompt_processing=1276.531 t_token=98.1946923076923 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [print_timings] generation eval time =    2067.85 ms /    24 runs   (   86.16 ms per token,    11.61 tokens per second) | n_decoded=24 n_tokens_second=11.606257707280509 slot_id=0 t_token=86.16041666666666 t_token_generation=2067.85 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [print_timings]           total time =    3344.38 ms | slot_id=0 t_prompt_processing=1276.531 t_token_generation=2067.85 t_total=3344.381 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584172 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584172
[GIN] 2024/08/14 - 00:22:52 | 200 |  3.428835917s |       127.0.0.1 | POST     "/v1/chat/completions"
time=2024-08-14T00:22:52.194+03:00 level=DEBUG source=sched.go:403 msg="context for request finished"

@cannox227 commented on GitHub (Aug 13, 2024): Thanks for your answer @rick-github, I did read this sort of "fix" in other issues. However, I did try to append ```json "options": { "num_ctx": 32000 }, "stream": false ``` to the request, but I get the same behaviour. (Truncated / omitted system prompt) ```json { "id": "chatcmpl-441", "object": "chat.completion", "created": 1723583662, "model": "llama3.1", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with or would you like to chat?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 23, "total_tokens": 36 } } ``` I don't know if there's something related to hardware (I'm running ollama on the basic m1 processor). If there is a lower level debug to do with llama.cpp I can help with that, but of course I'll need to be guided by someone more expert with the codebase 😄 LOGS ```log DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=58 tid="0x205ada080" timestamp=1723584168 DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=59 tid="0x205ada080" timestamp=1723584168 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584168 time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=60 tid="0x205ada080" timestamp=1723584168 DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112] DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [print_timings] prompt eval time = 1276.53 ms / 13 tokens ( 98.19 ms per token, 10.18 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=10.183849824250254 slot_id=0 t_prompt_processing=1276.531 t_token=98.1946923076923 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [print_timings] generation eval time = 2067.85 ms / 24 runs ( 86.16 ms per token, 11.61 tokens per second) | n_decoded=24 n_tokens_second=11.606257707280509 slot_id=0 t_token=86.16041666666666 t_token_generation=2067.85 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [print_timings] total time = 3344.38 ms | slot_id=0 t_prompt_processing=1276.531 t_token_generation=2067.85 t_total=3344.381 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584172 truncated=false DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584172 [GIN] 2024/08/14 - 00:22:52 | 200 | 3.428835917s | 127.0.0.1 | POST "/v1/chat/completions" time=2024-08-14T00:22:52.194+03:00 level=DEBUG source=sched.go:403 msg="context for request finished" ```

GiteaMirror commented

2026-05-03 23:06:39 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

Once again, server logs of the failure would be illuminating.

@rick-github commented on GitHub (Aug 13, 2024): Once again, server logs of the failure would be illuminating.

GiteaMirror commented

2026-05-03 23:06:40 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

Once again, server logs of the failure would be illuminating.

I did edit the previous comment, let me know if it is enough. However, it's the same error as cited before, nothing different.

@cannox227 commented on GitHub (Aug 13, 2024): > Once again, server logs of the failure would be illuminating. I did edit the previous comment, let me know if it is enough. However, it's the same error as cited before, nothing different.

GiteaMirror commented

2026-05-03 23:06:41 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

OK, I see from POST "/v1/chat/completions" that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting PARAMETER num_ctx in the Modelfile.

@rick-github commented on GitHub (Aug 13, 2024): OK, I see from `POST "/v1/chat/completions"` that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting `PARAMETER num_ctx` in the Modelfile.

GiteaMirror commented

2026-05-03 23:06:42 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

OK, I see from POST "/v1/chat/completions" that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting PARAMETER num_ctx in the Modelfile.

Ok, so should I hardcode a high value on the Modelfile, run the server and then perform OpenAI compatible calls because num_ctxwill always be ignored?
Can you provide an example @rick-github ?

@cannox227 commented on GitHub (Aug 13, 2024): > OK, I see from `POST "/v1/chat/completions"` that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting `PARAMETER num_ctx` in the Modelfile. Ok, so should I hardcode a high value on the Modelfile, run the server and then perform OpenAI compatible calls because `num_ctx`will always be ignored? Can you provide an example @rick-github ?

GiteaMirror commented

2026-05-03 23:06:42 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

num_ctx in OpenAI endpoints (localhost:11434/v1) are ignored. If you send a request with num_ctx to the ollama endpoints (localhost:11434/api) and it's different to the the value of num_ctx in the Modelfile, the model will be reloaded with the new context window size. If you never send num_ctx with a request to either endpoint, the model will continue to use the value of PARAMETER num_ctx.

@rick-github commented on GitHub (Aug 13, 2024): `num_ctx` in OpenAI endpoints (`localhost:11434/v1`) are ignored. If you send a request with `num_ctx` to the ollama endpoints (`localhost:11434/api`) and it's different to the the value of `num_ctx` in the Modelfile, the model will be reloaded with the new context window size. If you never send `num_ctx` with a request to either endpoint, the model will continue to use the value of `PARAMETER num_ctx`.

GiteaMirror commented

2026-05-03 23:06:43 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

Ok I think we've found the fix then, this is how I was able to run it (I saw from prompt that system prompt was included) 😂

DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x205ada080" timestamp=1723585063
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=57093 status=200 tid="0x16e08f000" timestamp=1723585063
time=2024-08-14T00:37:43.436+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nYou are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=2 tid="0x205ada080" timestamp=1723585063
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063
DEBUG [update_slots] slot progression | ga_i=0 n_past=0 n_past_se=0 n_prompt_tokens_processed=4590 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063

LLM RESPONSE

{
  "id": "chatcmpl-140",
  "object": "chat.completion",
  "created": 1723586678,
  "model": "llama3.1_big",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Welcome to our ice cream parlor! It's great to meet you! I'm your friendly Ice Cream Maker LLM, here to help you create the perfect scoop (or two, or three...)!\n\nWhat brings you in today? Do you have a favorite flavor combination in mind, or would you like some suggestions from me? Perhaps you're celebrating a special occasion and want an ice cream experience that's tailored just for you?\n\nLet's get this ice cream party started!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 4590,
    "completion_tokens": 96,
    "total_tokens": 4686
  }
}

Note: Prompt tokens count it is still low

Following this guide this was I did:

Creating a copy of the current llama3.1 model file

ollama show llama3.1 --modelfile > llama3_custom.modelfile

Editing the llama3_custom.modelfile file by appending this

PARAMETER num_ctx 32000

NOTE: probably the number can be higher

Create a copy of the original model with the custom modelfile

ollama create llama3.1_custom --file llama3_custom.modelfile

Running the server as before

ollama serve

Thanks @rick-github for your support!
What should we do now? Add the correct num_ctx to the original modelfile that has been uploaded to the official Ollama repo? I would be delighted if I can contribute to the project in any way! 😄

@cannox227 commented on GitHub (Aug 13, 2024): Ok I think we've found the fix then, this is how I was able to run it (**I saw from prompt that system prompt was included**) 😂 ```log DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x205ada080" timestamp=1723585063 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=57093 status=200 tid="0x16e08f000" timestamp=1723585063 time=2024-08-14T00:37:43.436+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nYou are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=2 tid="0x205ada080" timestamp=1723585063 DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 DEBUG [update_slots] slot progression | ga_i=0 n_past=0 n_past_se=0 n_prompt_tokens_processed=4590 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 ``` LLM RESPONSE ```json { "id": "chatcmpl-140", "object": "chat.completion", "created": 1723586678, "model": "llama3.1_big", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Welcome to our ice cream parlor! It's great to meet you! I'm your friendly Ice Cream Maker LLM, here to help you create the perfect scoop (or two, or three...)!\n\nWhat brings you in today? Do you have a favorite flavor combination in mind, or would you like some suggestions from me? Perhaps you're celebrating a special occasion and want an ice cream experience that's tailored just for you?\n\nLet's get this ice cream party started!" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 4590, "completion_tokens": 96, "total_tokens": 4686 } } ``` *Note: Prompt tokens count it is still low* --- Following this [guide](https://www.gpu-mart.com/blog/custom-llm-models-with-ollama-modelfile) this was I did: 1. Creating a copy of the current llama3.1 model file ```shell ollama show llama3.1 --modelfile > llama3_custom.modelfile ``` 2. Editing the `llama3_custom.modelfile` file by appending this ```shell PARAMETER num_ctx 32000 ``` *NOTE: probably the number can be higher* 3. Create a copy of the original model with the custom modelfile ```shell ollama create llama3.1_custom --file llama3_custom.modelfile ``` 4. Running the server as before ```shell ollama serve ``` Thanks @rick-github for your support! What should we do now? Add the correct `num_ctx` to the original modelfile that has been uploaded to the official Ollama repo? I would be delighted if I can contribute to the project in any way! 😄

GiteaMirror commented

2026-05-03 23:06:43 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.

@rick-github commented on GitHub (Aug 13, 2024): You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.

GiteaMirror commented

2026-05-03 23:06:44 -05:00

@cannox227 commented on GitHub (Aug 13, 2024):

You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.

Ok I see. Is there any way to add the support with a dynamical context length, such as by not ignoring num_ctx when sent from OpenAI client? Can you link me where in the repo the code that supports OpenAI-like calls is present? Maybe I could start edit that...

@cannox227 commented on GitHub (Aug 13, 2024): > You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low. Ok I see. Is there any way to add the support with a dynamical context length, such as by not ignoring `num_ctx` when sent from OpenAI client? Can you link me where in the repo the code that supports OpenAI-like calls is present? Maybe I could start edit that...

GiteaMirror commented

2026-05-03 23:06:45 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

The developers prefer to maintain alignment with the OpenAI standard for the compatibility endpoints, so I don't think a change to support num_ctx would be accepted. If you wanted have a go anyway, you would need to start with the ChatCompletionRequest and CompletionRequest structures.

@rick-github commented on GitHub (Aug 13, 2024): The developers prefer to [maintain alignment](https://github.com/ollama/ollama/issues/6089#issuecomment-2261039577) with the OpenAI standard for the compatibility endpoints, so I don't think a change to support `num_ctx` would be accepted. If you wanted have a go anyway, you would need to start with the [ChatCompletionRequest](https://github.com/ollama/ollama/blob/a0a40aa20cf774de20844426358ea9c5d9fa924f/openai/openai.go#L73) and [CompletionRequest](https://github.com/ollama/ollama/blob/a0a40aa20cf774de20844426358ea9c5d9fa924f/openai/openai.go#L108) structures.

GiteaMirror commented

2026-05-03 23:06:46 -05:00

@Rudd-O commented on GitHub (Oct 8, 2024):

num_ctx has been added to the ollama integration. It should no longer be needed to create a Modelfile and a custom model.

@Rudd-O commented on GitHub (Oct 8, 2024): num_ctx has been added to the ollama integration. It should no longer be needed to create a Modelfile and a custom model.

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

dhiltgen/llama-runner

parth-launch-codex-app

hoyyeva/anthropic-local-image-path

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#65896