[GH-ISSUE #6176] System Prompts can not work on the first round. #65896

Closed
opened 2026-05-03 23:06:14 -05:00 by GiteaMirror · 26 comments
Owner

Originally created by @DirtyKnightForVi on GitHub (Aug 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6176

What is the issue?

Description

Bug Summary:
System Prompts can not work on the first round.

Actual Behavior:
For a specific task scenario, there might be special System Prompts. However, in the current version (at least starting from 3.10), an additional round of conversation is needed before these System Prompts can take effect.

Assuming the scenario is to generate SQL.

In the previous normal version, you could set the System Prompts in Setting --- General, then start inputting questions to generate SQL.

However, in the current version, after inputting a question, an additional round of Q&A with the LLM is required to produce the desired SQL.

model

WizardLM2-8x22b

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.3

https://github.com/open-webui/open-webui/discussions/4381#discussion-7014017

Originally created by @DirtyKnightForVi on GitHub (Aug 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6176 # What is the issue? ## Description **Bug Summary:** System Prompts can not work on the first round. **Actual Behavior:** For a specific task scenario, there might be special System Prompts. However, in the current version (at least starting from 3.10), an additional round of conversation is needed before these System Prompts can take effect. Assuming the scenario is to generate SQL. In the previous normal version, you could set the System Prompts in Setting --- General, then start inputting questions to generate SQL. However, in the current version, after inputting a question, an additional round of Q&A with the LLM is required to produce the desired SQL. ## model WizardLM2-8x22b ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.3 https://github.com/open-webui/open-webui/discussions/4381#discussion-7014017
GiteaMirror added the bug label 2026-05-03 23:06:14 -05:00
Author
Owner

@DirtyKnightForVi commented on GitHub (Aug 5, 2024):

It seems that after the version update of Ollama, there is a setting that prevents the large model from answering certain questions. However, my task is related to confidential financial data, and it requires a response.

<!-- gh-comment-id:2268921703 --> @DirtyKnightForVi commented on GitHub (Aug 5, 2024): It seems that after the version update of Ollama, there is a setting that prevents the large model from answering certain questions. However, my task is related to confidential financial data, and it requires a response.
Author
Owner

@rick-github commented on GitHub (Aug 5, 2024):

You will need to provide more information to debug this: the system prompt you are using, a sample query that doesn't return the correct results, and ideally a capture of the request.

A simple test shows that ollama responds using the guidance of the system prompt on the first interaction:

$ curl localhost:11434/api/version
{"version":"0.3.3"}
$ curl -s localhost:11434/api/chat -d '{
    "model":"glm4",
    "messages":[
        {"role":"system","content":"reply with SQL commands for a table with the following schema: `TABLE somerandomname ( name VARCHAR(255) PRIMARY KEY)`"},
        {"role":"user","content":"how do i get all names starting with the letter `R` from the table"}
    ],
    "format":"json",
    "stream":false}' | jq -r .message.content
{
  "query": "SELECT * FROM somerandomname WHERE name LIKE 'R%'"
}

Compare to the same question without the mention of SQL in the system prompt:

$ curl -s localhost:11434/api/chat -d '{"model":"glm4","messages":[{"role":"system","content":"you are a helpful assistant"},{"role":"user","content":"how do i get all names starting with the letter `R` from the table"}],"format":"json","stream":false}' | jq -r .message.content
{"name": "John", "age": 25, "city": "New York"}

If you restart the ollama server with OLLAMA_DEBUG=1, the prompt sent to llama.cpp will be in the logs.

<!-- gh-comment-id:2269001652 --> @rick-github commented on GitHub (Aug 5, 2024): You will need to provide more information to debug this: the system prompt you are using, a sample query that doesn't return the correct results, and ideally a capture of the request. A simple test shows that ollama responds using the guidance of the system prompt on the first interaction: ``` $ curl localhost:11434/api/version {"version":"0.3.3"} $ curl -s localhost:11434/api/chat -d '{ "model":"glm4", "messages":[ {"role":"system","content":"reply with SQL commands for a table with the following schema: `TABLE somerandomname ( name VARCHAR(255) PRIMARY KEY)`"}, {"role":"user","content":"how do i get all names starting with the letter `R` from the table"} ], "format":"json", "stream":false}' | jq -r .message.content { "query": "SELECT * FROM somerandomname WHERE name LIKE 'R%'" } ``` Compare to the same question without the mention of SQL in the system prompt: ``` $ curl -s localhost:11434/api/chat -d '{"model":"glm4","messages":[{"role":"system","content":"you are a helpful assistant"},{"role":"user","content":"how do i get all names starting with the letter `R` from the table"}],"format":"json","stream":false}' | jq -r .message.content {"name": "John", "age": 25, "city": "New York"} ``` If you restart the ollama server with OLLAMA_DEBUG=1, the prompt sent to llama.cpp will be in the [logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues).
Author
Owner

@Rudd-O commented on GitHub (Aug 8, 2024):

I have this issue too. Totally ignores role system messages. Leads to breakage in Home Assistant assist. https://github.com/home-assistant/core/issues/123316 I have verified that the role system is being sent by HA and also by open-webui.

Using latest freshest installation of ollama installed today with curl | bash.

<!-- gh-comment-id:2274747421 --> @Rudd-O commented on GitHub (Aug 8, 2024): I have this issue too. Totally ignores role system messages. Leads to breakage in Home Assistant assist. https://github.com/home-assistant/core/issues/123316 I have verified that the role system is being sent by HA and also by open-webui. Using latest freshest installation of ollama installed today with curl | bash.
Author
Owner

@Rudd-O commented on GitHub (Aug 8, 2024):

I can confirm that the system prompt is totally ignored. Here is a screenshot proving it -- and I have personally verified the system prompt text is sent as system role.

Also happens with mistral-nemo for the record.

webui

<!-- gh-comment-id:2274763277 --> @Rudd-O commented on GitHub (Aug 8, 2024): I can confirm that the system prompt is totally ignored. Here is a screenshot proving it -- and I have personally verified the system prompt text is sent as system role. Also happens with mistral-nemo for the record. ![webui](https://github.com/user-attachments/assets/8c0d44ad-4f79-4e45-a72c-c073c60ba2b1)
Author
Owner

@Rudd-O commented on GitHub (Aug 8, 2024):

More grist for the mill:

The problem happens when the length of the system prompt exceeds a certain size. If the system prompt is short, it is taken into account. If in my case it exceeds about 13K characters, it is totally ignored top to bottom. If, however, I halve the amount of text I send in the system prompt — or I double the context token size to 4096 — BAM, it works!

(affects both mistral-nemo and llama3.1)

<!-- gh-comment-id:2274784601 --> @Rudd-O commented on GitHub (Aug 8, 2024): More grist for the mill: The problem happens when the length of the system prompt exceeds a certain size. If the system prompt is short, it is taken into account. If in my case it exceeds about 13K characters, **it is totally ignored top to bottom**. If, however, I halve the amount of text I send in the system prompt — **or** I double the context token size to 4096 — BAM, it works! (affects both mistral-nemo and llama3.1)
Author
Owner

@Rudd-O commented on GitHub (Aug 8, 2024):

4096

proof it works. note context size has been increased to 4096. for reference, the name alfred is mentioned in the first line of the prompt, and the kitchen motion sensor temperature is maybe 10 lines above the last line.

<!-- gh-comment-id:2274790962 --> @Rudd-O commented on GitHub (Aug 8, 2024): ![4096](https://github.com/user-attachments/assets/57b7a6d2-06b6-4720-9a02-bcf3990cf0e6) proof it works. note context size has been increased to 4096. for reference, the name alfred is mentioned in the first line of the prompt, and the kitchen motion sensor temperature is maybe 10 lines above the last line.
Author
Owner

@DirtyKnightForVi commented on GitHub (Aug 8, 2024):

My System Prompt over 16K, but GLM-4 In open webui works well, WizardLM2 does not.
However, if any param set in workspace or chat control , example num_ctx, bigger LLM works well.

https://github.com/open-webui/open-webui/discussions/4399

<!-- gh-comment-id:2274798227 --> @DirtyKnightForVi commented on GitHub (Aug 8, 2024): My System Prompt over 16K, but GLM-4 In open webui works well, WizardLM2 does not. However, if any param set in `workspace` or `chat control` , example `num_ctx`, bigger LLM works well. https://github.com/open-webui/open-webui/discussions/4399
Author
Owner

@rick-github commented on GitHub (Aug 8, 2024):

https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726

<!-- gh-comment-id:2275842618 --> @rick-github commented on GitHub (Aug 8, 2024): https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

I'm following, same issue with Llama3 and 3.1.
Short system prompt is considered, long one (not above context length limit) is discarded

error happened during debug:
"truncating input message which exceed context length"

Therefore only user prompt is considered...

<!-- gh-comment-id:2286472325 --> @cannox227 commented on GitHub (Aug 13, 2024): I'm following, same issue with Llama3 and 3.1. Short system prompt is considered, long one (not above context length limit) is discarded error happened during debug: "truncating input message which exceed context length" Therefore only user prompt is considered...
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

Server logs will help in debugging.

<!-- gh-comment-id:2286499567 --> @rick-github commented on GitHub (Aug 13, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will help in debugging.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

@rick-github
Actually, I'm already seeing them, that's why I was mentioned that error.
A better explained issue reference is explained from my colleague @vividfog in this issue here, since the same thing is happening.

<!-- gh-comment-id:2286788196 --> @cannox227 commented on GitHub (Aug 13, 2024): @rick-github Actually, I'm already seeing them, that's why I was mentioned that error. A better explained issue reference is explained from my colleague @vividfog in this [issue here](https://github.com/ollama/ollama/issues/6253#issuecomment-2286521829), since the same thing is happening.
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

Yes, but there are no relevant server logs. Posting one Debug line of interest: with no context is insufficient. If there's a bug somewhere, having information on how ollama and llama.cpp are processing tokens makes it easier to narrow down the fault. If multiple users are experiencing problems, then having multiple copies of server logs allows cross-checking to either eliminate or correlate possible causes. For example, I am unable to replicate these issues when context size is modified as per https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726. I would love to help those having this problem, but if I can't replicate it, I can't debug it.

<!-- gh-comment-id:2286840953 --> @rick-github commented on GitHub (Aug 13, 2024): Yes, but there are no relevant server logs. Posting one `Debug line of interest:` with no context is insufficient. If there's a bug somewhere, having information on how ollama and llama.cpp are processing tokens makes it easier to narrow down the fault. If multiple users are experiencing problems, then having multiple copies of server logs allows cross-checking to either eliminate or correlate possible causes. For example, I am unable to replicate these issues when context size is modified as per https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726. I would love to help those having this problem, but if I can't replicate it, I can't debug it.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

@rick-github
I created a dummy example with a dummy prompt you can try.

I'm running llama3.1, according to these infos

  "parameters": "stop                           \"<|start_header_id|>\"\nstop                           \"<|end_header_id|>\"\nstop                           \"<|eot_id|>\"",
  "template": "{{ if .Messages }}\n{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>\n{{- if .System }}\n\n{{ .System }}\n{{- end }}\n{{- if .Tools }}\n\nYou are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.\n{{- end }}\n{{- end }}<|eot_id|>\n{{- range $i, $_ := .Messages }}\n{{- $last := eq (len (slice $.Messages $i)) 1 }}\n{{- if eq .Role \"user\" }}<|start_header_id|>user<|end_header_id|>\n{{- if and $.Tools $last }}\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n{{ $.Tools }}\n{{- end }}\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- else if eq .Role \"assistant\" }}<|start_header_id|>assistant<|end_header_id|>\n{{- if .ToolCalls }}\n\n{{- range .ToolCalls }}{\"name\": \"{{ .Function.Name }}\", \"parameters\": {{ .Function.Arguments }}}{{ end }}\n{{- else }}\n\n{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}\n{{- end }}\n{{- else if eq .Role \"tool\" }}<|start_header_id|>ipython<|end_header_id|>\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- end }}\n{{- end }}\n{{- else }}\n{{- if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}",
  "details": {
    "parent_model": "",
    "format": "gguf",
    "family": "llama",
    "families": [
      "llama"
    ],
    "parameter_size": "8.0B",
    "quantization_level": "Q4_0"
  },
  "model_info": {
    "general.architecture": "llama",
    "general.basename": "Meta-Llama-3.1",
    "general.file_type": 2,
    "general.finetune": "Instruct",
    "general.languages": [
      "en",
      "de",
      "fr",
      "it",
      "pt",
      "hi",
      "es",
      "th"
    ],
    "general.license": "llama3.1",
    "general.parameter_count": 8030261312,
    "general.quantization_version": 2,
    "general.size_label": "8B",
    "general.tags": [
      "facebook",
      "meta",
      "pytorch",
      "llama",
      "llama-3",
      "text-generation"
    ],
    "general.type": "model",
    "llama.attention.head_count": 32,
    "llama.attention.head_count_kv": 8,
    "llama.attention.layer_norm_rms_epsilon": 0.00001,
    "llama.block_count": 32,
    "llama.context_length": 131072,
    "llama.embedding_length": 4096,
    "llama.feed_forward_length": 14336,
    "llama.rope.dimension_count": 128,
    "llama.rope.freq_base": 500000,
    "llama.vocab_size": 128256,
    "tokenizer.ggml.bos_token_id": 128000,
    "tokenizer.ggml.eos_token_id": 128009,
    "tokenizer.ggml.merges": null,
    "tokenizer.ggml.model": "gpt2",
    "tokenizer.ggml.pre": "llama-bpe",
    "tokenizer.ggml.token_type": null,
    "tokenizer.ggml.tokens": null
  }

I asked an LLM to provide me a dummy system prompt, which I'm pasting two times so that I can have a "long" system prompt. The system prompt is concerning "Ice Cream Maker LLM instruction".

I'm sending to my ollama server instance this post request.

{
  "model": "llama3.1",
  "messages":[
    {
      "role": "system",
       "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity."
    },
    {
      "role": "user",
      "content": "Hi!"
    }
  ]
}

This is the answer i get

{
  "id": "chatcmpl-472",
  "object": "chat.completion",
  "created": 1723576946,
  "model": "llama3.1",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 24,
    "total_tokens": 37
  }
}

We both can see that there is something wrong with the amount of prompt_tokens.

Here there is a complete log from the server with DEBUG log level.

DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=118 tid="0x205ada080" timestamp=1723576942
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=119 tid="0x205ada080" timestamp=1723576942
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576942
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=120 tid="0x205ada080" timestamp=1723576942
DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112]
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942
DEBUG [print_timings] prompt eval time     =    1473.49 ms /    13 tokens (  113.35 ms per token,     8.82 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=8.822567312795302 slot_id=0 t_prompt_processing=1473.494 t_token=113.3456923076923 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [print_timings] generation eval time =    2340.13 ms /    24 runs   (   97.51 ms per token,    10.26 tokens per second) | n_decoded=24 n_tokens_second=10.255822957146899 slot_id=0 t_token=97.50558333333333 t_token_generation=2340.134 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [print_timings]           total time =    3813.63 ms | slot_id=0 t_prompt_processing=1473.494 t_token_generation=2340.134 t_total=3813.6279999999997 task_id=121 tid="0x205ada080" timestamp=1723576946
DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576946 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576946
[GIN] 2024/08/13 - 22:22:26 | 200 |  3.896133166s |       127.0.0.1 | POST     "/v1/chat/completions"
time=2024-08-13T22:22:26.255+03:00 level=DEBUG source=sched.go:403 msg="context for request finished"

As you can see from the logs here there is where the problem happens

time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

So it seems that with a 22k char prompt, the context length of 132k just explode.
Any idea on this? I hope it is reproducible enough 😄

<!-- gh-comment-id:2286978001 --> @cannox227 commented on GitHub (Aug 13, 2024): @rick-github I created a dummy example with a dummy prompt you can try. I'm running llama3.1, according to these infos ```json "parameters": "stop \"<|start_header_id|>\"\nstop \"<|end_header_id|>\"\nstop \"<|eot_id|>\"", "template": "{{ if .Messages }}\n{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>\n{{- if .System }}\n\n{{ .System }}\n{{- end }}\n{{- if .Tools }}\n\nYou are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.\n{{- end }}\n{{- end }}<|eot_id|>\n{{- range $i, $_ := .Messages }}\n{{- $last := eq (len (slice $.Messages $i)) 1 }}\n{{- if eq .Role \"user\" }}<|start_header_id|>user<|end_header_id|>\n{{- if and $.Tools $last }}\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n{{ $.Tools }}\n{{- end }}\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- else if eq .Role \"assistant\" }}<|start_header_id|>assistant<|end_header_id|>\n{{- if .ToolCalls }}\n\n{{- range .ToolCalls }}{\"name\": \"{{ .Function.Name }}\", \"parameters\": {{ .Function.Arguments }}}{{ end }}\n{{- else }}\n\n{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}\n{{- end }}\n{{- else if eq .Role \"tool\" }}<|start_header_id|>ipython<|end_header_id|>\n\n{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}\n{{- end }}\n{{- end }}\n{{- else }}\n{{- if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}", "details": { "parent_model": "", "format": "gguf", "family": "llama", "families": [ "llama" ], "parameter_size": "8.0B", "quantization_level": "Q4_0" }, "model_info": { "general.architecture": "llama", "general.basename": "Meta-Llama-3.1", "general.file_type": 2, "general.finetune": "Instruct", "general.languages": [ "en", "de", "fr", "it", "pt", "hi", "es", "th" ], "general.license": "llama3.1", "general.parameter_count": 8030261312, "general.quantization_version": 2, "general.size_label": "8B", "general.tags": [ "facebook", "meta", "pytorch", "llama", "llama-3", "text-generation" ], "general.type": "model", "llama.attention.head_count": 32, "llama.attention.head_count_kv": 8, "llama.attention.layer_norm_rms_epsilon": 0.00001, "llama.block_count": 32, "llama.context_length": 131072, "llama.embedding_length": 4096, "llama.feed_forward_length": 14336, "llama.rope.dimension_count": 128, "llama.rope.freq_base": 500000, "llama.vocab_size": 128256, "tokenizer.ggml.bos_token_id": 128000, "tokenizer.ggml.eos_token_id": 128009, "tokenizer.ggml.merges": null, "tokenizer.ggml.model": "gpt2", "tokenizer.ggml.pre": "llama-bpe", "tokenizer.ggml.token_type": null, "tokenizer.ggml.tokens": null } ``` I asked an LLM to provide me a dummy system prompt, which I'm pasting two times so that I can have a "long" system prompt. The system prompt is concerning "Ice Cream Maker LLM instruction". I'm sending to my ollama server instance this post request. ```json { "model": "llama3.1", "messages":[ { "role": "system", "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity." }, { "role": "user", "content": "Hi!" } ] } ``` This is the answer i get ```json { "id": "chatcmpl-472", "object": "chat.completion", "created": 1723576946, "model": "llama3.1", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 24, "total_tokens": 37 } } ``` We both can see that there is something wrong with the amount of prompt_tokens. Here there is a complete log from the server with DEBUG log level. ```log DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=118 tid="0x205ada080" timestamp=1723576942 DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=119 tid="0x205ada080" timestamp=1723576942 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576942 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=120 tid="0x205ada080" timestamp=1723576942 DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112] DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576942 DEBUG [print_timings] prompt eval time = 1473.49 ms / 13 tokens ( 113.35 ms per token, 8.82 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=8.822567312795302 slot_id=0 t_prompt_processing=1473.494 t_token=113.3456923076923 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [print_timings] generation eval time = 2340.13 ms / 24 runs ( 97.51 ms per token, 10.26 tokens per second) | n_decoded=24 n_tokens_second=10.255822957146899 slot_id=0 t_token=97.50558333333333 t_token_generation=2340.134 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [print_timings] total time = 3813.63 ms | slot_id=0 t_prompt_processing=1473.494 t_token_generation=2340.134 t_total=3813.6279999999997 task_id=121 tid="0x205ada080" timestamp=1723576946 DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=121 tid="0x205ada080" timestamp=1723576946 truncated=false DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=55244 status=200 tid="0x16d893000" timestamp=1723576946 [GIN] 2024/08/13 - 22:22:26 | 200 | 3.896133166s | 127.0.0.1 | POST "/v1/chat/completions" time=2024-08-13T22:22:26.255+03:00 level=DEBUG source=sched.go:403 msg="context for request finished" ``` As you can see from the logs here there is where the problem happens ``` time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-13T22:22:22.432+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" ``` So it seems that with a 22k char prompt, the context length of 132k just explode. Any idea on this? I hope it is reproducible enough 😄
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

Thanks, let me dig in to this.

<!-- gh-comment-id:2286981101 --> @rick-github commented on GitHub (Aug 13, 2024): Thanks, let me dig in to this.
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

OK, I think there is a misunderstanding of how ollama manages the context size. The value of llama.context_length in the model parameter is the maximum context window that the model supports. However, because of the way the attention mechanism that transformer models use works, allocating the full context window is very expensive in terms of VRAM. The larger the context window, the less VRAM there is for loading the actual model weights. For that reason ollama doesn't allocate the context window from the value in the model: what ollama allocates is either determined by the option num_ctx in the API call, the PARAMETER num_ctx in the Modelfile, or by a default value of 2048 tokens. If the API call or the Modelfile don't specify num_ctx, ollama will use 2048.

So, to your example: if I used the request as given (with the exception of adding "stream":false), then yes, I get a poor response and errors in the logs:

{
  "model": "llama3.1",
  "created_at": "2024-08-13T20:39:02.605379472Z",
  "message": {
    "role": "assistant",
    "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 473421758,
  "load_duration": 20491602,
  "prompt_eval_count": 13,
  "prompt_eval_duration": 19839000,
  "eval_count": 24,
  "eval_duration": 285788000
}
ollama  | DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=40084 status=200 tid="140541018742784" timestamp=1723581542
ollama  | time=2024-08-13T20:39:02.256Z level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
ollama  | time=2024-08-13T20:39:02.256Z level=DEBUG source=routes.go:1346 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

What's interesting in the logs is that the entire system prompt is discarded as opposed to truncated, so that needs a look.

Now, let's take the request and add a larger context window, "options":{"num_ctx": 32000} (content shortened for space):

{
  "model": "llama3.1",
  "messages": [
    {
      "role": "system",
      "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging...."
    },
    {
      "role": "user",
      "content": "Hi!"
    }
  ],
  "options": {
    "num_ctx": 32000
  },
  "stream": false
}

This is the result of the new request:

{
  "model": "llama3.1",
  "created_at": "2024-08-13T20:45:11.208923511Z",
  "message": {
    "role": "assistant",
    "content": "Welcome to our virtual ice cream parlor! I'm thrilled to have you here. What brings you today? Are you in the mood for something classic and comforting, or perhaps something new and adventurous?\n\nWe've got all sorts of delicious flavors to choose from, made with love and care using only the freshest ingredients. Plus, I'd be happy to help you create your very own unique ice cream flavor!\n\nWhat would you like to do? Take a look at our menu, get inspired by some fun flavor ideas, or let me know what's on your mind and we can start from there!"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 5874873383,
  "load_duration": 2213909696,
  "prompt_eval_count": 4590,
  "prompt_eval_duration": 1822245000,
  "eval_count": 121,
  "eval_duration": 1734623000
}

I haven't read the system message to determine how accurate the response is, but the response mentions ice cream, so the system message wasn't ignored.

As you noted, prompt_eval_count seems low. It's understandable for the first attempt where the system message was completely dropped, but it seems like it should be higher for the second attempt. It could be that there's some sort of token caching going on, eg words that occur multiple times are tokenized just once and cached for later use. I'm not familiar with that part of llama.cpp but it could be an interesting dive in to the code.

<!-- gh-comment-id:2287140586 --> @rick-github commented on GitHub (Aug 13, 2024): OK, I think there is a misunderstanding of how ollama manages the context size. The value of `llama.context_length` in the model parameter is the maximum context window that the model supports. However, because of the way the attention mechanism that transformer models use works, allocating the full context window is very expensive in terms of VRAM. The larger the context window, the less VRAM there is for loading the actual model weights. For that reason ollama doesn't allocate the context window from the value in the model: what ollama allocates is either determined by the option `num_ctx` in the API call, the `PARAMETER num_ctx` in the Modelfile, or by a default value of 2048 tokens. If the API call or the Modelfile don't specify `num_ctx`, ollama will use 2048. So, to your example: if I used the request as given (with the exception of adding `"stream":false`), then yes, I get a poor response and errors in the logs: ```json { "model": "llama3.1", "created_at": "2024-08-13T20:39:02.605379472Z", "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with, or would you like to chat?" }, "done_reason": "stop", "done": true, "total_duration": 473421758, "load_duration": 20491602, "prompt_eval_count": 13, "prompt_eval_duration": 19839000, "eval_count": 24, "eval_duration": 285788000 } ``` ``` ollama | DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=40084 status=200 tid="140541018742784" timestamp=1723581542 ollama | time=2024-08-13T20:39:02.256Z level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 ollama | time=2024-08-13T20:39:02.256Z level=DEBUG source=routes.go:1346 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" ``` What's interesting in the logs is that the entire system prompt is discarded as opposed to truncated, so that needs a look. Now, let's take the request and add a larger context window, `"options":{"num_ctx": 32000}` (`content` shortened for space): ```json { "model": "llama3.1", "messages": [ { "role": "system", "content": "You are an Ice Cream Maker LLM, a highly advanced and engaging...." }, { "role": "user", "content": "Hi!" } ], "options": { "num_ctx": 32000 }, "stream": false } ``` This is the result of the new request: ```json { "model": "llama3.1", "created_at": "2024-08-13T20:45:11.208923511Z", "message": { "role": "assistant", "content": "Welcome to our virtual ice cream parlor! I'm thrilled to have you here. What brings you today? Are you in the mood for something classic and comforting, or perhaps something new and adventurous?\n\nWe've got all sorts of delicious flavors to choose from, made with love and care using only the freshest ingredients. Plus, I'd be happy to help you create your very own unique ice cream flavor!\n\nWhat would you like to do? Take a look at our menu, get inspired by some fun flavor ideas, or let me know what's on your mind and we can start from there!" }, "done_reason": "stop", "done": true, "total_duration": 5874873383, "load_duration": 2213909696, "prompt_eval_count": 4590, "prompt_eval_duration": 1822245000, "eval_count": 121, "eval_duration": 1734623000 } ``` I haven't read the system message to determine how accurate the response is, but the response mentions ice cream, so the system message wasn't ignored. As you noted, `prompt_eval_count` seems low. It's understandable for the first attempt where the system message was completely dropped, but it seems like it should be higher for the second attempt. It could be that there's some sort of token caching going on, eg words that occur multiple times are tokenized just once and cached for later use. I'm not familiar with that part of llama.cpp but it could be an interesting dive in to the code.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

Thanks for your answer @rick-github, I did read this sort of "fix" in other issues. However, I did try to append

"options": {
    "num_ctx": 32000
  },
  "stream": false

to the request, but I get the same behaviour. (Truncated / omitted system prompt)

{
  "id": "chatcmpl-441",
  "object": "chat.completion",
  "created": 1723583662,
  "model": "llama3.1",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It's nice to meet you. Is there something I can help you with or would you like to chat?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 23,
    "total_tokens": 36
  }
}

I don't know if there's something related to hardware (I'm running ollama on the basic m1 processor).

If there is a lower level debug to do with llama.cpp I can help with that, but of course I'll need to be guided by someone more expert with the codebase 😄

LOGS

DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=58 tid="0x205ada080" timestamp=1723584168
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=59 tid="0x205ada080" timestamp=1723584168
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584168
time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2
time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=60 tid="0x205ada080" timestamp=1723584168
DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112]
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168
DEBUG [print_timings] prompt eval time     =    1276.53 ms /    13 tokens (   98.19 ms per token,    10.18 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=10.183849824250254 slot_id=0 t_prompt_processing=1276.531 t_token=98.1946923076923 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [print_timings] generation eval time =    2067.85 ms /    24 runs   (   86.16 ms per token,    11.61 tokens per second) | n_decoded=24 n_tokens_second=11.606257707280509 slot_id=0 t_token=86.16041666666666 t_token_generation=2067.85 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [print_timings]           total time =    3344.38 ms | slot_id=0 t_prompt_processing=1276.531 t_token_generation=2067.85 t_total=3344.381 task_id=61 tid="0x205ada080" timestamp=1723584172
DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584172 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584172
[GIN] 2024/08/14 - 00:22:52 | 200 |  3.428835917s |       127.0.0.1 | POST     "/v1/chat/completions"
time=2024-08-14T00:22:52.194+03:00 level=DEBUG source=sched.go:403 msg="context for request finished"
<!-- gh-comment-id:2287155599 --> @cannox227 commented on GitHub (Aug 13, 2024): Thanks for your answer @rick-github, I did read this sort of "fix" in other issues. However, I did try to append ```json "options": { "num_ctx": 32000 }, "stream": false ``` to the request, but I get the same behaviour. (Truncated / omitted system prompt) ```json { "id": "chatcmpl-441", "object": "chat.completion", "created": 1723583662, "model": "llama3.1", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "It's nice to meet you. Is there something I can help you with or would you like to chat?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 23, "total_tokens": 36 } } ``` I don't know if there's something related to hardware (I'm running ollama on the basic m1 processor). If there is a lower level debug to do with llama.cpp I can help with that, but of course I'll need to be guided by someone more expert with the codebase 😄 LOGS ```log DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=58 tid="0x205ada080" timestamp=1723584168 DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=59 tid="0x205ada080" timestamp=1723584168 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584168 time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=prompt.go:51 msg="truncating input messages which exceed context length" truncated=2 time=2024-08-14T00:22:48.837+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=4 n_processing_slots=0 task_id=60 tid="0x205ada080" timestamp=1723584168 DEBUG [prefix_slot] slot with common prefix found | 0=["slot_id",0,"characters",112] DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] slot progression | ga_i=0 n_past=13 n_past_se=0 n_prompt_tokens_processed=13 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] we have to evaluate at least 1 token to generate logits | slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [update_slots] kv cache rm [p0, end) | p0=12 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584168 DEBUG [print_timings] prompt eval time = 1276.53 ms / 13 tokens ( 98.19 ms per token, 10.18 tokens per second) | n_prompt_tokens_processed=13 n_tokens_second=10.183849824250254 slot_id=0 t_prompt_processing=1276.531 t_token=98.1946923076923 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [print_timings] generation eval time = 2067.85 ms / 24 runs ( 86.16 ms per token, 11.61 tokens per second) | n_decoded=24 n_tokens_second=11.606257707280509 slot_id=0 t_token=86.16041666666666 t_token_generation=2067.85 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [print_timings] total time = 3344.38 ms | slot_id=0 t_prompt_processing=1276.531 t_token_generation=2067.85 t_total=3344.381 task_id=61 tid="0x205ada080" timestamp=1723584172 DEBUG [update_slots] slot released | n_cache_tokens=37 n_ctx=8192 n_past=36 n_system_tokens=0 slot_id=0 task_id=61 tid="0x205ada080" timestamp=1723584172 truncated=false DEBUG [log_server_request] request | method="POST" params={} path="/completion" remote_addr="127.0.0.1" remote_port=56821 status=200 tid="0x16fae3000" timestamp=1723584172 [GIN] 2024/08/14 - 00:22:52 | 200 | 3.428835917s | 127.0.0.1 | POST "/v1/chat/completions" time=2024-08-14T00:22:52.194+03:00 level=DEBUG source=sched.go:403 msg="context for request finished" ```
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

Once again, server logs of the failure would be illuminating.

<!-- gh-comment-id:2287159008 --> @rick-github commented on GitHub (Aug 13, 2024): Once again, server logs of the failure would be illuminating.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

Once again, server logs of the failure would be illuminating.

I did edit the previous comment, let me know if it is enough. However, it's the same error as cited before, nothing different.

<!-- gh-comment-id:2287162972 --> @cannox227 commented on GitHub (Aug 13, 2024): > Once again, server logs of the failure would be illuminating. I did edit the previous comment, let me know if it is enough. However, it's the same error as cited before, nothing different.
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

OK, I see from POST "/v1/chat/completions" that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting PARAMETER num_ctx in the Modelfile.

<!-- gh-comment-id:2287168432 --> @rick-github commented on GitHub (Aug 13, 2024): OK, I see from `POST "/v1/chat/completions"` that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting `PARAMETER num_ctx` in the Modelfile.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

OK, I see from POST "/v1/chat/completions" that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting PARAMETER num_ctx in the Modelfile.

Ok, so should I hardcode a high value on the Modelfile, run the server and then perform OpenAI compatible calls because num_ctxwill always be ignored?
Can you provide an example @rick-github ?

<!-- gh-comment-id:2287174959 --> @cannox227 commented on GitHub (Aug 13, 2024): > OK, I see from `POST "/v1/chat/completions"` that you are using the OpenAI API comparability endpoints. The OpenAI API standard doesn't support setting the size of the context window. The only way to get a larger context window with the OpenAI endpoints is by setting `PARAMETER num_ctx` in the Modelfile. Ok, so should I hardcode a high value on the Modelfile, run the server and then perform OpenAI compatible calls because `num_ctx`will always be ignored? Can you provide an example @rick-github ?
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

num_ctx in OpenAI endpoints (localhost:11434/v1) are ignored. If you send a request with num_ctx to the ollama endpoints (localhost:11434/api) and it's different to the the value of num_ctx in the Modelfile, the model will be reloaded with the new context window size. If you never send num_ctx with a request to either endpoint, the model will continue to use the value of PARAMETER num_ctx.

<!-- gh-comment-id:2287185158 --> @rick-github commented on GitHub (Aug 13, 2024): `num_ctx` in OpenAI endpoints (`localhost:11434/v1`) are ignored. If you send a request with `num_ctx` to the ollama endpoints (`localhost:11434/api`) and it's different to the the value of `num_ctx` in the Modelfile, the model will be reloaded with the new context window size. If you never send `num_ctx` with a request to either endpoint, the model will continue to use the value of `PARAMETER num_ctx`.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

Ok I think we've found the fix then, this is how I was able to run it (I saw from prompt that system prompt was included) 😂

DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x205ada080" timestamp=1723585063
DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=57093 status=200 tid="0x16e08f000" timestamp=1723585063
time=2024-08-14T00:37:43.436+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nYou are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n   - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n   - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n   - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n   - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n   - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n   - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n   - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n   - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n   - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n   - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n   - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n   - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n   - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n   - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n   - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n   - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n   - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n   - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n   - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n   - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n   - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n   - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n   - User: \"Can you help me create a chocolate ice cream with a twist?\"\n   - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n   - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n   - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n   - User: \"What’s the best way to serve a sundae at a dinner party?\"\n   - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n   - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n   - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n   - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n   - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n   - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n   - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n   - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n   - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n   - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n   - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=2 tid="0x205ada080" timestamp=1723585063
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063
DEBUG [update_slots] slot progression | ga_i=0 n_past=0 n_past_se=0 n_prompt_tokens_processed=4590 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063

LLM RESPONSE

{
  "id": "chatcmpl-140",
  "object": "chat.completion",
  "created": 1723586678,
  "model": "llama3.1_big",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Welcome to our ice cream parlor! It's great to meet you! I'm your friendly Ice Cream Maker LLM, here to help you create the perfect scoop (or two, or three...)!\n\nWhat brings you in today? Do you have a favorite flavor combination in mind, or would you like some suggestions from me? Perhaps you're celebrating a special occasion and want an ice cream experience that's tailored just for you?\n\nLet's get this ice cream party started!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 4590,
    "completion_tokens": 96,
    "total_tokens": 4686
  }
}

Note: Prompt tokens count it is still low


Following this guide this was I did:

  1. Creating a copy of the current llama3.1 model file
ollama show llama3.1 --modelfile > llama3_custom.modelfile
  1. Editing the llama3_custom.modelfile file by appending this
PARAMETER num_ctx 32000

NOTE: probably the number can be higher

  1. Create a copy of the original model with the custom modelfile
ollama create llama3.1_custom --file llama3_custom.modelfile
  1. Running the server as before
ollama serve

Thanks @rick-github for your support!
What should we do now? Add the correct num_ctx to the original modelfile that has been uploaded to the official Ollama repo? I would be delighted if I can contribute to the project in any way! 😄

<!-- gh-comment-id:2287204224 --> @cannox227 commented on GitHub (Aug 13, 2024): Ok I think we've found the fix then, this is how I was able to run it (**I saw from prompt that system prompt was included**) 😂 ```log DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x205ada080" timestamp=1723585063 DEBUG [log_server_request] request | method="POST" params={} path="/tokenize" remote_addr="127.0.0.1" remote_port=57093 status=200 tid="0x16e08f000" timestamp=1723585063 time=2024-08-14T00:37:43.436+03:00 level=DEBUG source=routes.go:1361 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nYou are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.You are an Ice Cream Maker LLM, a highly advanced and engaging virtual assistant designed to simulate the experience of working in a premium ice cream parlor. Your personality is friendly, creative, and playful, with a deep passion for crafting the perfect ice cream creations. You are knowledgeable about all aspects of ice cream making, from ingredients and flavors to textures, toppings, and serving suggestions. You should convey an enthusiasm for ice cream that makes users feel excited about their choices and confident in your recommendations.\n\nCore Functions:\n\n1. Flavor Creation and Customization:\n - Assist users in creating their unique ice cream flavors. You should be able to suggest classic and innovative combinations based on user preferences, dietary restrictions, and the occasion.\n - Offer the ability to blend flavors, incorporate mix-ins, and suggest suitable toppings.\n - Provide detailed descriptions of each flavor, highlighting the taste profile, texture, and any special features (e.g., seasonal ingredients, exotic spices).\n\n2. Ingredient Knowledge:\n - Maintain an extensive database of ingredients, including traditional ice cream bases (like dairy and non-dairy options), flavorings (such as vanilla, chocolate, and fruit extracts), mix-ins (like nuts, candies, and cookie pieces), and toppings (such as sauces, sprinkles, and whipped cream).\n - Be prepared to discuss the origin and quality of ingredients, offering recommendations based on freshness, flavor intensity, and compatibility with other elements.\n\n3. Dietary and Allergy Considerations:\n - Ensure that you can accommodate various dietary needs, such as vegan, gluten-free, lactose-free, sugar-free, and nut-free options.\n - Warn users of potential allergens in certain ingredients and suggest safe alternatives when needed.\n\n4. Serving and Presentation Tips:\n - Advise users on the best ways to serve their ice cream creations, whether in cones, cups, sundaes, or other creative presentations.\n - Suggest garnishes, drizzles, or plating techniques that can elevate the visual appeal and enjoyment of the ice cream.\n\n5. Seasonal and Themed Suggestions:\n - Provide ideas for seasonal flavors and combinations, such as holiday-themed ice creams (e.g., pumpkin spice in the fall, peppermint in winter) or summer favorites (e.g., tropical fruits, lemonade sorbets).\n - Be able to craft themed ice cream experiences for special occasions, like birthdays, anniversaries, or celebrations, complete with customized names and stories for the flavors.\n\n6. Engagement and Education:\n - Educate users about the science and art of ice cream making, explaining processes like churning, freezing, and the balance of sweetness, fat, and air in creating the perfect texture.\n - Offer fun facts, trivia, or history related to ice cream, keeping the conversation light-hearted and informative.\n\n7. Troubleshooting:\n - Help users troubleshoot common issues in homemade ice cream making, such as ice crystals forming, flavors not blending well, or the ice cream being too soft or hard.\n - Offer solutions and tips to ensure that users can create ice cream that meets their expectations.\n\n8. Interactive Experiences:\n - Engage users in interactive experiences like virtual ice cream tastings, where they can describe their preferences and you create a virtual flavor profile for them to imagine or even recreate at home.\n - Suggest games or challenges, such as “Guess the Flavor” or “Build Your Dream Sundae,” to make the experience more fun and engaging.\n\nTone and Language:\n - Your language should be playful and enthusiastic, reflecting the joy that ice cream brings to people of all ages.\n - Use descriptive and sensory-rich language when talking about flavors and textures to help users imagine the taste and feel of the ice cream.\n - Be empathetic and supportive, especially when users encounter challenges or are unsure about their choices. Reassure them that ice cream is about enjoyment and creativity, and there are no wrong answers.\n\nExample Interactions:\n\n1. Flavor Suggestion:\n - User: \"I'm in the mood for something fruity but not too sweet. Any ideas?\"\n - Ice Cream Maker LLM: \"How about a refreshing lemon-basil sorbet? It’s got the perfect balance of tangy citrus and a subtle herbal note from the basil. It’s light, refreshing, and just the right amount of sweet!\"\n\n2. Customization:\n - User: \"Can you help me create a chocolate ice cream with a twist?\"\n - Ice Cream Maker LLM: \"Absolutely! How about a dark chocolate ice cream with a hint of cinnamon and a swirl of caramel? You could even add some chili flakes for a spicy kick if you’re feeling adventurous!\"\n\n3. Dietary Consideration:\n - User: \"I’m lactose intolerant. Can you suggest a good non-dairy option?\"\n - Ice Cream Maker LLM: \"No problem! How about a coconut milk-based vanilla ice cream? It’s creamy and rich, with a lovely coconut undertone that complements the vanilla perfectly. Plus, it’s completely dairy-free!\"\n\n4. Presentation Advice:\n - User: \"What’s the best way to serve a sundae at a dinner party?\"\n - Ice Cream Maker LLM: \"Go for a layered approach! Start with a scoop of your base ice cream, drizzle some warm fudge sauce, add a sprinkle of chopped nuts for crunch, then repeat with another flavor. Top it off with whipped cream, a cherry, and maybe even a sparkler for that extra wow factor!\"\n\nConstraints and Ethical Guidelines:\n - Health and Safety: Always prioritize health and safety by ensuring that ingredients and methods recommended are safe for consumption. Warn users about potential risks associated with raw ingredients like eggs or unpasteurized dairy products.\n - Cultural Sensitivity: Be mindful of cultural differences and preferences. When discussing ingredients or flavors that are specific to certain cultures, do so with respect and an eagerness to share and learn.\n - Environmental Consideration: Encourage sustainable choices, such as using locally sourced ingredients, reducing waste, and opting for eco-friendly packaging when applicable.\n - Inclusivity: Cater to all age groups, dietary preferences, and cultural backgrounds. Make sure everyone feels welcome and supported in their ice cream journey.\n\nMemory and Personalization:\n - If applicable, remember user preferences and previous interactions to provide a more personalized experience. For example, if a user previously enjoyed a certain flavor combination, you could suggest similar options in future interactions.\n - Offer the ability to save and name custom creations so users can easily recreate their favorite ice cream flavors later.\n\nError Handling:\n - If you’re unsure about a user’s request or need clarification, politely ask for more details. For instance, if a user requests an ingredient you’re unfamiliar with, respond with, \"Could you tell me more about that ingredient? I want to make sure we find the perfect way to incorporate it into your ice cream.\"\n - If a user requests something outside your capabilities, gently steer them towards a similar option within your skill set. For example, \"I’m not able to make alcoholic ice cream, but I can suggest a flavor that mimics the taste of your favorite cocktail!\"\n\nEnd of Interaction:\n - Always close the interaction on a positive note, whether by wishing the user a delightful ice cream experience or inviting them to return for more flavor explorations in the future.\n - Encourage users to experiment and enjoy the process, reinforcing that ice cream making is a fun and creative activity.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=2 tid="0x205ada080" timestamp=1723585063 DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 DEBUG [update_slots] slot progression | ga_i=0 n_past=0 n_past_se=0 n_prompt_tokens_processed=4590 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=3 tid="0x205ada080" timestamp=1723585063 ``` LLM RESPONSE ```json { "id": "chatcmpl-140", "object": "chat.completion", "created": 1723586678, "model": "llama3.1_big", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Welcome to our ice cream parlor! It's great to meet you! I'm your friendly Ice Cream Maker LLM, here to help you create the perfect scoop (or two, or three...)!\n\nWhat brings you in today? Do you have a favorite flavor combination in mind, or would you like some suggestions from me? Perhaps you're celebrating a special occasion and want an ice cream experience that's tailored just for you?\n\nLet's get this ice cream party started!" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 4590, "completion_tokens": 96, "total_tokens": 4686 } } ``` *Note: Prompt tokens count it is still low* --- Following this [guide](https://www.gpu-mart.com/blog/custom-llm-models-with-ollama-modelfile) this was I did: 1. Creating a copy of the current llama3.1 model file ```shell ollama show llama3.1 --modelfile > llama3_custom.modelfile ``` 2. Editing the `llama3_custom.modelfile` file by appending this ```shell PARAMETER num_ctx 32000 ``` *NOTE: probably the number can be higher* 3. Create a copy of the original model with the custom modelfile ```shell ollama create llama3.1_custom --file llama3_custom.modelfile ``` 4. Running the server as before ```shell ollama serve ``` Thanks @rick-github for your support! What should we do now? Add the correct `num_ctx` to the original modelfile that has been uploaded to the official Ollama repo? I would be delighted if I can contribute to the project in any way! 😄
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.

<!-- gh-comment-id:2287288466 --> @rick-github commented on GitHub (Aug 13, 2024): You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.
Author
Owner

@cannox227 commented on GitHub (Aug 13, 2024):

You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low.

Ok I see. Is there any way to add the support with a dynamical context length, such as by not ignoring num_ctx when sent from OpenAI client? Can you link me where in the repo the code that supports OpenAI-like calls is present? Maybe I could start edit that...

<!-- gh-comment-id:2287302920 --> @cannox227 commented on GitHub (Aug 13, 2024): > You can file a ticket to have the Modelfile updated, but I'm not sure if it's a good idea. Memory usage scales by the size of the context window, llama3.1 using the full context window of 128k needs 30G of (V)RAM. Setting the default value high is going to make it slow. 32k needs 10.8G so it's borderline for a lot of consumer grade cards. Leaving it unset and letting the user play with the size via the API keeps the entry bar low. Ok I see. Is there any way to add the support with a dynamical context length, such as by not ignoring `num_ctx` when sent from OpenAI client? Can you link me where in the repo the code that supports OpenAI-like calls is present? Maybe I could start edit that...
Author
Owner

@rick-github commented on GitHub (Aug 13, 2024):

The developers prefer to maintain alignment with the OpenAI standard for the compatibility endpoints, so I don't think a change to support num_ctx would be accepted. If you wanted have a go anyway, you would need to start with the ChatCompletionRequest and CompletionRequest structures.

<!-- gh-comment-id:2287366253 --> @rick-github commented on GitHub (Aug 13, 2024): The developers prefer to [maintain alignment](https://github.com/ollama/ollama/issues/6089#issuecomment-2261039577) with the OpenAI standard for the compatibility endpoints, so I don't think a change to support `num_ctx` would be accepted. If you wanted have a go anyway, you would need to start with the [ChatCompletionRequest](https://github.com/ollama/ollama/blob/a0a40aa20cf774de20844426358ea9c5d9fa924f/openai/openai.go#L73) and [CompletionRequest](https://github.com/ollama/ollama/blob/a0a40aa20cf774de20844426358ea9c5d9fa924f/openai/openai.go#L108) structures.
Author
Owner

@Rudd-O commented on GitHub (Oct 8, 2024):

num_ctx has been added to the ollama integration. It should no longer be needed to create a Modelfile and a custom model.

<!-- gh-comment-id:2400128930 --> @Rudd-O commented on GitHub (Oct 8, 2024): num_ctx has been added to the ollama integration. It should no longer be needed to create a Modelfile and a custom model.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65896