[GH-ISSUE #8705] LLaMa 3.2-based chatbot is pretending the user-provided document doesn't exist #83327

Closed
opened 2026-05-09 17:49:39 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @AutistiCoder on GitHub (Jan 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8705

I'm trying to create a chatbot app in TypeScript, that uses pdf-parse to parse a PDF, feed that PDF into LLaMa 3.2 and then have that LLM answer questions based solely on the document:


import { readFile } from "fs";
import * as fs from "fs/promises";
import * as readline from "readline";
import pdf from "pdf-parse-debugging-disabled";
type Message = {role: string, content: string};
const decoder = new TextDecoder("utf-8")
const messages: Message[] = [];
async function chat()
{
    console.log(messages); // Debugging: Ensure the messages array has the document text
    const llamaRequestBody = {"messages": messages,"model":"llama3.2"}
    const llamaResponse = await fetch("http://localhost:11434/api/chat",{
        body: JSON.stringify(llamaRequestBody),method:"POST"
    });
    if (!llamaResponse.ok)
        throw llamaResponse.status;
    if (!llamaResponse.body)
        throw "null response body";
    const reader = llamaResponse.body.getReader();
    return reader;
};
function addMessage(message: Message)
{
    messages.push(message);
}
const rl = readline.createInterface(process.stdin,process.stdout);
const logOutput = async (reader: ReadableStreamDefaultReader)=>{
    let assistantResponse = {role:"assistant",content:""};
    while (true)
    {
        const {done,value} = await reader.read();
        if (done)
            break;
        const parsed = JSON.parse(decoder.decode(value)) as {message: Message};
        assistantResponse.content += parsed.message.content;
        process.stdout.write(parsed.message.content);
    }
};
rl.question("which document?", async (answer) => {
    try {
        const fileBuffer = await fs.readFile(answer);
        const pdfData = await pdf(fileBuffer);

        addMessage({ role: "system", content: `Below is the document content. Answer user questions strictly based on this content. No outside knowledge.\n\n${pdfData.text}` });

        rl.question("What would you like to say to the chatbot? ", async (userInput) => {
            addMessage({ role: "user", content: userInput });

            const reader = await chat();
            await logOutput(reader);
        });
    } catch (err) {
        console.error("Error reading file:", err);
    }
});

When I ask a question that is only answered in a document, LLaMa acts like the app never sent it any document, even though I can confirm that the document's contents are correctly loaded and are passed to the API request.

Originally created by @AutistiCoder on GitHub (Jan 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8705 I'm trying to create a chatbot app in TypeScript, that uses pdf-parse to parse a PDF, feed that PDF into LLaMa 3.2 and then have that LLM answer questions based solely on the document: ``` import { readFile } from "fs"; import * as fs from "fs/promises"; import * as readline from "readline"; import pdf from "pdf-parse-debugging-disabled"; type Message = {role: string, content: string}; const decoder = new TextDecoder("utf-8") const messages: Message[] = []; async function chat() { console.log(messages); // Debugging: Ensure the messages array has the document text const llamaRequestBody = {"messages": messages,"model":"llama3.2"} const llamaResponse = await fetch("http://localhost:11434/api/chat",{ body: JSON.stringify(llamaRequestBody),method:"POST" }); if (!llamaResponse.ok) throw llamaResponse.status; if (!llamaResponse.body) throw "null response body"; const reader = llamaResponse.body.getReader(); return reader; }; function addMessage(message: Message) { messages.push(message); } const rl = readline.createInterface(process.stdin,process.stdout); const logOutput = async (reader: ReadableStreamDefaultReader)=>{ let assistantResponse = {role:"assistant",content:""}; while (true) { const {done,value} = await reader.read(); if (done) break; const parsed = JSON.parse(decoder.decode(value)) as {message: Message}; assistantResponse.content += parsed.message.content; process.stdout.write(parsed.message.content); } }; rl.question("which document?", async (answer) => { try { const fileBuffer = await fs.readFile(answer); const pdfData = await pdf(fileBuffer); addMessage({ role: "system", content: `Below is the document content. Answer user questions strictly based on this content. No outside knowledge.\n\n${pdfData.text}` }); rl.question("What would you like to say to the chatbot? ", async (userInput) => { addMessage({ role: "user", content: userInput }); const reader = await chat(); await logOutput(reader); }); } catch (err) { console.error("Error reading file:", err); } }); ``` When I ask a question that is only answered in a document, LLaMa acts like the app never sent it any document, even though I can confirm that the document's contents are correctly loaded and are passed to the API request.
Author
Owner

@rick-github commented on GitHub (Jan 30, 2025):

Server logs will aid in debugging, but a likely candidate is you are overflowing the context buffer. Increase num_ctx.

    const llamaRequestBody = {"messages": messages,"model":"llama3.2","options":{"num_ctx":4096}}
<!-- gh-comment-id:2625503637 --> @rick-github commented on GitHub (Jan 30, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging, but a likely candidate is you are overflowing the context buffer. [Increase `num_ctx`](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size). ``` const llamaRequestBody = {"messages": messages,"model":"llama3.2","options":{"num_ctx":4096}} ```
Author
Owner

@AutistiCoder commented on GitHub (Jan 30, 2025):

Increasing "num_ctx" to 4096 did the trick (for now), Thanks!

<!-- gh-comment-id:2625516619 --> @AutistiCoder commented on GitHub (Jan 30, 2025): Increasing "num_ctx" to 4096 did the trick (for now), Thanks!
Author
Owner

@abhishek-syno commented on GitHub (Jan 31, 2025):

Is there a solution for how it can be set permanently through a configuration file? num_ctx=4096
@rick-github

<!-- gh-comment-id:2626523568 --> @abhishek-syno commented on GitHub (Jan 31, 2025): Is there a solution for how it can be set permanently through a configuration file? num_ctx=4096 @rick-github
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

echo FROM model-with-default-context > Modelfile
echo PARAMETER num_ctx 4096 >> Modelfile
ollama create model-with-context-4096 
<!-- gh-comment-id:2626952571 --> @rick-github commented on GitHub (Jan 31, 2025): ```console echo FROM model-with-default-context > Modelfile echo PARAMETER num_ctx 4096 >> Modelfile ollama create model-with-context-4096 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#83327