[GH-ISSUE #3239] Vercel AI SDK with Ollama - not for production #1998

Closed
opened 2026-04-12 12:11:47 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @jakobhoeg on GitHub (Mar 19, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3239

What is the issue?

The blog post about how Vercel AI SDK can be integrated with Ollama should be updated to include that it's NOT for production and only works locally.

Calls from NextJS api routes can't be proxied to localhost. When this route is called on a hosted instance, it runs on the server and can't access the Ollama running on a local machine.

What did you expect to see?

I expected to be able to host it through Vercel and still be able to hit my locally running Ollama. It doesn't state that this isn't possible and since the Vercel AI SDK is there, it implies for it to work.

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

No response

OS

Windows

Architecture

No response

Platform

No response

Ollama version

No response

GPU

No response

GPU info

No response

CPU

No response

Other software

Vercel

Originally created by @jakobhoeg on GitHub (Mar 19, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3239 ### What is the issue? The [blog post](https://ollama.com/blog/openai-compatibility) about how Vercel AI SDK can be integrated with Ollama should be updated to include that it's NOT for production and only works locally. Calls from NextJS api routes can't be proxied to localhost. When this route is called on a hosted instance, it runs on the server and can't access the Ollama running on a local machine. ### What did you expect to see? I expected to be able to host it through Vercel and still be able to hit my locally running Ollama. It doesn't state that this isn't possible and since the Vercel AI SDK is there, it implies for it to work. ### Steps to reproduce _No response_ ### Are there any recent changes that introduced the issue? _No response_ ### OS Windows ### Architecture _No response_ ### Platform _No response_ ### Ollama version _No response_ ### GPU _No response_ ### GPU info _No response_ ### CPU _No response_ ### Other software Vercel
GiteaMirror added the bugneeds more infoapi labels 2026-04-12 12:11:48 -05:00
Author
Owner

@xycjscs commented on GitHub (Mar 21, 2024):

Openai sdk can be accessed from local machine, remote machine (via python), and remote machine (curl). However, once it is hosted on the vercel, it failed.
Any clue or idea? I have already added it into my production on the vercel.

<!-- gh-comment-id:2011049769 --> @xycjscs commented on GitHub (Mar 21, 2024): Openai sdk can be accessed from local machine, remote machine (via python), and remote machine (curl). However, once it is hosted on the vercel, it failed. Any clue or idea? I have already added it into my production on the vercel.
Author
Owner

@xycjscs commented on GitHub (Mar 21, 2024):

Openai sdk can be accessed from local machine, remote machine (via python), and remote machine (curl). However, once it is hosted on the vercel, it failed. Any clue or idea? I have already added it into my production on the vercel.

import { CHAT_SETTING_LIMITS } from "@/lib/chat-setting-limits"
import { checkApiKey, getServerProfile } from "@/lib/server/server-chat-helpers"
import { ChatSettings } from "@/types"
import { OpenAIStream, StreamingTextResponse } from "ai"
import OpenAI from "openai"
import { ChatCompletionCreateParamsBase } from "openai/resources/chat/completions.mjs"
import { ServerRuntime } from "next"

export const runtime: ServerRuntime = "edge"

export async function POST(request: Request) {
  const json = await request.json()
  const { chatSettings, messages } = json as {
    chatSettings: ChatSettings
    messages: any[]
  }

  try {
    const openai = new OpenAI({
      apiKey: "any",
      baseURL: `${process.env.NEXT_PUBLIC_OLLAMA_URL}/v1` 
    })

    console.log(messages)

    const response = await openai.chat.completions.create({
      model: chatSettings.model as ChatCompletionCreateParamsBase["model"],
      messages: messages as ChatCompletionCreateParamsBase["messages"],
      temperature: chatSettings.temperature,
      max_tokens:
        CHAT_SETTING_LIMITS[chatSettings.model].MAX_TOKEN_OUTPUT_LENGTH,
      stream: true
    })

    const stream = OpenAIStream(response)

    return new StreamingTextResponse(stream)
  } catch (error: any) {
    const errorMessage = error.error?.message || "An unexpected error occurred"
    const errorCode = error.status || 500
    return new Response(JSON.stringify({ message: errorMessage }), {
      status: errorCode
    })
  }
}
<!-- gh-comment-id:2011056816 --> @xycjscs commented on GitHub (Mar 21, 2024): > Openai sdk can be accessed from local machine, remote machine (via python), and remote machine (curl). However, once it is hosted on the vercel, it failed. Any clue or idea? I have already added it into my production on the vercel. ``` import { CHAT_SETTING_LIMITS } from "@/lib/chat-setting-limits" import { checkApiKey, getServerProfile } from "@/lib/server/server-chat-helpers" import { ChatSettings } from "@/types" import { OpenAIStream, StreamingTextResponse } from "ai" import OpenAI from "openai" import { ChatCompletionCreateParamsBase } from "openai/resources/chat/completions.mjs" import { ServerRuntime } from "next" export const runtime: ServerRuntime = "edge" export async function POST(request: Request) { const json = await request.json() const { chatSettings, messages } = json as { chatSettings: ChatSettings messages: any[] } try { const openai = new OpenAI({ apiKey: "any", baseURL: `${process.env.NEXT_PUBLIC_OLLAMA_URL}/v1` }) console.log(messages) const response = await openai.chat.completions.create({ model: chatSettings.model as ChatCompletionCreateParamsBase["model"], messages: messages as ChatCompletionCreateParamsBase["messages"], temperature: chatSettings.temperature, max_tokens: CHAT_SETTING_LIMITS[chatSettings.model].MAX_TOKEN_OUTPUT_LENGTH, stream: true }) const stream = OpenAIStream(response) return new StreamingTextResponse(stream) } catch (error: any) { const errorMessage = error.error?.message || "An unexpected error occurred" const errorCode = error.status || 500 return new Response(JSON.stringify({ message: errorMessage }), { status: errorCode }) } } ```
Author
Owner

@dhiltgen commented on GitHub (Nov 6, 2024):

Is this still a problem? Does it require adjusting the OLLAMA_ORIGINS setting?

https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-allow-additional-web-origins-to-access-ollama

<!-- gh-comment-id:2460424543 --> @dhiltgen commented on GitHub (Nov 6, 2024): Is this still a problem? Does it require adjusting the `OLLAMA_ORIGINS` setting? https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-allow-additional-web-origins-to-access-ollama
Author
Owner

@pdevine commented on GitHub (Dec 19, 2024):

I'm going to close this, but we can reopen it if it's still a problem.

<!-- gh-comment-id:2555819412 --> @pdevine commented on GitHub (Dec 19, 2024): I'm going to close this, but we can reopen it if it's still a problem.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1998