[GH-ISSUE #5768] ollama serve supports only llama3 for other models like gemma its 404 error #50103

Closed
opened 2026-04-28 14:07:35 -05:00 by GiteaMirror · 14 comments
Owner

Originally created by @RakshitAralimatti on GitHub (Jul 18, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5768

Before I downloaded the Llama 3 and used it using the Ollama serve and made API calls using python.

Now I downloaded Gemma 2 and when I run Ollama serve and in API I use the model as gemma2 it shows 404 but when I run using llama3 it's working fine.
Thanks in advance

Originally created by @RakshitAralimatti on GitHub (Jul 18, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5768 Before I downloaded the Llama 3 and used it using the Ollama serve and made API calls using python. Now I downloaded Gemma 2 and when I run Ollama serve and in API I use the model as gemma2 it shows 404 but when I run using llama3 it's working fine. Thanks in advance
GiteaMirror added the question label 2026-04-28 14:07:35 -05:00
Author
Owner

@rick-github commented on GitHub (Jul 18, 2024):

Does ollama list show the gemma2 model? If you include this line in your python app, does it show the same list?

print("\n".join([m["name"] for m in ollama.Client().list()["models"]]))

<!-- gh-comment-id:2236935414 --> @rick-github commented on GitHub (Jul 18, 2024): Does `ollama list` show the gemma2 model? If you include this line in your python app, does it show the same list? `print("\n".join([m["name"] for m in ollama.Client().list()["models"]]))`
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 19, 2024):

Thanks for your response @rick-github
When I do ollama list it shows the model name gemma2

<!-- gh-comment-id:2238501673 --> @RakshitAralimatti commented on GitHub (Jul 19, 2024): Thanks for your response @rick-github When I do ollama list it shows the model name gemma2
Author
Owner

@rick-github commented on GitHub (Jul 19, 2024):

Does your app see the same models when it runs the print command? Is it possible that the model name in the app is mis-spelled? If you copy&paste the model name from your code to ollama run <modelname>, does it start the model?

<!-- gh-comment-id:2238807809 --> @rick-github commented on GitHub (Jul 19, 2024): Does your app see the same models when it runs the `print` command? Is it possible that the model name in the app is mis-spelled? If you copy&paste the model name from your code to `ollama run <modelname>`, does it start the model?
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 19, 2024):

@rick-github Thanks for your response
Yes its shows the model name with print and also when I run ollama run gemma2 in terminal its working.
But when using ollama serve and using my python script its showing 404 error.

<!-- gh-comment-id:2238937335 --> @RakshitAralimatti commented on GitHub (Jul 19, 2024): @rick-github Thanks for your response Yes its shows the model name with print and also when I run `ollama run gemma2` in terminal its working. But when using ollama serve and using my python script its showing 404 error.
Author
Owner

@rick-github commented on GitHub (Jul 19, 2024):

What is the exact error message returned with the 404 response?

<!-- gh-comment-id:2238948153 --> @rick-github commented on GitHub (Jul 19, 2024): What is the exact error message returned with the 404 response?
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 19, 2024):

@rick-github
ollama404
code i am trying to run -
` output = infer.generate({
"model": "gemma2",
"prompt": "Write a story of 500 words",
"stream": False,

})`

The generate function sends the request to API -  http://localhost:11434/api/generate
<!-- gh-comment-id:2238990172 --> @RakshitAralimatti commented on GitHub (Jul 19, 2024): @rick-github ![ollama404](https://github.com/user-attachments/assets/977707ed-4357-46d8-b6a8-a5ec97c97712) code i am trying to run - ` output = infer.generate({ "model": "gemma2", "prompt": "Write a story of 500 words", "stream": False, })` The generate function sends the request to API - http://localhost:11434/api/generate
Author
Owner

@rick-github commented on GitHub (Jul 19, 2024):

The response to infer.generate will have a body that tells you why it's returning 404. What is the content of that body?

<!-- gh-comment-id:2238998911 --> @rick-github commented on GitHub (Jul 19, 2024): The response to `infer.generate` will have a body that tells you why it's returning 404. What is the content of that body?
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 19, 2024):

@rick-github
This is the body showing - Content - b'{"error":"model \"gemma2\" not found, try pulling it first"}'
But I have gemma2 it shows in the ollama list
NAME ID SIZE MODIFIED
gemma2:latest ff02c3702f32 5.4 GB 24 hours ago
llama3:latest 365c0bd3c000 4.7 GB 3 weeks ago
example:latest 65cd853c43c7 16 GB 5 weeks ago
phi3:mini 64c1188f2485 2.4 GB 5 weeks ago
phi3:latest 64c1188f2485 2.4 GB 7 weeks ago

<!-- gh-comment-id:2239007963 --> @RakshitAralimatti commented on GitHub (Jul 19, 2024): @rick-github This is the body showing - **Content - b'{"error":"model \\"gemma2\\" not found, try pulling it first"}'** But I have gemma2 it shows in the `ollama list` NAME ID SIZE MODIFIED gemma2:latest ff02c3702f32 5.4 GB 24 hours ago llama3:latest 365c0bd3c000 4.7 GB 3 weeks ago example:latest 65cd853c43c7 16 GB 5 weeks ago phi3:mini 64c1188f2485 2.4 GB 5 weeks ago phi3:latest 64c1188f2485 2.4 GB 7 weeks ago
Author
Owner

@rick-github commented on GitHub (Jul 19, 2024):

Are you sure that your app is talking to the same ollama server that you are using when you run ollama list? When you ran the print command, did you run it within the app? What was the output of the print command?

<!-- gh-comment-id:2239014488 --> @rick-github commented on GitHub (Jul 19, 2024): Are you sure that your app is talking to the same ollama server that you are using when you run `ollama list`? When you ran the `print` command, did you run it within the app? What was the output of the `print` command?
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 19, 2024):

@rick-github
You r right when I ran the print command within my Python script its showed only llama3:latest
But why is it so? I have run ollama in only one terminal

<!-- gh-comment-id:2239035313 --> @RakshitAralimatti commented on GitHub (Jul 19, 2024): @rick-github You r right when I ran the print command within my Python script its showed only `llama3:latest` But why is it so? I have run ollama in only one terminal
Author
Owner

@rick-github commented on GitHub (Jul 19, 2024):

It seems like you are running 2 (or more) ollama servers. Is your app running in a container? Is OLLAMA_HOST set in your environment or the environment of the app? What's the output of ps wwp$(pidof ollama)?

<!-- gh-comment-id:2239072783 --> @rick-github commented on GitHub (Jul 19, 2024): It seems like you are running 2 (or more) ollama servers. Is your app running in a container? Is OLLAMA_HOST set in your environment or the environment of the app? What's the output of `ps wwp$(pidof ollama)`?
Author
Owner

@cleverpig commented on GitHub (Jul 22, 2024):

Watch this problem, seem like mine.

<!-- gh-comment-id:2241933334 --> @cleverpig commented on GitHub (Jul 22, 2024): Watch this problem, seem like mine.
Author
Owner

@RakshitAralimatti commented on GitHub (Jul 22, 2024):

@rick-github
When I run pidof ollama the output I get is 50778.
If I am running more than 2 Ollama servers, how can I stop all the servers?
And the OLLAMA_HOST env is not set in the environment.

<!-- gh-comment-id:2242076028 --> @RakshitAralimatti commented on GitHub (Jul 22, 2024): @rick-github When I run `pidof ollama` the output I get is `50778`. If I am running more than 2 Ollama servers, how can I stop all the servers? And the OLLAMA_HOST env is not set in the environment.
Author
Owner

@rick-github commented on GitHub (Jul 22, 2024):

If pidof ollama is showing only one process id, that means there's only one ollama server on that machine. But since ollama list and your app are seeing two different model lists, that means there are two servers. So you will have to figure out why that is.

<!-- gh-comment-id:2242518655 --> @rick-github commented on GitHub (Jul 22, 2024): If `pidof ollama` is showing only one process id, that means there's only one ollama server on that machine. But since `ollama list` and your app are seeing two different model lists, that means there are two servers. So you will have to figure out why that is.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50103