[GH-ISSUE #13748] Is Ollama good for production deployment for processing large data in batches? #34772

Closed
opened 2026-04-22 18:36:25 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @abhijithneilabraham on GitHub (Jan 16, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13748

Hello! We're considering using Ollama for production deployment for some of our customers on-premise with local LLMs. We use batch processing to help LLMs access the full data to have a row-level intelligence. We would like to know if Batch Processing open source LLMs on large scale data could use Ollama as the default backend for this.

Here's the project: https://github.com/vitalops/datatune

Originally created by @abhijithneilabraham on GitHub (Jan 16, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13748 Hello! We're considering using Ollama for production deployment for some of our customers on-premise with local LLMs. We use batch processing to help LLMs access the full data to have a row-level intelligence. We would like to know if Batch Processing open source LLMs on large scale data could use Ollama as the default backend for this. Here's the project: https://github.com/vitalops/datatune
GiteaMirror added the question label 2026-04-22 18:36:25 -05:00
Author
Owner

@SingularityMan commented on GitHub (Jan 16, 2026):

While Ollama is fantastic for backend automation and rapid prototyping, it doesn't scale very well unless you have really good hardware and even then, a lot of optimizations are automated. You might want to use VLLM instead since it is much better for scalable production enterprise solutions, but that is really advanced local inference, and it performs best with a Linux environment.

Depending on the size of your customer base, I'd rather go with something more customizable.

<!-- gh-comment-id:3761937488 --> @SingularityMan commented on GitHub (Jan 16, 2026): While Ollama is fantastic for backend automation and rapid prototyping, it doesn't scale very well unless you have really good hardware and even then, a lot of optimizations are automated. You might want to use VLLM instead since it is much better for scalable production enterprise solutions, but that is really advanced local inference, and it performs best with a Linux environment. Depending on the size of your customer base, I'd rather go with something more customizable.
Author
Owner

@abhijithneilabraham commented on GitHub (Jan 17, 2026):

@SingularityMan Thanks for the comment!

<!-- gh-comment-id:3764189856 --> @abhijithneilabraham commented on GitHub (Jan 17, 2026): @SingularityMan Thanks for the comment!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34772