[GH-ISSUE #6143] Support for AWS Neuron Inferentia GPU #3837

Open
opened 2026-04-12 14:40:23 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @mavwolverine on GitHub (Aug 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6143

This would add the ability to run ollama on inf2 instance types in AWS.

Originally created by @mavwolverine on GitHub (Aug 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6143 This would add the ability to run ollama on inf2 instance types in AWS.
GiteaMirror added the feature request label 2026-04-12 14:40:23 -05:00
Author
Owner

@mavwolverine commented on GitHub (Aug 2, 2024):

$ lspci -d '1d0f:' | grep NeuronDevice
00:1f.0 System peripheral: Amazon.com, Inc. NeuronDevice (Inferentia2)

<!-- gh-comment-id:2265737048 --> @mavwolverine commented on GitHub (Aug 2, 2024): $ lspci -d '1d0f:' | grep NeuronDevice 00:1f.0 System peripheral: Amazon.com, Inc. NeuronDevice (Inferentia2)
Author
Owner

@mavwolverine commented on GitHub (Aug 2, 2024):

/sys/module/neuron/version

<!-- gh-comment-id:2265737614 --> @mavwolverine commented on GitHub (Aug 2, 2024): /sys/module/neuron/version
Author
Owner

@mavwolverine commented on GitHub (Aug 2, 2024):

# Configure Linux for Neuron repository updates
sudo tee /etc/yum.repos.d/neuron.repo > /dev/null <<EOF
[neuron]
name=Neuron YUM Repository
baseurl=https://yum.repos.neuron.amazonaws.com
enabled=1
metadata_expire=0
EOF
sudo rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB

# Update OS packages 
sudo yum update -y

# Install OS headers 
sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) -y

# Install git 
sudo yum install git -y

# install Neuron Driver
sudo yum install aws-neuronx-dkms-2.* -y

# Install Neuron Runtime 
sudo yum install aws-neuronx-collectives-2.* -y
sudo yum install aws-neuronx-runtime-lib-2.* -y

# Install Neuron Tools 
sudo yum install aws-neuronx-tools-2.* -y

# Add PATH
export PATH=/opt/aws/neuron/bin:$PATH
<!-- gh-comment-id:2265741765 --> @mavwolverine commented on GitHub (Aug 2, 2024): ``` # Configure Linux for Neuron repository updates sudo tee /etc/yum.repos.d/neuron.repo > /dev/null <<EOF [neuron] name=Neuron YUM Repository baseurl=https://yum.repos.neuron.amazonaws.com enabled=1 metadata_expire=0 EOF sudo rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB # Update OS packages sudo yum update -y # Install OS headers sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) -y # Install git sudo yum install git -y # install Neuron Driver sudo yum install aws-neuronx-dkms-2.* -y # Install Neuron Runtime sudo yum install aws-neuronx-collectives-2.* -y sudo yum install aws-neuronx-runtime-lib-2.* -y # Install Neuron Tools sudo yum install aws-neuronx-tools-2.* -y # Add PATH export PATH=/opt/aws/neuron/bin:$PATH ```
Author
Owner

@ollieanwyll commented on GitHub (Aug 5, 2024):

Would also love to see Ollama support AWS Infra2 instances.

<!-- gh-comment-id:2268940340 --> @ollieanwyll commented on GitHub (Aug 5, 2024): Would also love to see Ollama support AWS Infra2 instances.
Author
Owner

@CptTZ commented on GitHub (Aug 8, 2024):

I think the limitation is with llama.cpp, so we should revive this issue first - https://github.com/ggerganov/llama.cpp/issues/2109

<!-- gh-comment-id:2276594196 --> @CptTZ commented on GitHub (Aug 8, 2024): I think the limitation is with llama.cpp, so we should revive this issue first - https://github.com/ggerganov/llama.cpp/issues/2109
Author
Owner

@muhammad-asn commented on GitHub (Feb 14, 2025):

love the see his if ollama support AWS Inferentia 2

<!-- gh-comment-id:2658605821 --> @muhammad-asn commented on GitHub (Feb 14, 2025): love the see his if ollama support AWS Inferentia 2
Author
Owner

@MoonKraken commented on GitHub (Feb 28, 2025):

Agree this would be amazing

<!-- gh-comment-id:2691123284 --> @MoonKraken commented on GitHub (Feb 28, 2025): Agree this would be amazing
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3837