[GH-ISSUE #516] Docker Container on KVM-Based Ubuntu Machine Fails to Start 'llama.cpp Server' with Illegal Instruction Error #240

Closed
opened 2026-04-12 09:45:37 -05:00 by GiteaMirror · 21 comments
Owner

Originally created by @philipempl on GitHub (Sep 12, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/516

Issue Summary:

I encountered an issue while running a Docker container on a KVM-based Ubuntu machine. The container is built using the following Dockerfile and runs a Go application:

Dockerfile:

# Stage 1: Build the binary
FROM golang:alpine AS builder

# Install required dependencies
RUN apk add --no-cache git build-base cmake

# Set the working directory within the container
WORKDIR /app

# Clone the source code from the GitHub repository
RUN git clone https://github.com/jmorganca/ollama.git .

# Build the binary with static linking
RUN go generate ./... \
    && go build -ldflags '-linkmode external -extldflags "-static"' -o .

# Stage 2: Create the final image
FROM alpine

ENV OLLAMA_HOST "0.0.0.0"

# Install required runtime dependencies
RUN apk add --no-cache libstdc++ curl

# Copy the custom entry point script into the container
COPY Modelfile /Modelfile

# Copy the custom entry point script into the container
COPY entrypoint.sh /entrypoint.sh

# Make the script executable
RUN chmod +x /entrypoint.sh

# Create a non-root user
ARG USER=ollama
ARG GROUP=ollama
RUN addgroup $GROUP && adduser -D -G $GROUP $USER

# Copy the binary from the builder stage
COPY --from=builder /app/ollama /bin/ollama

USER $USER:$GROUP

ENTRYPOINT ["/entrypoint.sh"]

Entrypoint.sh:

#!/bin/sh

./bin/ollama serve &

sleep 5

curl -X POST http://ollama:11434/api/pull -d '{"name": "llama2"}'

sleep 10

tail -f /dev/null

Error Log:

ollama | [GIN] 2023/09/12 - 14:16:41 | 500 | 1m30s | 10.10.2.6 | POST "/api/generate"
ollama | 2023/09/12 14:15:11 llama.go:311: waiting for llama.cpp server to start responding
ollama | 2023/09/12 14:15:41 llama.go:292: error starting llama.cpp server: llama.cpp server did not start within allotted time, retrying
ollama | 2023/09/12 14:15:41 llama.go:329: llama.cpp server exited with error: signal: illegal instruction (core dumped)
ollama | 2023/09/12 14:15:41 llama.go:285: starting llama.cpp server
ollama | 2023/09/12 14:15:41 llama.go:311: waiting for llama.cpp server to start responding

System Specification:

description: Computer
product: KVM (8.2.0)
vendor: Red Hat
version: RHEL-8.2.0 PC (Q35 + ICH9, 2009)
width: 64 bits
capabilities: smbios-2.8 dmi-2.8 smp vsyscall32
configuration: boot=normal family=Red Hat sku=8.2.0 uuid=AC06E592-B8AE-F64D-A219-4EC4D8C1C5A0
*-core
   description: Motherboard
   product: RHEL-AV
   vendor: Red Hat
   physical id: 0
   version: RHEL-8.2.0 PC (Q35 + ICH9, 2009)
*-firmware
      description: BIOS
      vendor: SeaBIOS
      physical id: 0
      version: 1.16.0-3.module+el8.7.0+1084+97b81f61
      date: 04/01/2014
      size: 96KiB
*-cpu:0
      description: CPU
      product: Intel Core i7 9xx (Nehalem Core i7, IBRS update)
      vendor: Intel Corp.
      physical id: 400
      bus info: cpu@0
      version: RHEL-8.2.0 PC (Q35 + ICH9, 2009)
      slot: CPU 0
      size: 2GHz
      capacity: 2GHz
      width: 64 bits
      capabilities: [List of CPU capabilities]
      configuration: [CPU configuration details]

Issue Description:

I have created a Docker container using the provided Dockerfile, and it runs successfully on Windows and macOS. However, when running it on a KVM-based Ubuntu machine, I encounter the following issue:

  1. The application attempts to start a "llama.cpp server" but fails with a "signal: illegal instruction (core dumped)" error.
  2. The error occurs when I try to access the "api/generate" endpoint.

Steps to Reproduce:

  1. Build the Docker container using the provided Dockerfile.
  2. Run the container on a KVM-based Ubuntu machine.
  3. Run ollama serve
  4. Run ollama pull llama2
  5. Access the "api/generate" endpoint:
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Why is the sky blue?"
}'

Expected Behavior:

The application should start without errors, and I should be able to access the "api/generate" endpoint.

Actual Behavior:

The application encounters a "signal: illegal instruction (core dumped)" error when starting the "llama.cpp server," and I cannot access the "api/generate" endpoint.

Originally created by @philipempl on GitHub (Sep 12, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/516 **Issue Summary:** I encountered an issue while running a Docker container on a KVM-based Ubuntu machine. The container is built using the following Dockerfile and runs a Go application: **Dockerfile:** ```Dockerfile # Stage 1: Build the binary FROM golang:alpine AS builder # Install required dependencies RUN apk add --no-cache git build-base cmake # Set the working directory within the container WORKDIR /app # Clone the source code from the GitHub repository RUN git clone https://github.com/jmorganca/ollama.git . # Build the binary with static linking RUN go generate ./... \ && go build -ldflags '-linkmode external -extldflags "-static"' -o . # Stage 2: Create the final image FROM alpine ENV OLLAMA_HOST "0.0.0.0" # Install required runtime dependencies RUN apk add --no-cache libstdc++ curl # Copy the custom entry point script into the container COPY Modelfile /Modelfile # Copy the custom entry point script into the container COPY entrypoint.sh /entrypoint.sh # Make the script executable RUN chmod +x /entrypoint.sh # Create a non-root user ARG USER=ollama ARG GROUP=ollama RUN addgroup $GROUP && adduser -D -G $GROUP $USER # Copy the binary from the builder stage COPY --from=builder /app/ollama /bin/ollama USER $USER:$GROUP ENTRYPOINT ["/entrypoint.sh"] ``` **Entrypoint.sh:** ```shell #!/bin/sh ./bin/ollama serve & sleep 5 curl -X POST http://ollama:11434/api/pull -d '{"name": "llama2"}' sleep 10 tail -f /dev/null ``` **Error Log:** ``` ollama | [GIN] 2023/09/12 - 14:16:41 | 500 | 1m30s | 10.10.2.6 | POST "/api/generate" ollama | 2023/09/12 14:15:11 llama.go:311: waiting for llama.cpp server to start responding ollama | 2023/09/12 14:15:41 llama.go:292: error starting llama.cpp server: llama.cpp server did not start within allotted time, retrying ollama | 2023/09/12 14:15:41 llama.go:329: llama.cpp server exited with error: signal: illegal instruction (core dumped) ollama | 2023/09/12 14:15:41 llama.go:285: starting llama.cpp server ollama | 2023/09/12 14:15:41 llama.go:311: waiting for llama.cpp server to start responding ``` **System Specification:** ``` description: Computer product: KVM (8.2.0) vendor: Red Hat version: RHEL-8.2.0 PC (Q35 + ICH9, 2009) width: 64 bits capabilities: smbios-2.8 dmi-2.8 smp vsyscall32 configuration: boot=normal family=Red Hat sku=8.2.0 uuid=AC06E592-B8AE-F64D-A219-4EC4D8C1C5A0 *-core description: Motherboard product: RHEL-AV vendor: Red Hat physical id: 0 version: RHEL-8.2.0 PC (Q35 + ICH9, 2009) *-firmware description: BIOS vendor: SeaBIOS physical id: 0 version: 1.16.0-3.module+el8.7.0+1084+97b81f61 date: 04/01/2014 size: 96KiB *-cpu:0 description: CPU product: Intel Core i7 9xx (Nehalem Core i7, IBRS update) vendor: Intel Corp. physical id: 400 bus info: cpu@0 version: RHEL-8.2.0 PC (Q35 + ICH9, 2009) slot: CPU 0 size: 2GHz capacity: 2GHz width: 64 bits capabilities: [List of CPU capabilities] configuration: [CPU configuration details] ``` **Issue Description:** I have created a Docker container using the provided Dockerfile, and it runs successfully on Windows and macOS. However, when running it on a KVM-based Ubuntu machine, I encounter the following issue: 1. The application attempts to start a "llama.cpp server" but fails with a "signal: illegal instruction (core dumped)" error. 2. The error occurs when I try to access the "api/generate" endpoint. **Steps to Reproduce:** 1. Build the Docker container using the provided Dockerfile. 2. Run the container on a KVM-based Ubuntu machine. 3. Run ollama serve 4. Run ollama pull llama2 5. Access the "api/generate" endpoint: ``` curl -X POST http://localhost:11434/api/generate -d '{ "model": "llama2", "prompt": "Why is the sky blue?" }' ``` **Expected Behavior:** The application should start without errors, and I should be able to access the "api/generate" endpoint. **Actual Behavior:** The application encounters a "signal: illegal instruction (core dumped)" error when starting the "llama.cpp server," and I cannot access the "api/generate" endpoint.
Author
Owner

@philipempl commented on GitHub (Sep 13, 2023):

Additional information. I run the ollama server on a remote host and call the API from a different conputer

<!-- gh-comment-id:1718213463 --> @philipempl commented on GitHub (Sep 13, 2023): Additional information. I run the ollama server on a remote host and call the API from a different conputer
Author
Owner

@philipempl commented on GitHub (Sep 13, 2023):

Maybe related to #484 ? But pulling the model works without any problems

<!-- gh-comment-id:1718214839 --> @philipempl commented on GitHub (Sep 13, 2023): Maybe related to #484 ? But pulling the model works without any problems
Author
Owner

@mxyng commented on GitHub (Sep 13, 2023):

Can you verify the architecture of the docker images matches the VM? I've seen similar issues when trying to run arm images on x86

$ docker image inspect <image> --format='{{ .Architecture }}'
arm64
<!-- gh-comment-id:1718280548 --> @mxyng commented on GitHub (Sep 13, 2023): Can you verify the architecture of the docker images matches the VM? I've seen similar issues when trying to run arm images on x86 ``` $ docker image inspect <image> --format='{{ .Architecture }}' arm64 ```
Author
Owner

@philipempl commented on GitHub (Sep 14, 2023):

Thanks for your response. This is the result:

$ docker image inspect lecture-chat_ollama --format='{{ .Architecture }}'
amd64
<!-- gh-comment-id:1719025948 --> @philipempl commented on GitHub (Sep 14, 2023): Thanks for your response. This is the result: ``` $ docker image inspect lecture-chat_ollama --format='{{ .Architecture }}' amd64 ```
Author
Owner

@mxyng commented on GitHub (Sep 14, 2023):

To confirm, Docker host is also x86?

Can you paste the contents of the docker build? In particular, the go generate step as that will describe what platform it thinks it's building for.

<!-- gh-comment-id:1719991034 --> @mxyng commented on GitHub (Sep 14, 2023): To confirm, Docker host is also x86? Can you paste the contents of the docker build? In particular, the `go generate` step as that will describe what platform it _thinks_ it's building for.
Author
Owner

@philipempl commented on GitHub (Sep 15, 2023):

Of course I can. Although I checked the log several times, I just found the error message "cublas not found". Could it be that my VM does not have a GPU available? As far as I read, Llama.cpp uses the CPU by default. Anyway, here is the complete log:

Building ollama
Sending build context to Docker daemon   5.12kB
Step 1/17 : FROM golang:alpine AS builder
 ---> 09df25511440
Step 2/17 : RUN apk add --no-cache git build-base cmake
 ---> Running in b0e8e0dc9b56
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
(1/40) Installing libgcc (12.2.1_git20220924-r10)
(2/40) Installing libstdc++ (12.2.1_git20220924-r10)
(3/40) Installing zstd-libs (1.5.5-r4)
(4/40) Installing binutils (2.40-r7)
(5/40) Installing libmagic (5.45-r0)
(6/40) Installing file (5.45-r0)
(7/40) Installing libgomp (12.2.1_git20220924-r10)
(8/40) Installing libatomic (12.2.1_git20220924-r10)
(9/40) Installing gmp (6.2.1-r3)
(10/40) Installing isl26 (0.26-r1)
(11/40) Installing mpfr4 (4.2.0_p12-r0)
(12/40) Installing mpc1 (1.3.1-r1)
(13/40) Installing gcc (12.2.1_git20220924-r10)
(14/40) Installing libstdc++-dev (12.2.1_git20220924-r10)
(15/40) Installing musl-dev (1.2.4-r1)
(16/40) Installing libc-dev (0.7.2-r5)
(17/40) Installing g++ (12.2.1_git20220924-r10)
(18/40) Installing make (4.4.1-r1)
(19/40) Installing fortify-headers (1.1-r3)
(20/40) Installing patch (2.7.6-r10)
(21/40) Installing build-base (0.5-r3)
(22/40) Installing libacl (2.3.1-r3)
(23/40) Installing libbz2 (1.0.8-r5)
(24/40) Installing libexpat (2.5.0-r1)
(25/40) Installing lz4-libs (1.9.4-r4)
(26/40) Installing xz-libs (5.4.3-r0)
(27/40) Installing libarchive (3.7.2-r0)
(28/40) Installing brotli-libs (1.0.9-r14)
(29/40) Installing libunistring (1.1-r1)
(30/40) Installing libidn2 (2.3.4-r1)
(31/40) Installing nghttp2-libs (1.55.1-r0)
(32/40) Installing libcurl (8.2.1-r0)
(33/40) Installing ncurses-terminfo-base (6.4_p20230506-r0)
(34/40) Installing libncursesw (6.4_p20230506-r0)
(35/40) Installing libformw (6.4_p20230506-r0)
(36/40) Installing rhash-libs (1.4.3-r2)
(37/40) Installing libuv (1.44.2-r2)
(38/40) Installing cmake (3.26.5-r0)
(39/40) Installing pcre2 (10.42-r1)
(40/40) Installing git (2.40.1-r0)
Executing busybox-1.36.1-r2.trigger
OK: 311 MiB in 56 packages
Removing intermediate container b0e8e0dc9b56
 ---> 5ef279345f5e
Step 3/17 : WORKDIR /app
 ---> Running in 804bd7a46685
Removing intermediate container 804bd7a46685
 ---> 1c6520a2bd10
Step 4/17 : RUN git clone https://github.com/jmorganca/ollama.git .
 ---> Running in 66a78e86b6b6
Cloning into '.'...
Removing intermediate container 66a78e86b6b6
 ---> 4ca99a2920e2
Step 5/17 : RUN go generate ./...     && go build -ldflags '-linkmode external -extldflags "-static"' -o .
 ---> Running in a0f8ed6b4383
go: downloading github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58
go: downloading golang.org/x/term v0.10.0
go: downloading github.com/mattn/go-runewidth v0.0.14
go: downloading github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db
go: downloading github.com/gin-contrib/cors v1.4.0
go: downloading github.com/gin-gonic/gin v1.9.1
go: downloading golang.org/x/crypto v0.10.0
go: downloading golang.org/x/exp v0.0.0-20230817173708-d852ddb80c63
go: downloading gonum.org/v1/gonum v0.13.0
go: downloading github.com/chzyer/readline v1.5.1
go: downloading github.com/dustin/go-humanize v1.0.1
go: downloading github.com/olekukonko/tablewriter v0.0.5
go: downloading github.com/spf13/cobra v1.7.0
go: downloading golang.org/x/sys v0.11.0
go: downloading github.com/rivo/uniseg v0.2.0
go: downloading github.com/spf13/pflag v1.0.5
go: downloading golang.org/x/net v0.10.0
go: downloading github.com/gin-contrib/sse v0.1.0
go: downloading github.com/mattn/go-isatty v0.0.19
go: downloading github.com/go-playground/validator/v10 v10.14.0
go: downloading github.com/ugorji/go/codec v1.2.11
go: downloading google.golang.org/protobuf v1.30.0
go: downloading github.com/pelletier/go-toml/v2 v2.0.8
go: downloading gopkg.in/yaml.v3 v3.0.1
go: downloading golang.org/x/text v0.10.0
go: downloading github.com/go-playground/universal-translator v0.18.1
go: downloading github.com/leodido/go-urn v1.2.4
go: downloading github.com/gabriel-vasile/mimetype v1.4.2
go: downloading github.com/go-playground/locales v0.14.1
Submodule 'llm/llama.cpp/ggml' (https://github.com/ggerganov/llama.cpp.git) registered for path 'ggml'
Submodule 'llm/llama.cpp/gguf' (https://github.com/ggerganov/llama.cpp.git) registered for path 'gguf'
Cloning into '/app/llm/llama.cpp/ggml'...
From https://github.com/ggerganov/llama.cpp
 * branch            9e232f0234073358e7031c1b8d7aa45020469a3b -> FETCH_HEAD
Submodule path 'ggml': checked out '9e232f0234073358e7031c1b8d7aa45020469a3b'
-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.5s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/ggml/build/cpu
[  9%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[ 27%] Built target ggml
[ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 45%] Linking CXX static library libllama.a
[ 45%] Built target llama
[ 54%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[ 63%] Building CXX object examples/CMakeFiles/common.dir/console.cpp.o
[ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 72%] Built target common
[ 81%] Built target BUILD_INFO
[ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[100%] Linking CXX executable ../../bin/server
[100%] Built target server
Cloning into '/app/llm/llama.cpp/gguf'...
From https://github.com/ggerganov/llama.cpp
 * branch            53885d7256909ec3e2176cdc2477f3986c15ec69 -> FETCH_HEAD
Submodule path 'gguf': checked out '53885d7256909ec3e2176cdc2477f3986c15ec69'
-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.5s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/gguf/build/cpu
[  9%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
/app/llm/llama.cpp/gguf/k_quants.c:182:14: warning: 'make_qkx1_quants' defined but not used [-Wunused-function]
  182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
      |              ^~~~~~~~~~~~~~~~
[ 27%] Built target ggml
[ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 45%] Linking CXX static library libllama.a
[ 45%] Built target llama
[ 54%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 63%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 72%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 72%] Built target common
[ 81%] Built target BUILD_INFO
[ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[100%] Linking CXX executable ../../bin/server
[100%] Built target server
Submodule path 'ggml': checked out '9e232f0234073358e7031c1b8d7aa45020469a3b'
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.1s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/ggml/build/cpu
[  9%] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.40.1")
[  9%] Built target BUILD_INFO
[ 18%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 27%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 36%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[ 36%] Built target ggml
[ 45%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 54%] Linking CXX static library libllama.a
[ 54%] Built target llama
[ 63%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 81%] Built target common
[ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[100%] Linking CXX executable ../../bin/server
[100%] Built target server
Submodule path 'gguf': checked out '53885d7256909ec3e2176cdc2477f3986c15ec69'
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.1s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/gguf/build/cpu
[ 27%] Built target ggml
[ 45%] Built target llama
[ 72%] Built target common
[ 81%] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.40.1")
[ 81%] Built target BUILD_INFO
[100%] Built target server
-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Could not find nvcc, please set CUDAToolkit_ROOT.
CMake Warning at CMakeLists.txt:291 (message):
  cuBLAS not found


-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.5s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/ggml/build/cuda-
[  9%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[ 27%] Built target ggml
[ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 45%] Linking CXX static library libllama.a
[ 45%] Built target llama
[ 54%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[ 63%] Building CXX object examples/CMakeFiles/common.dir/console.cpp.o
[ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 72%] Built target common
[ 81%] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.40.1")
[ 81%] Built target BUILD_INFO
[ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[100%] Linking CXX executable ../../bin/server
[100%] Built target server
-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Could not find nvcc, please set CUDAToolkit_ROOT.
CMake Warning at CMakeLists.txt:292 (message):
  cuBLAS not found


-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.5s)
-- Generating done (0.1s)
-- Build files have been written to: /app/llm/llama.cpp/gguf/build/cuda-
[  9%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
/app/llm/llama.cpp/gguf/k_quants.c:182:14: warning: 'make_qkx1_quants' defined but not used [-Wunused-function]
  182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
      |              ^~~~~~~~~~~~~~~~
[ 27%] Built target ggml
[ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 45%] Linking CXX static library libllama.a
[ 45%] Built target llama
[ 54%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 63%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 72%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 72%] Built target common
[ 81%] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.40.1")
[ 81%] Built target BUILD_INFO
[ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[100%] Linking CXX executable ../../bin/server
[100%] Built target server
Removing intermediate container a0f8ed6b4383
 ---> 427153c98da2
Step 6/17 : FROM alpine
 ---> 7e01a0d0a1dc
Step 7/17 : ENV OLLAMA_HOST "0.0.0.0"
 ---> Running in 2593a5532740
Removing intermediate container 2593a5532740
 ---> 5ae06a41c895
Step 8/17 : RUN apk add --no-cache libstdc++ curl
 ---> Running in 5545d4188c0b
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
(1/9) Installing ca-certificates (20230506-r0)
(2/9) Installing brotli-libs (1.0.9-r14)
(3/9) Installing libunistring (1.1-r1)
(4/9) Installing libidn2 (2.3.4-r1)
(5/9) Installing nghttp2-libs (1.55.1-r0)
(6/9) Installing libcurl (8.2.1-r0)
(7/9) Installing curl (8.2.1-r0)
(8/9) Installing libgcc (12.2.1_git20220924-r10)
(9/9) Installing libstdc++ (12.2.1_git20220924-r10)
Executing busybox-1.36.1-r2.trigger
Executing ca-certificates-20230506-r0.trigger
OK: 14 MiB in 24 packages
Removing intermediate container 5545d4188c0b
 ---> ee52ddb1a545
Step 9/17 : COPY Modelfile /Modelfile
 ---> a71e55fb0b6b
Step 10/17 : COPY entrypoint.sh /entrypoint.sh
 ---> ba1178d45ab5
Step 11/17 : RUN chmod +x /entrypoint.sh
 ---> Running in ded0bbdc5617
Removing intermediate container ded0bbdc5617
 ---> 31a70282a3b8
Step 12/17 : ARG USER=ollama
 ---> Running in 0b1d97ed807c
Removing intermediate container 0b1d97ed807c
 ---> 8ebdd7daf6c2
Step 13/17 : ARG GROUP=ollama
 ---> Running in 0fd3a90b08ef
Removing intermediate container 0fd3a90b08ef
 ---> 98f2fc43f0a5
Step 14/17 : RUN addgroup $GROUP && adduser -D -G $GROUP $USER
 ---> Running in e34a68a7842b
Removing intermediate container e34a68a7842b
 ---> ed481f22c190
Step 15/17 : COPY --from=builder /app/ollama /bin/ollama
 ---> 132bb7485e41
Step 16/17 : USER $USER:$GROUP
 ---> Running in 4a1b67258634
Removing intermediate container 4a1b67258634
 ---> 6b0361cc9c31
Step 17/17 : ENTRYPOINT ["/entrypoint.sh"]
 ---> Running in 5ec3ba6a8960
Removing intermediate container 5ec3ba6a8960
 ---> 8bbfc34c9162
Successfully built 8bbfc34c9162
Successfully tagged lecture-chat_ollama:latest
<!-- gh-comment-id:1720833383 --> @philipempl commented on GitHub (Sep 15, 2023): Of course I can. Although I checked the log several times, I just found the error message "cublas not found". Could it be that my VM does not have a GPU available? As far as I read, Llama.cpp uses the CPU by default. Anyway, here is the complete log: ``` shell Building ollama Sending build context to Docker daemon 5.12kB Step 1/17 : FROM golang:alpine AS builder ---> 09df25511440 Step 2/17 : RUN apk add --no-cache git build-base cmake ---> Running in b0e8e0dc9b56 fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz (1/40) Installing libgcc (12.2.1_git20220924-r10) (2/40) Installing libstdc++ (12.2.1_git20220924-r10) (3/40) Installing zstd-libs (1.5.5-r4) (4/40) Installing binutils (2.40-r7) (5/40) Installing libmagic (5.45-r0) (6/40) Installing file (5.45-r0) (7/40) Installing libgomp (12.2.1_git20220924-r10) (8/40) Installing libatomic (12.2.1_git20220924-r10) (9/40) Installing gmp (6.2.1-r3) (10/40) Installing isl26 (0.26-r1) (11/40) Installing mpfr4 (4.2.0_p12-r0) (12/40) Installing mpc1 (1.3.1-r1) (13/40) Installing gcc (12.2.1_git20220924-r10) (14/40) Installing libstdc++-dev (12.2.1_git20220924-r10) (15/40) Installing musl-dev (1.2.4-r1) (16/40) Installing libc-dev (0.7.2-r5) (17/40) Installing g++ (12.2.1_git20220924-r10) (18/40) Installing make (4.4.1-r1) (19/40) Installing fortify-headers (1.1-r3) (20/40) Installing patch (2.7.6-r10) (21/40) Installing build-base (0.5-r3) (22/40) Installing libacl (2.3.1-r3) (23/40) Installing libbz2 (1.0.8-r5) (24/40) Installing libexpat (2.5.0-r1) (25/40) Installing lz4-libs (1.9.4-r4) (26/40) Installing xz-libs (5.4.3-r0) (27/40) Installing libarchive (3.7.2-r0) (28/40) Installing brotli-libs (1.0.9-r14) (29/40) Installing libunistring (1.1-r1) (30/40) Installing libidn2 (2.3.4-r1) (31/40) Installing nghttp2-libs (1.55.1-r0) (32/40) Installing libcurl (8.2.1-r0) (33/40) Installing ncurses-terminfo-base (6.4_p20230506-r0) (34/40) Installing libncursesw (6.4_p20230506-r0) (35/40) Installing libformw (6.4_p20230506-r0) (36/40) Installing rhash-libs (1.4.3-r2) (37/40) Installing libuv (1.44.2-r2) (38/40) Installing cmake (3.26.5-r0) (39/40) Installing pcre2 (10.42-r1) (40/40) Installing git (2.40.1-r0) Executing busybox-1.36.1-r2.trigger OK: 311 MiB in 56 packages Removing intermediate container b0e8e0dc9b56 ---> 5ef279345f5e Step 3/17 : WORKDIR /app ---> Running in 804bd7a46685 Removing intermediate container 804bd7a46685 ---> 1c6520a2bd10 Step 4/17 : RUN git clone https://github.com/jmorganca/ollama.git . ---> Running in 66a78e86b6b6 Cloning into '.'... Removing intermediate container 66a78e86b6b6 ---> 4ca99a2920e2 Step 5/17 : RUN go generate ./... && go build -ldflags '-linkmode external -extldflags "-static"' -o . ---> Running in a0f8ed6b4383 go: downloading github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 go: downloading golang.org/x/term v0.10.0 go: downloading github.com/mattn/go-runewidth v0.0.14 go: downloading github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db go: downloading github.com/gin-contrib/cors v1.4.0 go: downloading github.com/gin-gonic/gin v1.9.1 go: downloading golang.org/x/crypto v0.10.0 go: downloading golang.org/x/exp v0.0.0-20230817173708-d852ddb80c63 go: downloading gonum.org/v1/gonum v0.13.0 go: downloading github.com/chzyer/readline v1.5.1 go: downloading github.com/dustin/go-humanize v1.0.1 go: downloading github.com/olekukonko/tablewriter v0.0.5 go: downloading github.com/spf13/cobra v1.7.0 go: downloading golang.org/x/sys v0.11.0 go: downloading github.com/rivo/uniseg v0.2.0 go: downloading github.com/spf13/pflag v1.0.5 go: downloading golang.org/x/net v0.10.0 go: downloading github.com/gin-contrib/sse v0.1.0 go: downloading github.com/mattn/go-isatty v0.0.19 go: downloading github.com/go-playground/validator/v10 v10.14.0 go: downloading github.com/ugorji/go/codec v1.2.11 go: downloading google.golang.org/protobuf v1.30.0 go: downloading github.com/pelletier/go-toml/v2 v2.0.8 go: downloading gopkg.in/yaml.v3 v3.0.1 go: downloading golang.org/x/text v0.10.0 go: downloading github.com/go-playground/universal-translator v0.18.1 go: downloading github.com/leodido/go-urn v1.2.4 go: downloading github.com/gabriel-vasile/mimetype v1.4.2 go: downloading github.com/go-playground/locales v0.14.1 Submodule 'llm/llama.cpp/ggml' (https://github.com/ggerganov/llama.cpp.git) registered for path 'ggml' Submodule 'llm/llama.cpp/gguf' (https://github.com/ggerganov/llama.cpp.git) registered for path 'gguf' Cloning into '/app/llm/llama.cpp/ggml'... From https://github.com/ggerganov/llama.cpp * branch 9e232f0234073358e7031c1b8d7aa45020469a3b -> FETCH_HEAD Submodule path 'ggml': checked out '9e232f0234073358e7031c1b8d7aa45020469a3b' -- The C compiler identification is GNU 12.2.1 -- The CXX compiler identification is GNU 12.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.40.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.5s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/ggml/build/cpu [ 9%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o [ 27%] Built target ggml [ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o [ 45%] Linking CXX static library libllama.a [ 45%] Built target llama [ 54%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o [ 63%] Building CXX object examples/CMakeFiles/common.dir/console.cpp.o [ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o [ 72%] Built target common [ 81%] Built target BUILD_INFO [ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o [100%] Linking CXX executable ../../bin/server [100%] Built target server Cloning into '/app/llm/llama.cpp/gguf'... From https://github.com/ggerganov/llama.cpp * branch 53885d7256909ec3e2176cdc2477f3986c15ec69 -> FETCH_HEAD Submodule path 'gguf': checked out '53885d7256909ec3e2176cdc2477f3986c15ec69' -- The C compiler identification is GNU 12.2.1 -- The CXX compiler identification is GNU 12.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.40.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.5s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/gguf/build/cpu [ 9%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o /app/llm/llama.cpp/gguf/k_quants.c:182:14: warning: 'make_qkx1_quants' defined but not used [-Wunused-function] 182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min, | ^~~~~~~~~~~~~~~~ [ 27%] Built target ggml [ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o [ 45%] Linking CXX static library libllama.a [ 45%] Built target llama [ 54%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o [ 63%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o [ 72%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o [ 72%] Built target common [ 81%] Built target BUILD_INFO [ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o [100%] Linking CXX executable ../../bin/server [100%] Built target server Submodule path 'ggml': checked out '9e232f0234073358e7031c1b8d7aa45020469a3b' -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.1s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/ggml/build/cpu [ 9%] Generating build details from Git -- Found Git: /usr/bin/git (found version "2.40.1") [ 9%] Built target BUILD_INFO [ 18%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 27%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 36%] Building C object CMakeFiles/ggml.dir/k_quants.c.o [ 36%] Built target ggml [ 45%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o [ 54%] Linking CXX static library libllama.a [ 54%] Built target llama [ 63%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o [ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o [ 81%] Built target common [ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o [100%] Linking CXX executable ../../bin/server [100%] Built target server Submodule path 'gguf': checked out '53885d7256909ec3e2176cdc2477f3986c15ec69' -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.1s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/gguf/build/cpu [ 27%] Built target ggml [ 45%] Built target llama [ 72%] Built target common [ 81%] Generating build details from Git -- Found Git: /usr/bin/git (found version "2.40.1") [ 81%] Built target BUILD_INFO [100%] Built target server -- The C compiler identification is GNU 12.2.1 -- The CXX compiler identification is GNU 12.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.40.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Could not find nvcc, please set CUDAToolkit_ROOT. CMake Warning at CMakeLists.txt:291 (message): cuBLAS not found -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.5s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/ggml/build/cuda- [ 9%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o [ 27%] Built target ggml [ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o [ 45%] Linking CXX static library libllama.a [ 45%] Built target llama [ 54%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o [ 63%] Building CXX object examples/CMakeFiles/common.dir/console.cpp.o [ 72%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o [ 72%] Built target common [ 81%] Generating build details from Git -- Found Git: /usr/bin/git (found version "2.40.1") [ 81%] Built target BUILD_INFO [ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o [100%] Linking CXX executable ../../bin/server [100%] Built target server -- The C compiler identification is GNU 12.2.1 -- The CXX compiler identification is GNU 12.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.40.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Could not find nvcc, please set CUDAToolkit_ROOT. CMake Warning at CMakeLists.txt:292 (message): cuBLAS not found -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected -- Configuring done (0.5s) -- Generating done (0.1s) -- Build files have been written to: /app/llm/llama.cpp/gguf/build/cuda- [ 9%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 18%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o [ 27%] Building C object CMakeFiles/ggml.dir/k_quants.c.o /app/llm/llama.cpp/gguf/k_quants.c:182:14: warning: 'make_qkx1_quants' defined but not used [-Wunused-function] 182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min, | ^~~~~~~~~~~~~~~~ [ 27%] Built target ggml [ 36%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o [ 45%] Linking CXX static library libllama.a [ 45%] Built target llama [ 54%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o [ 63%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o [ 72%] Building CXX object common/CMakeFiles/common.dir/grammar-parser.cpp.o [ 72%] Built target common [ 81%] Generating build details from Git -- Found Git: /usr/bin/git (found version "2.40.1") [ 81%] Built target BUILD_INFO [ 90%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o [100%] Linking CXX executable ../../bin/server [100%] Built target server Removing intermediate container a0f8ed6b4383 ---> 427153c98da2 Step 6/17 : FROM alpine ---> 7e01a0d0a1dc Step 7/17 : ENV OLLAMA_HOST "0.0.0.0" ---> Running in 2593a5532740 Removing intermediate container 2593a5532740 ---> 5ae06a41c895 Step 8/17 : RUN apk add --no-cache libstdc++ curl ---> Running in 5545d4188c0b fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz (1/9) Installing ca-certificates (20230506-r0) (2/9) Installing brotli-libs (1.0.9-r14) (3/9) Installing libunistring (1.1-r1) (4/9) Installing libidn2 (2.3.4-r1) (5/9) Installing nghttp2-libs (1.55.1-r0) (6/9) Installing libcurl (8.2.1-r0) (7/9) Installing curl (8.2.1-r0) (8/9) Installing libgcc (12.2.1_git20220924-r10) (9/9) Installing libstdc++ (12.2.1_git20220924-r10) Executing busybox-1.36.1-r2.trigger Executing ca-certificates-20230506-r0.trigger OK: 14 MiB in 24 packages Removing intermediate container 5545d4188c0b ---> ee52ddb1a545 Step 9/17 : COPY Modelfile /Modelfile ---> a71e55fb0b6b Step 10/17 : COPY entrypoint.sh /entrypoint.sh ---> ba1178d45ab5 Step 11/17 : RUN chmod +x /entrypoint.sh ---> Running in ded0bbdc5617 Removing intermediate container ded0bbdc5617 ---> 31a70282a3b8 Step 12/17 : ARG USER=ollama ---> Running in 0b1d97ed807c Removing intermediate container 0b1d97ed807c ---> 8ebdd7daf6c2 Step 13/17 : ARG GROUP=ollama ---> Running in 0fd3a90b08ef Removing intermediate container 0fd3a90b08ef ---> 98f2fc43f0a5 Step 14/17 : RUN addgroup $GROUP && adduser -D -G $GROUP $USER ---> Running in e34a68a7842b Removing intermediate container e34a68a7842b ---> ed481f22c190 Step 15/17 : COPY --from=builder /app/ollama /bin/ollama ---> 132bb7485e41 Step 16/17 : USER $USER:$GROUP ---> Running in 4a1b67258634 Removing intermediate container 4a1b67258634 ---> 6b0361cc9c31 Step 17/17 : ENTRYPOINT ["/entrypoint.sh"] ---> Running in 5ec3ba6a8960 Removing intermediate container 5ec3ba6a8960 ---> 8bbfc34c9162 Successfully built 8bbfc34c9162 Successfully tagged lecture-chat_ollama:latest ```
Author
Owner

@mxyng commented on GitHub (Sep 15, 2023):

That output looks ok. cuBLAS not found is a warning when building on a platform without the cuda toolkit. The serve will use CPU.

I notice the CPU platform is fairly old.

      product: Intel Core i7 9xx (Nehalem Core i7, IBRS update)

I can't definitely say this is the issue but it's pretty suspect. Are you able to try the same setup on another, newer platform?

<!-- gh-comment-id:1721971311 --> @mxyng commented on GitHub (Sep 15, 2023): That output looks ok. `cuBLAS not found` is a warning when building on a platform without the cuda toolkit. The serve will use CPU. I notice the CPU platform is fairly old. ``` product: Intel Core i7 9xx (Nehalem Core i7, IBRS update) ``` I can't definitely say this is the issue but it's pretty suspect. Are you able to try the same setup on another, newer platform?
Author
Owner

@philipempl commented on GitHub (Sep 17, 2023):

Thanks for your support :). Unfortunately not, because as a university we are dependent on the computing resources of our internal IT

<!-- gh-comment-id:1722537090 --> @philipempl commented on GitHub (Sep 17, 2023): Thanks for your support :). Unfortunately not, because as a university we are dependent on the computing resources of our internal IT
Author
Owner

@mxyng commented on GitHub (Sep 18, 2023):

Understood. I don't have the capabilities to investigate this at the moment but I'll leave the issue open for now.

<!-- gh-comment-id:1723893857 --> @mxyng commented on GitHub (Sep 18, 2023): Understood. I don't have the capabilities to investigate this at the moment but I'll leave the issue open for now.
Author
Owner

@khromov commented on GitHub (Sep 26, 2023):

I have the same issue on an Intel(R) Xeon(R) CPU X3480, almost certainly an "old cpu" issue.

2023/09/26 23:40:47 llama.go:346: waiting for llama runner to start responding
2023/09/26 23:40:47 llama.go:320: llama runner exited with error: signal: illegal instruction
<!-- gh-comment-id:1736340730 --> @khromov commented on GitHub (Sep 26, 2023): I have the same issue on an ` Intel(R) Xeon(R) CPU X3480`, almost certainly an "old cpu" issue. ``` 2023/09/26 23:40:47 llama.go:346: waiting for llama runner to start responding 2023/09/26 23:40:47 llama.go:320: llama runner exited with error: signal: illegal instruction ```
Author
Owner

@khromov commented on GitHub (Sep 26, 2023):

I tried compiling llama.cpp from source and it did work, so that is an option for older hardware. If it's some sort of compile flag I don't know if Ollama can do anything about it other than perhaps offer a "bare" binary for older CPU:s without fancy acceleration flags.

<!-- gh-comment-id:1736362352 --> @khromov commented on GitHub (Sep 26, 2023): I tried compiling [llama.cpp](https://github.com/ggerganov/llama.cpp) from source and it did work, so that is an option for older hardware. If it's some sort of compile flag I don't know if Ollama can do anything about it other than perhaps offer a "bare" binary for older CPU:s without fancy acceleration flags.
Author
Owner

@theflu commented on GitHub (Oct 19, 2023):

I had what seems to be the same issue (running on an older CPU Intel(R) Celeron(R) CPU G3930). I got around this by building Ollama from source development.md

<!-- gh-comment-id:1770106653 --> @theflu commented on GitHub (Oct 19, 2023): I had what seems to be the same issue (running on an older CPU Intel(R) Celeron(R) CPU G3930). I got around this by building Ollama from source [development.md](https://github.com/jmorganca/ollama/blob/main/docs/development.md)
Author
Owner

@JulioBarros commented on GitHub (Nov 1, 2023):

@theflu I'm still getting illegal instruction with an old GPU even after rebuilding it. Are there build flags that I should use? Like to tell it not to use CUDA or to assume an older version or something? TIA.

<!-- gh-comment-id:1788929983 --> @JulioBarros commented on GitHub (Nov 1, 2023): @theflu I'm still getting illegal instruction with an old GPU even after rebuilding it. Are there build flags that I should use? Like to tell it not to use CUDA or to assume an older version or something? TIA.
Author
Owner

@theflu commented on GitHub (Nov 1, 2023):

@theflu I'm still getting illegal instruction with an old GPU even after rebuilding it. Are there build flags that I should use? Like to tell it not to use CUDA or to assume an older version or something? TIA.

I am pretty sure CUDA is required. This was to get it working on an older CPU. Though you should be able to run it on your CPU without CUDA. You might need CUDA installed to compile it. Not sure if there is a flag to compile without CUDA support.

<!-- gh-comment-id:1788940821 --> @theflu commented on GitHub (Nov 1, 2023): > @theflu I'm still getting illegal instruction with an old GPU even after rebuilding it. Are there build flags that I should use? Like to tell it not to use CUDA or to assume an older version or something? TIA. I am pretty sure CUDA is required. This was to get it working on an older CPU. Though you should be able to run it on your CPU without CUDA. You might need CUDA installed to compile it. Not sure if there is a flag to compile without CUDA support.
Author
Owner

@JulioBarros commented on GitHub (Nov 1, 2023):

Sorry. I could have been clearer. I have 2 1080Ti's and am getting the illegal instruction even if I build it myself. I have cuda cudnn installed for other projects and was wondering if there was a particular way to get this to build so that it worked in that case.

<!-- gh-comment-id:1788951584 --> @JulioBarros commented on GitHub (Nov 1, 2023): Sorry. I could have been clearer. I have 2 1080Ti's and am getting the illegal instruction even if I build it myself. I have cuda cudnn installed for other projects and was wondering if there was a particular way to get this to build so that it worked in that case.
Author
Owner

@theflu commented on GitHub (Nov 1, 2023):

1080s should work I have it running on 1070s. What CPU are you using?

<!-- gh-comment-id:1788955618 --> @theflu commented on GitHub (Nov 1, 2023): 1080s should work I have it running on 1070s. What CPU are you using?
Author
Owner

@JulioBarros commented on GitHub (Nov 1, 2023):

I have 2 1080ti cards.

I still get: signal: illegal instruction (core dumped)

<!-- gh-comment-id:1789050230 --> @JulioBarros commented on GitHub (Nov 1, 2023): I have 2 1080ti cards. I still get: signal: illegal instruction (core dumped)
Author
Owner

@JulioBarros commented on GitHub (Nov 1, 2023):

Wiped out my ~/.ollama directory and it worked so maybe a corrupt model file or something. Thanks.

<!-- gh-comment-id:1789106742 --> @JulioBarros commented on GitHub (Nov 1, 2023): Wiped out my ~/.ollama directory and it worked so maybe a corrupt model file or something. Thanks.
Author
Owner

@unclamped commented on GitHub (Nov 2, 2023):

Getting this same issue on a Pentium G4560, running NixOS as my host OS, running ollama inside of a podman container through distrobox.

2023/11/02 20:08:04 llama.go:428: waiting for llama runner to start responding
2023/11/02 20:08:04 llama.go:385: signal: illegal instruction (core dumped)
2023/11/02 20:08:04 llama.go:393: error starting llama runner: llama runner process has terminate
<!-- gh-comment-id:1791471917 --> @unclamped commented on GitHub (Nov 2, 2023): Getting this same issue on a Pentium G4560, running NixOS as my host OS, running ollama inside of a podman container through distrobox. ```2023/11/02 20:08:04 llama.go:370: starting llama runner 2023/11/02 20:08:04 llama.go:428: waiting for llama runner to start responding 2023/11/02 20:08:04 llama.go:385: signal: illegal instruction (core dumped) 2023/11/02 20:08:04 llama.go:393: error starting llama runner: llama runner process has terminate
Author
Owner

@easp commented on GitHub (Nov 18, 2023):

@unclamped, it doesn't look like that CPU has AVX instructions, which the Ollama binaries require.

<!-- gh-comment-id:1817580422 --> @easp commented on GitHub (Nov 18, 2023): @unclamped, it [doesn't look like that CPU has AVX instructions](https://ark.intel.com/content/www/us/en/ark/products/97143/intel-pentium-processor-g4560-3m-cache-3-50-ghz.html), which the Ollama binaries require.
Author
Owner

@jmorganca commented on GitHub (Feb 23, 2024):

Hi this should be fixed by now. Ollama supports No AVX, AVX, and AVX2 CPUs

<!-- gh-comment-id:1962131005 --> @jmorganca commented on GitHub (Feb 23, 2024): Hi this should be fixed by now. Ollama supports No AVX, AVX, and AVX2 CPUs
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#240