[GH-ISSUE #1869] Provide instructions for running Ollama as a service in a GitLab CI/CD job #63108

Closed
opened 2026-05-03 12:07:55 -05:00 by GiteaMirror · 11 comments
Owner

Originally created by @Bengt on GitHub (Jan 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1869

I have an GitLab CI/CD job that runs my tests like this:

run-pytest-python3.11:
    needs:
        -   build-pytest-python3.11
    coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/'
    image:
        # yamllint disable-line rule:line-length
        name: registry.gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot:pytest-python3.11-latest
        entrypoint: [""]
    script:
        -   cd /1-llm-chatbot
        -   PYTHONPATH=. venv/bin/python -m pytest --cov=llm_chatbot

My test suite includes tests of the code that uses the Ollama server to answer the test requests. How do I run ollama serve as a GitLab service?

Originally created by @Bengt on GitHub (Jan 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1869 I have an GitLab CI/CD job that runs my tests like this: ```yaml run-pytest-python3.11: needs: - build-pytest-python3.11 coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/' image: # yamllint disable-line rule:line-length name: registry.gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot:pytest-python3.11-latest entrypoint: [""] script: - cd /1-llm-chatbot - PYTHONPATH=. venv/bin/python -m pytest --cov=llm_chatbot ``` My test suite includes tests of the code that uses the Ollama server to answer the test requests. How do I run `ollama serve` as a [GitLab service](https://docs.gitlab.com/ee/ci/services/)?
Author
Owner

@aliirz commented on GitHub (Jan 9, 2024):

Have you considered running Ollama as a docker container and then exposing it as GitLab service?

<!-- gh-comment-id:1883374894 --> @aliirz commented on GitHub (Jan 9, 2024): Have you considered running Ollama as a docker container and then exposing it as GitLab service?
Author
Owner

@mxyng commented on GitHub (Jan 9, 2024):

+1 to running as a GitLab service.

Here's an (unvetted) example:

my-job:
  services:
    - name: ollama/ollama:0.1.19
      alias: ollama
  script:
    - nc -vz ollama 11434
<!-- gh-comment-id:1883483409 --> @mxyng commented on GitHub (Jan 9, 2024): +1 to running as a GitLab service. Here's an (unvetted) example: ```yaml my-job: services: - name: ollama/ollama:0.1.19 alias: ollama script: - nc -vz ollama 11434 ```
Author
Owner

@Bengt commented on GitHub (Jan 10, 2024):

When I try ollama pull, I get the following error:

Error: could not connect to ollama server, run 'ollama serve' to start it
<!-- gh-comment-id:1884686855 --> @Bengt commented on GitHub (Jan 10, 2024): When I try `ollama pull`, I get the following error: ``` Error: could not connect to ollama server, run 'ollama serve' to start it ```
Author
Owner

@Bengt commented on GitHub (Jan 10, 2024):

I have added an entrypoint to the ollama service like so, but that does not help, either:

services:
    -   alias: ollama
        name: ollama/ollama:0.1.19
        entrypoint: ["ollama", "serve"]
<!-- gh-comment-id:1884732517 --> @Bengt commented on GitHub (Jan 10, 2024): I have added an entrypoint to the ollama service like so, but that does not help, either: ``` services: - alias: ollama name: ollama/ollama:0.1.19 entrypoint: ["ollama", "serve"] ```
Author
Owner

@Bengt commented on GitHub (Jan 10, 2024):

Setting OLLAMA_HOSTdid not help, either.

<!-- gh-comment-id:1885465543 --> @Bengt commented on GitHub (Jan 10, 2024): Setting `OLLAMA_HOST`did not help, either.
Author
Owner

@Bengt commented on GitHub (Jan 10, 2024):

Activating the service logging like so:

https://docs.gitlab.com/ee/ci/services/#capturing-service-container-logs

variables:
  CI_DEBUG_SERVICES: "true"

Gives this:


Waiting for services to be up and running (timeout 30 seconds)...
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.835151997Z Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837068506Z Your new public key is: 
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837079866Z 
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837084625Z ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOzRULmG6k0cYtK68EBuTNrFNw0CvVI/sVR8XqaHSj5F
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837088825Z 
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838342722Z 2024/01/10 19:38:24 images.go:808: total blobs: 0
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838445250Z 2024/01/10 19:38:24 images.go:815: total unused blobs removed: 0
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838710697Z 2024/01/10 19:38:24 routes.go:930: Listening on [::]:[11](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L11)434 (version 0.1.19)
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495087885Z 2024/01/10 19:38:25 shim_ext_server.go:[14](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L14)2: Dynamic LLM variants [cuda]
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495117114Z 2024/01/10 19:38:25 gpu.go:35: Detecting GPU type
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495682538Z 2024/01/10 19:38:25 gpu.go:40: CUDA not detected: Unable to load libnvidia-ml.so library to query for Nvidia GPUs: /usr/lib/wsl/lib/libnvidia-ml.so.1: cannot open shared object file: No such file or directory
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495820096Z 2024/01/10 19:38:25 gpu.go:46: ROCm not detected: Unable to load librocm_smi64.so library to query for Radeon GPUs: /opt/rocm/lib/librocm_smi64.so: cannot open shared object file: No such file or directory
[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495838[16](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L16)6Z 2024/01/10 [19](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L19):38:25 routes.go:953: no GPU detected
<!-- gh-comment-id:1885602046 --> @Bengt commented on GitHub (Jan 10, 2024): Activating the service logging like so: https://docs.gitlab.com/ee/ci/services/#capturing-service-container-logs ``` variables: CI_DEBUG_SERVICES: "true" ``` Gives this: ``` Waiting for services to be up and running (timeout 30 seconds)... [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.835151997Z Couldn't find '/root/.ollama/id_ed25519'. Generating new private key. [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837068506Z Your new public key is: [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837079866Z [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837084625Z ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOzRULmG6k0cYtK68EBuTNrFNw0CvVI/sVR8XqaHSj5F [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.837088825Z [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838342722Z 2024/01/10 19:38:24 images.go:808: total blobs: 0 [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838445250Z 2024/01/10 19:38:24 images.go:815: total unused blobs removed: 0 [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838710697Z 2024/01/10 19:38:24 routes.go:930: Listening on [::]:[11](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L11)434 (version 0.1.19) [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495087885Z 2024/01/10 19:38:25 shim_ext_server.go:[14](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L14)2: Dynamic LLM variants [cuda] [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495117114Z 2024/01/10 19:38:25 gpu.go:35: Detecting GPU type [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495682538Z 2024/01/10 19:38:25 gpu.go:40: CUDA not detected: Unable to load libnvidia-ml.so library to query for Nvidia GPUs: /usr/lib/wsl/lib/libnvidia-ml.so.1: cannot open shared object file: No such file or directory [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495820096Z 2024/01/10 19:38:25 gpu.go:46: ROCm not detected: Unable to load librocm_smi64.so library to query for Radeon GPUs: /opt/rocm/lib/librocm_smi64.so: cannot open shared object file: No such file or directory [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:25.495838[16](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L16)6Z 2024/01/10 [19](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L19):38:25 routes.go:953: no GPU detected ```
Author
Owner

@Bengt commented on GitHub (Jan 10, 2024):

Do I maybe need to configure the web origin hosts?

https://github.com/jmorganca/ollama/blob/main/docs/faq.md#how-can-i-allow-additional-web-origins-to-access-ollama

<!-- gh-comment-id:1885754820 --> @Bengt commented on GitHub (Jan 10, 2024): Do I maybe need to configure the web origin hosts? https://github.com/jmorganca/ollama/blob/main/docs/faq.md#how-can-i-allow-additional-web-origins-to-access-ollama
Author
Owner

@mxyng commented on GitHub (Jan 10, 2024):

Logs indicate the service is up and running and serving 0.0.0.0:

[service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838710697Z 2024/01/10 19:38:24 routes.go:930: Listening on [::]:[11](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L11)434 (version 0.1.19)

Keep in mind the address GitLab exposes is alias:port so OLLAMA_HOST must be set for the client like this OLLAMA_HOST=ollama:11434 ollama pull. While the port (11434) should be exposed by default, it's possible GitLab requires it to be set explicitly.

<!-- gh-comment-id:1885774089 --> @mxyng commented on GitHub (Jan 10, 2024): Logs indicate the service is up and running and serving 0.0.0.0: ``` [service:ollama__ollama-ollama-ollama-ollama] 2024-01-10T19:38:24.838710697Z 2024/01/10 19:38:24 routes.go:930: Listening on [::]:[11](https://gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot/-/jobs/5904376120#L11)434 (version 0.1.19) ``` Keep in mind the address GitLab exposes is `alias:port` so OLLAMA_HOST must be set for the client like this `OLLAMA_HOST=ollama:11434 ollama pull`. While the port (11434) should be exposed by default, it's possible GitLab requires it to be set explicitly.
Author
Owner

@Bengt commented on GitHub (Jan 11, 2024):

I couldn't make the service work, so for now, I am managing the Ollama server in a bash process like so:

pytest-python3.11:
    before_script:
        -   apt update
        -   apt install --yes curl
        -   curl https://ollama.ai/install.sh | sh
    coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/'
    image:
        # yamllint disable-line rule:line-length
        name: registry.gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot:pytest-python3.11-latest
        entrypoint: [""]
    script:
        -   ollama serve &
        -   pid=$!
        -   export OLLAMA_HOST=localhost
        -   export PYTHONPATH=.
        -   cd /1-llm-chatbot
        #   Wait for ollama to start
        -   sleep 60
        -   ollama pull llama2
        -   venv/bin/python -m pytest --cov=llm_chatbot
        -   kill $pid
<!-- gh-comment-id:1886006561 --> @Bengt commented on GitHub (Jan 11, 2024): I couldn't make the service work, so for now, I am managing the Ollama server in a bash process like so: ``` pytest-python3.11: before_script: - apt update - apt install --yes curl - curl https://ollama.ai/install.sh | sh coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/' image: # yamllint disable-line rule:line-length name: registry.gitlab.com/openknowledge-gmbh/projects/ml-platform/1-llm-chatbot:pytest-python3.11-latest entrypoint: [""] script: - ollama serve & - pid=$! - export OLLAMA_HOST=localhost - export PYTHONPATH=. - cd /1-llm-chatbot # Wait for ollama to start - sleep 60 - ollama pull llama2 - venv/bin/python -m pytest --cov=llm_chatbot - kill $pid ```
Author
Owner

@Bengt commented on GitHub (Jan 11, 2024):

Thanks for the hint, @mxyng! I think, I did that at some point, but maybe I was mistaking. I will try that again, later.

<!-- gh-comment-id:1886007453 --> @Bengt commented on GitHub (Jan 11, 2024): Thanks for the hint, @mxyng! I think, I did that at some point, but maybe I was mistaking. I will try that again, later.
Author
Owner

@DevOpsJeremy commented on GitHub (Feb 1, 2025):

This is the solution I came up with. Run ollama serve as a service, then pull the model via the Ollama API in the before_script:

ollama:
  services:
    - name: ollama/ollama
      alias: ollama
      command: ["serve"]
  before_script:
    - curl ollama:11434/api/pull -sS -X POST -d "{\"model\":\"llama3.2\",\"stream\":false}"
  script: curl -sS ollama:11434
<!-- gh-comment-id:2629090588 --> @DevOpsJeremy commented on GitHub (Feb 1, 2025): This is the solution I came up with. Run `ollama serve` as a service, then pull the model via the Ollama API in the `before_script`: ```yaml ollama: services: - name: ollama/ollama alias: ollama command: ["serve"] before_script: - curl ollama:11434/api/pull -sS -X POST -d "{\"model\":\"llama3.2\",\"stream\":false}" script: curl -sS ollama:11434 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63108