[GH-ISSUE #12262] Independent Model downloads #54664

Closed
opened 2026-04-29 06:47:41 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @walnut-co on GitHub (Sep 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12262

I am using azure container apps environment with docker images deployed, via CI / CD tools. I am targeting faster docker start time and it would be good have that models could be downloaded separately.

Current way

- run pipeline to deploy OLLAMA with ENV "OLLAMA_Model" set to the folder mounted.
- call the /api/pull to download the model
- start using the api 

New Way

- run pipeline to download models to azure file share
- run pipeline to deploy OLLAMA with ENV "OLLAMA_Model" set to the folder mounted (restart is required)
- start using the api

new way would have other benefits, where new model can deployed faster and outside of the container and will help in quick scaling.

Originally created by @walnut-co on GitHub (Sep 12, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12262 I am using azure container apps environment with docker images deployed, via CI / CD tools. I am targeting faster docker start time and it would be good have that models could be downloaded separately. Current way ``` - run pipeline to deploy OLLAMA with ENV "OLLAMA_Model" set to the folder mounted. - call the /api/pull to download the model - start using the api ``` New Way ``` - run pipeline to download models to azure file share - run pipeline to deploy OLLAMA with ENV "OLLAMA_Model" set to the folder mounted (restart is required) - start using the api ``` new way would have other benefits, where new model can deployed faster and outside of the container and will help in quick scaling.
GiteaMirror added the feature request label 2026-04-29 06:47:41 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 12, 2025):

#!/bin/bash

die(){
  echo "$1"
  exit 1
}

_=$(command -v jq) || die "Need jq"
_=$(command -v curl) || die "Need curl"

! PARSED=$(getopt --options=n --longoptions=dryrun --name "$0" -- "$@")
[[ ${PIPESTATUS[0]} -ne 0 ]] && die "Parsing failed"
eval set -- "$PARSED"

DRYRUN=

while true; do
  case "$1" in
    -n|--dryrun)
      DRYRUN=echo
      shift
      ;;
    --)
      shift
      break
      ;;
    *)
      die "Parsing failed"
      ;;
  esac
done

[ -z "$*" ] && die "usage: $0 modelname [modelname ...]"

pull_model() {
  model="$1"
  registry=registry.ollama.ai
  library=library
  name="${model%:*}" ; name="${name##*/}"
  [[ "$model" = */*/* ]] && registry="${model%%/*}"
  [[ "$model" = */* ]] && { library="${model%/*}" ; library="${library#*/}" ; }
  [[ "$model" = *:* ]] && tag="${model##*:}" || tag=latest

  OLLAMA_MODELS=${OLLAMA_MODELS-/usr/share/ollama/.ollama/models}

  cd $OLLAMA_MODELS || die "Couldn't cd to OLLAMA_MODELS ($OLLAMA_MODELS)"
  [ ! -d blobs -o ! -d manifests ] && die "Missing blobs or manifests directory"
  manifest_dir="manifests/$registry/$library/$name"
  [ -e "$manifest_dir/$tag" ] &&  { echo "$name:$tag already exists" ; return ; }
  [ ! -d "$manifest_dir" ] && { $DRYRUN mkdir -p "$manifest_dir" || die "Couldn't mkdir manifest dir ($manifest_dir)" ; }

  [ -n "$DRYRUN" ] && echo curl -sL https://registry.ollama.ai/v2/library/$name/manifests/$tag
  manifest=$(curl -sL https://$registry/v2/$library/$name/manifests/$tag) || die "Couldn't fetch manifest"
  errors=$(jq -cn "$manifest |.errors")
  [ "$errors" = "null" ] || die "$errors"

  config=$(jq -rn "$manifest | .config.digest") || die "No config digest"

  $DRYRUN curl -#L -C - -o blobs/${config/:/-} https://$registry/v2/$library/$name/blobs/$config || die "Couldn't fetch config blob"

  for layer in $(jq -rn "$manifest | .layers[].digest") ; do
    $DRYRUN curl -#L -C - -o blobs/${layer/:/-} https://$registry/v2/$library/$name/blobs/$layer || die "Couldn't fetch layer"
  done

  [ -n "$DRYRUN" ] && echo "echo '$manifest' > '$manifest_dir/$tag'" || { echo "$manifest" > "$manifest_dir/$tag" || die "Couldn't write manifest" ; }
}

for model in $* ; do
  pull_model $model
done
$ mkdir -p /tmp/models/{blobs,manifests}
$ OLLAMA_MODELS=/tmp/models ollama-pull.sh qwen2.5:0.5b
$ find /tmp/models
/tmp/models/
/tmp/models/blobs
/tmp/models/blobs/sha256-c5396e06af294bd101b30dce59131a76d2b773e76950acc870eda801d3ab0515
/tmp/models/blobs/sha256-005f95c7475154a17e84b85cd497949d6dd2a4f9d77c096e3c66e4d9c32acaf5
/tmp/models/blobs/sha256-832dd9e00a68dd83b3c3fb9f5588dad7dcf337a0db50f7d9483f310cd292e92e
/tmp/models/blobs/sha256-eb4402837c7829a690fa845de4d7f3fd842c2adee476d5341da8a46ea9255175
/tmp/models/blobs/sha256-66b9ea09bd5b7099cbb4fc820f31b575c0366fa439b08245566692c6784e281e
/tmp/models/manifests
/tmp/models/manifests/registry.ollama.ai
/tmp/models/manifests/registry.ollama.ai/library
/tmp/models/manifests/registry.ollama.ai/library/qwen2.5
/tmp/models/manifests/registry.ollama.ai/library/qwen2.5/0.5b
<!-- gh-comment-id:3284822994 --> @rick-github commented on GitHub (Sep 12, 2025): ```sh #!/bin/bash die(){ echo "$1" exit 1 } _=$(command -v jq) || die "Need jq" _=$(command -v curl) || die "Need curl" ! PARSED=$(getopt --options=n --longoptions=dryrun --name "$0" -- "$@") [[ ${PIPESTATUS[0]} -ne 0 ]] && die "Parsing failed" eval set -- "$PARSED" DRYRUN= while true; do case "$1" in -n|--dryrun) DRYRUN=echo shift ;; --) shift break ;; *) die "Parsing failed" ;; esac done [ -z "$*" ] && die "usage: $0 modelname [modelname ...]" pull_model() { model="$1" registry=registry.ollama.ai library=library name="${model%:*}" ; name="${name##*/}" [[ "$model" = */*/* ]] && registry="${model%%/*}" [[ "$model" = */* ]] && { library="${model%/*}" ; library="${library#*/}" ; } [[ "$model" = *:* ]] && tag="${model##*:}" || tag=latest OLLAMA_MODELS=${OLLAMA_MODELS-/usr/share/ollama/.ollama/models} cd $OLLAMA_MODELS || die "Couldn't cd to OLLAMA_MODELS ($OLLAMA_MODELS)" [ ! -d blobs -o ! -d manifests ] && die "Missing blobs or manifests directory" manifest_dir="manifests/$registry/$library/$name" [ -e "$manifest_dir/$tag" ] && { echo "$name:$tag already exists" ; return ; } [ ! -d "$manifest_dir" ] && { $DRYRUN mkdir -p "$manifest_dir" || die "Couldn't mkdir manifest dir ($manifest_dir)" ; } [ -n "$DRYRUN" ] && echo curl -sL https://registry.ollama.ai/v2/library/$name/manifests/$tag manifest=$(curl -sL https://$registry/v2/$library/$name/manifests/$tag) || die "Couldn't fetch manifest" errors=$(jq -cn "$manifest |.errors") [ "$errors" = "null" ] || die "$errors" config=$(jq -rn "$manifest | .config.digest") || die "No config digest" $DRYRUN curl -#L -C - -o blobs/${config/:/-} https://$registry/v2/$library/$name/blobs/$config || die "Couldn't fetch config blob" for layer in $(jq -rn "$manifest | .layers[].digest") ; do $DRYRUN curl -#L -C - -o blobs/${layer/:/-} https://$registry/v2/$library/$name/blobs/$layer || die "Couldn't fetch layer" done [ -n "$DRYRUN" ] && echo "echo '$manifest' > '$manifest_dir/$tag'" || { echo "$manifest" > "$manifest_dir/$tag" || die "Couldn't write manifest" ; } } for model in $* ; do pull_model $model done ``` ```console $ mkdir -p /tmp/models/{blobs,manifests} $ OLLAMA_MODELS=/tmp/models ollama-pull.sh qwen2.5:0.5b $ find /tmp/models /tmp/models/ /tmp/models/blobs /tmp/models/blobs/sha256-c5396e06af294bd101b30dce59131a76d2b773e76950acc870eda801d3ab0515 /tmp/models/blobs/sha256-005f95c7475154a17e84b85cd497949d6dd2a4f9d77c096e3c66e4d9c32acaf5 /tmp/models/blobs/sha256-832dd9e00a68dd83b3c3fb9f5588dad7dcf337a0db50f7d9483f310cd292e92e /tmp/models/blobs/sha256-eb4402837c7829a690fa845de4d7f3fd842c2adee476d5341da8a46ea9255175 /tmp/models/blobs/sha256-66b9ea09bd5b7099cbb4fc820f31b575c0366fa439b08245566692c6784e281e /tmp/models/manifests /tmp/models/manifests/registry.ollama.ai /tmp/models/manifests/registry.ollama.ai/library /tmp/models/manifests/registry.ollama.ai/library/qwen2.5 /tmp/models/manifests/registry.ollama.ai/library/qwen2.5/0.5b ``` ```
Author
Owner

@walnut-co commented on GitHub (Sep 13, 2025):

@rick-github thanks for the scripts. I gave it a go and it works.

I noticed that doing it this way you can see the below log and it happens at the time of first request sent. and the first request takes while.
stderr F load_tensors: loading model tensors, this can take a while... (mmap = true)

any suggestions ?

<!-- gh-comment-id:3288085870 --> @walnut-co commented on GitHub (Sep 13, 2025): @rick-github thanks for the scripts. I gave it a go and it works. I noticed that doing it this way you can see the below log and it happens at the time of first request sent. and the first request takes while. `stderr F load_tensors: loading model tensors, this can take a while... (mmap = true)` any suggestions ?
Author
Owner

@rick-github commented on GitHub (Sep 13, 2025):

This is normal. The model has to be loaded into VRAM before it can be used, so the first request has to wait for that to happen before the model can be run.

<!-- gh-comment-id:3288090051 --> @rick-github commented on GitHub (Sep 13, 2025): This is normal. The model has to be loaded into VRAM before it can be used, so the first request has to wait for that to happen before the model can be run.
Author
Owner

@walnut-co commented on GitHub (Sep 14, 2025):

@rick-github considering model being loaded into VRAM. Ideally we can/should architect one model per container ?? any thoughts ?

<!-- gh-comment-id:3288988053 --> @walnut-co commented on GitHub (Sep 14, 2025): @rick-github considering model being loaded into VRAM. Ideally we can/should architect one model per container ?? any thoughts ?
Author
Owner

@rick-github commented on GitHub (Sep 14, 2025):

If you want one model per container, install just one model.

<!-- gh-comment-id:3288991087 --> @rick-github commented on GitHub (Sep 14, 2025): If you want one model per container, install just one model.
Author
Owner

@pdevine commented on GitHub (Sep 15, 2025):

I think this is answered (thanks @rick-github !), so I'll go ahead and close the issue.

<!-- gh-comment-id:3294091149 --> @pdevine commented on GitHub (Sep 15, 2025): I think this is answered (thanks @rick-github !), so I'll go ahead and close the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54664