mirror of
https://github.com/moghtech/komodo.git
synced 2026-05-06 08:55:40 -05:00
[GH-ISSUE #769] Errors when batch-deploying multiple stacks using the same repo #7483
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @chrschorn on GitHub (Aug 26, 2025).
Original GitHub issue: https://github.com/moghtech/komodo/issues/769
Hey, I'm using many stacks (20+) configured from a central git repo with komodo v1.19.1. The idea is to use this simple procedure to update all changed stacks whenever a new commit is made to the repo
Deploying the stacks individually works fine. However, I'm running into multiple issues when trying to batch deploy:
BatchDeployStackIfChangedwon't recognize that anything has changed. Additionally, typically thecompose.yamleditor doesn't show up unless I recently used "Redeploy" (see screenshot).Any advice on how to properly configure this? Or perhaps I'm looking at a bug here? I have a suspicion that Komodo is trying to clone the repo many times to the same location, overwriting/deleting files in a race condition-like behavior.
@mbecker20 commented on GitHub (Aug 26, 2025):
It sounds like Komodo core is having trouble cloning the repos. It does this in addition to Periphery clone, so it can safely pull repo and check for newer files compared to the periphery clone. Is there a networking reason why your periphery servers would be able to clone the repos, but Komodo Core container wouldn't be?
@chrschorn commented on GitHub (Aug 26, 2025):
Thank you for pointing me in the network direction! I was able to solve issue 1, but 2 still persists.
In case anyone runs into the same networking issue as me: the issue was that, because komodo was running on the same VM as my reverse proxy, it was communicating with the reverse proxy via ipv6. I had configured the reverse proxy to only accept private ipv4 (but not ipv6) ranges for internal network communication. That lead to komodo not being able to access the git server at all. It wasn't very clear that Komodo was unable to clone the repo, but ultimately it was a setup issue on my end.
As for issue 2, when I use more than 5-10 stacks with a single shared "Repo" resource and use
BatchPullStack, most (but not all) stacks fail with this error 500. The same works correctly using an individually configured repo on each stack.@mbecker20 commented on GitHub (Aug 26, 2025):
@chrschorn there are a lot of config, the error you see is letting you know when you move to repo some other configuration is wrong, like run directory or file path.
@Elekam commented on GitHub (Aug 28, 2025):
I have the same issue. I'm quite confused because I have 50+ stacks running, only 14 fail, and these 14 stacks are very similar to all the others. The run directory is identical using "./application_name" with the compose.yaml inside that folder inside the repo.
I don't really see why these stacks fail but the others don't, The only difference between them is that they point at different directories, but in the exact same way, and different env vars, which shouldn't be related to this.
I can also redeploy these stacks manually, and it works fine, but they don't work when triggered through the Global Update Schedule
Edit: I also checked the permissions of all my volume mounts and they all seem fine, and are identical between the failing and non-failing stacks.
@mbecker20 commented on GitHub (Aug 30, 2025):
@chrschorn @Elekam I understand these issues are frustrating. The next step is to see if this issue can be reproduced on my side so I can figure out what might be happening. Do you guys have any steps to reproduce the issue you can provide?
@chrschorn commented on GitHub (Aug 31, 2025):
Steps that lead to the issue on my end:
stacks/immich,stacks/paperlessetc.). Not sure if this is important.BatchPullStackwith target*in a procedure@joeknock90 commented on GitHub (Oct 6, 2025):
Just wanted to throw in that I'm also experiencing this issue. with more or less the same setup as above. ~20 Stacks across 4 servers. Individually pull the stack works great, but the procedure fails with the same 500 error.
@durandguru commented on GitHub (Oct 20, 2025):
I have the same. 4 servers. On two intel servers the batch pull stack procedure always work. I have two servers running in Oracle Cloud with Ubuntu running on arm64. Those two servers always fail with the Batch Pull Stack procedure. Manually pulling stacks work, but only when I select a few and not all (1 server 12 stacks/1 server 22 stacks) stacks at once.
Running on 1.19.5
Error in Batch Pull Stack
ERROR: Failed stage 'Stage 1' execution after 59.99791408s
TRACE:
1: ERROR: Failed on PullStack(PullStack { stack: "starbase80-911", services: [] })
2: ERROR: execution not successful. see update '68f62c72353a32f10d52ba27'
Error in stack
ERROR: Failed at PullStack
TRACE:
1: 500 Internal Server Error
2: Failed to validate run directory on host after stack write (canonicalize error)
3: No such file or directory (os error 2)
Other Error (the compose file is in the shared git repo)
ERROR: Failed at PullStack
TRACE:
1: 500 Internal Server Error
2: Missing compose file at compose.yaml
Doing a manual pull does work without a problem.