Gitea Running as Pod in Kubernetes invoking OOM/bad pack header? #10617

Open
opened 2025-11-02 09:12:49 -06:00 by GiteaMirror · 6 comments
Owner

Originally created by @wattsap on GitHub (Apr 8, 2023).

Description

hello,

I am seeing an interesting issue with cloning a repo from gitea hosted on K3s. I recently migrated from a standalone docker VM as part of a larger migration.

The repo in question is 694 MiB, and after the migration when attempting to clone the repo to another VM outside the K3s cluster I am seeing the below error:

/usr/bin/git clone --origin origin 'https://user:pass@gitea.company.net/user/repo.git' /var/lib/awx/projects/_8__homelab
Cloning into '/var/lib/awx/projects/_8__homelab'...
remote: Enumerating objects: 4997, done.
fatal: the remote end hung up unexpectedly
fatal: protocol error: bad pack header

The pods CPU and memory limits are pretty reasonable:
containers: - name: gitea image: gitea/gitea:{{ gitea_version }} resources: requests: memory: 500M cpu: 200m limits: memory: 2000M cpu: 2000m

and when the clone is attempted, I can see it is not trying to cross those thresholds(green line indicates the resource requests, not the resource limits):

Screenshot 2023-04-08 at 1 11 19 PM

Screenshot 2023-04-08 at 1 11 37 PM

Thinking it might be something with the ingress-nginx controller, I tested from other pods in the cluster and got the same result hitting the ip of the clusterip service directly, bypassing the ingress/external routing.

In the gitea pod logs, I see the below during the clone attempt:
2023/04/08 17:06:52 [64319e8c-24] router: slow POST /user/repo.git/git-upload-pack for 10.43.3.3:0, elapsed 3981.0ms @ repo/http.go:492(repo.ServiceUploadPack)
2023/04/08 17:06:59 [64319f33] router: completed GET / for 10.43.1.248:57104, 200 OK in 15.6ms @ web/home.go:33(web.Home)
2023/04/08 17:07:09 [64319f3d] router: completed GET / for 10.43.1.248:33618, 200 OK in 4.8ms @ web/home.go:33(web.Home)
2023/04/08 17:07:19 [64319f47] router: completed GET / for 10.43.1.248:57398, 200 OK in 54.5ms @ web/home.go:33(web.Home)
2023/04/08 17:07:29 [64319f51] router: completed GET / for 10.43.1.248:35500, 200 OK in 4.5ms @ web/home.go:33(web.Home)
2023/04/08 17:07:37 [64319f59] router: completed GET / for 10.43.3.3:0, 200 OK in 32.6ms @ web/home.go:33(web.Home)
2023/04/08 17:07:39 [64319f5b] router: completed GET / for 10.43.1.248:39398, 200 OK in 4.0ms @ web/home.go:33(web.Home)
2023/04/08 17:07:49 [64319f65] router: completed GET / for 10.43.1.248:60854, 200 OK in 4.9ms @ web/home.go:33(web.Home)
2023/04/08 17:07:50 [64319f28-3] router: completed POST /user/repo.git/git-upload-pack for 10.43.3.3:0, 200 OK in 61683.7ms @ repo/http.go:492(repo.ServiceUploadPack)

I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git
`bash-5.1# cat config
[core]
repositoryformatversion = 0
filemode = true
bare = true
packedGitLimit = 256m

[pack]
windowMemory = 100m
packSizeLimit = 100m
threads = "1"

[http]
postBuffer = 200000000
bash-5.1# `

The repo itself appears to be healthy, runing git fsck on the gitea pod for the repo comes back successfully

bash-5.1# git fsck --full Checking object directories: 100% (256/256), done. Checking objects: 100% (3271/3271), done.

i'm not really sure where to go from here from a debugging perspective, i'm afraid I don't know enough about the git protocol in general - is there a tunable I am missing in the gitea config?

Thank you very much for your time.

Gitea Version

1.18.1

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

2.25.1

Operating System

Kubernetes

How are you running Gitea?

bare metal k3s cluster, gitea docker image gitea/gitea:1.18.1, postgres as backend DB. gitea and postgres are running as separate statefulsets, each with PVCs that are made from PVs mounting NFS shares.

Previous working configuration was single VM using docker-compose and the same gitea/postgres containers, mounting the same NFS shares as docker named volumes.

Database

PostgreSQL

Originally created by @wattsap on GitHub (Apr 8, 2023). ### Description hello, I am seeing an interesting issue with cloning a repo from gitea hosted on K3s. I recently migrated from a standalone docker VM as part of a larger migration. The repo in question is 694 MiB, and after the migration when attempting to clone the repo to another VM outside the K3s cluster I am seeing the below error: /usr/bin/git clone --origin origin 'https://user:pass@gitea.company.net/user/repo.git' /var/lib/awx/projects/_8__homelab Cloning into '/var/lib/awx/projects/_8__homelab'... remote: Enumerating objects: 4997, done. fatal: the remote end hung up unexpectedly fatal: protocol error: bad pack header The pods CPU and memory limits are pretty reasonable: `containers: - name: gitea image: gitea/gitea:{{ gitea_version }} resources: requests: memory: 500M cpu: 200m limits: memory: 2000M cpu: 2000m` and when the clone is attempted, I can see it is not trying to cross those thresholds(green line indicates the resource requests, not the resource limits): ![Screenshot 2023-04-08 at 1 11 19 PM](https://user-images.githubusercontent.com/9096269/230734199-0990ceac-c94c-4dad-b368-fc4304d3a8b5.png) ![Screenshot 2023-04-08 at 1 11 37 PM](https://user-images.githubusercontent.com/9096269/230734203-bce14fc5-9783-4ac7-be24-d121a0ed06f3.png) Thinking it might be something with the ingress-nginx controller, I tested from other pods in the cluster and got the same result hitting the ip of the clusterip service directly, bypassing the ingress/external routing. In the gitea pod logs, I see the below during the clone attempt: **2023/04/08 17:06:52 [64319e8c-24] router: slow POST /user/repo.git/git-upload-pack for 10.43.3.3:0, elapsed 3981.0ms @ repo/http.go:492(repo.ServiceUploadPack)** 2023/04/08 17:06:59 [64319f33] router: completed GET / for 10.43.1.248:57104, 200 OK in 15.6ms @ web/home.go:33(web.Home) 2023/04/08 17:07:09 [64319f3d] router: completed GET / for 10.43.1.248:33618, 200 OK in 4.8ms @ web/home.go:33(web.Home) 2023/04/08 17:07:19 [64319f47] router: completed GET / for 10.43.1.248:57398, 200 OK in 54.5ms @ web/home.go:33(web.Home) 2023/04/08 17:07:29 [64319f51] router: completed GET / for 10.43.1.248:35500, 200 OK in 4.5ms @ web/home.go:33(web.Home) 2023/04/08 17:07:37 [64319f59] router: completed GET / for 10.43.3.3:0, 200 OK in 32.6ms @ web/home.go:33(web.Home) 2023/04/08 17:07:39 [64319f5b] router: completed GET / for 10.43.1.248:39398, 200 OK in 4.0ms @ web/home.go:33(web.Home) 2023/04/08 17:07:49 [64319f65] router: completed GET / for 10.43.1.248:60854, 200 OK in 4.9ms @ web/home.go:33(web.Home) **2023/04/08 17:07:50 [64319f28-3] router: completed POST /user/repo.git/git-upload-pack for 10.43.3.3:0, 200 OK in 61683.7ms @ repo/http.go:492(repo.ServiceUploadPack)** I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git `bash-5.1# cat config [core] repositoryformatversion = 0 filemode = true bare = true packedGitLimit = 256m [pack] windowMemory = 100m packSizeLimit = 100m threads = "1" [http] postBuffer = 200000000 bash-5.1# ` The repo itself appears to be healthy, runing git fsck on the gitea pod for the repo comes back successfully `bash-5.1# git fsck --full Checking object directories: 100% (256/256), done. Checking objects: 100% (3271/3271), done.` i'm not really sure where to go from here from a debugging perspective, i'm afraid I don't know enough about the git protocol in general - is there a tunable I am missing in the gitea config? Thank you very much for your time. ### Gitea Version 1.18.1 ### Can you reproduce the bug on the Gitea demo site? No ### Log Gist _No response_ ### Screenshots _No response_ ### Git Version 2.25.1 ### Operating System Kubernetes ### How are you running Gitea? bare metal k3s cluster, gitea docker image gitea/gitea:1.18.1, postgres as backend DB. gitea and postgres are running as separate statefulsets, each with PVCs that are made from PVs mounting NFS shares. Previous working configuration was single VM using docker-compose and the same gitea/postgres containers, mounting the same NFS shares as docker named volumes. ### Database PostgreSQL
GiteaMirror added the type/bug label 2025-11-02 09:12:49 -06:00
Author
Owner

@wxiaoguang commented on GitHub (Apr 9, 2023):

I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git ( git config )

I guess it doesn't help, right?

@wxiaoguang commented on GitHub (Apr 9, 2023): > I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git ( git config ) I guess it doesn't help, right?
Author
Owner

@wattsap commented on GitHub (Apr 9, 2023):

prior to the [pack] section of the config, I was seeing Error Signal 9 errors in the gita server logs, which is what made me think it was OOM - after changing the config those messages are gone which is encouraging, but client side the result is still the same during a clone

@wattsap commented on GitHub (Apr 9, 2023): prior to the [pack] section of the config, I was seeing Error Signal 9 errors in the gita server logs, which is what made me think it was OOM - after changing the config those messages are gone which is encouraging, but client side the result is still the same during a clone
Author
Owner

@wxiaoguang commented on GitHub (Apr 9, 2023):

I haven't tested and have no idea about how to fine tune the config at the moment (sorry), just share some of my thoughts: it seems that git process itself causes the OOM (otherwise Gitea process would have been killed). Gitea executes the git command to provide the repository content when cloning, if the git process triggers OOM and gets killed, then the client sees a broken connection / protocol. Maybe Gitea also consumes some amount of the memory, so the free memory for git is not as much as before?

@wxiaoguang commented on GitHub (Apr 9, 2023): I haven't tested and have no idea about how to fine tune the config at the moment (sorry), just share some of my thoughts: it seems that `git` process itself causes the OOM (otherwise Gitea process would have been killed). Gitea executes the git command to provide the repository content when cloning, if the git process triggers OOM and gets killed, then the client sees a broken connection / protocol. Maybe Gitea also consumes some amount of the memory, so the free memory for git is not as much as before?
Author
Owner

@wattsap commented on GitHub (Apr 11, 2023):

I wondered that also, but was running a top on the pod during the clone and it still had plenty of memory:

Mem: 16228700K used, 165240K free, 41992K shrd, 161464K buff, 13025436K cached
CPU:  52% usr   4% sys   0% nic   0% idle  42% io   0% irq   0% sirq
Load average: 1.77 0.68 0.30 4/484 122
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
  121   120 git      R    1513m   9%   1  43% /usr/libexec/git-core/git pack-objects --revs --thin --stdout --progress --delta-base-offset
   18    16 git      S     853m   5%   0   0% /usr/local/bin/gitea web
  120    18 git      S     5428   0%   1   0% /usr/bin/git -c protocol.version=2 -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= upload-pack --stateless-rpc /data/git/repositories/user/repo.git
   17    15 root     S     4632   0%   1   0% sshd: /usr/sbin/sshd -D -e [listener] 0 of 10-100 startups
   75    68 root     S     2596   0%   0   0% bash
  111   104 root     S     2592   0%   1   0% bash

it doesn't seem like the container OS is running out

@wattsap commented on GitHub (Apr 11, 2023): I wondered that also, but was running a top on the pod during the clone and it still had plenty of memory: ``` Mem: 16228700K used, 165240K free, 41992K shrd, 161464K buff, 13025436K cached CPU: 52% usr 4% sys 0% nic 0% idle 42% io 0% irq 0% sirq Load average: 1.77 0.68 0.30 4/484 122 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 121 120 git R 1513m 9% 1 43% /usr/libexec/git-core/git pack-objects --revs --thin --stdout --progress --delta-base-offset 18 16 git S 853m 5% 0 0% /usr/local/bin/gitea web 120 18 git S 5428 0% 1 0% /usr/bin/git -c protocol.version=2 -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= upload-pack --stateless-rpc /data/git/repositories/user/repo.git 17 15 root S 4632 0% 1 0% sshd: /usr/sbin/sshd -D -e [listener] 0 of 10-100 startups 75 68 root S 2596 0% 0 0% bash 111 104 root S 2592 0% 1 0% bash ``` it doesn't seem like the container OS is running out
Author
Owner

@wxiaoguang commented on GitHub (Apr 11, 2023):

I see, can you try to change kernel vm.overcommit_memory=1 ?

@wxiaoguang commented on GitHub (Apr 11, 2023): I see, can you try to change kernel `vm.overcommit_memory=1` ?
Author
Owner

@wattsap commented on GitHub (Apr 11, 2023):

I didn't set it that way, but it looks like it already is:

bash-5.1# cat /proc/sys/vm/overcommit_memory 
1
@wattsap commented on GitHub (Apr 11, 2023): I didn't set it that way, but it looks like it already is: ``` bash-5.1# cat /proc/sys/vm/overcommit_memory 1 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#10617