During git clone, gitea fails with accept4: too many open files; retrying in 1s #15023

Closed
opened 2025-11-02 11:28:14 -06:00 by GiteaMirror · 23 comments
Owner

Originally created by @galets on GitHub (Oct 13, 2025).

Description

Description

While pulling a large repository with LFS, Gitea instance stops responding. Website shows "500 Internal Server Error". Log shows (initially):

Oct 13 14:20:34 gitea gitea[75]: 2025/10/13 14:20:34 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/9634ecdcbb32d105bb617de1bd6e93bc4d78bef0f76a8890f862b3f56ce6056e for 10.17.0.2:0, 200 OK in 94.2ms @ lfs/server.go:83(lfs.DownloadHandler)
Oct 13 14:20:34 gitea gitea[75]: 2025/10/13 14:20:34 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/97968b00ff273fd1a85e53a793befba5a6e6dd0ad8486998f156b5c73cdbf484 for 10.17.0.2:0, 200 OK in 152.2ms @ lfs/server.go:83(lfs.DownloadHandler)
Oct 13 14:20:35 gitea gitea[75]: 2025/10/13 14:20:35 HTTPRequest [W] router: slow      GET /api/internal/repo/galets/documents.git/info/lfs/objects/8583e862b3ed1b0a6b7519dc237383ae58c1d97aac85acca0ccc0bf04a949e81 for 10.17.0.2:0, elapsed 3475.9ms @ lfs/server.go:83(lfs.DownloadHandler)
Oct 13 14:20:38 gitea gitea[75]: 2025/10/13 14:20:38 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/8583e862b3ed1b0a6b7519dc237383ae58c1d97aac85acca0ccc0bf04a949e81 for 10.17.0.2:0, 200 OK in 6255.2ms @ lfs/server.go:83(lfs.DownloadHandler)
Oct 13 14:20:38 gitea gitea[75]: 2025/10/13 14:20:38 HTTPRequest [W] router: slow      GET /api/internal/repo/galets/documents.git/info/lfs/objects/983f1289778dfa7964d4953f9312270353f958a14ca87b56c5ef6c6669fd1d25 for 10.17.0.2:0, elapsed 3234.2ms @ lfs/server.go:83(lfs.DownloadHandler)

Then:

Oct 13 14:14:28 gitea gitea[75]: 2025/10/13 14:14:28 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:29 gitea gitea[75]: 2025/10/13 14:14:29 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:30 gitea gitea[75]: 2025/10/13 14:14:30 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:31 gitea gitea[75]: 2025/10/13 14:14:31 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:32 gitea gitea[75]: 2025/10/13 14:14:32 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:33 gitea gitea[75]: 2025/10/13 14:14:33 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:34 gitea gitea[75]: 2025/10/13 14:14:34 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:35 gitea gitea[75]: 2025/10/13 14:14:35 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:36 gitea gitea[75]: 2025/10/13 14:14:36 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:37 gitea gitea[75]: 2025/10/13 14:14:37 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:38 gitea gitea[75]: 2025/10/13 14:14:38 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:39 gitea gitea[75]: 2025/10/13 14:14:39 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:40 gitea gitea[75]: 2025/10/13 14:14:40 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s
Oct 13 14:14:41 gitea gitea[75]: 2025/10/13 14:14:41 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s

git clone shows:

$ git clone ssh://gitea@*********/galets/documents.git
Cloning into 'documents'...
remote: Enumerating objects: 36549, done.
remote: Counting objects: 100% (36549/36549), done.
remote: Compressing objects: 100% (24795/24795), done.
remote: Total 36549 (delta 10668), reused 36495 (delta 10648), pack-reused 0 (from 0)
Receiving objects: 100% (36549/36549), 10.79 MiB | 12.14 MiB/s, done.
Resolving deltas: 100% (10668/10668), done.
Updating files: 100% (15543/15543), done.
Filtering content:  31% (4266/13761), 12.21 GiB | 17.92 MiB/s4 MiB/s

Once gitea server goes into that condition, it does not self-recover

Also:

$ ulimit -u
578039
$ cat /etc/systemd/system.conf | grep NOFILE
DefaultLimitNOFILE=65535:524288

Gitea Version

v1.24.6

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

Ubuntu 22.04 / Debian 13

How are you running Gitea?

Gitea is running in systemd container (nspawn). Host is running Ubuntu 22.04. Container is running Debian 13.

Database

SQLite

Originally created by @galets on GitHub (Oct 13, 2025). ### Description ## Description While pulling a large repository with LFS, Gitea instance stops responding. Website shows "500 Internal Server Error". Log shows (initially): ``` Oct 13 14:20:34 gitea gitea[75]: 2025/10/13 14:20:34 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/9634ecdcbb32d105bb617de1bd6e93bc4d78bef0f76a8890f862b3f56ce6056e for 10.17.0.2:0, 200 OK in 94.2ms @ lfs/server.go:83(lfs.DownloadHandler) Oct 13 14:20:34 gitea gitea[75]: 2025/10/13 14:20:34 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/97968b00ff273fd1a85e53a793befba5a6e6dd0ad8486998f156b5c73cdbf484 for 10.17.0.2:0, 200 OK in 152.2ms @ lfs/server.go:83(lfs.DownloadHandler) Oct 13 14:20:35 gitea gitea[75]: 2025/10/13 14:20:35 HTTPRequest [W] router: slow GET /api/internal/repo/galets/documents.git/info/lfs/objects/8583e862b3ed1b0a6b7519dc237383ae58c1d97aac85acca0ccc0bf04a949e81 for 10.17.0.2:0, elapsed 3475.9ms @ lfs/server.go:83(lfs.DownloadHandler) Oct 13 14:20:38 gitea gitea[75]: 2025/10/13 14:20:38 HTTPRequest [I] router: completed GET /api/internal/repo/galets/documents.git/info/lfs/objects/8583e862b3ed1b0a6b7519dc237383ae58c1d97aac85acca0ccc0bf04a949e81 for 10.17.0.2:0, 200 OK in 6255.2ms @ lfs/server.go:83(lfs.DownloadHandler) Oct 13 14:20:38 gitea gitea[75]: 2025/10/13 14:20:38 HTTPRequest [W] router: slow GET /api/internal/repo/galets/documents.git/info/lfs/objects/983f1289778dfa7964d4953f9312270353f958a14ca87b56c5ef6c6669fd1d25 for 10.17.0.2:0, elapsed 3234.2ms @ lfs/server.go:83(lfs.DownloadHandler) ``` Then: ``` Oct 13 14:14:28 gitea gitea[75]: 2025/10/13 14:14:28 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:29 gitea gitea[75]: 2025/10/13 14:14:29 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:30 gitea gitea[75]: 2025/10/13 14:14:30 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:31 gitea gitea[75]: 2025/10/13 14:14:31 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:32 gitea gitea[75]: 2025/10/13 14:14:32 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:33 gitea gitea[75]: 2025/10/13 14:14:33 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:34 gitea gitea[75]: 2025/10/13 14:14:34 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:35 gitea gitea[75]: 2025/10/13 14:14:35 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:36 gitea gitea[75]: 2025/10/13 14:14:36 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:37 gitea gitea[75]: 2025/10/13 14:14:37 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:38 gitea gitea[75]: 2025/10/13 14:14:38 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:39 gitea gitea[75]: 2025/10/13 14:14:39 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:40 gitea gitea[75]: 2025/10/13 14:14:40 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s Oct 13 14:14:41 gitea gitea[75]: 2025/10/13 14:14:41 modules/log/misc.go:71:(*loggerToWriter).Write() [I] http: Accept error: accept tcp [::]:3000: accept4: too many open files; retrying in 1s ``` git clone shows: ``` $ git clone ssh://gitea@*********/galets/documents.git Cloning into 'documents'... remote: Enumerating objects: 36549, done. remote: Counting objects: 100% (36549/36549), done. remote: Compressing objects: 100% (24795/24795), done. remote: Total 36549 (delta 10668), reused 36495 (delta 10648), pack-reused 0 (from 0) Receiving objects: 100% (36549/36549), 10.79 MiB | 12.14 MiB/s, done. Resolving deltas: 100% (10668/10668), done. Updating files: 100% (15543/15543), done. Filtering content: 31% (4266/13761), 12.21 GiB | 17.92 MiB/s4 MiB/s ``` Once gitea server goes into that condition, it does not self-recover Also: ``` $ ulimit -u 578039 $ cat /etc/systemd/system.conf | grep NOFILE DefaultLimitNOFILE=65535:524288 ``` ### Gitea Version v1.24.6 ### Can you reproduce the bug on the Gitea demo site? No ### Log Gist _No response_ ### Screenshots _No response_ ### Git Version _No response_ ### Operating System Ubuntu 22.04 / Debian 13 ### How are you running Gitea? Gitea is running in systemd container (nspawn). Host is running Ubuntu 22.04. Container is running Debian 13. ### Database SQLite
GiteaMirror added the type/bug label 2025-11-02 11:28:14 -06:00
Author
Owner

@wxiaoguang commented on GitHub (Oct 14, 2025):

It seems that there are some resource leaking.

@wxiaoguang commented on GitHub (Oct 14, 2025): It seems that there are some resource leaking.
Author
Owner

@wxiaoguang commented on GitHub (Oct 14, 2025):

Propose a PR to try to fix: LFS SSH: Fix missing Close when error occurs and add more tests #35658

If you can collect more information (for example: when error occurs, what are the leaked resources? file description or tcp socket connection), it would be very helpful.

@wxiaoguang commented on GitHub (Oct 14, 2025): Propose a PR to try to fix: LFS SSH: Fix missing Close when error occurs and add more tests #35658 If you can collect more information (for example: when error occurs, what are the leaked resources? file description or tcp socket connection), it would be very helpful.
Author
Owner

@galets commented on GitHub (Oct 14, 2025):

I have environment where the issue is reproducible. Can you write me up the steps I need to follow to collect the data you want?

Also, would you like me to deploy the fix you proposed and see if it resolves the issue?

@galets commented on GitHub (Oct 14, 2025): I have environment where the issue is reproducible. Can you write me up the steps I need to follow to collect the data you want? Also, would you like me to deploy the fix you proposed and see if it resolves the issue?
Author
Owner

@wxiaoguang commented on GitHub (Oct 14, 2025):

Some tools like ss or lsof can help to list the open FD or TCP sockets in a process.

For example: ss -anp and lsof -anp $(pidof gitea)

@wxiaoguang commented on GitHub (Oct 14, 2025): Some tools like `ss` or `lsof` can help to list the open FD or TCP sockets in a process. For example: `ss -anp` and `lsof -anp $(pidof gitea)`
Author
Owner

@galets commented on GitHub (Oct 14, 2025):

I dumped the whole lsof, as there weren't many other processes in the container

(Update: deleted the log attachment)

Please let me know if you need anything else. I will restart gitea in a few hours

@galets commented on GitHub (Oct 14, 2025): I dumped the whole lsof, as there weren't many other processes in the container (Update: deleted the log attachment) Please let me know if you need anything else. I will restart gitea in a few hours
Author
Owner

@galets commented on GitHub (Oct 14, 2025):

Also, something occured to me worth mentioning: a lot of exhausted connections belong to web ui process, not workers:

# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 Oct13 ?        00:00:01 /usr/lib/systemd/systemd
root          16       1  0 Oct13 ?        00:00:42 /usr/lib/systemd/systemd-journald
systemd+      42       1  0 Oct13 ?        00:00:01 /usr/lib/systemd/systemd-networkd
root          65       1  0 Oct13 ?        00:00:00 /usr/sbin/cron -f
message+      66       1  0 Oct13 ?        00:00:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root          67       1  0 Oct13 ?        00:00:00 /usr/lib/systemd/systemd-logind
gitea         69       1 21 Oct13 ?        07:05:42 /usr/local/lib/gitea/gitea web --config /usr/local/var/gitea/custom/conf/app.ini
root          87       1  0 Oct13 pts/0    00:00:00 /sbin/agetty -o -- \u --noreset --noclear --keep-baud 115200,57600,38400,9600 - vt220
root          90       1  0 Oct13 ?        00:00:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root         228       1  0 Oct13 ?        00:00:00 /usr/lib/systemd/systemd --user
root         230     228  0 Oct13 ?        00:00:00 (sd-pam)
gitea       2128       1  0 23:22 ?        00:00:00 /usr/lib/systemd/systemd --user
gitea       2130    2128  0 23:22 ?        00:00:00 (sd-pam)
gitea       2203       1  1 23:22 ?        00:00:31 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2230       1  0 23:22 ?        00:00:18 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2257       1  0 23:22 ?        00:00:14 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2286       1  0 23:22 ?        00:00:19 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2311       1  0 23:22 ?        00:00:15 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2333       1  1 23:22 ?        00:00:22 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2359       1  0 23:22 ?        00:00:19 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
gitea       2385       1  1 23:22 ?        00:00:43 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4
root        2442       1  0 23:38 pts/2    00:00:00 /bin/bash -l
root        2444    2442  0 23:38 pts/2    00:00:00 (sd-
root        2573    2442  0 23:59 pts/2    00:00:00 ps -ef
@galets commented on GitHub (Oct 14, 2025): Also, something occured to me worth mentioning: a lot of exhausted connections belong to web ui process, not workers: ``` # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 Oct13 ? 00:00:01 /usr/lib/systemd/systemd root 16 1 0 Oct13 ? 00:00:42 /usr/lib/systemd/systemd-journald systemd+ 42 1 0 Oct13 ? 00:00:01 /usr/lib/systemd/systemd-networkd root 65 1 0 Oct13 ? 00:00:00 /usr/sbin/cron -f message+ 66 1 0 Oct13 ? 00:00:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only root 67 1 0 Oct13 ? 00:00:00 /usr/lib/systemd/systemd-logind gitea 69 1 21 Oct13 ? 07:05:42 /usr/local/lib/gitea/gitea web --config /usr/local/var/gitea/custom/conf/app.ini root 87 1 0 Oct13 pts/0 00:00:00 /sbin/agetty -o -- \u --noreset --noclear --keep-baud 115200,57600,38400,9600 - vt220 root 90 1 0 Oct13 ? 00:00:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups root 228 1 0 Oct13 ? 00:00:00 /usr/lib/systemd/systemd --user root 230 228 0 Oct13 ? 00:00:00 (sd-pam) gitea 2128 1 0 23:22 ? 00:00:00 /usr/lib/systemd/systemd --user gitea 2130 2128 0 23:22 ? 00:00:00 (sd-pam) gitea 2203 1 1 23:22 ? 00:00:31 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2230 1 0 23:22 ? 00:00:18 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2257 1 0 23:22 ? 00:00:14 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2286 1 0 23:22 ? 00:00:19 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2311 1 0 23:22 ? 00:00:15 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2333 1 1 23:22 ? 00:00:22 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2359 1 0 23:22 ? 00:00:19 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 gitea 2385 1 1 23:22 ? 00:00:43 /usr/local/lib/gitea/gitea --config=/usr/local/var/gitea/custom/conf/app.ini serv key-4 root 2442 1 0 23:38 pts/2 00:00:00 /bin/bash -l root 2444 2442 0 23:38 pts/2 00:00:00 (sd- root 2573 2442 0 23:59 pts/2 00:00:00 ps -ef ```
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

Hmm, just realized that it's difficult to "catch" a snapshot when the resource leaking occurs.

The reason is that the leaking occurs in the "git hook gitea subcommand" (but not gitea server), when the leaking occurs, git client is likely to exit soon, then the "git hook gitea subcommand" also exits, then everything recovers. It needs a quite good timing to catch the resource leaking in "git hook gitea subcommand" ....

@wxiaoguang commented on GitHub (Oct 15, 2025): Hmm, just realized that it's difficult to "catch" a snapshot when the resource leaking occurs. The reason is that the leaking occurs in the "git hook gitea subcommand" (but not gitea server), when the leaking occurs, git client is likely to exit soon, then the "git hook gitea subcommand" also exits, then everything recovers. It needs a quite good timing to catch the resource leaking in "git hook gitea subcommand" ....
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

I dumped the whole lsof, as there weren't many other processes in the container

gitea-lsof.txt.gz

Please let me know if you need anything else. I will restart gitea in a few hours

OK, I think we can know the problem is caused by the gitea subcommand to gitea server connection.

There are plenty of these logs:

gitea       69   81 gitea             gitea 3459u     IPv6           31151935       0t0        TCP localhost:3000->localhost:36762 (ESTABLISHED)

I guess "Fix missing Close when error occurs and add more tests #35658" might be able to fix. You can restart the server :)

@wxiaoguang commented on GitHub (Oct 15, 2025): > I dumped the whole lsof, as there weren't many other processes in the container > > [gitea-lsof.txt.gz](https://github.com/user-attachments/files/22915213/gitea-lsof.txt.gz) > > Please let me know if you need anything else. I will restart gitea in a few hours OK, I think we can know the problem is caused by the gitea subcommand to gitea server connection. There are plenty of these logs: ``` gitea 69 81 gitea gitea 3459u IPv6 31151935 0t0 TCP localhost:3000->localhost:36762 (ESTABLISHED) ``` I guess "Fix missing Close when error occurs and add more tests #35658" might be able to fix. You can restart the server :)
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

Would you like me to deploy the commit that fixes is to test?

@galets commented on GitHub (Oct 15, 2025): Would you like me to deploy the commit that fixes is to test?
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

Would you like me to deploy the commit that fixes is to test?

If you are able to build your own docker image and deploy, it would be quite helpful. Thank you very much.

(And I am still working on it to try to debug more details)

@wxiaoguang commented on GitHub (Oct 15, 2025): > Would you like me to deploy the commit that fixes is to test? If you are able to build your own docker image and deploy, it would be quite helpful. Thank you very much. (And I am still working on it to try to debug more details)
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

Can you supply exactly the commit hash?

@galets commented on GitHub (Oct 15, 2025): Can you supply exactly the commit hash?
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

You can checkout my branch: https://github.com/wxiaoguang/gitea/tree/fix-lfs-ssh

At the moment there is only one commit db70766908. I might add more commits later if I find something new.

@wxiaoguang commented on GitHub (Oct 15, 2025): You can checkout my branch: https://github.com/wxiaoguang/gitea/tree/fix-lfs-ssh At the moment there is only one commit https://github.com/go-gitea/gitea/pull/35658/commits/db70766908e7bc2aaa3b3e6303e8446aded89b0d. I might add more commits later if I find something new.
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

If you'd like to apply it to 1.24, the only related change is modules/lfstransfer/backend/backend.go, others are not directly related.

(ps: I think it's also safe to run my fix-lfs-ssh branch, although it is based on main branch, it should be stable enough and can be switch to 1.25 which is going to be released soon)

@wxiaoguang commented on GitHub (Oct 15, 2025): If you'd like to apply it to 1.24, the only related change is `modules/lfstransfer/backend/backend.go`, others are not directly related. (ps: I think it's also safe to run my `fix-lfs-ssh` branch, although it is based on main branch, it should be stable enough and can be switch to 1.25 which is going to be released soon)
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

I applied following and re-testing now:

$ git diff
diff --git a/modules/lfstransfer/backend/backend.go b/modules/lfstransfer/backend/backend.go
index dd4108ea56..f4e6157091 100644
--- a/modules/lfstransfer/backend/backend.go
+++ b/modules/lfstransfer/backend/backend.go
@@ -157,7 +157,7 @@ func (g *GiteaBackend) Batch(_ string, pointers []transfer.BatchItem, args trans
 }
 
 // Download implements transfer.Backend. The returned reader must be closed by the caller.
-func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser, int64, error) {
+func (g *GiteaBackend) Download(oid string, args transfer.Args) (_ io.ReadCloser, _ int64, retErr error) {
        idMapStr, exists := args[argID]
        if !exists {
                return nil, 0, ErrMissingID
@@ -188,7 +188,15 @@ func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser,
        if err != nil {
                return nil, 0, fmt.Errorf("failed to get response: %w", err)
        }
-       // no need to close the body here by "defer resp.Body.Close()", see below
+       // We must return the ReaderCloser but not "ReadAll", to avoid OOM.
+       // "transfer.Backend" will check io.Closer interface and close the Body reader.
+       // So only close the Body when error occurs
+       defer func() {
+               if retErr != nil {
+                       _ = resp.Body.Close()
+               }
+       }()
+
        if resp.StatusCode != http.StatusOK {
                return nil, 0, statusCodeToErr(resp.StatusCode)
        }
@@ -197,7 +205,6 @@ func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser,
        if err != nil {
                return nil, 0, fmt.Errorf("failed to parse content length: %w", err)
        }
-       // transfer.Backend will check io.Closer interface and close this Body reader
        return resp.Body, respSize, nil
 }
 
@galets commented on GitHub (Oct 15, 2025): I applied following and re-testing now: ``` $ git diff diff --git a/modules/lfstransfer/backend/backend.go b/modules/lfstransfer/backend/backend.go index dd4108ea56..f4e6157091 100644 --- a/modules/lfstransfer/backend/backend.go +++ b/modules/lfstransfer/backend/backend.go @@ -157,7 +157,7 @@ func (g *GiteaBackend) Batch(_ string, pointers []transfer.BatchItem, args trans } // Download implements transfer.Backend. The returned reader must be closed by the caller. -func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser, int64, error) { +func (g *GiteaBackend) Download(oid string, args transfer.Args) (_ io.ReadCloser, _ int64, retErr error) { idMapStr, exists := args[argID] if !exists { return nil, 0, ErrMissingID @@ -188,7 +188,15 @@ func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser, if err != nil { return nil, 0, fmt.Errorf("failed to get response: %w", err) } - // no need to close the body here by "defer resp.Body.Close()", see below + // We must return the ReaderCloser but not "ReadAll", to avoid OOM. + // "transfer.Backend" will check io.Closer interface and close the Body reader. + // So only close the Body when error occurs + defer func() { + if retErr != nil { + _ = resp.Body.Close() + } + }() + if resp.StatusCode != http.StatusOK { return nil, 0, statusCodeToErr(resp.StatusCode) } @@ -197,7 +205,6 @@ func (g *GiteaBackend) Download(oid string, args transfer.Args) (io.ReadCloser, if err != nil { return nil, 0, fmt.Errorf("failed to parse content length: %w", err) } - // transfer.Backend will check io.Closer interface and close this Body reader return resp.Body, respSize, nil } ```
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

Good news, I found a new resource leaking point.

The "http client" is abused, then it creates a lot of "connection pool", every pool holds an open connection.

Will propose a fix soon.

@wxiaoguang commented on GitHub (Oct 15, 2025): Good news, I found a new resource leaking point. The "http client" is abused, then it creates a lot of "connection pool", every pool holds an open connection. Will propose a fix soon.
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

I confirm that backend.go fix did not resolve the issue, still getting same error

@galets commented on GitHub (Oct 15, 2025): I confirm that backend.go fix did not resolve the issue, still getting same error
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

I think (highly likely) this patch will fix, just add two DisableKeepAlives: true:

diff --git a/modules/httplib/request.go b/modules/httplib/request.go
index 49ea6f4b73..3c65dfe820 100644
--- a/modules/httplib/request.go
+++ b/modules/httplib/request.go
@@ -167,6 +167,8 @@ func (r *Request) getResponse() (*http.Response, error) {
                        TLSClientConfig: r.setting.TLSClientConfig,
                        Proxy:           http.ProxyFromEnvironment,
                        DialContext:     TimeoutDialer(r.setting.ConnectTimeout),
+
+                       DisableKeepAlives: true,
                }
        } else if t, ok := trans.(*http.Transport); ok {
                if t.TLSClientConfig == nil {
@@ -175,6 +177,7 @@ func (r *Request) getResponse() (*http.Response, error) {
                if t.DialContext == nil {
                        t.DialContext = TimeoutDialer(r.setting.ConnectTimeout)
                }
+               t.DisableKeepAlives = true
        }
@wxiaoguang commented on GitHub (Oct 15, 2025): I think (highly likely) this patch will fix, just add two `DisableKeepAlives: true`: ```diff diff --git a/modules/httplib/request.go b/modules/httplib/request.go index 49ea6f4b73..3c65dfe820 100644 --- a/modules/httplib/request.go +++ b/modules/httplib/request.go @@ -167,6 +167,8 @@ func (r *Request) getResponse() (*http.Response, error) { TLSClientConfig: r.setting.TLSClientConfig, Proxy: http.ProxyFromEnvironment, DialContext: TimeoutDialer(r.setting.ConnectTimeout), + + DisableKeepAlives: true, } } else if t, ok := trans.(*http.Transport); ok { if t.TLSClientConfig == nil { @@ -175,6 +177,7 @@ func (r *Request) getResponse() (*http.Response, error) { if t.DialContext == nil { t.DialContext = TimeoutDialer(r.setting.ConnectTimeout) } + t.DisableKeepAlives = true } ```
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

Trying following (previous change rolled back):

$ git diff
diff --git a/modules/httplib/request.go b/modules/httplib/request.go
index 49ea6f4b73..32e42c7f8e 100644
--- a/modules/httplib/request.go
+++ b/modules/httplib/request.go
@@ -167,6 +167,7 @@ func (r *Request) getResponse() (*http.Response, error) {
                        TLSClientConfig: r.setting.TLSClientConfig,
                        Proxy:           http.ProxyFromEnvironment,
                        DialContext:     TimeoutDialer(r.setting.ConnectTimeout),
+                        DisableKeepAlives: true,
                }
        } else if t, ok := trans.(*http.Transport); ok {
                if t.TLSClientConfig == nil {
@@ -175,6 +176,7 @@ func (r *Request) getResponse() (*http.Response, error) {
                if t.DialContext == nil {
                        t.DialContext = TimeoutDialer(r.setting.ConnectTimeout)
                }
+                t.DisableKeepAlives = true
        }
 
        client := &http.Client{
@galets commented on GitHub (Oct 15, 2025): Trying following (previous change rolled back): ``` $ git diff diff --git a/modules/httplib/request.go b/modules/httplib/request.go index 49ea6f4b73..32e42c7f8e 100644 --- a/modules/httplib/request.go +++ b/modules/httplib/request.go @@ -167,6 +167,7 @@ func (r *Request) getResponse() (*http.Response, error) { TLSClientConfig: r.setting.TLSClientConfig, Proxy: http.ProxyFromEnvironment, DialContext: TimeoutDialer(r.setting.ConnectTimeout), + DisableKeepAlives: true, } } else if t, ok := trans.(*http.Transport); ok { if t.TLSClientConfig == nil { @@ -175,6 +176,7 @@ func (r *Request) getResponse() (*http.Response, error) { if t.DialContext == nil { t.DialContext = TimeoutDialer(r.setting.ConnectTimeout) } + t.DisableKeepAlives = true } client := &http.Client{ ```
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

As checkout is at 96%:

root@gitea:/usr/local/lib/gitea# lsof | wc -l
4006
@galets commented on GitHub (Oct 15, 2025): As checkout is at 96%: ``` root@gitea:/usr/local/lib/gitea# lsof | wc -l 4006 ```
Author
Owner

@galets commented on GitHub (Oct 15, 2025):

git clone is complete, the fix is working

@galets commented on GitHub (Oct 15, 2025): `git clone` is complete, the fix is working
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

Thank you very much, you can keep the quick patch for your instance.

I will try to make a complete fix (it would be more complex and needs some time)

@wxiaoguang commented on GitHub (Oct 15, 2025): Thank you very much, you can keep the quick patch for your instance. I will try to make a complete fix (it would be more complex and needs some time)
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

Made more changes, IMO the complete fix should be like this: Fix missing Close when error occurs and abused connection pool #35658

@wxiaoguang commented on GitHub (Oct 15, 2025): Made more changes, IMO the complete fix should be like this: Fix missing Close when error occurs and abused connection pool #35658
Author
Owner

@wxiaoguang commented on GitHub (Oct 15, 2025):

1.25 nightly is ready, it contains the complete fix, and will be 1.25.0 soon.

It would quite helpful if you could deploy and try it (1.25-nightly should be pretty stable now, even if there are any bugs, we can quickly fix them, and you can also safely downgrade to 1.24.x by UPDATE version SQL since there is no incompatible change)

@wxiaoguang commented on GitHub (Oct 15, 2025): 1.25 nightly is ready, it contains the complete fix, and will be 1.25.0 soon. It would quite helpful if you could deploy and try it (1.25-nightly should be pretty stable now, even if there are any bugs, we can quickly fix them, and you can also safely downgrade to 1.24.x by `UPDATE version` SQL since there is no incompatible change) * https://dl.gitea.com/gitea/1.25-nightly/ * https://hub.docker.com/r/gitea/gitea/tags?name=1.25-nightly
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#15023