Defunct Git Processes left lying around #1392

Closed
opened 2025-11-02 03:59:09 -06:00 by GiteaMirror · 14 comments
Owner

Originally created by @trebonian on GitHub (Dec 20, 2017).

  • Gitea version (or commit ref): e67b4055f9
  • Git version: 2.15.0
  • Operating system: Alpine Linux
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • [x ] SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • [ x] Not relevant
  • Log gist:

Description

After running any operation that uses the new commit-info module on a directory - it leaves a defunct process lying around saying "[git] " in the list of processes.
...

Screenshots

Originally created by @trebonian on GitHub (Dec 20, 2017). <!-- 1. Please speak English, this is the language all of us can speak and write. 2. Please ask questions or configuration/deploy problems on our Discord server (https://discord.gg/NsatcWJ) or forum (https://discourse.gitea.io). 3. Please take a moment to check that your issue doesn't already exist. 4. Please give all relevant information below for bug reports, because incomplete details will be handled as an invalid report. --> - Gitea version (or commit ref): e67b4055f9087cc381df437502f3cd912abb53ed - Git version: 2.15.0 - Operating system: Alpine Linux - Database (use `[x]`): - [ ] PostgreSQL - [ ] MySQL - [ ] MSSQL - [x ] SQLite - Can you reproduce the bug at https://try.gitea.io: - [ ] Yes (provide example URL) - [ ] No - [ x] Not relevant - Log gist: ## Description After running any operation that uses the new commit-info module on a directory - it leaves a defunct process lying around saying "[git] <defunct>" in the list of processes. ... ## Screenshots <!-- **If this issue involves the Web Interface, please include a screenshot** -->
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

I fixed it locally in /vendor/code.gitea.io/git/commit-info.go:
I added a line "defer cmd.Wait() after the cmd.exec of git log

@trebonian commented on GitHub (Dec 20, 2017): I fixed it locally in /vendor/code.gitea.io/git/commit-info.go: I added a line "defer cmd.Wait() after the cmd.exec of git log
Author
Owner

@lunny commented on GitHub (Dec 20, 2017):

@trebonian could you send a PR to github.com/go-gitea/git?

@lunny commented on GitHub (Dec 20, 2017): @trebonian could you send a PR to `github.com/go-gitea/git`?
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

Not sure how to do it without a fork of gitea on github.
Here is a "git diff" of my change:

diff --git a/vendor/code.gitea.io/git/commit_info.go b/vendor/code.gitea.io/git/commit_info.go
index 106ebc61..c2691fc6 100644
--- a/vendor/code.gitea.io/git/commit_info.go
+++ b/vendor/code.gitea.io/git/commit_info.go
@@ -198,6 +198,7 @@ func getCommitsInfo(state *getCommitsInfoState) error {
        }
        cmd := exec.CommandContext(ctx, "git", args...)
        cmd.Dir = state.headCommit.repo.Path
+       defer cmd.Wait()

        readCloser, err := cmd.StdoutPipe()
        if err != nil {

EDIT: change the formatting so it's readable // BKC

@trebonian commented on GitHub (Dec 20, 2017): Not sure how to do it without a fork of gitea on github. Here is a "git diff" of my change: ``` diff --git a/vendor/code.gitea.io/git/commit_info.go b/vendor/code.gitea.io/git/commit_info.go index 106ebc61..c2691fc6 100644 --- a/vendor/code.gitea.io/git/commit_info.go +++ b/vendor/code.gitea.io/git/commit_info.go @@ -198,6 +198,7 @@ func getCommitsInfo(state *getCommitsInfoState) error { } cmd := exec.CommandContext(ctx, "git", args...) cmd.Dir = state.headCommit.repo.Path + defer cmd.Wait() readCloser, err := cmd.StdoutPipe() if err != nil { ``` EDIT: change the formatting so it's readable // BKC
Author
Owner

@ethantkoenig commented on GitHub (Dec 20, 2017):

I don't think calling cmd.Wait() is the right solution here, since that will block until the git log command completely finishes (which will take a long time for large repos)

@ethantkoenig commented on GitHub (Dec 20, 2017): I don't think calling `cmd.Wait()` is the right solution here, since that will block until the `git log` command completely finishes (which will take a long time for large repos)
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

its deferred until this routine returns - which I assumed meant all the log output was processed.
The cmd.Wait() needs to be called before the routine returns. Is there a better place for it?

@trebonian commented on GitHub (Dec 20, 2017): its deferred until this routine returns - which I assumed meant all the log output was processed. The cmd.Wait() needs to be called before the routine returns. Is there a better place for it?
Author
Owner

@ethantkoenig commented on GitHub (Dec 20, 2017):

@trebonian No, the function may return without processing all of the git log output.

@ethantkoenig commented on GitHub (Dec 20, 2017): @trebonian No, the function may return without processing all of the `git log` output.
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

@ethantkoenig then whoever processes the last output needs to do a "Wait" otherwise zombies will be left lying around

@trebonian commented on GitHub (Dec 20, 2017): @ethantkoenig then whoever processes the last output needs to do a "Wait" otherwise zombies will be left lying around
Author
Owner

@ethantkoenig commented on GitHub (Dec 20, 2017):

It was my understanding that defer cancel() (here) would clean up the process, but I'll investigate more when I get the chance.

@ethantkoenig commented on GitHub (Dec 20, 2017): It was my understanding that `defer cancel()` ([here](https://github.com/go-gitea/git/blob/4573c63ec9c8257caf11e361c238b6ceb53781d7/commit_info.go#L193)) would clean up the process, but I'll investigate more when I get the chance.
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

Thanks - the symptom that I am seeing is that there is a new "git" zombie processes after every invocation of gitCommitsInfo.

I am not sure how the Golang runtime handles the "cancel" but the main process needs to do an OS wait on the child process id (at the linux level). I don't know how it can do this if the only handle to that child process is lost after gitCommitsInfo returns.

@trebonian commented on GitHub (Dec 20, 2017): Thanks - the symptom that I am seeing is that there is a new "git" zombie processes after every invocation of gitCommitsInfo. I am not sure how the Golang runtime handles the "cancel" but the main process needs to do an OS wait on the child process id (at the linux level). I don't know how it can do this if the only handle to that child process is lost after gitCommitsInfo returns.
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

PS - it should only call the Wait if we know the child is done - otherwise we will block the main process.

@trebonian commented on GitHub (Dec 20, 2017): PS - it should only call the Wait if we know the child is done - otherwise we will block the main process.
Author
Owner

@lunny commented on GitHub (Dec 20, 2017):

I think @trebonian is right. cmd.Wait is needed if cmd.Start but not cmd.Run.

@lunny commented on GitHub (Dec 20, 2017): I think @trebonian is right. `cmd.Wait` is needed if `cmd.Start` but not `cmd.Run`.
Author
Owner

@trebonian commented on GitHub (Dec 20, 2017):

I think @ethantkoenig is correct, and we don't want the main process to block - but I also think we need to call cmd.Wait() in the main process when the cmd'ed process terminates. Note that the current code does not seem to check for any kind of process error while the cmd is running.

I did the following after the cmd.Start() and it seems to work:
go func () { cmd.Wait() }()

This starts a goroutine that will wait until the process exits. This seems to fix up the zombies - but it's not the nicest.

I am not sure, but I think there needs to be something with a "done" channel to detect and report process errors (like getting killed, or signalled while running), and the cmd process done detection is merged with the other thread done check's lower down.

@trebonian commented on GitHub (Dec 20, 2017): I think @ethantkoenig is correct, and we don't want the main process to block - but I also think we need to call cmd.Wait() in the main process when the cmd'ed process terminates. Note that the current code does not seem to check for any kind of process error while the cmd is running. I did the following after the cmd.Start() and it seems to work: go func () { cmd.Wait() }() This starts a goroutine that will wait until the process exits. This seems to fix up the zombies - but it's not the nicest. I am not sure, but I think there needs to be something with a "done" channel to detect and report process errors (like getting killed, or signalled while running), and the cmd process done detection is merged with the other thread done check's lower down.
Author
Owner

@lunny commented on GitHub (Dec 20, 2017):

go func () { cmd.Wait() }()

seems work. Or maybe handle the error and write to log.

@lunny commented on GitHub (Dec 20, 2017): ```Go go func () { cmd.Wait() }() ``` seems work. Or maybe handle the error and write to log.
Author
Owner

@ethantkoenig commented on GitHub (Dec 20, 2017):

After running benchmarks, it looks like calling cmd.Wait() from the main process is okay, so long as we have previously closed the stdout pipe. Closing stdout will end the child process, so that the subsequent call to cmd.Wait() becomes effectively non-blocking. Benchmarks below:

Current implementation (creates defunct processes):

BenchmarkEntries_GetCommitsInfo/gitea-40         	     500	  27661283 ns/op
BenchmarkEntries_GetCommitsInfo/manyfiles-40     	     100	 195525556 ns/op
BenchmarkEntries_GetCommitsInfo/moby-40          	     300	  50553596 ns/op
BenchmarkEntries_GetCommitsInfo/go-40            	      30	 372394389 ns/op
BenchmarkEntries_GetCommitsInfo/linux-40         	      20	 704852371 ns/op

Proposed changes:

BenchmarkEntries_GetCommitsInfo/gitea-40         	     500	  26902208 ns/op
BenchmarkEntries_GetCommitsInfo/manyfiles-40     	     100	 201779563 ns/op
BenchmarkEntries_GetCommitsInfo/moby-40          	     300	  49267027 ns/op
BenchmarkEntries_GetCommitsInfo/go-40            	      50	 365035008 ns/op
BenchmarkEntries_GetCommitsInfo/linux-40         	      20	 716164050 ns/op

I will send a fix to https://github.com/go-gitea/git.

@ethantkoenig commented on GitHub (Dec 20, 2017): After running benchmarks, it looks like calling `cmd.Wait()` from the main process is okay, so long as we have previously closed the stdout pipe. Closing stdout will end the child process, so that the subsequent call to `cmd.Wait()` becomes effectively non-blocking. Benchmarks below: Current implementation (creates defunct processes): ``` BenchmarkEntries_GetCommitsInfo/gitea-40 500 27661283 ns/op BenchmarkEntries_GetCommitsInfo/manyfiles-40 100 195525556 ns/op BenchmarkEntries_GetCommitsInfo/moby-40 300 50553596 ns/op BenchmarkEntries_GetCommitsInfo/go-40 30 372394389 ns/op BenchmarkEntries_GetCommitsInfo/linux-40 20 704852371 ns/op ``` Proposed changes: ``` BenchmarkEntries_GetCommitsInfo/gitea-40 500 26902208 ns/op BenchmarkEntries_GetCommitsInfo/manyfiles-40 100 201779563 ns/op BenchmarkEntries_GetCommitsInfo/moby-40 300 49267027 ns/op BenchmarkEntries_GetCommitsInfo/go-40 50 365035008 ns/op BenchmarkEntries_GetCommitsInfo/linux-40 20 716164050 ns/op ``` I will send a fix to https://github.com/go-gitea/git.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#1392