Increase commits listing page speed #74

Closed
opened 2025-11-02 03:07:43 -06:00 by GiteaMirror · 9 comments
Owner

Originally created by @thibaultmeyer on GitHub (Nov 22, 2016).

Description

commits listing page (eg : https://127.0.0.1:3000/torvalds/linux-kernel/commits/master) could be speed up by stopping to dump whole git repository commits at each page by using --max-count and --skip arguments with the command git rev-list.

Relative to #3818, Linux kernel repository count more than 600K commits and it could be nice to avoid dumping whole commits at each page if this is not necessary.

Or maybe cache usage could be a viable alternative to avoid running this command at each page (even if no modifications has been done on the repo)

More information at https://github.com/gogits/gogs/issues/3830#issuecomment-262221320

Originally created by @thibaultmeyer on GitHub (Nov 22, 2016). ## Description commits listing page (eg : https://127.0.0.1:3000/torvalds/linux-kernel/commits/master) could be speed up by stopping to dump whole git repository commits at each page by using `--max-count` and `--skip` arguments with the command `git rev-list`. Relative to #3818, Linux kernel repository count more than 600K commits and it could be nice to avoid dumping whole commits at each page if this is not necessary. Or maybe cache usage could be a viable alternative to avoid running this command at each page (even if no modifications has been done on the repo) More information at https://github.com/gogits/gogs/issues/3830#issuecomment-262221320
GiteaMirror added the issue/confirmedtype/enhancement labels 2025-11-02 03:07:43 -06:00
Author
Owner

@bkcsoft commented on GitHub (Nov 29, 2016):

Preferably the entire git package should have caching when possible IMO

@bkcsoft commented on GitHub (Nov 29, 2016): Preferably the entire `git` package should have caching when possible IMO
Author
Owner
@denji commented on GitHub (Dec 2, 2016): Readahead cache index/reindex ([`vcs.proto`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/vcs.proto)) - `BehindAhead` - Protocol Buffers cache? [`sourcegraph/…/vcs/gitcmd/repo.go#L175-L200`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L175-L200) * LRU `groupcache/lru` on-demand * `commitLogCache` — [`sourcegraph/…/vcs/gitcmd/repo.go#L400-L422`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L400-L422) * `diffCache` — [`sourcegraph/…/vcs/gitcmd/repo.go#L432-L493`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L432-L493) * `blameCache` — [`sourcegraph/…/vcs/gitcmd/repo.go#L517-L655`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L517-L655) * `readFileBytesCache` — [`sourcegraph/…/vcs/gitcmd/repo.go#L736-L768`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L736-L768) * `lsTreeCache` — [`sourcegraph/…/vcs/gitcmd/repo.go#L846-L963`](https://github.com/EricAnderson1000/sourcegraph/blob/master/pkg/vcs/gitcmd/repo.go#L846-L963) Refs --- * https://github.com/gogits/gogs/issues/13#issuecomment-228562316 * https://github.com/alexanderGugel/arc * https://github.com/allegro/bigcache * https://github.com/coocood/freecache * https://github.com/cubicdaiya/cachectl * https://github.com/goburrow/cache * https://github.com/golang/groupcache * https://github.com/hashicorp/golang-lru * https://github.com/josephlewis42/multicache * https://github.com/karlseguin/ccache * https://github.com/karlseguin/ccache * https://github.com/Netflix/rend * https://github.com/bluele/gcache
Author
Owner

@bkcsoft commented on GitHub (Dec 2, 2016):

@denji I've personally used hashicorp/golang-lru and IMO it isn't really suited for this task :( A plus would be built in hooks to redis/memcache, but that can be "in-house" if necessary.

@bkcsoft commented on GitHub (Dec 2, 2016): @denji I've personally used `hashicorp/golang-lru` and IMO it isn't really suited for this task :( A plus would be built in hooks to redis/memcache, but that can be "in-house" if necessary.
Author
Owner

@denji commented on GitHub (Dec 2, 2016):

sourcegraph used github.com/golang/groupcache/lru

Suggestion is to disable the default, use the minimum size, and optional hybrid/adaptive use big-cache (config)

@denji commented on GitHub (Dec 2, 2016): sourcegraph used `github.com/golang/groupcache/lru` Suggestion is to disable the default, use the minimum size, and optional hybrid/adaptive use big-cache (config)
Author
Owner

@tycho commented on GitHub (Apr 15, 2018):

So these days the commit list page doesn't seem to be too costly. I'm guessing that the rev-list --count doesn't happen on that page anymore?

But the cost does hit the first time you load the summary page for the repository, and it costs about 6 seconds for a mirror of linux.git.

Also pushing can be super expensive. I've started carrying this patch in my build of gitea, for testing purposes:

diff --git a/vendor/code.gitea.io/git/commit.go b/vendor/code.gitea.io/git/commit.go
index 299a2381..7ed71c27 100644
--- a/vendor/code.gitea.io/git/commit.go
+++ b/vendor/code.gitea.io/git/commit.go
@@ -10,7 +10,6 @@ import (
        "container/list"
        "fmt"
        "net/http"
-       "strconv"
        "strings"
 )
 
@@ -158,19 +157,7 @@ func CommitChanges(repoPath string, opts CommitChangesOptions) error {
 }
 
 func commitsCount(repoPath, revision, relpath string) (int64, error) {
-       var cmd *Command
-       cmd = NewCommand("rev-list", "--count")
-       cmd.AddArguments(revision)
-       if len(relpath) > 0 {
-               cmd.AddArguments("--", relpath)
-       }
-
-       stdout, err := cmd.RunInDir(repoPath)
-       if err != nil {
-               return 0, err
-       }
-
-       return strconv.ParseInt(strings.TrimSpace(stdout), 10, 64)
+       return 0, nil
 }
 
 // CommitsCount returns number of total commits of until given revision.

With the above patch, pushing a full mirror of linux.git goes from this:

[...]
[Macaron] 2018-04-15 08:09:50: Completed POST /api/internal/push/update 202 Accepted in 5.580406861s
[Macaron] 2018-04-15 08:09:50: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:09:55: Completed POST /api/internal/push/update 202 Accepted in 5.438243945s
[Macaron] 2018-04-15 08:09:55: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:10:01: Completed POST /api/internal/push/update 202 Accepted in 5.712241444s
[Macaron] 2018-04-15 08:10:01: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:10:08: Completed POST /api/internal/push/update 202 Accepted in 6.736438453s
[Macaron] 2018-04-15 08:10:08: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:10:13: Completed POST /api/internal/push/update 202 Accepted in 5.919010004s
[Macaron] 2018-04-15 08:10:13: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:10:19: Completed POST /api/internal/push/update 202 Accepted in 5.782453907s
[Macaron] 2018-04-15 08:10:19: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:10:25: Completed POST /api/internal/push/update 202 Accepted in 5.790490746s
[Macaron] 2018-04-15 08:10:25: Started POST /api/internal/push/update for 127.0.0.1
[...]

To this:

[...]
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 53.959803ms
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 55.238858ms
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 59.450086ms
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 54.134458ms
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 56.100003ms
[Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 55.861499ms
[Macaron] 2018-04-15 08:34:42: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 54.701342ms
[Macaron] 2018-04-15 08:34:42: Started POST /api/internal/push/update for 127.0.0.1
[Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 54.761946ms
[...]

So that's a couple orders of magnitude improvement, and with no real loss as far as I'm concerned (what does one use the commit counts for, anyway?)

Maybe calculating the commit counts should be a background job that gets cached until the next repo modification? I really shouldn't be forced to sit there for ages waiting for a push to complete just because it has to count revisions on every tag or branch I push.

@tycho commented on GitHub (Apr 15, 2018): So these days the commit list page doesn't seem to be too costly. I'm guessing that the `rev-list --count` doesn't happen on that page anymore? But the cost does hit the first time you load the summary page for the repository, and it costs about 6 seconds for a mirror of linux.git. Also pushing can be super expensive. I've started carrying this patch in my build of gitea, for testing purposes: ```diff diff --git a/vendor/code.gitea.io/git/commit.go b/vendor/code.gitea.io/git/commit.go index 299a2381..7ed71c27 100644 --- a/vendor/code.gitea.io/git/commit.go +++ b/vendor/code.gitea.io/git/commit.go @@ -10,7 +10,6 @@ import ( "container/list" "fmt" "net/http" - "strconv" "strings" ) @@ -158,19 +157,7 @@ func CommitChanges(repoPath string, opts CommitChangesOptions) error { } func commitsCount(repoPath, revision, relpath string) (int64, error) { - var cmd *Command - cmd = NewCommand("rev-list", "--count") - cmd.AddArguments(revision) - if len(relpath) > 0 { - cmd.AddArguments("--", relpath) - } - - stdout, err := cmd.RunInDir(repoPath) - if err != nil { - return 0, err - } - - return strconv.ParseInt(strings.TrimSpace(stdout), 10, 64) + return 0, nil } // CommitsCount returns number of total commits of until given revision. ``` With the above patch, pushing a full mirror of linux.git goes from this: ``` [...] [Macaron] 2018-04-15 08:09:50: Completed POST /api/internal/push/update 202 Accepted in 5.580406861s [Macaron] 2018-04-15 08:09:50: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:09:55: Completed POST /api/internal/push/update 202 Accepted in 5.438243945s [Macaron] 2018-04-15 08:09:55: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:10:01: Completed POST /api/internal/push/update 202 Accepted in 5.712241444s [Macaron] 2018-04-15 08:10:01: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:10:08: Completed POST /api/internal/push/update 202 Accepted in 6.736438453s [Macaron] 2018-04-15 08:10:08: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:10:13: Completed POST /api/internal/push/update 202 Accepted in 5.919010004s [Macaron] 2018-04-15 08:10:13: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:10:19: Completed POST /api/internal/push/update 202 Accepted in 5.782453907s [Macaron] 2018-04-15 08:10:19: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:10:25: Completed POST /api/internal/push/update 202 Accepted in 5.790490746s [Macaron] 2018-04-15 08:10:25: Started POST /api/internal/push/update for 127.0.0.1 [...] ``` To this: ``` [...] [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 53.959803ms [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 55.238858ms [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 59.450086ms [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 54.134458ms [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:41: Completed POST /api/internal/push/update 202 Accepted in 56.100003ms [Macaron] 2018-04-15 08:34:41: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 55.861499ms [Macaron] 2018-04-15 08:34:42: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 54.701342ms [Macaron] 2018-04-15 08:34:42: Started POST /api/internal/push/update for 127.0.0.1 [Macaron] 2018-04-15 08:34:42: Completed POST /api/internal/push/update 202 Accepted in 54.761946ms [...] ``` So that's a couple orders of magnitude improvement, and with no real loss as far as I'm concerned (what does one use the commit counts for, anyway?) Maybe calculating the commit counts should be a background job that gets cached until the next repo modification? I really shouldn't be forced to sit there for ages waiting for a push to complete just because it has to count revisions on every tag or branch I push.
Author
Owner

@bkcsoft commented on GitHub (Apr 28, 2018):

Commit Count is used here:
image

@bkcsoft commented on GitHub (Apr 28, 2018): Commit Count is used here: ![image](https://user-images.githubusercontent.com/4726179/39401704-013d97bc-4b4c-11e8-8659-bfb8ef952294.png)
Author
Owner

@stale[bot] commented on GitHub (Jan 27, 2019):

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale[bot] commented on GitHub (Jan 27, 2019): This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.
Author
Owner

@lunny commented on GitHub (Feb 7, 2019):

A linux repository with 798649 commits will spent 5313ms on the first commits list page.

@lunny commented on GitHub (Feb 7, 2019): A linux repository with 798649 commits will spent 5313ms on the first commits list page.
Author
Owner

@techknowlogick commented on GitHub (Dec 9, 2020):

Closing as we've gotten the linux kernel to load fairly quickly these days.

@techknowlogick commented on GitHub (Dec 9, 2020): Closing as we've gotten the linux kernel to load fairly quickly these days.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#74