Slow Performance on file history for large repos #9387

Closed
opened 2025-11-02 08:37:20 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @zeripath on GitHub (Aug 11, 2022).

The use of git log --follow on getting file history causes increased slow-downs for large repositories and potentially makes the count incorect.

Example URL:

This is essentially:

git rev-list --count $REVISION -- $FILE_PATH

followed-by:

git log $REVISION --follow  --pretty=format:%H -- $FILE_PATH

The second one of these is so much slower than the first and it can actually produce different results for the number of commits due to the --follow on the second call. (which appears to be the cause of most of the slow downs.)

Now if it were not for --follow we could actually use git rev-list for both of these calls and the skip and max-count will be free (in contrast to the current system where the skip doesn't work.)

Looking at the history for this line I don't think there was reasoning behind adding the follow except that I would guess that it was nice to add.

So... a simple speed improvement here is to drop the follow and switch to rev-list for these calls.

An additional speed improvement is to add a deferrable route as in the commit infos page

Originally posted by @zeripath in https://github.com/go-gitea/gitea/issues/19812#issuecomment-1207478181

Originally created by @zeripath on GitHub (Aug 11, 2022). The use of `git log --follow` on getting file history causes increased slow-downs for large repositories and potentially makes the count incorect. Example URL: * https://gitea.com/marktsai0316/linux/commits/branch/master/scripts/clang-tools This is essentially: ```bash git rev-list --count $REVISION -- $FILE_PATH ``` followed-by: ```bash git log $REVISION --follow --pretty=format:%H -- $FILE_PATH ``` The second one of these is so much slower than the first and it can actually produce different results for the number of commits due to the `--follow` on the second call. (which appears to be the cause of most of the slow downs.) Now if it were not for `--follow` we could actually use `git rev-list` for both of these calls and the `skip` and `max-count` will be free (in contrast to the current system where the skip doesn't work.) Looking at the history for this line I don't think there was reasoning behind adding the follow except that I would guess that it was nice to add. So... a simple speed improvement here is to drop the follow and switch to `rev-list` for these calls. An additional speed improvement is to add a deferrable route as in the commit infos page _Originally posted by @zeripath in https://github.com/go-gitea/gitea/issues/19812#issuecomment-1207478181_
GiteaMirror added the type/enhancement label 2025-11-02 08:37:20 -06:00
Author
Owner

@bvp commented on GitHub (Feb 15, 2023):

This breaks compatibility with small repositories where there is file renaming

@bvp commented on GitHub (Feb 15, 2023): This breaks compatibility with small repositories where there is file renaming
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#9387