Slow page loads with a large repo #166

Closed
opened 2025-11-02 03:11:40 -06:00 by GiteaMirror · 65 comments
Owner

Originally created by @deanpcmad on GitHub (Dec 26, 2016).

Possibly linked/related to #490

I like to keep some mirrors of popular projects such as Rails on my Gitea server however whenever I go to view that repo, it can take 10 seconds plus (sometimes causing an nginx 502 timeout error) to load the page

Originally created by @deanpcmad on GitHub (Dec 26, 2016). Possibly linked/related to #490 I like to keep some mirrors of popular projects such as Rails on my Gitea server however whenever I go to view that repo, it can take 10 seconds plus (sometimes causing an nginx 502 timeout error) to load the page
GiteaMirror added the issue/confirmedtype/enhancement labels 2025-11-02 03:11:40 -06:00
Author
Owner

@lunny commented on GitHub (Jan 14, 2017):

I tested this on mac OS in a lowest power MacBook air, loading rails spent 3500ms ~ 4200ms. I think it's enough for v1.1.

@lunny commented on GitHub (Jan 14, 2017): I tested this on mac OS in a lowest power MacBook air, loading rails spent 3500ms ~ 4200ms. I think it's enough for v1.1.
Author
Owner

@clandmeter commented on GitHub (Jan 18, 2017):

@lunny could you please give alpinelinux aports a try?

Then try to browse the main directory.

It takes couple of minutes on a powerful server mainly because of git log....

I remember this issue has also been reported on gogs before but was never taken care of. Some have suggested to use a caching system. A simpler approach would be to fetch a directory list (like github does) and if needed a-sync fetch commit messages via javascript. Cgit just only shows the directory list which is pretty fast (if possible add this as an option to disable fetching of commit messages, if thats possible with current implementation).

@clandmeter commented on GitHub (Jan 18, 2017): @lunny could you please give alpinelinux [aports](https://github.com/alpinelinux/aports) a try? Then try to browse the main directory. It takes couple of minutes on a powerful server mainly because of `git log....` I remember this issue has also been reported on gogs before but was never taken care of. Some have suggested to use a caching system. A simpler approach would be to fetch a directory list (like github does) and if needed a-sync fetch commit messages via javascript. Cgit just only shows the directory list which is pretty fast (if possible add this as an option to disable fetching of commit messages, if thats possible with current implementation).
Author
Owner

@lunny commented on GitHub (Jan 18, 2017):

I will try it. @clandmeter

@lunny commented on GitHub (Jan 18, 2017): I will try it. @clandmeter
Author
Owner

@lunny commented on GitHub (Jan 18, 2017):

Which page do you want to test? @clandmeter In my machine, main page is 1763ms and first release page is 6662ms .

@lunny commented on GitHub (Jan 18, 2017): Which page do you want to test? @clandmeter In my machine, main page is 1763ms and first release page is 6662ms .
Author
Owner

@clandmeter commented on GitHub (Jan 18, 2017):

@lunny can you check the main directory like this one at github:

https://github.com/alpinelinux/aports/tree/master/main

@clandmeter commented on GitHub (Jan 18, 2017): @lunny can you check the main directory like this one at github: https://github.com/alpinelinux/aports/tree/master/main
Author
Owner

@clandmeter commented on GitHub (Jan 18, 2017):

@lunny btw, im using 1.0.1 i believe the performance commits for tags page has landed after the 1.0.1, or in another branch.

@clandmeter commented on GitHub (Jan 18, 2017): @lunny btw, im using 1.0.1 i believe the performance commits for tags page has landed after the 1.0.1, or in another branch.
Author
Owner

@clandmeter commented on GitHub (Jan 20, 2017):

@lunny I think https://github.com/go-gitea/gitea/issues/502 is related?

@clandmeter commented on GitHub (Jan 20, 2017): @lunny I think https://github.com/go-gitea/gitea/issues/502 is related?
Author
Owner

@lunny commented on GitHub (Jan 20, 2017):

@clandmeter Yes, I tested in master. I think v1.0.1 maybe slower than master. Yes. it's related with #502

@lunny commented on GitHub (Jan 20, 2017): @clandmeter Yes, I tested in master. I think v1.0.1 maybe slower than master. Yes. it's related with #502
Author
Owner

@clandmeter commented on GitHub (Jan 20, 2017):

@lunny I tried master today both on Linux (Alpine Linux) and win10. Both crash at startup so i cannot verify if its faster.

@clandmeter commented on GitHub (Jan 20, 2017): @lunny I tried master today both on Linux (Alpine Linux) and win10. Both crash at startup so i cannot verify if its faster.
Author
Owner

@lunny commented on GitHub (Jan 20, 2017):

Where is the crash log?

@lunny commented on GitHub (Jan 20, 2017): Where is the crash log?
Author
Owner

@clandmeter commented on GitHub (Jan 20, 2017):

C:\Users\carlo\Desktop\gitea>gitea.exe web
2017/01/20 12:47:02 [W] Custom config 'C:/Users/carlo/Desktop/gitea/custom/conf/app.ini' not found, ignore this if you're running first time
2017/01/20 12:47:02 [T] Custom path: C:/Users/carlo/Desktop/gitea/custom
2017/01/20 12:47:02 [T] Log path: C:/Users/carlo/Desktop/gitea/log
2017/01/20 12:47:02 [I] Gitea v1.0.0+137-g1610b9f
2017/01/20 12:47:02 [I] Log Mode: Console(Trace)
2017/01/20 12:47:02 [I] Cache Service Enabled
2017/01/20 12:47:02 [I] Session Service Enabled
2017/01/20 12:47:02 [I] SQLite3 Supported
2017/01/20 12:47:02 [I] Run Mode: Development
panic: Macaron handler must be a callable function

goroutine 1 [running]:
panic(0xeec4e0, 0xc0434b7400)
        /usr/local/go/src/runtime/panic.go:500 +0x1af
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0xeec4e0, 0xc0434b73e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xbf
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc0434bfc80, 0x6, 0x8)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x54
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc04200f360, 0x100f580, 0x4, 0xc0434c4160, 0x1b, 0xc0434c2f90, 0x6, 0x8, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x417
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc04200f360, 0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x10)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x7c
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x3)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x63
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc0434bd100, 0xc0434a6bb8, 0x100f580, 0x4, 0xc0434a6cc8, 0x3, 0x3, 0xeec4e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x12e
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc0434bd100, 0xc0434a6cc8, 0x3, 0x3, 0xc0434b73e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x99
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6()
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d9
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x1020d48, 0xe, 0xc0434a6f58, 0xc0434b70c0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1()
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc42
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100e74a, 0x3, 0xc0434a71a0, 0xc04348cfc0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc0422c6580)
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xdf
code.gitea.io/gitea/cmd.runWeb.func17()
        /srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x31
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100f074, 0x4, 0xc0434a74a8, 0xc04348cfb0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/cmd.runWeb(0xc042184140, 0x0, 0xc042184100)
        /srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1506
code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0xf07f80, 0x113f1c8, 0xc042184140, 0xc0421a2200, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xc0
code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x100edc8, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10318a9, 0x16, 0x0, ...)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcce
code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc04246e340, 0xc04203e3a0, 0x2, 0x2, 0x0, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6aa
main.main()
        /srv/app/src/code.gitea.io/gitea/main.go:39 +0x35b
@clandmeter commented on GitHub (Jan 20, 2017): ```log C:\Users\carlo\Desktop\gitea>gitea.exe web 2017/01/20 12:47:02 [W] Custom config 'C:/Users/carlo/Desktop/gitea/custom/conf/app.ini' not found, ignore this if you're running first time 2017/01/20 12:47:02 [T] Custom path: C:/Users/carlo/Desktop/gitea/custom 2017/01/20 12:47:02 [T] Log path: C:/Users/carlo/Desktop/gitea/log 2017/01/20 12:47:02 [I] Gitea v1.0.0+137-g1610b9f 2017/01/20 12:47:02 [I] Log Mode: Console(Trace) 2017/01/20 12:47:02 [I] Cache Service Enabled 2017/01/20 12:47:02 [I] Session Service Enabled 2017/01/20 12:47:02 [I] SQLite3 Supported 2017/01/20 12:47:02 [I] Run Mode: Development panic: Macaron handler must be a callable function goroutine 1 [running]: panic(0xeec4e0, 0xc0434b7400) /usr/local/go/src/runtime/panic.go:500 +0x1af code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0xeec4e0, 0xc0434b73e0) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xbf code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc0434bfc80, 0x6, 0x8) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x54 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc04200f360, 0x100f580, 0x4, 0xc0434c4160, 0x1b, 0xc0434c2f90, 0x6, 0x8, 0x0) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x417 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc04200f360, 0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x10) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x7c code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x3) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x63 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc0434bd100, 0xc0434a6bb8, 0x100f580, 0x4, 0xc0434a6cc8, 0x3, 0x3, 0xeec4e0) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x12e code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc0434bd100, 0xc0434a6cc8, 0x3, 0x3, 0xc0434b73e0) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x99 code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6() /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d9 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x1020d48, 0xe, 0xc0434a6f58, 0xc0434b70c0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112 code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1() /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc42 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100e74a, 0x3, 0xc0434a71a0, 0xc04348cfc0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112 code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc0422c6580) /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xdf code.gitea.io/gitea/cmd.runWeb.func17() /srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x31 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100f074, 0x4, 0xc0434a74a8, 0xc04348cfb0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112 code.gitea.io/gitea/cmd.runWeb(0xc042184140, 0x0, 0xc042184100) /srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1506 code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0xf07f80, 0x113f1c8, 0xc042184140, 0xc0421a2200, 0x0) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xc0 code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x100edc8, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10318a9, 0x16, 0x0, ...) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcce code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc04246e340, 0xc04203e3a0, 0x2, 0x2, 0x0, 0x0) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6aa main.main() /srv/app/src/code.gitea.io/gitea/main.go:39 +0x35b ```
Author
Owner

@drsect0r commented on GitHub (Jan 20, 2017):

I am stopped by the same panic message as @clandmeter (I don't know if it is the same issue, I was trying to update my Gitea installation - running on Docker)

bash-4.3$ /app/gitea/gitea web 
2017/01/20 11:54:36 [T] Custom path: /data/gitea
2017/01/20 11:54:36 [T] Log path: /data/gitea/log
panic: Macaron handler must be a callable function

goroutine 1 [running]:
panic(0x7ffa36d24140, 0xc42152b720)
	/usr/lib/go/src/runtime/panic.go:500 +0x1a5
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0x7ffa36d24140, 0xc42152b700)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xba
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc421555400, 0x6, 0x8)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x4f
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc4205cc460, 0x7ffa3661ce90, 0x4, 0xc42155e7c0, 0x1b, 0xc421568300, 0x6, 0x8, 0x7ffa37b93020)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x412
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc4205cc460, 0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x10)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x77
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x3)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x5e
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc4215430c0, 0xc4214d6bb8, 0x7ffa3661ce90, 0x4, 0xc4214d6cc8, 0x3, 0x3, 0x7ffa36d24140)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x129
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc4215430c0, 0xc4214d6cc8, 0x3, 0x3, 0xc42152b700)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x94
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6()
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d4
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3662e69d, 0xe, 0xc4214d6f58, 0xc42152b3e0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1()
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc3d
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c13f, 0x3, 0xc4214d71a0, 0xc4214912e0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc420473980)
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xda
code.gitea.io/gitea/cmd.runWeb.func17()
	/srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x2c
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c9d4, 0x4, 0xc4214d74a8, 0xc4214912d0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/cmd.runWeb(0xc4201c17c0, 0x0, 0xc4201c1700)
	/srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1501
code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0x7ffa36d3fcc0, 0x7ffa36e476e8, 0xc4201c17c0, 0xc420058d00, 0x0)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xbb
code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x7ffa3661c730, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffa3663ebbf, 0x16, 0x0, ...)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcc9
code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc42024b520, 0xc42000c140, 0x2, 0x2, 0x0, 0x0)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6a5
main.main()
	/srv/app/src/code.gitea.io/gitea/main.go:39 +0x356
@drsect0r commented on GitHub (Jan 20, 2017): I am stopped by the same panic message as @clandmeter (I don't know if it is the same issue, I was trying to update my Gitea installation - running on Docker) ``` bash-4.3$ /app/gitea/gitea web 2017/01/20 11:54:36 [T] Custom path: /data/gitea 2017/01/20 11:54:36 [T] Log path: /data/gitea/log panic: Macaron handler must be a callable function goroutine 1 [running]: panic(0x7ffa36d24140, 0xc42152b720) /usr/lib/go/src/runtime/panic.go:500 +0x1a5 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0x7ffa36d24140, 0xc42152b700) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xba code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc421555400, 0x6, 0x8) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x4f code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc4205cc460, 0x7ffa3661ce90, 0x4, 0xc42155e7c0, 0x1b, 0xc421568300, 0x6, 0x8, 0x7ffa37b93020) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x412 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc4205cc460, 0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x10) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x77 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x3) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x5e code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc4215430c0, 0xc4214d6bb8, 0x7ffa3661ce90, 0x4, 0xc4214d6cc8, 0x3, 0x3, 0x7ffa36d24140) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x129 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc4215430c0, 0xc4214d6cc8, 0x3, 0x3, 0xc42152b700) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x94 code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6() /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d4 code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3662e69d, 0xe, 0xc4214d6f58, 0xc42152b3e0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1() /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc3d code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c13f, 0x3, 0xc4214d71a0, 0xc4214912e0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc420473980) /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xda code.gitea.io/gitea/cmd.runWeb.func17() /srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x2c code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c9d4, 0x4, 0xc4214d74a8, 0xc4214912d0, 0x1, 0x1) /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d code.gitea.io/gitea/cmd.runWeb(0xc4201c17c0, 0x0, 0xc4201c1700) /srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1501 code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0x7ffa36d3fcc0, 0x7ffa36e476e8, 0xc4201c17c0, 0xc420058d00, 0x0) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xbb code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x7ffa3661c730, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffa3663ebbf, 0x16, 0x0, ...) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcc9 code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc42024b520, 0xc42000c140, 0x2, 0x2, 0x0, 0x0) /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6a5 main.main() /srv/app/src/code.gitea.io/gitea/main.go:39 +0x356 ```
Author
Owner

@lunny commented on GitHub (Jan 20, 2017):

resolved by #708

@lunny commented on GitHub (Jan 20, 2017): resolved by #708
Author
Owner

@clandmeter commented on GitHub (Jan 23, 2017):

@lunny seems master branch is working again so I did some small tests:

  1. Release page: 1.0.0+dev Page: 19309ms Template: 9ms. It seems to list all 147 releases alpine has while it shows a pager at the bottom (seems paging is broken).
  2. Directory listing of aports/main takes ages (will never complete) so its unusable for projects with lots of items in its directories. Seems there is already a PR for this #570
@clandmeter commented on GitHub (Jan 23, 2017): @lunny seems master branch is working again so I did some small tests: 1. Release page: 1.0.0+dev Page: **19309ms** Template: 9ms. It seems to list all 147 releases alpine has while it shows a pager at the bottom (seems paging is broken). 2. Directory listing of aports/main takes ages (will never complete) so its unusable for projects with lots of items in its directories. Seems there is already a PR for this #570
Author
Owner

@lunny commented on GitHub (Jan 23, 2017):

Yes. This issue should be fixed by #570 .

@lunny commented on GitHub (Jan 23, 2017): Yes. This issue should be fixed by #570 .
Author
Owner

@lunny commented on GitHub (Feb 24, 2017):

move this to v1.2 since #570 has been moved.

@lunny commented on GitHub (Feb 24, 2017): move this to v1.2 since #570 has been moved.
Author
Owner

@clandmeter commented on GitHub (Oct 31, 2017):

@lunny any progress in this area?

Im still getting very slow loads on large directory contents:
Gitea Version: d545e32 Page: 418155ms Template: 11903ms

https://try.gitea.io/clandmeter/aports/src/branch/master/community

Would it be possible to have a pager or disable the loading of commit history?

@clandmeter commented on GitHub (Oct 31, 2017): @lunny any progress in this area? Im still getting very slow loads on large directory contents: `Gitea Version: d545e32 Page: 418155ms Template: 11903ms` https://try.gitea.io/clandmeter/aports/src/branch/master/community Would it be possible to have a pager or disable the loading of commit history?
Author
Owner

@lunny commented on GitHub (Nov 3, 2017):

For github it will only show the first 1000 files.

@lunny commented on GitHub (Nov 3, 2017): For github it will only show the first 1000 files.
Author
Owner

@lunny commented on GitHub (May 11, 2018):

See https://try.gitea.io/joshfng/gitlab-ce, it spent about 13 seconds.

@lunny commented on GitHub (May 11, 2018): See https://try.gitea.io/joshfng/gitlab-ce, it spent about 13 seconds.
Author
Owner

@alexandrul commented on GitHub (May 16, 2018):

When creating a pull requests it takes ~12 seconds to show a single commit after selecting the branches:

Gitea Version: 1.4.1 Page: 12296ms Template: 9468ms

The repository has around 5k commits and the repo home page loads in 4 seconds.

@alexandrul commented on GitHub (May 16, 2018): When creating a pull requests it takes ~12 seconds to show a single commit after selecting the branches: Gitea Version: 1.4.1 Page: 12296ms Template: 9468ms The repository has around 5k commits and the repo home page loads in 4 seconds.
Author
Owner

@silverwind commented on GitHub (May 23, 2018):

I don't think the number of commits is really the bottleneck. I have a repo with a size of around 2 GB and 500 commits, and it's already taking 5 seconds to load. Maybe the git commands that are ran when showing a repo need to be optimized or their results cached.

@silverwind commented on GitHub (May 23, 2018): I don't think the number of commits is really the bottleneck. I have a repo with a size of around 2 GB and 500 commits, and it's already taking 5 seconds to load. Maybe the `git` commands that are ran when showing a repo need to be optimized or their results cached.
Author
Owner

@lafriks commented on GitHub (May 23, 2018):

There was problem with commit count before but I fixed that with adding cache for commit count so that should not be problem anymore. I think the current problem is with many files in view that slowing down last commit info calculation I think

@lafriks commented on GitHub (May 23, 2018): There was problem with commit count before but I fixed that with adding cache for commit count so that should not be problem anymore. I think the current problem is with many files in view that slowing down last commit info calculation I think
Author
Owner

@clandmeter commented on GitHub (May 23, 2018):

The problem i am facing is when directories contain many objects. For each object a git command is executed which is rather expensive regarding cpu/io. For instance https://github.com/alpinelinux/aports/tree/master/main will load extremely slow (if load at all) because git wants to fetch the latest info for each object. A simple approach to solve this is to make a setting to disable fetching of git information if the object count is larger than x.

@clandmeter commented on GitHub (May 23, 2018): The problem i am facing is when directories contain many objects. For each object a git command is executed which is rather expensive regarding cpu/io. For instance https://github.com/alpinelinux/aports/tree/master/main will load extremely slow (if load at all) because git wants to fetch the latest info for each object. A simple approach to solve this is to make a setting to disable fetching of git information if the object count is larger than x.
Author
Owner

@davydov-vyacheslav commented on GitHub (May 23, 2018):

isn't there a way to make gitlab style (I believe I saw that there) to make ajax request per each file/directory and update information asynchronously ?

@davydov-vyacheslav commented on GitHub (May 23, 2018): isn't there a way to make gitlab style (I believe I saw that there) to make ajax request per each file/directory and update information asynchronously ?
Author
Owner

@leepfrog-ger commented on GitHub (Jul 5, 2018):

Could this be helping for that issue? https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/

@leepfrog-ger commented on GitHub (Jul 5, 2018): Could this be helping for that issue? https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/
Author
Owner

@lunny commented on GitHub (Jul 6, 2018):

Maybe we could calc file's last commit asynchrony?

@lunny commented on GitHub (Jul 6, 2018): Maybe we could calc file's last commit asynchrony?
Author
Owner

@clandmeter commented on GitHub (Jul 6, 2018):

@lunny before there is a real solution to this problem could you add an option to disable commit info in listings so gitea doesn't spawn git cmd?

@clandmeter commented on GitHub (Jul 6, 2018): @lunny before there is a real solution to this problem could you add an option to disable commit info in listings so gitea doesn't spawn git cmd?
Author
Owner

@lunny commented on GitHub (Jul 6, 2018):

@clandmeter that could be a temporary solution.

@lunny commented on GitHub (Jul 6, 2018): @clandmeter that could be a temporary solution.
Author
Owner

@clandmeter commented on GitHub (Jul 6, 2018):

@lunny that would be great. I would love to test some Alpine Linux related things with gitea but our repo is just too large to make it work atm.

@clandmeter commented on GitHub (Jul 6, 2018): @lunny that would be great. I would love to test some Alpine Linux related things with gitea but our repo is just too large to make it work atm.
Author
Owner

@lafriks commented on GitHub (Sep 11, 2018):

I think rewriting this functionality to use go-git library it would greatly improve performance

@lafriks commented on GitHub (Sep 11, 2018): I think rewriting this functionality to use go-git library it would greatly improve performance
Author
Owner

@Siesh1oo commented on GitHub (Sep 12, 2018):

@lafriks wrote:

There was problem with commit count before but I fixed that with adding cache for commit count so that should not be problem anymore. I think the current problem is with many files in view that slowing down last commit info calculation I think

The bottleneck is caused due to the huge number of git-list-rev and git-cat-file calls for larger repos. Caching the output, or the rendered HTML in the macaron cache (which maps to redis or memcached), or as static file might help.

Pre-rendering HTML at git-receive or git-update time might be another option (to avoid slow rendering of first request).

I think rewriting this functionality to use go-git library it would greatly improve performance

you would still need to walk the git tree on disk, and collect the git-list-rev information; would this get any faster?

@Siesh1oo commented on GitHub (Sep 12, 2018): @lafriks wrote: > There was problem with commit count before but I fixed that with adding cache for commit count so that should not be problem anymore. I think the current problem is with many files in view that slowing down last commit info calculation I think The bottleneck is caused due to the huge number of git-list-rev and git-cat-file calls for larger repos. Caching the output, or the rendered HTML in the macaron cache (which maps to redis or memcached), or as static file might help. Pre-rendering HTML at git-receive or git-update time might be another option (to avoid slow rendering of first request). > I think rewriting this functionality to use go-git library it would greatly improve performance you would still need to walk the git tree on disk, and collect the git-list-rev information; would this get any faster?
Author
Owner

@filipnavara commented on GitHub (Sep 20, 2018):

Could this be helping for that issue? https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/

I gave it a try, but the improvements were quite negligable. On a Windows host the time to execute the git command to list latest commit for one file/directory takes slightly less than a second and I believe the overhead is not only the repository access. We have roughly 50 directories in a repository and the listing takes 25 - 50 seconds. I updated the storage on the machine to SSD with higher throughput and got about 30% boost and more consistent times, but it still takes 20 seconds to load the repository page.

UPDATE: The serialized commit graph helps only a bit. It reduces some I/O especially when hitting old history stored in pack files. However the tree objects still have to be loaded anyway, which dominates the time in the end.

@filipnavara commented on GitHub (Sep 20, 2018): > Could this be helping for that issue? https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/ I gave it a try, but the improvements were quite negligable. On a Windows host the time to execute the git command to list latest commit for one file/directory takes slightly less than a second and I believe the overhead is not only the repository access. We have roughly 50 directories in a repository and the listing takes 25 - 50 seconds. I updated the storage on the machine to SSD with higher throughput and got about 30% boost and more consistent times, but it still takes 20 seconds to load the repository page. UPDATE: The serialized commit graph helps only a bit. It reduces some I/O especially when hitting old history stored in pack files. However the tree objects still have to be loaded anyway, which dominates the time in the end.
Author
Owner

@filipnavara commented on GitHub (Sep 22, 2018):

@lafriks go-git is no silver bullet either, but there's a potential to improve the load times with it.

I implemented a simple Go program to list the root of a repository using go-git and for each entry find the last commit. Timing it on my test repository yields a similar result to whatever Gitea does now, but there are few things to note:

  • Inspecting the code in the git module makes me believe that there is some parallelization involved, albeit little. I didn't do any parallelization at all in my test code.
  • I didn't do proper benchmarking comparing the low-level code. I get around 15s time from my listing program and around 17s from loading a whole Gitea page.
  • There's a huge potential for parallel processing of the repository history using go-git, which simply couldn't be achieved by the git command line today. The trick is to process all the files at once while walking the commit history. Now the history is loaded many times over and the same trees are loaded and examined for each file.
  • The worst case scenario are old files where a lot of commit history and trees have to be loaded from disk. This eventually hits the pack files and go-git seems to be too eager to read way too much data (as evidenced by timeit tool on Windows when comparing simple log queries to git log -1 <file name>).

I have never written in Go before, so if someone wants to improve upon my measly attempt you are more than welcome. I'd be especially interested if someone could do the implementation of walking the history only once and processing more files at the same time (and stopping once we know the commits for all the files).

https://gist.github.com/filipnavara/8e6fdf980130d6ca120bfda4c25481e9

@filipnavara commented on GitHub (Sep 22, 2018): @lafriks `go-git` is no silver bullet either, but there's a potential to improve the load times with it. I implemented a simple Go program to list the root of a repository using go-git and for each entry find the last commit. Timing it on my test repository yields a similar result to whatever Gitea does now, but there are few things to note: - Inspecting the code in the `git` module makes me believe that there is some parallelization involved, albeit little. I didn't do any parallelization at all in my test code. - I didn't do proper benchmarking comparing the low-level code. I get around 15s time from my listing program and around 17s from loading a whole Gitea page. - There's a huge potential for parallel processing of the repository history using `go-git`, which simply couldn't be achieved by the `git` command line today. The trick is to process all the files at once while walking the commit history. Now the history is loaded many times over and the same trees are loaded and examined for each file. - The worst case scenario are old files where a lot of commit history and trees have to be loaded from disk. This eventually hits the pack files and `go-git` seems to be too eager to read way too much data (as evidenced by `timeit` tool on Windows when comparing simple log queries to `git log -1 <file name>`). I have never written in Go before, so if someone wants to improve upon my measly attempt you are more than welcome. I'd be especially interested if someone could do the implementation of walking the history only once and processing more files at the same time (and stopping once we know the commits for all the files). https://gist.github.com/filipnavara/8e6fdf980130d6ca120bfda4c25481e9
Author
Owner

@filipnavara commented on GitHub (Sep 22, 2018):

I updated the Gist with some naïve multi-file processing. Now I get 4s times on my repo, which is about 4x faster than the baseline. Worst-case with walking the whole history of the entire repo using go-git is around 30s.

UPDATE: Using KeepDescriptors option to prevent go-git from reopening pack files all the time slashes another 0.5s from the time (or 12% if you prefer).

UPDATE 2: Trying some performance optimizations at https://github.com/filipnavara/go-git/tree/perf-read. I'm now at ±2.6s on the tests. There was small gain (±0.25s) by avoiding reader.Seek(0, io.SeekCurrent) when reading packfiles and the offset was already known. Another problem with my code is that it accesses most commits twice, which caused them to be actually read twice from the disk for non-packfile objects. Lastly, there was a huge gain by using the in-memory packfile indexes to lookup commit hashes instead of looking into objects directory, if the indexes were already loaded. I still see quite weird and erratic reads on the packfiles itself, but I wasn't able to figure out what causes it.

UPDATE 3: I found the bottleneck when reading packfile objects and implemented a workaround. Now I am at 1.37s, or about 90% faster than my Gitea listing on the same machine. Profiler shows that it's only around 30% I/O bound now, so any further optimization will need someone with more Go experience.

@filipnavara commented on GitHub (Sep 22, 2018): I updated the Gist with some naïve multi-file processing. Now I get 4s times on my repo, which is about 4x faster than the baseline. Worst-case with walking the whole history of the entire repo using `go-git` is around 30s. UPDATE: Using `KeepDescriptors` option to prevent `go-git` from reopening pack files all the time slashes another 0.5s from the time (or 12% if you prefer). UPDATE 2: Trying some performance optimizations at https://github.com/filipnavara/go-git/tree/perf-read. I'm now at ±2.6s on the tests. There was small gain (±0.25s) by avoiding `reader.Seek(0, io.SeekCurrent)` when reading packfiles and the offset was already known. Another problem with my code is that it accesses most commits twice, which caused them to be actually read twice from the disk for non-packfile objects. Lastly, there was a huge gain by using the in-memory packfile indexes to lookup commit hashes instead of looking into objects directory, if the indexes were already loaded. I still see quite weird and erratic reads on the packfiles itself, but I wasn't able to figure out what causes it. UPDATE 3: I found the bottleneck when reading packfile objects and implemented a workaround. Now I am at 1.37s, or about 90% faster than my Gitea listing on the same machine. Profiler shows that it's only around 30% I/O bound now, so any further optimization will need someone with more Go experience.
Author
Owner

@filipnavara commented on GitHub (Sep 23, 2018):

I'll try to upstream my performance improvements to go-git, but as far as Gitea goes I would really appreciate any help.

@filipnavara commented on GitHub (Sep 23, 2018): I'll try to upstream my performance improvements to `go-git`, but as far as Gitea goes I would really appreciate any help.
Author
Owner

@filipnavara commented on GitHub (Sep 23, 2018):

Proof of concept:

Current status: Page: 1688ms Template: 28ms on the top-level listing vs Page: 16898ms Template: 14736ms with latest Gitea release. No git command at all is invoked for loading directory listings.

@filipnavara commented on GitHub (Sep 23, 2018): Proof of concept: - https://github.com/filipnavara/gitea/tree/perf-read ... Changes to `gitea` and `git` module to handle some read-only operations using `go-git`. - https://github.com/filipnavara/go-git/tree/perf-read ... Updated `go-git` with performance related fixes Current status: `Page: 1688ms Template: 28ms` on the top-level listing vs `Page: 16898ms Template: 14736ms` with latest Gitea release. No `git` command at all is invoked for loading directory listings.
Author
Owner

@filipnavara commented on GitHub (Sep 24, 2018):

Btw, the algorithm I implemented is the same one used by libgit2sharp and on the basic conceptual level similar to what git does today. It is not necessarily efficient when deep history has to be traversed (eg. looking at a directory that was not changed for long time relatively to the rest of the repository). It is easy to detect that case and impose some limits on the history traversed to maintain more consistent performance at the expense of not showing all the commit information.

There's ongoing work to speed that up using further improvements on top of the commit graph feature - https://blogs.msdn.microsoft.com/devops/2018/07/16/super-charging-the-git-commit-graph-iv-bloom-filters/ - but it's not even finalized in git itself by now.

@filipnavara commented on GitHub (Sep 24, 2018): Btw, the algorithm I implemented is the same one used by `libgit2sharp` and on the basic conceptual level similar to what `git` does today. It is not necessarily efficient when deep history has to be traversed (eg. looking at a directory that was not changed for long time relatively to the rest of the repository). It is easy to detect that case and impose some limits on the history traversed to maintain more consistent performance at the expense of not showing all the commit information. There's ongoing work to speed that up using further improvements on top of the commit graph feature - https://blogs.msdn.microsoft.com/devops/2018/07/16/super-charging-the-git-commit-graph-iv-bloom-filters/ - but it's not even finalized in git itself by now.
Author
Owner

@vtolstov commented on GitHub (Sep 24, 2018):

@filipnavara Thanks for go-git.

@vtolstov commented on GitHub (Sep 24, 2018): @filipnavara Thanks for go-git.
Author
Owner

@filipnavara commented on GitHub (Sep 26, 2018):

I have locally implemented support for the Git 2.18+ serialized commit graphs in go-git. As expected the performance benefits for that alone are not worth the additional complexity. However, adding the bloom filter optimization makes real wonders when looking into repository directories that weren't changed for quite a while. It could easily bring another 10x speed-up for that use cases at expense of up-front calculations (± 10 minutes for 30000 revisions in non-optimized code) and storage (640 additional bytes per revision in addition to Git commit graph).

@filipnavara commented on GitHub (Sep 26, 2018): I have locally implemented support for the Git 2.18+ serialized commit graphs in `go-git`. As expected the performance benefits for that alone are not worth the additional complexity. However, adding the bloom filter optimization makes real wonders when looking into repository directories that weren't changed for quite a while. It could easily bring another 10x speed-up for that use cases at expense of up-front calculations (± 10 minutes for 30000 revisions in non-optimized code) and storage (640 additional bytes per revision in addition to Git commit graph).
Author
Owner

@lafriks commented on GitHub (Sep 26, 2018):

@filipnavara I looked through your changes in gitea and must say they look amazing, great work! :) I will look more into this when we have got 1.6.0 out of the doors

@lafriks commented on GitHub (Sep 26, 2018): @filipnavara I looked through your changes in gitea and must say they look amazing, great work! :) I will look more into this when we have got 1.6.0 out of the doors
Author
Owner

@filipnavara commented on GitHub (Sep 26, 2018):

@lafriks Thanks much! I found some flaws in the commit_info.go implementation for getLastCommitForPaths which I still plan to fix, but it would be great to get some of the changes upstream. I'll try to help as much as my time permits.

@filipnavara commented on GitHub (Sep 26, 2018): @lafriks Thanks much! I found some flaws in the `commit_info.go` implementation for `getLastCommitForPaths` which I still plan to fix, but it would be great to get some of the changes upstream. I'll try to help as much as my time permits.
Author
Owner

@filipnavara commented on GitHub (Sep 27, 2018):

I commited fix for reporting the revisions across a more complicated commit graphs and across merges.

My experiments with using serialized commit graphs is tracked at https://github.com/src-d/go-git/issues/965. The Gitea counter-part is at https://github.com/filipnavara/go-git/tree/commitgraph. The code is NOT production ready, does NOT handle errors correctly and most importantly leaks file handles at the moment. I am only sharing it to show what further performance improvements are achiveable. There's a tool for generating the precomputed commit graphs in the go-git branch under _examples/commit-graph. It generates commit graph information in Git 2.18+ compatible format with the addition of optional path filter data. This tool is SLOW and is meant to be run only once in a while (eg. after repository import or along with git gc). It is possible to update this index incrementally, but it is not currently implemented. The precomputed information is only used for commits where it is available, otherwise standard Git objects are used. With the precomputed information I am getting sub-second page loads now for every directory listing in our repository, even if it contains paths changed 7+ years ago for which a lot of data would have to be read without the optimizations.

@filipnavara commented on GitHub (Sep 27, 2018): I commited fix for reporting the revisions across a more complicated commit graphs and across merges. My experiments with using serialized commit graphs is tracked at https://github.com/src-d/go-git/issues/965. The Gitea counter-part is at https://github.com/filipnavara/go-git/tree/commitgraph. The code is NOT production ready, does NOT handle errors correctly and most importantly leaks file handles at the moment. I am only sharing it to show what further performance improvements are achiveable. There's a tool for generating the precomputed commit graphs in the go-git branch under _examples/commit-graph. It generates commit graph information in Git 2.18+ compatible format with the addition of optional path filter data. This tool is SLOW and is meant to be run only once in a while (eg. after repository import or along with `git gc`). It is possible to update this index incrementally, but it is not currently implemented. The precomputed information is only used for commits where it is available, otherwise standard Git objects are used. With the precomputed information I am getting sub-second page loads now for every directory listing in our repository, even if it contains paths changed 7+ years ago for which a lot of data would have to be read without the optimizations.
Author
Owner

@stale[bot] commented on GitHub (Jan 8, 2019):

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale[bot] commented on GitHub (Jan 8, 2019): This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.
Author
Owner

@dfredell commented on GitHub (Mar 21, 2019):

Oh my @filipnavara Awesome job on the performance improvements. I pulled your perf-read branch and built it.
I have a repo with 25,000 files in one folder. The previous gitea web ui would take > 3 hours to load, but with your branch it loaded in 19s!
I would love to see this change make it into the master line.

@dfredell commented on GitHub (Mar 21, 2019): Oh my @filipnavara Awesome job on the performance improvements. I pulled your `perf-read` branch and built it. I have a repo with 25,000 files in one folder. The previous gitea web ui would take > 3 hours to load, but with your branch it loaded in 19s! I would love to see this change make it into the master line.
Author
Owner

@filipnavara commented on GitHub (Mar 21, 2019):

@dfredell Unfortunately I am busy and don't have time to upstream it. However I do update the branch every now and then to track upstream changes. Once #6364 gets merged I will do it again and probably open a PR to start the discussion.

@filipnavara commented on GitHub (Mar 21, 2019): @dfredell Unfortunately I am busy and don't have time to upstream it. However I do update the branch every now and then to track upstream changes. Once #6364 gets merged I will do it again and probably open a PR to start the discussion.
Author
Owner

@LukeOwlclaw commented on GitHub (Mar 28, 2019):

https://github.com/go-gitea/gitea/pull/6364 was merged yesterday! 😄

@LukeOwlclaw commented on GitHub (Mar 28, 2019): https://github.com/go-gitea/gitea/pull/6364 was merged yesterday! 😄
Author
Owner

@jchook commented on GitHub (Sep 5, 2019):

This issue seems to persist on the demo site (e.g. for the golang repo), taking 4.2s to respond to an HTTP GET.

During evaluation, this kind of problem might cause someone to use cgit instead.

@jchook commented on GitHub (Sep 5, 2019): This issue seems to persist on the demo site (e.g. for the [golang](https://try.gitea.io/tboerger/Go) repo), taking 4.2s to respond to an HTTP GET. During evaluation, this kind of problem might cause someone to use [cgit](https://git.zx2c4.com/cgit/) instead.
Author
Owner

@davidsvantesson commented on GitHub (Sep 13, 2019):

I have a repo with a folder with more than 2000 files. This takes ~25 seconds to load (not production site), of which 24 seconds are spent in getLastCommitForPaths (run from recent Gitea master branch).

In addition to any performance improvements possible, maybe a new option could be introduced to display only file names (without latest commit info) if a folder contains more than x entries (folders and files). That way very big folders can still be shown quickly but if you want to see commit details/history you need to enter the specific file.

@davidsvantesson commented on GitHub (Sep 13, 2019): I have a repo with a folder with more than 2000 files. This takes ~25 seconds to load (not production site), of which 24 seconds are spent in `getLastCommitForPaths` (run from recent Gitea master branch). In addition to any performance improvements possible, maybe a new option could be introduced to display only file names (without latest commit info) if a folder contains more than x entries (folders and files). That way very big folders can still be shown quickly but if you want to see commit details/history you need to enter the specific file.
Author
Owner

@filipnavara commented on GitHub (Sep 13, 2019):

@davidsvantesson You can speed it up a bit by building commit-graph file (git commit-graph write). I would be interested in how much it helps for your repository.

@filipnavara commented on GitHub (Sep 13, 2019): @davidsvantesson You can speed it up a bit by building commit-graph file (`git commit-graph write`). I would be interested in how much it helps for your repository.
Author
Owner

@davidsvantesson commented on GitHub (Sep 13, 2019):

@filipnavara That is very interesting, but I do not see any change in performance for listing repo files in Gitea. Maybe Gitea doesn't run operations where it benefits from it?

Edit: That is strange, because I have the code of #7314, but doesn't seem to improve my performance. I will do some more investigation into it.

@davidsvantesson commented on GitHub (Sep 13, 2019): @filipnavara That is very interesting, but I do not see any change in performance for listing repo files in Gitea. Maybe Gitea doesn't run operations where it benefits from it? Edit: That is strange, because I have the code of #7314, but doesn't seem to improve my performance. I will do some more investigation into it.
Author
Owner

@davidsvantesson commented on GitHub (Sep 14, 2019):

I think the problem is that getLastCommitForPaths has to traverse (remaining) paths for all commits. If we take an extreme case where 2000 files are added in the initial repo commit and then 10000 commits are made not doing any changes in the folder. Then it will have to loop 20.000.000 times (2000*10000). In a more realistic case where the files are added one by one it will still be about half of that.

It would be interesting to change to use something like git log --max-count=1 on each file to see how it affects the performance.

@davidsvantesson commented on GitHub (Sep 14, 2019): I think the problem is that `getLastCommitForPaths` has to traverse (remaining) paths for all commits. If we take an extreme case where 2000 files are added in the initial repo commit and then 10000 commits are made not doing any changes in the folder. Then it will have to loop 20.000.000 times (2000*10000). In a more realistic case where the files are added one by one it will still be about half of that. It would be interesting to change to use something like `git log --max-count=1` on each file to see how it affects the performance.
Author
Owner

@filipnavara commented on GitHub (Sep 14, 2019):

Yes, that is the pathological case and there is no way around it unless you introduce some new cache or statistical structure (bloom filters) to speed this up. The algorithm in getLastCommitForPaths goes through the history only once and thus saves a lot of git object accesses compared to running git log on each file.

@filipnavara commented on GitHub (Sep 14, 2019): Yes, that is the pathological case and there is no way around it unless you introduce some new cache or statistical structure (bloom filters) to speed this up. The algorithm in `getLastCommitForPaths` goes through the history only once and thus saves a lot of git object accesses compared to running `git log` on each file.
Author
Owner

@davidsvantesson commented on GitHub (Sep 15, 2019):

@filipnavara A simple command line git operation made it clear Gitea is already very efficient. The limitation seem to be in Git itself. The performance for this operation can't be improved much, since git doesn't cache the information we want in the tree, and also it doesn't store directly which files are changed by a commit, so we get this high order. I find it a bit strange there is no option to cache additional information in git to speed up this, as it should be a quite common use-case.

I still think not showing this information (by default) for very large folders can be useful for these special cases.

@davidsvantesson commented on GitHub (Sep 15, 2019): @filipnavara A simple command line git operation made it clear Gitea is already very efficient. The limitation seem to be in Git itself. The performance for this operation can't be improved much, since git doesn't cache the information we want in the tree, and also it doesn't store directly which files are changed by a commit, so we get this high order. I find it a bit strange there is no option to cache additional information in git to speed up this, as it should be a quite common use-case. I still think not showing this information (by default) for very large folders can be useful for these special cases.
Author
Owner

@guillep2k commented on GitHub (Sep 15, 2019):

It's normal to warn the user if the diff will be too large, or there are too many files to diff. So for this operation too I think it's useful to hold down on the details if there is some indication that the operation will take too long (e.g repository size? some statistics?).

@guillep2k commented on GitHub (Sep 15, 2019): It's normal to warn the user if the diff will be too large, or there are too many files to diff. So for this operation too I think it's useful to hold down on the details if there is some indication that the operation will take too long (e.g repository size? some statistics?).
Author
Owner

@lunny commented on GitHub (Sep 15, 2019):

I think it could be improved to add a cache system before git command.

@lunny commented on GitHub (Sep 15, 2019): I think it could be improved to add a cache system before git command.
Author
Owner

@davidsvantesson commented on GitHub (Sep 15, 2019):

A cache system would be good for viewing the "HEAD" which most people use. If wouldn't help if someone wants to browse old history, unless some cache option is built into git for all trees (which I think would be outside scope of gitea).

@guillep2k I thought it could be based on the number of entries (folders and files) in the folder being displayed. However a more true indication of the time needed will be the number of entries times the number of commits (in that folder), which you still can obtain with little effort.

@davidsvantesson commented on GitHub (Sep 15, 2019): A cache system would be good for viewing the "HEAD" which most people use. If wouldn't help if someone wants to browse old history, unless some cache option is built into git for all trees (which I think would be outside scope of gitea). @guillep2k I thought it could be based on the number of entries (folders and files) in the folder being displayed. However a more true indication of the time needed will be the number of entries times the number of commits (in that folder), which you still can obtain with little effort.
Author
Owner

@filipnavara commented on GitHub (Sep 15, 2019):

I have a prototype implementation of the git bloom filters which speed up browsing both HEAD and old history. I didn't pursue it further because I waited for an official git implementation. That said, I can revive it if anyone is brave enough to give it a try.

@filipnavara commented on GitHub (Sep 15, 2019): I have a prototype implementation of the [git bloom filters](https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-bloom-filters/) which speed up browsing both HEAD and old history. I didn't pursue it further because I waited for an official git implementation. That said, I can revive it if anyone is brave enough to give it a try.
Author
Owner

@strk commented on GitHub (Oct 22, 2019):

This is still a problem with 1.9.4 (for the record). I get 6 seconds rendering for a mirror of qgis:
https://dev.git.osgeo.org/gitea/qgis/qgis

Note that try.gitea.io gives a 500 (Internal Server Error) on the page I tried to setup for that:
https://try.gitea.io/strk/QGIS

@strk commented on GitHub (Oct 22, 2019): This is still a problem with 1.9.4 (for the record). I get 6 seconds rendering for a mirror of qgis: https://dev.git.osgeo.org/gitea/qgis/qgis Note that try.gitea.io gives a 500 (Internal Server Error) on the page I tried to setup for that: https://try.gitea.io/strk/QGIS
Author
Owner

@davidsvantesson commented on GitHub (Oct 22, 2019):

@filipnavara Do you have an insight in the chances that bloom filters get into git officially anytime soon?

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

@davidsvantesson commented on GitHub (Oct 22, 2019): @filipnavara Do you have an insight in the chances that bloom filters get into git officially anytime soon? What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?
Author
Owner

@filipnavara commented on GitHub (Oct 22, 2019):

Do you have an insight in the chances that bloom filters get into git officially anytime soon?

I don't know if there was any progress. There were few people who were interested in it but it didn't really move forward except for few experimental implementations at the end of the last year.

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

Azure GIT hosting does exactly that. It is perfectly doable and viable way short term, at small storage expense to duplicate some data structures. I will be happy to release my experimental implementation if someone wants to pick it up after me. I wrote all the code for reading and producing the bloom filters. The reading part was easy to integrate to Gitea. The writing part I did not integrate at all and that still needs to be done (manual index building and scheduled index building). I currently don't have any free engineering hours to dedicate to the project but I will be more than happy to help with it in any other way.

@filipnavara commented on GitHub (Oct 22, 2019): > Do you have an insight in the chances that bloom filters get into git officially anytime soon? I don't know if there was any progress. There were few people who were interested in it but it didn't really move forward except for few experimental implementations at the end of the last year. > What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)? Azure GIT hosting does exactly that. It is perfectly doable and viable way short term, at small storage expense to duplicate some data structures. I will be happy to release my experimental implementation if someone wants to pick it up after me. I wrote all the code for reading and producing the bloom filters. The reading part was easy to integrate to Gitea. The writing part I did not integrate at all and that still needs to be done (manual index building and scheduled index building). I currently don't have any free engineering hours to dedicate to the project but I will be more than happy to help with it in any other way.
Author
Owner

@guillep2k commented on GitHub (Oct 22, 2019):

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

@davidsvantesson I don't know what bloom filters are, but Gitea currently supports a considerable span of git versions, and there are plans to migrate to a pure golang implementation (I don't recall the library name). So, I wouldn't count much on implementing something that requires the latest git version. 😅

@guillep2k commented on GitHub (Oct 22, 2019): > > What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)? @davidsvantesson I don't know what bloom filters are, but Gitea currently supports [a considerable span of git versions](https://docs.gitea.io/en-us/#system-requirements), and there are plans to migrate to a pure golang implementation (I don't recall the library name). So, I wouldn't count much on implementing something that requires the latest git version. 😅
Author
Owner

@filipnavara commented on GitHub (Oct 22, 2019):

@guillep2k It is called go-git and I was one of the people who were doing the migration of Gitea code to use it. Coincidentally, I was also the person who added one of the latest git features to go-git (commit graph files) specifically to speed up Gitea file listing. I also implemented the bloom filters on top of go-git in file format that was compatible with one of the implementations discussed on the git mailing list... so I would say that it is very much possible to use latest git features if there is a use case for it and sufficient demand.

@filipnavara commented on GitHub (Oct 22, 2019): @guillep2k It is called go-git and I was one of the people who were doing the migration of Gitea code to use it. Coincidentally, I was also the person who added one of the latest git features to go-git (commit graph files) specifically to speed up Gitea file listing. I also implemented the bloom filters on top of go-git in file format that was compatible with one of the implementations discussed on the git mailing list... so I would say that it is very much possible to use latest git features if there is a use case for it and sufficient demand.
Author
Owner

@guillep2k commented on GitHub (Oct 22, 2019):

@filipnavara Yes, go-git was it. What I meant is that we should not count on users having the latest git installed on their systems. We can certainly provide the feature if it's implemented inside Gitea itself.

@guillep2k commented on GitHub (Oct 22, 2019): @filipnavara Yes, go-git was it. What I meant is that we should not count on users having the latest git installed on their systems. We can certainly provide the feature if it's implemented inside Gitea itself.
Author
Owner

@filipnavara commented on GitHub (Oct 22, 2019):

We can certainly provide the feature if it's implemented inside Gitea itself.

That's exactly what I do - both in Gitea and indirectly in go-git. The commitgraph file and the bloom filters are optional git indexes stored in the .git directory. Gitea/go-git can consume and generate them and new enough git can use them if they exists.

@filipnavara commented on GitHub (Oct 22, 2019): > We can certainly provide the feature if it's implemented inside Gitea itself. That's exactly what I do - both in Gitea and indirectly in go-git. The commitgraph file and the bloom filters are optional git indexes stored in the .git directory. Gitea/go-git can consume and generate them and new enough git can use them if they exists.
Author
Owner

@zeripath commented on GitHub (Oct 30, 2019):

It seems that some of the docker users aren't getting the git commitGraph gitconfig changes.

@zeripath commented on GitHub (Oct 30, 2019): It seems that some of the docker users aren't getting the git commitGraph gitconfig changes.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#166