mirror of
https://github.com/go-gitea/gitea.git
synced 2026-03-21 22:16:14 -05:00
Proposal: An abstract layer for managed git repositories #12427
Open
opened 2025-11-02 10:09:24 -06:00 by GiteaMirror
·
1 comment
No Branch/Tag Specified
main
release/v1.25
release/v1.24
release/v1.23
release/v1.22
release/v1.21
release/v1.20
release/v1.19
release/v1.18
release/v1.17
release/v1.16
release/v1.15
release/v1.14
release/v1.13
release/v1.12
release/v1.11
release/v1.10
release/v1.9
release/v1.8
v1.25.3
v1.25.2
v1.25.1
v1.25.0
v1.24.7
v1.25.0-rc0
v1.26.0-dev
v1.24.6
v1.24.5
v1.24.4
v1.24.3
v1.24.2
v1.24.1
v1.24.0
v1.23.8
v1.24.0-rc0
v1.25.0-dev
v1.23.7
v1.23.6
v1.23.5
v1.23.4
v1.23.3
v1.23.2
v1.23.1
v1.23.0
v1.23.0-rc0
v1.24.0-dev
v1.22.6
v1.22.5
v1.22.4
v1.22.3
v1.22.2
v1.22.1
v1.22.0
v1.23.0-dev
v1.22.0-rc1
v1.21.11
v1.22.0-rc0
v1.21.10
v1.21.9
v1.21.8
v1.21.7
v1.21.6
v1.21.5
v1.21.4
v1.21.3
v1.21.2
v1.20.6
v1.21.1
v1.21.0
v1.21.0-rc2
v1.21.0-rc1
v1.20.5
v1.22.0-dev
v1.21.0-rc0
v1.20.4
v1.20.3
v1.20.2
v1.20.1
v1.20.0
v1.19.4
v1.21.0-dev
v1.20.0-rc2
v1.20.0-rc1
v1.20.0-rc0
v1.19.3
v1.19.2
v1.19.1
v1.19.0
v1.19.0-rc1
v1.20.0-dev
v1.19.0-rc0
v1.18.5
v1.18.4
v1.18.3
v1.18.2
v1.18.1
v1.18.0
v1.17.4
v1.18.0-rc1
v1.19.0-dev
v1.18.0-rc0
v1.17.3
v1.17.2
v1.17.1
v1.17.0
v1.17.0-rc2
v1.16.9
v1.17.0-rc1
v1.18.0-dev
v1.16.8
v1.16.7
v1.16.6
v1.16.5
v1.16.4
v1.16.3
v1.16.2
v1.16.1
v1.16.0
v1.15.11
v1.17.0-dev
v1.16.0-rc1
v1.15.10
v1.15.9
v1.15.8
v1.15.7
v1.15.6
v1.15.5
v1.15.4
v1.15.3
v1.15.2
v1.15.1
v1.14.7
v1.15.0
v1.15.0-rc3
v1.14.6
v1.15.0-rc2
v1.14.5
v1.16.0-dev
v1.15.0-rc1
v1.14.4
v1.14.3
v1.14.2
v1.14.1
v1.14.0
v1.13.7
v1.14.0-rc2
v1.13.6
v1.13.5
v1.14.0-rc1
v1.15.0-dev
v1.13.4
v1.13.3
v1.13.2
v1.13.1
v1.13.0
v1.12.6
v1.13.0-rc2
v1.14.0-dev
v1.13.0-rc1
v1.12.5
v1.12.4
v1.12.3
v1.12.2
v1.12.1
v1.11.8
v1.12.0
v1.11.7
v1.12.0-rc2
v1.11.6
v1.12.0-rc1
v1.13.0-dev
v1.11.5
v1.11.4
v1.11.3
v1.10.6
v1.12.0-dev
v1.11.2
v1.10.5
v1.11.1
v1.10.4
v1.11.0
v1.11.0-rc2
v1.10.3
v1.11.0-rc1
v1.10.2
v1.10.1
v1.10.0
v1.9.6
v1.9.5
v1.10.0-rc2
v1.11.0-dev
v1.10.0-rc1
v1.9.4
v1.9.3
v1.9.2
v1.9.1
v1.9.0
v1.9.0-rc2
v1.10.0-dev
v1.9.0-rc1
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.8.0-rc3
v1.7.6
v1.8.0-rc2
v1.7.5
v1.8.0-rc1
v1.9.0-dev
v1.7.4
v1.7.3
v1.7.2
v1.7.1
v1.7.0
v1.7.0-rc3
v1.6.4
v1.7.0-rc2
v1.6.3
v1.7.0-rc1
v1.7.0-dev
v1.6.2
v1.6.1
v1.6.0
v1.6.0-rc2
v1.5.3
v1.6.0-rc1
v1.6.0-dev
v1.5.2
v1.5.1
v1.5.0
v1.5.0-rc2
v1.5.0-rc1
v1.5.0-dev
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.4.0-rc3
v1.4.0-rc2
v1.3.3
v1.4.0-rc1
v1.3.2
v1.3.1
v1.3.0
v1.3.0-rc2
v1.3.0-rc1
v1.2.3
v1.2.2
v1.2.1
v1.2.0
v1.2.0-rc3
v1.2.0-rc2
v1.1.4
v1.2.0-rc1
v1.1.3
v1.1.2
v1.1.1
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.9.99
Labels
Clear labels
$20
$250
$50
$500
backport/done
💎 Bounty
docs-update-needed
good first issue
hacktoberfest
issue/bounty
issue/confirmed
issue/critical
issue/duplicate
issue/needs-feedback
issue/not-a-bug
issue/regression
issue/stale
issue/workaround
lgtm/need 2
modifies/api
modifies/translation
outdated/backport/v1.18
outdated/theme/markdown
outdated/theme/timetracker
performance/bigrepo
performance/cpu
performance/memory
performance/speed
pr/breaking
proposal/accepted
proposal/rejected
pr/wip
pull-request
reviewed/wontfix
💰 Rewarded
skip-changelog
status/blocked
topic/accessibility
topic/api
topic/authentication
topic/build
topic/code-linting
topic/commit-signing
topic/content-rendering
topic/deployment
topic/distribution
topic/federation
topic/gitea-actions
topic/issues
topic/lfs
topic/mobile
topic/moderation
topic/packages
topic/pr
topic/projects
topic/repo
topic/repo-migration
topic/security
topic/theme
topic/ui
topic/ui-interaction
topic/ux
topic/webhooks
topic/wiki
type/bug
type/deprecation
type/docs
type/enhancement
type/feature
type/miscellaneous
type/proposal
type/question
type/refactoring
type/summary
type/testing
type/upstream
Mirrored from GitHub Pull Request
No Label
type/proposal
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/gitea#12427
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @lunny on GitHub (Feb 3, 2024).
Background
As more and more big Gitea instances, the current implementation have two drawbacks.
The git repositories stored in the disk and only under one directories. It’s hard to scale for those big Gitea instances. Because of the repository absolute path have already been used everywhere.
Git itself supports shared repositories but Gitea haven't use this feature to reduce forked repositories disk usage. Some designs need to be considered. Which one should be the root repositories of the base and forked repositories? Should we have a hide repository as the root repositories? This is also related as the layer.
When renaming a repository or a user, some folders needs to be renamed, this operations mixed with some database transactions. It have a high risk that the inconsistent between disk name and database records.
Purpose
So that I propose to have an abstract layer for managed repositories.
What is managed repositories? Now we have
gitpackage which can handle all git repositories, some repositories are created for pushing, editing and various reasons. Another repositories like the code repository, wiki repository, profile repository and package repositories. We call these repositories managed repositories which is not created and destroy for a special operation.All operations of managed repositories will depends on a new package named
gitrepopackage rather than directly depends ongitpackage.I think there are some benefits for that.
gitrepopackage. After all abstracts completed, we can have a proxy mode inside ofgitrepopackage. i.e.OriginalGitStorageServicecould keep the original logic with a root repositories path.HTTPGitStorageServicecould store the managed git repositories into another server against Gitea server and provide a HTTP service to read/write managed git repositories.Concepts
I ever sent some PRs to want to introduce a layer in the
module/gitbut I found it's not the right direction. That packagemodules/gitshould be a basis package which will always focus on handling disk operations. Whatever the repository is the managed one, the wiki one, the temporary one or the hide one. So I think some concepts need to be introduced to clarify.modules/git: This package should be a low level package which can handle any disk git repositories. For managed git repositories, a new package should be introduced.modules/gitrepo: This is the new package introduced as an abstract layer to handle managed git repositories. It may include different storage strategy but the interface to other package is almost the same as before to hide the implementation details. This package will depend onmodules/gitand should not depend on anymodelspackages. It can be dependent by othermodules,serviceslayer packages.Refactoring
To address the purpose, we need do some refactors.
Move managed git operations and
setting.RepoRootPathtomodules/gitrepopackage.All operations related to managed git repositories should be moved to
gitrepopackage but not depends onmodules/gitdirectly.modules/gitis still useful. It can handle temporary repositories and is dependent bymodules/gitrepo.An abstract storage repository interface like
So that, we need have
CodeStorageRepository,WikiStorageRepository,ProfileStorageRepositoryandPackageRepositorywhich implemented this interfact.The interface should only focus on the storage of managed git repositories.
All functions under
modules/gitreposhould use this interface as the second parameters, the first one iscontext.Context.Storage strategies
The relative path now is generated dynamically by ownername and reponame, it should be stored in the database, we can have some new columns in the database table
repositoryi.e.For the storage path generating, we can introduce different storage strategies. i.e.
The strategy should be applied only to new created repository, the old created repositories will depend on the database table column as storage relative path.
Some strategies will require disk operations when renaming which should be part of the strategy.
We can have a convert tool to convert the traditional relative path strategy to the hashed relative path. The hashed relative path will use the repository’s ID which is a 64-bit
Multiple storage services
After the first two steps, we have enough abstract to introduce
GitStorageService. AGitStorageServicecould have such an interfaceA repository interafce
Since for difference
Git Objects rewrite
Many git objects contains a reference to
git.Repositorywhich prevent the above abstract, so that a prepare step is to remove the reference inside the git objects likegit.Commit,git.Tagand etc.Related PRs
#28937
#28940
#28966
@silverwind commented on GitHub (Mar 9, 2024):
That will be a massive benefit for big hosters with many forks per repo and this is also how GitHub works under the hood. A repo and all of its forks use a shared git repo on the server, so if a repo has 1000 forks, you are only storing their changed branches.
Care needs to taken to prevent cross-repo influences. GitHub also had a number of issues related to this in the past (this comes to mind).