mirror of
https://github.com/go-gitea/gitea.git
synced 2026-03-13 11:31:28 -05:00
Migration of repositories with tags to organisation fails #4127
Closed
opened 2025-11-02 05:39:21 -06:00 by GiteaMirror
·
29 comments
No Branch/Tag Specified
main
release/v1.25
release/v1.24
release/v1.23
release/v1.22
release/v1.21
release/v1.20
release/v1.19
release/v1.18
release/v1.17
release/v1.16
release/v1.15
release/v1.14
release/v1.13
release/v1.12
release/v1.11
release/v1.10
release/v1.9
release/v1.8
v1.25.3
v1.25.2
v1.25.1
v1.25.0
v1.24.7
v1.25.0-rc0
v1.26.0-dev
v1.24.6
v1.24.5
v1.24.4
v1.24.3
v1.24.2
v1.24.1
v1.24.0
v1.23.8
v1.24.0-rc0
v1.25.0-dev
v1.23.7
v1.23.6
v1.23.5
v1.23.4
v1.23.3
v1.23.2
v1.23.1
v1.23.0
v1.23.0-rc0
v1.24.0-dev
v1.22.6
v1.22.5
v1.22.4
v1.22.3
v1.22.2
v1.22.1
v1.22.0
v1.23.0-dev
v1.22.0-rc1
v1.21.11
v1.22.0-rc0
v1.21.10
v1.21.9
v1.21.8
v1.21.7
v1.21.6
v1.21.5
v1.21.4
v1.21.3
v1.21.2
v1.20.6
v1.21.1
v1.21.0
v1.21.0-rc2
v1.21.0-rc1
v1.20.5
v1.22.0-dev
v1.21.0-rc0
v1.20.4
v1.20.3
v1.20.2
v1.20.1
v1.20.0
v1.19.4
v1.21.0-dev
v1.20.0-rc2
v1.20.0-rc1
v1.20.0-rc0
v1.19.3
v1.19.2
v1.19.1
v1.19.0
v1.19.0-rc1
v1.20.0-dev
v1.19.0-rc0
v1.18.5
v1.18.4
v1.18.3
v1.18.2
v1.18.1
v1.18.0
v1.17.4
v1.18.0-rc1
v1.19.0-dev
v1.18.0-rc0
v1.17.3
v1.17.2
v1.17.1
v1.17.0
v1.17.0-rc2
v1.16.9
v1.17.0-rc1
v1.18.0-dev
v1.16.8
v1.16.7
v1.16.6
v1.16.5
v1.16.4
v1.16.3
v1.16.2
v1.16.1
v1.16.0
v1.15.11
v1.17.0-dev
v1.16.0-rc1
v1.15.10
v1.15.9
v1.15.8
v1.15.7
v1.15.6
v1.15.5
v1.15.4
v1.15.3
v1.15.2
v1.15.1
v1.14.7
v1.15.0
v1.15.0-rc3
v1.14.6
v1.15.0-rc2
v1.14.5
v1.16.0-dev
v1.15.0-rc1
v1.14.4
v1.14.3
v1.14.2
v1.14.1
v1.14.0
v1.13.7
v1.14.0-rc2
v1.13.6
v1.13.5
v1.14.0-rc1
v1.15.0-dev
v1.13.4
v1.13.3
v1.13.2
v1.13.1
v1.13.0
v1.12.6
v1.13.0-rc2
v1.14.0-dev
v1.13.0-rc1
v1.12.5
v1.12.4
v1.12.3
v1.12.2
v1.12.1
v1.11.8
v1.12.0
v1.11.7
v1.12.0-rc2
v1.11.6
v1.12.0-rc1
v1.13.0-dev
v1.11.5
v1.11.4
v1.11.3
v1.10.6
v1.12.0-dev
v1.11.2
v1.10.5
v1.11.1
v1.10.4
v1.11.0
v1.11.0-rc2
v1.10.3
v1.11.0-rc1
v1.10.2
v1.10.1
v1.10.0
v1.9.6
v1.9.5
v1.10.0-rc2
v1.11.0-dev
v1.10.0-rc1
v1.9.4
v1.9.3
v1.9.2
v1.9.1
v1.9.0
v1.9.0-rc2
v1.10.0-dev
v1.9.0-rc1
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.8.0-rc3
v1.7.6
v1.8.0-rc2
v1.7.5
v1.8.0-rc1
v1.9.0-dev
v1.7.4
v1.7.3
v1.7.2
v1.7.1
v1.7.0
v1.7.0-rc3
v1.6.4
v1.7.0-rc2
v1.6.3
v1.7.0-rc1
v1.7.0-dev
v1.6.2
v1.6.1
v1.6.0
v1.6.0-rc2
v1.5.3
v1.6.0-rc1
v1.6.0-dev
v1.5.2
v1.5.1
v1.5.0
v1.5.0-rc2
v1.5.0-rc1
v1.5.0-dev
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.4.0-rc3
v1.4.0-rc2
v1.3.3
v1.4.0-rc1
v1.3.2
v1.3.1
v1.3.0
v1.3.0-rc2
v1.3.0-rc1
v1.2.3
v1.2.2
v1.2.1
v1.2.0
v1.2.0-rc3
v1.2.0-rc2
v1.1.4
v1.2.0-rc1
v1.1.3
v1.1.2
v1.1.1
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.9.99
Labels
Clear labels
$20
$250
$50
$500
backport/done
💎 Bounty
docs-update-needed
good first issue
hacktoberfest
issue/bounty
issue/confirmed
issue/critical
issue/duplicate
issue/needs-feedback
issue/not-a-bug
issue/regression
issue/stale
issue/workaround
lgtm/need 2
modifies/api
modifies/translation
outdated/backport/v1.18
outdated/theme/markdown
outdated/theme/timetracker
performance/bigrepo
performance/cpu
performance/memory
performance/speed
pr/breaking
proposal/accepted
proposal/rejected
pr/wip
pull-request
reviewed/wontfix
💰 Rewarded
skip-changelog
status/blocked
topic/accessibility
topic/api
topic/authentication
topic/build
topic/code-linting
topic/commit-signing
topic/content-rendering
topic/deployment
topic/distribution
topic/federation
topic/gitea-actions
topic/issues
topic/lfs
topic/mobile
topic/moderation
topic/packages
topic/pr
topic/projects
topic/repo
topic/repo-migration
topic/security
topic/theme
topic/ui
topic/ui-interaction
topic/ux
topic/webhooks
topic/wiki
type/bug
type/deprecation
type/docs
type/enhancement
type/feature
type/miscellaneous
type/proposal
type/question
type/refactoring
type/summary
type/testing
type/upstream
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/gitea#4127
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @oaxiento on GitHub (Oct 16, 2019).
[x]):Description
We are currently migrating all of our 230 repositories to Gitea. Here's how we do it:
To keep it simple our Gitea instance has just one organisation with around 110 "Owner" team members. When we migrate repositories with more than 50 tags to that organisation, the git client pushes everything without an error, but Gitea fails when creating releases for every tag with the following message:
In the Gitea logfile I can see that Gitea calls
repo_permission.gofor every tag and every user of that organisation, which results in a lot of calls and presumably database queries:When migrating that same repository to a "standalone" user account, everything works as expected. All releases get created. In the Gitea logfile I can see that gitea doesn't call
repo_permission.go, in this case:We also tried to migrate a very large repository with around 15.000 commits and 7000 tags. Although we get an error from the git client, everything (tags, commits) gets transmitted and all releases get created by gitea when migrating to a user repository.
Long story short, when we migrate repositories with more, or way more than 50 tags to a user account, everything works. When we migrate them to an organizazion with a lot of members, Gitea gets stuck when creating releases from tags.
Workaround
As a workaround we first migrate to a user account and then transmit the ownership of the repository to the organisation.
@guillep2k commented on GitHub (Oct 16, 2019):
This seems to be related to #8528 (#8273).
I'd recomment changing
app.iniand set:(I'd experiment with the values and see if they improve the results)
@zeripath commented on GitHub (Oct 16, 2019):
Is easier to read.
@oaxiento commented on GitHub (Oct 17, 2019):
Thanks you both for your quick response. I changed the database settings and tried migrating one of our biggest repositories (>6,000 tags, >14,000 commits) to our Gitea organisation. After creating 854 releases and around 100,000
repo_permission.gocalls I see the same error though:I think this is an efficiency issue, as I don't see why there have to be so many
repo_permission.gocalls when creating releases from tags. When I migrate the same repository to a user account an then move it to the organisationrepo_permission.goonly runs one for the user transferring the repository.@lunny commented on GitHub (Oct 17, 2019):
@mansshardt Thanks for your reports. Could you share your database setting?
@oaxiento commented on GitHub (Oct 17, 2019):
Sure! Here they are:
@lunny commented on GitHub (Oct 17, 2019):
300means300nano seconds, NOT300seconds, that's too short. Could you change that to5mand try agin.@guillep2k commented on GitHub (Oct 17, 2019):
Sorry, my fault about the nanoseconds.
@zeripath commented on GitHub (Oct 17, 2019):
My fault too. I should have checked.
@oaxiento commented on GitHub (Oct 17, 2019):
No problem! I changed the setting to
CONN_MAX_LIFE_TIME = 5mand restarted Gitea. Sadly, this didn't make a difference. After creating 766 releases I still get:@guillep2k commented on GitHub (Oct 17, 2019):
This is most likely because of attempting too many connections to MariaDB, but I thought this would be solved by the new settings.
As TCP can only create and destroy <64515 socket pairs in a small lapse of time (TIME_WAIT), and this import operation seems to be depleting that, perhaps you can try using UNIX sockets for your connection to MariaDB instead?
@oaxiento commented on GitHub (Oct 17, 2019):
I just pushed (--mirror) again and checked the connections to the database via
netstat -an | grep ':3306' | wc -l. During the creation of releases I have around 25,000 tcp sockets in TIME_WAIT. As mentioned before I'd guess that every permission call results in one database connection which doesn't get reused.I don't understand why so many connections are nessecary in the first place. In my opinion fiddling with MariaDB properties or database settings to mitigate the issue shouldn't be a solution. I also can't use unix sockets as the database is not on the same machine.
@guillep2k commented on GitHub (Oct 17, 2019):
@mansshardt I agree that this should not be the solution. I was only offering alternatives. 😄
For some reason
MAX_IDLE_CONNSdidn't have any effect. AFAIU it should have had the effect of reusing the connections for most transactions. Especially afterCONN_MAX_LIFE_TIME = 5m.@zeripath commented on GitHub (Oct 17, 2019):
You probably need my pr which allows you to set maxopenconns to prevent too many open connections to the db
@guillep2k commented on GitHub (Oct 17, 2019):
@zeripath If I'm not mistaken, the problem in this issue is the number of closed connections, not the open ones.
@oaxiento commented on GitHub (Oct 18, 2019):
@guillep2k Sure, and thanks again for your input.
The issue is not that there are to many open connections to the database as @guillep2k mentioned. The problem is, that during the creation of releases from tags, there is a huge amount of connections which get established and quickly closed. This doens't seem very efficient. As the system holds a tcp socket in TIME_WAIT for some time, we have a lot of these (~25.000) during the creation of releases from tags.
It seems that
MAX_IDLE_CONNSandCONN_MAX_LIFE_TIME = 5mdoesn't have any effect. I would expect this setting to build some kind of connection pool, which it doesn't.I think two elements are relevant in this issue:
@guillep2k commented on GitHub (Oct 18, 2019):
I've checked the docs and the sources and I couldn't find a reason why the connections are not pooled.
https://www.alexedwards.net/blog/configuring-sqldb
@zeripath It looks like your PR #8528 could be related after all, but the default value should be working nonetheless. In theory, by not calling
SetMaxOpenConns()we're effectively usingMaxOpenConns == 0, which should allow for any value inSetMaxIdleConns(). But that somehow is not working.@zeripath commented on GitHub (Oct 19, 2019):
@guillep2k yeah so with a long enough lifetime, a large enough MaxOpenConns and a MaxIdleConns near the MaxOpenConns should prevent the rapid opening and closing of connections preventing the port number depletion at least papering over this implementation fault.
We need to think about these permissions calls a bit better and consider if we can cache these results in some way.
In this particular case if we look at
cmd/hook.go:280f4bebbf/cmd/hook.go (L124-L186)Within the context of the hook we read each line of the provided stdin. We get one line per updated ref and they are of the form:
This then gets translated to a GET request to the Gitea server calling:
280f4bebbf/routers/private/hook.go (L126-L244)This has the benefit of meaning each commit sha id is logged for free but if you're updating a lot of refs that means that you get a lot of separate HTTP requests.
Pre-receive has a similar architecture.
Now, that architecture means that even if we were doing this within a single session we wouldn't get much benefit from session caching - although it might have some benefit.
A better architecture would be to pass all of the refs in a POST, we could then create a
repofiles.PushUpdateswhich could have all the updated refs.Unfortunately when I made these changes to the hooks I considered but dismissed the idea that anyone would be likely to send almost a thousand updates in one push so in terms of doing the least work I only made the simplest implementation.
@guillep2k commented on GitHub (Oct 19, 2019):
Although optimization is always a good thing, I think the root of the problem here is the connection pool. It will bite us back anytime, not only with the migration of large repositories.
@zeripath commented on GitHub (Oct 19, 2019):
Yeah, at the end of the day - it doesn't matter how efficient this bit of code is - if you have not configured the pool properly you could run it out of connections with the correct kind of load.
At least with #8528 we will expose all the configurables that go provides to the user - if that's still not enough then we'll have to think about writing our own pool. ( One which at the least could handle this error and wait)
If MaxOpenConns and MaxIdleConns are equal then there should be at most MSL * MaxOpenCons / MaxLifetime TimeWait connections. If you change MaxIdleConns to be different from MaxOpenConns you're likely to need to increase the maxlifetime but there will be point at which there is no stable solution.
Without setting MaxOpenConns a sufficient load will cause port exhaustion.
@guillep2k commented on GitHub (Oct 19, 2019):
@zeripath Mmm... I was about to write a long explanation of how MaxOpenConns should not affect the number of closed connections but now I think I see your point. The only way to avoid the system from creating time_wait entries is to keep it from closing them as much as possible, so MaxIdleConns should be equal to MaxOpenConns in this type of application where many users can be doing operations at the same time.
Again, your PR seems on point. What I wonder is: what's the strategy in the database driver for scheduling the connection requests when they are all busy? FIFO? Are there others available?
@zeripath commented on GitHub (Oct 19, 2019):
Without looking at the code I would guess it's actually "random" - the most obvious implementation is a simple spin lock with a wait until you actually get the real lock. I would bet there is no formal queue - too expensive - so we're into OS level queuing algorithms, suggesting a likely bias towards LIFO.
@guillep2k commented on GitHub (Oct 19, 2019):
@mansshardt I know it's a bit of a hassle, but is it possible for you to grab @zeripath 's PR to build from source and try?:
(It's important for the test that the first two values match)
@oaxiento commented on GitHub (Oct 19, 2019):
@guillep2k I will try that on monday when I am back at the office and get back to you.
@oaxiento commented on GitHub (Oct 21, 2019):
I just had the chance to test with a build from @zeripath PR and the following db settings:
With this build and settings I can see a proper database pooling. During migration I have two or three tcp sockets in
ESTABLISHEDstate, which get used randomly. After five minutes the sockets go in TIME_WAIT and then get removed, as expected. With this build the migration of huge repos with a lot of tags works well, even though it's pretty slow. But all releases get created properly. I think #8602 should improve the speed issue. Hope to see both fixes/improvements in a official release soon.@zeripath commented on GitHub (Oct 21, 2019):
@mansshardt would you be able to try #8602 it would be a great test of the code if it worked.
@oaxiento commented on GitHub (Oct 21, 2019):
@zeripath With my build from #8602 git doesn't timeout anymore, but just 190 of over 6000 tags got created as releases. Gitea didn't throw any error.
@zeripath commented on GitHub (Oct 21, 2019):
Damn that means that I have a bug...
@zeripath commented on GitHub (Oct 21, 2019):
OK @mansshardt I think that 200 is too large a batch size for gitea to process without the internal request timing out and that's why the you only get ~200 processed.
@stale[bot] commented on GitHub (Dec 20, 2019):
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.