mirror of
https://github.com/go-gitea/gitea.git
synced 2026-03-16 13:13:23 -05:00
Migrations permanently stuck if gitea is restarted during the migration #6293
Closed
opened 2025-11-02 06:51:15 -06:00 by GiteaMirror
·
19 comments
No Branch/Tag Specified
main
release/v1.25
release/v1.24
release/v1.23
release/v1.22
release/v1.21
release/v1.20
release/v1.19
release/v1.18
release/v1.17
release/v1.16
release/v1.15
release/v1.14
release/v1.13
release/v1.12
release/v1.11
release/v1.10
release/v1.9
release/v1.8
v1.25.3
v1.25.2
v1.25.1
v1.25.0
v1.24.7
v1.25.0-rc0
v1.26.0-dev
v1.24.6
v1.24.5
v1.24.4
v1.24.3
v1.24.2
v1.24.1
v1.24.0
v1.23.8
v1.24.0-rc0
v1.25.0-dev
v1.23.7
v1.23.6
v1.23.5
v1.23.4
v1.23.3
v1.23.2
v1.23.1
v1.23.0
v1.23.0-rc0
v1.24.0-dev
v1.22.6
v1.22.5
v1.22.4
v1.22.3
v1.22.2
v1.22.1
v1.22.0
v1.23.0-dev
v1.22.0-rc1
v1.21.11
v1.22.0-rc0
v1.21.10
v1.21.9
v1.21.8
v1.21.7
v1.21.6
v1.21.5
v1.21.4
v1.21.3
v1.21.2
v1.20.6
v1.21.1
v1.21.0
v1.21.0-rc2
v1.21.0-rc1
v1.20.5
v1.22.0-dev
v1.21.0-rc0
v1.20.4
v1.20.3
v1.20.2
v1.20.1
v1.20.0
v1.19.4
v1.21.0-dev
v1.20.0-rc2
v1.20.0-rc1
v1.20.0-rc0
v1.19.3
v1.19.2
v1.19.1
v1.19.0
v1.19.0-rc1
v1.20.0-dev
v1.19.0-rc0
v1.18.5
v1.18.4
v1.18.3
v1.18.2
v1.18.1
v1.18.0
v1.17.4
v1.18.0-rc1
v1.19.0-dev
v1.18.0-rc0
v1.17.3
v1.17.2
v1.17.1
v1.17.0
v1.17.0-rc2
v1.16.9
v1.17.0-rc1
v1.18.0-dev
v1.16.8
v1.16.7
v1.16.6
v1.16.5
v1.16.4
v1.16.3
v1.16.2
v1.16.1
v1.16.0
v1.15.11
v1.17.0-dev
v1.16.0-rc1
v1.15.10
v1.15.9
v1.15.8
v1.15.7
v1.15.6
v1.15.5
v1.15.4
v1.15.3
v1.15.2
v1.15.1
v1.14.7
v1.15.0
v1.15.0-rc3
v1.14.6
v1.15.0-rc2
v1.14.5
v1.16.0-dev
v1.15.0-rc1
v1.14.4
v1.14.3
v1.14.2
v1.14.1
v1.14.0
v1.13.7
v1.14.0-rc2
v1.13.6
v1.13.5
v1.14.0-rc1
v1.15.0-dev
v1.13.4
v1.13.3
v1.13.2
v1.13.1
v1.13.0
v1.12.6
v1.13.0-rc2
v1.14.0-dev
v1.13.0-rc1
v1.12.5
v1.12.4
v1.12.3
v1.12.2
v1.12.1
v1.11.8
v1.12.0
v1.11.7
v1.12.0-rc2
v1.11.6
v1.12.0-rc1
v1.13.0-dev
v1.11.5
v1.11.4
v1.11.3
v1.10.6
v1.12.0-dev
v1.11.2
v1.10.5
v1.11.1
v1.10.4
v1.11.0
v1.11.0-rc2
v1.10.3
v1.11.0-rc1
v1.10.2
v1.10.1
v1.10.0
v1.9.6
v1.9.5
v1.10.0-rc2
v1.11.0-dev
v1.10.0-rc1
v1.9.4
v1.9.3
v1.9.2
v1.9.1
v1.9.0
v1.9.0-rc2
v1.10.0-dev
v1.9.0-rc1
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.8.0-rc3
v1.7.6
v1.8.0-rc2
v1.7.5
v1.8.0-rc1
v1.9.0-dev
v1.7.4
v1.7.3
v1.7.2
v1.7.1
v1.7.0
v1.7.0-rc3
v1.6.4
v1.7.0-rc2
v1.6.3
v1.7.0-rc1
v1.7.0-dev
v1.6.2
v1.6.1
v1.6.0
v1.6.0-rc2
v1.5.3
v1.6.0-rc1
v1.6.0-dev
v1.5.2
v1.5.1
v1.5.0
v1.5.0-rc2
v1.5.0-rc1
v1.5.0-dev
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.4.0-rc3
v1.4.0-rc2
v1.3.3
v1.4.0-rc1
v1.3.2
v1.3.1
v1.3.0
v1.3.0-rc2
v1.3.0-rc1
v1.2.3
v1.2.2
v1.2.1
v1.2.0
v1.2.0-rc3
v1.2.0-rc2
v1.1.4
v1.2.0-rc1
v1.1.3
v1.1.2
v1.1.1
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.9.99
Labels
Clear labels
$20
$250
$50
$500
backport/done
💎 Bounty
docs-update-needed
good first issue
hacktoberfest
issue/bounty
issue/confirmed
issue/critical
issue/duplicate
issue/needs-feedback
issue/not-a-bug
issue/regression
issue/stale
issue/workaround
lgtm/need 2
modifies/api
modifies/translation
outdated/backport/v1.18
outdated/theme/markdown
outdated/theme/timetracker
performance/bigrepo
performance/cpu
performance/memory
performance/speed
pr/breaking
proposal/accepted
proposal/rejected
pr/wip
pull-request
reviewed/wontfix
💰 Rewarded
skip-changelog
status/blocked
topic/accessibility
topic/api
topic/authentication
topic/build
topic/code-linting
topic/commit-signing
topic/content-rendering
topic/deployment
topic/distribution
topic/federation
topic/gitea-actions
topic/issues
topic/lfs
topic/mobile
topic/moderation
topic/packages
topic/pr
topic/projects
topic/repo
topic/repo-migration
topic/security
topic/theme
topic/ui
topic/ui-interaction
topic/ux
topic/webhooks
topic/wiki
type/bug
type/deprecation
type/docs
type/enhancement
type/feature
type/miscellaneous
type/proposal
type/question
type/refactoring
type/summary
type/testing
type/upstream
Mirrored from GitHub Pull Request
No Label
type/bug
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/gitea#6293
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Qix- on GitHub (Nov 11, 2020).
[x]):Description
https://github.com/go-gitea/gitea/issues/8812#issuecomment-549700212
Same as mentioned there. Forcefully restarting gitea while a migration is happening will cause any unfinished/pending migrations to hang indefinitely. Manually running the cron tasks in the administration panel does nothing.
I just spent about 5 hours scouring the web for clone links for a bunch of dependencies we need to mirror, I would really prefer not to have to do that again.
Screenshots
@Qix- commented on GitHub (Nov 11, 2020):
This definitely should get fixed permanently but I'm also open for any manual workarounds that don't involve deleting each of the 74 mirrors I just created and re-initializing all of them manually...
@lunny commented on GitHub (Nov 11, 2020):
Just delete the repository from admin panel and then migrate it again.
@Qix- commented on GitHub (Nov 11, 2020):
I will spend another 5 hours re-initializing all of them >.> is there a way to kick off the migrations manually?
@zeripath commented on GitHub (Nov 11, 2020):
You could use the API?
@zeripath commented on GitHub (Nov 11, 2020):
why is your gitea being restarted so much?
@zeripath commented on GitHub (Nov 11, 2020):
(This also leads to the question as to why the migration isn't being cancelled when the machine is restarted, and why the migration stuff isn't restartable...)
@Qix- commented on GitHub (Nov 11, 2020):
Well, it froze and prevented anyone from SSHing in and had to be force killed. 🙃 So 0 for 2 now.
Why is gitea not fault tolerant is a question-as-a-response, lol.
@zeripath commented on GitHub (Nov 11, 2020):
OK so you're running SQLite in production and you've hit #13271.
That was fixed by: #13505 and should be fixed in 1.13 by #13507.
Bugs happen. No-one is paying any of us to work on Gitea.
@Qix- commented on GitHub (Nov 11, 2020):
Yes, I'm fully aware how OSS works (check my profile). The sort of silly question why I'm restarting gitea (faults happen in production...) deserved an answer in-kind. Gitea is not critical for us, I'm not demanding anything, etc.
Thank you for the links, I'll patiently await 1.13 then 🙂
@6543 commented on GitHub (Nov 11, 2020):
@Qix- gitea is trying to be tolerant - just SQLite is very limited ... so if you dont use it for your ~5 repos but mirror 74+ repos and more, you realy should consider moving to mysql
@Qix- commented on GitHub (Nov 11, 2020):
@6543 Why? SQLite is very robust if used correctly. It's been around for decades and is used successfully in production (see: android) every day by billions of users.
That's a weak argument. I'm not trying to debate here, I was simply reporting a bug. There's no reason, however, to insinuate that lack of fault tolerance is somehow my fault. It's a bug, it's nobody's fault, and I'm grateful for the project of course.
I was simply filing a bug.
@6543 commented on GitHub (Nov 11, 2020):
I have nothing against you, I just want to point out that SQLite easily deadlocks when it is used by multiple actors (yes we are trying to get rid of it).
And thanks for bug-reporting, without we would not be aware of many bugs 👍
@zeripath commented on GitHub (Nov 11, 2020):
@Qix- I'm sorry if you thought that: https://github.com/go-gitea/gitea/issues/13513#issuecomment-725354404 was an inappropriate question
It isn't inappropriate, because repos should get deleted if the migration is cancelled because of shutdown. The deadlock explains why they weren't and is the root cause of the problems you are seeing.
@Qix- commented on GitHub (Nov 11, 2020):
I merely insinuated that a web service would be more robust if it could survive unexpected shutdowns. Gitea being force-killed put it into a corrupted state that cannot be resumed or error-corrected, which is a dist-sys problem.
I'm a dist-sys architect; asking me "why are you restarting [a web service]" is like asking me "why did you make your server's power go out during a thunderstorm?". I didn't want that to happen, but it happens. A robust service would be fault tolerant of that.
With a single instance running, I highly doubt this is purely SQLite's fault (there are not multiple actors here). Perhaps I'm missing implementation details, but it seems like maybe something could be improved to increase the robustness against failures.
That's all I was implying. 🙂 I wasn't trying to put anyone down, but I didn't see how the question fit the bug report at all.
@zeripath commented on GitHub (Nov 11, 2020):
(@Qix- your replies are reading very aggressively - I'm sorry if mine are reading in the same way. I'm not trying to be aggressive or defensive here.)
There already is code to clean up a migration if it fails or gitea is shutdown during a migration - however, this relies on the db not being totally deadlocked at that point.
Clearly - that is not a completely robust solution as assuming that the connection to the db was OK at shutdown is probably not something we can rely on and rather we need something that can look at in progress tasks and allow them to be cancelled. It's worth noting however that if SQLite has gone down like this we're in serious trouble - the goroutines block until the db context is killed at hammer - by which time all git operations have to die too. The migration as a whole could and should have a context which is cancelled at shutdown but xorm does not provide a way for us to make a db request with a specific context (AFAIK) so I don't think there is a way. <- OK it looks like this is actually possible just need to set the session context - this would mean propogating the context down to the models package
Sequencing these things is not simple - and the answer is that sqlite deadlocks are IMO critical security issues to be solved as soon as possible.
Now it would be helpful to provide some way of cancelling migrations - which has been discussed on a different issue and is also not simple. Tasks can run on different gitea instances so the request to cancel a migration would have to be published somewhere - and then caught by the reading gitea and before being cancelled. But of course that would not solve the issue you were having as it was due to a deadlock.
I hope that now you see why asking why you were stopping and starting gitea so much is relevant. If you're having to stop and start a web service constantly because of a problem with it - the bug that is forcing you to restart may be the actual reason you're seeing.
@Qix- commented on GitHub (Nov 11, 2020):
I'm not being aggressive, I just seem to have a different viewpoint than you about software robustness.
A fault tolerant web service has the property that, in the event of a failure of any kind, it is able to error-correct and resume operations without manual intervention.
There could be a new cron-job; pseudo-code:
I don't see how what I'm saying is "aggressive", I apologize if you've perceived it that way. However, I'm not going to pretend the current behavior is correct or that it's not a bug. If you're not interested in fixing it, that's fine - I can find another solution, it's not a problem. However, I wanted to let you know that this is indeed an issue and that I simply wanted to express that the two responses - "why are you restarting?" and "It's SQLite's problem" - don't make much sense to me as they do not address the fault tolerance point.
If SQLite makes it easy for gitea to fail, then gitea should probably have error-correcting logic to correct any errors SQLite might cause.
That's all.
@6543 commented on GitHub (Nov 11, 2020):
@Qix- Since what you suggest is a new topic i have created a new issue ... #13515
keep bugs and requests seperated ...
@Qix- commented on GitHub (Nov 11, 2020):
I had to restart it once. I don't know where you got the idea that I was just constantly bringing it up and down. It froze the entire external sshd instance once and that was enough for it to ignore all of the migrations.
@techknowlogick commented on GitHub (Nov 11, 2020):
Locking as this issue has been closed and whenever a comment is made 400+ get an email.