mirror of
https://github.com/go-gitea/gitea.git
synced 2026-03-19 06:32:57 -05:00
Recurring freezes on BSD #8318
Closed
opened 2025-11-02 08:02:05 -06:00 by GiteaMirror
·
34 comments
No Branch/Tag Specified
main
release/v1.25
release/v1.24
release/v1.23
release/v1.22
release/v1.21
release/v1.20
release/v1.19
release/v1.18
release/v1.17
release/v1.16
release/v1.15
release/v1.14
release/v1.13
release/v1.12
release/v1.11
release/v1.10
release/v1.9
release/v1.8
v1.25.3
v1.25.2
v1.25.1
v1.25.0
v1.24.7
v1.25.0-rc0
v1.26.0-dev
v1.24.6
v1.24.5
v1.24.4
v1.24.3
v1.24.2
v1.24.1
v1.24.0
v1.23.8
v1.24.0-rc0
v1.25.0-dev
v1.23.7
v1.23.6
v1.23.5
v1.23.4
v1.23.3
v1.23.2
v1.23.1
v1.23.0
v1.23.0-rc0
v1.24.0-dev
v1.22.6
v1.22.5
v1.22.4
v1.22.3
v1.22.2
v1.22.1
v1.22.0
v1.23.0-dev
v1.22.0-rc1
v1.21.11
v1.22.0-rc0
v1.21.10
v1.21.9
v1.21.8
v1.21.7
v1.21.6
v1.21.5
v1.21.4
v1.21.3
v1.21.2
v1.20.6
v1.21.1
v1.21.0
v1.21.0-rc2
v1.21.0-rc1
v1.20.5
v1.22.0-dev
v1.21.0-rc0
v1.20.4
v1.20.3
v1.20.2
v1.20.1
v1.20.0
v1.19.4
v1.21.0-dev
v1.20.0-rc2
v1.20.0-rc1
v1.20.0-rc0
v1.19.3
v1.19.2
v1.19.1
v1.19.0
v1.19.0-rc1
v1.20.0-dev
v1.19.0-rc0
v1.18.5
v1.18.4
v1.18.3
v1.18.2
v1.18.1
v1.18.0
v1.17.4
v1.18.0-rc1
v1.19.0-dev
v1.18.0-rc0
v1.17.3
v1.17.2
v1.17.1
v1.17.0
v1.17.0-rc2
v1.16.9
v1.17.0-rc1
v1.18.0-dev
v1.16.8
v1.16.7
v1.16.6
v1.16.5
v1.16.4
v1.16.3
v1.16.2
v1.16.1
v1.16.0
v1.15.11
v1.17.0-dev
v1.16.0-rc1
v1.15.10
v1.15.9
v1.15.8
v1.15.7
v1.15.6
v1.15.5
v1.15.4
v1.15.3
v1.15.2
v1.15.1
v1.14.7
v1.15.0
v1.15.0-rc3
v1.14.6
v1.15.0-rc2
v1.14.5
v1.16.0-dev
v1.15.0-rc1
v1.14.4
v1.14.3
v1.14.2
v1.14.1
v1.14.0
v1.13.7
v1.14.0-rc2
v1.13.6
v1.13.5
v1.14.0-rc1
v1.15.0-dev
v1.13.4
v1.13.3
v1.13.2
v1.13.1
v1.13.0
v1.12.6
v1.13.0-rc2
v1.14.0-dev
v1.13.0-rc1
v1.12.5
v1.12.4
v1.12.3
v1.12.2
v1.12.1
v1.11.8
v1.12.0
v1.11.7
v1.12.0-rc2
v1.11.6
v1.12.0-rc1
v1.13.0-dev
v1.11.5
v1.11.4
v1.11.3
v1.10.6
v1.12.0-dev
v1.11.2
v1.10.5
v1.11.1
v1.10.4
v1.11.0
v1.11.0-rc2
v1.10.3
v1.11.0-rc1
v1.10.2
v1.10.1
v1.10.0
v1.9.6
v1.9.5
v1.10.0-rc2
v1.11.0-dev
v1.10.0-rc1
v1.9.4
v1.9.3
v1.9.2
v1.9.1
v1.9.0
v1.9.0-rc2
v1.10.0-dev
v1.9.0-rc1
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.8.0-rc3
v1.7.6
v1.8.0-rc2
v1.7.5
v1.8.0-rc1
v1.9.0-dev
v1.7.4
v1.7.3
v1.7.2
v1.7.1
v1.7.0
v1.7.0-rc3
v1.6.4
v1.7.0-rc2
v1.6.3
v1.7.0-rc1
v1.7.0-dev
v1.6.2
v1.6.1
v1.6.0
v1.6.0-rc2
v1.5.3
v1.6.0-rc1
v1.6.0-dev
v1.5.2
v1.5.1
v1.5.0
v1.5.0-rc2
v1.5.0-rc1
v1.5.0-dev
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.4.0-rc3
v1.4.0-rc2
v1.3.3
v1.4.0-rc1
v1.3.2
v1.3.1
v1.3.0
v1.3.0-rc2
v1.3.0-rc1
v1.2.3
v1.2.2
v1.2.1
v1.2.0
v1.2.0-rc3
v1.2.0-rc2
v1.1.4
v1.2.0-rc1
v1.1.3
v1.1.2
v1.1.1
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.9.99
Labels
Clear labels
$20
$250
$50
$500
backport/done
💎 Bounty
docs-update-needed
good first issue
hacktoberfest
issue/bounty
issue/confirmed
issue/critical
issue/duplicate
issue/needs-feedback
issue/not-a-bug
issue/regression
issue/stale
issue/workaround
lgtm/need 2
modifies/api
modifies/translation
outdated/backport/v1.18
outdated/theme/markdown
outdated/theme/timetracker
performance/bigrepo
performance/cpu
performance/memory
performance/speed
pr/breaking
proposal/accepted
proposal/rejected
pr/wip
pull-request
reviewed/wontfix
💰 Rewarded
skip-changelog
status/blocked
topic/accessibility
topic/api
topic/authentication
topic/build
topic/code-linting
topic/commit-signing
topic/content-rendering
topic/deployment
topic/distribution
topic/federation
topic/gitea-actions
topic/issues
topic/lfs
topic/mobile
topic/moderation
topic/packages
topic/pr
topic/projects
topic/repo
topic/repo-migration
topic/security
topic/theme
topic/ui
topic/ui-interaction
topic/ux
topic/webhooks
topic/wiki
type/bug
type/deprecation
type/docs
type/enhancement
type/feature
type/miscellaneous
type/proposal
type/question
type/refactoring
type/summary
type/testing
type/upstream
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/gitea#8318
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @phryk on GitHub (Jan 4, 2022).
Gitea Version
1.15.8
Git Version
2.34.1
Operating System
FreeBSD
How are you running Gitea?
Installed from custom package repository built py local poudriere.
Same problem occured with versions from official FreeBSD pkg repo.
Gitea is running within a jail and has an nginx running on the same jail in front of it as transparent proxy.
Database
PostgreSQL
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
I'm not sure what that even means. Do you want a paste?
Description
I've been experiencing recurring freezes where only
killall -9 giteahelps for about a year now.This happens from within a couple hours to a couple weeks after the service is started.
I cannot reproduce this issue wilfully, but if I wait long enough it always shows up.
I have previously tried setting the log level to Debug, but haven't seen anything that
would tell me what the actual issue is. As this happened last summer, I don't have
those particular logs anymore, but have re-escalated my log level to Debug to be able
to attach a log when this bug next hits.
My latest attempt in figuring out what's going on was to enable gitea's metrics and have
them graphed in Grafana, but every single metric just shows a straight line up to the point
when gitea freezes, which is when data stops coming in.
In the meantime till I can attach a log: Any hunches on what might trigger this behavior?
Any other data I can supply to help triage this issue?
Best wishes,
phryk
Screenshots
No response
@lunny commented on GitHub (Jan 4, 2022):
How did you compile Gitea yourself?
And could you catch the CPU, disk, memory usage when Gitea recurring freezes?
@phryk commented on GitHub (Jan 4, 2022):
It's built by poudriere, which is a bulk package builder for FreeBSD that
uses FreeBSDs ports system to do the actual building.
You can find out more about it here: https://github.com/freebsd/poudriere/wiki
Actual disk usage I sadly don't have data for, but here are graphs for the other metrics as well as gitea and nginx (the green one is the gitea, or rather the transparent proxy for it)
@phryk commented on GitHub (Jan 4, 2022):
I should probably also mention that no other services seem to be affected by this, so it looks like it's not hogging any system resources (or rather not any used by anything else) when it freezes – it just stops doing stuff.
@zeripath commented on GitHub (Jan 4, 2022):
This sounds like a deadlock happening somewhere but postgres isn't usually a DB that suffers from them. (SQLite is usually the way we detect these.) Apart from DB deadlocks I'm not sure there's any other obvious thing that would cause a deadlock.
One problem in 1.15.7- was that
git cat-file --batchwon't fail fast if the git repo is broken but #17992 should prevent that - Is this definitely still happening on 1.15.8?If so, a few thoughts:
@phryk commented on GitHub (Jan 4, 2022):
grep -i error gitea.logshows nothing of interest, the only legitimate errors are from gitea being unableto update a mirror whose source repo got DMCA'd…
@zeripath commented on GitHub (Jan 7, 2022):
The logs appear somewhat confusing:
This implies that the problem has occurred somewhere in:
b25a571bc9/modules/context/repo.go (L551-L558)But the problem is that there isn't really any place for a deadlock to occur in there.
My thoughts for progressing this further are to apply the following:
That would help us to see if there is deadlock somewhere.
One final random thought is maybe the problem is in log/file.go - I guess it's worth another review.
@zeripath commented on GitHub (Jan 15, 2022):
Have you seen any further freezes?
@phryk commented on GitHub (Jan 16, 2022):
Yes and no. The bug seems to have triggered again, but this time the process actually managed to die.
Logs are on their way to you.
@zeripath commented on GitHub (Jan 16, 2022):
OK so somewhat relievingly the problem is not in process.GetManager().Add() and that the error has occurred in the same place suggests that logging is not to blame.
The problem lies somewhere in:
6cb5069bf6/modules/git/command.go (L123-L155)The only places left therefore are:
ctx, cancel := context.WithTimeout(c.parentContext, timeout)is this deadlocking somehow?cmd := exec.CommandContext(ctx, c.name, c.args...)this requires stating the filesystem. Is the filesystem locking for some reason?if err := cmd.Start(); err != nil {this involves starting the process itself so maybe there is some issue with bsd process creation here.None of these are looking greatly soluble, I guess adding some logging to this section of code would be the only thing to do to move things closer to working out what the possible reason is.
(remember github likes to pretend the final empty line doesn't exist so if you copy this add a terminal empty line.)
@phryk commented on GitHub (Jan 16, 2022):
Patched, deployed and will get back next time the bug strikes. Thanks for the assistance. :)
@thearchivalone commented on GitHub (Jan 20, 2022):
@phryk thanks for posting this bug report. I'm having the same issue running Gitea in Bastille running postgres in the same container. I'm currently updating to the latest release but did notice that the hang happens the same way as you and have to fully restart the server when I need to update a setting or two (still getting Gitea moved over and configured properly from a Linux server).
Definitely seeing this issue with 1.15.10 on my end.
The easiest way to get it to trigger for me was to just spam
service gitea restartuntil it freezes at Waiting for PIDS message. Sending this to the port maintainer.@zeripath commented on GitHub (Jan 20, 2022):
@bedwardly-down as I said above this isn't looking like a Gitea bug per se. The three points of possible deadlock are all deep in go std library code and likely at system calls.
My greatest suspicion is unfortunately falling at the context.WithTimeout call. If it's there then that's a serious problem and working around it will not be easy (although we could simply drop the WithTimeout assuming WithCancel is unaffected.)
@thearchivalone commented on GitHub (Jan 20, 2022):
@zeripath thanks for responding. I'm waiting on the FreeBSD port maintainer to get back to me on it but go apps having this kind of issue is not uncommon on FreeBSD, according to other maintainers I've been interacting with in the IRC and official Discord channels. I ran a test yesterday that kind of goes along with your assumptions too.
I'm primarily involved with the Nodejs ecosystem right now, so I use the PM2 process manager and Nodemon pretty regularly. Running Gitea through PM2 and having that strapped to FreeBSD's init system showed that the port was attempting to start and stop at the exact same time in various places causing it to deadlock. I had a similar occurrence with Caddy and a few other go apps on my old Gentoo server (it uses a modified version of FreeBSD's init system with most of the core functionality being exactly the same between the two). Running Gitea and Caddy outside of init but instead directly seemed to work fine in several of my tests.
@phryk commented on GitHub (Jan 20, 2022):
Bug hit again. Logs will shortly be on the way to @zeripath, last 3 lines from
gitea.loglook like this:@zeripath commented on GitHub (Jan 20, 2022):
OK well we've established that the problem is in starting the process:
6cb5069bf6/modules/git/command.go (L148)I think this is likely to be an os/jail problem - @bedwardly-down 's comment suggests that perhaps the issue might some deadlock in PM2 with processes being created at exactly the same time. If so, there's nothing we as gitea can do.
You could try to use the gogit variant - as this will create a lot fewer calls to git - which might reduce the issue?
@thearchivalone commented on GitHub (Jan 22, 2022):
To further clarify, PM2 was not how I normally ran Gitea. It was a test to see what’s actually happening behind the scenes with a tool that has a built in monitor that prints some basic but useful information. It also doesn’t seem to have any issues with Sqlite3.
@thearchivalone commented on GitHub (Jan 23, 2022):
@phryk how dependent are you on getting your git up and running? A large chunk of my daily needs are built around git and version control, so this definitely was an inconvenience for me. I hope it gets fixed pretty quickly upstream.
@lunny commented on GitHub (Jan 23, 2022):
Have you catched the issue when enable pprof? If that, could you upload pprof report?
@thearchivalone commented on GitHub (Jan 23, 2022):
I haven’t tried or heard of that. I’ll have to tinker with that later.
@phryk commented on GitHub (Jan 23, 2022):
@łunny I've tried that before, but with the process frozen, pprof doesn't answer anymore either.
@zeripath recommended periodically polling pprof and saving the last couple results. I might get to that in a couple days.
Would be easier, if the info collected by pprof would also go into the openmetrics output as that's already being polled. :P
@bedwardly-down I'm running a private gitea instance which I use for all my projects, so it is a bit of an inconvenience for me personally, but I don't have anyone else depending on the service.
@zeripath commented on GitHub (Jan 23, 2022):
I'm really not sure pprof if going to help much. The problem appears to be in
os/exec/exec.gowithin go's std library itself.os/exec/exec.go:(*cmd).Start()Walking through the code of Start in there points to:
As the place where the problem is.
syscall/exec_unix.go:os.StartProcess(...)Which on bsd, unix and linux all call
forkExecin syscall/exec_unix.go. A cursory glance of this code shows:Now if there is a panic in there the
ForkLockcould end up being left locked but the panic should be seen in our logs and I see no evidence of this.Which leads me to think that either
forkExecPipeorforkAndExecInChildare blocking.forkExecPipeNow
forkExecPipein linux is somewhat more complex than that on bsd:BSD (
syscall/forkpipe2.go):Linux (
exec_linux,go):But the complexity here is simply falling back. So if there is a deadlock here it's in the syscall pipe2. https://www.freebsd.org/cgi/man.cgi?query=pipe2&sektion=2&format=html
The man page does not indicate that this could block but dtrace and ktrace will capture these calls.
forkAndExecInChildThe implementations between linux and bsd are substantially different here and it's getting into the deep systems programming level of starting processes that I'm afraid I know little to nothing about. The file is
syscall/exec_bsd.goFundamentally there is an assembly call into
runtime_BeforeForkwhich callssystemstack(beforeFork)which tells the system stack to block signals. (Could the blocking of signals be causing a problem? Does your jail send up a signal if Fork is blocked?)Then https://www.freebsd.org/cgi/man.cgi?fork(2), and we check the pid returned - and if we're the parent or an error occurs returns the pid or error after running
runtime_AfterFork()which callssystemstack(afterFork)reversing the changes of fork.Remaining part of
forkExecafterForkLock.Unlock()I guess this could block - but I don't understand why it would take down the whole system.
Summary
os/exec/exec.go:(*cmd).Start()syscall/exec_unix.go:os.StartProcess(...)and henceforkExecinsyscall/exec_unix.go.forkAndExecInChildinsyscall/exec_bsd.go.fork(2)either at that call or at thesystemstackcallspipe(2)inforkAndExecInChildcould be blocking but again this should be dtraceable/ktraceable.ForkLockhas unlocked there is some code around reading the error from the child pipe but this shouldn't kill the whole go process.So we're likely looking at a bug in go's runtime either due to some weird resource limit handling of the jail and the way it reports issues to go, or even more difficult to fix some bug in the OS about creating processes after some limit has been reached.
Maybe this https://github.com/golang/go/issues/43873 is related?
You've never told us the version of FreeBSD you're running or given us any information about how you have set up the jail. Could the parameters of your jail responsible?
@thearchivalone commented on GitHub (Jan 23, 2022):
@zeripath Thanks for looking into this as much as you have. For mine:
@phryk I just had a thought: are you running postgres in the same jail as Gitea instead of separately and interacting with it through a port? To get it to work, it needed some modifications to its main config file to allow it to use more RAM and other system resources than most BSD jails initially allocate. I haven’t tried postgres in a separate jail yet since many online sources recommended encapsulating DB and its software in a single unit to allow easier transfer and backup. I wonder if the deadlock is caused because of that.
@thearchivalone commented on GitHub (Jan 23, 2022):
The issue wasn’t solved by moving postgres to a separate container for me. Even running postgres through a port and connecting that way caused gitea to hang. Removing the app.ini and letting it try to generate a new one also caused it to hang even with the /usr/local/etc/gitea/conf directory permissions set to 755 or 777. So far, it looks like SQLite is the definitive way to run it right now.
Can I get anyone else to try it with Sqlite? So far, that works in my tests but isn’t ideal if you have big repositories
@zeripath commented on GitHub (Jan 23, 2022):
@bedwardly-down are you sure that you have the same problem as @phryk ? It would useful to double check to see where your hangs are happening. I cannot see why SQLite would be better for a problem relating to forking.
If you're finding that SQLite is better then could you try connecting to postgres over a unix port instead tcp port as it might be that you're suffering tcp port-exhaustion instead.
Thinking again to this blocking problem relating to fork I wonder if the block is happening is because a page fault is occurring in fork and the signal cannot be handled?
@phyrk what version of go are you building Gitea with? Please ensure that go is the most recent version. It might help to
allow.mlockin the jail.@thearchivalone commented on GitHub (Jan 23, 2022):
@zeripath honestly, it may be different. There’s not enough info from the original poster to get a clear picture and the only common thread we have is postgres and I’m grasping straws using my own limited knowledge of how gitea and its database support works.
@zeripath commented on GitHub (Jan 23, 2022):
@bedwardly-down could you apply the patch in https://github.com/go-gitea/gitea/issues/18180#issuecomment-1013916215 and check your logs for when then deadlock occurs to see if it's at the same place.
If you find that the final log line is the same as in @phryk you're hitting the same problem. If not - well then we get to find another bug.
If switching to sqlite helps then that means port exhaustion is more likely. Similarly if using a unix port for postgres helps then it's far more likely to be a port exhaustion problem.
@thearchivalone commented on GitHub (Jan 23, 2022):
Didn’t apply the patch. Running it straight through the unix socket inside the same container finally solved my issue. Looking at the system logs, there was no indication that anything was happening prior running through a port from either a separate or local port that my permissions were wrong within Postgres. I fixed them and can now run Gitea with no issues. So, I’ll have to agree that mine was probably port exhaustion as you suggested.
@zeripath commented on GitHub (Jan 23, 2022):
If you're using an http/https proxy like nginx you should also be able to make gitea run as http+unix which would then prevent port exhaustion (except through migrations) as a cause for problems. (Well I say "eliminate" I mean stop Gitea from being the cause - the port exhaustion could still happen in nginx.)
Actually it looks like Go is now setting SO_REUSEADDR so our server listening shouldn't be affected by port exhaustion - perhaps the DB connection sockets aren't set with SO_REUSEADDR?
@thearchivalone commented on GitHub (Jan 23, 2022):
I’m using Caddy in a separate container and reverse proxying straight from the local container ports. I don't believe Bastille supports unix sockets right now but that's possibly in the works. I have under 10 containers so far and caddy is only currently serving 3 while the rest are completely internal. That’s not good. I’ll have to research a bit on how to wrangle the containers into a single socket. Thanks.
@tsowa commented on GitHub (Jan 25, 2022):
I have been using gitea for more than a year and have never had such problems. My configuration:
$ uname -srm
FreeBSD 13.0-STABLE amd64
Jails were created with make distribution and make installworld from /usr/src, actually I am using one jail as a template and then only tar -x to the specific location. The main system (host) is build from source, after host is built then the jails are updated with the same binaries: I am using a custom script which does mergemaster, make installworld, make delete-old(libs) to the specific jail location with -D and DESTDIR parameter. Packages are built in a different jail (I am not using poudiere) and are available to other jails through custom pkg repository.
From pf (packet filter) I am making a redirect to a jail with nginx:
In the jail the nginx config looks like:
proxy-headers.conf:
ssl-params.conf:
Nginx makes a proxy to the jail where gitea is installed together with postgresql.
My jails config /etc/jail.conf:
Make sure you have sysv* set in your config if you are using postgresql.
My limits for the gitea jail:
and the average resource use:
In fact, there are three separate instances of gitea in the jail - each with custom startup script. But the startup script is based on the standard /usr/local/etc/rc.d/gitea so this should not be the problem.
Someone said above that this error can be provoked by service restart so I have tried:
but nothing wrong has happened. May this depends on the kind of the git repository. If you wish I can prepare you an empty gitea instance and you can then interact with it to see if this problem occurs.
@tsowa commented on GitHub (Jan 25, 2022):
For completeness: I am not using ZFS but UFS with soft-updates:
In the past I have used soft update journaling but there were a bug in soft updated journaling which caused the file system to be locked: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224292 and I have changed the filesystem configuration so that UFS only uses soft-updates now.
@phryk commented on GitHub (Jan 27, 2022):
My setup looks a bit different. I'm not really sure what
(if anything) of this is useful to know, but here you go:
Set up on this is a custom FreeBSD (12.2) install with redundant ZFS pools
and a hybrid HDD/SSD UFS mirror (which is where database-specific data lives).
All drives use 256 bit AES-XTS encryption via
geom_eli, accelerated with AESNI.All currently deployed packages are built by a poudriere run on the host OS,
but this is a relatively recent change on this setup and the bug already happened
way before it.
I'm running a custom thinjail setup using
/etc/jail.confin which I have just onebase-system (
/jail/base) that's being mounted read-only into all jail roots withnullfsand another read-writenullfsmount of/jail/rw/<jailname>for mostother stuff on top – the Postgres' data directory being a notable exception as
that's yet another read-write
nullfsmount.All jails are on an extra loopback interface,
lo1so communications to andfrom them can't reach the internet without going through the firewall (pf).
I use nginx as reverse proxy with gitea running on TCP port 3000 in the same
jail (
http). PostgreSQL however runs on a separate jail (database) and isonly exposed through a UNIX socket in a directory that is
nullfs-mounted tohttpso services on that jail can access the database (offers pretty neat andgranular access control).
I think I already said this to @zeripath on IRC, but it might be worth reiterating,
that I kind of get the feeling that the root cause of my problem might be buried
somewhere in
nullfs.@feld commented on GitHub (Jan 27, 2022):
I also use the nullfs trick to mount postgres, mysql, and syslogd sockets into jails but have never encountered issues with it FWIW
edit: what is your SSD? Some have small buffers that cause the entire device to choke if you fill it completely with writes
@phryk commented on GitHub (Mar 29, 2022):
30 days ago, I did a (minor) FreeBSD update which contained a new kernel
version and updated gitea to 1.15.10. Go might also have been updated,
tho I'm not sure about that – it's now at 1.17.6. I have seen no freezes since.
I'm closing this issue for now – feel free to reopen if you think this isn't warranted.
I would of course prefer to know what the underlying issue was, but it
sounds like that would require investing a huge amount of time. :P