mirror of
https://github.com/fosrl/newt.git
synced 2026-05-06 07:59:04 -05:00
[GH-ISSUE #238] Newt ending using 100% resources after DNS failure on network #1747
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @KusAdama on GitHub (Feb 17, 2026).
Original GitHub issue: https://github.com/fosrl/newt/issues/238
Describe the Bug
Within few days I lost partially connectivity and so new connections being impossible to made because of network blocking DNS resolver of my choice.
While debugging what is working and what not (already established connections does persist so Newt provided networks from within my network remain active from outside as they already exist) I restarted the DNS on router and changed DNS severall times.
And it does happen that the Newt can't recover and stay stuck, taking all CPU resources. Same behavior on 2 machines. I noticed because both machines start spinning fans on high noise.
The WireGuard tunnels connected automatically all of them, only all Newt I have stuck like this and needed manual restart each one of them.
In all logs I have can see only two lines (x 10):
ERROR: 2026/02/15 22:23:26 Failed to resolve endpoint: DNS lookup failed: lookup sub.domain.xyz on 127.0.0.11:53: read udp 127.0.0.1:37338->127.0.0.11:53: i/o timeout
INFO: 2026/02/15 22:23:26 Connecting to endpoint: sub.domain.xyz
And nothing more, Newt on 100% resources.
Or:
INFO: 2026/02/17 20:44:27 Connecting to endpoint: sub.domain.xyz
ERROR: 2026/02/17 20:44:27 Failed to resolve endpoint: DNS lookup failed: lookup sub.domain.xyz on 127.0.0.11:53: server misbehaving
And nothing more, Newt on 100% resources.
One machine does have 2 networks with Newt and second is with one.
Seems to me the Newt stop logging or there is some counter for attempts to be made - and it stop doing them after a while?
Environment
To Reproduce
I don't know precisely how I did that. Just as switching between working and not working DNS state within my network it does happen 2x on both machines.
Expected Behavior
Will reconnect automatically, not stall and cosumming all CPU resources.
@github-actions[bot] commented on GitHub (Mar 4, 2026):
This issue has been automatically marked as stale due to 14 days of inactivity. It will be closed in 14 days if no further activity occurs.
@github-actions[bot] commented on GitHub (Mar 22, 2026):
This issue has been automatically marked as stale due to 14 days of inactivity. It will be closed in 14 days if no further activity occurs.
@strausmann commented on GitHub (Mar 27, 2026):
We have additional data points that may be related to this issue. In our environment, we observed 234% CPU on a Newt instance that was also experiencing a TCP connection leak (details in #268).
The high CPU correlates with:
After restarting Newt, CPU immediately dropped to <3%. This suggests the CPU spike is not caused by DNS failures alone, but also by the goroutine overhead of managing thousands of leaked TCP connections in
proxy/manager.go(handleTCPProxy).In the current code,
handleTCPProxyusesnet.Dial("tcp", targetAddr)without a timeout and then callsio.Copywithout any read/write deadline. If the remote end holds the connection open (common with SMTP, SSH, or any long-lived protocol), the goroutines and file descriptors accumulate indefinitely.This is a separate but compounding issue to the DNS-related CPU spike you reported — both contribute to resource exhaustion under failure conditions.
@github-actions[bot] commented on GitHub (Apr 11, 2026):
This issue has been automatically marked as stale due to 14 days of inactivity. It will be closed in 14 days if no further activity occurs.
@github-actions[bot] commented on GitHub (Apr 25, 2026):
This issue has been automatically closed due to inactivity. If you believe this is still relevant, please open a new issue with up-to-date information.