[GH-ISSUE #2913] newt holepunch does not re-resolve endpoint hostname after connection loss (stale IP on reconnect) #13072

Open
opened 2026-05-13 18:37:11 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @timootten on GitHub (Apr 28, 2026).
Original GitHub issue: https://github.com/fosrl/pangolin/issues/2913

Originally assigned to: @oschwartz10612 on GitHub.

Describe the Bug

newt/holepunch does not re-resolve endpoint hostname after connection loss (stale IP on reconnect)

Summary

When the IP address behind the configured gerbil/exit-node hostname changes (e.g. due to a failover, migration, or DNS record update), newt does not perform a fresh DNS lookup on reconnect. Instead, it continues using the IP it resolved at startup, leaving the tunnel in a permanently broken state until newt is manually restarted.

This affects any dynamic IP environment — including setups using tools like external-dns that automatically update DNS records at the provider level during cluster events.

Actual Behavior

Holepunch keeps sending to the originally resolved IP for the entire runtime. No re-resolution occurs even well past the DNS TTL. The only recovery path is manually restarting newt.

Evidence / Logs

Starting hole punch to <exit-node-hostname>
Resolved exit node: <Address B>
Sent UDP hole punch to <Address B>
Sent UDP hole punch to <Address B>
... (repeated indefinitely, no re-resolution to Address A)

No subsequent resolution attempt is visible in the logs after the DNS record changes.

Environment

  • Tool: newt + holepunch
  • DNS managed externally (e.g. Cloudflare via external-dns)
  • Deployment: Kubernetes

Suggested Fix

On each reconnect attempt (or periodically, respecting TTL), re-resolve the exit-node hostname from DNS rather than caching the IP from the initial startup resolution.

Environment

  • OS Type & Version: Debian 13
  • Pangolin Version: 1.17.1
  • Gerbil Version: 1.3.1
  • Traefik Version: 3.6
  • Newt Version: 11.0
  • Olm Version: (if applicable)

To Reproduce

  1. Point the exit-node hostname DNS record to Address A
  2. Start newt — confirm holepunch targets Address A in the logs
  3. Update the DNS record to point to Address B (simulating a failover)
  4. Restart newt — confirm holepunch correctly targets Address B
  5. Update the DNS record back to Address A
  6. Wait longer than the DNS TTL (e.g. 8+ minutes with a TTL of ~60s)
  7. Observe holepunch logs — it continues targeting Address B

Expected Behavior

After the DNS TTL expires, newt/holepunch should re-resolve the exit-node hostname and reconnect using the updated IP address — without requiring a manual restart.

Originally created by @timootten on GitHub (Apr 28, 2026). Original GitHub issue: https://github.com/fosrl/pangolin/issues/2913 Originally assigned to: @oschwartz10612 on GitHub. ### Describe the Bug # newt/holepunch does not re-resolve endpoint hostname after connection loss (stale IP on reconnect) ## Summary When the IP address behind the configured gerbil/exit-node hostname changes (e.g. due to a failover, migration, or DNS record update), `newt` does not perform a fresh DNS lookup on reconnect. Instead, it continues using the IP it resolved at startup, leaving the tunnel in a permanently broken state until `newt` is manually restarted. This affects any dynamic IP environment — including setups using tools like [external-dns](https://github.com/kubernetes-sigs/external-dns) that automatically update DNS records at the provider level during cluster events. ## Actual Behavior Holepunch keeps sending to the originally resolved IP for the entire runtime. No re-resolution occurs even well past the DNS TTL. The only recovery path is manually restarting `newt`. ## Evidence / Logs ``` Starting hole punch to <exit-node-hostname> Resolved exit node: <Address B> Sent UDP hole punch to <Address B> Sent UDP hole punch to <Address B> ... (repeated indefinitely, no re-resolution to Address A) ``` No subsequent resolution attempt is visible in the logs after the DNS record changes. ## Environment - Tool: `newt` + holepunch - DNS managed externally (e.g. Cloudflare via external-dns) - Deployment: Kubernetes ## Suggested Fix On each reconnect attempt (or periodically, respecting TTL), re-resolve the exit-node hostname from DNS rather than caching the IP from the initial startup resolution. ### Environment - OS Type & Version: Debian 13 - Pangolin Version: 1.17.1 - Gerbil Version: 1.3.1 - Traefik Version: 3.6 - Newt Version: 11.0 - Olm Version: (if applicable) ### To Reproduce 1. Point the exit-node hostname DNS record to **Address A** 2. Start `newt` — confirm holepunch targets **Address A** in the logs 3. Update the DNS record to point to **Address B** (simulating a failover) 4. Restart `newt` — confirm holepunch correctly targets **Address B** 5. Update the DNS record back to **Address A** 6. Wait longer than the DNS TTL (e.g. 8+ minutes with a TTL of ~60s) 7. Observe holepunch logs — it continues targeting **Address B** ### Expected Behavior After the DNS TTL expires, `newt`/holepunch should re-resolve the exit-node hostname and reconnect using the updated IP address — without requiring a manual restart.
GiteaMirror added the bug label 2026-05-13 18:37:11 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/pangolin#13072