mirror of
https://github.com/fosrl/pangolin.git
synced 2026-05-07 13:19:07 -05:00
[GH-ISSUE #1806] Issues with key file persistence & exit node lookup #3953
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Soitora on GitHub (Nov 3, 2025).
Original GitHub issue: https://github.com/fosrl/pangolin/issues/1806
Originally assigned to: @oschwartz10612 on GitHub.
Describe the Bug
After restarting my Unraid server (no config changes for months), Pangolin/Gerbil failed to start. On boot Gerbil logs an error about failing to assign an IP due to an invalid CIDR and fetching the remote config returns a 404 for the gerbil endpoint.
Key log lines:
Manual check of the endpoint:
Response:
This happened after a power outage / UPS-triggered shutdown and subsequent server reboot. No configuration, keys, or exit node edits were made by me prior to this failure.
Environment
To Reproduce
I cannot reliably reproduce this from scratch, but the failure happened under the following conditions:
invalid CIDR address:error and the/api/v1/gerbil/get-configendpoint returns a 404.Observed behavior suggests a database state mismatch where the exit node entry and/or sequence cause Gerbil to generate or reference an invalid/empty CIDR.
Expected Behavior
/api/v1/gerbil/get-configendpoint should return appropriate config (not404) when the service is up.@Soitora commented on GitHub (Nov 3, 2025):
I managed to resolve it, finally... (a couple of days later)
Here’s what I did:
db.sqlitefile and:exitNodestable.exitNodeIdto1(instead of2).sqlite_sequencetable so theexitNodessequence was reset to1.letsencryptfolder so Pangolin could generate new certificates.This indicates the issue wasn't with my Unraid setup or environment, but likely a mismatch or corruption in the Pangolin database state that caused Gerbil to fail to assign its IP on boot?
Resetting the exit node and certificates forced a clean reinitialization, resolving the invalid CIDR and 404 config issues
@oschwartz10612 commented on GitHub (Nov 8, 2025):
@Soitora sorry for this! Glad its fixed. Thanks for the write up of the fix.
@Alucard133 commented on GitHub (Nov 19, 2025):
I actually had the same issue today, thanks @Soitora for the explanation on how to fix it!
@PhilRW commented on GitHub (Dec 5, 2025):
This helped me today as well; definitely seems like a bug. I encountered it TWICE today and my only action was doing a
docker-compose downto the stack.@Soitora commented on GitHub (Dec 6, 2025):
@oschwartz10612 Maybe it should be investigated further if it still happens to people? Even I had it happen to me twice, first time I solved it by just starting from scratch
@Jurrer commented on GitHub (Dec 10, 2025):
Happened to me when I was updating compose, thanks to you @Soitora kind sir, I managed to fix this issue.
I only removed the entry from exitNodes and it worked
@cantchooseaname8 commented on GitHub (Dec 12, 2025):
Just happened to me as well. I originally had a local only site without gerbil in my compose. As soon as I updated the compose to include gerbil, it started throwing this error.
@GovSat1 commented on GitHub (Dec 12, 2025):
Same issue here
@oschwartz10612 commented on GitHub (Dec 12, 2025):
What could be happening here is you are losing or incorporating the key file in the config that gerbil sends the pangolin. That could be why we are getting a 404.
Could anyone confirm if this key file was removed or changed or corrupted and that is causing this issue?
@okfro commented on GitHub (Dec 14, 2025):
Bingo. I have been having this issue every time I
docker compose down && docker compose up -dmy stack. At v1.12.x, the SQLfu above would mitigate the problem:Then, starting in v1.13.x, I started getting a different error: "Failed to parse private key: wgtypes: failed to parse base64-encoded key: illegal base64 data at input byte 44". I assumed this was a variation of the same fault--
wgtypesdoesn't exist in the schema (afaict), but it's obviously a wireguard error and I was guessing it was emerging from trying to parse the exit node. So I deleted the exit node. It seems that in v13.x, Pangolin is no longer recreating the exit node if missing--because I then started getting a missing exitNode error. So I manually inserted a record:And I was back to the original error. So at that point, I just removed Gerbil from the stack so I could get back up and running.
This morning, seeing @oschwartz10612's comment, I took a look at my
./stack_folder/pangolin/keyfile. Over the past week, I have been moving all my config content into git, and the key file was included.cat pangolin/key>+DpQehbCWGJwlzYa0Cjdt36U77YEHTaQ/PyxjhDtTmg=⏎nano pangolin/key>+DpQehbCWGJwlzYa0Cjdt36U77YEHTaQ/PyxjhDtTmg= ^o^nSo the sneaky
⏎made it into git, and that gives the tailing^o^non the key. I removed that, restored gerbil to the stack, restarted and viola!I hope this is the end of the problem(s) for me. Thanks to all in this thread for sharing your insights.
@Soitora commented on GitHub (Dec 17, 2025):
I had a loss of power (although system safely shut down with UPS), and this issue came back once I rebooted.
Is there supposed to be a key in the main folders?
@oschwartz10612 commented on GitHub (Dec 19, 2025):
If you are using gerbil there should be a file called key in the config dir but it looks like its not there!? Thats strange! Does it come back if you recover your instance?
@Soitora commented on GitHub (Dec 19, 2025):
I checked my weekly backups all the way to late July, and not a single one of them has a file called Key
@adrianipopescu commented on GitHub (Dec 21, 2025):
I fixed this for me by killing both pangolin and gerbil containers, then removed the new autogenerated key from gerbil, restart gerbil to generate its key, and then pangolin up to let pangolin reinsert its row in the db for this gerbil with that key
now -- it does seem like I need to regenerate newt identities as I'm seeing a lot of
here's an easy approach to read the db, feel free to run deletes and whatnot
@farru1998 commented on GitHub (Jan 15, 2026):
In my case as well the key is somehow getting deleted after a pod restart in kubernetes environment, although I have persisted it via persistent volume claim.
@oschwartz10612 commented on GitHub (Jan 17, 2026):
You can use the
GENERATE_AND_SAVE_KEY_TOenv var or--generateAndSaveKeyToflag on gerbil to define where it writes thekeyfile to. Maybe this will help?We will work on adjusting I think gerbil to use an ephemeral key in the future.