[GH-ISSUE #2120] Pangolin leaking memory after upgrading to 1.13.1 #4017

New Issue

@laugmanuel commented on GitHub (Dec 28, 2025):

If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect.

If the monitoring is following the redirect, a 200 is expected as it's the result of the Pangolin auth page. However, this redirect still causes a No Valid Auth event in Pangolin and result in my observed behaviour above (memory allocations).

@laugmanuel commented on GitHub (Dec 28, 2025): > If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect. If the monitoring is following the redirect, a 200 is expected as it's the result of the Pangolin auth page. However, this redirect still causes a `No Valid Auth` event in Pangolin and result in my observed behaviour above (memory allocations).

GiteaMirror commented

@nath1416 commented on GitHub (Dec 29, 2025):

This would make sense, I do have a misconfigured gatus health check that result in multiple Denied events. This could be related. I will turn off the endpoint in gatus and check if it still crashes.

I tried your suggestion @laugmanuel to limit the ram usage for the Pangolin container. It resulted in a restart of the container after that it working fine so far.

@nath1416 commented on GitHub (Dec 29, 2025): This would make sense, I do have a misconfigured gatus health check that result in multiple `Denied` events. This could be related. I will turn off the endpoint in gatus and check if it still crashes. I tried your suggestion @laugmanuel to limit the ram usage for the Pangolin container. It resulted in a restart of the container after that it working fine so far. <img width="584" height="833" alt="Image" src="https://github.com/user-attachments/assets/b4039ccb-6e90-41fd-b80d-86439cfba836" />

GiteaMirror commented

@oschwartz10612 commented on GitHub (Dec 31, 2025):

Could anyone confirm if turning off the request logs fixes the memory
problem?

@oschwartz10612 commented on GitHub (Dec 31, 2025): Could anyone confirm if turning off the request logs fixes the memory problem?

GiteaMirror commented

2026-04-20 08:24:25 -05:00

@oschwartz10612 commented on GitHub (Dec 31, 2025):

Also interested if anyone could turn on debug logs and watch the cache
print statements to see if we are building memeory in there. I dont
think so but would like to check. I think this is in 1.14.

@oschwartz10612 commented on GitHub (Dec 31, 2025): Also interested if anyone could turn on debug logs and watch the cache print statements to see if we are building memeory in there. I dont think so but would like to check. I think this is in 1.14.

GiteaMirror commented

@Josh-Voyles commented on GitHub (Jan 1, 2026):

Could anyone confirm if turning off the request logs fixes the memory
problem?

I've turned request logs off and will report back tomorrow.

@Josh-Voyles commented on GitHub (Jan 1, 2026): > Could anyone confirm if turning off the request logs fixes the memory > problem? I've turned request logs off and will report back tomorrow.

GiteaMirror commented

2026-04-20 08:24:25 -05:00

@kazooie13 commented on GitHub (Jan 1, 2026):

Same problem for me after updating: memory runs full over time, then Pangolin locks up and shortly after that the VPS (very limited – 1 GB RAM) crashes. Pangolin idles at around 350 MB. I use WebDAV over Pangolin in combination with path rules, so there are many requests. Unfortunately, I can’t tell if the problem existed before the upgrade, because I added the path rules shortly after the upgrade. I’m trying to turn off the request logs. So far the consumption is still relatively high, and I can’t yet tell whether it will continue to increase and crash.

@kazooie13 commented on GitHub (Jan 1, 2026): Same problem for me after updating: memory runs full over time, then Pangolin locks up and shortly after that the VPS (very limited – 1 GB RAM) crashes. Pangolin idles at around 350 MB. I use WebDAV over Pangolin in combination with path rules, so there are many requests. Unfortunately, I can’t tell if the problem existed before the upgrade, because I added the path rules shortly after the upgrade. I’m trying to turn off the request logs. So far the consumption is still relatively high, and I can’t yet tell whether it will continue to increase and crash.

GiteaMirror commented

2026-04-20 08:24:25 -05:00

@Josh-Voyles commented on GitHub (Jan 1, 2026):

Could anyone confirm if turning off the request logs fixes the memory
problem?

Turning off request logs does not solve the issue.

@Josh-Voyles commented on GitHub (Jan 1, 2026): > Could anyone confirm if turning off the request logs fixes the memory > problem? Turning off request logs does not solve the issue.

GiteaMirror commented

2026-04-20 08:24:26 -05:00

@oschwartz10612 commented on GitHub (Jan 1, 2026):

Can anyone pinpoint if its the traefik container or pangolin or gerbil?
Are we 100% sure its the pangolin container? Just want to be sure
because I know there were previous issues with Traefik running away with
memory.

@oschwartz10612 commented on GitHub (Jan 1, 2026): Can anyone pinpoint if its the traefik container or pangolin or gerbil? Are we 100% sure its the pangolin container? Just want to be sure because I know there were previous issues with Traefik running away with memory.

GiteaMirror commented

2026-04-20 08:24:27 -05:00

@Joly0 commented on GitHub (Jan 1, 2026):

Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs".

I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that.

Edit: Use docker exec pangolin ps a tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container.
So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

@Joly0 commented on GitHub (Jan 1, 2026): Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs". I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that. Edit: Use `docker exec pangolin ps a` tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

GiteaMirror commented

2026-04-20 08:24:28 -05:00

@sambilbow commented on GitHub (Jan 1, 2026):

Same as above. I posted some images of usage within container on Discord

@sambilbow commented on GitHub (Jan 1, 2026): Same as above. I posted some images of usage within container on [Discord](https://discord.com/channels/1325658630518865980/1451123764682035305/1451123764682035305)

GiteaMirror commented

2026-04-20 08:24:29 -05:00

@kazooie13 commented on GitHub (Jan 2, 2026):

I can confirm that "I‑enable‑source‑maps dist/server.mjs" is the affected process and that it originates from Pangolin itself.

Also, disabling request logs does not resolve the issue.

@kazooie13 commented on GitHub (Jan 2, 2026): I can confirm that "I‑enable‑source‑maps dist/server.mjs" is the affected process and that it originates from Pangolin itself. Also, disabling request logs does not resolve the issue.

GiteaMirror commented

2026-04-20 08:24:31 -05:00

@laugmanuel commented on GitHub (Jan 2, 2026):

I can also confirm, that it's the pangolin container itself:

I didn't see any more problems after fixing the monitoring check (mentioned above). I reverted that fix and enabled debug logging for Pangolin to see cache growth - however, the caching seems to work just fine and doesn't show any abnormalities:

pangolin          | 2026-01-02T04:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 9344, Misses: 3068, Hit rate: 75.28%
pangolin          | 2026-01-02T04:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 9628, Misses: 3172, Hit rate: 75.22%
pangolin          | 2026-01-02T04:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 9910, Misses: 3270, Hit rate: 75.19%
pangolin          | 2026-01-02T04:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10192, Misses: 3368, Hit rate: 75.16%
pangolin          | 2026-01-02T04:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10474, Misses: 3466, Hit rate: 75.14%
pangolin          | 2026-01-02T04:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10756, Misses: 3564, Hit rate: 75.11%
pangolin          | 2026-01-02T05:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11040, Misses: 3660, Hit rate: 75.10%
pangolin          | 2026-01-02T05:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 11330, Misses: 3758, Hit rate: 75.09%
pangolin          | 2026-01-02T05:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11614, Misses: 3850, Hit rate: 75.10%
pangolin          | 2026-01-02T05:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11901, Misses: 3943, Hit rate: 75.11%
pangolin          | 2026-01-02T05:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12188, Misses: 4036, Hit rate: 75.12%
pangolin          | 2026-01-02T05:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12475, Misses: 4129, Hit rate: 75.13%
pangolin          | 2026-01-02T05:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12762, Misses: 4222, Hit rate: 75.14%
pangolin          | 2026-01-02T05:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13046, Misses: 4314, Hit rate: 75.15%
pangolin          | 2026-01-02T05:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13321, Misses: 4403, Hit rate: 75.16%
pangolin          | 2026-01-02T05:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 13612, Misses: 4500, Hit rate: 75.15%
pangolin          | 2026-01-02T05:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 13899, Misses: 4593, Hit rate: 75.16%
pangolin          | 2026-01-02T05:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14186, Misses: 4686, Hit rate: 75.17%
pangolin          | 2026-01-02T06:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14473, Misses: 4779, Hit rate: 75.18%
pangolin          | 2026-01-02T06:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14757, Misses: 4871, Hit rate: 75.18%
pangolin          | 2026-01-02T06:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15044, Misses: 4964, Hit rate: 75.19%
pangolin          | 2026-01-02T06:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15331, Misses: 5057, Hit rate: 75.20%
pangolin          | 2026-01-02T06:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15618, Misses: 5150, Hit rate: 75.20%
pangolin          | 2026-01-02T06:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15905, Misses: 5243, Hit rate: 75.21%
pangolin          | 2026-01-02T06:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16192, Misses: 5336, Hit rate: 75.21%
pangolin          | 2026-01-02T06:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 16539, Misses: 5437, Hit rate: 75.26%
pangolin          | 2026-01-02T06:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16826, Misses: 5530, Hit rate: 75.26%
pangolin          | 2026-01-02T06:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17113, Misses: 5623, Hit rate: 75.27%
pangolin          | 2026-01-02T06:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17400, Misses: 5716, Hit rate: 75.27%
pangolin          | 2026-01-02T06:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17687, Misses: 5809, Hit rate: 75.28%
pangolin          | 2026-01-02T07:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17974, Misses: 5902, Hit rate: 75.28%

@laugmanuel commented on GitHub (Jan 2, 2026): I can also confirm, that it's the pangolin container itself: <img width="1465" height="311" alt="Image" src="https://github.com/user-attachments/assets/cf988e51-5054-4099-b637-0f0632913281" /> I didn't see any more problems after fixing the monitoring check (mentioned above). I reverted that fix and enabled debug logging for Pangolin to see cache growth - however, the caching seems to work just fine and doesn't show any abnormalities: ```sh pangolin | 2026-01-02T04:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 9344, Misses: 3068, Hit rate: 75.28% pangolin | 2026-01-02T04:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 9628, Misses: 3172, Hit rate: 75.22% pangolin | 2026-01-02T04:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 9910, Misses: 3270, Hit rate: 75.19% pangolin | 2026-01-02T04:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10192, Misses: 3368, Hit rate: 75.16% pangolin | 2026-01-02T04:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10474, Misses: 3466, Hit rate: 75.14% pangolin | 2026-01-02T04:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10756, Misses: 3564, Hit rate: 75.11% pangolin | 2026-01-02T05:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11040, Misses: 3660, Hit rate: 75.10% pangolin | 2026-01-02T05:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 11330, Misses: 3758, Hit rate: 75.09% pangolin | 2026-01-02T05:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11614, Misses: 3850, Hit rate: 75.10% pangolin | 2026-01-02T05:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11901, Misses: 3943, Hit rate: 75.11% pangolin | 2026-01-02T05:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12188, Misses: 4036, Hit rate: 75.12% pangolin | 2026-01-02T05:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12475, Misses: 4129, Hit rate: 75.13% pangolin | 2026-01-02T05:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12762, Misses: 4222, Hit rate: 75.14% pangolin | 2026-01-02T05:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13046, Misses: 4314, Hit rate: 75.15% pangolin | 2026-01-02T05:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13321, Misses: 4403, Hit rate: 75.16% pangolin | 2026-01-02T05:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 13612, Misses: 4500, Hit rate: 75.15% pangolin | 2026-01-02T05:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 13899, Misses: 4593, Hit rate: 75.16% pangolin | 2026-01-02T05:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14186, Misses: 4686, Hit rate: 75.17% pangolin | 2026-01-02T06:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14473, Misses: 4779, Hit rate: 75.18% pangolin | 2026-01-02T06:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14757, Misses: 4871, Hit rate: 75.18% pangolin | 2026-01-02T06:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15044, Misses: 4964, Hit rate: 75.19% pangolin | 2026-01-02T06:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15331, Misses: 5057, Hit rate: 75.20% pangolin | 2026-01-02T06:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15618, Misses: 5150, Hit rate: 75.20% pangolin | 2026-01-02T06:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15905, Misses: 5243, Hit rate: 75.21% pangolin | 2026-01-02T06:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16192, Misses: 5336, Hit rate: 75.21% pangolin | 2026-01-02T06:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 16539, Misses: 5437, Hit rate: 75.26% pangolin | 2026-01-02T06:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16826, Misses: 5530, Hit rate: 75.26% pangolin | 2026-01-02T06:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17113, Misses: 5623, Hit rate: 75.27% pangolin | 2026-01-02T06:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17400, Misses: 5716, Hit rate: 75.27% pangolin | 2026-01-02T06:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17687, Misses: 5809, Hit rate: 75.28% pangolin | 2026-01-02T07:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17974, Misses: 5902, Hit rate: 75.28% ```

GiteaMirror commented

2026-04-20 08:24:32 -05:00

@oschwartz10612 commented on GitHub (Jan 2, 2026):

Hum this is going to be a tough one. I will see about building the
container with --inspect in the node command to see if I can use the
heap inspector from chrome to see where memory is building up.

@oschwartz10612 commented on GitHub (Jan 2, 2026): Hum this is going to be a tough one. I will see about building the container with --inspect in the node command to see if I can use the heap inspector from chrome to see where memory is building up.

GiteaMirror commented

2026-04-20 08:24:33 -05:00

@0i5e4u commented on GitHub (Jan 2, 2026):

I disabled every non HTTPS Resources and the container seems to stay at a stable RAM consumtion.
Also non reachable Targets are disabled for testing. Can someone confirm this?
Running since 14h

root@ubuntu:~# docker ps | grep pangolin
37b8f017bf25 fosrl/pangolin:1.14.1 "docker-entrypoint.s…" 14 hours ago Up 14 hours (healthy)

CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
9bc00849ddba traefik 0.08% 34.38MiB / 848.6MiB 4.05% 230MB / 232MB 523MB / 49.2kB 7
8e621351d8dd gerbil 0.00% 5.949MiB / 848.6MiB 0.70% 230MB / 232MB 58.7MB / 0B 7
37b8f017bf25 pangolin 2.09% 290.4MiB / 500MiB 58.09% 20.6MB / 137MB 652MB / 174MB 22

@0i5e4u commented on GitHub (Jan 2, 2026): I disabled every non HTTPS Resources and the container seems to stay at a stable RAM consumtion. Also non reachable Targets are disabled for testing. Can someone confirm this? Running since 14h root@ubuntu:~# docker ps | grep pangolin 37b8f017bf25 fosrl/pangolin:1.14.1 "docker-entrypoint.s…" 14 hours ago Up 14 hours (healthy) CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 9bc00849ddba traefik 0.08% 34.38MiB / 848.6MiB 4.05% 230MB / 232MB 523MB / 49.2kB 7 8e621351d8dd gerbil 0.00% 5.949MiB / 848.6MiB 0.70% 230MB / 232MB 58.7MB / 0B 7 37b8f017bf25 pangolin 2.09% 290.4MiB / 500MiB 58.09% 20.6MB / 137MB 652MB / 174MB 22

GiteaMirror commented

2026-04-20 08:24:34 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

reverted back to 1.13.1 and deleted all of crowdsec in my setup - stable again since aprox 3 days.

@buster39 commented on GitHub (Jan 3, 2026): reverted back to 1.13.1 and deleted all of crowdsec in my setup - stable again since aprox 3 days.

GiteaMirror commented

2026-04-20 08:24:35 -05:00

@kazooie13 commented on GitHub (Jan 3, 2026):

I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

@kazooie13 commented on GitHub (Jan 3, 2026): I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

GiteaMirror commented

2026-04-20 08:24:36 -05:00

@Josh-Voyles commented on GitHub (Jan 3, 2026):

I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

Crowdsec was never part of my config and I still had issues. I am trying reverting back to 13.1. I still have requests logs off.

@Josh-Voyles commented on GitHub (Jan 3, 2026): > I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet. Crowdsec was never part of my config and I still had issues. I am trying reverting back to 13.1. I still have requests logs off.

GiteaMirror commented

2026-04-20 08:24:37 -05:00

@kazooie13 commented on GitHub (Jan 3, 2026):

Is there any instruction on how to revert ~~to version 1.13.1~~? Can I simply adjust the Compose file, or will that cause conflicts with the database? I urgently need a stable system and a temporary fix for the problem.

@kazooie13 commented on GitHub (Jan 3, 2026): Is there any instruction on how to revert ~~to version 1.13.1~~? Can I simply adjust the Compose file, or will that cause conflicts with the database? I urgently need a stable system and a temporary fix for the problem.

GiteaMirror commented

2026-04-20 08:24:37 -05:00

@joerg-hro commented on GitHub (Jan 3, 2026):

I had version 1.10.1 installed. Before updating to 1.13.1, I backed up the config-directory. Since I had the same problems with version 1.13.1, I adapted the Pangolin version of the docker-compose file to version 1.10. Before deploying the docker-compose.yaml file, I restored the config-directory. Now Pangolin is running perfectly again.

@joerg-hro commented on GitHub (Jan 3, 2026): I had version 1.10.1 installed. Before updating to 1.13.1, I backed up the config-directory. Since I had the same problems with version 1.13.1, I adapted the Pangolin version of the docker-compose file to version 1.10. Before deploying the docker-compose.yaml file, I restored the config-directory. Now Pangolin is running perfectly again.

GiteaMirror commented

2026-04-20 08:24:38 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, so you'd have to revert to something older to get rid of it (e.g. 1.11.1 or maybe 1.12.3 but nobody confirmed/denied yet that that 1.12 was stable)

AFAIK you can only revert to 1.11.1 if you took a backup of the whole config directory (and docker-compose.yml)

@jjeuriss commented on GitHub (Jan 3, 2026): T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, so you'd have to revert to something older to get rid of it (e.g. 1.11.1 or maybe 1.12.3 but nobody confirmed/denied yet that that 1.12 was stable) AFAIK you can only revert to 1.11.1 if you took a backup of the whole config directory (and docker-compose.yml)

GiteaMirror commented

2026-04-20 08:24:38 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, ...

True - just told, what happened to me. Maybe we'll find a point where to dig.

@buster39 commented on GitHub (Jan 3, 2026): > T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, ... True - just told, what happened to me. Maybe we'll find a point where to dig.

GiteaMirror commented

2026-04-20 08:24:39 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

@buster39 did you reinstall everything or revert to a backup?

@jjeuriss commented on GitHub (Jan 3, 2026): @buster39 did you reinstall everything or revert to a backup?

GiteaMirror commented

2026-04-20 08:24:39 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

By the way I noticed that the base memory usage of Pangolin (i.e. after running it for about 5min) on 1.11.1 is 230MB where-as this is 305MB on 1.14.1. Both without crowdsec. I hope the increased memory usage is expected.

@jjeuriss commented on GitHub (Jan 3, 2026): By the way I noticed that the base memory usage of Pangolin (i.e. after running it for about 5min) on 1.11.1 is 230MB where-as this is 305MB on 1.14.1. Both without crowdsec. I hope the increased memory usage is expected.

GiteaMirror commented

2026-04-20 08:24:41 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

just a little setup with the smallest VPS with 1 GB RAM - if i remember correctly, lock of my server started after upgrading to [1.13.1]. Server became unreachable a few hours after restarts.
Upgrading to 1.14 didn't help. Had some crowdsec related issues in the log. So decided to start with removing crowdsec from the setup.
Had no backup of the config - just changed the versions in compose yml and deleted everything related to crowdsec.
But - of course - had to pull the docker containers again for the different versions.

@buster39 commented on GitHub (Jan 3, 2026): just a little setup with the smallest VPS with 1 GB RAM - if i remember correctly, lock of my server started after upgrading to [1.13.1]. Server became unreachable a few hours after restarts. Upgrading to 1.14 didn't help. Had some crowdsec related issues in the log. So decided to start with removing crowdsec from the setup. Had no backup of the config - just changed the versions in compose yml and deleted everything related to crowdsec. But - of course - had to pull the docker containers again for the different versions.

GiteaMirror commented

2026-04-20 08:24:41 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

Ah yeah. I tried running an earlier version of Pangolin (1.10 or lower) on a VPS of 1GB with Crowdsec too. This simply locks up the VPS as there's not enough memory to run Pangolin + Crowdsec on a system with such lower memory. I don't think that's related to this bug though.
This bug describes that on 1.13+ the memory usage of Pangolin increases in certain scenario's due to a memory leak (which in turn can also lock up a VPS with only 1GB of memory).

@jjeuriss commented on GitHub (Jan 3, 2026): Ah yeah. I tried running an earlier version of Pangolin (1.10 or lower) on a VPS of 1GB with Crowdsec too. This simply locks up the VPS as there's not enough memory to run Pangolin + Crowdsec on a system with such lower memory. I don't think that's related to this bug though. This bug describes that on 1.13+ the memory usage of Pangolin increases in certain scenario's due to a memory leak (which in turn can also lock up a VPS with only 1GB of memory).

GiteaMirror commented

2026-04-20 08:24:41 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

@jjeuriss I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce?

@oschwartz10612 I still seem to be able to reproduce as follows with failed requests.

I've got photos.mydomain.com forwarded through Pangolin. Through that domain I can browse my photos on my Synology NAS, through a web browser without a problem. But, somehow my Synology Photos app on my phone which points to this same domain is not correctly working yet. So if I run the Photos app on my phone on my 4G network and it connects to my NAS through Pangolin, then the images are not showing up (probably failing somewhere). The fact that it's not working to view my photos is out of scope for this discussion. Just using this as repro scenario.

To make the memory shoot up (from 350MB to 395MB in 1-2minutes), I just need to browse on my 4G network on my phone, on the Synology Photos app (which then doesn't show any images at all). Doing this shoots up the memory and it doesn't seem to go down automatically any more. It also shoots up the CPU usage. The longer I scroll (i.e. issue failing requests), the more memory is consumed.

Note: occasionally I do see memory dropping again.

This repro scenario may be useful for debugging this, but I have no experience with inspecting or profiling containers. If you can let me know how to inspect that, I can further assist with that.

As a work-around for this memory leak problem, I've set a memory limit, fairly conservative (400MB in my case), which then auto-restarts the container, avoiding my VPS to lock-up. You can see the auto-restart of the container in the graph.

I'll now do 1 more test with setting the limit at 500MB and see whether memory usage goes down again over time.
Update: when my pangolin usage goes above 425MB, my 1GB VPS simply dies and I need to reboot it.

@jjeuriss commented on GitHub (Jan 3, 2026): > [@jjeuriss](https://github.com/jjeuriss) I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce? @oschwartz10612 I still seem to be able to reproduce as follows with failed requests. I've got photos.mydomain.com forwarded through Pangolin. Through that domain I can browse my photos on my Synology NAS, through a web browser without a problem. But, somehow my Synology Photos app on my phone which points to this same domain is not correctly working yet. So if I run the Photos app on my phone on my 4G network and it connects to my NAS through Pangolin, then the images are not showing up (probably failing somewhere). The fact that it's not working to view my photos is out of scope for this discussion. Just using this as repro scenario. To make the memory shoot up (from 350MB to 395MB in 1-2minutes), I just need to browse on my 4G network on my phone, on the Synology Photos app (which then doesn't show any images at all). Doing this shoots up the memory and it doesn't seem to go down automatically any more. It also shoots up the CPU usage. The longer I scroll (i.e. issue failing requests), the more memory is consumed. Note: occasionally I do see memory dropping again. <img width="2364" height="711" alt="Image" src="https://github.com/user-attachments/assets/85c80199-8608-4420-bf3a-d6dc6fd8ed03" /> <img width="2364" height="710" alt="Image" src="https://github.com/user-attachments/assets/8f56dde4-edea-4614-8ed1-96ed2fb2a7e2" /> This repro scenario may be useful for debugging this, but I have no experience with inspecting or profiling containers. If you can let me know how to inspect that, I can further assist with that. As a work-around for this memory leak problem, I've set a memory limit, fairly conservative (400MB in my case), which then auto-restarts the container, avoiding my VPS to lock-up. You can see the auto-restart of the container in the graph. I'll now do 1 more test with setting the limit at 500MB and see whether memory usage goes down again over time. Update: when my pangolin usage goes above 425MB, my 1GB VPS simply dies and I need to reboot it.

GiteaMirror commented

2026-04-20 08:24:42 -05:00

@AlexWhitehouse commented on GitHub (Jan 5, 2026):

Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs".

I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that.

Edit: Use docker exec pangolin ps a tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

I am on Pangolin v1.12.1 and am experiencing the same behaviour.

I'm getting a weird periodic CPU usage spike, alongside ever increasing memory consumption with the htop output showing the above process from the Pangolin container being the culprit.

@AlexWhitehouse commented on GitHub (Jan 5, 2026): > Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs". > > I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that. > > Edit: Use `docker exec pangolin ps a` tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil I am on Pangolin v1.12.1 and am experiencing the same behaviour. I'm getting a weird periodic CPU usage spike, alongside ever increasing memory consumption with the htop output showing the above process from the Pangolin container being the culprit. <img width="1426" height="346" alt="Image" src="https://github.com/user-attachments/assets/380f584b-47b1-456d-bf97-0d2a42850830" />

GiteaMirror commented

2026-04-20 08:24:42 -05:00

@hansencheck24 commented on GitHub (Jan 6, 2026):

I'm on pangolin v1.14.1 and having the same issue. Currently I disabled the maxmind_db_path on config and disabled the log retention on settings.

the 3 spikes on memory happens before I disabled both of it, and currently running with stable memory usage. hope this helps

@hansencheck24 commented on GitHub (Jan 6, 2026): I'm on pangolin v1.14.1 and having the same issue. Currently I disabled the `maxmind_db_path` on config and disabled the log retention on settings. <img width="788" height="211" alt="Image" src="https://github.com/user-attachments/assets/a958b3f4-81b0-4ae6-8c9c-3e53396d2f61" /> the 3 spikes on memory happens before I disabled both of it, and currently running with stable memory usage. hope this helps

GiteaMirror commented

2026-04-20 08:24:43 -05:00

@oschwartz10612 commented on GitHub (Jan 6, 2026):

Yes that is thank you! Can anyone else repro?

@oschwartz10612 commented on GitHub (Jan 6, 2026): Yes that is thank you! Can anyone else repro?

GiteaMirror commented

2026-04-20 08:24:43 -05:00

@kazooie13 commented on GitHub (Jan 6, 2026):

Unfortunately, that didn’t work for me. I’m still getting the same CPU spikes as mentioned above; disabling the path in config and deactivate the log retention didn’t made any difference.

(The problem already existed for me before I even added the country path).

@kazooie13 commented on GitHub (Jan 6, 2026): Unfortunately, that didn’t work for me. I’m still getting the same CPU spikes as mentioned above; disabling the path in config and deactivate the log retention didn’t made any difference. (The problem already existed for me before I even added the country path).

GiteaMirror commented

2026-04-20 08:24:43 -05:00

@Josh-Voyles commented on GitHub (Jan 7, 2026):

A couple of findings:

I reverted from 14.1 to 13.1 with request logs still off and didn't have any issues for a couple of days.

Then, in 13.1, I tried turning request logs back on and had issues the same night.

I turned request logs off again (still running 13.1), and everything seems to be running fine.

I'm looking at my pangolin container after running for over 24 hours and memory still seems normal.

I'm running a t2.micro on AWS with 1 vCPU and 1GB of memory.

I hope this info helps track down the problem.

@Josh-Voyles commented on GitHub (Jan 7, 2026): A couple of findings: I reverted from 14.1 to 13.1 with request logs still off and didn't have any issues for a couple of days. Then, in 13.1, I tried turning request logs back on and had issues the same night. I turned request logs off again (still running 13.1), and everything seems to be running fine. I'm looking at my pangolin container after running for over 24 hours and memory still seems normal. I'm running a t2.micro on AWS with 1 vCPU and 1GB of memory. I hope this info helps track down the problem.

GiteaMirror commented

2026-04-20 08:24:44 -05:00

@jjeuriss commented on GitHub (Jan 7, 2026):

I developed an extremely lightweight Prometheus exporter docker to measure CPU, memory, PID and disk statistics of containers running on a host. This allows me to inspect the containers of the pangolin solution (pangolin+traefic+gerbil) on my 1GB VPS. Had to develop one, because typical solutions like cadvisor were consuming around 100MB which would even more quickly run my VPS out of memory. My custom exporter consumes about 10MB, so that's workable. If you're interested, you can find it here: https://github.com/jjeuriss/tiny-docker-exporter.

I keep seeing the same thing: the Pangolin solution is stable until you start doing some extra monitoring (e.g. uptime-kuma) or letting it do requests it cannot fulfill (in my case that's scrolling through my photos on my Synology Photos app, which doesn't seem to work through Pangolin). Even my docker memory cap of 400MB doesn't help to make my solution stable: the docker doesn't seem to get killed in time; my VPS just hangs and SSH is impossible till I reboot it.
I tried my failure scenario (scrolling through photos it cannot access) once more and noticed a high peak in storage being read/written. That may point to some problem?

With these new graphs and way to reproduce, I'ld really like to try some things out.
I tried turning off requests logs through the Pangolin GUI, but that hasn't reduced the read/write throughput:

I don't have GeoLite2-Country set up, so I don't see how disabling maxmind_db_path could help. How do I disable that exactly by the way?

Is there any way I can see what Pangolin is doing reading and writing this much?

@jjeuriss commented on GitHub (Jan 7, 2026): I developed an extremely lightweight Prometheus exporter docker to measure CPU, memory, PID and disk statistics of containers running on a host. This allows me to inspect the containers of the pangolin solution (pangolin+traefic+gerbil) on my 1GB VPS. Had to develop one, because typical solutions like `cadvisor` were consuming around 100MB which would even more quickly run my VPS out of memory. My custom exporter consumes about 10MB, so that's workable. If you're interested, you can find it here: https://github.com/jjeuriss/tiny-docker-exporter. I keep seeing the same thing: the Pangolin solution is stable until you start doing some extra monitoring (e.g. uptime-kuma) or letting it do requests it cannot fulfill (in my case that's scrolling through my photos on my Synology Photos app, which doesn't seem to work through Pangolin). Even my docker memory cap of 400MB doesn't help to make my solution stable: the docker doesn't seem to get killed in time; my VPS just hangs and SSH is impossible till I reboot it. I tried my failure scenario (scrolling through photos it cannot access) once more and noticed a high peak in storage being read/written. That may point to some problem? <img width="1907" height="570" alt="Image" src="https://github.com/user-attachments/assets/e9a5ba91-fe2d-4102-a667-e383d2a7a394" /> With these new graphs and way to reproduce, I'ld really like to try some things out. I tried turning off requests logs through the Pangolin GUI, but that hasn't reduced the read/write throughput: <img width="1903" height="542" alt="Image" src="https://github.com/user-attachments/assets/6fc2495f-e0f4-42c2-99a2-8e531eaf6594" /> I don't have `GeoLite2-Country` set up, so I don't see how disabling `maxmind_db_path` could help. How do I disable that exactly by the way? Is there any way I can see what Pangolin is doing reading and writing this much?

GiteaMirror commented

2026-04-20 08:24:44 -05:00

@asardaes commented on GitHub (Jan 7, 2026):

@jjeuriss for disk IO, see my comment above, that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

@asardaes commented on GitHub (Jan 7, 2026): @jjeuriss for disk IO, see [my comment above](https://github.com/fosrl/pangolin/issues/2120#issuecomment-3691831518), that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

GiteaMirror commented

2026-04-20 08:24:44 -05:00

@jjeuriss commented on GitHub (Jan 8, 2026):

@jjeuriss for disk IO, see my comment above, that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

Nice, thanks, @asardaes ! That WA actually works. I cannot seem to crash my Pangolin anymore with this workaround, and even with ZRAM filled up, it seems to stay responsive.

After scrolling a long time in my (disfunctional) photos, I was still able to fill up the ZRAM. Ideally at some point Pangolin evicts the data in it again. I'll monitor that. I assume the memory leak described here is exactly about that, and the ZRAM now delays the point of failure.

These were the commands I used to enable ZRAM (as @asardaes advised). Just put these in copy-paste ready format for future reference.

Stop Docker containers:

docker stop pangolin gerbil traefik

Install zram-tools:

apt-get update && apt-get install -y zram-tools

Configure zram (512MB):

sed -i 's/#SIZE=256/SIZE=512/' /etc/default/zramswap

Restart zramswap:

systemctl restart zramswap && sleep 2 && swapon --show

Apply VM tuning:

cat >> /etc/sysctl.conf << 'EOF'
vm.swappiness=10
vm.vfs_cache_pressure=50
vm.overcommit_memory=1
EOF

sysctl -p

Restart Docker containers:

docker start pangolin gerbil traefik

Verify:

free -h && swapon --show

If you want to revert this you can execute
Stop Docker containers

docker stop pangolin gerbil traefik

Stop and disable zramswap

systemctl stop zramswap
systemctl disable zramswap

Uninstall zram-tools

apt-get remove -y zram-tools && apt-get autoremove -y

Remove VM tuning parameters from /etc/sysctl.conf: Remove the lines added (vm.swappiness, vm.vfs_cache_pressure, vm.overcommit_memory)

sed -i '/^vm.swappiness=10$/d; /^vm.vfs_cache_pressure=50$/d; /^vm.overcommit_memory=1$/d' /etc/sysctl.conf

Apply sysctl changes to reload defaults

sysctl -p

Restart Docker containers

docker start pangolin gerbil traefik

Verify the revert

free -h && swapon --show

@jjeuriss commented on GitHub (Jan 8, 2026): > [@jjeuriss](https://github.com/jjeuriss) for disk IO, see [my comment above](https://github.com/fosrl/pangolin/issues/2120#issuecomment-3691831518), that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard. Nice, thanks, @asardaes ! That WA actually works. I cannot seem to crash my Pangolin anymore with this workaround, and even with ZRAM filled up, it seems to stay responsive. <img width="1888" height="514" alt="Image" src="https://github.com/user-attachments/assets/eea1393e-b9b2-434d-bcc7-bab3a636dacc" /> After scrolling a long time in my (disfunctional) photos, I was still able to fill up the ZRAM. Ideally at some point Pangolin evicts the data in it again. I'll monitor that. I assume the memory leak described here is exactly about that, and the ZRAM now delays the point of failure. <img width="954" height="437" alt="Image" src="https://github.com/user-attachments/assets/a088982d-0993-4da1-a295-22c34a54a644" /> These were the commands I used to enable ZRAM (as @asardaes advised). Just put these in copy-paste ready format for future reference. Stop Docker containers: ```bash docker stop pangolin gerbil traefik ``` Install zram-tools: ```bash apt-get update && apt-get install -y zram-tools ``` Configure zram (512MB): ```bash sed -i 's/#SIZE=256/SIZE=512/' /etc/default/zramswap ``` Restart zramswap: ```bash systemctl restart zramswap && sleep 2 && swapon --show ``` Apply VM tuning: ```bash cat >> /etc/sysctl.conf << 'EOF' vm.swappiness=10 vm.vfs_cache_pressure=50 vm.overcommit_memory=1 EOF sysctl -p ``` Restart Docker containers: ```bash docker start pangolin gerbil traefik ``` Verify: ```bash free -h && swapon --show ``` **If you want to revert this you can execute** Stop Docker containers ```bash docker stop pangolin gerbil traefik ``` Stop and disable zramswap ```bash systemctl stop zramswap systemctl disable zramswap ``` Uninstall zram-tools ```bash apt-get remove -y zram-tools && apt-get autoremove -y ``` Remove VM tuning parameters from `/etc/sysctl.conf`: Remove the lines added (vm.swappiness, vm.vfs_cache_pressure, vm.overcommit_memory) ```bash sed -i '/^vm.swappiness=10$/d; /^vm.vfs_cache_pressure=50$/d; /^vm.overcommit_memory=1$/d' /etc/sysctl.conf ``` Apply sysctl changes to reload defaults ```bash sysctl -p ``` Restart Docker containers ```bash docker start pangolin gerbil traefik ``` Verify the revert ```bash free -h && swapon --show ```

GiteaMirror commented

2026-04-20 08:24:45 -05:00

@jjeuriss commented on GitHub (Jan 9, 2026):

I've enabled debug logs and added some extra prints to check memory usage.

When my photos app has SSO enabled (and thus needs to go through an extra authentication step), it results in a flood of unauthenticated requests. These unauthenticated requests seem to massively increase heap memory.
When I turn off SSO authentication, my app correctly shows my images and the heap memory stays constant!

Clearly memory is being leaked for unauthenticated requests, even with request logs disabled in the GUI.

I'm working on a fix!

@jjeuriss commented on GitHub (Jan 9, 2026): I've enabled debug logs and added some extra prints to check memory usage. When my photos app has SSO enabled (and thus needs to go through an extra authentication step), it results in a flood of unauthenticated requests. These unauthenticated requests seem to massively increase heap memory. When I turn off SSO authentication, my app correctly shows my images and the heap memory stays constant! Clearly memory is being leaked for unauthenticated requests, even with request logs disabled in the GUI. I'm working on a fix!

GiteaMirror commented

2026-04-20 08:24:45 -05:00

@Josh-Voyles commented on GitHub (Jan 9, 2026):

I'll just be hanging on 13.1 with request logs off until a fix is released. Stable for a few days now.

@Josh-Voyles commented on GitHub (Jan 9, 2026): I'll just be hanging on 13.1 with request logs off until a fix is released. Stable for a few days now.

GiteaMirror commented

2026-04-20 08:24:46 -05:00

@jjeuriss commented on GitHub (Jan 10, 2026):

Did a few more tests to see where the high disk I/O problems started. Because I think those are at the root cause of the memory leak. Sure, the ZRAM workaround of @asardaes helps to avoid them, but hide the problem: they then build us as zram swap.
I repeated the same reproduction scenario (scrolling through photos that are each getting an unauthenticated error) on each of the Pangolin versions:

1.11.1
1.12.0
1.12.3
1.13.0

Clearly the problem started at 1.13.0 (and remains on 1.14.1 by the way), so the diff from 1.12.3 to 1.13.0 should reveal it.

Now, I still need to figure out what's causing it... Already tried a couple of things on my fork, but so far, no luck.
I'm thinking now it might be related to the analytics that were added.

@jjeuriss commented on GitHub (Jan 10, 2026): Did a few more tests to see where the high disk I/O problems started. Because I think those are at the root cause of the memory leak. Sure, the ZRAM workaround of @asardaes helps to avoid them, but hide the problem: they then build us as zram swap. I repeated the same reproduction scenario (scrolling through photos that are each getting an unauthenticated error) on each of the Pangolin versions: - 1.11.1 - 1.12.0 - 1.12.3 - 1.13.0 <img width="1904" height="564" alt="Image" src="https://github.com/user-attachments/assets/c89f8b86-d06b-4bed-a725-4e98af8d31f9" /> Clearly the problem started at 1.13.0 (and remains on 1.14.1 by the way), so the diff from 1.12.3 to 1.13.0 should reveal it. Now, I still need to figure out what's causing it... Already tried a couple of things on my fork, but so far, no luck. I'm thinking now it might be related to the analytics that were added.

GiteaMirror commented

2026-04-20 08:24:46 -05:00

@Yonoesio commented on GitHub (Jan 13, 2026):

NO SE SI ESTO PODRA AYUDAR. HE DEDICADO UN POCO DE TIEMPO CON AYUDA DE LA IA PARA BUSCAR UNA SOLUCION "CHAPUCERA" PARA MANTENER VIVO MI VPS. NI TENGO CONOCIMIENTOS NI SE MUCHO INGLES. SOLO INTENTO APORTAR.

Investigation: Memory Leak Management in Pangolin (Virtualized Environment with OPNsense/ICMP)

⚠️ Disclaimer and Methodology

This document is the result of experimental research. The author (user) states that they do not possess the deep technical knowledge of systems engineering required to resolve the underlying root cause within the Pangolin source code.

The resolution of this problem was achieved through a process of trial and error based strictly on tests performed in a production environment, with the support of Gemini AI for diagnosis, technical structuring of solutions, and the generation of this English translation. The methods described here are "configuration patches" to ensure service availability, not a fix for the original software's code.

1. The Problem: The Memory Leak

During monitoring with htop and docker stats, an uncontrolled growth of the RES (Resident Set Size) memory of the Pangolin process was detected.

Starting Point: The process begins with a healthy consumption of 400 MB.
The Leak: Consumption rises linearly at a rate of 30-50 MB per minute.
Critical Point: Upon reaching 1.5 GB, the system began to overflow. Without limits, the process would consume all 4 GB of the VPS RAM, forcing the use of SWAP.
Impact: The jump in SWAP usage from 93 MB to +170 MB caused a CPU stall (I/O Wait), rendering the VPS inaccessible via SSH and dropping the connection for the 15 managed VMs.

2. Testing and Diagnosis

Stress tests were conducted focusing on traffic persistence:

ICMP: Latency monitoring to observe the impact of container restarts.
OPNsense: Integration of filtering rules and traffic management, confirming that the collapse was not network-related but due to host resource exhaustion.

3. The Solution: The Docker "Cage" (Hard Limits)

Since correcting the memory leak in the source code was not an option, a "self-cleaning" mechanism for the container was implemented.

It was discovered that the standard deploy: resources block in Docker Compose does not effectively stop SWAP usage in standalone environments. Therefore, a configuration patch was applied using Host directives to strictly "enclose" the process.

Configuration Patch (`docker-compose.yml`):

YAML

services:
  pangolin:
    image: [YOUR_IMAGE]
    container_name: pangolin
    restart: always
    # Forced physical RAM limit
    mem_limit: 1800M
    # Equalizing RAM+SWAP to prohibit disk usage (SWAP = 0)
    memswap_limit: 1800M

4. Comparative Results

After applying the limits and observing the system, the results are as follows:


Total VPS RAM	3.8 GB (Saturated)	2.6 GB (Peak usage)
Pangolin RES Memory	+1500 MB (Growing)	1800 MB Limit (Reset)
SWAP Status	+173 MB (Growing)	93.5 MB (Frozen)
Availability	Total system crash	Functional Auto-restart

5. Technical Conclusion

The implemented solution acts as a safety circuit breaker. Upon reaching the 1.8 GB limit, the Linux Kernel (via Docker) executes an OOM Kill on the container. Thanks to the restart: always policy, the container restarts in less than 3 seconds with clean memory (400 MB), preventing the VPS from collapsing.

This method ensures that, despite the memory leak, the 15 VMs maintain 99% uptime without manual intervention.

@Yonoesio commented on GitHub (Jan 13, 2026): > NO SE SI ESTO PODRA AYUDAR. HE DEDICADO UN POCO DE TIEMPO CON AYUDA DE LA IA PARA BUSCAR UNA SOLUCION "CHAPUCERA" PARA MANTENER VIVO MI VPS. NI TENGO CONOCIMIENTOS NI SE MUCHO INGLES. SOLO INTENTO APORTAR. # Investigation: Memory Leak Management in Pangolin (Virtualized Environment with OPNsense/ICMP) # ⚠️ Disclaimer and Methodology This document is the result of experimental research. The author (user) states that they **do not possess the deep technical knowledge** of systems engineering required to resolve the underlying root cause within the Pangolin source code. The resolution of this problem was achieved through a process of **trial and error** based strictly on tests performed in a production environment, with the support of **Gemini AI** for diagnosis, technical structuring of solutions, and the generation of this English translation. The methods described here are "configuration patches" to ensure service availability, not a fix for the original software's code. *** ## 1. The Problem: The Memory Leak During monitoring with `htop` and `docker stats`, an uncontrolled growth of the **RES (Resident Set Size)** memory of the Pangolin process was detected. * **Starting Point:** The process begins with a healthy consumption of **400 MB**. * **The Leak:** Consumption rises linearly at a rate of **30-50 MB per minute**. * **Critical Point:** Upon reaching **1.5 GB**, the system began to overflow. Without limits, the process would consume all 4 GB of the VPS RAM, forcing the use of **SWAP**. * **Impact:** The jump in SWAP usage from 93 MB to +170 MB caused a CPU stall (I/O Wait), rendering the VPS inaccessible via SSH and dropping the connection for the 15 managed VMs. *** ## 2. Testing and Diagnosis Stress tests were conducted focusing on traffic persistence: * **ICMP:** Latency monitoring to observe the impact of container restarts. * **OPNsense:** Integration of filtering rules and traffic management, confirming that the collapse was not network-related but due to host resource exhaustion. *** ## 3. The Solution: The Docker "Cage" (Hard Limits) Since correcting the memory leak in the source code was not an option, a "self-cleaning" mechanism for the container was implemented. It was discovered that the standard `deploy: resources` block in Docker Compose does not effectively stop SWAP usage in standalone environments. Therefore, a configuration patch was applied using Host directives to strictly "enclose" the process. ### Configuration Patch (`docker-compose.yml`): YAML ``` services: pangolin: image: [YOUR_IMAGE] container_name: pangolin restart: always # Forced physical RAM limit mem_limit: 1800M # Equalizing RAM+SWAP to prohibit disk usage (SWAP = 0) memswap_limit: 1800M ``` *** ## 4. Comparative Results After applying the limits and observing the system, the results are as follows: | | | | | ----------------------- | ------------------ | --------------------------- | | **Total VPS RAM** | 3.8 GB (Saturated) | **2.6 GB (Peak usage)** | | **Pangolin RES Memory** | +1500 MB (Growing) | **1800 MB Limit (Reset)** | | **SWAP Status** | +173 MB (Growing) | **93.5 MB (Frozen)** | | **Availability** | Total system crash | **Functional Auto-restart** | *** ## 5. Technical Conclusion The implemented solution acts as a **safety circuit breaker**. Upon reaching the 1.8 GB limit, the Linux Kernel (via Docker) executes an `OOM Kill` on the container. Thanks to the `restart: always` policy, the container restarts in less than 3 seconds with clean memory (400 MB), preventing the VPS from collapsing. This method ensures that, despite the memory leak, the 15 VMs maintain 99% uptime without manual intervention.

GiteaMirror commented

2026-04-20 08:24:47 -05:00

@jjeuriss commented on GitHub (Jan 13, 2026):

Yeah, limiting memory helps in some cases, @Yonoesio as was mentioned earlier in this thread by @Ragnaruk in https://github.com/fosrl/pangolin/issues/2120#issuecomment-3683502087 . In more extreme cases (e.g. VPS with low memory and high amount of unauthenticated requests), it doesn't help to avoid the VPS from hanging.

@jjeuriss commented on GitHub (Jan 13, 2026): Yeah, limiting memory helps in some cases, @Yonoesio as was mentioned earlier in this thread by @Ragnaruk in https://github.com/fosrl/pangolin/issues/2120#issuecomment-3683502087 . In more extreme cases (e.g. VPS with low memory and high amount of unauthenticated requests), it doesn't help to avoid the VPS from hanging.

GiteaMirror commented

2026-04-20 08:24:48 -05:00

@jjeuriss commented on GitHub (Jan 13, 2026):

Still haven't found the root cause of the leak. Help is welcome.

My VPS is also a bit too small to test this on properly, because whenever memory hits about 450MB, my VPS already hangs. So I have about 100MB of RAM to play with from the ~350MB it starts up with.

@jjeuriss commented on GitHub (Jan 13, 2026): Still haven't found the root cause of the leak. Help is welcome. My VPS is also a bit too small to test this on properly, because whenever memory hits about 450MB, my VPS already hangs. So I have about 100MB of RAM to play with from the ~350MB it starts up with.

GiteaMirror commented

2026-04-20 08:24:48 -05:00

@oschwartz10612 commented on GitHub (Jan 14, 2026):

Appreciating all of the feedback everyone. We are going to put some real effort into this before 1.15 to see if we can resolve it. Worried its DEEP 😅

@oschwartz10612 commented on GitHub (Jan 14, 2026): Appreciating all of the feedback everyone. We are going to put some real effort into this before 1.15 to see if we can resolve it. Worried its DEEP 😅

GiteaMirror commented

2026-04-20 08:24:49 -05:00

@Yonoesio commented on GitHub (Jan 16, 2026):

Key findings on Memory Stability and Storage Drivers (Assisted by Gemini AI)

**Technical Disclaimer:**This report was structured and translated with the assistance of Gemini AI. The user (author) performed the empirical testing and environmental changes but does not claim deep expertise in systems engineering. The findings below are based on recent, real-world observations.

Recent Findings (Casual Discovery): I would like to share a significant and somewhat casual discovery regarding the memory leak reported in this issue. After experiencing constant system crashes on a 4GB RAM VPS, I performed a clean migration of my stack. I cannot strictly confirm if there is a direct technical correlation, but the change in stability has been spectacular.

Infrastructure Change: Migrated from a raw containerd setup to Docker with the overlay2 storage driver.
Strict Resource Constraints: Implemented hard limits in Docker Compose (outside the deploy block to ensure enforcement):
```
mem_limit: 1800M
memswap_limit: 1800M
```

Real-time Observations: While I am waiting for more time to pass to generate a complete usage graph, the visual evidence is clear. I am now seeing active memory releases every 10 to 15 minutes, a behavior that was non-existent before.

Current Peak: ~495 MB (with 15 VMs active).
Post-Garbage Collection (GC): The memory successfully drops back to a stable baseline of 354 MB - 380 MB.

Conclusion: I don't have the technical background to explain why, but switching to the Docker Storage Driver (overlay2) combined with hard limits has transformed a broken system into a stable one. Previously, memory grew linearly until a total host crash. Now, the Node.js Garbage Collector seems to be functioning correctly in a "sawtooth" pattern.

I will provide a full graph once it's completed, but I wanted to share this "spectacular" improvement immediately as it might provide a clue to the developers or relief to other users.

@Yonoesio commented on GitHub (Jan 16, 2026): # Key findings on Memory Stability and Storage Drivers (Assisted by Gemini AI) **Technical Disclaimer:***This report was structured and translated with the assistance of ****Gemini AI****. The user (author) performed the empirical testing and environmental changes but does not claim deep expertise in systems engineering. The findings below are based on recent, real-world observations.* *** **Recent Findings (Casual Discovery):** I would like to share a significant and somewhat casual discovery regarding the memory leak reported in this issue. After experiencing constant system crashes on a 4GB RAM VPS, I performed a clean migration of my stack. **I cannot strictly confirm if there is a direct technical correlation, but the change in stability has been spectacular.** 1. **Infrastructure Change:** Migrated from a raw `containerd` setup to **Docker with the ****`overlay2`**** storage driver**. 2. **Strict Resource Constraints:** Implemented hard limits in Docker Compose (outside the `deploy` block to ensure enforcement): ``` mem_limit: 1800M memswap_limit: 1800M ``` **Real-time Observations:** While I am waiting for more time to pass to generate a complete usage graph, the visual evidence is clear. I am now seeing **active memory releases every 10 to 15 minutes**, a behavior that was non-existent before. * **Current Peak:** \~495 MB (with 15 VMs active). * **Post-Garbage Collection (GC):** The memory successfully drops back to a stable baseline of **354 MB - 380 MB**. **Conclusion:** I don't have the technical background to explain why, but switching to the **Docker Storage Driver (****`overlay2`****)** combined with hard limits has transformed a broken system into a stable one. Previously, memory grew linearly until a total host crash. Now, the Node.js Garbage Collector seems to be functioning correctly in a "sawtooth" pattern. I will provide a full graph once it's completed, but I wanted to share this "spectacular" improvement immediately as it might provide a clue to the developers or relief to other users. ![Imagen](https://pics.dhoserver.net/u/fMwnsi.png)

GiteaMirror commented

2026-04-20 08:24:50 -05:00

@Josh-Voyles commented on GitHub (Jan 18, 2026):

Quick update: In 13.1, even with logs off, I eventually had problems again; it just took a whole week to manifest.

@Josh-Voyles commented on GitHub (Jan 18, 2026): Quick update: In 13.1, even with logs off, I eventually had problems again; it just took a whole week to manifest.

GiteaMirror commented

2026-04-20 08:24:51 -05:00

@rex1234 commented on GitHub (Jan 18, 2026):

No settings mentioned here helped me get rid off this memory leak. The node process starts eating my memory until pangolin restarts which causes all sites to be unavailable for a while multiple times per day. This bug should get full attention finally, it's opened for more than a month and availability of all running services suffers from it.

@rex1234 commented on GitHub (Jan 18, 2026): No settings mentioned here helped me get rid off this memory leak. The node process starts eating my memory until pangolin restarts which causes all sites to be unavailable for a while multiple times per day. This bug should get full attention finally, it's opened for more than a month and availability of all running services suffers from it.

GiteaMirror commented

2026-04-20 08:24:51 -05:00

@jjeuriss commented on GitHub (Jan 22, 2026):

I agree, this is IMHO the top priority bug.

I can't use v1.14.1 or higher for more than half a day due to this bug. I've already tried a few things on my fork of this project, but none have resolved it so far.

Things I know so far:

There's a peak in read I/O (>100MB/s) that happens after a bunch of unauthenticated requests happen. This peak does not occur when doing authenticated requests. The peak is easily visible when capturing the docker metrics with tiny-docker-exporter
Unauthenticated requests seem to drive up memory usage fast, while authenticated requests do not. Due to this, bot scans drive up memory usage over time.
The problem started at version 1.13.0 and is not reproducible in 1.12.3. I ran multiple tests to confirm this. The high I/O also does not occur on 1.12.3.
1.13.0 has a higher base memory usage than 1.12.3.
There's multiple layers in the code where caches could be added to reduce the frequency of database operations, but this doesn't avoid the big I/O peak.
Disabling request logs doesn't help
Systematic testing with feature flags (https://github.com/jjeuriss/pangolin/commit/cbe315c2):
** DISABLE_AUDIT_LOGGING=true → Issue still reproduced (not audit logging)
** DISABLE_GEOIP_LOOKUP + DISABLE_ASN_LOOKUP → Issue still reproduced (not geo/ASN lookups)
** DISABLE_RULES_CHECK=true → Issue still reproduced (not rules check)
** DISABLE_SESSION_QUERIES=true → Issue NOT reproduced, but this breaks auth entirely (not a viable fix)

Attempted fixes (all failed to resolve the issue):

Added comprehensive debug logging (https://github.com/jjeuriss/pangolin/commit/6754a9f1) to trace database queries, cache hits/misses, and request flows - This helped identify patterns but didn't fix the issue
Fixed infinite redirect loop (https://github.com/jjeuriss/pangolin/commit/99cdbed2) - Auth pages were redirecting to themselves with nested redirect parameters. Fixed the redirect logic, but the I/O spike persisted.
Increased resource cache TTL from 5s to 60s (https://github.com/jjeuriss/pangolin/commit/96587485) - Reduced database query frequency but the I/O spike still occurred
Added React cache() wrapper to verifySession (https://github.com/jjeuriss/pangolin/commit/277aef5a) - This didn't work because React's cache() only deduplicates during SSR of a single page render, not across HTTP requests
Added server-side caching to /api/v1/user endpoint (https://github.com/jjeuriss/pangolin/commit/a5932f95) - Reduced database queries from 562 to 2 per test (99% reduction), but the VPS still froze and I/O spikes persisted
Added caching to session verification queries (https://github.com/jjeuriss/pangolin/commit/691e582a) - Cached getUserSessionWithUser, getUserOrgRole, getRoleResourceAccess, and getUserResourceAccess with 60-second TTL. Achieved 95%
reduction in database queries and kept them fast, but the memory still grew, I/O spikes still occurred, and VPS still froze
Feature flags for systematic testing (https://github.com/jjeuriss/pangolin/commit/cbe315c2) - Added flags to disable audit logging, GeoIP lookups, ASN lookups, rules checking, and org policy checks. None of these disabled features
prevented the issue.

The problem remains 100% reproducible and makes v1.13.0+ unusable in production with high volumes of unauthenticated requests.

At this point I need help from the core team to identify what changed in v1.13.0 that could cause this. I've ruled out the suspects I could think off and am stuck.

@jjeuriss commented on GitHub (Jan 22, 2026): I agree, this is IMHO the top priority bug. I can't use v1.14.1 or higher for more than half a day due to this bug. I've already [tried a few things](https://github.com/fosrl/pangolin/compare/main...jjeuriss:pangolin:main) on my fork of this project, but none have resolved it so far. Things I know so far: - There's a peak in read I/O (>100MB/s) that happens after a bunch of unauthenticated requests happen. This peak does not occur when doing authenticated requests. The peak is easily visible when capturing the docker metrics with [tiny-docker-exporter](https://github.com/jjeuriss/tiny-docker-exporter/) - Unauthenticated requests seem to drive up memory usage fast, while authenticated requests do not. Due to this, bot scans drive up memory usage over time. - The problem started at version 1.13.0 and is not reproducible in 1.12.3. I ran multiple tests to confirm this. The high I/O also does not occur on 1.12.3. - 1.13.0 has a higher base memory usage than 1.12.3. - There's multiple layers in the code where caches could be added to reduce the frequency of database operations, but this doesn't avoid the big I/O peak. - Disabling request logs doesn't help - Systematic testing with feature flags (https://github.com/jjeuriss/pangolin/commit/cbe315c2): ** DISABLE_AUDIT_LOGGING=true → Issue still reproduced (not audit logging) ** DISABLE_GEOIP_LOOKUP + DISABLE_ASN_LOOKUP → Issue still reproduced (not geo/ASN lookups) ** DISABLE_RULES_CHECK=true → Issue still reproduced (not rules check) ** DISABLE_SESSION_QUERIES=true → Issue NOT reproduced, but this breaks auth entirely (not a viable fix) Attempted fixes (all failed to resolve the issue): 1. Added comprehensive debug logging (https://github.com/jjeuriss/pangolin/commit/6754a9f1) to trace database queries, cache hits/misses, and request flows - This helped identify patterns but didn't fix the issue 2. Fixed infinite redirect loop (https://github.com/jjeuriss/pangolin/commit/99cdbed2) - Auth pages were redirecting to themselves with nested redirect parameters. Fixed the redirect logic, but the I/O spike persisted. 3. Increased resource cache TTL from 5s to 60s (https://github.com/jjeuriss/pangolin/commit/96587485) - Reduced database query frequency but the I/O spike still occurred 4. Added React cache() wrapper to verifySession (https://github.com/jjeuriss/pangolin/commit/277aef5a) - This didn't work because React's cache() only deduplicates during SSR of a single page render, not across HTTP requests 5. Added server-side caching to /api/v1/user endpoint (https://github.com/jjeuriss/pangolin/commit/a5932f95) - Reduced database queries from 562 to 2 per test (99% reduction), but the VPS still froze and I/O spikes persisted 6. Added caching to session verification queries (https://github.com/jjeuriss/pangolin/commit/691e582a) - Cached getUserSessionWithUser, getUserOrgRole, getRoleResourceAccess, and getUserResourceAccess with 60-second TTL. Achieved 95% reduction in database queries and kept them fast, but the memory still grew, I/O spikes still occurred, and VPS still froze 7. Feature flags for systematic testing (https://github.com/jjeuriss/pangolin/commit/cbe315c2) - Added flags to disable audit logging, GeoIP lookups, ASN lookups, rules checking, and org policy checks. None of these disabled features prevented the issue. The problem remains 100% reproducible and makes v1.13.0+ unusable in production with high volumes of unauthenticated requests. At this point I need help from the core team to identify what changed in v1.13.0 that could cause this. I've ruled out the suspects I could think off and am stuck.

GiteaMirror commented

2026-04-20 08:24:52 -05:00

@Vangreen commented on GitHub (Jan 26, 2026):

For me 1.15 version fix problem.
Before there was 1-2 restart per day. Now it runs for 2 days without high ram usage

@Vangreen commented on GitHub (Jan 26, 2026): For me 1.15 version fix problem. Before there was 1-2 restart per day. Now it runs for 2 days without high ram usage <img width="726" height="331" alt="Image" src="https://github.com/user-attachments/assets/e5a58e52-d7b9-4a01-b43f-2668d2ea6de8" />

GiteaMirror commented

2026-04-20 08:24:53 -05:00

@oschwartz10612 commented on GitHub (Jan 26, 2026):

Good to know @Vangreen thank you! I forgot to update this thread but we made some improvements in 1.15. Could everyone try it out and let me know if the issue still persists?

@oschwartz10612 commented on GitHub (Jan 26, 2026): Good to know @Vangreen thank you! I forgot to update this thread but we made some improvements in 1.15. Could everyone try it out and let me know if the issue still persists?

GiteaMirror commented

2026-04-20 08:24:54 -05:00

@n1LWeb commented on GitHub (Jan 27, 2026):

For me the issue still persists in 1.15.1 and the process locks up way before the 24GB Memory of my VPS is filled. My limit is now set at 1000MB and the restart happens about every 2 hours.

The limit is set in docker to prevent pangolin from growing more and locking up eventually.

@n1LWeb commented on GitHub (Jan 27, 2026): For me the issue still persists in 1.15.1 and the process locks up way before the 24GB Memory of my VPS is filled. My limit is now set at 1000MB and the restart happens about every 2 hours. <img width="1550" height="257" alt="Image" src="https://github.com/user-attachments/assets/b5ab34e3-1822-4da3-b4ff-4230ead30f86" /> The limit is set in docker to prevent pangolin from growing more and locking up eventually.

GiteaMirror commented

2026-04-20 08:24:55 -05:00

@Vangreen commented on GitHub (Jan 27, 2026):

@n1LWeb do you have health check for resources set up? I notice when I have setup status monitoring for my resources this behavior with high ram usage begins.

@Vangreen commented on GitHub (Jan 27, 2026): @n1LWeb do you have health check for resources set up? I notice when I have setup status monitoring for my resources this behavior with high ram usage begins.

GiteaMirror commented

2026-04-20 08:24:55 -05:00

@n1LWeb commented on GitHub (Jan 27, 2026):

@Vangreen Yes for almost all ressources I'm using multiple targets per resource. At the moment I'm just testing with 2 newt connections over the same internet connection, but soon the 2 newt connections will be routed via 2 different internet connections (DSL/fibre). Then I'll need the health checks so pangolin will not route over a failing connection.

@n1LWeb commented on GitHub (Jan 27, 2026): @Vangreen Yes for almost all ressources I'm using multiple targets per resource. At the moment I'm just testing with 2 newt connections over the same internet connection, but soon the 2 newt connections will be routed via 2 different internet connections (DSL/fibre). Then I'll need the health checks so pangolin will not route over a failing connection.

GiteaMirror commented

2026-04-20 08:24:55 -05:00

@rex1234 commented on GitHub (Jan 27, 2026):

I can confirm that issue is caused by unathenticated requests that some of my services were doing, after I fixed it and all are properly authorized the memory leak seems to be fixed.

@rex1234 commented on GitHub (Jan 27, 2026): I can confirm that issue is caused by unathenticated requests that some of my services were doing, after I fixed it and all are properly authorized the memory leak seems to be fixed.

GiteaMirror commented

2026-04-20 08:24:57 -05:00

@jjeuriss commented on GitHub (Jan 27, 2026):

I’m still seeing the same issues in 1.15.1. This thread already points out that unauthenticated failures are at the root of the problem. Note that unauthenticated requests aren’t only caused by monitoring; they’re also triggered by bots crawling a domain and trying every available page. Monitoring does make it worse, but it isn’t the sole cause.

The underlying unauthenticated-request problem does not appear to be fixed. In 1.12.3, these failures do not result in significant I/O usage or memory spikes. However, starting with 1.13.0 and continuing through 1.15.1, they clearly do.

After sending a burst of unauthenticated requests (~500–1000), I observed a massive spike in read I/O on my VPS.
A few hours later, the system completely hung after a second spike occurred (not manually triggered, probably a scan).

These kinds of high I/O peaks don't occur on 1.12.3 (I used that version again for the last 4 days and saw no issues). Going back to it now. Looking forward to a fix for this issue still!

@jjeuriss commented on GitHub (Jan 27, 2026): I’m still seeing the same issues in 1.15.1. This thread already points out that unauthenticated failures are at the root of the problem. Note that unauthenticated requests aren’t only caused by monitoring; they’re also triggered by bots crawling a domain and trying every available page. Monitoring does make it worse, but it isn’t the sole cause. The underlying unauthenticated-request problem does not appear to be fixed. In 1.12.3, these failures do not result in significant I/O usage or memory spikes. However, starting with 1.13.0 and continuing through 1.15.1, they clearly do. After sending a burst of unauthenticated requests (~500–1000), I observed a massive spike in read I/O on my VPS. A few hours later, the system completely hung after a second spike occurred (not manually triggered, probably a scan). <img width="936" height="567" alt="Image" src="https://github.com/user-attachments/assets/70df690f-3d8a-4594-90c4-3f4b37d9087c" /> These kinds of high I/O peaks don't occur on 1.12.3 (I used that version again for the last 4 days and saw no issues). Going back to it now. Looking forward to a fix for this issue still!

GiteaMirror commented

2026-04-20 08:24:58 -05:00

@SamTV12345 commented on GitHub (Jan 28, 2026):

Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

@SamTV12345 commented on GitHub (Jan 28, 2026): Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

GiteaMirror commented

2026-04-20 08:24:59 -05:00

@Boscovitz commented on GitHub (Jan 28, 2026):

Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

Same here with 1GB. Every 2-3 days and I have to restart the vps because it hangs OOM.

@Boscovitz commented on GitHub (Jan 28, 2026): > Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days. Same here with 1GB. Every 2-3 days and I have to restart the vps because it hangs OOM.

GiteaMirror commented

2026-04-20 08:24:59 -05:00

@oschwartz10612 commented on GitHub (Jan 30, 2026):

Hum wonder if its a dependency... Will experiment

@oschwartz10612 commented on GitHub (Jan 30, 2026): Hum wonder if its a dependency... Will experiment

GiteaMirror commented

2026-04-20 08:25:00 -05:00

@formless63 commented on GitHub (Feb 4, 2026):

Happening to me as well. 2GB VPS goes OOM every 24-36 hours.

Edit: can confirm this continued when bumping to 1.15.2 as well. Potentially even happening faster now.

@formless63 commented on GitHub (Feb 4, 2026): Happening to me as well. 2GB VPS goes OOM every 24-36 hours. Edit: can confirm this continued when bumping to 1.15.2 as well. Potentially even happening faster now. <img width="874" height="755" alt="Image" src="https://github.com/user-attachments/assets/9c9b237f-2d26-44dc-a9a6-8cfcbc71b131" />

GiteaMirror commented

2026-04-20 08:25:00 -05:00

@maiestro commented on GitHub (Feb 12, 2026):

Hello,

I wanted to ask if there is any news regarding this issue? I am currently using Pangolin v.1.11.1, which is essentially the last version where the memory issue has not occurred.

I am grateful for any information.
Best regards

@maiestro commented on GitHub (Feb 12, 2026): Hello, I wanted to ask if there is any news regarding this issue? I am currently using Pangolin v.1.11.1, which is essentially the last version where the memory issue has not occurred. I am grateful for any information. Best regards

GiteaMirror commented

@ghostklart commented on GitHub (Feb 21, 2026):

Hello, I'm actually started to have the same issue on 1.15.4
Will try to reverse back to 1.12.3

@ghostklart commented on GitHub (Feb 21, 2026): Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3

GiteaMirror commented

@N3m351x commented on GitHub (Feb 24, 2026):

Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3

Did the downgrade fixed your issue?

@N3m351x commented on GitHub (Feb 24, 2026): > Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3 Did the downgrade fixed your issue?

GiteaMirror commented

@ChrissiBe commented on GitHub (Feb 24, 2026):

Same issue on my VPS with 1GB memory. After 1-2 days the system ran out of memory.No login possible.
Downgrade to 1.13.0 works with no problems . I tried with debian and ubuntu minimal configuration but same problem.
1.13.0 works, 1.13.1 and newer run out of memory

@ChrissiBe commented on GitHub (Feb 24, 2026): Same issue on my VPS with 1GB memory. After 1-2 days the system ran out of memory.No login possible. Downgrade to 1.13.0 works with no problems . I tried with debian and ubuntu minimal configuration but same problem. 1.13.0 works, 1.13.1 and newer run out of memory

GiteaMirror commented

2026-04-20 08:25:02 -05:00

@maiestro commented on GitHub (Feb 24, 2026):

I also have a VPS with 1GB RAM. The problem occurred after about 6 hours for me.
I had problems with the following configuration:

Pangolin v. 1.11.1
Gerbil v. 1.3.0
Treafik v. 3.6.7 (plugins: badger v. 1.3.1; geoblock: v. 0.3.6)

@ChrissiBe: Could you please also tell me your (currently working) version numbers? I would then like to try exactly the same setup on my VPS.

The version numbers for Pangolin, Gerbil, and Treafik are located in
{DOCKERINSTALLPATH}/config.yml

and for the Treafik plugins under:
{DOCKERINSTALLPATH}/traefik/traefik_config.yml in the experimental->plugins section.

EDIT: Pangolin Version

@maiestro commented on GitHub (Feb 24, 2026): I also have a VPS with 1GB RAM. The problem occurred after about 6 hours for me. I had problems with the following configuration: Pangolin v. 1.11.1 Gerbil v. 1.3.0 Treafik v. 3.6.7 (plugins: badger v. 1.3.1; geoblock: v. 0.3.6) @ChrissiBe: Could you please also tell me your (currently working) version numbers? I would then like to try exactly the same setup on my VPS. The version numbers for Pangolin, Gerbil, and Treafik are located in `{DOCKERINSTALLPATH}/config.yml` and for the Treafik plugins under: `{DOCKERINSTALLPATH}/traefik/traefik_config.yml ` in the experimental->plugins section. **EDIT: Pangolin Version**

GiteaMirror commented

@kazooie13 commented on GitHub (Feb 24, 2026):

Could you please provide us with an update of the current status of the issue?

It has been open for over two months now and has become the issue with the most comments/reports.

If no more resources are being allocated to resolving it, I will need to look for an alternative.

@kazooie13 commented on GitHub (Feb 24, 2026): Could you please provide us with an update of the current status of the issue? It has been open for over two months now and has become the issue with the most comments/reports. If no more resources are being allocated to resolving it, I will need to look for an alternative.

GiteaMirror commented

2026-04-20 08:25:03 -05:00

@ChrissiBe commented on GitHub (Feb 24, 2026):

@maiestro
Pangolin 1.13.0
Gerbil 1.3.0
Traeffik3.6.7

I testet all of the pangolin Version up to 1.15.4
All Version have this out of memory problem. I did a fresh installation with ubuntu (22 and 24) or debian (12 and 13)

Only 1.13.0 and older are working
now I installaled with the quick-setup script, docker compose down, in the .yml set the pangolin version to 1.13.0 ,
docker compose pull , docker compose up -d .

@ChrissiBe commented on GitHub (Feb 24, 2026): @maiestro Pangolin 1.13.0 Gerbil 1.3.0 Traeffik3.6.7 I testet all of the pangolin Version up to 1.15.4 All Version have this out of memory problem. I did a fresh installation with ubuntu (22 and 24) or debian (12 and 13) Only 1.13.0 and older are working now I installaled with the quick-setup script, docker compose down, in the .yml set the pangolin version to 1.13.0 , docker compose pull , docker compose up -d .

GiteaMirror commented

2026-04-20 08:25:03 -05:00

@huzky-v commented on GitHub (Feb 25, 2026):

Not sure if it helps, but I tried downgrading the zod dependencies to v3 and update the code to adapt zod/v3 with codex (so use it with your own risk, and some schema definition may not be accurate)

The downgrade is based on 1.15.4 codebase
Observed with --inspect that the heap usage is lower when starting pangolin stack
Wonder if anyone test that for real traffic as I don't have that VPS size and traffic

The POC branch is here: https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3, you may check the code and build the Docker image (or you can just use docker.io/xerial817/pangolin-zod-poc:latest if you are in yolo mode)

And also some unsolved case for somebody reporting a zod/v4 memory leak issue, https://github.com/colinhacks/zod/issues/5490

EDIT: seems not working 😫

@huzky-v commented on GitHub (Feb 25, 2026): Not sure if it helps, but I tried downgrading the zod dependencies to v3 and update the code to adapt zod/v3 with `codex` (so use it with your own risk, and some schema definition may not be accurate) The downgrade is based on `1.15.4` codebase Observed with `--inspect` that the heap usage is lower when starting pangolin stack Wonder if anyone test that for real traffic as I don't have that VPS size and traffic The POC branch is here: [https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3](https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3), you may check the code and build the Docker image (or you can just use docker.io/xerial817/pangolin-zod-poc:latest if you are in yolo mode) And also some unsolved case for somebody reporting a zod/v4 memory leak issue, [https://github.com/colinhacks/zod/issues/5490](https://github.com/colinhacks/zod/issues/5490) EDIT: seems not working 😫

GiteaMirror commented

2026-04-20 08:25:04 -05:00

@n1LWeb commented on GitHub (Feb 26, 2026):

@huzky-v I Tried and got the increasing memory usage on this version, too.

@n1LWeb commented on GitHub (Feb 26, 2026): @huzky-v I Tried and got the increasing memory usage on this version, too.

GiteaMirror commented

2026-04-20 08:25:04 -05:00

@duchu commented on GitHub (Feb 27, 2026):

On version 1.16.0, the problem still occurs.

@duchu commented on GitHub (Feb 27, 2026): On version 1.16.0, the problem still occurs.

GiteaMirror commented

2026-04-20 08:25:05 -05:00

@Josh-Voyles commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs. Do I need to scrap my install and rebuild from scratch? Use RHEL instead of Ubuntu? I'm happy to do whatever; I just need instructions.

Also, is it just this thread of people having issues? What are the other users doing that we aren't?

@Josh-Voyles commented on GitHub (Feb 27, 2026): Are we missing something? The release candidate said no known bugs. Do I need to scrap my install and rebuild from scratch? Use RHEL instead of Ubuntu? I'm happy to do whatever; I just need instructions. Also, is it just this thread of people having issues? What are the other users doing that we aren't?

GiteaMirror commented

2026-04-20 08:25:06 -05:00

@formless63 commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs.

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts.

When digging and trying to resolve I made some notes:

Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory.
DB size was stable at 113MB.
Log files were negligible (15KB for Pangolin, 32MB Traefik access.log)
save_logs: true in config was not the issue

@formless63 commented on GitHub (Feb 27, 2026): > Are we missing something? The release candidate said no known bugs. I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts. When digging and trying to resolve I made some notes: - Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory. - DB size was stable at 113MB. - Log files were negligible (15KB for Pangolin, 32MB Traefik access.log) - save_logs: true in config was not the issue

GiteaMirror commented

2026-04-20 08:25:07 -05:00

@Josh-Voyles commented on GitHub (Feb 27, 2026):

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

I know there's been work by the devs and community, but it seems like it's not clear what's going on. I'm sure if the majority of users and their SaaS platform were having issues, all efforts would be focused on this. However, I'm not convinced that's the case.

So, that's why I'm asking what needs to change on my end.

@Josh-Voyles commented on GitHub (Feb 27, 2026): > I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. I know there's been work by the devs and community, but it seems like it's not clear what's going on. I'm sure if the majority of users and their SaaS platform were having issues, all efforts would be focused on this. However, I'm not convinced that's the case. So, that's why I'm asking what needs to change on my end.

GiteaMirror commented

2026-04-20 08:25:08 -05:00

@AlexWhitehouse commented on GitHub (Feb 27, 2026):

I was having the issue, I restarted the container having changed nothing and am no longer experiencing the issue. Unhelpful I know but shows it seems more of a race condition than something permanent.

@AlexWhitehouse commented on GitHub (Feb 27, 2026): I was having the issue, I restarted the container having changed nothing and am no longer experiencing the issue. Unhelpful I know but shows it seems more of a race condition than something permanent.

GiteaMirror commented

2026-04-20 08:25:09 -05:00

@SamTV12345 commented on GitHub (Feb 27, 2026):

It still occurs for me. I "solved" the issue by adding a cron job that runs every midnight where my 1 GB VPS is restarted.

@SamTV12345 commented on GitHub (Feb 27, 2026): It still occurs for me. I "solved" the issue by adding a cron job that runs every midnight where my 1 GB VPS is restarted.

GiteaMirror commented

2026-04-20 08:25:10 -05:00

@huzky-v commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs.

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts.

When digging and trying to resolve I made some notes:

Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory.

DB size was stable at 113MB.

Log files were negligible (15KB for Pangolin, 32MB Traefik access.log)

save_logs: true in config was not the issue

I have tried some testing on my 1GB testing VPS, the command is
echo "GET https://resource.protected.ltd" | vegeta attack -duration=3600s --rate=10 | vegeta report
basically sending 10 request / s to the target protected resources

Things I tried:

Using the same node image as 1.12.3, which is node 22
Not using alpine image
Downgrading some of the packages
Remove some of the logic for the unauth case

Here are some of my observations during my debug, build, test loop:

Even on 1.12.3, the memory will still grow on unauth request and make the docker stats not able to return data
The heap snapshot on 1.12.3 is around 200MB, and grows with the version progress.
Even if I remove all logic on src/app/auth/resource/%5BresourceGuid%5D/page.tsx, which shows the auth page when not authenticated, the memory still grow
Try downgrading the zod library as stated above, it did cut some of the heap usage like 10%, the docker stats will not hang that soon, but it will crash.
By comparing the heap diff, there are always a bunch of string, i18n stuff, some zlib hanging around
My best guess for 1.12.3 is still ok because the base memory usage is low enough for the VPS to run, and there is headroom to GC. The memory leak? I think it still exists, but just too hard to locate, and just hard to find a minimal reproducible snippet for that.

Dang. I am not affected by this case (I have a large memory VPS instance, and frankly not growing too much memory because of that) but I am literally out of idea

@huzky-v commented on GitHub (Feb 27, 2026): > > Are we missing something? The release candidate said no known bugs. > > I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. > > Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts. > > When digging and trying to resolve I made some notes: > > * Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory. > * DB size was stable at 113MB. > * Log files were negligible (15KB for Pangolin, 32MB Traefik access.log) > * save_logs: true in config was not the issue I have tried some testing on my 1GB testing VPS, the command is `echo "GET https://resource.protected.ltd" | vegeta attack -duration=3600s --rate=10 | vegeta report` basically sending 10 request / s to the target protected resources Things I tried: 1. Using the same node image as `1.12.3`, which is `node 22` 2. Not using `alpine` image 3. Downgrading some of the packages 4. Remove some of the logic for the unauth case Here are some of my observations during my debug, build, test loop: 1. Even on `1.12.3`, the memory will still grow on unauth request and make the `docker stats` not able to return data 2. The heap snapshot on `1.12.3` is around 200MB, and grows with the version progress. 3. Even if I remove all logic on `src/app/auth/resource/%5BresourceGuid%5D/page.tsx`, which shows the auth page when not authenticated, the memory *still grow* 4. Try downgrading the `zod` library as stated above, it *did* cut some of the heap usage like 10%, the `docker stats` will not hang that soon, but it *will* crash. 5. By comparing the heap diff, there are always a bunch of `string`, i18n stuff, some zlib hanging around 6. My best guess for `1.12.3` is still ok because the base memory usage is low enough for the VPS to run, and there is headroom to GC. The memory leak? I think it still exists, but just too hard to locate, and just hard to find a minimal reproducible snippet for that. Dang. I am not affected by this case (I have a large memory VPS instance, and frankly not growing too much memory because of that) but I am literally out of idea

GiteaMirror commented

2026-04-20 08:25:11 -05:00

@n1LWeb commented on GitHub (Feb 27, 2026):

Some insights:

If I disable all health checks the memory usage is mostly stable. But I need them as most of my ressources are reachable over two different newt instances.

Maybe the people without the issue didn't enable health checks yet?

I switched from my 1GB x86 VPS to my 24GB ARM VPS, but pangolin still crashes if growing over 1GB in RAM usage. Setting a 900MB limit in docker will restart the Container like every 2 hours but it's usable.

@n1LWeb commented on GitHub (Feb 27, 2026): Some insights: If I disable all health checks the memory usage is mostly stable. But I need them as most of my ressources are reachable over two different newt instances. Maybe the people without the issue didn't enable health checks yet? I switched from my 1GB x86 VPS to my 24GB ARM VPS, but pangolin still crashes if growing over 1GB in RAM usage. Setting a 900MB limit in docker will restart the Container like every 2 hours but it's usable.

GiteaMirror commented

2026-04-20 08:25:11 -05:00

@Alloc86 commented on GitHub (Feb 27, 2026):

Just to chime in as I have a different scale of proxy on my end:

Running Debian 13
Pangolin in Docker
Only local proxy for three public resources -> docker instances on the same host, no Newt or anything
Little traffic, as it's only resources used by myself (probably some bot traffic though)
Unfortunately I started with Pangolin 1.13, so no experience with the older version

I don't get reproducible issues, but had it like 3-5 times locking up due to memory issues. The current "session" has been fine for like 3-4 weeks already though (with no change either), maybe less bots hitting it or something.

@Alloc86 commented on GitHub (Feb 27, 2026): Just to chime in as I have a different scale of proxy on my end: - Running Debian 13 - Pangolin in Docker - Only local proxy for three public resources -> docker instances on the same host, no Newt or anything - Little traffic, as it's only resources used by myself (probably some bot traffic though) - Unfortunately I started with Pangolin 1.13, so no experience with the older version I don't get reproducible issues, but had it like 3-5 times locking up due to memory issues. The current "session" has been fine for like 3-4 weeks already though (with no change either), maybe less bots hitting it or something.

GiteaMirror commented

2026-04-20 08:25:13 -05:00

@oschwartz10612 commented on GitHub (Feb 27, 2026):

Thanks everyone for the continued information and concerns.

We are looking at it but it has been hard to pin down what it is with
all of the reports in here I am not sure if there has been a "smoaking
gun" I can just go fix. On top of that - despite doing updates to
packages from dependabot that did not fix it either if it is a
dependency thing.

I will try to make it a point to look into this again with the new info
ASAP and maybe we can do a patch or two or something.

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

I would highly suggest adding resource limits to the container though -
docker should handle killing it and restarting it

https://docs.docker.com/reference/compose-file/deploy/#resources

@oschwartz10612 commented on GitHub (Feb 27, 2026): Thanks everyone for the continued information and concerns. We are looking at it but it has been hard to pin down what it is with all of the reports in here I am not sure if there has been a "smoaking gun" I can just go fix. On top of that - despite doing updates to packages from dependabot that did not fix it either if it is a dependency thing. I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something. What is even more baffling is we have thousands of users and sites on the cloud yet we dont see the issue LOL so all I can say is we are still throughly confused but want to get this resolved! I would highly suggest adding resource limits to the container though - docker should handle killing it and restarting it https://docs.docker.com/reference/compose-file/deploy/#resources

GiteaMirror commented

2026-04-20 08:25:13 -05:00

@formless63 commented on GitHub (Feb 27, 2026):

I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something.

If there is anything specific those of us who are affected can do to help, please let us know. I'm happy to set up custom logging of some sort if there are configurations that might bring more details to light for you to work with. - or any other potential item that might produce good data for you.

Thanks for all of the work you do!

@formless63 commented on GitHub (Feb 27, 2026): > I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something. If there is anything specific those of us who are affected can do to help, please let us know. I'm happy to set up custom logging of some sort if there are configurations that might bring more details to light for you to work with. - or any other potential item that might produce good data for you. Thanks for all of the work you do!

GiteaMirror commented

2026-04-20 08:25:13 -05:00

@Joly0 commented on GitHub (Feb 27, 2026):

Btw, I was affected by this problem aswell a while ago. I had added a lot of things to my pangolin stack (like the traefik-dashboard or other things by hhftechnology). I tried resetting and re-installing pangolin and only added crowdsec and the geoblock updater containers to the stack, and so far everything is buttery smooth and stable

@Joly0 commented on GitHub (Feb 27, 2026): Btw, I was affected by this problem aswell a while ago. I had added a lot of things to my pangolin stack (like the traefik-dashboard or other things by hhftechnology). I tried resetting and re-installing pangolin and only added crowdsec and the geoblock updater containers to the stack, and so far everything is buttery smooth and stable

GiteaMirror commented

2026-04-20 08:25:14 -05:00

@Ragnaruk commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

@Ragnaruk commented on GitHub (Mar 2, 2026): > What is even more baffling is we have thousands of users and sites on the cloud yet we dont see the issue LOL so all I can say is we are still throughly confused but want to get this resolved! Could it be the sqlite driver? You probably use Postgres in your cloud.

GiteaMirror commented

2026-04-20 08:25:15 -05:00

@n1LWeb commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

I'm using sqlite and have the issue.

Others?

@n1LWeb commented on GitHub (Mar 2, 2026): > > What is even more baffling is we have thousands of users and sites on > > the cloud yet we dont see the issue LOL so all I can say is we are still > > throughly confused but want to get this resolved! > > Could it be the sqlite driver? You probably use Postgres in your cloud. I'm using sqlite and have the issue. Others?

GiteaMirror commented

2026-04-20 08:25:15 -05:00

@joerg-hro commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

I'm using sqlite and have the issue.

Others?

me too

@joerg-hro commented on GitHub (Mar 2, 2026): > > > What is even more baffling is we have thousands of users and sites on > > > the cloud yet we dont see the issue LOL so all I can say is we are still > > > throughly confused but want to get this resolved! > > > > > > Could it be the sqlite driver? You probably use Postgres in your cloud. > > I'm using sqlite and have the issue. > > Others? me too

GiteaMirror commented

2026-04-20 08:25:16 -05:00

@Josh-Voyles commented on GitHub (Mar 2, 2026):

Sqlite here.

@Josh-Voyles commented on GitHub (Mar 2, 2026): Sqlite here.

GiteaMirror commented

2026-04-20 08:25:16 -05:00

@oschwartz10612 commented on GitHub (Mar 4, 2026):

Ahh yes this is good info. It must be with the sqlite driver or
something else then. Helps narrow it down! Let me do some thinking.
Maybe its time to upgrade to libsqlite3 to get off better-sqlite...

@oschwartz10612 commented on GitHub (Mar 4, 2026): Ahh yes this is good info. It must be with the sqlite driver or something else then. Helps narrow it down! Let me do some thinking. Maybe its time to upgrade to libsqlite3 to get off better-sqlite...

GiteaMirror commented

2026-04-20 08:25:17 -05:00

@hansencheck24 commented on GitHub (Mar 4, 2026):

Im using ghcr.io/fosrl/pangolin:postgresql-1.15.1 and has issue

@hansencheck24 commented on GitHub (Mar 4, 2026): Im using ghcr.io/fosrl/pangolin:postgresql-1.15.1 and has issue

GiteaMirror commented

2026-04-20 08:25:17 -05:00

@maiestro commented on GitHub (Mar 9, 2026):

In the meantime, I have installed Pangolin on two different small VPSs, each with 1GB RAM, 1-core vCPU, and the latest Debian 13 for testing purposes:

On the IONOS system, a failure occurs almost immediately (even the eth0 interface fails after a while).
Based on my testing, the failure occurs shortly after I visit the Pangolin configuration web interface to make some settings.

On the Netcup system, Pangolin runs with almost 80% RAM utilization, but has been stable so far (3 days).

root@IONOS-VPS# lscpu

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: AuthenticAMD
Model name: AMD EPYC-Milan Processor
CPU family: 25
...
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full

root@NETCUP-VPS# lscpu

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: GenuineIntel
Model name: QEMU Virtual CPU version 2.5+
CPU family: 15
...
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full

Perhaps other systems with Pangolin problems are looking similar to my IONOS VPS System?

@maiestro commented on GitHub (Mar 9, 2026): In the meantime, I have installed Pangolin on two different small VPSs, each with 1GB RAM, 1-core vCPU, and the latest Debian 13 for testing purposes: On the IONOS system, a failure occurs almost immediately (even the eth0 interface fails after a while). Based on my testing, the failure occurs shortly after I visit the Pangolin configuration web interface to make some settings. On the Netcup system, Pangolin runs with almost 80% RAM utilization, but has been stable so far (3 days). `root@IONOS-VPS# lscpu` > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 40 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 1 > On-line CPU(s) list: 0 > Vendor ID: AuthenticAMD > Model name: AMD EPYC-Milan Processor > CPU family: 25 > ... > Virtualization features: > Virtualization: AMD-V > Hypervisor vendor: KVM > Virtualization type: full `root@NETCUP-VPS# lscpu` > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 40 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 1 > On-line CPU(s) list: 0 > Vendor ID: GenuineIntel > Model name: QEMU Virtual CPU version 2.5+ > CPU family: 15 > ... > Virtualization features: > Hypervisor vendor: KVM > Virtualization type: full Perhaps other systems with Pangolin problems are looking similar to my IONOS VPS System?

GiteaMirror commented

2026-04-20 08:25:18 -05:00

@n1LWeb commented on GitHub (Mar 9, 2026):

For me the issue exists on a RackNerd VPS and on an Oracle Free Tier ARM VPS.

On both only if I have activated health checks.

oracle$ lscpu
Architecture:             aarch64
  CPU op-mode(s):         32-bit, 64-bit
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                ARM
  Model name:             Neoverse-N1
    Model:                1
    Thread(s) per core:   1
    Core(s) per cluster:  4
    Socket(s):            -
    Cluster(s):           1
    Stepping:             r3p1
    BogoMIPS:             50.00
    Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:                     
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3

Racknerd:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  1
  On-line CPU(s) list:   0
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Red Hat
  Model name:            Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
    BIOS Model name:     RHEL 7.6.0 PC (i440FX + PIIX, 1996)  CPU @ 2.0GHz
    BIOS CPU family:     1
    CPU family:          6
    Model:               79
    Thread(s) per core:  1
    Core(s) per socket:  1
    Socket(s):           1
    Stepping:            1
    BogoMIPS:            5199.99
Virtualization features: 
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   32 KiB (1 instance)
  L1i:                   32 KiB (1 instance)
  L2:                    4 MiB (1 instance)
  L3:                    16 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0

@n1LWeb commented on GitHub (Mar 9, 2026): For me the issue exists on a RackNerd VPS and on an Oracle Free Tier ARM VPS. On both only if I have activated health checks. ``` oracle$ lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: ARM Model name: Neoverse-N1 Model: 1 Thread(s) per core: 1 Core(s) per cluster: 4 Socket(s): - Cluster(s): 1 Stepping: r3p1 BogoMIPS: 50.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 ``` Racknerd: ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Vendor ID: GenuineIntel BIOS Vendor ID: Red Hat Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz BIOS Model name: RHEL 7.6.0 PC (i440FX + PIIX, 1996) CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 6 Model: 79 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 Stepping: 1 BogoMIPS: 5199.99 Virtualization features: Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 32 KiB (1 instance) L1i: 32 KiB (1 instance) L2: 4 MiB (1 instance) L3: 16 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0 ```

GiteaMirror commented

2026-04-20 08:25:18 -05:00

@0i5e4u commented on GitHub (Mar 9, 2026):

Same here with Problems:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: AuthenticAMD
BIOS Vendor ID: QEMU
Model name: AMD EPYC-Milan Processor
BIOS Model name: pc-i440fx-6.1 CPU @ 2.0GHz
BIOS CPU family: 1
CPU family: 25
Model: 1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
Stepping: 1
BogoMIPS: 3992.49
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_g
ood nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf
_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 i
nvpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip pku ospke vaes vpclmulq
dq rdpid
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full

Hosted on Strato

@0i5e4u commented on GitHub (Mar 9, 2026): Same here with Problems: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 40 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Vendor ID: AuthenticAMD BIOS Vendor ID: QEMU Model name: AMD EPYC-Milan Processor BIOS Model name: pc-i440fx-6.1 CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 25 Model: 1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 Stepping: 1 BogoMIPS: 3992.49 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_g ood nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf _lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 i nvpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip pku ospke vaes vpclmulq dq rdpid Virtualization features: Virtualization: AMD-V Hypervisor vendor: KVM Virtualization type: full Hosted on Strato

GiteaMirror commented

2026-04-20 08:25:18 -05:00

@harrybaumann commented on GitHub (Mar 10, 2026):

I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users.

However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

@harrybaumann commented on GitHub (Mar 10, 2026): I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users. However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

GiteaMirror commented

2026-04-20 08:25:19 -05:00

@joerg-hro commented on GitHub (Mar 10, 2026):

I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users.

However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

I' ve the same Konfiguration with the same issues.

@joerg-hro commented on GitHub (Mar 10, 2026): > I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users. > > However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console. I' ve the same Konfiguration with the same issues.

GiteaMirror commented

@xylcro commented on GitHub (Mar 10, 2026):

Same issue here. Also using sqlite

@xylcro commented on GitHub (Mar 10, 2026): Same issue here. Also using sqlite

GiteaMirror commented

@Madnex commented on GitHub (Mar 13, 2026):

Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

@Madnex commented on GitHub (Mar 13, 2026): Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

GiteaMirror commented

@xylcro commented on GitHub (Mar 16, 2026):

Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

I use Gatus too, how'd you fix your config?

@xylcro commented on GitHub (Mar 16, 2026): > Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2 I use Gatus too, how'd you fix your config?

GiteaMirror commented