[GH-ISSUE #2120] Pangolin leaking memory after upgrading to 1.13.1 #8839

New Issue

GiteaMirror · 2026-04-30T04:54:41-05:00

GiteaMirror commented

2026-04-30 04:54:41 -05:00

Originally created by @Ragnaruk on GitHub (Dec 18, 2025).
Original GitHub issue: https://github.com/fosrl/pangolin/issues/2120

Originally assigned to: @oschwartz10612, @miloschwartz on GitHub.

Describe the Bug

Pangolin seems to be leaking memory after upgrading from 1.10.2(?) to 1.13.1.

Environment

OS Type & Version: Ubuntu 24.04.3 LTS
Pangolin Version: 1.13.1
Gerbil Version: 1.3.0
Traefik Version: 3.6

To Reproduce

~8k allowed and ~2k blocked requests a day.

Expected Behavior

Memory usage stays constant.

Originally created by @Ragnaruk on GitHub (Dec 18, 2025). Original GitHub issue: https://github.com/fosrl/pangolin/issues/2120 Originally assigned to: @oschwartz10612, @miloschwartz on GitHub. ### Describe the Bug Pangolin seems to be leaking memory after upgrading from 1.10.2(?) to 1.13.1. <img width="938" height="422" alt="Image" src="https://github.com/user-attachments/assets/45cb520e-854d-434e-b894-4a4a454b4ff0" /> ### Environment - OS Type & Version: Ubuntu 24.04.3 LTS - Pangolin Version: 1.13.1 - Gerbil Version: 1.3.0 - Traefik Version: 3.6 ### To Reproduce ~8k allowed and ~2k blocked requests a day. ### Expected Behavior Memory usage stays constant.

GiteaMirror added the needs investigating bug labels 2026-04-30 04:54:41 -05:00

GiteaMirror commented

2026-04-30 04:54:43 -05:00

@Ragnaruk commented on GitHub (Dec 18, 2025):

Not sure if the cache is the problem, but you should consider periodically logging its stats.

Also, there's no real need to use cache here if MaxMind is not enabled. Plus, using a separate cache (maybe a LRU TTL one?) and limiting the number of keys also sounds like a good idea.

async function getCountryCodeFromIp(ip: string): Promise<string | undefined> {
    const geoIpCacheKey = `geoip:${ip}`;

    let cachedCountryCode: string | undefined = cache.get(geoIpCacheKey);

    if (!cachedCountryCode) {
        cachedCountryCode = await getCountryCodeForIp(ip); // do it locally
        // Cache for longer since IP geolocation doesn't change frequently
        cache.set(geoIpCacheKey, cachedCountryCode, 300); // 5 minutes
    }

    return cachedCountryCode;
}

@Ragnaruk commented on GitHub (Dec 18, 2025): Not sure if the cache is the problem, but you should consider periodically logging its stats. Also, there's no real need to use cache here if MaxMind is not enabled. Plus, using a separate cache (maybe a LRU TTL one?) and limiting the number of keys also sounds like a good idea. ```javascript async function getCountryCodeFromIp(ip: string): Promise<string | undefined> { const geoIpCacheKey = `geoip:${ip}`; let cachedCountryCode: string | undefined = cache.get(geoIpCacheKey); if (!cachedCountryCode) { cachedCountryCode = await getCountryCodeForIp(ip); // do it locally // Cache for longer since IP geolocation doesn't change frequently cache.set(geoIpCacheKey, cachedCountryCode, 300); // 5 minutes } return cachedCountryCode; } ```

GiteaMirror commented

2026-04-30 04:54:43 -05:00

@nath1416 commented on GitHub (Dec 19, 2025):

I also have this problem, after setting up GeoLite2-Country database.
My 1GIB vps with 1GIB of swap runs out of memory and docker get kills by the OOM killer.

Here a screenshot of cpu usage, wich spiked to 100% when the containers get kiled.

Using:

Debian bookworm with docker

@nath1416 commented on GitHub (Dec 19, 2025): I also have this problem, after setting up `GeoLite2-Country` database. My 1GIB vps with 1GIB of swap runs out of memory and docker get kills by the OOM killer. Here a screenshot of cpu usage, wich spiked to 100% when the containers get kiled. Using: - Debian bookworm with docker <img width="977" height="289" alt="Image" src="https://github.com/user-attachments/assets/90cf5ec7-7125-4034-ba98-b41ab3622017" />

GiteaMirror commented

2026-04-30 04:54:46 -05:00

@oschwartz10612 commented on GitHub (Dec 19, 2025):

Hum thats interesting did this only happen after adding the country database @nath1416 ?

@oschwartz10612 commented on GitHub (Dec 19, 2025): Hum thats interesting did this only happen _after_ adding the country database @nath1416 ?

GiteaMirror commented

2026-04-30 04:54:47 -05:00

@nath1416 commented on GitHub (Dec 19, 2025):

Yes, I upgraded to 1.13.1 and added the country database at the same time.
After I have memory leaks.

Currently I have 298985 total request and 7373 blocked.

Did not find anything interesting in the logs, but will check if it happens again and try to provide them here.

@nath1416 commented on GitHub (Dec 19, 2025): Yes, I upgraded to 1.13.1 and added the country database at the same time. After I have memory leaks. Currently I have 298985 total request and 7373 blocked. Did not find anything interesting in the logs, but will check if it happens again and try to provide them here.

GiteaMirror commented

2026-04-30 04:54:48 -05:00

@djcrafts commented on GitHub (Dec 21, 2025):

I've opened PR #2133 that should fix this memory leak.

What was changed:

Added maxKeys: 10000 limit to the cache to prevent unbounded growth (uses LRU eviction)
Skip caching when GeoIP/ASN lookups return undefined (e.g., when MaxMind isn't configured)
Added cache stats logging every 5 minutes for monitoring
The cache was growing without limits - especially problematic with GeoIP enabled since every unique IP gets cached. The 10k key limit should be plenty for normal traffic while preventing OOM issues.

@nath1416 @Ragnaruk Would appreciate if you could test this when you get a chance, since you're both experiencing the issue.

@djcrafts commented on GitHub (Dec 21, 2025): I've opened PR #2133 that should fix this memory leak. What was changed: Added [maxKeys: 10000](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) limit to the cache to prevent unbounded growth (uses LRU eviction) Skip caching when GeoIP/ASN lookups return undefined (e.g., when MaxMind isn't configured) Added cache stats logging every 5 minutes for monitoring The cache was growing without limits - especially problematic with GeoIP enabled since every unique IP gets cached. The 10k key limit should be plenty for normal traffic while preventing OOM issues. @nath1416 @Ragnaruk Would appreciate if you could test this when you get a chance, since you're both experiencing the issue.

GiteaMirror commented

2026-04-30 04:54:49 -05:00

@oschwartz10612 commented on GitHub (Dec 22, 2025):

Please reopen if still an issue on 1.14+

@oschwartz10612 commented on GitHub (Dec 22, 2025): Please reopen if still an issue on 1.14+

GiteaMirror commented

2026-04-30 04:54:49 -05:00

@Ragnaruk commented on GitHub (Dec 22, 2025):

Version: 1.14.0-rc.0.

I debugged the process and researched the problem for a bit, and I am ~90% certain that it's not a memory leak but memory fragmentation.

Unfortunately, switching to jemalloc didn't seem to help much.

Fow now, I've just set a memory limit and am accepting periodic restarts.

@Ragnaruk commented on GitHub (Dec 22, 2025): Version: 1.14.0-rc.0. I debugged the process and researched the problem for a bit, and I am ~90% certain that it's not a memory leak but memory fragmentation. Unfortunately, switching to jemalloc didn't seem to help much. Fow now, I've just set a memory limit and am accepting periodic restarts.

GiteaMirror commented

2026-04-30 04:54:50 -05:00

@oschwartz10612 commented on GitHub (Dec 22, 2025):

Hum interesting. So limiting the cache must not have helped.

What is the memory footprint when it gets restarted?

@oschwartz10612 commented on GitHub (Dec 22, 2025): Hum interesting. So limiting the cache must not have helped. What is the memory footprint when it gets restarted?

GiteaMirror commented

2026-04-30 04:54:51 -05:00

@Ragnaruk commented on GitHub (Dec 22, 2025):

Fresh process: 296 MB.
Upper limit: >1.5 GB.

@Ragnaruk commented on GitHub (Dec 22, 2025): Fresh process: 296 MB. Upper limit: >1.5 GB.

GiteaMirror commented

2026-04-30 04:54:52 -05:00

@oschwartz10612 commented on GitHub (Dec 22, 2025):

It might need more memory than that but if it keeps increasing
indefinitely then thats a problem. This is a nodejs applications so I
would expect high base memory load and for it to fluctuate as it garbage
collects and stuff like that.

@oschwartz10612 commented on GitHub (Dec 22, 2025): It might need more memory than that but if it keeps increasing indefinitely then thats a problem. This is a nodejs applications so I would expect high base memory load and for it to fluctuate as it garbage collects and stuff like that.

GiteaMirror commented

2026-04-30 04:54:53 -05:00

@Ragnaruk commented on GitHub (Dec 22, 2025):

pmap returns the following, so I'm pretty sure it's musl allocations rather than JS.

total 19480228 873476 843124 0
000005bd64000000 1081344 58532 58532 0 rw-p [ anon ]
0000be2875820000 100752 27534 0 0 r-xp /usr/local/bin/node
0000ec9cc08f7000 18432 17488 17488 0 rw-p [ anon ]
0000eca0d6800000 261888 10468 10468 0 rwxp [ anon ]
0000ec9ccb28e000 9344 9148 9148 0 rw-p [ anon ]
0000ec9cbf0e1000 9728 8408 8408 0 rw-p [ anon ]
0000ec9cc1bf7000 7168 5960 5960 0 rw-p [ anon ]
0000003534cc0000 5048 5048 5048 0 rw-p [ anon ]

@Ragnaruk commented on GitHub (Dec 22, 2025): pmap returns the following, so I'm pretty sure it's musl allocations rather than JS. ```bash total 19480228 873476 843124 0 000005bd64000000 1081344 58532 58532 0 rw-p [ anon ] 0000be2875820000 100752 27534 0 0 r-xp /usr/local/bin/node 0000ec9cc08f7000 18432 17488 17488 0 rw-p [ anon ] 0000eca0d6800000 261888 10468 10468 0 rwxp [ anon ] 0000ec9ccb28e000 9344 9148 9148 0 rw-p [ anon ] 0000ec9cbf0e1000 9728 8408 8408 0 rw-p [ anon ] 0000ec9cc1bf7000 7168 5960 5960 0 rw-p [ anon ] 0000003534cc0000 5048 5048 5048 0 rw-p [ anon ] ```

GiteaMirror commented

2026-04-30 04:54:53 -05:00

@jjeuriss commented on GitHub (Dec 22, 2025):

I'm able to reproduce this too, by scrolling through my photos on my NAS (which are accessed through pangolin). This seems to heavily increase memory usages, mainly on pangolin and eventually completely lock up my VPS. On this 1GB VPS, Pangolin was running smooth for a few months before version 1.13.1 I upgraded from 1.11, via 1.12 but didn't use 1.12 for a long time. I'm planning to export a logs of memory usage and a CSV to see the metrics. Unfortunately, running a Prometheus client is pushing it most likely, so I'm logging "docker stats --no-stream" to a file periodically to get an idea of the memory pressure.

If I can help in any other way, please let me know.

@jjeuriss commented on GitHub (Dec 22, 2025): I'm able to reproduce this too, by scrolling through my photos on my NAS (which are accessed through pangolin). This seems to heavily increase memory usages, mainly on pangolin and eventually completely lock up my VPS. On this 1GB VPS, Pangolin was running smooth for a few months before version 1.13.1 I upgraded from 1.11, via 1.12 but didn't use 1.12 for a long time. I'm planning to export a logs of memory usage and a CSV to see the metrics. Unfortunately, running a Prometheus client is pushing it most likely, so I'm logging "docker stats --no-stream" to a file periodically to get an idea of the memory pressure. If I can help in any other way, please let me know.

GiteaMirror commented

2026-04-30 04:54:54 -05:00

@jjeuriss commented on GitHub (Dec 24, 2025):

My problems are not over yet with 1.14.0.
Is there a way to downgrade to 1.12 or will the migrated databases not allow that?

@jjeuriss commented on GitHub (Dec 24, 2025): My problems are not over yet with 1.14.0. Is there a way to downgrade to 1.12 or will the migrated databases not allow that?

GiteaMirror commented

2026-04-30 04:54:55 -05:00

@joerg-hro commented on GitHub (Dec 24, 2025):

I have the same problem. I had a backup of the config folder (1.10.1). I restored it. Then I installed the image (1.10.1). It's working again.

@joerg-hro commented on GitHub (Dec 24, 2025): I have the same problem. I had a backup of the config folder (1.10.1). I restored it. Then I installed the image (1.10.1). It's working again.

GiteaMirror commented

2026-04-30 04:54:55 -05:00

@oschwartz10612 commented on GitHub (Dec 24, 2025):

@jjeuriss I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce?

@oschwartz10612 commented on GitHub (Dec 24, 2025): @jjeuriss I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce?

GiteaMirror commented

2026-04-30 04:54:56 -05:00

@jjeuriss commented on GitHub (Dec 25, 2025):

@oschwartz10612 Although I frequently run into the problem, I fear my 'reproduction' scenario isn't actually helping to reproduce the problem.

Had to do manual reboots of my VPS to recover:
2025-12-24 14:07
2025-12-25 06:07
2025-12-25 07:53

I am keeping the statistics on memory and CPU and plotted one period where my VPS got stuck after only 2 hours (on 2025-12-25 07:53) and plotted this in an Excel. Tbh I don't see anything really weird w.r.t. memory usage. I see CPU spikes after the docker starts up, that seems normal...
Excel and full CSV attached.
I've also attached the pangolin docker logs, which shows some errors.

pangolin.log

memory-metrics-chart-2025-12-25-07-53-00-v3.xlsx

memory-log-test.zip

Pangolin

Traefik

The dips/spikes in the graph are where my VPS restarts (it completely locks up, can't even SSH into it anymore, so I have to reboot it).

These graphs are IMHO inconclusive. I don't know why it happens, only that it happens on 1.13 and didn't use to happen on 1.11.

I'm not sure exactly how to profile a docker container. Is there a guide or set of commands you want me to run?

@jjeuriss commented on GitHub (Dec 25, 2025): @oschwartz10612 Although I frequently run into the problem, I fear my 'reproduction' scenario isn't actually helping to reproduce the problem. Had to do manual reboots of my VPS to recover: 2025-12-24 14:07 **2025-12-25 06:07** **2025-12-25 07:53** I am keeping the statistics on memory and CPU and plotted one period where my VPS got stuck after only 2 hours (on 2025-12-25 07:53) and plotted this in an Excel. Tbh I don't see anything really weird w.r.t. memory usage. I see CPU spikes after the docker starts up, that seems normal... Excel and full CSV attached. I've also attached the pangolin docker logs, which shows some errors. [pangolin.log](https://github.com/user-attachments/files/24340460/pangolin.log) [memory-metrics-chart-2025-12-25-07-53-00-v3.xlsx](https://github.com/user-attachments/files/24340443/memory-metrics-chart-2025-12-25-07-53-00-v3.xlsx) [memory-log-test.zip](https://github.com/user-attachments/files/24340444/memory-log-test.zip) _Pangolin_ <img width="1297" height="392" alt="Image" src="https://github.com/user-attachments/assets/966e0aa0-10f2-4b25-a9bc-2795cd146e93" /> _Traefik_ <img width="1526" height="474" alt="Image" src="https://github.com/user-attachments/assets/dbdad607-77ed-4ee8-bff7-7dad171b07a1" /> The dips/spikes in the graph are where my VPS restarts (it completely locks up, can't even SSH into it anymore, so I have to reboot it). These graphs are IMHO inconclusive. I don't know why it happens, only that it happens on 1.13 and didn't use to happen on 1.11. I'm not sure exactly how to profile a docker container. Is there a guide or set of commands you want me to run?

GiteaMirror commented

2026-04-30 04:54:56 -05:00

@Josh-Voyles commented on GitHub (Dec 25, 2025):

I'm also experiencing issues on my AWS instance. After running for a while, Pangolin locks up. The only service I can access is Authentik which is not protected. I'm running 14.1.

@Josh-Voyles commented on GitHub (Dec 25, 2025): I'm also experiencing issues on my AWS instance. After running for a while, Pangolin locks up. The only service I can access is Authentik which is not protected. I'm running 14.1.

GiteaMirror commented

2026-04-30 04:54:57 -05:00

@asardaes commented on GitHub (Dec 26, 2025):

I didn't experience OOM kills as such, but I did report abnormal IO in #2134 (which also led to the VM freezing entirely and needing a reboot). I eventually figured out it was memory pressure forcing memory pages to disk and immediately reading them again in some endless death loop. What worked for me was to enable zram swap as described in the arch wiki - I configured 512M for my 1G VPS, and since actual usage gets compressed, I think my memory usage actually went down. I also applied the optimizations mentioned in section 2.4 from that wiki except for vm.watermark_boost_factor; from what I could gather that might not really help for a VM with little memory, so I left it at the default of 10. I've had zero issues since, although I don't have a lot of requests going through the VPS, so YMMV.

@asardaes commented on GitHub (Dec 26, 2025): I didn't experience OOM kills as such, but I did report abnormal IO in #2134 (which also led to the VM freezing entirely and needing a reboot). I eventually figured out it was memory pressure forcing memory pages to disk and immediately reading them again in some endless death loop. What worked for me was to enable zram swap as described in [the arch wiki](https://wiki.archlinux.org/title/Zram) - I configured 512M for my 1G VPS, and since actual usage gets compressed, I think my memory usage actually went down. I also applied the optimizations mentioned in section 2.4 from that wiki except for `vm.watermark_boost_factor`; from what I could gather that might not really help for a VM with little memory, so I left it at the default of 10. I've had zero issues since, although I don't have a lot of requests going through the VPS, so YMMV.

GiteaMirror commented

2026-04-30 04:55:01 -05:00

@nath1416 commented on GitHub (Dec 26, 2025):

I still have the same issue, just restarted my instance after 3 days, the swap was full.
I will try to set up better monitoring this week.

@nath1416 commented on GitHub (Dec 26, 2025): I still have the same issue, just restarted my instance after 3 days, the swap was full. I will try to set up better monitoring this week.

GiteaMirror commented

2026-04-30 04:55:02 -05:00

@koenieee commented on GitHub (Dec 26, 2025):

Same issue here. I have to reboot the whole vps machine with 1 gb of ram every day now.

@koenieee commented on GitHub (Dec 26, 2025): Same issue here. I have to reboot the whole vps machine with 1 gb of ram every day now.

GiteaMirror commented

2026-04-30 04:55:03 -05:00

@jjeuriss commented on GitHub (Dec 26, 2025):

I'll try version 1.12.3 to see if this also existed there. (I know it ran fine on 1.11.1, but don't know yet whether it regressed in 1.12 or in 1.13). I'll keep the 3rd party components (traefik and gerbil) the same latest version.
Interestingly, the base memory usage of pangolin dropped from about 280MB to about 243MB.

@jjeuriss commented on GitHub (Dec 26, 2025): I'll try version 1.12.3 to see if this also existed there. (I know it ran fine on 1.11.1, but don't know yet whether it regressed in 1.12 or in 1.13). I'll keep the 3rd party components (traefik and gerbil) the same latest version. Interestingly, the base memory usage of pangolin dropped from about 280MB to about 243MB.

GiteaMirror commented

2026-04-30 04:55:03 -05:00

@Josh-Voyles commented on GitHub (Dec 26, 2025):

Last night, I upgraded my t2.micro instance to a t3a.small instance to see if it just needed more memory. However, I can see now, that over time, the docker container memory usage is gradually increased. It takes about 5 hours, but then my instance become unusable and CPU hits 100 percent. In the attached image from docker stats, you can see the abnormally high memory usage. The other two services aren't visible because things started to glitch before loosing connection. I'm happy to investigate further if needed.

@Josh-Voyles commented on GitHub (Dec 26, 2025): Last night, I upgraded my t2.micro instance to a t3a.small instance to see if it just needed more memory. However, I can see now, that over time, the docker container memory usage is gradually increased. It takes about 5 hours, but then my instance become unusable and CPU hits 100 percent. In the attached image from docker stats, you can see the abnormally high memory usage. The other two services aren't visible because things started to glitch before loosing connection. I'm happy to investigate further if needed. <img width="1734" height="190" alt="Image" src="https://github.com/user-attachments/assets/649249d3-8fc9-4455-85f7-437676a8cf53" />

GiteaMirror commented

2026-04-30 04:55:04 -05:00

@laugmanuel commented on GitHub (Dec 26, 2025):

I can confirm this as well.
I've attached screenshots of memory usage of the three components of the last 24h.

I've upgraded to 1.14.0 on the 23.12. and to 1.14.1 on 24.12.

I've also attached a pangolin memory usage graph for the last 7 days and it looks like for me it started happening on the 24.12. in the evening. This seems to suggest that is has to do with that release (1.14.1)...

Is there a way to dump and analyse node memory footprint to find the cause?

@laugmanuel commented on GitHub (Dec 26, 2025): I can confirm this as well. I've attached screenshots of memory usage of the three components of the last 24h. I've upgraded to 1.14.0 on the 23.12. and to 1.14.1 on 24.12. ![Screenshot_20251226_165647_Chrome.jpg](https://github.com/user-attachments/assets/4bd8c624-32d6-4a80-9bb0-da771a6d165f) ![Screenshot_20251226_165545_Chrome.jpg](https://github.com/user-attachments/assets/1dd199d4-da59-481f-bbd9-9fc121decb16) ![Screenshot_20251226_165535_Chrome.jpg](https://github.com/user-attachments/assets/2abd3978-2494-4e5c-b8f9-233e03594193) I've also attached a pangolin memory usage graph for the last 7 days and it looks like for me it started happening on the 24.12. in the evening. This seems to suggest that is has to do with that release (1.14.1)... Is there a way to dump and analyse node memory footprint to find the cause? ![Screenshot_20251226_165836_Chrome.jpg](https://github.com/user-attachments/assets/5554e558-ac11-4687-aca5-912f89036a0b)

GiteaMirror commented

2026-04-30 04:55:05 -05:00

@Josh-Voyles commented on GitHub (Dec 26, 2025):

As others have mentioned, happy to help as much as I can.

But for now, I set up a CloudWatch alarm on AWS when my CPU exceeds 60 percent average for 1 minute, it sends me an email and reboots my instance.

Also, more investigation is revealing I seemed to have started having issues right around the time I updated my Newt instances to 1.8 from 1.6.

Note: I'm updating Newt to 1.8.1 and will monitor if anything changes.

@Josh-Voyles commented on GitHub (Dec 26, 2025): As others have mentioned, happy to help as much as I can. But for now, I set up a CloudWatch alarm on AWS when my CPU exceeds 60 percent average for 1 minute, it sends me an email and reboots my instance. Also, more investigation is revealing I seemed to have started having issues right around the time I updated my Newt instances to 1.8 from 1.6. Note: I'm updating Newt to 1.8.1 and will monitor if anything changes.

GiteaMirror commented

2026-04-30 04:55:05 -05:00

@SamTV12345 commented on GitHub (Dec 26, 2025):

I'm idling at around 250 MB of memory usage. Can that also limit the performance? Through a 1Gb fiber connection I'm only getting through pangolin with a speedtest 50-60 MB. Wasn't it possible to build the server in Go? It has quite a lower memory and cpu footprint than Node.

@SamTV12345 commented on GitHub (Dec 26, 2025): I'm idling at around 250 MB of memory usage. Can that also limit the performance? Through a 1Gb fiber connection I'm only getting through pangolin with a speedtest 50-60 MB. Wasn't it possible to build the server in Go? It has quite a lower memory and cpu footprint than Node.

GiteaMirror commented

2026-04-30 04:55:06 -05:00

@laugmanuel commented on GitHub (Dec 26, 2025):

I've limited the memory consumption via docker memory limits to 769M (arbitrary value). This seems to help my setup. The container is still leaking memory up until the limit, but keeps stable afterwards and is not getting killed.

services:
  pangolin:
    image: fosrl/pangolin:1.14.1
    container_name: pangolin
    restart: unless-stopped
    [...]
    deploy:
      resources:
        limits:
          memory: 768M

(see stable line at the end at ~20:40)

@laugmanuel commented on GitHub (Dec 26, 2025): I've limited the memory consumption via docker memory limits to 769M (arbitrary value). This seems to help my setup. The container is still leaking memory up until the limit, but keeps stable afterwards and is not getting killed. ``` services: pangolin: image: fosrl/pangolin:1.14.1 container_name: pangolin restart: unless-stopped [...] deploy: resources: limits: memory: 768M ``` ![Screenshot_20251226_222554_Chrome.jpg](https://github.com/user-attachments/assets/b2c0c269-7987-4d51-b7e2-4a502241aa18) (see stable line at the end at ~20:40)

GiteaMirror commented

2026-04-30 04:55:06 -05:00

@Josh-Voyles commented on GitHub (Dec 26, 2025):

@laugmanuel This looks like it could be helpful. When you hit your memory limit, are you experiencing any performance impact to Pangolin?

@Josh-Voyles commented on GitHub (Dec 26, 2025): @laugmanuel This looks like it could be helpful. When you hit your memory limit, are you experiencing any performance impact to Pangolin?

GiteaMirror commented

2026-04-30 04:55:06 -05:00

@laugmanuel commented on GitHub (Dec 27, 2025):

When you hit your memory limit, are you experiencing any performance impact to Pangolin?

I'm not really running any performance sensitive applications over Pangolin, but so far it seems to work just fine.

Edit: after some time, Pangolin gets killed for me - however, it's far better than exhausting the entirety of VM memory...

@laugmanuel commented on GitHub (Dec 27, 2025): > When you hit your memory limit, are you experiencing any performance impact to Pangolin? I'm not really running any performance sensitive applications over Pangolin, but so far it seems to work just fine. Edit: after some time, Pangolin gets killed for me - however, it's far better than exhausting the entirety of VM memory...

GiteaMirror commented

2026-04-30 04:55:07 -05:00

@laugmanuel commented on GitHub (Dec 28, 2025):

I did some more digging using Node inspection tooling (--inspect + port-forward + Chrome DevTools) and checking memory allocations over time.

To me it looks like the memory usage only increases by denied requests. The screenshot below shows this. The part between the red lines is caused by unauthorized requests. The same amount of requests with valid auth shows almost no allocations (green):

I'm struggling to export the memory snapshot, so I can't look into that right now. If someone has some ideas on how to investigate further, I'm more than happy to help.

@laugmanuel commented on GitHub (Dec 28, 2025): I did some more digging using Node inspection tooling (`--inspect` + port-forward + Chrome DevTools) and checking memory allocations over time. To me it looks like the memory usage only increases by denied requests. The screenshot below shows this. The part between the red lines is caused by unauthorized requests. The same amount of requests with valid auth shows almost no allocations (green): <img width="1287" height="93" alt="Image" src="https://github.com/user-attachments/assets/dcfbffea-05b9-4928-b665-f588a0034674" /> I'm struggling to export the memory snapshot, so I can't look into that right now. If someone has some ideas on how to investigate further, I'm more than happy to help.

GiteaMirror commented

2026-04-30 04:55:10 -05:00

@koenieee commented on GitHub (Dec 28, 2025):

When you hit your memory limit, are you experiencing any performance impact to Pangolin?

I'm not really running any performance sensitive applications over Pangolin, but so far it seems to work just fine.

Edit: after some time, Pangolin gets killed for me - however, it's far better than exhausting the entirety of VM memory...

I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better?

@koenieee commented on GitHub (Dec 28, 2025): > > When you hit your memory limit, are you experiencing any performance impact to Pangolin? > > I'm not really running any performance sensitive applications over Pangolin, but so far it seems to work just fine. > > Edit: after some time, Pangolin gets killed for me - however, it's far better than exhausting the entirety of VM memory... I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better?

GiteaMirror commented

2026-04-30 04:55:11 -05:00

@laugmanuel commented on GitHub (Dec 28, 2025):

I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better?

It was not thought of as a fix, but as a temporary workaround to limit the impact of Pangolin to itself and not impact other ressources on the same host.

One general question: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring..

However, even if that is the case, it's still a potential DoS if denied requests leak memory...

@laugmanuel commented on GitHub (Dec 28, 2025): > I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better? It was not thought of as a fix, but as a temporary workaround to limit the impact of Pangolin to itself and not impact other ressources on the same host. **One general question**: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring.. However, even if that is the case, it's still a potential DoS if denied requests leak memory...

GiteaMirror commented

2026-04-30 04:55:12 -05:00

@Josh-Voyles commented on GitHub (Dec 28, 2025):

I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better?

It was not thought of as a fix, but as a temporary workaround to limit the impact of Pangolin to itself and not impact other ressources on the same host.

One general question: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring..

However, even if that is the case, it's still a potential DoS if denied requests leak memory...

If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect.

@Josh-Voyles commented on GitHub (Dec 28, 2025): > > I have tried your fix, but it still keeps getting killed. Thanks for the tip, how can we limit the memory usage better? > > It was not thought of as a fix, but as a temporary workaround to limit the impact of Pangolin to itself and not impact other ressources on the same host. > > **One general question**: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring.. > > However, even if that is the case, it's still a potential DoS if denied requests leak memory... If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect.

GiteaMirror commented

2026-04-30 04:55:12 -05:00

@sambilbow commented on GitHub (Dec 28, 2025):

One general question: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring..

Yes. I use Gatus and get 200 on proxied private resources. But it might be following the redirect I guess?

@sambilbow commented on GitHub (Dec 28, 2025): > **One general question**: do you guys, by any chance, use some sort of monitoring tool to check Pangolin itself or a ressource protected by it? I've noticed, that Pangolin is returning HTTP 200 even if the ressource is protected and the request is unauthorized. The reason is that the client is redirected to the Pangolin auth page which is served successfully. In my case, it looks like a major contributor was my monitoring.. > Yes. I use Gatus and get 200 on proxied private resources. But it might be following the redirect I guess?

GiteaMirror commented

2026-04-30 04:55:13 -05:00

@laugmanuel commented on GitHub (Dec 28, 2025):

If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect.

If the monitoring is following the redirect, a 200 is expected as it's the result of the Pangolin auth page. However, this redirect still causes a No Valid Auth event in Pangolin and result in my observed behaviour above (memory allocations).

@laugmanuel commented on GitHub (Dec 28, 2025): > If you're getting 200 for protected resources that seems odd. I'm getting code 302 which is redirect. If the monitoring is following the redirect, a 200 is expected as it's the result of the Pangolin auth page. However, this redirect still causes a `No Valid Auth` event in Pangolin and result in my observed behaviour above (memory allocations).

GiteaMirror commented

2026-04-30 04:55:13 -05:00

@nath1416 commented on GitHub (Dec 29, 2025):

This would make sense, I do have a misconfigured gatus health check that result in multiple Denied events. This could be related. I will turn off the endpoint in gatus and check if it still crashes.

I tried your suggestion @laugmanuel to limit the ram usage for the Pangolin container. It resulted in a restart of the container after that it working fine so far.

@nath1416 commented on GitHub (Dec 29, 2025): This would make sense, I do have a misconfigured gatus health check that result in multiple `Denied` events. This could be related. I will turn off the endpoint in gatus and check if it still crashes. I tried your suggestion @laugmanuel to limit the ram usage for the Pangolin container. It resulted in a restart of the container after that it working fine so far. <img width="584" height="833" alt="Image" src="https://github.com/user-attachments/assets/b4039ccb-6e90-41fd-b80d-86439cfba836" />

GiteaMirror commented

2026-04-30 04:55:14 -05:00

@oschwartz10612 commented on GitHub (Dec 31, 2025):

Could anyone confirm if turning off the request logs fixes the memory
problem?

@oschwartz10612 commented on GitHub (Dec 31, 2025): Could anyone confirm if turning off the request logs fixes the memory problem?

GiteaMirror commented

2026-04-30 04:55:15 -05:00

@oschwartz10612 commented on GitHub (Dec 31, 2025):

Also interested if anyone could turn on debug logs and watch the cache
print statements to see if we are building memeory in there. I dont
think so but would like to check. I think this is in 1.14.

@oschwartz10612 commented on GitHub (Dec 31, 2025): Also interested if anyone could turn on debug logs and watch the cache print statements to see if we are building memeory in there. I dont think so but would like to check. I think this is in 1.14.

GiteaMirror commented

2026-04-30 04:55:15 -05:00

@Josh-Voyles commented on GitHub (Jan 1, 2026):

Could anyone confirm if turning off the request logs fixes the memory
problem?

I've turned request logs off and will report back tomorrow.

@Josh-Voyles commented on GitHub (Jan 1, 2026): > Could anyone confirm if turning off the request logs fixes the memory > problem? I've turned request logs off and will report back tomorrow.

GiteaMirror commented

2026-04-30 04:55:16 -05:00

@kazooie13 commented on GitHub (Jan 1, 2026):

Same problem for me after updating: memory runs full over time, then Pangolin locks up and shortly after that the VPS (very limited – 1 GB RAM) crashes. Pangolin idles at around 350 MB. I use WebDAV over Pangolin in combination with path rules, so there are many requests. Unfortunately, I can’t tell if the problem existed before the upgrade, because I added the path rules shortly after the upgrade. I’m trying to turn off the request logs. So far the consumption is still relatively high, and I can’t yet tell whether it will continue to increase and crash.

@kazooie13 commented on GitHub (Jan 1, 2026): Same problem for me after updating: memory runs full over time, then Pangolin locks up and shortly after that the VPS (very limited – 1 GB RAM) crashes. Pangolin idles at around 350 MB. I use WebDAV over Pangolin in combination with path rules, so there are many requests. Unfortunately, I can’t tell if the problem existed before the upgrade, because I added the path rules shortly after the upgrade. I’m trying to turn off the request logs. So far the consumption is still relatively high, and I can’t yet tell whether it will continue to increase and crash.

GiteaMirror commented

2026-04-30 04:55:16 -05:00

@Josh-Voyles commented on GitHub (Jan 1, 2026):

Could anyone confirm if turning off the request logs fixes the memory
problem?

Turning off request logs does not solve the issue.

@Josh-Voyles commented on GitHub (Jan 1, 2026): > Could anyone confirm if turning off the request logs fixes the memory > problem? Turning off request logs does not solve the issue.

GiteaMirror commented

2026-04-30 04:55:17 -05:00

@oschwartz10612 commented on GitHub (Jan 1, 2026):

Can anyone pinpoint if its the traefik container or pangolin or gerbil?
Are we 100% sure its the pangolin container? Just want to be sure
because I know there were previous issues with Traefik running away with
memory.

@oschwartz10612 commented on GitHub (Jan 1, 2026): Can anyone pinpoint if its the traefik container or pangolin or gerbil? Are we 100% sure its the pangolin container? Just want to be sure because I know there were previous issues with Traefik running away with memory.

GiteaMirror commented

2026-04-30 04:55:18 -05:00

@Joly0 commented on GitHub (Jan 1, 2026):

Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs".

I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that.

Edit: Use docker exec pangolin ps a tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container.
So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

@Joly0 commented on GitHub (Jan 1, 2026): Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs". I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that. Edit: Use `docker exec pangolin ps a` tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

GiteaMirror commented

2026-04-30 04:55:19 -05:00

@sambilbow commented on GitHub (Jan 1, 2026):

Same as above. I posted some images of usage within container on Discord

@sambilbow commented on GitHub (Jan 1, 2026): Same as above. I posted some images of usage within container on [Discord](https://discord.com/channels/1325658630518865980/1451123764682035305/1451123764682035305)

GiteaMirror commented

2026-04-30 04:55:20 -05:00

@kazooie13 commented on GitHub (Jan 2, 2026):

I can confirm that "I‑enable‑source‑maps dist/server.mjs" is the affected process and that it originates from Pangolin itself.

Also, disabling request logs does not resolve the issue.

@kazooie13 commented on GitHub (Jan 2, 2026): I can confirm that "I‑enable‑source‑maps dist/server.mjs" is the affected process and that it originates from Pangolin itself. Also, disabling request logs does not resolve the issue.

GiteaMirror commented

2026-04-30 04:55:20 -05:00

@laugmanuel commented on GitHub (Jan 2, 2026):

I can also confirm, that it's the pangolin container itself:

I didn't see any more problems after fixing the monitoring check (mentioned above). I reverted that fix and enabled debug logging for Pangolin to see cache growth - however, the caching seems to work just fine and doesn't show any abnormalities:

pangolin          | 2026-01-02T04:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 9344, Misses: 3068, Hit rate: 75.28%
pangolin          | 2026-01-02T04:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 9628, Misses: 3172, Hit rate: 75.22%
pangolin          | 2026-01-02T04:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 9910, Misses: 3270, Hit rate: 75.19%
pangolin          | 2026-01-02T04:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10192, Misses: 3368, Hit rate: 75.16%
pangolin          | 2026-01-02T04:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10474, Misses: 3466, Hit rate: 75.14%
pangolin          | 2026-01-02T04:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10756, Misses: 3564, Hit rate: 75.11%
pangolin          | 2026-01-02T05:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11040, Misses: 3660, Hit rate: 75.10%
pangolin          | 2026-01-02T05:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 11330, Misses: 3758, Hit rate: 75.09%
pangolin          | 2026-01-02T05:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11614, Misses: 3850, Hit rate: 75.10%
pangolin          | 2026-01-02T05:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11901, Misses: 3943, Hit rate: 75.11%
pangolin          | 2026-01-02T05:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12188, Misses: 4036, Hit rate: 75.12%
pangolin          | 2026-01-02T05:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12475, Misses: 4129, Hit rate: 75.13%
pangolin          | 2026-01-02T05:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12762, Misses: 4222, Hit rate: 75.14%
pangolin          | 2026-01-02T05:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13046, Misses: 4314, Hit rate: 75.15%
pangolin          | 2026-01-02T05:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13321, Misses: 4403, Hit rate: 75.16%
pangolin          | 2026-01-02T05:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 13612, Misses: 4500, Hit rate: 75.15%
pangolin          | 2026-01-02T05:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 13899, Misses: 4593, Hit rate: 75.16%
pangolin          | 2026-01-02T05:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14186, Misses: 4686, Hit rate: 75.17%
pangolin          | 2026-01-02T06:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14473, Misses: 4779, Hit rate: 75.18%
pangolin          | 2026-01-02T06:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14757, Misses: 4871, Hit rate: 75.18%
pangolin          | 2026-01-02T06:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15044, Misses: 4964, Hit rate: 75.19%
pangolin          | 2026-01-02T06:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15331, Misses: 5057, Hit rate: 75.20%
pangolin          | 2026-01-02T06:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15618, Misses: 5150, Hit rate: 75.20%
pangolin          | 2026-01-02T06:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15905, Misses: 5243, Hit rate: 75.21%
pangolin          | 2026-01-02T06:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16192, Misses: 5336, Hit rate: 75.21%
pangolin          | 2026-01-02T06:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 16539, Misses: 5437, Hit rate: 75.26%
pangolin          | 2026-01-02T06:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16826, Misses: 5530, Hit rate: 75.26%
pangolin          | 2026-01-02T06:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17113, Misses: 5623, Hit rate: 75.27%
pangolin          | 2026-01-02T06:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17400, Misses: 5716, Hit rate: 75.27%
pangolin          | 2026-01-02T06:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17687, Misses: 5809, Hit rate: 75.28%
pangolin          | 2026-01-02T07:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17974, Misses: 5902, Hit rate: 75.28%

@laugmanuel commented on GitHub (Jan 2, 2026): I can also confirm, that it's the pangolin container itself: <img width="1465" height="311" alt="Image" src="https://github.com/user-attachments/assets/cf988e51-5054-4099-b637-0f0632913281" /> I didn't see any more problems after fixing the monitoring check (mentioned above). I reverted that fix and enabled debug logging for Pangolin to see cache growth - however, the caching seems to work just fine and doesn't show any abnormalities: ```sh pangolin | 2026-01-02T04:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 9344, Misses: 3068, Hit rate: 75.28% pangolin | 2026-01-02T04:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 9628, Misses: 3172, Hit rate: 75.22% pangolin | 2026-01-02T04:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 9910, Misses: 3270, Hit rate: 75.19% pangolin | 2026-01-02T04:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10192, Misses: 3368, Hit rate: 75.16% pangolin | 2026-01-02T04:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10474, Misses: 3466, Hit rate: 75.14% pangolin | 2026-01-02T04:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 10756, Misses: 3564, Hit rate: 75.11% pangolin | 2026-01-02T05:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11040, Misses: 3660, Hit rate: 75.10% pangolin | 2026-01-02T05:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 11330, Misses: 3758, Hit rate: 75.09% pangolin | 2026-01-02T05:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11614, Misses: 3850, Hit rate: 75.10% pangolin | 2026-01-02T05:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 11901, Misses: 3943, Hit rate: 75.11% pangolin | 2026-01-02T05:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12188, Misses: 4036, Hit rate: 75.12% pangolin | 2026-01-02T05:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12475, Misses: 4129, Hit rate: 75.13% pangolin | 2026-01-02T05:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 12762, Misses: 4222, Hit rate: 75.14% pangolin | 2026-01-02T05:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13046, Misses: 4314, Hit rate: 75.15% pangolin | 2026-01-02T05:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 13321, Misses: 4403, Hit rate: 75.16% pangolin | 2026-01-02T05:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 19, Hits: 13612, Misses: 4500, Hit rate: 75.15% pangolin | 2026-01-02T05:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 13899, Misses: 4593, Hit rate: 75.16% pangolin | 2026-01-02T05:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14186, Misses: 4686, Hit rate: 75.17% pangolin | 2026-01-02T06:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14473, Misses: 4779, Hit rate: 75.18% pangolin | 2026-01-02T06:09:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 14757, Misses: 4871, Hit rate: 75.18% pangolin | 2026-01-02T06:14:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15044, Misses: 4964, Hit rate: 75.19% pangolin | 2026-01-02T06:19:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15331, Misses: 5057, Hit rate: 75.20% pangolin | 2026-01-02T06:24:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15618, Misses: 5150, Hit rate: 75.20% pangolin | 2026-01-02T06:29:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 15905, Misses: 5243, Hit rate: 75.21% pangolin | 2026-01-02T06:34:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16192, Misses: 5336, Hit rate: 75.21% pangolin | 2026-01-02T06:39:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 17, Hits: 16539, Misses: 5437, Hit rate: 75.26% pangolin | 2026-01-02T06:44:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 16826, Misses: 5530, Hit rate: 75.26% pangolin | 2026-01-02T06:49:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17113, Misses: 5623, Hit rate: 75.27% pangolin | 2026-01-02T06:54:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17400, Misses: 5716, Hit rate: 75.27% pangolin | 2026-01-02T06:59:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17687, Misses: 5809, Hit rate: 75.28% pangolin | 2026-01-02T07:04:22+00:00 [ESC[34mdebugESC[39m]: Cache stats - Keys: 15, Hits: 17974, Misses: 5902, Hit rate: 75.28% ```

GiteaMirror commented

2026-04-30 04:55:21 -05:00

@oschwartz10612 commented on GitHub (Jan 2, 2026):

Hum this is going to be a tough one. I will see about building the
container with --inspect in the node command to see if I can use the
heap inspector from chrome to see where memory is building up.

@oschwartz10612 commented on GitHub (Jan 2, 2026): Hum this is going to be a tough one. I will see about building the container with --inspect in the node command to see if I can use the heap inspector from chrome to see where memory is building up.

GiteaMirror commented

2026-04-30 04:55:21 -05:00

@0i5e4u commented on GitHub (Jan 2, 2026):

I disabled every non HTTPS Resources and the container seems to stay at a stable RAM consumtion.
Also non reachable Targets are disabled for testing. Can someone confirm this?
Running since 14h

root@ubuntu:~# docker ps | grep pangolin
37b8f017bf25 fosrl/pangolin:1.14.1 "docker-entrypoint.s…" 14 hours ago Up 14 hours (healthy)

CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
9bc00849ddba traefik 0.08% 34.38MiB / 848.6MiB 4.05% 230MB / 232MB 523MB / 49.2kB 7
8e621351d8dd gerbil 0.00% 5.949MiB / 848.6MiB 0.70% 230MB / 232MB 58.7MB / 0B 7
37b8f017bf25 pangolin 2.09% 290.4MiB / 500MiB 58.09% 20.6MB / 137MB 652MB / 174MB 22

@0i5e4u commented on GitHub (Jan 2, 2026): I disabled every non HTTPS Resources and the container seems to stay at a stable RAM consumtion. Also non reachable Targets are disabled for testing. Can someone confirm this? Running since 14h root@ubuntu:~# docker ps | grep pangolin 37b8f017bf25 fosrl/pangolin:1.14.1 "docker-entrypoint.s…" 14 hours ago Up 14 hours (healthy) CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 9bc00849ddba traefik 0.08% 34.38MiB / 848.6MiB 4.05% 230MB / 232MB 523MB / 49.2kB 7 8e621351d8dd gerbil 0.00% 5.949MiB / 848.6MiB 0.70% 230MB / 232MB 58.7MB / 0B 7 37b8f017bf25 pangolin 2.09% 290.4MiB / 500MiB 58.09% 20.6MB / 137MB 652MB / 174MB 22

GiteaMirror commented

2026-04-30 04:55:21 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

reverted back to 1.13.1 and deleted all of crowdsec in my setup - stable again since aprox 3 days.

@buster39 commented on GitHub (Jan 3, 2026): reverted back to 1.13.1 and deleted all of crowdsec in my setup - stable again since aprox 3 days.

GiteaMirror commented

2026-04-30 04:55:22 -05:00

@kazooie13 commented on GitHub (Jan 3, 2026):

I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

@kazooie13 commented on GitHub (Jan 3, 2026): I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

GiteaMirror commented

2026-04-30 04:55:23 -05:00

@Josh-Voyles commented on GitHub (Jan 3, 2026):

I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet.

Crowdsec was never part of my config and I still had issues. I am trying reverting back to 13.1. I still have requests logs off.

@Josh-Voyles commented on GitHub (Jan 3, 2026): > I am also using crowdsec with a new postoverflow whitelist rule. I don’t know whether that could have an impact on the problem (i dont think so), and I haven’t tried it without yet. Crowdsec was never part of my config and I still had issues. I am trying reverting back to 13.1. I still have requests logs off.

GiteaMirror commented

2026-04-30 04:55:24 -05:00

@kazooie13 commented on GitHub (Jan 3, 2026):

Is there any instruction on how to revert ~~to version 1.13.1~~? Can I simply adjust the Compose file, or will that cause conflicts with the database? I urgently need a stable system and a temporary fix for the problem.

@kazooie13 commented on GitHub (Jan 3, 2026): Is there any instruction on how to revert ~~to version 1.13.1~~? Can I simply adjust the Compose file, or will that cause conflicts with the database? I urgently need a stable system and a temporary fix for the problem.

GiteaMirror commented

2026-04-30 04:55:24 -05:00

@joerg-hro commented on GitHub (Jan 3, 2026):

I had version 1.10.1 installed. Before updating to 1.13.1, I backed up the config-directory. Since I had the same problems with version 1.13.1, I adapted the Pangolin version of the docker-compose file to version 1.10. Before deploying the docker-compose.yaml file, I restored the config-directory. Now Pangolin is running perfectly again.

@joerg-hro commented on GitHub (Jan 3, 2026): I had version 1.10.1 installed. Before updating to 1.13.1, I backed up the config-directory. Since I had the same problems with version 1.13.1, I adapted the Pangolin version of the docker-compose file to version 1.10. Before deploying the docker-compose.yaml file, I restored the config-directory. Now Pangolin is running perfectly again.

GiteaMirror commented

2026-04-30 04:55:25 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, so you'd have to revert to something older to get rid of it (e.g. 1.11.1 or maybe 1.12.3 but nobody confirmed/denied yet that that 1.12 was stable)

AFAIK you can only revert to 1.11.1 if you took a backup of the whole config directory (and docker-compose.yml)

@jjeuriss commented on GitHub (Jan 3, 2026): T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, so you'd have to revert to something older to get rid of it (e.g. 1.11.1 or maybe 1.12.3 but nobody confirmed/denied yet that that 1.12 was stable) AFAIK you can only revert to 1.11.1 if you took a backup of the whole config directory (and docker-compose.yml)

GiteaMirror commented

2026-04-30 04:55:26 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, ...

True - just told, what happened to me. Maybe we'll find a point where to dig.

@buster39 commented on GitHub (Jan 3, 2026): > T.b.h. I don't understand why the last comments are about reverting to 1.13.1. The problem seems to have been introduced inversion 1.13.1 as per the OP, ... True - just told, what happened to me. Maybe we'll find a point where to dig.

GiteaMirror commented

2026-04-30 04:55:28 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

@buster39 did you reinstall everything or revert to a backup?

@jjeuriss commented on GitHub (Jan 3, 2026): @buster39 did you reinstall everything or revert to a backup?

GiteaMirror commented

2026-04-30 04:55:29 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

By the way I noticed that the base memory usage of Pangolin (i.e. after running it for about 5min) on 1.11.1 is 230MB where-as this is 305MB on 1.14.1. Both without crowdsec. I hope the increased memory usage is expected.

@jjeuriss commented on GitHub (Jan 3, 2026): By the way I noticed that the base memory usage of Pangolin (i.e. after running it for about 5min) on 1.11.1 is 230MB where-as this is 305MB on 1.14.1. Both without crowdsec. I hope the increased memory usage is expected.

GiteaMirror commented

2026-04-30 04:55:29 -05:00

@buster39 commented on GitHub (Jan 3, 2026):

just a little setup with the smallest VPS with 1 GB RAM - if i remember correctly, lock of my server started after upgrading to [1.13.1]. Server became unreachable a few hours after restarts.
Upgrading to 1.14 didn't help. Had some crowdsec related issues in the log. So decided to start with removing crowdsec from the setup.
Had no backup of the config - just changed the versions in compose yml and deleted everything related to crowdsec.
But - of course - had to pull the docker containers again for the different versions.

@buster39 commented on GitHub (Jan 3, 2026): just a little setup with the smallest VPS with 1 GB RAM - if i remember correctly, lock of my server started after upgrading to [1.13.1]. Server became unreachable a few hours after restarts. Upgrading to 1.14 didn't help. Had some crowdsec related issues in the log. So decided to start with removing crowdsec from the setup. Had no backup of the config - just changed the versions in compose yml and deleted everything related to crowdsec. But - of course - had to pull the docker containers again for the different versions.

GiteaMirror commented

2026-04-30 04:55:29 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

Ah yeah. I tried running an earlier version of Pangolin (1.10 or lower) on a VPS of 1GB with Crowdsec too. This simply locks up the VPS as there's not enough memory to run Pangolin + Crowdsec on a system with such lower memory. I don't think that's related to this bug though.
This bug describes that on 1.13+ the memory usage of Pangolin increases in certain scenario's due to a memory leak (which in turn can also lock up a VPS with only 1GB of memory).

@jjeuriss commented on GitHub (Jan 3, 2026): Ah yeah. I tried running an earlier version of Pangolin (1.10 or lower) on a VPS of 1GB with Crowdsec too. This simply locks up the VPS as there's not enough memory to run Pangolin + Crowdsec on a system with such lower memory. I don't think that's related to this bug though. This bug describes that on 1.13+ the memory usage of Pangolin increases in certain scenario's due to a memory leak (which in turn can also lock up a VPS with only 1GB of memory).

GiteaMirror commented

2026-04-30 04:55:30 -05:00

@jjeuriss commented on GitHub (Jan 3, 2026):

@jjeuriss I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce?

@oschwartz10612 I still seem to be able to reproduce as follows with failed requests.

I've got photos.mydomain.com forwarded through Pangolin. Through that domain I can browse my photos on my Synology NAS, through a web browser without a problem. But, somehow my Synology Photos app on my phone which points to this same domain is not correctly working yet. So if I run the Photos app on my phone on my 4G network and it connects to my NAS through Pangolin, then the images are not showing up (probably failing somewhere). The fact that it's not working to view my photos is out of scope for this discussion. Just using this as repro scenario.

To make the memory shoot up (from 350MB to 395MB in 1-2minutes), I just need to browse on my 4G network on my phone, on the Synology Photos app (which then doesn't show any images at all). Doing this shoots up the memory and it doesn't seem to go down automatically any more. It also shoots up the CPU usage. The longer I scroll (i.e. issue failing requests), the more memory is consumed.

Note: occasionally I do see memory dropping again.

This repro scenario may be useful for debugging this, but I have no experience with inspecting or profiling containers. If you can let me know how to inspect that, I can further assist with that.

As a work-around for this memory leak problem, I've set a memory limit, fairly conservative (400MB in my case), which then auto-restarts the container, avoiding my VPS to lock-up. You can see the auto-restart of the container in the graph.

I'll now do 1 more test with setting the limit at 500MB and see whether memory usage goes down again over time.
Update: when my pangolin usage goes above 425MB, my 1GB VPS simply dies and I need to reboot it.

@jjeuriss commented on GitHub (Jan 3, 2026): > [@jjeuriss](https://github.com/jjeuriss) I would be curious is this is the pangolin container or the Traefik container. Are you able to profile the containers and reproduce? @oschwartz10612 I still seem to be able to reproduce as follows with failed requests. I've got photos.mydomain.com forwarded through Pangolin. Through that domain I can browse my photos on my Synology NAS, through a web browser without a problem. But, somehow my Synology Photos app on my phone which points to this same domain is not correctly working yet. So if I run the Photos app on my phone on my 4G network and it connects to my NAS through Pangolin, then the images are not showing up (probably failing somewhere). The fact that it's not working to view my photos is out of scope for this discussion. Just using this as repro scenario. To make the memory shoot up (from 350MB to 395MB in 1-2minutes), I just need to browse on my 4G network on my phone, on the Synology Photos app (which then doesn't show any images at all). Doing this shoots up the memory and it doesn't seem to go down automatically any more. It also shoots up the CPU usage. The longer I scroll (i.e. issue failing requests), the more memory is consumed. Note: occasionally I do see memory dropping again. <img width="2364" height="711" alt="Image" src="https://github.com/user-attachments/assets/85c80199-8608-4420-bf3a-d6dc6fd8ed03" /> <img width="2364" height="710" alt="Image" src="https://github.com/user-attachments/assets/8f56dde4-edea-4614-8ed1-96ed2fb2a7e2" /> This repro scenario may be useful for debugging this, but I have no experience with inspecting or profiling containers. If you can let me know how to inspect that, I can further assist with that. As a work-around for this memory leak problem, I've set a memory limit, fairly conservative (400MB in my case), which then auto-restarts the container, avoiding my VPS to lock-up. You can see the auto-restart of the container in the graph. I'll now do 1 more test with setting the limit at 500MB and see whether memory usage goes down again over time. Update: when my pangolin usage goes above 425MB, my 1GB VPS simply dies and I need to reboot it.

GiteaMirror commented

2026-04-30 04:55:30 -05:00

@AlexWhitehouse commented on GitHub (Jan 5, 2026):

Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs".

I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that.

Edit: Use docker exec pangolin ps a tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil

I am on Pangolin v1.12.1 and am experiencing the same behaviour.

I'm getting a weird periodic CPU usage spike, alongside ever increasing memory consumption with the htop output showing the above process from the Pangolin container being the culprit.

@AlexWhitehouse commented on GitHub (Jan 5, 2026): > Not sure about the others, but when I look on my server with htop, the process that's consuming currently about 15 GB (out of 16 on my server) is "node -enable-source-maps dist/server.mjs". > > I am not sure what process exactly this is or which container is running this, just maybe someone else can tell that. > > Edit: Use `docker exec pangolin ps a` tog et all the processes in the pangolin container and the mentioned node process is the main thread in the pangolin container. So this error seems to come pretty surely from pangolin itself and not traefik or gerbil I am on Pangolin v1.12.1 and am experiencing the same behaviour. I'm getting a weird periodic CPU usage spike, alongside ever increasing memory consumption with the htop output showing the above process from the Pangolin container being the culprit. <img width="1426" height="346" alt="Image" src="https://github.com/user-attachments/assets/380f584b-47b1-456d-bf97-0d2a42850830" />

GiteaMirror commented

2026-04-30 04:55:31 -05:00

@hansencheck24 commented on GitHub (Jan 6, 2026):

I'm on pangolin v1.14.1 and having the same issue. Currently I disabled the maxmind_db_path on config and disabled the log retention on settings.

the 3 spikes on memory happens before I disabled both of it, and currently running with stable memory usage. hope this helps

@hansencheck24 commented on GitHub (Jan 6, 2026): I'm on pangolin v1.14.1 and having the same issue. Currently I disabled the `maxmind_db_path` on config and disabled the log retention on settings. <img width="788" height="211" alt="Image" src="https://github.com/user-attachments/assets/a958b3f4-81b0-4ae6-8c9c-3e53396d2f61" /> the 3 spikes on memory happens before I disabled both of it, and currently running with stable memory usage. hope this helps

GiteaMirror commented

2026-04-30 04:55:31 -05:00

@oschwartz10612 commented on GitHub (Jan 6, 2026):

Yes that is thank you! Can anyone else repro?

@oschwartz10612 commented on GitHub (Jan 6, 2026): Yes that is thank you! Can anyone else repro?

GiteaMirror commented

2026-04-30 04:55:32 -05:00

@kazooie13 commented on GitHub (Jan 6, 2026):

Unfortunately, that didn’t work for me. I’m still getting the same CPU spikes as mentioned above; disabling the path in config and deactivate the log retention didn’t made any difference.

(The problem already existed for me before I even added the country path).

@kazooie13 commented on GitHub (Jan 6, 2026): Unfortunately, that didn’t work for me. I’m still getting the same CPU spikes as mentioned above; disabling the path in config and deactivate the log retention didn’t made any difference. (The problem already existed for me before I even added the country path).

GiteaMirror commented

2026-04-30 04:55:32 -05:00

@Josh-Voyles commented on GitHub (Jan 7, 2026):

A couple of findings:

I reverted from 14.1 to 13.1 with request logs still off and didn't have any issues for a couple of days.

Then, in 13.1, I tried turning request logs back on and had issues the same night.

I turned request logs off again (still running 13.1), and everything seems to be running fine.

I'm looking at my pangolin container after running for over 24 hours and memory still seems normal.

I'm running a t2.micro on AWS with 1 vCPU and 1GB of memory.

I hope this info helps track down the problem.

@Josh-Voyles commented on GitHub (Jan 7, 2026): A couple of findings: I reverted from 14.1 to 13.1 with request logs still off and didn't have any issues for a couple of days. Then, in 13.1, I tried turning request logs back on and had issues the same night. I turned request logs off again (still running 13.1), and everything seems to be running fine. I'm looking at my pangolin container after running for over 24 hours and memory still seems normal. I'm running a t2.micro on AWS with 1 vCPU and 1GB of memory. I hope this info helps track down the problem.

GiteaMirror commented

2026-04-30 04:55:32 -05:00

@jjeuriss commented on GitHub (Jan 7, 2026):

I developed an extremely lightweight Prometheus exporter docker to measure CPU, memory, PID and disk statistics of containers running on a host. This allows me to inspect the containers of the pangolin solution (pangolin+traefic+gerbil) on my 1GB VPS. Had to develop one, because typical solutions like cadvisor were consuming around 100MB which would even more quickly run my VPS out of memory. My custom exporter consumes about 10MB, so that's workable. If you're interested, you can find it here: https://github.com/jjeuriss/tiny-docker-exporter.

I keep seeing the same thing: the Pangolin solution is stable until you start doing some extra monitoring (e.g. uptime-kuma) or letting it do requests it cannot fulfill (in my case that's scrolling through my photos on my Synology Photos app, which doesn't seem to work through Pangolin). Even my docker memory cap of 400MB doesn't help to make my solution stable: the docker doesn't seem to get killed in time; my VPS just hangs and SSH is impossible till I reboot it.
I tried my failure scenario (scrolling through photos it cannot access) once more and noticed a high peak in storage being read/written. That may point to some problem?

With these new graphs and way to reproduce, I'ld really like to try some things out.
I tried turning off requests logs through the Pangolin GUI, but that hasn't reduced the read/write throughput:

I don't have GeoLite2-Country set up, so I don't see how disabling maxmind_db_path could help. How do I disable that exactly by the way?

Is there any way I can see what Pangolin is doing reading and writing this much?

@jjeuriss commented on GitHub (Jan 7, 2026): I developed an extremely lightweight Prometheus exporter docker to measure CPU, memory, PID and disk statistics of containers running on a host. This allows me to inspect the containers of the pangolin solution (pangolin+traefic+gerbil) on my 1GB VPS. Had to develop one, because typical solutions like `cadvisor` were consuming around 100MB which would even more quickly run my VPS out of memory. My custom exporter consumes about 10MB, so that's workable. If you're interested, you can find it here: https://github.com/jjeuriss/tiny-docker-exporter. I keep seeing the same thing: the Pangolin solution is stable until you start doing some extra monitoring (e.g. uptime-kuma) or letting it do requests it cannot fulfill (in my case that's scrolling through my photos on my Synology Photos app, which doesn't seem to work through Pangolin). Even my docker memory cap of 400MB doesn't help to make my solution stable: the docker doesn't seem to get killed in time; my VPS just hangs and SSH is impossible till I reboot it. I tried my failure scenario (scrolling through photos it cannot access) once more and noticed a high peak in storage being read/written. That may point to some problem? <img width="1907" height="570" alt="Image" src="https://github.com/user-attachments/assets/e9a5ba91-fe2d-4102-a667-e383d2a7a394" /> With these new graphs and way to reproduce, I'ld really like to try some things out. I tried turning off requests logs through the Pangolin GUI, but that hasn't reduced the read/write throughput: <img width="1903" height="542" alt="Image" src="https://github.com/user-attachments/assets/6fc2495f-e0f4-42c2-99a2-8e531eaf6594" /> I don't have `GeoLite2-Country` set up, so I don't see how disabling `maxmind_db_path` could help. How do I disable that exactly by the way? Is there any way I can see what Pangolin is doing reading and writing this much?

GiteaMirror commented

2026-04-30 04:55:33 -05:00

@asardaes commented on GitHub (Jan 7, 2026):

@jjeuriss for disk IO, see my comment above, that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

@asardaes commented on GitHub (Jan 7, 2026): @jjeuriss for disk IO, see [my comment above](https://github.com/fosrl/pangolin/issues/2120#issuecomment-3691831518), that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

GiteaMirror commented

2026-04-30 04:55:35 -05:00

@jjeuriss commented on GitHub (Jan 8, 2026):

@jjeuriss for disk IO, see my comment above, that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard.

Nice, thanks, @asardaes ! That WA actually works. I cannot seem to crash my Pangolin anymore with this workaround, and even with ZRAM filled up, it seems to stay responsive.

After scrolling a long time in my (disfunctional) photos, I was still able to fill up the ZRAM. Ideally at some point Pangolin evicts the data in it again. I'll monitor that. I assume the memory leak described here is exactly about that, and the ZRAM now delays the point of failure.

These were the commands I used to enable ZRAM (as @asardaes advised). Just put these in copy-paste ready format for future reference.

Stop Docker containers:

docker stop pangolin gerbil traefik

Install zram-tools:

apt-get update && apt-get install -y zram-tools

Configure zram (512MB):

sed -i 's/#SIZE=256/SIZE=512/' /etc/default/zramswap

Restart zramswap:

systemctl restart zramswap && sleep 2 && swapon --show

Apply VM tuning:

cat >> /etc/sysctl.conf << 'EOF'
vm.swappiness=10
vm.vfs_cache_pressure=50
vm.overcommit_memory=1
EOF

sysctl -p

Restart Docker containers:

docker start pangolin gerbil traefik

Verify:

free -h && swapon --show

If you want to revert this you can execute
Stop Docker containers

docker stop pangolin gerbil traefik

Stop and disable zramswap

systemctl stop zramswap
systemctl disable zramswap

Uninstall zram-tools

apt-get remove -y zram-tools && apt-get autoremove -y

Remove VM tuning parameters from /etc/sysctl.conf: Remove the lines added (vm.swappiness, vm.vfs_cache_pressure, vm.overcommit_memory)

sed -i '/^vm.swappiness=10$/d; /^vm.vfs_cache_pressure=50$/d; /^vm.overcommit_memory=1$/d' /etc/sysctl.conf

Apply sysctl changes to reload defaults

sysctl -p

Restart Docker containers

docker start pangolin gerbil traefik

Verify the revert

free -h && swapon --show

@jjeuriss commented on GitHub (Jan 8, 2026): > [@jjeuriss](https://github.com/jjeuriss) for disk IO, see [my comment above](https://github.com/fosrl/pangolin/issues/2120#issuecomment-3691831518), that's the exact issue I had, it was memory pressure at the kernel level, which led to using the boot disk as a kind of swap even without a swap partition configured, which killed the whole VM hard. Nice, thanks, @asardaes ! That WA actually works. I cannot seem to crash my Pangolin anymore with this workaround, and even with ZRAM filled up, it seems to stay responsive. <img width="1888" height="514" alt="Image" src="https://github.com/user-attachments/assets/eea1393e-b9b2-434d-bcc7-bab3a636dacc" /> After scrolling a long time in my (disfunctional) photos, I was still able to fill up the ZRAM. Ideally at some point Pangolin evicts the data in it again. I'll monitor that. I assume the memory leak described here is exactly about that, and the ZRAM now delays the point of failure. <img width="954" height="437" alt="Image" src="https://github.com/user-attachments/assets/a088982d-0993-4da1-a295-22c34a54a644" /> These were the commands I used to enable ZRAM (as @asardaes advised). Just put these in copy-paste ready format for future reference. Stop Docker containers: ```bash docker stop pangolin gerbil traefik ``` Install zram-tools: ```bash apt-get update && apt-get install -y zram-tools ``` Configure zram (512MB): ```bash sed -i 's/#SIZE=256/SIZE=512/' /etc/default/zramswap ``` Restart zramswap: ```bash systemctl restart zramswap && sleep 2 && swapon --show ``` Apply VM tuning: ```bash cat >> /etc/sysctl.conf << 'EOF' vm.swappiness=10 vm.vfs_cache_pressure=50 vm.overcommit_memory=1 EOF sysctl -p ``` Restart Docker containers: ```bash docker start pangolin gerbil traefik ``` Verify: ```bash free -h && swapon --show ``` **If you want to revert this you can execute** Stop Docker containers ```bash docker stop pangolin gerbil traefik ``` Stop and disable zramswap ```bash systemctl stop zramswap systemctl disable zramswap ``` Uninstall zram-tools ```bash apt-get remove -y zram-tools && apt-get autoremove -y ``` Remove VM tuning parameters from `/etc/sysctl.conf`: Remove the lines added (vm.swappiness, vm.vfs_cache_pressure, vm.overcommit_memory) ```bash sed -i '/^vm.swappiness=10$/d; /^vm.vfs_cache_pressure=50$/d; /^vm.overcommit_memory=1$/d' /etc/sysctl.conf ``` Apply sysctl changes to reload defaults ```bash sysctl -p ``` Restart Docker containers ```bash docker start pangolin gerbil traefik ``` Verify the revert ```bash free -h && swapon --show ```

GiteaMirror commented

2026-04-30 04:55:36 -05:00

@jjeuriss commented on GitHub (Jan 9, 2026):

I've enabled debug logs and added some extra prints to check memory usage.

When my photos app has SSO enabled (and thus needs to go through an extra authentication step), it results in a flood of unauthenticated requests. These unauthenticated requests seem to massively increase heap memory.
When I turn off SSO authentication, my app correctly shows my images and the heap memory stays constant!

Clearly memory is being leaked for unauthenticated requests, even with request logs disabled in the GUI.

I'm working on a fix!

@jjeuriss commented on GitHub (Jan 9, 2026): I've enabled debug logs and added some extra prints to check memory usage. When my photos app has SSO enabled (and thus needs to go through an extra authentication step), it results in a flood of unauthenticated requests. These unauthenticated requests seem to massively increase heap memory. When I turn off SSO authentication, my app correctly shows my images and the heap memory stays constant! Clearly memory is being leaked for unauthenticated requests, even with request logs disabled in the GUI. I'm working on a fix!

GiteaMirror commented

2026-04-30 04:55:38 -05:00

@Josh-Voyles commented on GitHub (Jan 9, 2026):

I'll just be hanging on 13.1 with request logs off until a fix is released. Stable for a few days now.

@Josh-Voyles commented on GitHub (Jan 9, 2026): I'll just be hanging on 13.1 with request logs off until a fix is released. Stable for a few days now.

GiteaMirror commented

2026-04-30 04:55:38 -05:00

@jjeuriss commented on GitHub (Jan 10, 2026):

Did a few more tests to see where the high disk I/O problems started. Because I think those are at the root cause of the memory leak. Sure, the ZRAM workaround of @asardaes helps to avoid them, but hide the problem: they then build us as zram swap.
I repeated the same reproduction scenario (scrolling through photos that are each getting an unauthenticated error) on each of the Pangolin versions:

1.11.1
1.12.0
1.12.3
1.13.0

Clearly the problem started at 1.13.0 (and remains on 1.14.1 by the way), so the diff from 1.12.3 to 1.13.0 should reveal it.

Now, I still need to figure out what's causing it... Already tried a couple of things on my fork, but so far, no luck.
I'm thinking now it might be related to the analytics that were added.

@jjeuriss commented on GitHub (Jan 10, 2026): Did a few more tests to see where the high disk I/O problems started. Because I think those are at the root cause of the memory leak. Sure, the ZRAM workaround of @asardaes helps to avoid them, but hide the problem: they then build us as zram swap. I repeated the same reproduction scenario (scrolling through photos that are each getting an unauthenticated error) on each of the Pangolin versions: - 1.11.1 - 1.12.0 - 1.12.3 - 1.13.0 <img width="1904" height="564" alt="Image" src="https://github.com/user-attachments/assets/c89f8b86-d06b-4bed-a725-4e98af8d31f9" /> Clearly the problem started at 1.13.0 (and remains on 1.14.1 by the way), so the diff from 1.12.3 to 1.13.0 should reveal it. Now, I still need to figure out what's causing it... Already tried a couple of things on my fork, but so far, no luck. I'm thinking now it might be related to the analytics that were added.

GiteaMirror commented

2026-04-30 04:55:39 -05:00

@Yonoesio commented on GitHub (Jan 13, 2026):

NO SE SI ESTO PODRA AYUDAR. HE DEDICADO UN POCO DE TIEMPO CON AYUDA DE LA IA PARA BUSCAR UNA SOLUCION "CHAPUCERA" PARA MANTENER VIVO MI VPS. NI TENGO CONOCIMIENTOS NI SE MUCHO INGLES. SOLO INTENTO APORTAR.

Investigation: Memory Leak Management in Pangolin (Virtualized Environment with OPNsense/ICMP)

⚠️ Disclaimer and Methodology

This document is the result of experimental research. The author (user) states that they do not possess the deep technical knowledge of systems engineering required to resolve the underlying root cause within the Pangolin source code.

The resolution of this problem was achieved through a process of trial and error based strictly on tests performed in a production environment, with the support of Gemini AI for diagnosis, technical structuring of solutions, and the generation of this English translation. The methods described here are "configuration patches" to ensure service availability, not a fix for the original software's code.

1. The Problem: The Memory Leak

During monitoring with htop and docker stats, an uncontrolled growth of the RES (Resident Set Size) memory of the Pangolin process was detected.

Starting Point: The process begins with a healthy consumption of 400 MB.
The Leak: Consumption rises linearly at a rate of 30-50 MB per minute.
Critical Point: Upon reaching 1.5 GB, the system began to overflow. Without limits, the process would consume all 4 GB of the VPS RAM, forcing the use of SWAP.
Impact: The jump in SWAP usage from 93 MB to +170 MB caused a CPU stall (I/O Wait), rendering the VPS inaccessible via SSH and dropping the connection for the 15 managed VMs.

2. Testing and Diagnosis

Stress tests were conducted focusing on traffic persistence:

ICMP: Latency monitoring to observe the impact of container restarts.
OPNsense: Integration of filtering rules and traffic management, confirming that the collapse was not network-related but due to host resource exhaustion.

3. The Solution: The Docker "Cage" (Hard Limits)

Since correcting the memory leak in the source code was not an option, a "self-cleaning" mechanism for the container was implemented.

It was discovered that the standard deploy: resources block in Docker Compose does not effectively stop SWAP usage in standalone environments. Therefore, a configuration patch was applied using Host directives to strictly "enclose" the process.

Configuration Patch (`docker-compose.yml`):

YAML

services:
  pangolin:
    image: [YOUR_IMAGE]
    container_name: pangolin
    restart: always
    # Forced physical RAM limit
    mem_limit: 1800M
    # Equalizing RAM+SWAP to prohibit disk usage (SWAP = 0)
    memswap_limit: 1800M

4. Comparative Results

After applying the limits and observing the system, the results are as follows:


Total VPS RAM	3.8 GB (Saturated)	2.6 GB (Peak usage)
Pangolin RES Memory	+1500 MB (Growing)	1800 MB Limit (Reset)
SWAP Status	+173 MB (Growing)	93.5 MB (Frozen)
Availability	Total system crash	Functional Auto-restart

5. Technical Conclusion

The implemented solution acts as a safety circuit breaker. Upon reaching the 1.8 GB limit, the Linux Kernel (via Docker) executes an OOM Kill on the container. Thanks to the restart: always policy, the container restarts in less than 3 seconds with clean memory (400 MB), preventing the VPS from collapsing.

This method ensures that, despite the memory leak, the 15 VMs maintain 99% uptime without manual intervention.

@Yonoesio commented on GitHub (Jan 13, 2026): > NO SE SI ESTO PODRA AYUDAR. HE DEDICADO UN POCO DE TIEMPO CON AYUDA DE LA IA PARA BUSCAR UNA SOLUCION "CHAPUCERA" PARA MANTENER VIVO MI VPS. NI TENGO CONOCIMIENTOS NI SE MUCHO INGLES. SOLO INTENTO APORTAR. # Investigation: Memory Leak Management in Pangolin (Virtualized Environment with OPNsense/ICMP) # ⚠️ Disclaimer and Methodology This document is the result of experimental research. The author (user) states that they **do not possess the deep technical knowledge** of systems engineering required to resolve the underlying root cause within the Pangolin source code. The resolution of this problem was achieved through a process of **trial and error** based strictly on tests performed in a production environment, with the support of **Gemini AI** for diagnosis, technical structuring of solutions, and the generation of this English translation. The methods described here are "configuration patches" to ensure service availability, not a fix for the original software's code. *** ## 1. The Problem: The Memory Leak During monitoring with `htop` and `docker stats`, an uncontrolled growth of the **RES (Resident Set Size)** memory of the Pangolin process was detected. * **Starting Point:** The process begins with a healthy consumption of **400 MB**. * **The Leak:** Consumption rises linearly at a rate of **30-50 MB per minute**. * **Critical Point:** Upon reaching **1.5 GB**, the system began to overflow. Without limits, the process would consume all 4 GB of the VPS RAM, forcing the use of **SWAP**. * **Impact:** The jump in SWAP usage from 93 MB to +170 MB caused a CPU stall (I/O Wait), rendering the VPS inaccessible via SSH and dropping the connection for the 15 managed VMs. *** ## 2. Testing and Diagnosis Stress tests were conducted focusing on traffic persistence: * **ICMP:** Latency monitoring to observe the impact of container restarts. * **OPNsense:** Integration of filtering rules and traffic management, confirming that the collapse was not network-related but due to host resource exhaustion. *** ## 3. The Solution: The Docker "Cage" (Hard Limits) Since correcting the memory leak in the source code was not an option, a "self-cleaning" mechanism for the container was implemented. It was discovered that the standard `deploy: resources` block in Docker Compose does not effectively stop SWAP usage in standalone environments. Therefore, a configuration patch was applied using Host directives to strictly "enclose" the process. ### Configuration Patch (`docker-compose.yml`): YAML ``` services: pangolin: image: [YOUR_IMAGE] container_name: pangolin restart: always # Forced physical RAM limit mem_limit: 1800M # Equalizing RAM+SWAP to prohibit disk usage (SWAP = 0) memswap_limit: 1800M ``` *** ## 4. Comparative Results After applying the limits and observing the system, the results are as follows: | | | | | ----------------------- | ------------------ | --------------------------- | | **Total VPS RAM** | 3.8 GB (Saturated) | **2.6 GB (Peak usage)** | | **Pangolin RES Memory** | +1500 MB (Growing) | **1800 MB Limit (Reset)** | | **SWAP Status** | +173 MB (Growing) | **93.5 MB (Frozen)** | | **Availability** | Total system crash | **Functional Auto-restart** | *** ## 5. Technical Conclusion The implemented solution acts as a **safety circuit breaker**. Upon reaching the 1.8 GB limit, the Linux Kernel (via Docker) executes an `OOM Kill` on the container. Thanks to the `restart: always` policy, the container restarts in less than 3 seconds with clean memory (400 MB), preventing the VPS from collapsing. This method ensures that, despite the memory leak, the 15 VMs maintain 99% uptime without manual intervention.

GiteaMirror commented

2026-04-30 04:55:39 -05:00

@jjeuriss commented on GitHub (Jan 13, 2026):

Yeah, limiting memory helps in some cases, @Yonoesio as was mentioned earlier in this thread by @Ragnaruk in https://github.com/fosrl/pangolin/issues/2120#issuecomment-3683502087 . In more extreme cases (e.g. VPS with low memory and high amount of unauthenticated requests), it doesn't help to avoid the VPS from hanging.

@jjeuriss commented on GitHub (Jan 13, 2026): Yeah, limiting memory helps in some cases, @Yonoesio as was mentioned earlier in this thread by @Ragnaruk in https://github.com/fosrl/pangolin/issues/2120#issuecomment-3683502087 . In more extreme cases (e.g. VPS with low memory and high amount of unauthenticated requests), it doesn't help to avoid the VPS from hanging.

GiteaMirror commented

2026-04-30 04:55:39 -05:00

@jjeuriss commented on GitHub (Jan 13, 2026):

Still haven't found the root cause of the leak. Help is welcome.

My VPS is also a bit too small to test this on properly, because whenever memory hits about 450MB, my VPS already hangs. So I have about 100MB of RAM to play with from the ~350MB it starts up with.

@jjeuriss commented on GitHub (Jan 13, 2026): Still haven't found the root cause of the leak. Help is welcome. My VPS is also a bit too small to test this on properly, because whenever memory hits about 450MB, my VPS already hangs. So I have about 100MB of RAM to play with from the ~350MB it starts up with.

GiteaMirror commented

2026-04-30 04:55:40 -05:00

@oschwartz10612 commented on GitHub (Jan 14, 2026):

Appreciating all of the feedback everyone. We are going to put some real effort into this before 1.15 to see if we can resolve it. Worried its DEEP 😅

@oschwartz10612 commented on GitHub (Jan 14, 2026): Appreciating all of the feedback everyone. We are going to put some real effort into this before 1.15 to see if we can resolve it. Worried its DEEP 😅

GiteaMirror commented

2026-04-30 04:55:41 -05:00

@Yonoesio commented on GitHub (Jan 16, 2026):

Key findings on Memory Stability and Storage Drivers (Assisted by Gemini AI)

**Technical Disclaimer:**This report was structured and translated with the assistance of Gemini AI. The user (author) performed the empirical testing and environmental changes but does not claim deep expertise in systems engineering. The findings below are based on recent, real-world observations.

Recent Findings (Casual Discovery): I would like to share a significant and somewhat casual discovery regarding the memory leak reported in this issue. After experiencing constant system crashes on a 4GB RAM VPS, I performed a clean migration of my stack. I cannot strictly confirm if there is a direct technical correlation, but the change in stability has been spectacular.

Infrastructure Change: Migrated from a raw containerd setup to Docker with the overlay2 storage driver.
Strict Resource Constraints: Implemented hard limits in Docker Compose (outside the deploy block to ensure enforcement):
```
mem_limit: 1800M
memswap_limit: 1800M
```

Real-time Observations: While I am waiting for more time to pass to generate a complete usage graph, the visual evidence is clear. I am now seeing active memory releases every 10 to 15 minutes, a behavior that was non-existent before.

Current Peak: ~495 MB (with 15 VMs active).
Post-Garbage Collection (GC): The memory successfully drops back to a stable baseline of 354 MB - 380 MB.

Conclusion: I don't have the technical background to explain why, but switching to the Docker Storage Driver (overlay2) combined with hard limits has transformed a broken system into a stable one. Previously, memory grew linearly until a total host crash. Now, the Node.js Garbage Collector seems to be functioning correctly in a "sawtooth" pattern.

I will provide a full graph once it's completed, but I wanted to share this "spectacular" improvement immediately as it might provide a clue to the developers or relief to other users.

@Yonoesio commented on GitHub (Jan 16, 2026): # Key findings on Memory Stability and Storage Drivers (Assisted by Gemini AI) **Technical Disclaimer:***This report was structured and translated with the assistance of ****Gemini AI****. The user (author) performed the empirical testing and environmental changes but does not claim deep expertise in systems engineering. The findings below are based on recent, real-world observations.* *** **Recent Findings (Casual Discovery):** I would like to share a significant and somewhat casual discovery regarding the memory leak reported in this issue. After experiencing constant system crashes on a 4GB RAM VPS, I performed a clean migration of my stack. **I cannot strictly confirm if there is a direct technical correlation, but the change in stability has been spectacular.** 1. **Infrastructure Change:** Migrated from a raw `containerd` setup to **Docker with the ****`overlay2`**** storage driver**. 2. **Strict Resource Constraints:** Implemented hard limits in Docker Compose (outside the `deploy` block to ensure enforcement): ``` mem_limit: 1800M memswap_limit: 1800M ``` **Real-time Observations:** While I am waiting for more time to pass to generate a complete usage graph, the visual evidence is clear. I am now seeing **active memory releases every 10 to 15 minutes**, a behavior that was non-existent before. * **Current Peak:** \~495 MB (with 15 VMs active). * **Post-Garbage Collection (GC):** The memory successfully drops back to a stable baseline of **354 MB - 380 MB**. **Conclusion:** I don't have the technical background to explain why, but switching to the **Docker Storage Driver (****`overlay2`****)** combined with hard limits has transformed a broken system into a stable one. Previously, memory grew linearly until a total host crash. Now, the Node.js Garbage Collector seems to be functioning correctly in a "sawtooth" pattern. I will provide a full graph once it's completed, but I wanted to share this "spectacular" improvement immediately as it might provide a clue to the developers or relief to other users. ![Imagen](https://pics.dhoserver.net/u/fMwnsi.png)

GiteaMirror commented

2026-04-30 04:55:41 -05:00

@Josh-Voyles commented on GitHub (Jan 18, 2026):

Quick update: In 13.1, even with logs off, I eventually had problems again; it just took a whole week to manifest.

@Josh-Voyles commented on GitHub (Jan 18, 2026): Quick update: In 13.1, even with logs off, I eventually had problems again; it just took a whole week to manifest.

GiteaMirror commented

2026-04-30 04:55:42 -05:00

@rex1234 commented on GitHub (Jan 18, 2026):

No settings mentioned here helped me get rid off this memory leak. The node process starts eating my memory until pangolin restarts which causes all sites to be unavailable for a while multiple times per day. This bug should get full attention finally, it's opened for more than a month and availability of all running services suffers from it.

@rex1234 commented on GitHub (Jan 18, 2026): No settings mentioned here helped me get rid off this memory leak. The node process starts eating my memory until pangolin restarts which causes all sites to be unavailable for a while multiple times per day. This bug should get full attention finally, it's opened for more than a month and availability of all running services suffers from it.

GiteaMirror commented

2026-04-30 04:55:42 -05:00

@jjeuriss commented on GitHub (Jan 22, 2026):

I agree, this is IMHO the top priority bug.

I can't use v1.14.1 or higher for more than half a day due to this bug. I've already tried a few things on my fork of this project, but none have resolved it so far.

Things I know so far:

There's a peak in read I/O (>100MB/s) that happens after a bunch of unauthenticated requests happen. This peak does not occur when doing authenticated requests. The peak is easily visible when capturing the docker metrics with tiny-docker-exporter
Unauthenticated requests seem to drive up memory usage fast, while authenticated requests do not. Due to this, bot scans drive up memory usage over time.
The problem started at version 1.13.0 and is not reproducible in 1.12.3. I ran multiple tests to confirm this. The high I/O also does not occur on 1.12.3.
1.13.0 has a higher base memory usage than 1.12.3.
There's multiple layers in the code where caches could be added to reduce the frequency of database operations, but this doesn't avoid the big I/O peak.
Disabling request logs doesn't help
Systematic testing with feature flags (https://github.com/jjeuriss/pangolin/commit/cbe315c2):
** DISABLE_AUDIT_LOGGING=true → Issue still reproduced (not audit logging)
** DISABLE_GEOIP_LOOKUP + DISABLE_ASN_LOOKUP → Issue still reproduced (not geo/ASN lookups)
** DISABLE_RULES_CHECK=true → Issue still reproduced (not rules check)
** DISABLE_SESSION_QUERIES=true → Issue NOT reproduced, but this breaks auth entirely (not a viable fix)

Attempted fixes (all failed to resolve the issue):

Added comprehensive debug logging (https://github.com/jjeuriss/pangolin/commit/6754a9f1) to trace database queries, cache hits/misses, and request flows - This helped identify patterns but didn't fix the issue
Fixed infinite redirect loop (https://github.com/jjeuriss/pangolin/commit/99cdbed2) - Auth pages were redirecting to themselves with nested redirect parameters. Fixed the redirect logic, but the I/O spike persisted.
Increased resource cache TTL from 5s to 60s (https://github.com/jjeuriss/pangolin/commit/96587485) - Reduced database query frequency but the I/O spike still occurred
Added React cache() wrapper to verifySession (https://github.com/jjeuriss/pangolin/commit/277aef5a) - This didn't work because React's cache() only deduplicates during SSR of a single page render, not across HTTP requests
Added server-side caching to /api/v1/user endpoint (https://github.com/jjeuriss/pangolin/commit/a5932f95) - Reduced database queries from 562 to 2 per test (99% reduction), but the VPS still froze and I/O spikes persisted
Added caching to session verification queries (https://github.com/jjeuriss/pangolin/commit/691e582a) - Cached getUserSessionWithUser, getUserOrgRole, getRoleResourceAccess, and getUserResourceAccess with 60-second TTL. Achieved 95%
reduction in database queries and kept them fast, but the memory still grew, I/O spikes still occurred, and VPS still froze
Feature flags for systematic testing (https://github.com/jjeuriss/pangolin/commit/cbe315c2) - Added flags to disable audit logging, GeoIP lookups, ASN lookups, rules checking, and org policy checks. None of these disabled features
prevented the issue.

The problem remains 100% reproducible and makes v1.13.0+ unusable in production with high volumes of unauthenticated requests.

At this point I need help from the core team to identify what changed in v1.13.0 that could cause this. I've ruled out the suspects I could think off and am stuck.

@jjeuriss commented on GitHub (Jan 22, 2026): I agree, this is IMHO the top priority bug. I can't use v1.14.1 or higher for more than half a day due to this bug. I've already [tried a few things](https://github.com/fosrl/pangolin/compare/main...jjeuriss:pangolin:main) on my fork of this project, but none have resolved it so far. Things I know so far: - There's a peak in read I/O (>100MB/s) that happens after a bunch of unauthenticated requests happen. This peak does not occur when doing authenticated requests. The peak is easily visible when capturing the docker metrics with [tiny-docker-exporter](https://github.com/jjeuriss/tiny-docker-exporter/) - Unauthenticated requests seem to drive up memory usage fast, while authenticated requests do not. Due to this, bot scans drive up memory usage over time. - The problem started at version 1.13.0 and is not reproducible in 1.12.3. I ran multiple tests to confirm this. The high I/O also does not occur on 1.12.3. - 1.13.0 has a higher base memory usage than 1.12.3. - There's multiple layers in the code where caches could be added to reduce the frequency of database operations, but this doesn't avoid the big I/O peak. - Disabling request logs doesn't help - Systematic testing with feature flags (https://github.com/jjeuriss/pangolin/commit/cbe315c2): ** DISABLE_AUDIT_LOGGING=true → Issue still reproduced (not audit logging) ** DISABLE_GEOIP_LOOKUP + DISABLE_ASN_LOOKUP → Issue still reproduced (not geo/ASN lookups) ** DISABLE_RULES_CHECK=true → Issue still reproduced (not rules check) ** DISABLE_SESSION_QUERIES=true → Issue NOT reproduced, but this breaks auth entirely (not a viable fix) Attempted fixes (all failed to resolve the issue): 1. Added comprehensive debug logging (https://github.com/jjeuriss/pangolin/commit/6754a9f1) to trace database queries, cache hits/misses, and request flows - This helped identify patterns but didn't fix the issue 2. Fixed infinite redirect loop (https://github.com/jjeuriss/pangolin/commit/99cdbed2) - Auth pages were redirecting to themselves with nested redirect parameters. Fixed the redirect logic, but the I/O spike persisted. 3. Increased resource cache TTL from 5s to 60s (https://github.com/jjeuriss/pangolin/commit/96587485) - Reduced database query frequency but the I/O spike still occurred 4. Added React cache() wrapper to verifySession (https://github.com/jjeuriss/pangolin/commit/277aef5a) - This didn't work because React's cache() only deduplicates during SSR of a single page render, not across HTTP requests 5. Added server-side caching to /api/v1/user endpoint (https://github.com/jjeuriss/pangolin/commit/a5932f95) - Reduced database queries from 562 to 2 per test (99% reduction), but the VPS still froze and I/O spikes persisted 6. Added caching to session verification queries (https://github.com/jjeuriss/pangolin/commit/691e582a) - Cached getUserSessionWithUser, getUserOrgRole, getRoleResourceAccess, and getUserResourceAccess with 60-second TTL. Achieved 95% reduction in database queries and kept them fast, but the memory still grew, I/O spikes still occurred, and VPS still froze 7. Feature flags for systematic testing (https://github.com/jjeuriss/pangolin/commit/cbe315c2) - Added flags to disable audit logging, GeoIP lookups, ASN lookups, rules checking, and org policy checks. None of these disabled features prevented the issue. The problem remains 100% reproducible and makes v1.13.0+ unusable in production with high volumes of unauthenticated requests. At this point I need help from the core team to identify what changed in v1.13.0 that could cause this. I've ruled out the suspects I could think off and am stuck.

GiteaMirror commented

2026-04-30 04:55:43 -05:00

@Vangreen commented on GitHub (Jan 26, 2026):

For me 1.15 version fix problem.
Before there was 1-2 restart per day. Now it runs for 2 days without high ram usage

@Vangreen commented on GitHub (Jan 26, 2026): For me 1.15 version fix problem. Before there was 1-2 restart per day. Now it runs for 2 days without high ram usage <img width="726" height="331" alt="Image" src="https://github.com/user-attachments/assets/e5a58e52-d7b9-4a01-b43f-2668d2ea6de8" />

GiteaMirror commented

2026-04-30 04:55:45 -05:00

@oschwartz10612 commented on GitHub (Jan 26, 2026):

Good to know @Vangreen thank you! I forgot to update this thread but we made some improvements in 1.15. Could everyone try it out and let me know if the issue still persists?

@oschwartz10612 commented on GitHub (Jan 26, 2026): Good to know @Vangreen thank you! I forgot to update this thread but we made some improvements in 1.15. Could everyone try it out and let me know if the issue still persists?

GiteaMirror commented

2026-04-30 04:55:46 -05:00

@n1LWeb commented on GitHub (Jan 27, 2026):

For me the issue still persists in 1.15.1 and the process locks up way before the 24GB Memory of my VPS is filled. My limit is now set at 1000MB and the restart happens about every 2 hours.

The limit is set in docker to prevent pangolin from growing more and locking up eventually.

@n1LWeb commented on GitHub (Jan 27, 2026): For me the issue still persists in 1.15.1 and the process locks up way before the 24GB Memory of my VPS is filled. My limit is now set at 1000MB and the restart happens about every 2 hours. <img width="1550" height="257" alt="Image" src="https://github.com/user-attachments/assets/b5ab34e3-1822-4da3-b4ff-4230ead30f86" /> The limit is set in docker to prevent pangolin from growing more and locking up eventually.

GiteaMirror commented

2026-04-30 04:55:46 -05:00

@Vangreen commented on GitHub (Jan 27, 2026):

@n1LWeb do you have health check for resources set up? I notice when I have setup status monitoring for my resources this behavior with high ram usage begins.

@Vangreen commented on GitHub (Jan 27, 2026): @n1LWeb do you have health check for resources set up? I notice when I have setup status monitoring for my resources this behavior with high ram usage begins.

GiteaMirror commented

2026-04-30 04:55:46 -05:00

@n1LWeb commented on GitHub (Jan 27, 2026):

@Vangreen Yes for almost all ressources I'm using multiple targets per resource. At the moment I'm just testing with 2 newt connections over the same internet connection, but soon the 2 newt connections will be routed via 2 different internet connections (DSL/fibre). Then I'll need the health checks so pangolin will not route over a failing connection.

@n1LWeb commented on GitHub (Jan 27, 2026): @Vangreen Yes for almost all ressources I'm using multiple targets per resource. At the moment I'm just testing with 2 newt connections over the same internet connection, but soon the 2 newt connections will be routed via 2 different internet connections (DSL/fibre). Then I'll need the health checks so pangolin will not route over a failing connection.

GiteaMirror commented

2026-04-30 04:55:47 -05:00

@rex1234 commented on GitHub (Jan 27, 2026):

I can confirm that issue is caused by unathenticated requests that some of my services were doing, after I fixed it and all are properly authorized the memory leak seems to be fixed.

@rex1234 commented on GitHub (Jan 27, 2026): I can confirm that issue is caused by unathenticated requests that some of my services were doing, after I fixed it and all are properly authorized the memory leak seems to be fixed.

GiteaMirror commented

2026-04-30 04:55:47 -05:00

@jjeuriss commented on GitHub (Jan 27, 2026):

I’m still seeing the same issues in 1.15.1. This thread already points out that unauthenticated failures are at the root of the problem. Note that unauthenticated requests aren’t only caused by monitoring; they’re also triggered by bots crawling a domain and trying every available page. Monitoring does make it worse, but it isn’t the sole cause.

The underlying unauthenticated-request problem does not appear to be fixed. In 1.12.3, these failures do not result in significant I/O usage or memory spikes. However, starting with 1.13.0 and continuing through 1.15.1, they clearly do.

After sending a burst of unauthenticated requests (~500–1000), I observed a massive spike in read I/O on my VPS.
A few hours later, the system completely hung after a second spike occurred (not manually triggered, probably a scan).

These kinds of high I/O peaks don't occur on 1.12.3 (I used that version again for the last 4 days and saw no issues). Going back to it now. Looking forward to a fix for this issue still!

@jjeuriss commented on GitHub (Jan 27, 2026): I’m still seeing the same issues in 1.15.1. This thread already points out that unauthenticated failures are at the root of the problem. Note that unauthenticated requests aren’t only caused by monitoring; they’re also triggered by bots crawling a domain and trying every available page. Monitoring does make it worse, but it isn’t the sole cause. The underlying unauthenticated-request problem does not appear to be fixed. In 1.12.3, these failures do not result in significant I/O usage or memory spikes. However, starting with 1.13.0 and continuing through 1.15.1, they clearly do. After sending a burst of unauthenticated requests (~500–1000), I observed a massive spike in read I/O on my VPS. A few hours later, the system completely hung after a second spike occurred (not manually triggered, probably a scan). <img width="936" height="567" alt="Image" src="https://github.com/user-attachments/assets/70df690f-3d8a-4594-90c4-3f4b37d9087c" /> These kinds of high I/O peaks don't occur on 1.12.3 (I used that version again for the last 4 days and saw no issues). Going back to it now. Looking forward to a fix for this issue still!

GiteaMirror commented

2026-04-30 04:55:47 -05:00

@SamTV12345 commented on GitHub (Jan 28, 2026):

Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

@SamTV12345 commented on GitHub (Jan 28, 2026): Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

GiteaMirror commented

2026-04-30 04:55:48 -05:00

@Boscovitz commented on GitHub (Jan 28, 2026):

Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days.

Same here with 1GB. Every 2-3 days and I have to restart the vps because it hangs OOM.

@Boscovitz commented on GitHub (Jan 28, 2026): > Same issue for me. With 1.15.1 my 1GB vm is OOM after less than 2 days. Same here with 1GB. Every 2-3 days and I have to restart the vps because it hangs OOM.

GiteaMirror commented

2026-04-30 04:55:49 -05:00

@oschwartz10612 commented on GitHub (Jan 30, 2026):

Hum wonder if its a dependency... Will experiment

@oschwartz10612 commented on GitHub (Jan 30, 2026): Hum wonder if its a dependency... Will experiment

GiteaMirror commented

2026-04-30 04:55:49 -05:00

@formless63 commented on GitHub (Feb 4, 2026):

Happening to me as well. 2GB VPS goes OOM every 24-36 hours.

Edit: can confirm this continued when bumping to 1.15.2 as well. Potentially even happening faster now.

@formless63 commented on GitHub (Feb 4, 2026): Happening to me as well. 2GB VPS goes OOM every 24-36 hours. Edit: can confirm this continued when bumping to 1.15.2 as well. Potentially even happening faster now. <img width="874" height="755" alt="Image" src="https://github.com/user-attachments/assets/9c9b237f-2d26-44dc-a9a6-8cfcbc71b131" />

GiteaMirror commented

2026-04-30 04:55:49 -05:00

@maiestro commented on GitHub (Feb 12, 2026):

Hello,

I wanted to ask if there is any news regarding this issue? I am currently using Pangolin v.1.11.1, which is essentially the last version where the memory issue has not occurred.

I am grateful for any information.
Best regards

@maiestro commented on GitHub (Feb 12, 2026): Hello, I wanted to ask if there is any news regarding this issue? I am currently using Pangolin v.1.11.1, which is essentially the last version where the memory issue has not occurred. I am grateful for any information. Best regards

GiteaMirror commented

2026-04-30 04:55:51 -05:00

@ghostklart commented on GitHub (Feb 21, 2026):

Hello, I'm actually started to have the same issue on 1.15.4
Will try to reverse back to 1.12.3

@ghostklart commented on GitHub (Feb 21, 2026): Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3

GiteaMirror commented

2026-04-30 04:55:53 -05:00

@N3m351x commented on GitHub (Feb 24, 2026):

Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3

Did the downgrade fixed your issue?

@N3m351x commented on GitHub (Feb 24, 2026): > Hello, I'm actually started to have the same issue on 1.15.4 Will try to reverse back to 1.12.3 Did the downgrade fixed your issue?

GiteaMirror commented

2026-04-30 04:55:54 -05:00

@ChrissiBe commented on GitHub (Feb 24, 2026):

Same issue on my VPS with 1GB memory. After 1-2 days the system ran out of memory.No login possible.
Downgrade to 1.13.0 works with no problems . I tried with debian and ubuntu minimal configuration but same problem.
1.13.0 works, 1.13.1 and newer run out of memory

@ChrissiBe commented on GitHub (Feb 24, 2026): Same issue on my VPS with 1GB memory. After 1-2 days the system ran out of memory.No login possible. Downgrade to 1.13.0 works with no problems . I tried with debian and ubuntu minimal configuration but same problem. 1.13.0 works, 1.13.1 and newer run out of memory

GiteaMirror commented

2026-04-30 04:55:54 -05:00

@maiestro commented on GitHub (Feb 24, 2026):

I also have a VPS with 1GB RAM. The problem occurred after about 6 hours for me.
I had problems with the following configuration:

Pangolin v. 1.11.1
Gerbil v. 1.3.0
Treafik v. 3.6.7 (plugins: badger v. 1.3.1; geoblock: v. 0.3.6)

@ChrissiBe: Could you please also tell me your (currently working) version numbers? I would then like to try exactly the same setup on my VPS.

The version numbers for Pangolin, Gerbil, and Treafik are located in
{DOCKERINSTALLPATH}/config.yml

and for the Treafik plugins under:
{DOCKERINSTALLPATH}/traefik/traefik_config.yml in the experimental->plugins section.

EDIT: Pangolin Version

@maiestro commented on GitHub (Feb 24, 2026): I also have a VPS with 1GB RAM. The problem occurred after about 6 hours for me. I had problems with the following configuration: Pangolin v. 1.11.1 Gerbil v. 1.3.0 Treafik v. 3.6.7 (plugins: badger v. 1.3.1; geoblock: v. 0.3.6) @ChrissiBe: Could you please also tell me your (currently working) version numbers? I would then like to try exactly the same setup on my VPS. The version numbers for Pangolin, Gerbil, and Treafik are located in `{DOCKERINSTALLPATH}/config.yml` and for the Treafik plugins under: `{DOCKERINSTALLPATH}/traefik/traefik_config.yml ` in the experimental->plugins section. **EDIT: Pangolin Version**

GiteaMirror commented

2026-04-30 04:55:54 -05:00

@kazooie13 commented on GitHub (Feb 24, 2026):

Could you please provide us with an update of the current status of the issue?

It has been open for over two months now and has become the issue with the most comments/reports.

If no more resources are being allocated to resolving it, I will need to look for an alternative.

@kazooie13 commented on GitHub (Feb 24, 2026): Could you please provide us with an update of the current status of the issue? It has been open for over two months now and has become the issue with the most comments/reports. If no more resources are being allocated to resolving it, I will need to look for an alternative.

GiteaMirror commented

2026-04-30 04:55:55 -05:00

@ChrissiBe commented on GitHub (Feb 24, 2026):

@maiestro
Pangolin 1.13.0
Gerbil 1.3.0
Traeffik3.6.7

I testet all of the pangolin Version up to 1.15.4
All Version have this out of memory problem. I did a fresh installation with ubuntu (22 and 24) or debian (12 and 13)

Only 1.13.0 and older are working
now I installaled with the quick-setup script, docker compose down, in the .yml set the pangolin version to 1.13.0 ,
docker compose pull , docker compose up -d .

@ChrissiBe commented on GitHub (Feb 24, 2026): @maiestro Pangolin 1.13.0 Gerbil 1.3.0 Traeffik3.6.7 I testet all of the pangolin Version up to 1.15.4 All Version have this out of memory problem. I did a fresh installation with ubuntu (22 and 24) or debian (12 and 13) Only 1.13.0 and older are working now I installaled with the quick-setup script, docker compose down, in the .yml set the pangolin version to 1.13.0 , docker compose pull , docker compose up -d .

GiteaMirror commented

2026-04-30 04:55:55 -05:00

@huzky-v commented on GitHub (Feb 25, 2026):

Not sure if it helps, but I tried downgrading the zod dependencies to v3 and update the code to adapt zod/v3 with codex (so use it with your own risk, and some schema definition may not be accurate)

The downgrade is based on 1.15.4 codebase
Observed with --inspect that the heap usage is lower when starting pangolin stack
Wonder if anyone test that for real traffic as I don't have that VPS size and traffic

The POC branch is here: https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3, you may check the code and build the Docker image (or you can just use docker.io/xerial817/pangolin-zod-poc:latest if you are in yolo mode)

And also some unsolved case for somebody reporting a zod/v4 memory leak issue, https://github.com/colinhacks/zod/issues/5490

EDIT: seems not working 😫

@huzky-v commented on GitHub (Feb 25, 2026): Not sure if it helps, but I tried downgrading the zod dependencies to v3 and update the code to adapt zod/v3 with `codex` (so use it with your own risk, and some schema definition may not be accurate) The downgrade is based on `1.15.4` codebase Observed with `--inspect` that the heap usage is lower when starting pangolin stack Wonder if anyone test that for real traffic as I don't have that VPS size and traffic The POC branch is here: [https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3](https://github.com/huzky-v/pangolin/tree/zod-v4-to-v3), you may check the code and build the Docker image (or you can just use docker.io/xerial817/pangolin-zod-poc:latest if you are in yolo mode) And also some unsolved case for somebody reporting a zod/v4 memory leak issue, [https://github.com/colinhacks/zod/issues/5490](https://github.com/colinhacks/zod/issues/5490) EDIT: seems not working 😫

GiteaMirror commented

2026-04-30 04:55:56 -05:00

@n1LWeb commented on GitHub (Feb 26, 2026):

@huzky-v I Tried and got the increasing memory usage on this version, too.

@n1LWeb commented on GitHub (Feb 26, 2026): @huzky-v I Tried and got the increasing memory usage on this version, too.

GiteaMirror commented

2026-04-30 04:55:56 -05:00

@duchu commented on GitHub (Feb 27, 2026):

On version 1.16.0, the problem still occurs.

@duchu commented on GitHub (Feb 27, 2026): On version 1.16.0, the problem still occurs.

GiteaMirror commented

2026-04-30 04:55:57 -05:00

@Josh-Voyles commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs. Do I need to scrap my install and rebuild from scratch? Use RHEL instead of Ubuntu? I'm happy to do whatever; I just need instructions.

Also, is it just this thread of people having issues? What are the other users doing that we aren't?

@Josh-Voyles commented on GitHub (Feb 27, 2026): Are we missing something? The release candidate said no known bugs. Do I need to scrap my install and rebuild from scratch? Use RHEL instead of Ubuntu? I'm happy to do whatever; I just need instructions. Also, is it just this thread of people having issues? What are the other users doing that we aren't?

GiteaMirror commented

2026-04-30 04:55:58 -05:00

@formless63 commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs.

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts.

When digging and trying to resolve I made some notes:

Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory.
DB size was stable at 113MB.
Log files were negligible (15KB for Pangolin, 32MB Traefik access.log)
save_logs: true in config was not the issue

@formless63 commented on GitHub (Feb 27, 2026): > Are we missing something? The release candidate said no known bugs. I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts. When digging and trying to resolve I made some notes: - Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory. - DB size was stable at 113MB. - Log files were negligible (15KB for Pangolin, 32MB Traefik access.log) - save_logs: true in config was not the issue

GiteaMirror commented

2026-04-30 04:56:00 -05:00

@Josh-Voyles commented on GitHub (Feb 27, 2026):

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

I know there's been work by the devs and community, but it seems like it's not clear what's going on. I'm sure if the majority of users and their SaaS platform were having issues, all efforts would be focused on this. However, I'm not convinced that's the case.

So, that's why I'm asking what needs to change on my end.

@Josh-Voyles commented on GitHub (Feb 27, 2026): > I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. I know there's been work by the devs and community, but it seems like it's not clear what's going on. I'm sure if the majority of users and their SaaS platform were having issues, all efforts would be focused on this. However, I'm not convinced that's the case. So, that's why I'm asking what needs to change on my end.

GiteaMirror commented

2026-04-30 04:56:01 -05:00

@AlexWhitehouse commented on GitHub (Feb 27, 2026):

I was having the issue, I restarted the container having changed nothing and am no longer experiencing the issue. Unhelpful I know but shows it seems more of a race condition than something permanent.

@AlexWhitehouse commented on GitHub (Feb 27, 2026): I was having the issue, I restarted the container having changed nothing and am no longer experiencing the issue. Unhelpful I know but shows it seems more of a race condition than something permanent.

GiteaMirror commented

2026-04-30 04:56:01 -05:00

@SamTV12345 commented on GitHub (Feb 27, 2026):

It still occurs for me. I "solved" the issue by adding a cron job that runs every midnight where my 1 GB VPS is restarted.

@SamTV12345 commented on GitHub (Feb 27, 2026): It still occurs for me. I "solved" the issue by adding a cron job that runs every midnight where my 1 GB VPS is restarted.

GiteaMirror commented

2026-04-30 04:56:02 -05:00

@huzky-v commented on GitHub (Feb 27, 2026):

Are we missing something? The release candidate said no known bugs.

I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common.

Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts.

When digging and trying to resolve I made some notes:

Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory.

DB size was stable at 113MB.

Log files were negligible (15KB for Pangolin, 32MB Traefik access.log)

save_logs: true in config was not the issue

I have tried some testing on my 1GB testing VPS, the command is
echo "GET https://resource.protected.ltd" | vegeta attack -duration=3600s --rate=10 | vegeta report
basically sending 10 request / s to the target protected resources

Things I tried:

Using the same node image as 1.12.3, which is node 22
Not using alpine image
Downgrading some of the packages
Remove some of the logic for the unauth case

Here are some of my observations during my debug, build, test loop:

Even on 1.12.3, the memory will still grow on unauth request and make the docker stats not able to return data
The heap snapshot on 1.12.3 is around 200MB, and grows with the version progress.
Even if I remove all logic on src/app/auth/resource/%5BresourceGuid%5D/page.tsx, which shows the auth page when not authenticated, the memory still grow
Try downgrading the zod library as stated above, it did cut some of the heap usage like 10%, the docker stats will not hang that soon, but it will crash.
By comparing the heap diff, there are always a bunch of string, i18n stuff, some zlib hanging around
My best guess for 1.12.3 is still ok because the base memory usage is low enough for the VPS to run, and there is headroom to GC. The memory leak? I think it still exists, but just too hard to locate, and just hard to find a minimal reproducible snippet for that.

Dang. I am not affected by this case (I have a large memory VPS instance, and frankly not growing too much memory because of that) but I am literally out of idea

@huzky-v commented on GitHub (Feb 27, 2026): > > Are we missing something? The release candidate said no known bugs. > > I get the feeling that there's been zero effort to investigate. I'm wondering if we all have something in common. > > Personally, Ubuntu, running in docker compose, multiple newt sites, many popular selfhosted apps underneath. I added f2b on the VPS after this started as I initially assumed it was related to getting spammed with failed connection/auth attempts. > > When digging and trying to resolve I made some notes: > > * Node.js heap was tiny (~5.5MB used) — the leak was not in the JS heap, pointing to native memory. > * DB size was stable at 113MB. > * Log files were negligible (15KB for Pangolin, 32MB Traefik access.log) > * save_logs: true in config was not the issue I have tried some testing on my 1GB testing VPS, the command is `echo "GET https://resource.protected.ltd" | vegeta attack -duration=3600s --rate=10 | vegeta report` basically sending 10 request / s to the target protected resources Things I tried: 1. Using the same node image as `1.12.3`, which is `node 22` 2. Not using `alpine` image 3. Downgrading some of the packages 4. Remove some of the logic for the unauth case Here are some of my observations during my debug, build, test loop: 1. Even on `1.12.3`, the memory will still grow on unauth request and make the `docker stats` not able to return data 2. The heap snapshot on `1.12.3` is around 200MB, and grows with the version progress. 3. Even if I remove all logic on `src/app/auth/resource/%5BresourceGuid%5D/page.tsx`, which shows the auth page when not authenticated, the memory *still grow* 4. Try downgrading the `zod` library as stated above, it *did* cut some of the heap usage like 10%, the `docker stats` will not hang that soon, but it *will* crash. 5. By comparing the heap diff, there are always a bunch of `string`, i18n stuff, some zlib hanging around 6. My best guess for `1.12.3` is still ok because the base memory usage is low enough for the VPS to run, and there is headroom to GC. The memory leak? I think it still exists, but just too hard to locate, and just hard to find a minimal reproducible snippet for that. Dang. I am not affected by this case (I have a large memory VPS instance, and frankly not growing too much memory because of that) but I am literally out of idea

GiteaMirror commented

2026-04-30 04:56:02 -05:00

@n1LWeb commented on GitHub (Feb 27, 2026):

Some insights:

If I disable all health checks the memory usage is mostly stable. But I need them as most of my ressources are reachable over two different newt instances.

Maybe the people without the issue didn't enable health checks yet?

I switched from my 1GB x86 VPS to my 24GB ARM VPS, but pangolin still crashes if growing over 1GB in RAM usage. Setting a 900MB limit in docker will restart the Container like every 2 hours but it's usable.

@n1LWeb commented on GitHub (Feb 27, 2026): Some insights: If I disable all health checks the memory usage is mostly stable. But I need them as most of my ressources are reachable over two different newt instances. Maybe the people without the issue didn't enable health checks yet? I switched from my 1GB x86 VPS to my 24GB ARM VPS, but pangolin still crashes if growing over 1GB in RAM usage. Setting a 900MB limit in docker will restart the Container like every 2 hours but it's usable.

GiteaMirror commented

2026-04-30 04:56:02 -05:00

@Alloc86 commented on GitHub (Feb 27, 2026):

Just to chime in as I have a different scale of proxy on my end:

Running Debian 13
Pangolin in Docker
Only local proxy for three public resources -> docker instances on the same host, no Newt or anything
Little traffic, as it's only resources used by myself (probably some bot traffic though)
Unfortunately I started with Pangolin 1.13, so no experience with the older version

I don't get reproducible issues, but had it like 3-5 times locking up due to memory issues. The current "session" has been fine for like 3-4 weeks already though (with no change either), maybe less bots hitting it or something.

@Alloc86 commented on GitHub (Feb 27, 2026): Just to chime in as I have a different scale of proxy on my end: - Running Debian 13 - Pangolin in Docker - Only local proxy for three public resources -> docker instances on the same host, no Newt or anything - Little traffic, as it's only resources used by myself (probably some bot traffic though) - Unfortunately I started with Pangolin 1.13, so no experience with the older version I don't get reproducible issues, but had it like 3-5 times locking up due to memory issues. The current "session" has been fine for like 3-4 weeks already though (with no change either), maybe less bots hitting it or something.

GiteaMirror commented

2026-04-30 04:56:03 -05:00

@oschwartz10612 commented on GitHub (Feb 27, 2026):

Thanks everyone for the continued information and concerns.

We are looking at it but it has been hard to pin down what it is with
all of the reports in here I am not sure if there has been a "smoaking
gun" I can just go fix. On top of that - despite doing updates to
packages from dependabot that did not fix it either if it is a
dependency thing.

I will try to make it a point to look into this again with the new info
ASAP and maybe we can do a patch or two or something.

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

I would highly suggest adding resource limits to the container though -
docker should handle killing it and restarting it

https://docs.docker.com/reference/compose-file/deploy/#resources

@oschwartz10612 commented on GitHub (Feb 27, 2026): Thanks everyone for the continued information and concerns. We are looking at it but it has been hard to pin down what it is with all of the reports in here I am not sure if there has been a "smoaking gun" I can just go fix. On top of that - despite doing updates to packages from dependabot that did not fix it either if it is a dependency thing. I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something. What is even more baffling is we have thousands of users and sites on the cloud yet we dont see the issue LOL so all I can say is we are still throughly confused but want to get this resolved! I would highly suggest adding resource limits to the container though - docker should handle killing it and restarting it https://docs.docker.com/reference/compose-file/deploy/#resources

GiteaMirror commented

2026-04-30 04:56:03 -05:00

@formless63 commented on GitHub (Feb 27, 2026):

I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something.

If there is anything specific those of us who are affected can do to help, please let us know. I'm happy to set up custom logging of some sort if there are configurations that might bring more details to light for you to work with. - or any other potential item that might produce good data for you.

Thanks for all of the work you do!

@formless63 commented on GitHub (Feb 27, 2026): > I will try to make it a point to look into this again with the new info ASAP and maybe we can do a patch or two or something. If there is anything specific those of us who are affected can do to help, please let us know. I'm happy to set up custom logging of some sort if there are configurations that might bring more details to light for you to work with. - or any other potential item that might produce good data for you. Thanks for all of the work you do!

GiteaMirror commented

2026-04-30 04:56:04 -05:00

@Joly0 commented on GitHub (Feb 27, 2026):

Btw, I was affected by this problem aswell a while ago. I had added a lot of things to my pangolin stack (like the traefik-dashboard or other things by hhftechnology). I tried resetting and re-installing pangolin and only added crowdsec and the geoblock updater containers to the stack, and so far everything is buttery smooth and stable

@Joly0 commented on GitHub (Feb 27, 2026): Btw, I was affected by this problem aswell a while ago. I had added a lot of things to my pangolin stack (like the traefik-dashboard or other things by hhftechnology). I tried resetting and re-installing pangolin and only added crowdsec and the geoblock updater containers to the stack, and so far everything is buttery smooth and stable

GiteaMirror commented

2026-04-30 04:56:04 -05:00

@Ragnaruk commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

@Ragnaruk commented on GitHub (Mar 2, 2026): > What is even more baffling is we have thousands of users and sites on the cloud yet we dont see the issue LOL so all I can say is we are still throughly confused but want to get this resolved! Could it be the sqlite driver? You probably use Postgres in your cloud.

GiteaMirror commented

2026-04-30 04:56:05 -05:00

@n1LWeb commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

I'm using sqlite and have the issue.

Others?

@n1LWeb commented on GitHub (Mar 2, 2026): > > What is even more baffling is we have thousands of users and sites on > > the cloud yet we dont see the issue LOL so all I can say is we are still > > throughly confused but want to get this resolved! > > Could it be the sqlite driver? You probably use Postgres in your cloud. I'm using sqlite and have the issue. Others?

GiteaMirror commented

2026-04-30 04:56:07 -05:00

@joerg-hro commented on GitHub (Mar 2, 2026):

What is even more baffling is we have thousands of users and sites on
the cloud yet we dont see the issue LOL so all I can say is we are still
throughly confused but want to get this resolved!

Could it be the sqlite driver? You probably use Postgres in your cloud.

I'm using sqlite and have the issue.

Others?

me too

@joerg-hro commented on GitHub (Mar 2, 2026): > > > What is even more baffling is we have thousands of users and sites on > > > the cloud yet we dont see the issue LOL so all I can say is we are still > > > throughly confused but want to get this resolved! > > > > > > Could it be the sqlite driver? You probably use Postgres in your cloud. > > I'm using sqlite and have the issue. > > Others? me too

GiteaMirror commented

2026-04-30 04:56:08 -05:00

@Josh-Voyles commented on GitHub (Mar 2, 2026):

Sqlite here.

@Josh-Voyles commented on GitHub (Mar 2, 2026): Sqlite here.

GiteaMirror commented

2026-04-30 04:56:10 -05:00

@oschwartz10612 commented on GitHub (Mar 4, 2026):

Ahh yes this is good info. It must be with the sqlite driver or
something else then. Helps narrow it down! Let me do some thinking.
Maybe its time to upgrade to libsqlite3 to get off better-sqlite...

@oschwartz10612 commented on GitHub (Mar 4, 2026): Ahh yes this is good info. It must be with the sqlite driver or something else then. Helps narrow it down! Let me do some thinking. Maybe its time to upgrade to libsqlite3 to get off better-sqlite...

GiteaMirror commented

2026-04-30 04:56:10 -05:00

@hansencheck24 commented on GitHub (Mar 4, 2026):

Im using ghcr.io/fosrl/pangolin:postgresql-1.15.1 and has issue

@hansencheck24 commented on GitHub (Mar 4, 2026): Im using ghcr.io/fosrl/pangolin:postgresql-1.15.1 and has issue

GiteaMirror commented

2026-04-30 04:56:10 -05:00

@maiestro commented on GitHub (Mar 9, 2026):

In the meantime, I have installed Pangolin on two different small VPSs, each with 1GB RAM, 1-core vCPU, and the latest Debian 13 for testing purposes:

On the IONOS system, a failure occurs almost immediately (even the eth0 interface fails after a while).
Based on my testing, the failure occurs shortly after I visit the Pangolin configuration web interface to make some settings.

On the Netcup system, Pangolin runs with almost 80% RAM utilization, but has been stable so far (3 days).

root@IONOS-VPS# lscpu

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: AuthenticAMD
Model name: AMD EPYC-Milan Processor
CPU family: 25
...
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full

root@NETCUP-VPS# lscpu

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: GenuineIntel
Model name: QEMU Virtual CPU version 2.5+
CPU family: 15
...
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full

Perhaps other systems with Pangolin problems are looking similar to my IONOS VPS System?

@maiestro commented on GitHub (Mar 9, 2026): In the meantime, I have installed Pangolin on two different small VPSs, each with 1GB RAM, 1-core vCPU, and the latest Debian 13 for testing purposes: On the IONOS system, a failure occurs almost immediately (even the eth0 interface fails after a while). Based on my testing, the failure occurs shortly after I visit the Pangolin configuration web interface to make some settings. On the Netcup system, Pangolin runs with almost 80% RAM utilization, but has been stable so far (3 days). `root@IONOS-VPS# lscpu` > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 40 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 1 > On-line CPU(s) list: 0 > Vendor ID: AuthenticAMD > Model name: AMD EPYC-Milan Processor > CPU family: 25 > ... > Virtualization features: > Virtualization: AMD-V > Hypervisor vendor: KVM > Virtualization type: full `root@NETCUP-VPS# lscpu` > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 40 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 1 > On-line CPU(s) list: 0 > Vendor ID: GenuineIntel > Model name: QEMU Virtual CPU version 2.5+ > CPU family: 15 > ... > Virtualization features: > Hypervisor vendor: KVM > Virtualization type: full Perhaps other systems with Pangolin problems are looking similar to my IONOS VPS System?

GiteaMirror commented

2026-04-30 04:56:11 -05:00

@n1LWeb commented on GitHub (Mar 9, 2026):

For me the issue exists on a RackNerd VPS and on an Oracle Free Tier ARM VPS.

On both only if I have activated health checks.

oracle$ lscpu
Architecture:             aarch64
  CPU op-mode(s):         32-bit, 64-bit
  Byte Order:             Little Endian
CPU(s):                   4
  On-line CPU(s) list:    0-3
Vendor ID:                ARM
  Model name:             Neoverse-N1
    Model:                1
    Thread(s) per core:   1
    Core(s) per cluster:  4
    Socket(s):            -
    Cluster(s):           1
    Stepping:             r3p1
    BogoMIPS:             50.00
    Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:                     
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-3

Racknerd:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  1
  On-line CPU(s) list:   0
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Red Hat
  Model name:            Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
    BIOS Model name:     RHEL 7.6.0 PC (i440FX + PIIX, 1996)  CPU @ 2.0GHz
    BIOS CPU family:     1
    CPU family:          6
    Model:               79
    Thread(s) per core:  1
    Core(s) per socket:  1
    Socket(s):           1
    Stepping:            1
    BogoMIPS:            5199.99
Virtualization features: 
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   32 KiB (1 instance)
  L1i:                   32 KiB (1 instance)
  L2:                    4 MiB (1 instance)
  L3:                    16 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0

@n1LWeb commented on GitHub (Mar 9, 2026): For me the issue exists on a RackNerd VPS and on an Oracle Free Tier ARM VPS. On both only if I have activated health checks. ``` oracle$ lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: ARM Model name: Neoverse-N1 Model: 1 Thread(s) per core: 1 Core(s) per cluster: 4 Socket(s): - Cluster(s): 1 Stepping: r3p1 BogoMIPS: 50.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-3 ``` Racknerd: ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Vendor ID: GenuineIntel BIOS Vendor ID: Red Hat Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz BIOS Model name: RHEL 7.6.0 PC (i440FX + PIIX, 1996) CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 6 Model: 79 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 Stepping: 1 BogoMIPS: 5199.99 Virtualization features: Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 32 KiB (1 instance) L1i: 32 KiB (1 instance) L2: 4 MiB (1 instance) L3: 16 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0 ```

GiteaMirror commented

2026-04-30 04:56:11 -05:00

@0i5e4u commented on GitHub (Mar 9, 2026):

Same here with Problems:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Vendor ID: AuthenticAMD
BIOS Vendor ID: QEMU
Model name: AMD EPYC-Milan Processor
BIOS Model name: pc-i440fx-6.1 CPU @ 2.0GHz
BIOS CPU family: 1
CPU family: 25
Model: 1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
Stepping: 1
BogoMIPS: 3992.49
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_g
ood nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf
_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 i
nvpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip pku ospke vaes vpclmulq
dq rdpid
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full

Hosted on Strato

@0i5e4u commented on GitHub (Mar 9, 2026): Same here with Problems: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 40 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Vendor ID: AuthenticAMD BIOS Vendor ID: QEMU Model name: AMD EPYC-Milan Processor BIOS Model name: pc-i440fx-6.1 CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 25 Model: 1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 Stepping: 1 BogoMIPS: 3992.49 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_g ood nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf _lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 i nvpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_save umip pku ospke vaes vpclmulq dq rdpid Virtualization features: Virtualization: AMD-V Hypervisor vendor: KVM Virtualization type: full Hosted on Strato

GiteaMirror commented

2026-04-30 04:56:12 -05:00

@harrybaumann commented on GitHub (Mar 10, 2026):

I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users.

However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

@harrybaumann commented on GitHub (Mar 10, 2026): I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users. However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

GiteaMirror commented

2026-04-30 04:56:13 -05:00

@joerg-hro commented on GitHub (Mar 10, 2026):

I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users.

However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console.

I' ve the same Konfiguration with the same issues.

@joerg-hro commented on GitHub (Mar 10, 2026): > I have the issue on the same 1 GB IONOS VPS as mentioned above. Sqlite version of pangolin. Using 3 Newt connections. There is not much load, as there are only a hand-full of users. > > However, the pangolin instance stops responding after some hours. Sometimes it stays responsible for 1 or 2 days, but not more. It is necessary to restart the instance via IONOS management console. I' ve the same Konfiguration with the same issues.

GiteaMirror commented

2026-04-30 04:56:14 -05:00

@xylcro commented on GitHub (Mar 10, 2026):

Same issue here. Also using sqlite

@xylcro commented on GitHub (Mar 10, 2026): Same issue here. Also using sqlite

GiteaMirror commented

2026-04-30 04:56:16 -05:00

@Madnex commented on GitHub (Mar 13, 2026):

Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

@Madnex commented on GitHub (Mar 13, 2026): Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

GiteaMirror commented

2026-04-30 04:56:18 -05:00

@xylcro commented on GitHub (Mar 16, 2026):

Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

I use Gatus too, how'd you fix your config?

@xylcro commented on GitHub (Mar 16, 2026): > Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2 I use Gatus too, how'd you fix your config?

GiteaMirror commented

2026-04-30 04:56:18 -05:00

@sambilbow commented on GitHub (Mar 16, 2026):

I also use Gatus to hit my proxied endpoints... Interesting

@sambilbow commented on GitHub (Mar 16, 2026): I also use Gatus to hit my proxied endpoints... Interesting

GiteaMirror commented

2026-04-30 04:56:19 -05:00

@Madnex commented on GitHub (Mar 16, 2026):

Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2

I use Gatus too, how'd you fix your config?

I just added IP based bypass rules in pangolin. Was a bit tricky to find out when the check is actually working and not hitting the authentication page. One thing that helped was adding this condition: "[BODY] != pat(*Powered by Pangolin*)"

However, even after fixing this, I see that the pangolin container is steadily grabbing more RAM, just much slower now. I assume everytime someone tries to access endpoints and hits the authentication page it adds up. At least it's not my own monitoring anymore...

@Madnex commented on GitHub (Mar 16, 2026): > > Same issue here as well. I started to use gatus for monitoring and had used a config that hit the pangolin auth page for the checks. Memory consumsuption of the pangolin container went up steadily then until I fixed the gatus config. Then still had to redeploy pangolin to free up the memory. Using pangolin v1.16.2 > > I use Gatus too, how'd you fix your config? I just added IP based bypass rules in pangolin. Was a bit tricky to find out when the check is actually working and not hitting the authentication page. One thing that helped was adding this condition: `"[BODY] != pat(*Powered by Pangolin*)" ` However, even after fixing this, I see that the pangolin container is steadily grabbing more RAM, just much slower now. I assume everytime someone tries to access endpoints and hits the authentication page it adds up. At least it's not my own monitoring anymore...

GiteaMirror commented

2026-04-30 04:56:19 -05:00

@dunamos commented on GitHub (Mar 16, 2026):

Hi!

I have the same issue, and I only noticed it after adding a monitoring tool (Dockhand in my case).
I am not sure which of the chicken or the egg. Did adding the monitoring caused OOM issues (I since added a max RAM allowed) or was it doing it before, and I didn't notice?

@dunamos commented on GitHub (Mar 16, 2026): Hi! I have the same issue, and I only noticed it after adding a monitoring tool (Dockhand in my case). I am not sure which of the chicken or the egg. Did adding the monitoring caused OOM issues (I since added a max RAM allowed) or was it doing it before, and I didn't notice?

GiteaMirror commented

2026-04-30 04:56:19 -05:00

@formless63 commented on GitHub (Mar 16, 2026):

Also using Gatus to do health checks here, but I do have bypass rules in place.

It certainly seems related to authorization hits. When I setup f2b and started blocking bots and such the memory growth slowed but has not completely resolved.

@formless63 commented on GitHub (Mar 16, 2026): Also using Gatus to do health checks here, but I do have bypass rules in place. It certainly seems related to authorization hits. When I setup f2b and started blocking bots and such the memory growth slowed but has not completely resolved.

GiteaMirror commented

2026-04-30 04:56:20 -05:00

@harrybaumann commented on GitHub (Mar 16, 2026):

I think I have found "my" issue with a 1 GB VPS. Although the RAM filled up quite fast, it wasn't the reason for the instance to become unavailable. I noticed that the instance's harddisk of 10 GB was completely full (100%), when the machine stopped working.

Due to a sqllite database that had more and more data in it (multiple GB, maybe logging?) and a docker image that became larger with every version update, the lifetime of my small instance got shorter.

I've "fixed" the issue by upgrading to a bigger instance with more harddisk space. I can confirm that pangolin works well for 5 days now on this new machine running a copy of the original docker volume, so I believe, the crashes are gone.

Maybe the issue I had wasn't the issue discussed in this topic.

@harrybaumann commented on GitHub (Mar 16, 2026): I think I have found "my" issue with a 1 GB VPS. Although the RAM filled up quite fast, it wasn't the reason for the instance to become unavailable. I noticed that the instance's harddisk of 10 GB was completely full (100%), when the machine stopped working. Due to a sqllite database that had more and more data in it (multiple GB, maybe logging?) and a docker image that became larger with every version update, the lifetime of my small instance got shorter. I've "fixed" the issue by upgrading to a bigger instance with more harddisk space. I can confirm that pangolin works well for 5 days now on this new machine running a copy of the original docker volume, so I believe, the crashes are gone. Maybe the issue I had wasn't the issue discussed in this topic.

GiteaMirror commented

2026-04-30 04:56:21 -05:00

@dunamos commented on GitHub (Mar 16, 2026):

I think I have found "my" issue with a 1 GB VPS. Although the RAM filled up quite fast, it wasn't the reason for the instance to become unavailable. I noticed that the instance's harddisk of 10 GB was completely full (100%), when the machine stopped working.

Due to a sqllite database that had more and more data in it (multiple GB, maybe logging?) and a docker image that became larger with every version update, the lifetime of my small instance got shorter.

I've "fixed" the issue by upgrading to a bigger instance with more harddisk space. I can confirm that pangolin works well for 5 days now on this new machine running a copy of the original docker volume, so I believe, the crashes are gone.

Maybe the issue I had wasn't the issue discussed in this topic.

In my case I had my VPS become irresponsive due to kswapd0 using 100% of my CPU.
It makes sense because my RAM was saturated, and my storage space was very low. So no swap available.
After cleaning my VPS storage a bit and adding a 500M RAM limit to my Pangolin container I have fewer issues.

@dunamos commented on GitHub (Mar 16, 2026): > I think I have found "my" issue with a 1 GB VPS. Although the RAM filled up quite fast, it wasn't the reason for the instance to become unavailable. I noticed that the instance's harddisk of 10 GB was completely full (100%), when the machine stopped working. > > Due to a sqllite database that had more and more data in it (multiple GB, maybe logging?) and a docker image that became larger with every version update, the lifetime of my small instance got shorter. > > I've "fixed" the issue by upgrading to a bigger instance with more harddisk space. I can confirm that pangolin works well for 5 days now on this new machine running a copy of the original docker volume, so I believe, the crashes are gone. > > Maybe the issue I had wasn't the issue discussed in this topic. In my case I had my VPS become irresponsive due to kswapd0 using 100% of my CPU. It makes sense because my RAM was saturated, and my storage space was very low. So no swap available. After cleaning my VPS storage a bit and adding a 500M RAM limit to my Pangolin container I have fewer issues.

GiteaMirror commented

2026-04-30 04:56:21 -05:00

@xylcro commented on GitHub (Mar 16, 2026):

After removing the proxy host from Gatus monitoring, the ram appears to remain stable. It definitely has something to do with Gatus and it hitting the auth page...

@xylcro commented on GitHub (Mar 16, 2026): After removing the proxy host from Gatus monitoring, the ram appears to remain stable. It definitely has something to do with Gatus and it hitting the auth page... <img width="748" height="312" alt="Image" src="https://github.com/user-attachments/assets/729cb903-24f1-4088-a7b4-448aa1f7d972" />

GiteaMirror commented

2026-04-30 04:56:21 -05:00

@n1LWeb commented on GitHub (Mar 16, 2026):

Not specific to Gatus but any repeated accesses i think. But thanks for bringing Gatus to my attention might be time to switch. Looks like a better fit for my workflow than Uptime Kuma.

I however don't check much of my pangolin services using Uptime Kuma, instead I'm using the health checks of Pangolin itself and it makes a huge difference if i disable them, sadly I need them for failover.

The other thing that was new around the time the problems started is the Country filtering. On the 24GB RAM VPS I'm using pangolin becomes irresponsive as soon as it hits around 1GB RAM way before my memory is full.

@n1LWeb commented on GitHub (Mar 16, 2026): Not specific to Gatus but any repeated accesses i think. But thanks for bringing Gatus to my attention might be time to switch. Looks like a better fit for my workflow than Uptime Kuma. I however don't check much of my pangolin services using Uptime Kuma, instead I'm using the health checks of Pangolin itself and it makes a huge difference if i disable them, sadly I need them for failover. The other thing that was new around the time the problems started is the Country filtering. On the 24GB RAM VPS I'm using pangolin becomes irresponsive as soon as it hits around 1GB RAM way before my memory is full.

GiteaMirror commented

2026-04-30 04:56:22 -05:00

@cmmrandau commented on GitHub (Mar 19, 2026):

It did it again. This is on a vanilla setup with crowdsec. Only other services running are arcane agent and watchtower (and pulse agent as a systemd service).

@cmmrandau commented on GitHub (Mar 19, 2026): It did it again. This is on a vanilla setup with crowdsec. Only other services running are arcane agent and watchtower (and pulse agent as a systemd service). <img width="1657" height="522" alt="Image" src="https://github.com/user-attachments/assets/68cd0a37-a173-4a63-9f63-2a24ad7613df" />

GiteaMirror commented

2026-04-30 04:56:24 -05:00

@TubaApollo commented on GitHub (Mar 22, 2026):

I am also experiencing an issue with some kind of memory leak, so I tried to trace it down.

I took two V8 heap snapshots from the running server process (ee-1.16.2, Node v24.14.0), the second after forcing GC via HeapProfiler.collectGarbage:

Object	20 min uptime	1.5h uptime (post-GC)	Growth
`Gzip`	810	1,059	+249
`ServerResponse`	814	1,063	+249
`zlib_memory` (native, ~263 KB each)	807 (219 MB)	1,055 (286 MB)	+248 (+67 MB)

All leaked sockets trace back to the Next.js server.

The bulk of the leaked memory is native zlib allocations:

Container RSS after 1.5h: 1.54 GiB
V8 heap: 253 MB
Untracked native (zlib): ~1.2 GB

The compression is applied in node_modules/next/dist/server/lib/router-server.js:

if ((config?.compress) !== false) {
    compress = (0, _compression.default)();
}
// ...
if (compress) {
    compress(req, res, ()=>{});
}

Disable compression in node_modules/next/dist/server/lib/router-server.js line 109 helped in my case:

- if ((config == null ? void 0 : config.compress) !== false) {
+ if (false) {

A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above.
And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related.

@TubaApollo commented on GitHub (Mar 22, 2026): I am also experiencing an issue with some kind of memory leak, so I tried to trace it down. I took two V8 heap snapshots from the running server process (`ee-1.16.2`, Node `v24.14.0`), the second **after forcing GC** via `HeapProfiler.collectGarbage`: | Object | 20 min uptime | 1.5h uptime (post-GC) | Growth | |---|---|---|---| | `Gzip` | 810 | 1,059 | +249 | | `ServerResponse` | 814 | 1,063 | +249 | | `zlib_memory` (native, ~263 KB each) | 807 (219 MB) | 1,055 (286 MB) | +248 (+67 MB) | All leaked sockets trace back to the Next.js server. The bulk of the leaked memory is native zlib allocations: Container RSS after 1.5h: 1.54 GiB V8 heap: 253 MB Untracked native (zlib): ~1.2 GB The compression is applied in `node_modules/next/dist/server/lib/router-server.js`: ```js if ((config?.compress) !== false) { compress = (0, _compression.default)(); } // ... if (compress) { compress(req, res, ()=>{}); } ``` Disable compression in `node_modules/next/dist/server/lib/router-server.js` line 109 helped in my case: ```diff - if ((config == null ? void 0 : config.compress) !== false) { + if (false) { ``` A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above. And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related.

GiteaMirror commented

2026-04-30 04:56:26 -05:00

@huzky-v commented on GitHub (Mar 22, 2026):

I am also experiencing an issue with some kind of memory leak, so I tried to trace it down.

I took two V8 heap snapshots from the running server process (ee-1.16.2, Node v24.14.0), the second after forcing GC via HeapProfiler.collectGarbage:

Object 20 min uptime 1.5h uptime (post-GC) Growth

Gzip 810 1,059 +249

ServerResponse 814 1,063 +249

zlib_memory (native, ~263 KB each) 807 (219 MB) 1,055 (286 MB) +248 (+67 MB)

All leaked sockets trace back to the Next.js server.

The bulk of the leaked memory is native zlib allocations:

Container RSS after 1.5h: 1.54 GiB
V8 heap: 253 MB
Untracked native (zlib): ~1.2 GB

The compression is applied in node_modules/next/dist/server/lib/router-server.js:
if ((config?.compress) !== false) {
    compress = (0, _compression.default)();
}
// ...
if (compress) {
    compress(req, res, ()=>{});
}
Disable compression in node_modules/next/dist/server/lib/router-server.js line 109 helped in my case:
- if ((config == null ? void 0 : config.compress) !== false) {
+ if (false) {
A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above.
And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related.

That's also what I observed on my loadtesting on the heap, and tried that disable compression config before

But the config is not respected and gzip response is still there.

@huzky-v commented on GitHub (Mar 22, 2026): > I am also experiencing an issue with some kind of memory leak, so I tried to trace it down. > > I took two V8 heap snapshots from the running server process (`ee-1.16.2`, Node `v24.14.0`), the second **after forcing GC** via `HeapProfiler.collectGarbage`: > > | Object | 20 min uptime | 1.5h uptime (post-GC) | Growth | > |---|---|---|---| > | `Gzip` | 810 | 1,059 | +249 | > | `ServerResponse` | 814 | 1,063 | +249 | > | `zlib_memory` (native, ~263 KB each) | 807 (219 MB) | 1,055 (286 MB) | +248 (+67 MB) | > > All leaked sockets trace back to the Next.js server. > > The bulk of the leaked memory is native zlib allocations: > > Container RSS after 1.5h: 1.54 GiB > V8 heap: 253 MB > Untracked native (zlib): ~1.2 GB > > The compression is applied in `node_modules/next/dist/server/lib/router-server.js`: > > ```js > if ((config?.compress) !== false) { > compress = (0, _compression.default)(); > } > // ... > if (compress) { > compress(req, res, ()=>{}); > } > ``` > Disable compression in `node_modules/next/dist/server/lib/router-server.js` line 109 helped in my case: > > ```diff > - if ((config == null ? void 0 : config.compress) !== false) { > + if (false) { > ``` > A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above. > And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related. > That's also what I observed on my loadtesting on the heap, and tried that disable compression config before But the config is not respected and gzip response is still there.

GiteaMirror commented

2026-04-30 04:56:26 -05:00

@TubaApollo commented on GitHub (Mar 22, 2026):

I am also experiencing an issue with some kind of memory leak, so I tried to trace it down.
I took two V8 heap snapshots from the running server process (ee-1.16.2, Node v24.14.0), the second after forcing GC via HeapProfiler.collectGarbage:

Object
20 min uptime
1.5h uptime (post-GC)
Growth

Gzip
810
1,059
+249

ServerResponse
814
1,063
+249

zlib_memory (native, ~263 KB each)
807 (219 MB)
1,055 (286 MB)
+248 (+67 MB)

All leaked sockets trace back to the Next.js server.
The bulk of the leaked memory is native zlib allocations:
Container RSS after 1.5h: 1.54 GiB
V8 heap: 253 MB
Untracked native (zlib): ~1.2 GB
The compression is applied in node_modules/next/dist/server/lib/router-server.js:
if ((config?.compress) !== false) {
compress = (0, _compression.default)();
}
// ...
if (compress) {
compress(req, res, ()=>{});
}

Disable compression in node_modules/next/dist/server/lib/router-server.js line 109 helped in my case:

if ((config == null ? void 0 : config.compress) !== false) {

if (false) {

A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above.
And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related.

That's also what I observed on my loadtesting on the heap, and tried that disable compression config before

But the config is not respected and gzip response is still there.

I tried with the compress: false option passed to next() first. But it is ignored because router-server.js reads from the Next.js file config (next.config.js), not the constructor options. The only way I got it to work was patching router-server.js directly.

@TubaApollo commented on GitHub (Mar 22, 2026): > > I am also experiencing an issue with some kind of memory leak, so I tried to trace it down. > > I took two V8 heap snapshots from the running server process (`ee-1.16.2`, Node `v24.14.0`), the second **after forcing GC** via `HeapProfiler.collectGarbage`: > > > > > > > > Object > > 20 min uptime > > 1.5h uptime (post-GC) > > Growth > > > > > > > > > > `Gzip` > > 810 > > 1,059 > > +249 > > > > > > `ServerResponse` > > 814 > > 1,063 > > +249 > > > > > > `zlib_memory` (native, ~263 KB each) > > 807 (219 MB) > > 1,055 (286 MB) > > +248 (+67 MB) > > > > > > > > All leaked sockets trace back to the Next.js server. > > The bulk of the leaked memory is native zlib allocations: > > Container RSS after 1.5h: 1.54 GiB > > V8 heap: 253 MB > > Untracked native (zlib): ~1.2 GB > > The compression is applied in `node_modules/next/dist/server/lib/router-server.js`: > > if ((config?.compress) !== false) { > > compress = (0, _compression.default)(); > > } > > // ... > > if (compress) { > > compress(req, res, ()=>{}); > > } > > > > > > > > > > > > > > > > > > > > Disable compression in `node_modules/next/dist/server/lib/router-server.js` line 109 helped in my case: > > - if ((config == null ? void 0 : config.compress) !== false) { > > + if (false) { > > > > > > > > > > > > > > > > > > > > A proper fix would likely be setting compress: false in Next.js's config, but I have only verified the direct patch above. > > And so far I am seeing a massive improvement, maybe someone can confirm that. Not sure if it's related. > > That's also what I observed on my loadtesting on the heap, and tried that disable compression config before > > But the config is not respected and gzip response is still there. I tried with the `compress: false` option passed to `next()` first. But it is ignored because `router-server.js` reads from the Next.js file config (`next.config.js`), not the constructor options. The only way I got it to work was patching `router-server.js` directly.

GiteaMirror commented

2026-04-30 04:56:27 -05:00

@huzky-v commented on GitHub (Mar 22, 2026):

I tried with the compress: false option passed to next() first. But it is ignored because router-server.js reads from the Next.js file config (next.config.js), not the constructor options. The only way I got it to work was patching router-server.js directly.

My approach was to set the https://github.com/fosrl/pangolin/blob/main/next.config.ts with compression option disabled but it did not work.

Don't know if I set the config wrong

@huzky-v commented on GitHub (Mar 22, 2026): > I tried with the `compress: false` option passed to `next()` first. But it is ignored because `router-server.js` reads from the Next.js file config (`next.config.js`), not the constructor options. The only way I got it to work was patching `router-server.js` directly. My approach was to set the https://github.com/fosrl/pangolin/blob/main/next.config.ts with compression option disabled but it did not work. Don't know if I set the config wrong

GiteaMirror commented

2026-04-30 04:56:27 -05:00

@TubaApollo commented on GitHub (Mar 22, 2026):

My approach was to set the https://github.com/fosrl/pangolin/blob/main/next.config.ts with compression option disabled but it did not work.

I rechecked. The config seems to be baked into /app/.next/required-server-files.json. You would need a full rebuild if you don't want to patch it. (turns out this is wrong, this file is not read at runtime I think?)

@TubaApollo commented on GitHub (Mar 22, 2026): > My approach was to set the https://github.com/fosrl/pangolin/blob/main/next.config.ts with compression option disabled but it did not work. I rechecked. The config seems to be baked into `/app/.next/required-server-files.json`. You would need a full rebuild if you don't want to patch it. (turns out this is wrong, this file is not read at runtime I think?)

GiteaMirror commented

2026-04-30 04:56:28 -05:00

@huzky-v commented on GitHub (Mar 22, 2026):

I rechecked. The config seems to be baked into /app/.next/required-server-files.json. You would need a full rebuild if you don't want to patch it.

My test always rebuild the docker images for each test after making changes (including the next config), but it doesn't work.
Maybe I try to build that image again and check the files

@huzky-v commented on GitHub (Mar 22, 2026): > I rechecked. The config seems to be baked into `/app/.next/required-server-files.json`. You would need a full rebuild if you don't want to patch it. My test always rebuild the docker images for each test after making changes (including the next config), but it doesn't work. Maybe I try to build that image again and check the files

GiteaMirror commented

2026-04-30 04:56:29 -05:00

@huzky-v commented on GitHub (Mar 22, 2026):

Still no luck for me to just add the `compress: false` to `next.config.ts` 😕

The response still have gzip

@huzky-v commented on GitHub (Mar 22, 2026): <img width="918" height="53" alt="Image" src="https://github.com/user-attachments/assets/d4f15182-2677-4ef4-a9dc-acae666e65b9" /> Still no luck for me to just add the `compress: false` to `next.config.ts` 😕 The response still have `gzip`

GiteaMirror commented

2026-04-30 04:56:29 -05:00

@TubaApollo commented on GitHub (Mar 22, 2026):

Still no luck for me to just add the compress: false to next.config.ts 😕

The response still have gzip

I am not fully sure but although the config is baked into required-server-files.json, router-server.js never reads it? It calls loadConfig() which looks for a physical next.config.js file on disk and that file doesn't exist in the container. So it defaults to compress true.
So you will probably need to adjust the Dockerfile accordingly if you havent already.
By adding something like this:

COPY --from=builder-dev /app/next.config.ts ./next.config.ts

But I rather would have someone with a bit more clue confirm this haha

@TubaApollo commented on GitHub (Mar 22, 2026): > <img alt="Image" width="918" height="53" src="https://private-user-images.githubusercontent.com/194083329/567372388-d4f15182-2677-4ef4-a9dc-acae666e65b9.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQxNTYwNjEsIm5iZiI6MTc3NDE1NTc2MSwicGF0aCI6Ii8xOTQwODMzMjkvNTY3MzcyMzg4LWQ0ZjE1MTgyLTI2NzctNGVmNC1hOWRjLWFjYWU2NjZlNjViOS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwMzIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDMyMlQwNTAyNDFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kMGE1MzNmZmNkYWEyODgyMzFlYzAxYjJjZjBlMzg4YmI5NGMzZGE5ZWUyNWNjM2I3MTliOTQyMjkwZWMyNDg0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.blS1p34ndC33FA33Ff3Q72AYzdKVW4vVWfcW7dDszuY"> Still no luck for me to just add the `compress: false` to `next.config.ts` 😕 > > The response still have `gzip` I am not fully sure but although the config is baked into `required-server-files.json`, `router-server.js` never reads it? It calls `loadConfig()` which looks for a physical `next.config.js` file on disk and that file doesn't exist in the container. So it defaults to `compress` `true`. So you will probably need to adjust the Dockerfile accordingly if you havent already. By adding something like this: ```COPY --from=builder-dev /app/next.config.ts ./next.config.ts``` But I rather would have someone with a bit more clue confirm this haha

GiteaMirror commented

2026-04-30 04:56:29 -05:00

@huzky-v commented on GitHub (Mar 22, 2026):

OK, with the idea from @TubaApollo , finally managed to get the compress option in nextjs turned off

The idea is to add a filenext.override.ts

import type { NextConfig } from "next";

const nextConfig: NextConfig = {
   compress: false
};

export default nextConfig

EDIT: The image also does not needed to be built
For existing pangolin user, just pass the next.override.ts file to pangolin volume in the compose file to hot patch it without rebuild the image

pangolin:
    image: fosrl/pangolin:ee-latest
...
    volumes:
+      - "./next.override.ts:/app/next.config.ts"

Please note that as the gz is disabled in nextjs, you may see massive jump on bandwidth usage if you don't change the traefik config.

In compensate of the gz is turning off from the next.js, offload the gz compression to traefik
Add a middleware on config/traefik/dynamic_config.yml

http:
  middlewares:
    ......
+    gz-compress:
+       compress: {}
....
  next-router:
+    middlewares:
+       - gz-compress
.....

Don't know if there is negative effect on the gz middleware tho.

My testing shows that there is no more zlib stuff in the HeapDump, but can't tell with my instance as my instance is not resource limited.

@huzky-v commented on GitHub (Mar 22, 2026): OK, with the idea from @TubaApollo , finally managed to get the compress option in nextjs turned off The idea is to add a file`next.override.ts` ``` import type { NextConfig } from "next"; const nextConfig: NextConfig = { compress: false }; export default nextConfig ``` EDIT: The image also does not needed to be built For existing pangolin user, just pass the `next.override.ts` file to pangolin volume in the compose file to `hot patch` it without rebuild the image ``` pangolin: image: fosrl/pangolin:ee-latest ... volumes: + - "./next.override.ts:/app/next.config.ts" ``` Please note that as the gz is disabled in `nextjs`, you may see massive jump on bandwidth usage if you don't change the traefik config. In compensate of the gz is turning off from the next.js, offload the gz compression to traefik Add a middleware on `config/traefik/dynamic_config.yml` ``` http: middlewares: ...... + gz-compress: + compress: {} .... next-router: + middlewares: + - gz-compress ..... ``` Don't know if there is negative effect on the gz middleware tho. My testing shows that there is no more zlib stuff in the HeapDump, but can't tell with my instance as my instance is not resource limited.

GiteaMirror commented

2026-04-30 04:56:31 -05:00

@huzky-v commented on GitHub (Mar 23, 2026):

Moved my pangolin instance to a 1G ram VPS, see what will happen in couple days.
Current state is like this

pangolin
(CPU): 5.45%
(MEM USAGE / LIMIT) 432.6MiB / 954.9MiB
(BLOCK I/O) 34.6GB / 3.73MB

EDIT: The instance is crashed, still not working 😩

@huzky-v commented on GitHub (Mar 23, 2026): Moved my pangolin instance to a 1G ram VPS, see what will happen in couple days. Current state is like this pangolin (CPU): 5.45% (MEM USAGE / LIMIT) 432.6MiB / 954.9MiB (BLOCK I/O) 34.6GB / 3.73MB EDIT: The instance is crashed, still not working 😩

GiteaMirror commented

2026-04-30 04:56:33 -05:00

@Josh-Voyles commented on GitHub (Mar 23, 2026):

@huzky-v It didn't seem to make a difference for me. However, I'm not running the enterprise build.

I'm going to try disabling my uptime kuma checks. It's been brutal these last few weeks.

@Josh-Voyles commented on GitHub (Mar 23, 2026): @huzky-v It didn't seem to make a difference for me. However, I'm not running the enterprise build. I'm going to try disabling my uptime kuma checks. It's been brutal these last few weeks. <img width="1492" height="554" alt="Image" src="https://github.com/user-attachments/assets/1449db45-1aaa-43bb-8ebe-5b8e0a318893" />

GiteaMirror commented

2026-04-30 04:56:34 -05:00

@TubaApollo commented on GitHub (Mar 25, 2026):

Moved my pangolin instance to a 1G ram VPS, see what will happen in couple days. Current state is like this

pangolin (CPU): 5.45% (MEM USAGE / LIMIT) 432.6MiB / 954.9MiB (BLOCK I/O) 34.6GB / 3.73MB

EDIT: The instance is crashed, still not working 😩

Mh, there might be another (possibly smaller memory leak?). Because for me it's definitely a lot better. I have 16GB memory available and before it took all of them within a few days. Now I am at about 1,4GB since the fix, so it did definitely do something.

@TubaApollo commented on GitHub (Mar 25, 2026): > Moved my pangolin instance to a 1G ram VPS, see what will happen in couple days. Current state is like this > > pangolin (CPU): 5.45% (MEM USAGE / LIMIT) 432.6MiB / 954.9MiB (BLOCK I/O) 34.6GB / 3.73MB > > EDIT: The instance is crashed, still not working 😩 Mh, there might be another (possibly smaller memory leak?). Because for me it's definitely a lot better. I have 16GB memory available and before it took all of them within a few days. Now I am at about 1,4GB since the fix, so it did definitely do something.

GiteaMirror commented

2026-04-30 04:56:34 -05:00

@hansencheck24 commented on GitHub (Apr 10, 2026):

fix-memory-leak.patch

does anyone can help me how can I test to build custom pangolin image, so I can test this patch? I would like to build custom postgresql variant of the image

@hansencheck24 commented on GitHub (Apr 10, 2026): [fix-memory-leak.patch](https://github.com/user-attachments/files/26628528/fix-memory-leak.patch) does anyone can help me how can I test to build custom pangolin image, so I can test this patch? I would like to build custom postgresql variant of the image

Sign in to join this conversation.

Branches Tags

main

dependabot/npm_and_yarn/prod-patch-updates-64dd675a88

dependabot/npm_and_yarn/dev-minor-updates-10ef4f0f15

dependabot/docker/node-26-alpine

dependabot/docker/docker/library/node-26-slim

dependabot/npm_and_yarn/multi-7bdfbe8666

crowdin_dev

dev

resource-policies

redis

newt-install-commands

dependabot/npm_and_yarn/multi-d2fd79378c

dependabot/npm_and_yarn/uuid-14.0.0

dependabot/npm_and_yarn/postcss-8.5.10

miloschwartz-patch-2

dependabot/github_actions/actions/setup-node-6.4.0

dependabot/npm_and_yarn/next-16.2.1

dependabot/npm_and_yarn/recharts-3.8.1

dependabot/npm_and_yarn/next-intl-4.9.1

cross-org-idp

update-readme

miloschwartz-patch-1

breakout-sites-tables

revert-2766-feature/systemd-install-instructions

ssh

delete-account

msg-delivery

org-only-idp

cicd

patch

site-targets-auto-login

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/pangolin#8839

[GH-ISSUE #2120] Pangolin leaking memory after upgrading to 1.13.1 #8839

Describe the Bug

Environment

To Reproduce

Expected Behavior

Investigation: Memory Leak Management in Pangolin (Virtualized Environment with OPNsense/ICMP)

⚠️ Disclaimer and Methodology

1. The Problem: The Memory Leak

2. Testing and Diagnosis

3. The Solution: The Docker "Cage" (Hard Limits)

Configuration Patch (docker-compose.yml):

4. Comparative Results

5. Technical Conclusion

Key findings on Memory Stability and Storage Drivers (Assisted by Gemini AI)

Configuration Patch (`docker-compose.yml`):