kubernetes manifest / kustomize #442

Open
opened 2025-11-13 12:00:43 -06:00 by GiteaMirror · 16 comments
Owner

Originally created by @rdxmb on GitHub (Jun 16, 2025).

Originally assigned to: @marcschaeferger on GitHub.

Hello,

we've just built a pangolin stack for kubernetes via kustomize. Are you interested in a PR?

Limitations

Since we did not fully understand, which configs have to be shared across the containers (none?), our stack looks like this:

  • An all-in-one-pod with all containers included
  • Additionally a Kubernetes Job to create some random secrets such as session secret and admin password
  • One PVC mounted into all containers

Of course, this could be cleaned up for production - but it could be a first step. If you like the idea, please let me know.

Originally created by @rdxmb on GitHub (Jun 16, 2025). Originally assigned to: @marcschaeferger on GitHub. Hello, we've just built a pangolin stack for kubernetes via kustomize. Are you interested in a PR? ## Limitations Since we did not fully understand, which configs have to be shared across the containers (none?), our stack looks like this: - An all-in-one-pod with all containers included - Additionally a Kubernetes Job to create some random secrets such as session secret and admin password - One PVC mounted into all containers Of course, this could be cleaned up for production - but it could be a first step. If you like the idea, please let me know.
GiteaMirror added the help wantedgood first issue labels 2025-11-13 12:00:43 -06:00
Author
Owner

@oschwartz10612 commented on GitHub (Jun 16, 2025):

Yeah absolutely all PRs are welcome! I know the community has been eager for k8s support but our experience is a bit lacking with it so we have not had the chance.

For the config - all the containers dont actually have to reference the same config. As long as they can address each other over the network for API calls the configs can be separated out of that config directory. Gerbil does not need it at all though it is in our deployment which might need to change. Traefik just needs its config files and Pangolin needs just the config.yml.

Only pangolin needs to access the DB.

@oschwartz10612 commented on GitHub (Jun 16, 2025): Yeah absolutely all PRs are welcome! I know the community has been eager for k8s support but our experience is a bit lacking with it so we have not had the chance. For the config - all the containers dont actually have to reference the same config. As long as they can address each other over the network for API calls the configs can be separated out of that config directory. Gerbil does not need it at all though it is in our deployment which might need to change. Traefik just needs its config files and Pangolin needs just the config.yml. Only pangolin needs to access the DB.
Author
Owner

@rdxmb commented on GitHub (Jun 16, 2025):

Thanks for the detailed answer! That's good to know.

In Kubernetes, it's important to differentiate between read-only configurations and data that will be written by the application. We would also prefer to only mount the configurations needed per container. Could you clean up the docker-compose.yml file in that way? (Read-only configurations could be mounted with ro.)

It would be great to have this preparation to clean up our k8s-manifest as well and create a scalable microservice stack.

@rdxmb commented on GitHub (Jun 16, 2025): Thanks for the detailed answer! That's good to know. In Kubernetes, it's important to differentiate between read-only configurations and data that will be written by the application. We would also prefer to only mount the configurations needed per container. Could you clean up the docker-compose.yml file in that way? (Read-only configurations could be mounted with ro.) It would be great to have this preparation to clean up our k8s-manifest as well and create a scalable microservice stack.
Author
Owner

@github-actions[bot] commented on GitHub (Jul 1, 2025):

This issue has been automatically marked as stale due to 14 days of inactivity. It will be closed in 14 days if no further activity occurs.

@github-actions[bot] commented on GitHub (Jul 1, 2025): This issue has been automatically marked as stale due to 14 days of inactivity. It will be closed in 14 days if no further activity occurs.
Author
Owner

@vtmocanu commented on GitHub (Jul 9, 2025):

also interested in this one, not stale

@vtmocanu commented on GitHub (Jul 9, 2025): also interested in this one, not stale
Author
Owner

@marcschaeferger commented on GitHub (Aug 17, 2025):

I'm working on it. They will be released soon. Currently Alpha Phase

@marcschaeferger commented on GitHub (Aug 17, 2025): I'm working on it. They will be released soon. Currently Alpha Phase
Author
Owner

@chiqors commented on GitHub (Sep 5, 2025):

I am interested with this. Helm chart would be helpful

@chiqors commented on GitHub (Sep 5, 2025): I am interested with this. Helm chart would be helpful
Author
Owner

@marcschaeferger commented on GitHub (Sep 8, 2025):

@chiqors I can send you the Helm chart for Newt or Pangolin if you need it immediately.

The official release will take another 1–2 weeks for the final setup, e.g., publishing on ArtifactHub, setting up CI, and creating beginner-friendly Helm documentation.

@marcschaeferger commented on GitHub (Sep 8, 2025): @chiqors I can send you the Helm chart for Newt or Pangolin if you need it immediately. The official release will take another 1–2 weeks for the final setup, e.g., publishing on ArtifactHub, setting up CI, and creating beginner-friendly Helm documentation.
Author
Owner

@marcschaeferger commented on GitHub (Sep 8, 2025):

The same applies for the manifest files and Kustomize, with options for Service/PodMonitor, PrometheusRules, and multiple deployment setups — for example, a single all-in-one pod or a multi-pod setup where each component runs separately.

Both approaches can be configured to use an external managed PostgreSQL database (e.g., via a PostgreSQL operator like CloudNative-PG or a separate Helm installation of PostgreSQL) or a PostgreSQL instance deployed directly by the Application Manifest/Kustomize.

@marcschaeferger commented on GitHub (Sep 8, 2025): The same applies for the manifest files and Kustomize, with options for Service/PodMonitor, PrometheusRules, and multiple deployment setups — for example, a single all-in-one pod or a multi-pod setup where each component runs separately. Both approaches can be configured to use an external managed PostgreSQL database (e.g., via a PostgreSQL operator like CloudNative-PG or a separate Helm installation of PostgreSQL) or a PostgreSQL instance deployed directly by the Application Manifest/Kustomize.
Author
Owner

@chiqors commented on GitHub (Sep 8, 2025):

@chiqors I can send you the Helm chart for Newt or Pangolin if you need it immediately.

The official release will take another 1–2 weeks for the final setup, e.g., publishing on ArtifactHub, setting up CI, and creating beginner-friendly Helm documentation.

i see.. it's fine, i can wait. Does that use ingress, or gateway api or other method for reverse proxy/expose the service?

@chiqors commented on GitHub (Sep 8, 2025): > [@chiqors](https://github.com/chiqors) I can send you the Helm chart for Newt or Pangolin if you need it immediately. > > The official release will take another 1–2 weeks for the final setup, e.g., publishing on ArtifactHub, setting up CI, and creating beginner-friendly Helm documentation. i see.. it's fine, i can wait. Does that use ingress, or gateway api or other method for reverse proxy/expose the service?
Author
Owner

@marcschaeferger commented on GitHub (Sep 8, 2025):

@chiqors
There are two options:

Option 1:

Use the Traefik pod with an external IP for example, with MetalLB or a load balancer via Cloud Controller Manager annotations, depending on your cloud provider. This approach is similar to the Docker-Compose installation, just transferred to Kubernetes. There’s no need for Ingress or anything else. Traefik Pod reads the Config via HttpFile Provider from Pangolin as it does on the normal Pangolin Installation.

Option 2 (preferred):

Use the Pangolin Traefik Controller, which is a small microservice I wrote in Go. It also supports OTEL (metrics, tracing and many more) and interacts with the K8s API to create IngressRoutes, middlewares, etc. for all resources you create within Pangolin. With this, you don’t need a dedicated Traefik pod for Pangolin just use the Traefik IngressController and later, I plan to add NGINX support as well.

@marcschaeferger commented on GitHub (Sep 8, 2025): @chiqors There are two options: ## Option 1: Use the Traefik pod with an external IP for example, with MetalLB or a load balancer via Cloud Controller Manager annotations, depending on your cloud provider. This approach is similar to the Docker-Compose installation, just transferred to Kubernetes. There’s no need for Ingress or anything else. Traefik Pod reads the Config via HttpFile Provider from Pangolin as it does on the normal Pangolin Installation. ## Option 2 (preferred): Use the Pangolin Traefik Controller, which is a small microservice I wrote in Go. It also supports OTEL (metrics, tracing and many more) and interacts with the K8s API to create IngressRoutes, middlewares, etc. for all resources you create within Pangolin. With this, you don’t need a dedicated Traefik pod for Pangolin just use the Traefik IngressController and later, I plan to add NGINX support as well.
Author
Owner

@chiqors commented on GitHub (Sep 11, 2025):

@chiqors There are two options:

Option 1:

Use the Traefik pod with an external IP for example, with MetalLB or a load balancer via Cloud Controller Manager annotations, depending on your cloud provider. This approach is similar to the Docker-Compose installation, just transferred to Kubernetes. There’s no need for Ingress or anything else. Traefik Pod reads the Config via HttpFile Provider from Pangolin as it does on the normal Pangolin Installation.

Option 2 (preferred):

Use the Pangolin Traefik Controller, which is a small microservice I wrote in Go. It also supports OTEL (metrics, tracing and many more) and interacts with the K8s API to create IngressRoutes, middlewares, etc. for all resources you create within Pangolin. With this, you don’t need a dedicated Traefik pod for Pangolin just use the Traefik IngressController and later, I plan to add NGINX support as well.

Thank you for the detailed explanation.

I'm currently using Pangolin on Docker as a reverse proxy manager, which works flawlessly for my setup: exposing multiple services via subdomains from a single public IP.

Now, I'm planning to migrate to a bare-metal, single-node Kubernetes cluster with Cilium CNI. I want to run the entire Pangolin stack inside the cluster.

My initial thought was to expose Pangolin using a NodePort and then let it handle the reverse proxying from there. However, I was under the impression that this approach might require the Pangolin instance to be outside the cluster, which is what I want to avoid.

Given that my goal is to have everything running within Kubernetes, which of your two options would you recommend for my bare-metal setup?

@chiqors commented on GitHub (Sep 11, 2025): > [@chiqors](https://github.com/chiqors) There are two options: > ## Option 1: > > Use the Traefik pod with an external IP for example, with MetalLB or a load balancer via Cloud Controller Manager annotations, depending on your cloud provider. This approach is similar to the Docker-Compose installation, just transferred to Kubernetes. There’s no need for Ingress or anything else. Traefik Pod reads the Config via HttpFile Provider from Pangolin as it does on the normal Pangolin Installation. > ## Option 2 (preferred): > > Use the Pangolin Traefik Controller, which is a small microservice I wrote in Go. It also supports OTEL (metrics, tracing and many more) and interacts with the K8s API to create IngressRoutes, middlewares, etc. for all resources you create within Pangolin. With this, you don’t need a dedicated Traefik pod for Pangolin just use the Traefik IngressController and later, I plan to add NGINX support as well. Thank you for the detailed explanation. I'm currently using Pangolin on Docker as a reverse proxy manager, which works flawlessly for my setup: exposing multiple services via subdomains from a single public IP. Now, I'm planning to migrate to a bare-metal, single-node Kubernetes cluster with Cilium CNI. I want to run the entire Pangolin stack inside the cluster. My initial thought was to expose Pangolin using a NodePort and then let it handle the reverse proxying from there. However, I was under the impression that this approach might require the Pangolin instance to be outside the cluster, which is what I want to avoid. Given that my goal is to have everything running within Kubernetes, which of your two options would you recommend for my bare-metal setup?
Author
Owner

@marcschaeferger commented on GitHub (Sep 11, 2025):

@chiqors

Given your bare-metal, single-node Kubernetes setup with Cilium CNI and the requirement to run everything inside the cluster, the recommended approach is Option 2 — use the Pangolin Traefik Controller.


Why this is the better choice

  • Kubernetes-native management and orchestration
    This approach aligns perfectly with Kubernetes best practices by leveraging CRDs (Custom Resource Definitions) and the Kubernetes API directly.

  • Scalability, observability, and maintainability
    It’s easier to scale, monitor, and integrate with other Kubernetes-native tools such as Prometheus, Grafana, and Jaeger. This makes the stack more modular, maintainable, and future-proof, especially when adding more nodes in the future.

  • Reusable for other applications
    Since the Traefik IngressController can be used for exposing any application in your cluster. Not only Pangolin which would be the case in Option 1.

  • Easier high availability compared to a standalone Traefik Pod
    Running Traefik as an IngressController allows HA by simply scaling the deployment (e.g., one pod per node), whereas a plain Traefik pod needs manual tweaks for redundancy.

And many other reasons...


Explanation

An Ingress Controller in Kubernetes is essentially a routing component that integrates deeply with the Kubernetes API. Compared to running a plain Traefik pod, it offers richer and more native functionality:

  • All routing configurations are stored as Kubernetes resources (Ingress, IngressRoute (Traefik-specific), Middleware (Traefik-specific), etc.) instead of large, centralized configuration files.
  • Switching between controllers (e.g., NGINX ↔ Traefik) is much simpler if you stick to standard Ingress resources, since these are compatible with any controller.
  • These resources can be inspected, modified, and applied at any time via kubectl, making changes far more transparent and less error-prone.
  • Being standard Kubernetes YAML definitions, they can easily be incorporated into GitOps workflows (Argo CD, Flux, etc.), allowing automated deployments, version control, and reproducibility.

There is more...


SingleNode consideration:
If possible, avoid running production workloads on a single-node cluster. Aim for at least a 3-node cluster for high availability (HA) and operational resilience. You can even use low-cost hardware such as Raspberry Pi / Compute Modules for additional nodes — these can serve as control-plane nodes or workers, depending on your needs.
At minimum, I would recommend 3 Raspberry Pi control-plane nodes and 1 dedicated worker node (the one you initially planned to use as your single node). This setup will save you troubleshooting time and provide much better fault tolerance.

Bare-metal consideration:
If your node(s) are self-hosted and you opt for Option 2, you will also need something like MetalLB (or another bare-metal load balancer) to provide external IPs for your Traefik IngressController service. For Could you can sometimes use MetalLB as well or the Provider Options e.g Hetzner LB.

If you have more questions just let me know.

@marcschaeferger commented on GitHub (Sep 11, 2025): @chiqors Given your bare-metal, single-node Kubernetes setup with Cilium CNI and the requirement to run everything inside the cluster, the recommended approach is **Option 2 — use the Pangolin Traefik Controller**. --- ### **Why this is the better choice** - **Kubernetes-native management and orchestration** This approach aligns perfectly with Kubernetes best practices by leveraging CRDs (Custom Resource Definitions) and the Kubernetes API directly. - **Scalability, observability, and maintainability** It’s easier to scale, monitor, and integrate with other Kubernetes-native tools such as Prometheus, Grafana, and Jaeger. This makes the stack more **modular, maintainable, and future-proof**, especially when adding more nodes in the future. - **Reusable for other applications** Since the Traefik IngressController can be used for **exposing any application** in your cluster. Not only Pangolin which would be the case in Option 1. - **Easier high availability compared to a standalone Traefik Pod** Running Traefik as an IngressController allows HA by simply scaling the deployment (e.g., one pod per node), whereas a plain Traefik pod needs manual tweaks for redundancy. And many other reasons... --- ### **Explanation** An **Ingress Controller** in Kubernetes is essentially a routing component that integrates deeply with the Kubernetes API. Compared to running a plain Traefik pod, it offers richer and more native functionality: - All routing configurations are stored as **Kubernetes resources** (`Ingress`, `IngressRoute` (Traefik-specific), `Middleware` (Traefik-specific), etc.) instead of large, centralized configuration files. - Switching between controllers (e.g., NGINX ↔ Traefik) is much simpler if you stick to standard `Ingress` resources, since these are compatible with any controller. - These resources can be inspected, modified, and applied at any time via `kubectl`, making changes far more transparent and less error-prone. - Being standard Kubernetes YAML definitions, they can easily be incorporated into **GitOps workflows** (Argo CD, Flux, etc.), allowing automated deployments, version control, and reproducibility. There is more... --- **SingleNode consideration:** If possible, **avoid running production workloads on a single-node cluster**. Aim for at least a **3-node cluster** for high availability (HA) and operational resilience. You can even use low-cost hardware such as **Raspberry Pi / Compute Modules** for additional nodes — these can serve as control-plane nodes or workers, depending on your needs. At minimum, I would recommend **3 Raspberry Pi control-plane nodes** and **1 dedicated worker node** (the one you initially planned to use as your single node). This setup will save you troubleshooting time and provide much better fault tolerance. **Bare-metal consideration:** If your node(s) are self-hosted and you opt for Option 2, you will also need something like **MetalLB** (or another bare-metal load balancer) to provide external IPs for your Traefik IngressController service. For Could you can sometimes use MetalLB as well or the Provider Options e.g Hetzner LB. If you have more questions just let me know.
Author
Owner

@chiqors commented on GitHub (Sep 13, 2025):

@marcschaeferger

I'm testing on a VPS platform that only provides a single public IP, so I can't use Layer 2 ARP announcements or cloud load balancer IPs. I'm used to hosting services with Docker Compose, and Pangolin has been a really helpful reverse proxy so far. Since this is just for my personal homelab and I'm still learning about Kubernetes, it's not easy for me to migrate to a K8s cluster.

@chiqors commented on GitHub (Sep 13, 2025): @marcschaeferger I'm testing on a VPS platform that only provides a single public IP, so I can't use Layer 2 ARP announcements or cloud load balancer IPs. I'm used to hosting services with Docker Compose, and Pangolin has been a really helpful reverse proxy so far. Since this is just for my personal homelab and I'm still learning about Kubernetes, it's not easy for me to migrate to a K8s cluster.
Author
Owner

@marcschaeferger commented on GitHub (Sep 13, 2025):

@chiqors

Just because you have a single public IP doesn’t necessarily mean you can’t use Layer 2 ARP advertising if you have a private IP as well. This really depends on the capabilities and policies of your provider.
Also, I strongly recommend that you never expose the Kubernetes API directly on a public IP without a firewall in front of it.

I don’t know the exact details of your setup, provider networking, or budget, but let’s make a reasonable assumption:

You have 1 public IP.
You can create a private network with multiple internal IPs between your nodes.

In that case, here’s what I’d suggest:

Run at least 3 nodes (master + worker), ideally 4 or 5 (either 3 control-plane + workers, or 3 master/worker combined).
Assign private IPs for node-to-node communication and cluster networking + KubeAPI (maybe a VIP or something).
Use the public IP with MetalLB.

Even if your current provider doesn’t allow this kind of setup, there are plenty of VPS/hosters worldwide that do support private networks and ARP announcements—many of them are still quite affordable. And if you want to stick to your Provide you can still use NodePort with the Pangolin-Controller (Traefik Ingress Controller) inside your Kubernetes Cluster.


I’m aware of the Operator you mentioned. In fact, I’ve been developing my own Pangolin Operator over the last 3–4 weeks, which will be released under fosrl soon. I just need a bit more time to wrap up some remaining work and do some allignment with Owen as it's planned to do a declerative Configuration for all Platforms and they should allign. You can see a related comment here: https://github.com/fosrl/pangolin/issues/691#issuecomment-3255161366

The Operator you referenced is in an early rough-draft version, as the author/dev also stated. This will also be discussed in upcomming community event next week.
Currently, it still uses Go 1.22, which carries a couple of upstream CVEs, and many parts are marked as ToDo in the Code. I think it's also his first operator atleast the Codebase looks like this.

Some of the things that stand out to me with the current version that are blockers in my eyes:

Finalizer Cleanup
The finalizer is currently removed without cleaning up external resources (Tunnel + Resource).

Reconcile Idempotency & Update Handling

For PangolinResource: Check if a resource exists (by Name/Domain/Subdomain) before creating to avoid duplicates after requeues.
Implement update paths for changes in the Spec.
For PangolinTarget: Support updates rather than only creating once.

Watches & Indexes

Binding Controller should watch Service/Endpoints changes (Watch + MapFunc, EnqueueByField).
Resource Controller could watch Organization/Tunnel changes via Field Indexer (e.g. .spec.tunnelRef.name) for targeted reconciles.
Tunnel Controller: Owns(Deployment/Secret) is fine, but ensure proper OwnerReferences/Labels for garbage collection.

TODOs in Tunnel Reconciler should be implemented:

Populate Secret from API data (NewtID / NewtSecretKey).
Deployment/PodTemplate from Spec/Defaults with Image, Replicas, readiness/liveness probes, resource limits, labels/OwnerRefs, and update Status.ReadyReplicas.
Consider ImagePullPolicy/Secrets, SecurityContext, NodeSelector/Tolerations.

and a few other things.

@marcschaeferger commented on GitHub (Sep 13, 2025): @chiqors Just because you have a single public IP doesn’t necessarily mean you can’t use Layer 2 ARP advertising if you have a private IP as well. This really depends on the capabilities and policies of your provider. Also, I strongly recommend that you never expose the Kubernetes API directly on a public IP without a firewall in front of it. I don’t know the exact details of your setup, provider networking, or budget, but let’s make a reasonable assumption: You have 1 public IP. You can create a private network with multiple internal IPs between your nodes. In that case, here’s what I’d suggest: Run at least 3 nodes (master + worker), ideally 4 or 5 (either 3 control-plane + workers, or 3 master/worker combined). Assign private IPs for node-to-node communication and cluster networking + KubeAPI (maybe a VIP or something). Use the public IP with MetalLB. Even if your current provider doesn’t allow this kind of setup, there are plenty of VPS/hosters worldwide that do support private networks and ARP announcements—many of them are still quite affordable. And if you want to stick to your Provide you can still use NodePort with the Pangolin-Controller (Traefik Ingress Controller) inside your Kubernetes Cluster. --- I’m aware of the Operator you mentioned. In fact, I’ve been developing my own Pangolin Operator over the last 3–4 weeks, which will be released under fosrl soon. I just need a bit more time to wrap up some remaining work and do some allignment with Owen as it's planned to do a declerative Configuration for all Platforms and they should allign. You can see a related comment here: https://github.com/fosrl/pangolin/issues/691#issuecomment-3255161366 The Operator you referenced is in an early rough-draft version, as the author/dev also stated. This will also be discussed in upcomming community event next week. Currently, it still uses Go 1.22, which carries a couple of upstream CVEs, and many parts are marked as ToDo in the Code. I think it's also his first operator atleast the Codebase looks like this. Some of the things that stand out to me with the current version that are blockers in my eyes: Finalizer Cleanup The finalizer is currently removed without cleaning up external resources (Tunnel + Resource). Reconcile Idempotency & Update Handling For PangolinResource: Check if a resource exists (by Name/Domain/Subdomain) before creating to avoid duplicates after requeues. Implement update paths for changes in the Spec. For PangolinTarget: Support updates rather than only creating once. Watches & Indexes Binding Controller should watch Service/Endpoints changes (Watch + MapFunc, EnqueueByField). Resource Controller could watch Organization/Tunnel changes via Field Indexer (e.g. .spec.tunnelRef.name) for targeted reconciles. Tunnel Controller: Owns(Deployment/Secret) is fine, but ensure proper OwnerReferences/Labels for garbage collection. TODOs in Tunnel Reconciler should be implemented: Populate Secret from API data (NewtID / NewtSecretKey). Deployment/PodTemplate from Spec/Defaults with Image, Replicas, readiness/liveness probes, resource limits, labels/OwnerRefs, and update Status.ReadyReplicas. Consider ImagePullPolicy/Secrets, SecurityContext, NodeSelector/Tolerations. and a few other things.
Author
Owner

@chiqors commented on GitHub (Sep 13, 2025):

That's helpful information. Thanks for the answer, I get the early picture of what you build. I will keep watching the updates :)

@chiqors commented on GitHub (Sep 13, 2025): That's helpful information. Thanks for the answer, I get the early picture of what you build. I will keep watching the updates :)
Author
Owner

@Xoffio commented on GitHub (Nov 4, 2025):

Here is a Helm chart I created. Hopefully it helps anyone who needs something now while we wait for an official chart.

It’s set up to work with Traefik as the Ingress controller. I created an IngressRouteTCP that passes TLS through to Pangolin’s Traefik. If you’re using NGINX, you’ll need to modify the Ingress.

Thanks to the passthrough I didn't need any special microservice but now I have two Traefiks which I'm fine with.

@Xoffio commented on GitHub (Nov 4, 2025): [Here](https://github.com/XpaceOff/xo-pangolin) is a Helm chart I created. Hopefully it helps anyone who needs something now while we wait for an official chart. It’s set up to work with Traefik as the Ingress controller. I created an IngressRouteTCP that passes TLS through to Pangolin’s Traefik. If you’re using NGINX, you’ll need to modify the Ingress. Thanks to the passthrough I didn't need any special microservice but now I have two Traefiks which I'm fine with.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/pangolin#442