Compare commits

...

227 Commits
1.3.4 ... 1.4.9

Author SHA1 Message Date
Marc Schäfer
e07439a366 ci(actions): add GHCR mirroring and cosign signing for Docker images
mirror images from Docker Hub to GHCR using skopeo (preserves multi-arch manifests)
login to GHCR via docker/login-action for signing/pushing
install cosign and perform dual signing: keyless (OIDC) + key-based; verify signatures
add required permissions for id-token/packages and reference necessary secrets
2025-10-21 00:06:27 +02:00
Marc Schäfer
9f43d4ce6d ci(actions): pin action versions to commit SHAs for security
- Pin actions/checkout to SHA for v5.0.0
- Pin docker/setup-qemu-action to SHA for v3.6.0
- Pin docker/setup-buildx-action to SHA for v3.11.1
- Pin docker/login-action to SHA for v3.6.0
- Pin actions/setup-go to SHA for v6.0.0
- Pin actions/upload-artifact to SHA for v4.6.2
2025-10-20 23:23:13 +02:00
Marc Schäfer
5888553c50 Update mirror.yaml 2025-10-20 21:06:50 +02:00
Marc Schäfer
f63b1b689f Create mirror.yaml 2025-10-20 21:01:19 +02:00
Marc Schäfer
7f104d1a0c Merge pull request #13 from marcschaeferger/otel
Adding Otel to main after upstream sync
2025-10-10 19:43:40 +02:00
Marc Schäfer
9de29e7e00 Merge branch 'main' into otel 2025-10-10 19:41:08 +02:00
Marc Schäfer
cf611fe849 Merge branch 'main' of https://github.com/marcschaeferger/newt 2025-10-10 19:19:19 +02:00
Marc Schäfer
23e2731473 Merge pull request #11 from marcschaeferger/codex/implement-review-suggestions-for-newt-code
Adjust telemetry metrics for heartbeat timestamps and uptime
2025-10-10 19:18:49 +02:00
Marc Schäfer
186b51e000 refactor(telemetry): update OpenTelemetry SDK imports and types for metrics and tracing 2025-10-10 19:17:02 +02:00
Marc Schäfer
d21f4951e9 Add WebSocket and proxy lifecycle metrics 2025-10-10 19:15:33 +02:00
Marc Schäfer
e04c654292 Merge pull request #9 from marcschaeferger/dependabot/go_modules/prod-minor-updates-8fc2d76c77
Bump the prod-minor-updates group across 1 directory with 4 updates
2025-10-10 18:21:31 +02:00
Marc Schäfer
e43fbebcb8 Merge pull request #10 from marcschaeferger/codex/review-opentelemetry-metrics-and-tracing
Enhance telemetry metrics and context propagation
2025-10-10 18:21:14 +02:00
Marc Schäfer
1afed32562 Merge branch 'main' into codex/review-opentelemetry-metrics-and-tracing 2025-10-10 18:20:41 +02:00
Marc Schäfer
46384e6242 fix(metrics): update metrics recommendations and add OpenTelemetry review documentation 2025-10-10 18:18:38 +02:00
Marc Schäfer
52e4a57cc1 Enhance telemetry metrics and context propagation 2025-10-10 18:17:59 +02:00
Marc Schäfer
1a9f6c4685 fix(github-actions): add permissions section for content read access in workflows 2025-10-10 15:34:00 +02:00
Marc Schäfer
b6f5458ad9 fix(telemetry): enhance session observation logic for tunnel IDs and site-level aggregation 2025-10-10 15:30:06 +02:00
Marc Schäfer
4ef9737862 fix(observability): enhance clarity and structure of metrics documentation 2025-10-10 15:29:53 +02:00
Marc Schäfer
b68777e83a fix(prometheus): clarify instructions regarding scraping the Collector 2025-10-10 15:29:45 +02:00
Marc Schäfer
8d26de5f4d fix(docker-compose): improve comments for clarity on port mapping and collector usage 2025-10-10 15:29:24 +02:00
Marc Schäfer
c32828128f fix(readme): enhance clarity and structure of installation and documentation sections 2025-10-10 14:49:14 +02:00
Marc Schäfer
3cd7329d8b fix(prometheus): update comment for clarity and consistency in scraping instructions 2025-10-10 14:47:49 +02:00
Marc Schäfer
3490220803 fix(docker-compose, prometheus): remove unnecessary comments and improve clarity 2025-10-10 14:46:17 +02:00
Marc Schäfer
bd62da4cc9 fix(docker-compose, prometheus, telemetry, proxy): standardize collector naming and improve error handling 2025-10-10 14:42:05 +02:00
Marc Schäfer
8d0e6be2c7 fix(metrics): enhance documentation clarity and structure for metrics recommendations 2025-10-10 14:17:24 +02:00
Marc Schäfer
b62e18622e fix(manager, stub, util): enhance error handling and logging consistency 2025-10-10 14:16:28 +02:00
dependabot[bot]
89274eb9a8 Bump the prod-minor-updates group across 1 directory with 4 updates
Bumps the prod-minor-updates group with 3 updates in the / directory: [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go), [golang.org/x/crypto](https://github.com/golang/crypto) and [google.golang.org/grpc](https://github.com/grpc/grpc-go).


Updates `go.opentelemetry.io/otel/exporters/prometheus` from 0.57.0 to 0.60.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/exporters/prometheus/v0.57.0...exporters/prometheus/v0.60.0)

Updates `golang.org/x/crypto` from 0.42.0 to 0.43.0
- [Commits](https://github.com/golang/crypto/compare/v0.42.0...v0.43.0)

Updates `golang.org/x/net` from 0.44.0 to 0.45.0
- [Commits](https://github.com/golang/net/compare/v0.44.0...v0.45.0)

Updates `google.golang.org/grpc` from 1.75.1 to 1.76.0
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.75.1...v1.76.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
  dependency-version: 0.60.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
- dependency-name: golang.org/x/crypto
  dependency-version: 0.43.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
- dependency-name: golang.org/x/net
  dependency-version: 0.45.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
- dependency-name: google.golang.org/grpc
  dependency-version: 1.76.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-09 09:41:55 +00:00
Owen Schwartz
b383cec0b0 Merge pull request #157 from fosrl/dev
No cloud, config file overwriting, hp
2025-10-08 17:42:45 -07:00
Owen Schwartz
fb110ba2a1 Merge pull request #156 from fosrl/dependabot/go_modules/prod-minor-updates-51461da29c
Bump the prod-minor-updates group across 1 directory with 2 updates
2025-10-08 17:40:23 -07:00
dependabot[bot]
f287888480 Bump the prod-minor-updates group across 1 directory with 2 updates
Bumps the prod-minor-updates group with 2 updates in the / directory: [github.com/docker/docker](https://github.com/docker/docker) and [golang.org/x/net](https://github.com/golang/net).


Updates `github.com/docker/docker` from 28.4.0+incompatible to 28.5.0+incompatible
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.4.0...v28.5.0)

Updates `golang.org/x/net` from 0.44.0 to 0.45.0
- [Commits](https://github.com/golang/net/compare/v0.44.0...v0.45.0)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-version: 28.5.0+incompatible
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
- dependency-name: golang.org/x/net
  dependency-version: 0.45.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-08 09:23:09 +00:00
Marc Schäfer
77d56596ab fix(wgtester): improve logging format for consistency and clarity 2025-10-08 08:14:35 +02:00
Marc Schäfer
6ec0ab813c fix(main): refactor logging messages and introduce constants for improved readability 2025-10-08 08:14:25 +02:00
Marc Schäfer
fef9e8c76b fix(websocket): improve error type handling in connection establishment and ping monitoring 2025-10-08 08:14:04 +02:00
Marc Schäfer
ae5129a7c7 fix(sonar-telemetry): update observeSessionsFor function to include siteID and improve attribute handling 2025-10-08 08:13:35 +02:00
Marc Schäfer
ed127a2d61 fix(docker-compose): update comments in metrics configuration for clarity and consistency 2025-10-08 08:12:58 +02:00
Marc Schäfer
20ddbb5382 fix(telemetry): update proxyStopper to be a no-op function when registration fails 2025-10-08 08:12:20 +02:00
Marc Schäfer
5cbda35637 fix(docker-compose): update newt service configuration to use local build and environment file 2025-10-08 07:34:27 +02:00
Marc Schäfer
60196455d1 fix(telemetry): improve error handling and formatting in telemetry setup functions 2025-10-08 07:33:11 +02:00
Marc Schäfer
84e659acde docs(observability): update code blocks to specify language for better syntax highlighting 2025-10-08 01:12:51 +02:00
Marc Schäfer
e16881b7c8 fix(sonar): SetObservableCallback uses unregister stopper instead of empty function to satisfy S1186 2025-10-08 01:09:18 +02:00
Marc Schäfer
587e829e42 fix(build): use Registration.Unregister() without context; return tracer shutdown func from setupTracing 2025-10-08 01:07:08 +02:00
Marc Schäfer
ee2f8899ff refactor(telemetry): reduce cognitive complexity by splitting registerInstruments and Init; add unregister stoppers; extract state_view helpers 2025-10-08 01:06:13 +02:00
Marc Schäfer
744a741556 docs(README): add Observability Quickstart section and link to docs/observability.md 2025-10-08 01:01:33 +02:00
Marc Schäfer
aea80200e0 docs: add Quickstart in observability; examples: add docker-compose.metrics.collector.yml and prometheus.with-collector.yml (collector-only scrape) 2025-10-08 00:58:30 +02:00
Marc Schäfer
b20f7a02b2 feat(metrics): NEWT_METRICS_INCLUDE_TUNNEL_ID toggle; conditionally drop tunnel_id across bytes/sessions/proxy/reconnect; docs and smoke test updated; examples/prometheus.yml with relabels; docker-compose defaults avoid double-scrape 2025-10-08 00:53:40 +02:00
Marc Schäfer
f28d90595b fix(telemetry): adapt to RegisterCallback returning (Registration, error) 2025-10-08 00:46:41 +02:00
Marc Schäfer
4a90e36a44 docs+examples: document direction=ingress|egress, initiator and error_type enums; add cardinality relabel tips; provide Collector variants; add scripts/smoke-metrics.sh 2025-10-08 00:46:01 +02:00
Marc Schäfer
9ace45e71f fix(metrics): direction=ingress|egress for bytes; remove transport on tunnel_sessions; extend allow-list (msg_type, phase); add units for histograms and bytes; handle callback errors; normalize error_type taxonomy; HTTP error mapping to enums 2025-10-08 00:43:53 +02:00
Marc Schäfer
75d5e695d6 fix: update IncReconnect for auth failures; import metric in proxy manager for observable callback 2025-10-08 00:32:39 +02:00
Marc Schäfer
d74065a71b feat(phase2): websocket connect latency and message counters; proxy active/buffer/drops gauges and counters; config apply histogram; reconnect initiator label; update call-sites 2025-10-08 00:30:07 +02:00
Marc Schäfer
f86031f458 docs: update observability catalog to include site_id labels and clarify transport vs protocol; add METRICS_RECOMMENDATIONS.md with roadmap and ops guidance 2025-10-08 00:10:54 +02:00
Marc Schäfer
31f70e5032 test(telemetry): assert allowed attribute site_id appears in metrics exposition 2025-10-08 00:10:17 +02:00
Marc Schäfer
31514f26df feat(proxy): add site_id (and optional region) to bytes attribute sets for tunnel metrics 2025-10-08 00:10:03 +02:00
Marc Schäfer
09fcb36963 fix(main): remove duplicate ClearTunnelID/State and call telemetry.UpdateSiteInfo after resolving client ID 2025-10-08 00:09:44 +02:00
Marc Schäfer
83c3ae5cf9 feat(telemetry/state_view): add site_id label to gauges and set tunnel_sessions transport=wireguard (no hardcoded tcp) 2025-10-08 00:09:30 +02:00
Marc Schäfer
1e88fb86b4 feat(telemetry,metrics): allow site_id/region in attribute filter; read site_id from NEWT_SITE_ID/NEWT_ID or OTEL_RESOURCE_ATTRIBUTES; propagate site_id/region labels across metrics; include site labels in build_info; seed global site info 2025-10-08 00:09:17 +02:00
Marc Schäfer
62407b0c74 remove: removed test results 2025-10-08 00:02:44 +02:00
Marc Schäfer
d91c6ef168 fix: Update observability documentation to correct code block syntax and improve clarity 2025-10-08 00:00:56 +02:00
Marc Schäfer
59e8d79404 chore: Update docker-compose.metrics.yml for improved service configuration 2025-10-07 23:55:47 +02:00
Marc Schäfer
d907ae9e84 fix: Remove unnecessary blank line in prometheus.yml 2025-10-07 23:55:23 +02:00
Marc Schäfer
d745aa79d4 feat: Add Grafana dashboard and Prometheus datasource configuration files 2025-10-07 18:45:40 +02:00
Marc Schäfer
427ab67bb5 fix: Update observability documentation to clarify resource attributes and scraping strategy 2025-10-07 18:45:02 +02:00
Marc Schäfer
a86b14d97d refactor: Simplify telemetry metrics by removing site_id and enhancing tunnel_id usage 2025-10-07 18:43:09 +02:00
Marc Schäfer
f8fd8e1bc5 fix: Update otel-collector.yaml and docker-compose to correct endpoint configurations and enhance resource detection 2025-10-07 17:53:55 +02:00
Marc Schäfer
0b5e662abc fix: Update otel-collector.yaml to correct resource attribute checks and streamline processor/exporter configuration 2025-10-07 12:37:44 +02:00
Marc Schäfer
bd55269b39 feat: Add .env.example file and update docker-compose to use environment variables 2025-10-07 12:37:16 +02:00
Marc Schäfer
3e9c74a65b chore: Update OpenTelemetry collector image to version 0.136.0 2025-10-07 11:51:13 +02:00
Marc Schäfer
922591b269 chore: Update Dockerfile to enhance Go proxy settings and optimize build process 2025-10-07 11:36:23 +02:00
Marc Schäfer
cfe52caa4a chore: No code changes made to the Dockerfile 2025-10-07 11:30:53 +02:00
Marc Schäfer
d31d08c1c8 feat: Update Dockerfile to include installation of git and ca-certificates 2025-10-07 11:25:07 +02:00
Marc Schäfer
9ac4cee48d feat: Add Docker Compose configuration for OpenTelemetry collector and Prometheus 2025-10-07 11:09:20 +02:00
Marc Schäfer
b53fb70778 feat: Implement telemetry for reconnect reasons and RTT reporting
- Added telemetry hooks to track reconnect reasons for WireGuard connections, including server requests and authentication errors.
- Introduced RTT reporting to telemetry for better latency monitoring.
- Enhanced metrics configuration with flags for Prometheus and OTLP exporters.
- Implemented graceful shutdown and signal handling in the main application.
- Updated WebSocket client to classify connection errors and report them to telemetry.
- Added support for async byte counting in metrics.
- Improved handling of reconnect scenarios in the WireGuard service.
- Added documentation for applying patches and rollback procedures.
2025-10-07 09:17:05 +02:00
Marc Schäfer
0f83489f11 Add OpenTelemetry configuration and observability documentation 2025-10-07 09:16:44 +02:00
Marc Schäfer
09e9bd9493 Implement TelemetryView for thread-safe session management and observability 2025-10-07 09:16:17 +02:00
Marc Schäfer
2d4f656852 Add telemetry metrics and constants for improved observability 2025-10-07 09:15:36 +02:00
Marc Schäfer
8f7f9c417c Refactor WireGuard and netstack services for telemetry integration 2025-10-07 09:13:05 +02:00
Marc Schäfer
660adcc72d Instrument authentication and WebSocket connection logic for telemetry events 2025-10-07 09:13:04 +02:00
Marc Schäfer
0d55e35784 Add tunnel latency and reconnect telemetry to ping logic 2025-10-07 09:13:04 +02:00
Marc Schäfer
ceef228665 Refactor ProxyManager for per-tunnel metrics, async bytes collection, and session counting 2025-10-07 09:13:03 +02:00
Marc Schäfer
496ff0734c Integrate tunnel metrics and telemetry reporting throughout main application logic 2025-10-07 09:13:03 +02:00
Marc Schäfer
a89f13870c Initialize telemetry and start admin HTTP server for metrics export 2025-10-07 09:13:03 +02:00
Marc Schäfer
85394d3255 Add flags and environment variables for telemetry and metrics configuration 2025-10-07 09:13:02 +02:00
Marc Schäfer
0405aebb45 Expose admin/metrics endpoint in Dockerfile 2025-10-07 09:13:02 +02:00
Marc Schäfer
9c0f4599b8 Update dependencies for telemetry and metrics support 2025-10-07 09:13:01 +02:00
Owen
348b8f6b94 Try to fix overwriting config file 2025-10-01 10:31:14 -07:00
miloschwartz
71c5bf7e65 update template 2025-09-29 16:38:49 -07:00
Owen
dda0b414cc Add timeouts to hp 2025-09-29 14:55:26 -07:00
Owen
8f224e2a45 Add no cloud option 2025-09-29 12:25:07 -07:00
Owen Schwartz
90243cd6c6 Merge pull request #148 from fosrl/dependabot/go_modules/github.com/docker/docker-28.4.0incompatible
Bump github.com/docker/docker from 28.3.3+incompatible to 28.4.0+incompatible
2025-09-28 17:58:58 -07:00
Owen Schwartz
9b79af10ed Merge pull request #153 from fosrl/dev
Dev
2025-09-28 17:58:38 -07:00
Owen
31b1ffcbe9 Merge branch 'dev' into docker-events 2025-09-28 17:44:09 -07:00
dependabot[bot]
f1c4e1db71 Bump github.com/docker/docker
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.3.3+incompatible to 28.4.0+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.3.3...v28.4.0)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-version: 28.4.0+incompatible
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-28 23:31:05 +00:00
Owen
72a61d0933 Merge branch 'main' into dev 2025-09-28 16:27:05 -07:00
Owen
e489a2cc66 Merge branch 'main' of github.com:fosrl/newt 2025-09-28 16:26:58 -07:00
Owen
4e648af8e9 Pick up the existing interface private key 2025-09-28 16:26:36 -07:00
Owen
5d891225de Fix generateAndSaveKeyTo 2025-09-28 11:28:31 -07:00
Owen Schwartz
9864965381 Merge pull request #152 from didotb/didotb-docs-blueprint-file
docs: Add blueprint-file as a new cli arg and env var
2025-09-25 18:08:50 -07:00
Owen
75f6362a90 Add logging to config 2025-09-25 17:18:28 -07:00
Andrew Barrientos
30907188fb docs: Add new cli arg and env var
Include blueprint-file as an option in the cli arguments and environment variable
2025-09-26 06:46:32 +08:00
Owen Schwartz
5f11df8df2 Merge pull request #147 from marcschaeferger/Dependency-Update-09-25
Golang Dependency Update 09-2025
2025-09-21 20:10:13 -04:00
Owen Schwartz
7eea6dd335 Merge pull request #146 from marcschaeferger/github-actions
fix(gh-actions): Workflow does not contain permissions
2025-09-21 20:09:35 -04:00
Marc Schäfer
9dc5a3d91c fix(deps): add missing gopkg.in/yaml.v3 v3.0.1 back 2025-09-22 00:40:18 +02:00
Marc Schäfer
1881309148 chore(deps): update golang.org/x/crypto to v0.42.0, golang.org/x/net to v0.44.0, and golang.org/x/sys to v0.36.0 2025-09-22 00:30:33 +02:00
Marc Schäfer
aff928e60f fix(gh-actions): Workflow does not contain permissions 2025-09-22 00:22:42 +02:00
Marc Schäfer
fd6b1ae323 Merge pull request #1 from marcschaeferger/dependabot/go_modules/golang.org/x/net-0.44.0
Bump golang.org/x/net from 0.43.0 to 0.44.0
2025-09-21 21:47:59 +02:00
Marc Schäfer
831ae2d9c5 Merge pull request #2 from marcschaeferger/dependabot/go_modules/golang.org/x/crypto-0.42.0
Bump golang.org/x/crypto from 0.41.0 to 0.42.0
2025-09-21 21:47:45 +02:00
dependabot[bot]
a63a27e3ab Bump golang.org/x/crypto from 0.41.0 to 0.42.0
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.41.0 to 0.42.0.
- [Commits](https://github.com/golang/crypto/compare/v0.41.0...v0.42.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.42.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-21 19:40:14 +00:00
dependabot[bot]
34d558a5a2 Bump golang.org/x/net from 0.43.0 to 0.44.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.43.0 to 0.44.0.
- [Commits](https://github.com/golang/net/compare/v0.43.0...v0.44.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.44.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-21 19:40:11 +00:00
Owen
f6e7bfe8ea Watching socket and quiteting some events 2025-09-21 11:32:47 -04:00
Owen
60873f0a4f React to docker events 2025-09-21 11:19:52 -04:00
Owen Schwartz
50bb81981b Merge pull request #132 from fosrl/dependabot/github_actions/actions/setup-go-6
Bump actions/setup-go from 5 to 6
2025-09-20 11:43:42 -04:00
Owen Schwartz
4ced99fa3f Merge pull request #143 from rgutmen/mlts-pkcs12-compatibility
Mlts pkcs12 compatibility
2025-09-20 11:43:24 -04:00
rgutmen
9bd96ac540 Support TLS_CLIENT_CERT, TLS_CLIENT_KEY and TLS_CA_CERT in Docker Compose 2025-09-20 09:15:58 +01:00
Owen Schwartz
c673743692 Merge pull request #142 from marcschaeferger/main
Add Badges to README.md
2025-09-19 11:55:03 -04:00
Marc Schäfer
a08a3b9665 feat(Docs): Add License Badge and PkgGo Badge 2025-09-19 16:34:44 +02:00
Marc Schäfer
0fc13be413 feat(Docs): Addding GoReport Badge 2025-09-19 16:25:04 +02:00
Owen
92cedd00b3 Quiet up the logs 2025-09-15 10:58:40 -07:00
Owen
8b0cc36554 Add blueprint yaml sending 2025-09-08 15:25:05 -07:00
dependabot[bot]
ba9ca9f097 Bump actions/setup-go from 5 to 6
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5 to 6.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-08 10:14:37 +00:00
Owen
8b4a88937c Merge branch 'main' into dev 2025-09-06 17:38:46 -07:00
Owen Schwartz
58412a7a61 Merge pull request #129 from l3pr-org/main
Implement more privacy-respecting DNS service
2025-09-04 10:39:33 -07:00
Stanley Wisnioski
2675b812aa Update README.md
Updated README.md to reflect change of default DNS server from Google to Quad9.
2025-09-04 10:03:58 -04:00
Stanley Wisnioski
217a9346c6 Change DNS Server in clients.go
Changed DNS server from Google (8.8.8.8) to Quad9 (9.9.9.9)
2025-09-04 10:00:48 -04:00
Stanley Wisnioski
eda8073bce Change DNS Server
Changed DNS server from Google (8.8.8.8) to Quad9 (9.9.9.9)
2025-09-04 09:58:43 -04:00
Owen
2969f9d2d6 Ensure backward compatability with --docker-socket 2025-09-02 14:08:24 -07:00
Owen
07b7025a24 Ensure backward compatability with --docker-socket 2025-09-02 13:56:18 -07:00
Owen
502ebfc362 Make sure to call stop function inside of clients 2025-09-01 15:45:23 -07:00
Owen
288413fd15 Limit the amount of times the send message sends
Fixes #115
2025-09-01 11:53:46 -07:00
Owen
0ba44206b1 Print the body for debug 2025-09-01 11:51:23 -07:00
Owen
3f8dcd8f22 Update docs with enforce-hc-cert 2025-09-01 10:59:54 -07:00
Owen
c5c0143013 Allow health check to http self signed by default
Fixes #122
2025-09-01 10:56:08 -07:00
Owen
87ac5c97e3 Merge branch 'main' of github.com:fosrl/newt 2025-08-30 18:07:22 -07:00
Owen
e2238c3cc8 Merge branch 'Pallavikumarimdb-feat/Split-mTLS-client-and-CA-certificates' 2025-08-30 18:07:07 -07:00
Owen
58a67328d3 Merge branch 'feat/Split-mTLS-client-and-CA-certificates' of github.com:Pallavikumarimdb/newt into Pallavikumarimdb-feat/Split-mTLS-client-and-CA-certificates 2025-08-30 18:06:18 -07:00
Owen Schwartz
002fdc4d3f Merge pull request #97 from Nemental/feat/docker-socket-protocol
feat: docker socket protocol
2025-08-30 16:53:21 -07:00
Owen Schwartz
9a1fa2c19f Merge pull request #117 from fosrl/dependabot/github_actions/docker/setup-buildx-action-3
Bump docker/setup-buildx-action from 2 to 3
2025-08-30 16:52:06 -07:00
Owen Schwartz
a6797172ef Merge pull request #118 from fosrl/dependabot/github_actions/actions/setup-go-5
Bump actions/setup-go from 4 to 5
2025-08-30 16:51:59 -07:00
Owen Schwartz
d373de7fa1 Merge pull request #119 from fosrl/dependabot/github_actions/docker/login-action-3
Bump docker/login-action from 2 to 3
2025-08-30 16:51:52 -07:00
Owen Schwartz
f876bad632 Merge pull request #120 from fosrl/dependabot/github_actions/actions/checkout-5
Bump actions/checkout from 3 to 5
2025-08-30 16:51:45 -07:00
dependabot[bot]
54b096e6a7 Bump actions/checkout from 3 to 5
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-30 22:26:53 +00:00
dependabot[bot]
10720afd31 Bump docker/login-action from 2 to 3
Bumps [docker/login-action](https://github.com/docker/login-action) from 2 to 3.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](https://github.com/docker/login-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-version: '3'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-30 22:26:50 +00:00
dependabot[bot]
0b37f20d5d Bump actions/setup-go from 4 to 5
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4 to 5.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-30 22:26:47 +00:00
dependabot[bot]
aa6e54f383 Bump docker/setup-buildx-action from 2 to 3
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '3'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-30 22:26:43 +00:00
Owen Schwartz
30f8eb9785 Merge pull request #116 from Lokowitz/update-version
Update version
2025-08-30 15:26:09 -07:00
Marvin
e765d9c774 Update go.mod 2025-08-28 17:34:34 +02:00
Marvin
3ae4ac23ef Update test.yml 2025-08-28 17:33:59 +02:00
Marvin
6a98b90b01 Update cicd.yml 2025-08-28 17:33:39 +02:00
Marvin
e0ce9d4e48 Update dependabot.yml 2025-08-28 17:33:04 +02:00
Marvin
5914c9ed33 Update .go-version 2025-08-28 17:32:27 +02:00
Owen Schwartz
109bda961f Merge pull request #103 from fosrl/dependabot/go_modules/prod-minor-updates-50897cc7ef
Bump the prod-minor-updates group with 2 updates
2025-08-27 11:02:27 -07:00
Owen Schwartz
c2a93134b1 Merge pull request #106 from fosrl/dependabot/docker/minor-updates-887f07f54c
Bump golang from 1.24-alpine to 1.25-alpine in the minor-updates group
2025-08-27 11:02:16 -07:00
Owen Schwartz
100d8e6afe Merge pull request #114 from firecat53/1.4.2
Update version to 1.4.2
2025-08-27 11:01:18 -07:00
Scott Hansen
04f2048a0a Update flake.nix to 1.4.2 2025-08-27 10:58:00 -07:00
dependabot[bot]
04de5ef8ba Bump the prod-minor-updates group with 2 updates
Bumps the prod-minor-updates group with 2 updates: [golang.org/x/crypto](https://github.com/golang/crypto) and [golang.org/x/net](https://github.com/golang/net).


Updates `golang.org/x/crypto` from 0.40.0 to 0.41.0
- [Commits](https://github.com/golang/crypto/compare/v0.40.0...v0.41.0)

Updates `golang.org/x/net` from 0.42.0 to 0.43.0
- [Commits](https://github.com/golang/net/compare/v0.42.0...v0.43.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
- dependency-name: golang.org/x/net
  dependency-version: 0.43.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-25 11:38:15 +00:00
dependabot[bot]
e77601cccc Bump golang from 1.24-alpine to 1.25-alpine in the minor-updates group
Bumps the minor-updates group with 1 update: golang.


Updates `golang` from 1.24-alpine to 1.25-alpine

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.25-alpine
  dependency-type: direct:production
  dependency-group: minor-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-25 09:47:03 +00:00
Owen
e9752f868e Merge branch 'main' into dev 2025-08-23 12:17:58 -07:00
Owen Schwartz
866afaf749 Merge pull request #108 from firecat53/main
Bugfix for #107. Only update main.go
2025-08-22 21:42:36 -07:00
Owen
a12ae17a66 Add note about config 2025-08-22 21:34:47 -07:00
Owen
e0cba2e5c6 Merge branch 'site-targets' into dev 2025-08-19 10:57:25 -07:00
Scott Hansen
79f3db6fb6 Bugfix for #107. Only update main.go 2025-08-16 15:25:23 -07:00
Owen Schwartz
009b4cf425 Merge pull request #107 from firecat53/main
Update version to 1.4.1 and update version_replaceme when using nix build
2025-08-15 09:40:40 -07:00
Scott Hansen
9c28d75155 Update version to 1.4.1 and update version_replaceme when using nix build 2025-08-14 11:47:40 -07:00
Owen
bad244d0ea Merge branch 'main' into dev 2025-08-13 14:56:02 -07:00
Owen
d013dc0543 Adjust logging 2025-08-13 14:18:47 -07:00
Owen
0047b54e94 Dont override ENV
Fixes #101
2025-08-12 20:44:34 -07:00
Owen
f0c8d2c7c7 Change permissions to 0600
Fixes #104
2025-08-11 08:15:36 -07:00
Owen
28b6865f73 Healthcheck working 2025-08-11 08:14:29 -07:00
Pallavi
d52f89f629 Split mTLS client and CA certificates 2025-08-05 01:08:29 +05:30
Owen
289cce3a22 Add health checks 2025-08-03 18:43:43 -07:00
Owen
e8612c7e6b Handle adding and removing healthchecks 2025-08-03 17:02:15 -07:00
Owen
6820f8d23e Add basic heathchecks 2025-08-03 16:12:00 -07:00
Owen
151d0e38e6 Stop sending requests when you get a terminate 2025-08-03 14:47:36 -07:00
Nemental
a9d8ec0b1e docs: update docker socket part 2025-07-30 15:28:55 +02:00
Nemental
e9dbfb239b fix: remove hardcoded protocol from socket path 2025-07-30 09:36:53 +02:00
Nemental
a79dccc0e4 feat: checksocket protocol support 2025-07-30 09:36:19 +02:00
Nemental
42dfb6b3d8 feat: add type and function for docker endpoint parsing 2025-07-30 09:31:41 +02:00
Owen Schwartz
3ccd755d55 Merge pull request #95 from fosrl/dependabot/go_modules/prod-patch-updates-e08645070f
Bump github.com/docker/docker from 28.3.2+incompatible to 28.3.3+incompatible in the prod-patch-updates group
2025-07-29 23:24:19 -07:00
Owen Schwartz
a0f0b674e8 Merge pull request #96 from firecat53/main
Update flake.nix to 1.4.0
2025-07-29 23:24:03 -07:00
Owen
9e675121d3 Dont reset dns 2025-07-29 22:42:54 -07:00
Owen
45d17da570 Fix the bind problem by just recreating the dev
TODO: WHY CANT WE REBIND TO A PORT - WE NEED TO FIX THIS BETTER
2025-07-29 20:58:48 -07:00
Owen
dfba35f8bb Use the tunnel ip 2025-07-29 16:31:42 -07:00
Scott Hansen
9e73aab21d Update flake.nix to 1.4.0 2025-07-29 14:14:42 -07:00
dependabot[bot]
e1ddad006a Bump github.com/docker/docker in the prod-patch-updates group
Bumps the prod-patch-updates group with 1 update: [github.com/docker/docker](https://github.com/docker/docker).


Updates `github.com/docker/docker` from 28.3.2+incompatible to 28.3.3+incompatible
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.3.2...v28.3.3)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-version: 28.3.3+incompatible
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-patch-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-29 16:39:02 +00:00
Owen
29567d6e0b Dont print private key 2025-07-28 20:07:13 -07:00
Owen
47321ea9ad Update readme: env 2025-07-28 12:34:38 -07:00
Owen
abfc9d8efc Update readme: cli 2025-07-28 12:12:40 -07:00
Owen
c6929621e7 Merge branch 'main' into dev 2025-07-28 12:02:22 -07:00
Owen
46993203a3 Update readme 2025-07-28 12:02:10 -07:00
Owen
8306084354 SSH not ready 2025-07-28 12:02:09 -07:00
Owen
02c1e2b7d0 Compute kind of works now!? 2025-07-28 12:02:09 -07:00
Owen
ae7e2a1055 Clean up operation 2025-07-28 12:02:09 -07:00
Owen Schwartz
88f1335cff Merge pull request #93 from Lokowitz/sync-go-version
Sync go version
2025-07-28 11:59:10 -07:00
Owen
8bf9c9795b Netstack working 2025-07-27 10:25:34 -07:00
Marvin
5d343cd420 modified: go.mod
modified:   go.sum
2025-07-26 13:25:52 +00:00
Marvin
d1473b7e22 go.mod aktualisieren 2025-07-26 10:32:20 +02:00
Marvin
2efbd7dd6a Dockerfile aktualisieren 2025-07-26 10:31:53 +02:00
Marvin
82a3a39a1f .go-version aktualisieren 2025-07-26 10:31:35 +02:00
Marvin
df09193834 cicd.yml aktualisieren 2025-07-26 10:31:20 +02:00
Marvin
b2fe4e3b03 test.yml aktualisieren 2025-07-26 10:31:05 +02:00
Owen
e14d53087f Starting to work on option 2025-07-25 16:16:33 -07:00
Owen
3583270f73 Adding option for netstack 2025-07-25 16:16:00 -07:00
Owen
f5be05c55a Add flag 2025-07-25 16:14:25 -07:00
Owen
d09e3fbd60 Proxies working 2025-07-25 16:10:53 -07:00
Owen
493831b5f0 Pm working 2025-07-25 13:09:11 -07:00
Owen
9fc692c090 Proxy working? 2025-07-25 12:00:09 -07:00
Owen
ccb7008579 Just hp like olm 2025-07-25 11:42:36 -07:00
Owen
f17dbe1fef Use normal udp 2025-07-25 11:05:24 -07:00
Owen
27561f52ca Dont restart netstack 2025-07-25 11:01:54 -07:00
Owen
499ebcd928 Maybe its working? 2025-07-25 10:59:34 -07:00
Owen
40dfab31a5 Maybe basic func 2025-07-25 10:50:02 -07:00
Owen
56377ec87e Exit well 2025-07-24 20:46:33 -07:00
Owen
008be54c55 Add get config 2025-07-24 12:40:14 -07:00
Owen
64c22a94a4 Log to file optionally and update config locations 2025-07-24 12:01:53 -07:00
Owen Schwartz
468c93c581 Merge pull request #91 from fosrl/dependabot/go_modules/prod-minor-updates-17f8beca3b
Bump software.sslmate.com/src/go-pkcs12 from 0.5.0 to 0.6.0 in the prod-minor-updates group
2025-07-23 11:26:32 -07:00
Owen Schwartz
c53b859cda Merge pull request #92 from nepthar/patch-1
Nit: Typo fix in help string
2025-07-23 11:26:15 -07:00
Jordan Parker
6cd824baf2 Nit: Typo fix in help string 2025-07-23 10:25:11 -04:00
dependabot[bot]
d8c5182acd Bump software.sslmate.com/src/go-pkcs12 in the prod-minor-updates group
Bumps the prod-minor-updates group with 1 update: software.sslmate.com/src/go-pkcs12.


Updates `software.sslmate.com/src/go-pkcs12` from 0.5.0 to 0.6.0

---
updated-dependencies:
- dependency-name: software.sslmate.com/src/go-pkcs12
  dependency-version: 0.6.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: prod-minor-updates
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-23 10:01:13 +00:00
Owen
c8c4666d63 Change rm to keep 2025-07-22 11:36:31 -07:00
Owen
f1fcc13e66 Holepunch to the right endpoint 2025-07-21 17:04:05 -07:00
Owen Schwartz
52bbc2fe31 Merge pull request #90 from firecat53/main
Update flake.nix to 1.3.4
2025-07-21 14:54:22 -07:00
Scott Hansen
b5ee12f84a Update flake.nix for 1.3.4 2025-07-21 11:23:50 -07:00
Owen
510e78437c Add client type 2025-07-18 16:55:38 -07:00
Owen
e14cffce1c Merge branch 'main' into clients-fr 2025-07-18 16:53:27 -07:00
Owen
629a92ee81 Make client work for olm 2025-07-18 16:53:13 -07:00
Owen
56df75544d Adjust logging 2025-07-18 16:52:59 -07:00
Owen
5b2e743470 Remove defers causing bad file descriptor issues 2025-07-18 15:49:57 -07:00
Owen
b5025c142f Working on it 2025-07-18 15:25:39 -07:00
66 changed files with 10188 additions and 617 deletions

5
.env.example Normal file
View File

@@ -0,0 +1,5 @@
# Copy this file to .env and fill in your values
# Required for connecting to Pangolin service
PANGOLIN_ENDPOINT=https://example.com
NEWT_ID=changeme-id
NEWT_SECRET=changeme-secret

View File

@@ -0,0 +1,47 @@
body:
- type: textarea
attributes:
label: Summary
description: A clear and concise summary of the requested feature.
validations:
required: true
- type: textarea
attributes:
label: Motivation
description: |
Why is this feature important?
Explain the problem this feature would solve or what use case it would enable.
validations:
required: true
- type: textarea
attributes:
label: Proposed Solution
description: |
How would you like to see this feature implemented?
Provide as much detail as possible about the desired behavior, configuration, or changes.
validations:
required: true
- type: textarea
attributes:
label: Alternatives Considered
description: Describe any alternative solutions or workarounds you've thought about.
validations:
required: false
- type: textarea
attributes:
label: Additional Context
description: Add any other context, mockups, or screenshots about the feature request here.
validations:
required: false
- type: markdown
attributes:
value: |
Before submitting, please:
- Check if there is an existing issue for this feature.
- Clearly explain the benefit and use case.
- Be as specific as possible to help contributors evaluate and implement.

51
.github/ISSUE_TEMPLATE/1.bug_report.yml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: Bug Report
description: Create a bug report
labels: []
body:
- type: textarea
attributes:
label: Describe the Bug
description: A clear and concise description of what the bug is.
validations:
required: true
- type: textarea
attributes:
label: Environment
description: Please fill out the relevant details below for your environment.
value: |
- OS Type & Version: (e.g., Ubuntu 22.04)
- Pangolin Version:
- Gerbil Version:
- Traefik Version:
- Newt Version:
- Olm Version: (if applicable)
validations:
required: true
- type: textarea
attributes:
label: To Reproduce
description: |
Steps to reproduce the behavior, please provide a clear description of how to reproduce the issue, based on the linked minimal reproduction. Screenshots can be provided in the issue body below.
If using code blocks, make sure syntax highlighting is correct and double-check that the rendered preview is not broken.
validations:
required: true
- type: textarea
attributes:
label: Expected Behavior
description: A clear and concise description of what you expected to happen.
validations:
required: true
- type: markdown
attributes:
value: |
Before posting the issue go through the steps you've written down to make sure the steps provided are detailed and clear.
- type: markdown
attributes:
value: |
Contributors should be able to follow the steps provided in order to reproduce the bug.

8
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Need help or have questions?
url: https://github.com/orgs/fosrl/discussions
about: Ask questions, get help, and discuss with other community members
- name: Request a Feature
url: https://github.com/orgs/fosrl/discussions/new?category=feature-requests
about: Feature requests should be opened as discussions so others can upvote and comment

View File

@@ -33,3 +33,8 @@ updates:
minor-updates:
update-types:
- "minor"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"

View File

@@ -1,61 +1,160 @@
name: CI/CD Pipeline
# CI/CD workflow for building, publishing, mirroring, signing container images and building release binaries.
# Actions are pinned to specific SHAs to reduce supply-chain risk. This workflow triggers on tag push events.
permissions:
contents: read
packages: write # for GHCR push
id-token: write # for Cosign Keyless (OIDC) Signing
# Required secrets:
# - DOCKER_HUB_USERNAME / DOCKER_HUB_ACCESS_TOKEN: push to Docker Hub
# - GITHUB_TOKEN: used for GHCR login and OIDC keyless signing
# - COSIGN_PRIVATE_KEY / COSIGN_PASSWORD / COSIGN_PUBLIC_KEY: for key-based signing
on:
push:
tags:
- "*"
push:
tags:
- "*"
concurrency:
group: ${{ github.ref }}
cancel-in-progress: true
jobs:
release:
name: Build and Release
runs-on: ubuntu-latest
release:
name: Build and Release
runs-on: ubuntu-latest
# Job-level timeout to avoid runaway or stuck runs
timeout-minutes: 120
env:
# Target images
DOCKERHUB_IMAGE: docker.io/${{ secrets.DOCKER_HUB_USERNAME }}/${{ github.event.repository.name }}
GHCR_IMAGE: ghcr.io/${{ github.repository_owner }}/${{ github.event.repository.name }}
steps:
- name: Checkout code
uses: actions/checkout@v3
steps:
- name: Checkout code
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Log in to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_HUB_USERNAME }}
password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
- name: Log in to Docker Hub
uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with:
registry: docker.io
username: ${{ secrets.DOCKER_HUB_USERNAME }}
password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
- name: Extract tag name
id: get-tag
run: echo "TAG=${GITHUB_REF#refs/tags/}" >> $GITHUB_ENV
- name: Extract tag name
id: get-tag
run: echo "TAG=${GITHUB_REF#refs/tags/}" >> $GITHUB_ENV
shell: bash
- name: Install Go
uses: actions/setup-go@v4
with:
go-version: 1.23.1
- name: Install Go
uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with:
go-version: 1.25
- name: Update version in main.go
run: |
TAG=${{ env.TAG }}
if [ -f main.go ]; then
sed -i 's/version_replaceme/'"$TAG"'/' main.go
echo "Updated main.go with version $TAG"
else
echo "main.go not found"
fi
- name: Update version in main.go
run: |
TAG=${{ env.TAG }}
if [ -f main.go ]; then
sed -i 's/version_replaceme/'"$TAG"'/' main.go
echo "Updated main.go with version $TAG"
else
echo "main.go not found"
fi
shell: bash
- name: Build and push Docker images
run: |
TAG=${{ env.TAG }}
make docker-build-release tag=$TAG
- name: Build and push Docker images (Docker Hub)
run: |
TAG=${{ env.TAG }}
make docker-build-release tag=$TAG
echo "Built & pushed to: ${{ env.DOCKERHUB_IMAGE }}:${TAG}"
shell: bash
- name: Build binaries
run: |
make go-build-release
- name: Login in to GHCR
uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Upload artifacts from /bin
uses: actions/upload-artifact@v4
with:
name: binaries
path: bin/
- name: Install skopeo + jq
# skopeo: copy/inspect images between registries
# jq: JSON parsing tool used to extract digest values
run: |
sudo apt-get update -y
sudo apt-get install -y skopeo jq
skopeo --version
shell: bash
- name: Copy tag from Docker Hub to GHCR
# Mirror the already-built image (all architectures) to GHCR so we can sign it
run: |
set -euo pipefail
TAG=${{ env.TAG }}
echo "Copying ${{ env.DOCKERHUB_IMAGE }}:${TAG} -> ${{ env.GHCR_IMAGE }}:${TAG}"
skopeo copy --all --retry-times 3 \
docker://$DOCKERHUB_IMAGE:$TAG \
docker://$GHCR_IMAGE:$TAG
shell: bash
- name: Install cosign
# cosign is used to sign and verify container images (key and keyless)
uses: sigstore/cosign-installer@faadad0cce49287aee09b3a48701e75088a2c6ad # v4.0.0
- name: Dual-sign and verify (GHCR & Docker Hub)
# Sign each image by digest using keyless (OIDC) and key-based signing,
# then verify both the public key signature and the keyless OIDC signature.
env:
TAG: ${{ env.TAG }}
COSIGN_PRIVATE_KEY: ${{ secrets.COSIGN_PRIVATE_KEY }}
COSIGN_PASSWORD: ${{ secrets.COSIGN_PASSWORD }}
COSIGN_PUBLIC_KEY: ${{ secrets.COSIGN_PUBLIC_KEY }}
COSIGN_YES: "true"
run: |
set -euo pipefail
issuer="https://token.actions.githubusercontent.com"
id_regex="^https://github.com/${{ github.repository }}/.+" # accept this repo (all workflows/refs)
for IMAGE in "${GHCR_IMAGE}" "${DOCKERHUB_IMAGE}"; do
echo "Processing ${IMAGE}:${TAG}"
DIGEST="$(skopeo inspect --retry-times 3 docker://${IMAGE}:${TAG} | jq -r '.Digest')"
REF="${IMAGE}@${DIGEST}"
echo "Resolved digest: ${REF}"
echo "==> cosign sign (keyless) --recursive ${REF}"
cosign sign --recursive "${REF}"
echo "==> cosign sign (key) --recursive ${REF}"
cosign sign --key env://COSIGN_PRIVATE_KEY --recursive "${REF}"
echo "==> cosign verify (public key) ${REF}"
cosign verify --key env://COSIGN_PUBLIC_KEY "${REF}" -o text
echo "==> cosign verify (keyless policy) ${REF}"
cosign verify \
--certificate-oidc-issuer "${issuer}" \
--certificate-identity-regexp "${id_regex}" \
"${REF}" -o text
done
shell: bash
- name: Build binaries
run: |
make go-build-release
shell: bash
- name: Upload artifacts from /bin
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: binaries
path: bin/

139
.github/workflows/mirror.yaml vendored Normal file
View File

@@ -0,0 +1,139 @@
name: Mirror & Sign (Docker Hub to GHCR)
on:
workflow_dispatch: {}
permissions:
contents: read
packages: write
id-token: write # for keyless OIDC
env:
SOURCE_IMAGE: docker.io/fosrl/newt
DEST_IMAGE: ghcr.io/${{ github.repository_owner }}/${{ github.event.repository.name }}
jobs:
mirror-and-dual-sign:
runs-on: ubuntu-latest
steps:
- name: Install skopeo + jq
run: |
sudo apt-get update -y
sudo apt-get install -y skopeo jq
skopeo --version
- name: Install cosign
uses: sigstore/cosign-installer@faadad0cce49287aee09b3a48701e75088a2c6ad # v4.0.0
- name: Input check
run: |
test -n "${SOURCE_IMAGE}" || (echo "SOURCE_IMAGE is empty" && exit 1)
echo "Source : ${SOURCE_IMAGE}"
echo "Target : ${DEST_IMAGE}"
# Auth for skopeo (containers-auth)
- name: Skopeo login to GHCR
run: |
skopeo login ghcr.io -u "${{ github.actor }}" -p "${{ secrets.GITHUB_TOKEN }}"
# >>> IMPORTANT: Auth for cosign (docker-config) <<<
- name: Docker login to GHCR (for cosign)
run: |
echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin
# Optional (if Docker Hub private / tight limits)
# - name: Login to Docker Hub (skopeo and cosign share this via docker login)
# run: |
# echo "${{ secrets.DOCKERHUB_TOKEN }}" | docker login docker.io -u "${{ secrets.DOCKERHUB_USERNAME }}" --password-stdin
# skopeo login docker.io -u "${{ secrets.DOCKERHUB_USERNAME }}" -p "${{ secrets.DOCKERHUB_TOKEN }}"
- name: List source tags
run: |
set -euo pipefail
skopeo list-tags --retry-times 3 docker://"${SOURCE_IMAGE}" \
| jq -r '.Tags[]' | sort -u > src-tags.txt
echo "Found source tags: $(wc -l < src-tags.txt)"
head -n 20 src-tags.txt || true
- name: List destination tags (skip existing)
run: |
set -euo pipefail
if skopeo list-tags --retry-times 3 docker://"${DEST_IMAGE}" >/tmp/dst.json 2>/dev/null; then
jq -r '.Tags[]' /tmp/dst.json | sort -u > dst-tags.txt
else
: > dst-tags.txt
fi
echo "Existing destination tags: $(wc -l < dst-tags.txt)"
- name: Mirror, dual-sign, and verify
env:
# keyless
COSIGN_YES: "true"
# key-based
COSIGN_PRIVATE_KEY: ${{ secrets.COSIGN_PRIVATE_KEY }}
COSIGN_PASSWORD: ${{ secrets.COSIGN_PASSWORD }}
# verify
COSIGN_PUBLIC_KEY: ${{ secrets.COSIGN_PUBLIC_KEY }}
run: |
set -euo pipefail
copied=0; skipped=0; v_ok=0; errs=0
issuer="https://token.actions.githubusercontent.com"
id_regex="^https://github.com/${{ github.repository }}/.+"
while read -r tag; do
[ -z "$tag" ] && continue
if grep -Fxq "$tag" dst-tags.txt; then
echo "::notice ::Skip (exists) ${DEST_IMAGE}:${tag}"
skipped=$((skipped+1))
continue
fi
echo "==> Copy ${SOURCE_IMAGE}:${tag} → ${DEST_IMAGE}:${tag}"
if ! skopeo copy --all --retry-times 3 \
docker://"${SOURCE_IMAGE}:${tag}" docker://"${DEST_IMAGE}:${tag}"; then
echo "::warning title=Copy failed::${SOURCE_IMAGE}:${tag}"
errs=$((errs+1)); continue
fi
copied=$((copied+1))
digest="$(skopeo inspect --retry-times 3 docker://"${DEST_IMAGE}:${tag}" | jq -r '.Digest')"
ref="${DEST_IMAGE}@${digest}"
echo "==> cosign sign (keyless) --recursive ${ref}"
if ! cosign sign --recursive "${ref}"; then
echo "::warning title=Keyless sign failed::${ref}"
errs=$((errs+1))
fi
echo "==> cosign sign (key) --recursive ${ref}"
if ! cosign sign --key env://COSIGN_PRIVATE_KEY --recursive "${ref}"; then
echo "::warning title=Key sign failed::${ref}"
errs=$((errs+1))
fi
echo "==> cosign verify (public key) ${ref}"
if ! cosign verify --key env://COSIGN_PUBLIC_KEY "${ref}" -o text; then
echo "::warning title=Verify(pubkey) failed::${ref}"
errs=$((errs+1))
fi
echo "==> cosign verify (keyless policy) ${ref}"
if ! cosign verify \
--certificate-oidc-issuer "${issuer}" \
--certificate-identity-regexp "${id_regex}" \
"${ref}" -o text; then
echo "::warning title=Verify(keyless) failed::${ref}"
errs=$((errs+1))
else
v_ok=$((v_ok+1))
fi
done < src-tags.txt
echo "---- Summary ----"
echo "Copied : $copied"
echo "Skipped : $skipped"
echo "Verified OK : $v_ok"
echo "Errors : $errs"

View File

@@ -1,5 +1,8 @@
name: Run Tests
permissions:
contents: read
on:
pull_request:
branches:
@@ -11,12 +14,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Set up Go
uses: actions/setup-go@v4
uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with:
go-version: '1.23'
go-version: 1.25
- name: Build go
run: go build

1
.gitignore vendored
View File

@@ -6,3 +6,4 @@ nohup.out
*.iml
certs/
newt_arm64
.env

View File

@@ -1 +1 @@
1.23.2
1.25

View File

@@ -1,4 +1,9 @@
FROM golang:1.24.5-alpine AS builder
#ghcr.io/marcschaeferger/newt-private:1.0.0-otel
#tademsh/newt:1.0.0-otel
FROM golang:1.25-alpine AS builder
# Install git and ca-certificates
RUN apk --no-cache add ca-certificates git tzdata
# Set the working directory inside the container
WORKDIR /app
@@ -6,14 +11,19 @@ WORKDIR /app
# Copy go mod and sum files
COPY go.mod go.sum ./
# Coolify specific Test - set Go proxy to direct to avoid issues
# ENV GOSUMDB=off
ENV GOPROXY=https://goproxy.io,https://proxy.golang.org,direct
RUN go env | grep -E 'GOPROXY|GOSUMDB|GOPRIVATE' && go mod download
# Download all dependencies
RUN go mod download
#RUN go mod download
# Copy the source code into the container
COPY . .
# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -o /newt
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /newt
FROM alpine:3.22 AS runner
@@ -22,6 +32,9 @@ RUN apk --no-cache add ca-certificates tzdata
COPY --from=builder /newt /usr/local/bin/
COPY entrypoint.sh /
# Admin/metrics endpoint (Prometheus scrape)
EXPOSE 2112
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["newt"]
CMD ["newt"]

342
README.md
View File

@@ -1,18 +1,24 @@
<!-- markdownlint-disable MD033 -->
# Newt
[![PkgGoDev](https://pkg.go.dev/badge/github.com/fosrl/newt)](https://pkg.go.dev/github.com/fosrl/newt)
[![GitHub License](https://img.shields.io/github/license/fosrl/newt)](https://github.com/fosrl/newt/blob/main/LICENSE)
[![Go Report Card](https://goreportcard.com/badge/github.com/fosrl/newt)](https://goreportcard.com/report/github.com/fosrl/newt)
Newt is a fully user space [WireGuard](https://www.wireguard.com/) tunnel client and TCP/UDP proxy, designed to securely expose private resources controlled by Pangolin. By using Newt, you don't need to manage complex WireGuard tunnels and NATing.
### Installation and Documentation
## Installation and Documentation
Newt is used with Pangolin and Gerbil as part of the larger system. See documentation below:
- [Full Documentation](https://docs.fossorial.io)
- [Full Documentation](https://docs.fossorial.io)
- Observability Quickstart: see `docs/observability.md` — canonical Prometheus/OTel Collector quickstart and smoke tests
## Preview
<img src="public/screenshots/preview.png" alt="Preview"/>
_Sample output of a Newt container connected to Pangolin and hosting various resource target proxies._
_Sample output of a Newt connected to Pangolin and hosting various resource target proxies._
## Key Functions
@@ -22,7 +28,7 @@ Using the Newt ID and a secret, the client will make HTTP requests to Pangolin t
### Receives WireGuard Control Messages
When Newt receives WireGuard control messages, it will use the information encoded (endpoint, public key) to bring up a WireGuard tunnel using [netstack](https://github.com/WireGuard/wireguard-go/blob/master/tun/netstack/examples/http_server.go) fully in user space. It will ping over the tunnel to ensure the peer on the Gerbil side is brought up.
When Newt receives WireGuard control messages, it will use the information encoded (endpoint, public key) to bring up a WireGuard tunnel using [netstack](https://github.com/WireGuard/wireguard-go/blob/master/tun/netstack/examples/http_server.go) fully in user space. It will ping over the tunnel to ensure the peer on the Gerbil side is brought up.
### Receives Proxy Control Messages
@@ -30,22 +36,94 @@ When Newt receives WireGuard control messages, it will use the information encod
## CLI Args
- `endpoint`: The endpoint where both Gerbil and Pangolin reside in order to connect to the websocket.
- `id`: Newt ID generated by Pangolin to identify the client.
- `secret`: A unique secret (not shared and kept private) used to authenticate the client ID with the websocket in order to receive commands.
- `mtu`: MTU for the internal WG interface. Default: 1280
- `dns`: DNS server to use to resolve the endpoint
- `log-level` (optional): The log level to use. Default: INFO
- `updown` (optional): A script to be called when targets are added or removed.
- `tls-client-cert` (optional): Client certificate (p12 or pfx) for mTLS. See [mTLS](#mtls)
- `docker-socket` (optional): Set the Docker socket to use the container discovery integration
- `docker-enforce-network-validation` (optional): Validate the container target is on the same network as the newt process
- `health-file` (optional): Check if connection to WG server (pangolin) is ok. creates a file if ok, removes it if not ok. Can be used with docker healtcheck to restart newt
- `id`: Newt ID generated by Pangolin to identify the client.
- `secret`: A unique secret (not shared and kept private) used to authenticate the client ID with the websocket in order to receive commands.
- `endpoint`: The endpoint where both Gerbil and Pangolin reside in order to connect to the websocket.
- `mtu` (optional): MTU for the internal WG interface. Default: 1280
- `dns` (optional): DNS server to use to resolve the endpoint. Default: 9.9.9.9
- `log-level` (optional): The log level to use (DEBUG, INFO, WARN, ERROR, FATAL). Default: INFO
- `enforce-hc-cert` (optional): Enforce certificate validation for health checks. Default: false (accepts any cert)
- `docker-socket` (optional): Set the Docker socket to use the container discovery integration
- `ping-interval` (optional): Interval for pinging the server. Default: 3s
- `ping-timeout` (optional): Timeout for each ping. Default: 5s
- `updown` (optional): A script to be called when targets are added or removed.
- `tls-client-cert` (optional): Client certificate (p12 or pfx) for mTLS. See [mTLS](#mtls)
- `tls-client-cert` (optional): Path to client certificate (PEM format, optional if using PKCS12). See [mTLS](#mtls)
- `tls-client-key` (optional): Path to private key for mTLS (PEM format, optional if using PKCS12)
- `tls-ca-cert` (optional): Path to CA certificate to verify server (PEM format, optional if using PKCS12)
- `docker-enforce-network-validation` (optional): Validate the container target is on the same network as the newt process. Default: false
- `health-file` (optional): Check if connection to WG server (pangolin) is ok. creates a file if ok, removes it if not ok. Can be used with docker healtcheck to restart newt
- `accept-clients` (optional): Enable WireGuard server mode to accept incoming newt client connections. Default: false
- `generateAndSaveKeyTo` (optional): Path to save generated private key
- `native` (optional): Use native WireGuard interface when accepting clients (requires WireGuard kernel module and Linux, must run as root). Default: false (uses userspace netstack)
- `interface` (optional): Name of the WireGuard interface. Default: newt
- `keep-interface` (optional): Keep the WireGuard interface. Default: false
- `blueprint-file` (optional): Path to blueprint file to define Pangolin resources and configurations.
- `no-cloud` (optional): Don't fail over to the cloud when using managed nodes in Pangolin Cloud. Default: false
## Environment Variables
All CLI arguments can be set using environment variables as an alternative to command line flags. Environment variables are particularly useful when running Newt in containerized environments.
- `PANGOLIN_ENDPOINT`: Endpoint of your pangolin server (equivalent to `--endpoint`)
- `NEWT_ID`: Newt ID generated by Pangolin (equivalent to `--id`)
- `NEWT_SECRET`: Newt secret for authentication (equivalent to `--secret`)
- `MTU`: MTU for the internal WG interface. Default: 1280 (equivalent to `--mtu`)
- `DNS`: DNS server to use to resolve the endpoint. Default: 9.9.9.9 (equivalent to `--dns`)
- `LOG_LEVEL`: Log level (DEBUG, INFO, WARN, ERROR, FATAL). Default: INFO (equivalent to `--log-level`)
- `DOCKER_SOCKET`: Path to Docker socket for container discovery (equivalent to `--docker-socket`)
- `PING_INTERVAL`: Interval for pinging the server. Default: 3s (equivalent to `--ping-interval`)
- `PING_TIMEOUT`: Timeout for each ping. Default: 5s (equivalent to `--ping-timeout`)
- `UPDOWN_SCRIPT`: Path to updown script for target add/remove events (equivalent to `--updown`)
- `TLS_CLIENT_CERT`: Path to client certificate for mTLS (equivalent to `--tls-client-cert`)
- `TLS_CLIENT_CERT`: Path to client certificate for mTLS (equivalent to `--tls-client-cert`)
- `TLS_CLIENT_KEY`: Path to private key for mTLS (equivalent to `--tls-client-key`)
- `TLS_CA_CERT`: Path to CA certificate to verify server (equivalent to `--tls-ca-cert`)
- `DOCKER_ENFORCE_NETWORK_VALIDATION`: Validate container targets are on same network. Default: false (equivalent to `--docker-enforce-network-validation`)
- `ENFORCE_HC_CERT`: Enforce certificate validation for health checks. Default: false (equivalent to `--enforce-hc-cert`)
- `HEALTH_FILE`: Path to health file for connection monitoring (equivalent to `--health-file`)
- `ACCEPT_CLIENTS`: Enable WireGuard server mode. Default: false (equivalent to `--accept-clients`)
- `GENERATE_AND_SAVE_KEY_TO`: Path to save generated private key (equivalent to `--generateAndSaveKeyTo`)
- `USE_NATIVE_INTERFACE`: Use native WireGuard interface (Linux only). Default: false (equivalent to `--native`)
- `INTERFACE`: Name of the WireGuard interface. Default: newt (equivalent to `--interface`)
- `KEEP_INTERFACE`: Keep the WireGuard interface after shutdown. Default: false (equivalent to `--keep-interface`)
- `CONFIG_FILE`: Load the config json from this file instead of in the home folder.
- `BLUEPRINT_FILE`: Path to blueprint file to define Pangolin resources and configurations. (equivalent to `--blueprint-file`)
- `NO_CLOUD`: Don't fail over to the cloud when using managed nodes in Pangolin Cloud. Default: false (equivalent to `--no-cloud`)
## Loading secrets from files
You can use `CONFIG_FILE` to define a location of a config file to store the credentials between runs.
```sh
$ cat ~/.config/newt-client/config.json
{
"id": "spmzu8rbpzj1qq6",
"secret": "f6v61mjutwme2kkydbw3fjo227zl60a2tsf5psw9r25hgae3",
"endpoint": "https://pangolin.fossorial.io",
"tlsClientCert": ""
}
```
This file is also written to when newt first starts up. So you do not need to run every time with --id and secret if you have run it once!
Default locations:
- **macOS**: `~/Library/Application Support/newt-client/config.json`
- **Windows**: `%PROGRAMDATA%\newt\newt-client\config.json`
- **Linux/Others**: `~/.config/newt-client/config.json`
<!-- Observability Quickstart moved to docs/observability.md (canonical). -->
## Examples
**Note**: When both environment variables and CLI arguments are provided, CLI arguments take precedence.
- Example:
```bash
./newt \
newt \
--id 31frd0uzbjvp721 \
--secret h51mmlknrvrwv8s4r1i210azhumt6isgbpyavxodibx1k2d6 \
--endpoint https://example.com
@@ -55,58 +133,164 @@ You can also run it with Docker compose. For example, a service in your `docker-
```yaml
services:
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- HEALTH_FILE=/tmp/healthy
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- HEALTH_FILE=/tmp/healthy
```
You can also pass the CLI args to the container:
```yaml
services:
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
command:
- --id 31frd0uzbjvp721
- --secret h51mmlknrvrwv8s4r1i210azhumt6isgbpyavxodibx1k2d6
- --endpoint https://example.com
- --health-file /tmp/healthy
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
command:
- --id 31frd0uzbjvp721
- --secret h51mmlknrvrwv8s4r1i210azhumt6isgbpyavxodibx1k2d6
- --endpoint https://example.com
- --health-file /tmp/healthy
```
## Accept Client Connections
When the `--accept-clients` flag is enabled (or `ACCEPT_CLIENTS=true` environment variable is set), Newt operates as a WireGuard server that can accept incoming client connections from other devices. This enables peer-to-peer connectivity through the Newt instance.
### How It Works
In client acceptance mode, Newt:
- **Creates a WireGuard service** that can accept incoming connections from other WireGuard clients
- **Starts a connection testing server** (WGTester) that responds to connectivity checks from remote clients
- **Manages peer configurations** dynamically based on Pangolin's instructions
- **Enables bidirectional communication** between the Newt instance and connected clients
### Use Cases
- **Site-to-site connectivity**: Connect multiple locations through a central Newt instance
- **Client access to private networks**: Allow remote clients to access resources behind the Newt instance
- **Development environments**: Provide developers secure access to internal services
### Client Tunneling Modes
Newt supports two WireGuard tunneling modes:
#### Userspace Mode (Default)
By default, Newt uses a fully userspace WireGuard implementation using [netstack](https://github.com/WireGuard/wireguard-go/blob/master/tun/netstack/examples/http_server.go). This mode:
- **Does not require root privileges**
- **Works on all supported platforms** (Linux, Windows, macOS)
- **Does not require WireGuard kernel module** to be installed
- **Runs entirely in userspace** - no system network interface is created
- **Is containerization-friendly** - works seamlessly in Docker containers
This is the recommended mode for most deployments, especially containerized environments.
In this mode, TCP and UDP is proxied out of newt from the remote client using TCP/UDP resources in Pangolin.
#### Native Mode (Linux only)
When using the `--native` flag or setting `USE_NATIVE_INTERFACE=true`, Newt uses the native WireGuard kernel module. This mode:
- **Requires root privileges** to create and manage network interfaces
- **Only works on Linux** with the WireGuard kernel module installed
- **Creates a real network interface** (e.g., `newt0`) on the system
- **May offer better performance** for high-throughput scenarios
- **Requires proper network permissions** and may conflict with existing network configurations
In this mode it functions like a traditional VPN interface - all data arrives on the interface and you must get it to the destination (or access things locally).
#### Native Mode Requirements
To use native mode:
1. Run on a Linux system
2. Install the WireGuard kernel module
3. Run Newt as root (`sudo`)
4. Ensure the system allows creation of network interfaces
Docker Compose example:
```yaml
services:
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- ACCEPT_CLIENTS=true
```
### Technical Details
When client acceptance is enabled:
- **WGTester Server**: Runs on `port + 1` (e.g., if WireGuard uses port 51820, WGTester uses 51821)
- **Connection Testing**: Responds to UDP packets with magic header `0xDEADBEEF` for connectivity verification
- **Dynamic Configuration**: Peer configurations are managed remotely through Pangolin
- **Proxy Integration**: Can work with both userspace (netstack) and native WireGuard modes
**Note**: Client acceptance mode requires coordination with Pangolin for peer management and configuration distribution.
### Docker Socket Integration
Newt can integrate with the Docker socket to provide remote inspection of Docker containers. This allows Pangolin to query and retrieve detailed information about containers running on the Newt client, including metadata, network configuration, port mappings, and more.
**Configuration:**
You can specify the Docker socket path using the `--docker-socket` CLI argument or by setting the `DOCKER_SOCKET` environment variable. On most linux systems the socket is `/var/run/docker.sock`. When deploying newt as a container, you need to mount the host socket as a volume for the newt container to access it. If the Docker socket is not available or accessible, Newt will gracefully disable Docker integration and continue normal operation.
You can specify the Docker socket path using the `--docker-socket` CLI argument or by setting the `DOCKER_SOCKET` environment variable. If the Docker socket is not available or accessible, Newt will gracefully disable Docker integration and continue normal operation.
Supported values include:
- Local UNIX socket (default):
>You must mount the socket file into the container using a volume, so Newt can access it.
`unix:///var/run/docker.sock`
- TCP socket (e.g., via Docker Socket Proxy):
`tcp://localhost:2375`
- HTTP/HTTPS endpoints (e.g., remote Docker APIs):
`http://your-host:2375`
- SSH connections (experimental, requires SSH setup):
`ssh://user@host`
```yaml
services:
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- DOCKER_SOCKET=/var/run/docker.sock
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- DOCKER_SOCKET=unix:///var/run/docker.sock
```
>If you previously used just a path like `/var/run/docker.sock`, it still works — Newt assumes it is a UNIX socket by default.
#### Hostnames vs IPs
When the Docker Socket Integration is used, depending on the network which Newt is run with, either the hostname (generally considered the container name) or the IP address of the container will be sent to Pangolin. Here are some of the scenarios where IPs or hostname of the container will be utilised:
- **Running in Network Mode 'host'**: IP addresses will be used
- **Running in Network Mode 'bridge'**: IP addresses will be used
- **Running in docker-compose without a network specification**: Docker compose creates a network for the compose by default, hostnames will be used
@@ -130,7 +314,7 @@ You can pass in a updown script for Newt to call when it is adding or removing a
`--updown "python3 test.py"`
It will get called with args when a target is added:
It will get called with args when a target is added:
`python3 test.py add tcp localhost:8556`
`python3 test.py remove tcp localhost:8556`
@@ -139,40 +323,66 @@ Returning a string from the script in the format of a target (`ip:dst` so `10.0.
You can look at updown.py as a reference script to get started!
### mTLS
Newt supports mutual TLS (mTLS) authentication, if the server has been configured to request a client certificate.
* Only PKCS12 (.p12 or .pfx) file format is accepted
* The PKCS12 file must contain:
* Private key
* Public certificate
* CA certificate
* Encrypted PKCS12 files are currently not supported
Examples:
Newt supports mutual TLS (mTLS) authentication if the server is configured to request a client certificate. You can use either a PKCS12 (.p12/.pfx) file or split PEM files for the client cert, private key, and CA.
#### Option 1: PKCS12 (Legacy)
> This is the original method and still supported.
- File must contain:
- Client private key
- Public certificate
- CA certificate
- Encrypted `.p12` files are **not supported**
Example:
```bash
./newt \
newt \
--id 31frd0uzbjvp721 \
--secret h51mmlknrvrwv8s4r1i210azhumt6isgbpyavxodibx1k2d6 \
--endpoint https://example.com \
--tls-client-cert ./client.p12
```
#### Option 2: Split PEM Files (Preferred)
You can now provide separate files for:
- `--tls-client-cert`: client certificate (`.crt` or `.pem`)
- `--tls-client-key`: client private key (`.key` or `.pem`)
- `--tls-ca-cert`: CA cert to verify the server
Example:
```bash
newt \
--id 31frd0uzbjvp721 \
--secret h51mmlknrvrwv8s4r1i210azhumt6isgbpyavxodibx1k2d6 \
--endpoint https://example.com \
--tls-client-cert ./client.crt \
--tls-client-key ./client.key \
--tls-ca-cert ./ca.crt
```
```yaml
services:
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- TLS_CLIENT_CERT=./client.p12
newt:
image: fosrl/newt
container_name: newt
restart: unless-stopped
environment:
- PANGOLIN_ENDPOINT=https://example.com
- NEWT_ID=2ix2t8xk22ubpfy
- NEWT_SECRET=nnisrfsdfc7prqsp9ewo1dvtvci50j5uiqotez00dgap0ii2
- TLS_CLIENT_CERT=./client.p12
```
## Build
### Container
### Container
Ensure Docker is installed.

37
blueprint.yaml Normal file
View File

@@ -0,0 +1,37 @@
resources:
resource-nice-id:
name: this is my resource
protocol: http
full-domain: level1.test3.example.com
host-header: example.com
tls-server-name: example.com
auth:
pincode: 123456
password: sadfasdfadsf
sso-enabled: true
sso-roles:
- Member
sso-users:
- owen@fossorial.io
whitelist-users:
- owen@fossorial.io
targets:
# - site: glossy-plains-viscacha-rat
- hostname: localhost
method: http
port: 8000
healthcheck:
port: 8000
hostname: localhost
# - site: glossy-plains-viscacha-rat
- hostname: localhost
method: http
port: 8001
resource-nice-id2:
name: this is other resource
protocol: tcp
proxy-port: 3000
targets:
# - site: glossy-plains-viscacha-rat
- hostname: localhost
port: 3000

132
clients.go Normal file
View File

@@ -0,0 +1,132 @@
package main
import (
"fmt"
"strings"
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
"github.com/fosrl/newt/websocket"
"golang.zx2c4.com/wireguard/tun/netstack"
"github.com/fosrl/newt/wgnetstack"
"github.com/fosrl/newt/wgtester"
)
var wgService *wgnetstack.WireGuardService
var wgTesterServer *wgtester.Server
var ready bool
func setupClients(client *websocket.Client) {
var host = endpoint
if strings.HasPrefix(host, "http://") {
host = strings.TrimPrefix(host, "http://")
} else if strings.HasPrefix(host, "https://") {
host = strings.TrimPrefix(host, "https://")
}
host = strings.TrimSuffix(host, "/")
if useNativeInterface {
setupClientsNative(client, host)
} else {
setupClientsNetstack(client, host)
}
ready = true
}
func setupClientsNetstack(client *websocket.Client, host string) {
logger.Info("Setting up clients with netstack...")
// Create WireGuard service
wgService, err = wgnetstack.NewWireGuardService(interfaceName, mtuInt, generateAndSaveKeyTo, host, id, client, "9.9.9.9")
if err != nil {
logger.Fatal("Failed to create WireGuard service: %v", err)
}
// // Set up callback to restart wgtester with netstack when WireGuard is ready
wgService.SetOnNetstackReady(func(tnet *netstack.Net) {
wgTesterServer = wgtester.NewServerWithNetstack("0.0.0.0", wgService.Port, id, tnet) // TODO: maybe make this the same ip of the wg server?
err := wgTesterServer.Start()
if err != nil {
logger.Error("Failed to start WireGuard tester server: %v", err)
}
})
wgService.SetOnNetstackClose(func() {
if wgTesterServer != nil {
wgTesterServer.Stop()
wgTesterServer = nil
}
})
client.OnTokenUpdate(func(token string) {
wgService.SetToken(token)
})
}
func setDownstreamTNetstack(tnet *netstack.Net) {
if wgService != nil {
wgService.SetOthertnet(tnet)
}
}
func closeClients() {
logger.Info("Closing clients...")
if wgService != nil {
wgService.Close(!keepInterface)
wgService = nil
}
closeWgServiceNative()
if wgTesterServer != nil {
wgTesterServer.Stop()
wgTesterServer = nil
}
}
func clientsHandleNewtConnection(publicKey string, endpoint string) {
if !ready {
return
}
// split off the port from the endpoint
parts := strings.Split(endpoint, ":")
if len(parts) < 2 {
logger.Error("Invalid endpoint format: %s", endpoint)
return
}
endpoint = strings.Join(parts[:len(parts)-1], ":")
if wgService != nil {
wgService.StartHolepunch(publicKey, endpoint)
}
clientsHandleNewtConnectionNative(publicKey, endpoint)
}
func clientsOnConnect() {
if !ready {
return
}
if wgService != nil {
wgService.LoadRemoteConfig()
}
clientsOnConnectNative()
}
func clientsAddProxyTarget(pm *proxy.ProxyManager, tunnelIp string) {
if !ready {
return
}
// add a udp proxy for localost and the wgService port
// TODO: make sure this port is not used in a target
if wgService != nil {
pm.AddTarget("udp", tunnelIp, int(wgService.Port), fmt.Sprintf("127.0.0.1:%d", wgService.Port))
}
clientsAddProxyTargetNative(pm, tunnelIp)
}

View File

@@ -0,0 +1,32 @@
services:
otel-collector:
image: otel/opentelemetry-collector:0.111.0
command: ["--config=/etc/otelcol/config.yaml"]
volumes:
- ./examples/otel-collector.yaml:/etc/otelcol/config.yaml:ro
ports:
- "4317:4317" # OTLP gRPC in
- "8889:8889" # Prometheus scrape out
newt:
build: .
image: newt:dev
environment:
OTEL_SERVICE_NAME: newt
NEWT_METRICS_PROMETHEUS_ENABLED: "true"
NEWT_METRICS_OTLP_ENABLED: "true"
OTEL_EXPORTER_OTLP_ENDPOINT: "otel-collector:4317"
OTEL_EXPORTER_OTLP_INSECURE: "true"
OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE: "cumulative"
NEWT_ADMIN_ADDR: "0.0.0.0:2112"
ports:
- "2112:2112"
depends_on:
- otel-collector
prometheus:
image: prom/prometheus:v2.55.0
volumes:
- ./examples/prometheus.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"

View File

@@ -0,0 +1,41 @@
services:
newt:
build: .
image: newt:dev
env_file:
- .env
environment:
- NEWT_METRICS_PROMETHEUS_ENABLED=false # important: disable direct /metrics scraping
- NEWT_METRICS_OTLP_ENABLED=true # OTLP to the Collector
# optional:
# - NEWT_METRICS_INCLUDE_TUNNEL_ID=false
# When using the Collector pattern, do NOT map the Newt admin/metrics port
# (2112) on the application service. Mapping 2112 here can cause port
# conflicts and may result in duplicated Prometheus scraping (app AND
# collector being scraped for the same metrics). Instead either:
# - leave ports unset on the app service (recommended), or
# - map 2112 only on a dedicated metrics/collector service that is
# responsible for exposing metrics to Prometheus.
# Example: do NOT map here
# ports: []
# Example: map 2112 only on a collector service
# collector:
# ports:
# - "2112:2112" # collector's prometheus exporter (scraped by Prometheus)
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otelcol/config.yaml"]
volumes:
- ./examples/otel-collector.yaml:/etc/otelcol/config.yaml:ro
ports:
- "4317:4317" # OTLP gRPC
- "8889:8889" # Prometheus Exporter (scraped by Prometheus)
prometheus:
image: prom/prometheus:latest
volumes:
- ./examples/prometheus.with-collector.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"

View File

@@ -0,0 +1,55 @@
services:
# Recommended Variant A: Direct Prometheus scrape of Newt (/metrics)
# Optional: You may add the Collector service and enable OTLP export, but do NOT
# scrape both Newt and the Collector for the same process.
newt:
build: .
image: newt:dev
env_file:
- .env
environment:
OTEL_SERVICE_NAME: newt
NEWT_METRICS_PROMETHEUS_ENABLED: "true"
NEWT_METRICS_OTLP_ENABLED: "false" # avoid double-scrape by default
NEWT_ADMIN_ADDR: ":2112"
# Base NEWT configuration
PANGOLIN_ENDPOINT: ${PANGOLIN_ENDPOINT}
NEWT_ID: ${NEWT_ID}
NEWT_SECRET: ${NEWT_SECRET}
LOG_LEVEL: "DEBUG"
ports:
- "2112:2112"
# Optional Variant B: Enable the Collector and switch Prometheus scrape to it.
# collector:
# image: otel/opentelemetry-collector-contrib:0.136.0
# command: ["--config=/etc/otelcol/config.yaml"]
# volumes:
# - ./examples/otel-collector.yaml:/etc/otelcol/config.yaml:ro
# ports:
# - "4317:4317" # OTLP gRPC in
# - "8889:8889" # Prometheus scrape out
prometheus:
image: prom/prometheus:v3.6.0
volumes:
- ./examples/prometheus.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"
grafana:
image: grafana/grafana:12.2.0
container_name: newt-metrics-grafana
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
ports:
- "3005:3000"
depends_on:
- prometheus
volumes:
- ./examples/grafana/provisioning/datasources:/etc/grafana/provisioning/datasources:ro
- ./examples/grafana/provisioning/dashboards:/etc/grafana/provisioning/dashboards:ro
- ./examples/grafana/dashboards:/var/lib/grafana/dashboards:ro

View File

@@ -10,6 +10,7 @@ import (
"time"
"github.com/docker/docker/api/types/container"
"github.com/docker/docker/api/types/events"
"github.com/docker/docker/api/types/filters"
"github.com/docker/docker/client"
"github.com/fosrl/newt/logger"
@@ -53,22 +54,65 @@ type Network struct {
DNSNames []string `json:"dnsNames,omitempty"`
}
// Strcuture parts of docker api endpoint
type dockerHost struct {
protocol string // e.g. unix, http, tcp, ssh
address string // e.g. "/var/run/docker.sock" or "host:port"
}
// Parse the docker api endpoint into its parts
func parseDockerHost(raw string) (dockerHost, error) {
switch {
case strings.HasPrefix(raw, "unix://"):
return dockerHost{"unix", strings.TrimPrefix(raw, "unix://")}, nil
case strings.HasPrefix(raw, "ssh://"):
// SSH is treated as TCP-like transport by the docker client
return dockerHost{"ssh", strings.TrimPrefix(raw, "ssh://")}, nil
case strings.HasPrefix(raw, "tcp://"), strings.HasPrefix(raw, "http://"), strings.HasPrefix(raw, "https://"):
s := raw
s = strings.TrimPrefix(s, "tcp://")
s = strings.TrimPrefix(s, "http://")
s = strings.TrimPrefix(s, "https://")
return dockerHost{"tcp", s}, nil
case strings.HasPrefix(raw, "/"):
// Absolute path without scheme - treat as unix socket
return dockerHost{"unix", raw}, nil
default:
// For relative paths or other formats, also default to unix
return dockerHost{"unix", raw}, nil
}
}
// CheckSocket checks if Docker socket is available
func CheckSocket(socketPath string) bool {
// Use the provided socket path or default to standard location
if socketPath == "" {
socketPath = "/var/run/docker.sock"
socketPath = "unix:///var/run/docker.sock"
}
// Try to create a connection to the Docker socket
conn, err := net.Dial("unix", socketPath)
// Ensure the socket path is properly formatted
if !strings.Contains(socketPath, "://") {
// If no scheme provided, assume unix socket
socketPath = "unix://" + socketPath
}
host, err := parseDockerHost(socketPath)
if err != nil {
logger.Debug("Docker socket not available at %s: %v", socketPath, err)
logger.Debug("Invalid Docker socket path '%s': %v", socketPath, err)
return false
}
protocol := host.protocol
addr := host.address
// ssh might need different verification, but tcp works for basic reachability
conn, err := net.DialTimeout(protocol, addr, 2*time.Second)
if err != nil {
logger.Debug("Docker not reachable via %s at %s: %v", protocol, addr, err)
return false
}
defer conn.Close()
logger.Debug("Docker socket is available at %s", socketPath)
logger.Debug("Docker reachable via %s at %s", protocol, addr)
return true
}
@@ -116,7 +160,13 @@ func IsWithinHostNetwork(socketPath string, targetAddress string, targetPort int
func ListContainers(socketPath string, enforceNetworkValidation bool) ([]Container, error) {
// Use the provided socket path or default to standard location
if socketPath == "" {
socketPath = "/var/run/docker.sock"
socketPath = "unix:///var/run/docker.sock"
}
// Ensure the socket path is properly formatted for the Docker client
if !strings.Contains(socketPath, "://") {
// If no scheme provided, assume unix socket
socketPath = "unix://" + socketPath
}
// Used to filter down containers returned to Pangolin
@@ -132,7 +182,7 @@ func ListContainers(socketPath string, enforceNetworkValidation bool) ([]Contain
// Create client with custom socket path
cli, err := client.NewClientWithOpts(
client.WithHost("unix://"+socketPath),
client.WithHost(socketPath),
client.WithAPIVersionNegotiation(),
)
if err != nil {
@@ -182,7 +232,6 @@ func ListContainers(socketPath string, enforceNetworkValidation bool) ([]Contain
hostname = containerInfo.Config.Hostname
}
// Skip host container if set
if hostContainerId != "" && c.ID == hostContainerId {
continue
@@ -273,3 +322,128 @@ func getHostContainer(dockerContext context.Context, dockerClient *client.Client
return &hostContainer, nil
}
// EventCallback defines the function signature for handling Docker events
type EventCallback func(containers []Container)
// EventMonitor handles Docker event monitoring
type EventMonitor struct {
client *client.Client
ctx context.Context
cancel context.CancelFunc
callback EventCallback
socketPath string
enforceNetworkValidation bool
}
// NewEventMonitor creates a new Docker event monitor
func NewEventMonitor(socketPath string, enforceNetworkValidation bool, callback EventCallback) (*EventMonitor, error) {
if socketPath == "" {
socketPath = "unix:///var/run/docker.sock"
}
if !strings.Contains(socketPath, "://") {
socketPath = "unix://" + socketPath
}
cli, err := client.NewClientWithOpts(
client.WithHost(socketPath),
client.WithAPIVersionNegotiation(),
)
if err != nil {
return nil, fmt.Errorf("failed to create Docker client: %v", err)
}
ctx, cancel := context.WithCancel(context.Background())
return &EventMonitor{
client: cli,
ctx: ctx,
cancel: cancel,
callback: callback,
socketPath: socketPath,
enforceNetworkValidation: enforceNetworkValidation,
}, nil
}
// Start begins monitoring Docker events
func (em *EventMonitor) Start() error {
logger.Debug("Starting Docker event monitoring")
// Filter for container events we care about
eventFilters := filters.NewArgs()
eventFilters.Add("type", "container")
// eventFilters.Add("event", "create")
eventFilters.Add("event", "start")
eventFilters.Add("event", "stop")
// eventFilters.Add("event", "destroy")
// eventFilters.Add("event", "die")
// eventFilters.Add("event", "pause")
// eventFilters.Add("event", "unpause")
// Start listening for events
eventCh, errCh := em.client.Events(em.ctx, events.ListOptions{
Filters: eventFilters,
})
go func() {
defer func() {
if err := em.client.Close(); err != nil {
logger.Error("Error closing Docker client: %v", err)
}
}()
for {
select {
case event := <-eventCh:
logger.Debug("Docker event received: %s %s for container %s", event.Action, event.Type, event.Actor.ID[:12])
// Fetch updated container list and trigger callback
go em.handleEvent(event)
case err := <-errCh:
if err != nil && err != context.Canceled {
logger.Error("Docker event stream error: %v", err)
// Try to reconnect after a brief delay
time.Sleep(5 * time.Second)
if em.ctx.Err() == nil {
logger.Info("Attempting to reconnect to Docker event stream")
eventCh, errCh = em.client.Events(em.ctx, events.ListOptions{
Filters: eventFilters,
})
}
}
return
case <-em.ctx.Done():
logger.Info("Docker event monitoring stopped")
return
}
}
}()
return nil
}
// handleEvent processes a Docker event and triggers the callback with updated container list
func (em *EventMonitor) handleEvent(event events.Message) {
// Add a small delay to ensure Docker has fully processed the event
time.Sleep(100 * time.Millisecond)
containers, err := ListContainers(em.socketPath, em.enforceNetworkValidation)
if err != nil {
logger.Error("Failed to list containers after Docker event %s: %v", event.Action, err)
return
}
logger.Debug("Triggering callback with %d containers after Docker event %s", len(containers), event.Action)
em.callback(containers)
}
// Stop stops the event monitoring
func (em *EventMonitor) Stop() {
logger.Info("Stopping Docker event monitoring")
if em.cancel != nil {
em.cancel()
}
}

View File

@@ -0,0 +1,75 @@
# Newt Metrics: Recommendations, Gaps, and Roadmap
This document captures the current state of Newt metrics, prioritized fixes, and a pragmatic roadmap for near-term improvements.
1) Current setup (summary)
- Export: Prometheus exposition (default), optional OTLP (gRPC)
- Existing instruments:
- Sites: newt_site_registrations_total, newt_site_online (0/1), newt_site_last_heartbeat_timestamp_seconds
- Tunnel/Traffic: newt_tunnel_sessions, newt_tunnel_bytes_total, newt_tunnel_latency_seconds, newt_tunnel_reconnects_total
- Connection lifecycle: newt_connection_attempts_total, newt_connection_errors_total
- Operations: newt_config_reloads_total, process_start_time_seconds, newt_build_info
- Operations: newt_config_reloads_total, process_start_time_seconds, newt_config_apply_seconds, newt_cert_rotation_total
- Build metadata: newt_build_info
- Control plane: newt_websocket_connect_latency_seconds, newt_websocket_messages_total, newt_websocket_connected, newt_websocket_reconnects_total
- Proxy: newt_proxy_active_connections, newt_proxy_buffer_bytes, newt_proxy_async_backlog_bytes, newt_proxy_drops_total, newt_proxy_accept_total, newt_proxy_connection_duration_seconds, newt_proxy_connections_total
- Go runtime: GC, heap, goroutines via runtime instrumentation
2) Main issues addressed now
- Attribute filter (allow-list) extended to include site_id and region in addition to existing keys (tunnel_id, transport, protocol, direction, result, reason, error_type, version, commit).
- site_id and region propagation: site_id/region remain resource attributes. Metric labels mirror them on per-site gauges and counters by default; set `NEWT_METRICS_INCLUDE_SITE_LABELS=false` to drop them for multi-tenant scrapes.
- Label semantics clarified:
- transport: control-plane mechanism (e.g., websocket, wireguard)
- protocol: L4 payload type (tcp, udp)
- newt_tunnel_bytes_total uses protocol and direction, not transport.
- Robustness improvements: removed duplicate clear logic on reconnect; avoided empty site_id by reading NEWT_SITE_ID/NEWT_ID and OTEL_RESOURCE_ATTRIBUTES.
3) Remaining gaps and deviations
- Some call sites still need initiator label on reconnect outcomes (client vs server). This is planned.
- Config apply duration and cert rotation counters are planned.
- Registration and config reload failures are not yet emitted; add failure code paths so result labels expose churn.
- Document using `process_start_time_seconds` (and `time()` in PromQL) to derive uptime; no explicit restart counter is needed.
- Metric helpers often use `context.Background()`. Where lightweight contexts exist (e.g., HTTP handlers), propagate them to ease future correlation.
- Tracing coverage is limited to admin HTTP and WebSocket connect spans; extend to blueprint fetches, proxy accept loops, and WireGuard updates when OTLP is enabled.
4) Roadmap (phased)
- Phase 1 (done in this iteration)
- Fix attribute filter (site_id, region)
- Propagate site_id (and optional region) across metrics
- Correct label semantics (transport vs protocol); fix sessions transport labelling
- Documentation alignment
- Phase 2 (next)
- Reconnect: add initiator label (client/server)
- Config & PKI: newt_config_apply_seconds{phase,result}; newt_cert_rotation_total{result}
- WebSocket disconnect and keepalive failure counters
- Proxy connection lifecycle metrics (accept totals, duration histogram)
- Pangolin blueprint/config fetch latency and status metrics
- Certificate rotation duration histogram to complement success/failure counter
5) Operational guidance
- Do not double scrape: scrape either Newt (/metrics) or the Collectors Prometheus exporter (not both) to avoid double-counting cumulative counters.
- For high cardinality tunnel_id, consider relabeling or dropping per-tunnel series in Prometheus to control cardinality.
- OTLP troubleshooting: enable TLS via OTEL_EXPORTER_OTLP_CERTIFICATE, use OTEL_EXPORTER_OTLP_HEADERS for auth; verify endpoint reachability.
6) Example alerts/recording rules (suggestions)
- Reconnect spikes:
- increase(newt_tunnel_reconnects_total[5m]) by (site_id)
- Sustained connection errors:
- rate(newt_connection_errors_total[5m]) by (site_id,transport,error_type)
- Heartbeat gaps:
- max_over_time(time() - newt_site_last_heartbeat_timestamp_seconds[15m]) by (site_id)
- Proxy drops:
- increase(newt_proxy_drops_total[5m]) by (site_id,protocol)
- WebSocket connect p95 (when added):
- histogram_quantile(0.95, sum(rate(newt_websocket_connect_latency_seconds_bucket[5m])) by (le,site_id))
7) Collector configuration
- Direct scrape variant requires no attribute promotion since site_id is already a metric label.
- Transform/promote variant remains optional for environments that rely on resource-to-label promotion.

247
docs/observability.md Normal file
View File

@@ -0,0 +1,247 @@
<!-- markdownlint-disable MD033 -->
# OpenTelemetry Observability for Newt
This document describes how Newt exposes metrics using the OpenTelemetry (OTel) Go SDK, how to enable Prometheus scraping, and how to send data to an OpenTelemetry Collector for further export.
Goals
- Provide a /metrics endpoint in Prometheus exposition format (via OTel Prometheus exporter)
- Keep metrics backend-agnostic; optional OTLP export to a Collector
- Use OTel semantic conventions where applicable and enforce SI units
- Low-cardinality, stable labels only
Enable via flags (ENV mirrors)
- --metrics (default: true) ↔ NEWT_METRICS_PROMETHEUS_ENABLED
- --metrics-admin-addr (default: 127.0.0.1:2112) ↔ NEWT_ADMIN_ADDR
- --otlp (default: false) ↔ NEWT_METRICS_OTLP_ENABLED
Enable exporters via environment variables (no code changes required)
- NEWT_METRICS_PROMETHEUS_ENABLED=true|false (default: true)
- NEWT_METRICS_OTLP_ENABLED=true|false (default: false)
- OTEL_EXPORTER_OTLP_ENDPOINT=collector:4317
- OTEL_EXPORTER_OTLP_INSECURE=true|false (default: true for dev)
- OTEL_SERVICE_NAME=newt (default)
- OTEL_SERVICE_VERSION=<version>
- OTEL_RESOURCE_ATTRIBUTES=service.instance.id=<id>,site_id=<id>
- OTEL_METRIC_EXPORT_INTERVAL=15s (default)
- NEWT_ADMIN_ADDR=127.0.0.1:2112 (default admin HTTP with /metrics)
- NEWT_METRICS_INCLUDE_SITE_LABELS=true|false (default: true; disable to drop site_id/region as metric labels and rely on resource attributes only)
Runtime behavior
- When Prometheus exporter is enabled, Newt serves /metrics on NEWT_ADMIN_ADDR (default :2112)
- When OTLP is enabled, metrics and traces are exported to OTLP gRPC endpoint
- Go runtime metrics (goroutines, GC, memory) are exported automatically
Metric catalog (current)
Unless otherwise noted, `site_id` and `region` are available via resource attributes and, by default, as metric labels. Set `NEWT_METRICS_INCLUDE_SITE_LABELS=false` to drop them from counter/histogram label sets in high-cardinality environments.
| Metric | Instrument | Key attributes | Purpose | Example |
| --- | --- | --- | --- | --- |
| `newt_build_info` | Observable gauge (Int64) | `version`, `commit`, `site_id`, `region` (optional when site labels enabled) | Emits build metadata with value `1` for scrape-time verification. | `newt_build_info{version="1.5.0"} 1` |
| `newt_site_registrations_total` | Counter (Int64) | `result` (`success`/`failure`), `site_id`, `region` (optional) | Counts Pangolin registration attempts. | `newt_site_registrations_total{result="success",site_id="acme-edge-1"} 1` |
| `newt_site_online` | Observable gauge (Int64) | `site_id` | Reports whether the site is currently connected (`1`) or offline (`0`). | `newt_site_online{site_id="acme-edge-1"} 1` |
| `newt_site_last_heartbeat_timestamp_seconds` | Observable gauge (Float64) | `site_id` | Unix timestamp of the most recent Pangolin heartbeat (derive age via `time() - metric`). | `newt_site_last_heartbeat_timestamp_seconds{site_id="acme-edge-1"} 1.728e+09` |
| `newt_tunnel_sessions` | Observable gauge (Int64) | `site_id`, `tunnel_id` (when enabled) | Counts active tunnel sessions per peer; collapses to per-site when tunnel IDs are disabled. | `newt_tunnel_sessions{site_id="acme-edge-1",tunnel_id="wgpub..."} 3` |
| `newt_tunnel_bytes_total` | Counter (Int64) | `direction` (`ingress`/`egress`), `protocol` (`tcp`/`udp`), `tunnel_id` (optional), `site_id`, `region` (optional) | Measures proxied traffic volume across tunnels. | `newt_tunnel_bytes_total{direction="ingress",protocol="tcp",site_id="acme-edge-1"} 4096` |
| `newt_tunnel_latency_seconds` | Histogram (Float64) | `transport` (e.g., `wireguard`), `tunnel_id` (optional), `site_id`, `region` (optional) | Captures RTT or configuration-driven latency samples. | `newt_tunnel_latency_seconds_bucket{transport="wireguard",le="0.5"} 42` |
| `newt_tunnel_reconnects_total` | Counter (Int64) | `initiator` (`client`/`server`), `reason` (enumerated), `tunnel_id` (optional), `site_id`, `region` (optional) | Tracks reconnect causes for troubleshooting flaps. | `newt_tunnel_reconnects_total{initiator="client",reason="timeout",site_id="acme-edge-1"} 5` |
| `newt_connection_attempts_total` | Counter (Int64) | `transport` (`auth`/`websocket`), `result`, `site_id`, `region` (optional) | Measures control-plane dial attempts and their outcomes. | `newt_connection_attempts_total{transport="websocket",result="success",site_id="acme-edge-1"} 8` |
| `newt_connection_errors_total` | Counter (Int64) | `transport`, `error_type`, `site_id`, `region` (optional) | Buckets connection failures by normalized error class. | `newt_connection_errors_total{transport="websocket",error_type="tls_handshake",site_id="acme-edge-1"} 1` |
| `newt_config_reloads_total` | Counter (Int64) | `result`, `site_id`, `region` (optional) | Counts remote blueprint/config reloads. | `newt_config_reloads_total{result="success",site_id="acme-edge-1"} 3` |
| `process_start_time_seconds` | Observable gauge (Float64) | — | Unix timestamp of the Newt process start time (use `time() - process_start_time_seconds` for uptime). | `process_start_time_seconds 1.728e+09` |
| `newt_config_apply_seconds` | Histogram (Float64) | `phase` (`interface`/`peer`), `result`, `site_id`, `region` (optional) | Measures time spent applying WireGuard configuration phases. | `newt_config_apply_seconds_sum{phase="peer",result="success",site_id="acme-edge-1"} 0.48` |
| `newt_cert_rotation_total` | Counter (Int64) | `result`, `site_id`, `region` (optional) | Tracks client certificate rotation attempts. | `newt_cert_rotation_total{result="success",site_id="acme-edge-1"} 2` |
| `newt_websocket_connect_latency_seconds` | Histogram (Float64) | `transport="websocket"`, `result`, `error_type` (on failure), `site_id`, `region` (optional) | Measures WebSocket dial latency and exposes failure buckets. | `newt_websocket_connect_latency_seconds_bucket{result="success",le="0.5",site_id="acme-edge-1"} 9` |
| `newt_websocket_messages_total` | Counter (Int64) | `direction` (`in`/`out`), `msg_type` (`text`/`ping`/`pong`), `site_id`, `region` (optional) | Accounts for control WebSocket traffic volume by type. | `newt_websocket_messages_total{direction="out",msg_type="ping",site_id="acme-edge-1"} 12` |
| `newt_websocket_connected` | Observable gauge (Int64) | `site_id`, `region` (optional) | Reports current WebSocket connectivity (`1` when connected). | `newt_websocket_connected{site_id="acme-edge-1"} 1` |
| `newt_websocket_reconnects_total` | Counter (Int64) | `reason` (`tls_handshake`, `dial_timeout`, `io_error`, `ping_write`, `timeout`, etc.), `site_id`, `region` (optional) | Counts reconnect attempts with normalized reasons for failure analysis. | `newt_websocket_reconnects_total{reason="timeout",site_id="acme-edge-1"} 3` |
| `newt_proxy_active_connections` | Observable gauge (Int64) | `protocol` (`tcp`/`udp`), `direction` (`ingress`/`egress`), `tunnel_id` (optional), `site_id`, `region` (optional) | Current proxy connections per tunnel and protocol. | `newt_proxy_active_connections{protocol="tcp",direction="egress",site_id="acme-edge-1"} 4` |
| `newt_proxy_buffer_bytes` | Observable gauge (Int64) | `protocol`, `direction`, `tunnel_id` (optional), `site_id`, `region` (optional) | Volume of buffered data awaiting flush in proxy queues. | `newt_proxy_buffer_bytes{protocol="udp",direction="egress",site_id="acme-edge-1"} 2048` |
| `newt_proxy_async_backlog_bytes` | Observable gauge (Int64) | `protocol`, `direction`, `tunnel_id` (optional), `site_id`, `region` (optional) | Tracks async write backlog when deferred flushing is enabled. | `newt_proxy_async_backlog_bytes{protocol="tcp",direction="egress",site_id="acme-edge-1"} 512` |
| `newt_proxy_drops_total` | Counter (Int64) | `protocol`, `tunnel_id` (optional), `site_id`, `region` (optional) | Counts proxy drop events caused by downstream write errors. | `newt_proxy_drops_total{protocol="udp",site_id="acme-edge-1"} 1` |
| `newt_proxy_connections_total` | Counter (Int64) | `event` (`opened`/`closed`), `protocol`, `tunnel_id` (optional), `site_id`, `region` (optional) | Tracks proxy connection lifecycle events for rate/SLO calculations. | `newt_proxy_connections_total{event="opened",protocol="tcp",site_id="acme-edge-1"} 10` |
Conventions
- Durations in seconds (unit: s), names end with _seconds
- Sizes in bytes (unit: By), names end with _bytes
- Counters end with _total
- Labels must be low-cardinality and stable
Histogram buckets
- Latency (seconds): 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, 30
Local quickstart
1) Direct Prometheus scrape (do not also scrape the Collector)
NEWT_METRICS_PROMETHEUS_ENABLED=true \
NEWT_METRICS_OTLP_ENABLED=false \
NEWT_ADMIN_ADDR="127.0.0.1:2112" \
./newt
curl -s <http://localhost:2112/metrics> | grep ^newt_
2) Using the Collector (compose-style)
NEWT_METRICS_PROMETHEUS_ENABLED=true \
NEWT_METRICS_OTLP_ENABLED=true \
OTEL_EXPORTER_OTLP_ENDPOINT=collector:4317 \
OTEL_EXPORTER_OTLP_INSECURE=true \
OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=cumulative \
./newt
Collector config example: examples/otel-collector.yaml
Prometheus scrape config: examples/prometheus.yml
Adding new metrics
- Use helpers in internal/telemetry/metrics.go for counters/histograms
- Keep labels low-cardinality
- Add observable gauges through SetObservableCallback
Optional tracing
- When --otlp is enabled, you can wrap outbound HTTP clients with otelhttp.NewTransport to create spans for HTTP requests to Pangolin. This affects traces only and does not add metric labels.
OTLP TLS example
- Enable TLS to Collector with a custom CA and headers:
```sh
NEWT_METRICS_OTLP_ENABLED=true \
OTEL_EXPORTER_OTLP_ENDPOINT=collector:4317 \
OTEL_EXPORTER_OTLP_INSECURE=false \
OTEL_EXPORTER_OTLP_CERTIFICATE=/etc/otel/custom-ca.pem \
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer abc123,tenant=acme" \
./newt
```
Prometheus scrape strategy (choose one)
Important: Do not scrape both Newt (2112) and the Collectors Prometheus exporter (8889) at the same time for the same process. Doing so will double-count cumulative counters.
A) Scrape Newt directly:
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: newt
static_configs:
- targets: ["newt:2112"]
```
B) Scrape the Collectors Prometheus exporter:
```yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets: ["otel-collector:8889"]
```
Reason mapping (source → reason)
- Server instructs reconnect/terminate → server_request
- Heartbeat/Ping threshold exceeded → timeout
- Peer closed connection gracefully → peer_close
- Route/Interface change detected → network_change
- Auth/token failure (HTTP 401/403) → auth_error
- TLS/WG handshake error → handshake_error
- Config reloaded/applied (causing reconnection) → config_change
- Other/unclassified errors → error
PromQL snippets
- Throughput in (5m):
```sh
sum(rate(newt_tunnel_bytes_total{direction="ingress"}[5m]))
```
- P95 latency (seconds):
```sh
histogram_quantile(0.95, sum(rate(newt_tunnel_latency_seconds_bucket[5m])) by (le))
```
- Active sessions:
```sh
sum(newt_tunnel_sessions)
```
Compatibility notes
- Gauges do not use the _total suffix (e.g., newt_tunnel_sessions).
- site_id/region remain resource attributes. Metric labels for these fields appear on per-site gauges (e.g., `newt_site_online`) and, by default, on counters/histograms; disable them with `NEWT_METRICS_INCLUDE_SITE_LABELS=false` if needed. `tunnel_id` is a metric label (WireGuard public key). Never expose secrets in labels.
- NEWT_METRICS_INCLUDE_TUNNEL_ID (default: true) toggles whether tunnel_id is included as a label on bytes/sessions/proxy/reconnect metrics. Disable in high-cardinality environments.
- Avoid double-scraping: scrape either Newt (/metrics) or the Collector's Prometheus exporter, not both.
- Prometheus does not accept remote_write; use Mimir/Cortex/VM/Thanos-Receive for remote_write.
- No free text in labels; use only the enumerated constants for reason, protocol (tcp|udp), and transport (e.g., websocket|wireguard).
Further reading
- See docs/METRICS_RECOMMENDATIONS.md for roadmap, label guidance (transport vs protocol), and example alerts.
Cardinality tips
- tunnel_id can grow in larger fleets. Use relabeling to drop or retain a subset, for example:
```yaml
# Drop all tunnel_id on bytes to reduce series
- source_labels: [__name__]
regex: newt_tunnel_bytes_total
action: keep
- action: labeldrop
regex: tunnel_id
# Or drop only high-churn tunnels
- source_labels: [tunnel_id]
regex: .*
action: drop
```
Quickstart: direkte Prometheus-Erfassung (empfohlen)
```sh
# Start (direkter /metrics-Scrape, keine Doppel-Erfassung)
docker compose -f docker-compose.metrics.yml up -d
# Smoke-Checks
./scripts/smoke-metrics.sh
# Tunnel-IDs ausblenden (optional):
# EXPECT_TUNNEL_ID=false NEWT_METRICS_INCLUDE_TUNNEL_ID=false ./scripts/smoke-metrics.sh
```
- Prometheus UI: <http://localhost:9090>
- Standard-Scrape-Intervall: 15s
- Kein OTLP aktiv (NEWT_METRICS_OTLP_ENABLED=false in docker-compose.metrics.yml)
Häufige PromQL-Schnelltests
```yaml
# Online-Status einer Site in den letzten 5 Minuten
max_over_time(newt_site_online{site_id="$site"}[5m])
# TCP egress-Bytes pro Site/Tunnel (10m)
sum by (site_id, tunnel_id) (increase(newt_tunnel_bytes_total{protocol="tcp",direction="egress"}[10m]))
# WebSocket-Connect P95
histogram_quantile(0.95, sum by (le, site_id) (rate(newt_websocket_connect_latency_seconds_bucket[5m])))
# Reconnects nach Initiator
increase(newt_tunnel_reconnects_total{site_id="$site"}[30m]) by (initiator, reason)
```
Troubleshooting
- curl :2112/metrics ensure endpoint is reachable and includes newt_* metrics
- Check Collector logs for OTLP connection issues
- Verify Prometheus Targets are UP and scraping Newt or Collector

129
docs/otel-review.md Normal file
View File

@@ -0,0 +1,129 @@
# Newt OpenTelemetry Review
## Overview
This document summarises the current OpenTelemetry (OTel) instrumentation in Newt, assesses
compliance with OTel guidelines, and lists concrete improvements to pursue before release.
It is based on the implementation in `internal/telemetry` and the call-sites that emit
metrics and traces across the code base.
## Current metric instrumentation
All instruments are registered in `internal/telemetry/metrics.go`. They are grouped
into site, tunnel, connection, configuration, build, WebSocket, and proxy domains.
A global attribute filter (see `buildMeterProvider`) constrains exposed label keys to
`site_id`, `region`, and a curated list of low-cardinality dimensions so that Prometheus
exports stay bounded.
- **Site lifecycle**: `newt_site_registrations_total`, `newt_site_online`, and
`newt_site_last_heartbeat_timestamp_seconds` capture registration attempts and liveness. They
are fed either manually (`IncSiteRegistration`) or via the `TelemetryView` state
callback that publishes observable gauges for the active site.
- **Tunnel health and usage**: Counters and histograms track bytes, latency, reconnects,
and active sessions per tunnel (`newt_tunnel_*` family). Attribute helpers respect
the `NEWT_METRICS_INCLUDE_TUNNEL_ID` toggle to keep cardinality manageable on larger
fleets.
- **Connection attempts**: `newt_connection_attempts_total` and
`newt_connection_errors_total` are emitted throughout the WebSocket client to classify
authentication, dial, and transport failures.
- **Operations/configuration**: `newt_config_reloads_total`,
`process_start_time_seconds`, `newt_config_apply_seconds`, and
`newt_cert_rotation_total` provide visibility into blueprint reloads, process boots,
configuration timings, and certificate rotation outcomes.
- **Build metadata**: `newt_build_info` records the binary version/commit together
with optional site metadata when build information is supplied at startup.
- **WebSocket control-plane**: `newt_websocket_connect_latency_seconds`,
`newt_websocket_messages_total`, `newt_websocket_connected`, and
`newt_websocket_reconnects_total` report connect latency, ping/pong/text activity,
connection state, and reconnect reasons.
- **Proxy data-plane**: Observable gauges (`newt_proxy_active_connections`,
`newt_proxy_buffer_bytes`, `newt_proxy_async_backlog_bytes`) plus counters for
drops, accepts, connection lifecycle events (`newt_proxy_connections_total`), and
duration histograms (`newt_proxy_connection_duration_seconds`) surface backlog,
drop behaviour, and churn alongside per-protocol byte counters.
Refer to `docs/observability.md` for a tabular catalogue with instrument types,
attributes, and sample exposition lines.
## Tracing coverage
Tracing is optional and enabled only when OTLP export is configured. When active:
- The admin HTTP mux is wrapped with `otelhttp.NewHandler`, producing spans for
`/metrics` and `/healthz` requests.
- The WebSocket dial path creates a `ws.connect` span around the gRPC-based handshake.
No other subsystems currently create spans, so data-plane operations, blueprint fetches,
Docker discovery, and WireGuard reconfiguration happen without trace context.
## Guideline & best-practice alignment
The implementation adheres to most OTel Go recommendations:
- **Naming & units** Every instrument follows the `newt_*` prefix with `_total`
suffixes for counters and `_seconds`/`_bytes` unit conventions. Histograms are
registered with explicit second-based buckets.
- **Resource attributes** Service name/version and optional `site_id`/`region`
populate the `resource.Resource`. Metric labels mirror these by default (and on
per-site gauges) but can be disabled with `NEWT_METRICS_INCLUDE_SITE_LABELS=false`
to avoid unnecessary cardinality growth.
- **Attribute hygiene** A single attribute filter (`sdkmetric.WithView`) enforces
the allow-list of label keys to prevent accidental high-cardinality emission.
- **Runtime metrics** Go runtime instrumentation is enabled automatically through
`runtime.Start`.
- **Configuration via environment** `telemetry.FromEnv` honours `OTEL_*` variables
alongside `NEWT_*` overrides so operators can configure exporters without code
changes.
- **Shutdown handling** `Setup.Shutdown` iterates exporters in reverse order to
flush buffers before process exit.
## Adjustments & improvements
The review identified a few actionable adjustments:
1. **Record registration failures** `newt_site_registrations_total` is currently
incremented only on success. Emit `result="failure"` samples whenever Pangolin
rejects a registration or credential exchange so operators can alert on churn.
2. **Surface config reload failures** `telemetry.IncConfigReload` is invoked with
`result="success"` only. Callers should record a failure result when blueprint
parsing or application aborts before success counters are incremented.
3. **Expose robust uptime** Document using `time() - process_start_time_seconds`
to derive uptime now that the restart counter has been replaced with a timestamp
gauge.
4. **Propagate contexts where available** Many emitters call metric helpers with
`context.Background()`. Passing real contexts (when inexpensive) would allow future
exporters to correlate spans and metrics.
5. **Extend tracing coverage** Instrument critical flows such as blueprint fetches,
WireGuard reconfiguration, proxy accept loops, and Docker discovery to provide end
to end visibility when OTLP tracing is enabled.
## Metrics to add before release
Prioritised additions that would close visibility gaps:
1. **Config reload error taxonomy** Split reload attempts into a dedicated
`newt_config_reload_errors_total{phase}` counter to make blueprint validation failures
visible alongside the existing success counter.
2. **Config source visibility** Export `newt_config_source_info{source,version}` so
operators can audit the active blueprint origin/commit during incidents.
3. **Certificate expiry** Emit `newt_cert_expiry_timestamp_seconds` (per cert) to
enable proactive alerts before mTLS credentials lapse.
4. **Blueprint/config pull latency** Measuring Pangolin blueprint fetch durations and
HTTP status distribution would expose slow control-plane operations.
5. **Tunnel setup latency** Histograms for DNS resolution and tunnel handshakes would
help correlate connect latency spikes with network dependencies.
These metrics rely on data that is already available in the code paths mentioned
above and would round out operational dashboards.
## Tracing wishlist
To benefit from tracing when OTLP is active, add spans around:
- Pangolin REST calls (wrap the HTTP client with `otelhttp.NewTransport`).
- Docker discovery cycles and target registration callbacks.
- WireGuard reconfiguration (interface bring-up, peer updates).
- Proxy dial/accept loops for both TCP and UDP targets.
Capturing these stages will let operators correlate latency spikes with reconnects
and proxy drops using distributed traces in addition to the metric signals.

View File

@@ -0,0 +1,898 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"decimals": 0,
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 500
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "value_and_name"
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "go_goroutine_count",
"instant": true,
"legendFormat": "",
"refId": "A"
}
],
"title": "Goroutines",
"transformations": [],
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"decimals": 1,
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "orange",
"value": 256
},
{
"color": "red",
"value": 512
}
]
},
"unit": "bytes"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 6,
"y": 0
},
"id": 2,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "value_and_name"
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "go_memory_gc_goal_bytes / 1024 / 1024",
"format": "time_series",
"instant": true,
"legendFormat": "",
"refId": "A"
}
],
"title": "GC Target Heap (MiB)",
"transformations": [],
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"decimals": 2,
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "orange",
"value": 10
},
{
"color": "red",
"value": 25
}
]
},
"unit": "ops"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 12,
"y": 0
},
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "value_and_name"
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(http_server_request_duration_seconds_count[$__rate_interval]))",
"instant": false,
"legendFormat": "req/s",
"refId": "A"
}
],
"title": "HTTP Requests / s",
"transformations": [],
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"decimals": 3,
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "orange",
"value": 0.1
},
{
"color": "red",
"value": 0.5
}
]
},
"unit": "ops"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 18,
"y": 0
},
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "value_and_name"
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(newt_connection_errors_total{site_id=~\"$site_id\"}[$__rate_interval]))",
"instant": false,
"legendFormat": "errors/s",
"refId": "A"
}
],
"title": "Connection Errors / s",
"transformations": [],
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "bytes"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 7
},
"id": 5,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(go_memory_used_bytes)",
"legendFormat": "Used",
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "go_memory_gc_goal_bytes",
"legendFormat": "GC Goal",
"refId": "B"
}
],
"title": "Go Heap Usage vs GC Goal",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"decimals": 0,
"mappings": [],
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 7
},
"id": 6,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "rate(go_memory_allocations_total[$__rate_interval])",
"legendFormat": "Allocations/s",
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "rate(go_memory_allocated_bytes_total[$__rate_interval])",
"legendFormat": "Allocated bytes/s",
"refId": "B"
}
],
"title": "Allocation Activity",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 16
},
"id": 7,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.5, sum(rate(http_server_request_duration_seconds_bucket[$__rate_interval])) by (le))",
"legendFormat": "p50",
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.95, sum(rate(http_server_request_duration_seconds_bucket[$__rate_interval])) by (le))",
"legendFormat": "p95",
"refId": "B"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.99, sum(rate(http_server_request_duration_seconds_bucket[$__rate_interval])) by (le))",
"legendFormat": "p99",
"refId": "C"
}
],
"title": "HTTP Request Duration Quantiles",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "ops"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 16
},
"id": 8,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(http_server_request_duration_seconds_count[$__rate_interval])) by (http_response_status_code)",
"legendFormat": "{{http_response_status_code}}",
"refId": "A"
}
],
"title": "HTTP Requests by Status",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "ops"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 25
},
"id": 9,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(newt_connection_attempts_total{site_id=~\"$site_id\"}[$__rate_interval])) by (transport, result)",
"legendFormat": "{{transport}} • {{result}}",
"refId": "A"
}
],
"title": "Connection Attempts by Transport/Result",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "ops"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 25
},
"id": 10,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(newt_connection_errors_total{site_id=~\"$site_id\"}[$__rate_interval])) by (transport, error_type)",
"legendFormat": "{{transport}} • {{error_type}}",
"refId": "A"
}
],
"title": "Connection Errors by Type",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"decimals": 3,
"mappings": [],
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 34
},
"id": 11,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.5, sum(rate(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\", tunnel_id=~\"$tunnel_id\"}[$__rate_interval])) by (le))",
"legendFormat": "p50",
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.95, sum(rate(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\", tunnel_id=~\"$tunnel_id\"}[$__rate_interval])) by (le))",
"legendFormat": "p95",
"refId": "B"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "histogram_quantile(0.99, sum(rate(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\", tunnel_id=~\"$tunnel_id\"}[$__rate_interval])) by (le))",
"legendFormat": "p99",
"refId": "C"
}
],
"title": "Tunnel Latency Quantiles",
"type": "timeseries"
},
{
"cards": {},
"color": {
"cardColor": "#b4ff00",
"colorScale": "sqrt",
"colorScheme": "interpolateTurbo"
},
"dataFormat": "tsbuckets",
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"custom": {},
"mappings": [],
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 34
},
"heatmap": {},
"hideZeroBuckets": true,
"id": 12,
"legend": {
"show": false
},
"options": {
"calculate": true,
"cellGap": 2,
"cellSize": "auto",
"color": {
"exponent": 0.5
},
"exemplars": {
"color": "rgba(255,255,255,0.7)"
},
"filterValues": {
"le": 1e-9
},
"legend": {
"show": false
},
"tooltip": {
"mode": "single",
"show": true
},
"xAxis": {
"show": true
},
"yAxis": {
"decimals": 3,
"show": true
}
},
"pluginVersion": "11.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"editorMode": "code",
"expr": "sum(rate(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\", tunnel_id=~\"$tunnel_id\"}[$__rate_interval])) by (le)",
"format": "heatmap",
"legendFormat": "{{le}}",
"refId": "A"
}
],
"title": "Tunnel Latency Bucket Rate",
"type": "heatmap"
}
],
"refresh": "30s",
"schemaVersion": 39,
"style": "dark",
"tags": [
"newt",
"otel"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "Prometheus",
"value": "prometheus"
},
"hide": 0,
"label": "Datasource",
"name": "DS_PROMETHEUS",
"options": [],
"query": "prometheus",
"refresh": 1,
"type": "datasource"
},
{
"current": {
"selected": false,
"text": "All",
"value": "$__all"
},
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"definition": "label_values(target_info, site_id)",
"hide": 0,
"includeAll": true,
"label": "Site",
"multi": true,
"name": "site_id",
"options": [],
"query": {
"query": "label_values(target_info, site_id)",
"refId": "SiteIdVar"
},
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": false,
"text": "All",
"value": "$__all"
},
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"definition": "label_values(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\"}, tunnel_id)",
"hide": 0,
"includeAll": true,
"label": "Tunnel",
"multi": true,
"name": "tunnel_id",
"options": [],
"query": {
"query": "label_values(newt_tunnel_latency_seconds_bucket{site_id=~\"$site_id\"}, tunnel_id)",
"refId": "TunnelVar"
},
"refresh": 2,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "Newt Overview",
"uid": "newt-overview",
"version": 1,
"weekStart": ""
}

View File

@@ -0,0 +1,9 @@
apiVersion: 1
providers:
- name: "newt"
folder: "Newt"
type: file
disableDeletion: false
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards

View File

@@ -0,0 +1,9 @@
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
uid: prometheus
isDefault: true
editable: true

View File

@@ -0,0 +1,61 @@
# Variant A: Direct scrape of Newt (/metrics) via Prometheus (no Collector needed)
# Note: Newt already exposes labels like site_id, protocol, direction. Do not promote
# resource attributes into labels when scraping Newt directly.
#
# Example Prometheus scrape config:
# global:
# scrape_interval: 15s
# scrape_configs:
# - job_name: newt
# static_configs:
# - targets: ["newt:2112"]
#
# Variant B: Use OTEL Collector (Newt -> OTLP -> Collector -> Prometheus)
# This pipeline scrapes metrics from the Collector's Prometheus exporter.
# Labels are already on datapoints; promotion from resource is OPTIONAL and typically NOT required.
# If you enable transform/promote below, ensure you do not duplicate labels.
receivers:
otlp:
protocols:
grpc:
endpoint: ":4317"
processors:
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
resourcedetection:
detectors: [env, system]
timeout: 5s
batch: {}
# OPTIONAL: Only enable if you need to promote resource attributes to labels.
# WARNING: Newt already provides site_id as a label; avoid double-promotion.
# transform/promote:
# error_mode: ignore
# metric_statements:
# - context: datapoint
# statements:
# - set(attributes["service_instance_id"], resource.attributes["service.instance.id"]) where resource.attributes["service.instance.id"] != nil
# - set(attributes["site_id"], resource.attributes["site_id"]) where resource.attributes["site_id"] != nil
exporters:
prometheus:
endpoint: ":8889"
send_timestamps: true
# prometheusremotewrite:
# endpoint: http://mimir:9009/api/v1/push
debug:
verbosity: basic
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, resourcedetection, batch] # add transform/promote if you really need it
exporters: [prometheus]
traces:
receivers: [otlp]
processors: [memory_limiter, resourcedetection, batch]
exporters: [debug]

View File

@@ -0,0 +1,16 @@
global:
scrape_interval: 15s
scrape_configs:
# IMPORTANT: Do not scrape Newt directly; scrape only the Collector!
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:8889']
# optional: limit metric cardinality
relabel_configs:
- action: labeldrop
regex: 'tunnel_id'
# - action: keep
# source_labels: [site_id]
# regex: '(site-a|site-b)'

21
examples/prometheus.yml Normal file
View File

@@ -0,0 +1,21 @@
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'newt'
scrape_interval: 15s
static_configs:
- targets: ['newt:2112'] # /metrics
relabel_configs:
# optional: drop tunnel_id
- action: labeldrop
regex: 'tunnel_id'
# optional: allow only specific sites
- action: keep
source_labels: [site_id]
regex: '(site-a|site-b)'
# WARNING: Do not enable this together with the 'newt' job above or you will double-count.
# - job_name: 'otel-collector'
# static_configs:
# - targets: ['otel-collector:8889']

6
flake.lock generated
View File

@@ -2,11 +2,11 @@
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1752308619,
"narHash": "sha256-pzrVLKRQNPrii06Rm09Q0i0dq3wt2t2pciT/GNq5EZQ=",
"lastModified": 1756217674,
"narHash": "sha256-TH1SfSP523QI7kcPiNtMAEuwZR3Jdz0MCDXPs7TS8uo=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "650e572363c091045cdbc5b36b0f4c1f614d3058",
"rev": "4e7667a90c167f7a81d906e5a75cba4ad8bee620",
"type": "github"
},
"original": {

View File

@@ -22,17 +22,25 @@
system:
let
pkgs = pkgsFor system;
# Update version when releasing
version = "1.4.2";
# Update the version in a new source tree
srcWithReplacedVersion = pkgs.runCommand "newt-src-with-version" { } ''
cp -r ${./.} $out
chmod -R +w $out
rm -rf $out/.git $out/result $out/.envrc $out/.direnv
sed -i "s/version_replaceme/${version}/g" $out/main.go
'';
in
{
default = self.packages.${system}.pangolin-newt;
pangolin-newt = pkgs.buildGoModule {
pname = "pangolin-newt";
version = "1.3.2";
src = ./.;
vendorHash = "sha256-Y/f7GCO7Kf1iQiDR32DIEIGJdcN+PKS0OrhBvXiHvwo=";
version = version;
src = srcWithReplacedVersion;
vendorHash = "sha256-PENsCO2yFxLVZNPgx2OP+gWVNfjJAfXkwWS7tzlm490=";
meta = with pkgs.lib; {
description = "A tunneling client for Pangolin";
homepage = "https://github.com/fosrl/newt";

74
go.mod
View File

@@ -1,36 +1,51 @@
module github.com/fosrl/newt
go 1.23.1
toolchain go1.23.2
go 1.25
require (
github.com/docker/docker v28.3.2+incompatible
github.com/docker/docker v28.5.0+incompatible
github.com/google/gopacket v1.1.19
github.com/gorilla/websocket v1.5.3
github.com/prometheus/client_golang v1.23.2
github.com/vishvananda/netlink v1.3.1
golang.org/x/crypto v0.40.0
golang.org/x/exp v0.0.0-20250218142911-aa4b98e5adaa
golang.org/x/net v0.42.0
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0
go.opentelemetry.io/contrib/instrumentation/runtime v0.63.0
go.opentelemetry.io/otel v1.38.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.38.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.38.0
go.opentelemetry.io/otel/exporters/prometheus v0.60.0
go.opentelemetry.io/otel/metric v1.38.0
go.opentelemetry.io/otel/sdk v1.38.0
go.opentelemetry.io/otel/sdk/metric v1.38.0
golang.org/x/crypto v0.43.0
golang.org/x/net v0.45.0
golang.org/x/exp v0.0.0-20250718183923-645b1fa84792
golang.zx2c4.com/wireguard v0.0.0-20250521234502-f333402bd9cb
golang.zx2c4.com/wireguard/wgctrl v0.0.0-20241231184526-a9ab2273dd10
gopkg.in/yaml.v3 v3.0.1
google.golang.org/grpc v1.76.0
gvisor.dev/gvisor v0.0.0-20250503011706-39ed1f5ac29c
software.sslmate.com/src/go-pkcs12 v0.5.0
software.sslmate.com/src/go-pkcs12 v0.6.0
)
require (
github.com/Microsoft/go-winio v0.6.0 // indirect
github.com/containerd/errdefs v1.0.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v5 v5.0.3 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/containerd/errdefs v0.3.0 // indirect
github.com/containerd/errdefs/pkg v0.3.0 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/docker/go-connections v0.5.0 // indirect
github.com/docker/go-connections v0.6.0 // indirect
github.com/docker/go-units v0.4.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/logr v1.4.3 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/google/btree v1.1.2 // indirect
github.com/google/btree v1.1.3 // indirect
github.com/google/go-cmp v0.7.0 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.2 // indirect
github.com/josharian/native v1.1.0 // indirect
github.com/mdlayher/genetlink v1.3.2 // indirect
github.com/mdlayher/netlink v1.7.2 // indirect
@@ -39,20 +54,33 @@ require (
github.com/moby/sys/atomicwriter v0.1.0 // indirect
github.com/moby/term v0.5.2 // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.1.1 // indirect
github.com/opencontainers/image-spec v1.1.0 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/common v0.66.1 // indirect
github.com/prometheus/otlptranslator v0.0.2 // indirect
github.com/prometheus/procfs v0.17.0 // indirect
github.com/vishvananda/netns v0.0.5 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 // indirect
go.opentelemetry.io/otel v1.36.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.38.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.38.0 // indirect
go.opentelemetry.io/otel/trace v1.38.0 // indirect
go.opentelemetry.io/proto/otlp v1.7.1 // indirect
go.yaml.in/yaml/v2 v2.4.2 // indirect
golang.org/x/mod v0.28.0 // indirect
golang.org/x/sync v0.17.0 // indirect
golang.org/x/sys v0.37.0 // indirect
golang.org/x/text v0.30.0 // indirect
golang.org/x/time v0.12.0 // indirect
golang.org/x/tools v0.37.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.62.0 // indirect
go.opentelemetry.io/otel v1.37.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.36.0 // indirect
go.opentelemetry.io/otel/metric v1.36.0 // indirect
go.opentelemetry.io/otel/trace v1.36.0 // indirect
golang.org/x/mod v0.23.0 // indirect
golang.org/x/sync v0.11.0 // indirect
golang.org/x/sys v0.34.0 // indirect
golang.org/x/time v0.7.0 // indirect
golang.org/x/tools v0.30.0 // indirect
go.opentelemetry.io/otel/metric v1.37.0 // indirect
golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20250825161204-c5933d9347a5 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20250825161204-c5933d9347a5 // indirect
google.golang.org/protobuf v1.36.8 // indirect
)

201
go.sum
View File

@@ -2,11 +2,14 @@ github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c h1:udKWzYgxTojEK
github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/Microsoft/go-winio v0.6.0 h1:slsWYD/zyx7lCXoZVlvQrj0hPTM1HI4+v1sIda2yDvg=
github.com/Microsoft/go-winio v0.6.0/go.mod h1:cTAf44im0RAYeL23bpB+fzCyDH2MJiz2BO69KH/soAE=
github.com/cenkalti/backoff v2.2.1+incompatible h1:tNowT99t7UNflLxfYYSlKYsBpXdEet03Pg2g16Swow4=
github.com/cenkalti/backoff/v5 v5.0.2 h1:rIfFVxEf1QsI7E1ZHfp/B4DF/6QBAUhmgkxc0H7Zss8=
github.com/cenkalti/backoff/v5 v5.0.2/go.mod h1:rkhZdG3JZukswDf7f0cwqPNk4K0sa+F97BxZthm/crw=
github.com/containerd/errdefs v1.0.0 h1:tg5yIfIlQIrxYtu9ajqY42W3lpS19XqdxRQeEwYG8PI=
github.com/containerd/errdefs v1.0.0/go.mod h1:+YBYIdtsnF4Iw6nWZhJcqGSg/dwvV7tyJ/kCkyJ2k+M=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/cenkalti/backoff/v5 v5.0.3 h1:ZN+IMa753KfX5hd8vVaMixjnqRZ3y8CuJKRKj1xcsSM=
github.com/cenkalti/backoff/v5 v5.0.3/go.mod h1:rkhZdG3JZukswDf7f0cwqPNk4K0sa+F97BxZthm/crw=
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/containerd/errdefs v0.3.0 h1:FSZgGOeK4yuT/+DnF07/Olde/q4KBoMsaamhXxIMDp4=
github.com/containerd/errdefs v0.3.0/go.mod h1:+YBYIdtsnF4Iw6nWZhJcqGSg/dwvV7tyJ/kCkyJ2k+M=
github.com/containerd/errdefs/pkg v0.3.0 h1:9IKJ06FvyNlexW690DXuQNx2KA2cUJXx151Xdx3ZPPE=
github.com/containerd/errdefs/pkg v0.3.0/go.mod h1:NJw6s9HwNuRhnjJhM7pylWwMyAkmCQvQ4GpJHEqRLVk=
github.com/containerd/log v0.1.0 h1:TCJt7ioM2cr/tfR8GPbGf9/VRAX8D2B4PjzCpfX540I=
@@ -15,23 +18,25 @@ github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
github.com/docker/docker v28.3.2+incompatible h1:wn66NJ6pWB1vBZIilP8G3qQPqHy5XymfYn5vsqeA5oA=
github.com/docker/docker v28.3.2+incompatible/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk=
github.com/docker/go-connections v0.5.0 h1:USnMq7hx7gwdVZq1L49hLXaFtUdTADjXGp+uj1Br63c=
github.com/docker/go-connections v0.5.0/go.mod h1:ov60Kzw0kKElRwhNs9UlUHAE/F9Fe6GLaXnqyDdmEXc=
github.com/docker/docker v28.5.0+incompatible h1:ZdSQoRUE9XxhFI/B8YLvhnEFMmYN9Pp8Egd2qcaFk1E=
github.com/docker/docker v28.5.0+incompatible/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk=
github.com/docker/go-connections v0.6.0 h1:LlMG9azAe1TqfR7sO+NJttz1gy6KO7VJBh+pMmjSD94=
github.com/docker/go-connections v0.6.0/go.mod h1:AahvXYshr6JgfUJGdDCs2b5EZG/vmaMAntpSFH5BFKE=
github.com/docker/go-units v0.4.0 h1:3uh0PgVws3nIA0Q+MwDC8yjEPf9zjRfZZWXZYDct3Tw=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-logr/logr v1.4.3 h1:CjnDlHq8ikf6E492q6eKboGOC0T8CDaOvkHCIg8idEI=
github.com/go-logr/logr v1.4.3/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/google/btree v1.1.2 h1:xf4v41cLI2Z6FxbKm+8Bu+m8ifhj15JuZ9sa0jZCMUU=
github.com/google/btree v1.1.2/go.mod h1:qOPhT0dTNdNzV6Z/lhRX0YXUafgPLFUh+gZMl761Gm4=
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
github.com/google/btree v1.1.3 h1:CVpQJjYgC4VbzxeGVHfvZrv1ctoYCAI8vbl07Fcxlyg=
github.com/google/btree v1.1.3/go.mod h1:qOPhT0dTNdNzV6Z/lhRX0YXUafgPLFUh+gZMl761Gm4=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/gopacket v1.1.19 h1:ves8RnFZPGiFnTS0uPQStjwru6uO6h+nlr9j6fL7kF8=
@@ -40,12 +45,20 @@ github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.3 h1:5ZPtiqj0JL5oKWmcsq4VMaAW5ukBEgSGXEN89zeH1Jo=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.3/go.mod h1:ndYquD05frm2vACXE1nsccT4oJzjhw2arTS2cpUD1PI=
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc h1:GN2Lv3MGO7AS6PrRoT6yV5+wkrOpcszoIsO4+4ds248=
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc/go.mod h1:+JKpmjMGhpgPL+rXZ5nsZieVzvarn86asRlBg4uNGnk=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.2 h1:8Tjv8EJ+pM1xP8mK6egEbD1OgnVTyacbefKhmbLhIhU=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.2/go.mod h1:pkJQ2tZHJ0aFOVEEot6oZmaVEZcRme73eIFmhiVuRWs=
github.com/josharian/native v1.1.0 h1:uuaP0hAbW7Y4l0ZRQ6C9zfb7Mg1mbFKry/xzDAfmtLA=
github.com/josharian/native v1.1.0/go.mod h1:7X/raswPFr05uY3HiLlYeyQntB6OO7E/d2Cu7qoaN2w=
github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/klauspost/compress v1.18.0 h1:c/Cqfb0r+Yi+JtIEq73FWXVkRonBlf0CRNYc8Zttxdo=
github.com/klauspost/compress v1.18.0/go.mod h1:2Pp+KzxcywXVXMr50+X0Q/Lsb43OQHYWRCY2AiWywWQ=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0SNc=
github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
github.com/mdlayher/genetlink v1.3.2 h1:KdrNKe+CTu+IbZnm/GVUMXSqBBLqcGpRDa0xkQy56gw=
github.com/mdlayher/genetlink v1.3.2/go.mod h1:tcC3pkCrPUGIKKsCsp0B3AdaaKuHtaxoJRz3cc+528o=
github.com/mdlayher/netlink v1.7.2 h1:/UtM3ofJap7Vl4QWCPDGXY8d3GIY2UGSDbK+QWmY8/g=
@@ -64,112 +77,126 @@ github.com/moby/term v0.5.2 h1:6qk3FJAFDs6i/q3W/pQ97SX192qKfZgGjCQqfCJkgzQ=
github.com/moby/term v0.5.2/go.mod h1:d3djjFCrjnB+fl8NJux+EJzu0msscUP+f8it8hPkFLc=
github.com/morikuni/aec v1.0.0 h1:nP9CBfwrvYnBRgY6qfDQkygYDmYwOilePFkwzv4dU8A=
github.com/morikuni/aec v1.0.0/go.mod h1:BbKIizmSmc5MMPqRYbxO4ZU0S0+P200+tUnFx7PXmsc=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U=
github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM=
github.com/opencontainers/image-spec v1.1.1 h1:y0fUlFfIZhPF1W537XOLg0/fcx6zcHCJwooC2xJA040=
github.com/opencontainers/image-spec v1.1.1/go.mod h1:qpqAh3Dmcf36wStyyWU+kCeDgrGnAve2nCC8+7h8Q0M=
github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQb2IpWsCzug=
github.com/opencontainers/image-spec v1.1.0/go.mod h1:W4s4sFTMaBeK1BQLXbG4AdM2szdn85PY75RI83NrTrM=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prometheus/client_golang v1.23.2 h1:Je96obch5RDVy3FDMndoUsjAhG5Edi49h0RJWRi/o0o=
github.com/prometheus/client_golang v1.23.2/go.mod h1:Tb1a6LWHB3/SPIzCoaDXI4I8UHKeFTEQ1YCr+0Gyqmg=
github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk=
github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE=
github.com/prometheus/common v0.66.1 h1:h5E0h5/Y8niHc5DlaLlWLArTQI7tMrsfQjHV+d9ZoGs=
github.com/prometheus/common v0.66.1/go.mod h1:gcaUsgf3KfRSwHY4dIMXLPV0K/Wg1oZ8+SbZk/HH/dA=
github.com/prometheus/otlptranslator v0.0.2 h1:+1CdeLVrRQ6Psmhnobldo0kTp96Rj80DRXRd5OSnMEQ=
github.com/prometheus/otlptranslator v0.0.2/go.mod h1:P8AwMgdD7XEr6QRUJ2QWLpiAZTgTE2UYgjlu3svompI=
github.com/prometheus/procfs v0.17.0 h1:FuLQ+05u4ZI+SS/w9+BWEM2TXiHKsUQ9TADiRH7DuK0=
github.com/prometheus/procfs v0.17.0/go.mod h1:oPQLaDAMRbA+u8H5Pbfq+dl3VDAvHxMUOVhe0wYB2zw=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
github.com/vishvananda/netlink v1.3.1 h1:3AEMt62VKqz90r0tmNhog0r/PpWKmrEShJU0wJW6bV0=
github.com/vishvananda/netlink v1.3.1/go.mod h1:ARtKouGSTGchR8aMwmkzC0qiNPrrWO5JS/XMVl45+b4=
github.com/vishvananda/netns v0.0.5 h1:DfiHV+j8bA32MFM7bfEunvT8IAqQ/NzSJHtcmW5zdEY=
github.com/vishvananda/netns v0.0.5/go.mod h1:SpkAiCQRtJ6TvvxPnOSyH3BMl6unz3xZlaprSwhNNJM=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
go.opentelemetry.io/auto/sdk v1.1.0 h1:cH53jehLUN6UFLY71z+NDOiNJqDdPRaXzTel0sJySYA=
go.opentelemetry.io/auto/sdk v1.1.0/go.mod h1:3wSPjt5PWp2RhlCcmmOial7AvC4DQqZb7a7wCow3W8A=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 h1:F7Jx+6hwnZ41NSFTO5q4LYDtJRXBf2PD0rNBkeB/lus=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0/go.mod h1:UHB22Z8QsdRDrnAtX4PntOl36ajSxcdUMt1sF7Y6E7Q=
go.opentelemetry.io/otel v1.36.0 h1:UumtzIklRBY6cI/lllNZlALOF5nNIzJVb16APdvgTXg=
go.opentelemetry.io/otel v1.36.0/go.mod h1:/TcFMXYjyRNh8khOAO9ybYkqaDBb/70aVwkNML4pP8E=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.36.0 h1:dNzwXjZKpMpE2JhmO+9HsPl42NIXFIFSUSSs0fiqra0=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.36.0/go.mod h1:90PoxvaEB5n6AOdZvi+yWJQoE95U8Dhhw2bSyRqnTD0=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.36.0 h1:nRVXXvf78e00EwY6Wp0YII8ww2JVWshZ20HfTlE11AM=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.36.0/go.mod h1:r49hO7CgrxY9Voaj3Xe8pANWtr0Oq916d0XAmOoCZAQ=
go.opentelemetry.io/otel/metric v1.36.0 h1:MoWPKVhQvJ+eeXWHFBOPoBOi20jh6Iq2CcCREuTYufE=
go.opentelemetry.io/otel/metric v1.36.0/go.mod h1:zC7Ks+yeyJt4xig9DEw9kuUFe5C3zLbVjV2PzT6qzbs=
go.opentelemetry.io/otel/sdk v1.36.0 h1:b6SYIuLRs88ztox4EyrvRti80uXIFy+Sqzoh9kFULbs=
go.opentelemetry.io/otel/sdk v1.36.0/go.mod h1:+lC+mTgD+MUWfjJubi2vvXWcVxyr9rmlshZni72pXeY=
go.opentelemetry.io/otel/sdk/metric v1.36.0 h1:r0ntwwGosWGaa0CrSt8cuNuTcccMXERFwHX4dThiPis=
go.opentelemetry.io/otel/sdk/metric v1.36.0/go.mod h1:qTNOhFDfKRwX0yXOqJYegL5WRaW376QbB7P4Pb0qva4=
go.opentelemetry.io/otel/trace v1.36.0 h1:ahxWNuqZjpdiFAyrIoQ4GIiAIhxAunQR6MUoKrsNd4w=
go.opentelemetry.io/otel/trace v1.36.0/go.mod h1:gQ+OnDZzrybY4k4seLzPAWNwVBBVlF2szhehOBB/tGA=
go.opentelemetry.io/proto/otlp v1.6.0 h1:jQjP+AQyTf+Fe7OKj/MfkDrmK4MNVtw2NpXsf9fefDI=
go.opentelemetry.io/proto/otlp v1.6.0/go.mod h1:cicgGehlFuNdgZkcALOCh3VE6K/u2tAjzlRhDwmVpZc=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0 h1:RbKq8BG0FI8OiXhBfcRtqqHcZcka+gU3cskNuf05R18=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0/go.mod h1:h06DGIukJOevXaj/xrNjhi/2098RZzcLTbc0jDAUbsg=
go.opentelemetry.io/contrib/instrumentation/runtime v0.63.0 h1:PeBoRj6af6xMI7qCupwFvTbbnd49V7n5YpG6pg8iDYQ=
go.opentelemetry.io/contrib/instrumentation/runtime v0.63.0/go.mod h1:ingqBCtMCe8I4vpz/UVzCW6sxoqgZB37nao91mLQ3Bw=
go.opentelemetry.io/otel v1.38.0 h1:RkfdswUDRimDg0m2Az18RKOsnI8UDzppJAtj01/Ymk8=
go.opentelemetry.io/otel v1.38.0/go.mod h1:zcmtmQ1+YmQM9wrNsTGV/q/uyusom3P8RxwExxkZhjM=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.38.0 h1:vl9obrcoWVKp/lwl8tRE33853I8Xru9HFbw/skNeLs8=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.38.0/go.mod h1:GAXRxmLJcVM3u22IjTg74zWBrRCKq8BnOqUVLodpcpw=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.38.0 h1:GqRJVj7UmLjCVyVJ3ZFLdPRmhDUp2zFmQe3RHIOsw24=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.38.0/go.mod h1:ri3aaHSmCTVYu2AWv44YMauwAQc0aqI9gHKIcSbI1pU=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.38.0 h1:lwI4Dc5leUqENgGuQImwLo4WnuXFPetmPpkLi2IrX54=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.38.0/go.mod h1:Kz/oCE7z5wuyhPxsXDuaPteSWqjSBD5YaSdbxZYGbGk=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.38.0 h1:aTL7F04bJHUlztTsNGJ2l+6he8c+y/b//eR0jjjemT4=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.38.0/go.mod h1:kldtb7jDTeol0l3ewcmd8SDvx3EmIE7lyvqbasU3QC4=
go.opentelemetry.io/otel/exporters/prometheus v0.60.0 h1:cGtQxGvZbnrWdC2GyjZi0PDKVSLWP/Jocix3QWfXtbo=
go.opentelemetry.io/otel/exporters/prometheus v0.60.0/go.mod h1:hkd1EekxNo69PTV4OWFGZcKQiIqg0RfuWExcPKFvepk=
go.opentelemetry.io/otel/metric v1.38.0 h1:Kl6lzIYGAh5M159u9NgiRkmoMKjvbsKtYRwgfrA6WpA=
go.opentelemetry.io/otel/metric v1.38.0/go.mod h1:kB5n/QoRM8YwmUahxvI3bO34eVtQf2i4utNVLr9gEmI=
go.opentelemetry.io/otel/sdk v1.38.0 h1:l48sr5YbNf2hpCUj/FoGhW9yDkl+Ma+LrVl8qaM5b+E=
go.opentelemetry.io/otel/sdk v1.38.0/go.mod h1:ghmNdGlVemJI3+ZB5iDEuk4bWA3GkTpW+DOoZMYBVVg=
go.opentelemetry.io/otel/sdk/metric v1.38.0 h1:aSH66iL0aZqo//xXzQLYozmWrXxyFkBJ6qT5wthqPoM=
go.opentelemetry.io/otel/sdk/metric v1.38.0/go.mod h1:dg9PBnW9XdQ1Hd6ZnRz689CbtrUp0wMMs9iPcgT9EZA=
go.opentelemetry.io/otel/trace v1.38.0 h1:Fxk5bKrDZJUH+AMyyIXGcFAPah0oRcT+LuNtJrmcNLE=
go.opentelemetry.io/otel/trace v1.38.0/go.mod h1:j1P9ivuFsTceSWe1oY+EeW3sc+Pp42sO++GHkg4wwhs=
go.opentelemetry.io/proto/otlp v1.7.1 h1:gTOMpGDb0WTBOP8JaO72iL3auEZhVmAQg4ipjOVAtj4=
go.opentelemetry.io/proto/otlp v1.7.1/go.mod h1:b2rVh6rfI/s2pHWNlB7ILJcRALpcNDzKhACevjI+ZnE=
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
go.yaml.in/yaml/v2 v2.4.2 h1:DzmwEr2rDGHl7lsFgAHxmNz/1NlQ7xLIrlN2h5d1eGI=
go.yaml.in/yaml/v2 v2.4.2/go.mod h1:081UH+NErpNdqlCXm3TtEran0rJZGxAYx9hb/ELlsPU=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.40.0 h1:r4x+VvoG5Fm+eJcxMaY8CQM7Lb0l1lsmjGBQ6s8BfKM=
golang.org/x/crypto v0.40.0/go.mod h1:Qr1vMER5WyS2dfPHAlsOj01wgLbsyWtFn/aY+5+ZdxY=
golang.org/x/exp v0.0.0-20250218142911-aa4b98e5adaa h1:t2QcU6V556bFjYgu4L6C+6VrCPyJZ+eyRsABUPs1mz4=
golang.org/x/exp v0.0.0-20250218142911-aa4b98e5adaa/go.mod h1:BHOTPb3L19zxehTsLoJXVaTktb06DFgmdW6Wb9s8jqk=
golang.org/x/crypto v0.43.0 h1:dduJYIi3A3KOfdGOHX8AVZ/jGiyPa3IbBozJ5kNuE04=
golang.org/x/crypto v0.43.0/go.mod h1:BFbav4mRNlXJL4wNeejLpWxB7wMbc79PdRGhWKncxR0=
golang.org/x/lint v0.0.0-20200302205851-738671d3881b/go.mod h1:3xt1FjdF8hUf6vQPIChWIBhFzV8gjjsPE/fR3IyQdNY=
golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg=
golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.23.0 h1:Zb7khfcRGKk+kqfxFaP5tZqCnDZMjC5VtUBs87Hr6QM=
golang.org/x/mod v0.23.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/mod v0.28.0 h1:gQBtGhjxykdjY9YhZpSlZIsbnaE2+PgjfLWUQTnoZ1U=
golang.org/x/mod v0.28.0/go.mod h1:yfB/L0NOf/kmEbXjzCPOx1iK1fRutOydrCMsqRhEBxI=
golang.org/x/exp v0.0.0-20250718183923-645b1fa84792 h1:R9PFI6EUdfVKgwKjZef7QIwGcBKu86OEFpJ9nUEP2l4=
golang.org/x/exp v0.0.0-20250718183923-645b1fa84792/go.mod h1:A+z0yzpGtvnG90cToK5n2tu8UJVP2XUATh+r+sfOOOc=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.42.0 h1:jzkYrhi3YQWD6MLBJcsklgQsoAcw89EcZbJw8Z614hs=
golang.org/x/net v0.42.0/go.mod h1:FF1RA5d3u7nAYA4z2TkclSCKh68eSXtiFwcWQpPXdt8=
golang.org/x/net v0.45.0 h1:RLBg5JKixCy82FtLJpeNlVM0nrSqpCRYzVU1n8kj0tM=
golang.org/x/net v0.45.0/go.mod h1:ECOoLqd5U3Lhyeyo/QDCEVQ4sNgYsqvCZ722XogGieY=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.11.0 h1:GGz8+XQP4FvTTrjZPzNKTMFtSXH80RAzG+5ghFPgK9w=
golang.org/x/sync v0.11.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.17.0 h1:l60nONMj9l5drqw6jlhIELNv9I0A4OFgRsG9k2oT9Ug=
golang.org/x/sync v0.17.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.10.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.34.0 h1:H5Y5sJ2L2JRdyv7ROF1he/lPdvFsd0mJHFw2ThKHxLA=
golang.org/x/sys v0.34.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/sys v0.37.0 h1:fdNQudmxPjkdUTPnLn5mdQv7Zwvbvpaxqs831goi9kQ=
golang.org/x/sys v0.37.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.27.0 h1:4fGWRpyh641NLlecmyl4LOe6yDdfaYNrGb2zdfo4JV4=
golang.org/x/text v0.27.0/go.mod h1:1D28KMCvyooCX9hBiosv5Tz/+YLxj0j7XhWjpSUF7CU=
golang.org/x/time v0.7.0 h1:ntUhktv3OPE6TgYxXWv9vKvUSJyIFJlyohwbkEwPrKQ=
golang.org/x/time v0.7.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/text v0.30.0 h1:yznKA/E9zq54KzlzBEAWn1NXSQ8DIp/NYMy88xJjl4k=
golang.org/x/text v0.30.0/go.mod h1:yDdHFIX9t+tORqspjENWgzaCVXgk0yYnYuSZ8UzzBVM=
golang.org/x/time v0.12.0 h1:ScB/8o8olJvc+CQPWrK3fPZNfh7qgwCrY0zJmoEQLSE=
golang.org/x/time v0.12.0/go.mod h1:CDIdPxbZBQxdj6cxyCIdrNogrJKMJ7pr37NYpMcMDSg=
golang.org/x/tools v0.0.0-20200130002326-2f3ba24bd6e7/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.37.0 h1:DVSRzp7FwePZW356yEAChSdNcQo6Nsp+fex1SUW09lE=
golang.org/x/tools v0.37.0/go.mod h1:MBN5QPQtLMHVdvsbtarmTNukZDdgwdwlO5qGacAzF0w=
golang.org/x/tools v0.0.0-20200130002326-2f3ba24bd6e7/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.30.0 h1:BgcpHewrV5AUp2G9MebG4XPFI1E2W41zU1SaqVA9vJY=
golang.org/x/tools v0.30.0/go.mod h1:c347cR/OJfw5TI+GfX7RUPNMdDRRbjvYTS0jPyvsVtY=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2 h1:B82qJJgjvYKsXS9jeunTOisW56dUokqW/FOteYJJ/yg=
golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2/go.mod h1:deeaetjYA+DHMHg+sMSMI58GrEteJUUzzw7en6TJQcI=
golang.zx2c4.com/wireguard v0.0.0-20250521234502-f333402bd9cb h1:whnFRlWMcXI9d+ZbWg+4sHnLp52d5yiIPUxMBSt4X9A=
golang.zx2c4.com/wireguard v0.0.0-20250521234502-f333402bd9cb/go.mod h1:rpwXGsirqLqN2L0JDJQlwOboGHmptD5ZD6T2VmcqhTw=
golang.zx2c4.com/wireguard/wgctrl v0.0.0-20241231184526-a9ab2273dd10 h1:3GDAcqdIg1ozBNLgPy4SLT84nfcBjr6rhGtXYtrkWLU=
golang.zx2c4.com/wireguard/wgctrl v0.0.0-20241231184526-a9ab2273dd10/go.mod h1:T97yPqesLiNrOYxkwmhMI0ZIlJDm+p0PMR8eRVeR5tQ=
gonum.org/v1/gonum v0.16.0 h1:5+ul4Swaf3ESvrOnidPp4GZbzf0mxVQpDCYUQE7OJfk=
gonum.org/v1/gonum v0.16.0/go.mod h1:fef3am4MQ93R2HHpKnLk4/Tbh/s0+wqD5nfa6Pnwy4E=
google.golang.org/genproto/googleapis/api v0.0.0-20250825161204-c5933d9347a5 h1:BIRfGDEjiHRrk0QKZe3Xv2ieMhtgRGeLcZQ0mIVn4EY=
google.golang.org/genproto/googleapis/api v0.0.0-20250825161204-c5933d9347a5/go.mod h1:j3QtIyytwqGr1JUDtYXwtMXWPKsEa5LtzIFN1Wn5WvE=
google.golang.org/genproto/googleapis/rpc v0.0.0-20250825161204-c5933d9347a5 h1:eaY8u2EuxbRv7c3NiGK0/NedzVsCcV6hDuU5qPX5EGE=
google.golang.org/genproto/googleapis/rpc v0.0.0-20250825161204-c5933d9347a5/go.mod h1:M4/wBTSeyLxupu3W3tJtOgB14jILAS/XWPSSa3TAlJc=
google.golang.org/grpc v1.76.0 h1:UnVkv1+uMLYXoIz6o7chp59WfQUYA2ex/BXQ9rHZu7A=
google.golang.org/grpc v1.76.0/go.mod h1:Ju12QI8M6iQJtbcsV+awF5a4hfJMLi4X0JLo94ULZ6c=
google.golang.org/protobuf v1.36.8 h1:xHScyCOEuuwZEc6UtSOvPbAT4zRh0xcNRYekJwfqyMc=
google.golang.org/protobuf v1.36.8/go.mod h1:fuxRtAxBytpl4zzqUh6/eyUujkJdNiuEkXntxiD/uRU=
google.golang.org/genproto v0.0.0-20230920204549-e6e6cdab5c13 h1:vlzZttNJGVqTsRFU9AmdnrcO1Znh8Ew9kCD//yjigk0=
google.golang.org/genproto/googleapis/api v0.0.0-20250519155744-55703ea1f237 h1:Kog3KlB4xevJlAcbbbzPfRG0+X9fdoGM+UBRKVz6Wr0=
google.golang.org/genproto/googleapis/api v0.0.0-20250519155744-55703ea1f237/go.mod h1:ezi0AVyMKDWy5xAncvjLWH7UcLBB5n7y2fQ8MzjJcto=
google.golang.org/genproto/googleapis/rpc v0.0.0-20250519155744-55703ea1f237 h1:cJfm9zPbe1e873mHJzmQ1nwVEeRDU/T1wXDK2kUSU34=
google.golang.org/genproto/googleapis/rpc v0.0.0-20250519155744-55703ea1f237/go.mod h1:qQ0YXyHHx3XkvlzUtpXDkS29lDSafHMZBAZDc03LQ3A=
google.golang.org/grpc v1.72.1 h1:HR03wO6eyZ7lknl75XlxABNVLLFc2PAb6mHlYh756mA=
google.golang.org/grpc v1.72.1/go.mod h1:wH5Aktxcg25y1I3w7H69nHfXdOG3UiadoBtjh3izSDM=
google.golang.org/protobuf v1.36.6 h1:z1NpPI8ku2WgiWnf+t9wTPsn6eP1L7ksHUlkfLvd9xY=
google.golang.org/protobuf v1.36.6/go.mod h1:jduwjTPXsFjZGTmRluh+L6NjiWu7pchiJ2/5YcXBHnY=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gotest.tools/v3 v3.4.0 h1:ZazjZUfuVeZGLAmlKKuyv3IKP5orXcwtOwDQH6YVr6o=
gotest.tools/v3 v3.4.0/go.mod h1:CtbdzLSsqVhDgMtKsx03ird5YTGB3ar27v0u/yKBW5g=
gvisor.dev/gvisor v0.0.0-20250503011706-39ed1f5ac29c h1:m/r7OM+Y2Ty1sgBQ7Qb27VgIMBW8ZZhT4gLnUyDIhzI=
gvisor.dev/gvisor v0.0.0-20250503011706-39ed1f5ac29c/go.mod h1:3r5CMtNQMKIvBlrmM9xWUNamjKBYPOWyXOjmg5Kts3g=
software.sslmate.com/src/go-pkcs12 v0.5.0 h1:EC6R394xgENTpZ4RltKydeDUjtlM5drOYIG9c6TVj2M=
software.sslmate.com/src/go-pkcs12 v0.5.0/go.mod h1:Qiz0EyvDRJjjxGyUQa2cCNZn/wMyzrRJ/qcDXOQazLI=
software.sslmate.com/src/go-pkcs12 v0.6.0 h1:f3sQittAeF+pao32Vb+mkli+ZyT+VwKaD014qFGq6oU=
software.sslmate.com/src/go-pkcs12 v0.6.0/go.mod h1:Qiz0EyvDRJjjxGyUQa2cCNZn/wMyzrRJ/qcDXOQazLI=

517
healthcheck/healthcheck.go Normal file
View File

@@ -0,0 +1,517 @@
package healthcheck
import (
"context"
"crypto/tls"
"encoding/json"
"fmt"
"net/http"
"strings"
"sync"
"time"
"github.com/fosrl/newt/logger"
)
// Health represents the health status of a target
type Health int
const (
StatusUnknown Health = iota
StatusHealthy
StatusUnhealthy
)
func (s Health) String() string {
switch s {
case StatusHealthy:
return "healthy"
case StatusUnhealthy:
return "unhealthy"
default:
return "unknown"
}
}
// Config holds the health check configuration for a target
type Config struct {
ID int `json:"id"`
Enabled bool `json:"hcEnabled"`
Path string `json:"hcPath"`
Scheme string `json:"hcScheme"`
Mode string `json:"hcMode"`
Hostname string `json:"hcHostname"`
Port int `json:"hcPort"`
Interval int `json:"hcInterval"` // in seconds
UnhealthyInterval int `json:"hcUnhealthyInterval"` // in seconds
Timeout int `json:"hcTimeout"` // in seconds
Headers map[string]string `json:"hcHeaders"`
Method string `json:"hcMethod"`
Status int `json:"hcStatus"` // HTTP status code
}
// Target represents a health check target with its current status
type Target struct {
Config Config `json:"config"`
Status Health `json:"status"`
LastCheck time.Time `json:"lastCheck"`
LastError string `json:"lastError,omitempty"`
CheckCount int `json:"checkCount"`
ticker *time.Ticker
ctx context.Context
cancel context.CancelFunc
}
// StatusChangeCallback is called when any target's status changes
type StatusChangeCallback func(targets map[int]*Target)
// Monitor manages health check targets and their monitoring
type Monitor struct {
targets map[int]*Target
mutex sync.RWMutex
callback StatusChangeCallback
client *http.Client
enforceCert bool
}
// NewMonitor creates a new health check monitor
func NewMonitor(callback StatusChangeCallback, enforceCert bool) *Monitor {
logger.Debug("Creating new health check monitor with certificate enforcement: %t", enforceCert)
// Configure TLS settings based on certificate enforcement
transport := &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: !enforceCert,
},
}
return &Monitor{
targets: make(map[int]*Target),
callback: callback,
enforceCert: enforceCert,
client: &http.Client{
Timeout: 30 * time.Second,
Transport: transport,
},
}
}
// parseHeaders parses the headers string into a map
func parseHeaders(headersStr string) map[string]string {
headers := make(map[string]string)
if headersStr == "" {
return headers
}
// Try to parse as JSON first
if err := json.Unmarshal([]byte(headersStr), &headers); err == nil {
return headers
}
// Fallback to simple key:value parsing
pairs := strings.Split(headersStr, ",")
for _, pair := range pairs {
kv := strings.SplitN(strings.TrimSpace(pair), ":", 2)
if len(kv) == 2 {
headers[strings.TrimSpace(kv[0])] = strings.TrimSpace(kv[1])
}
}
return headers
}
// AddTarget adds a new health check target
func (m *Monitor) AddTarget(config Config) error {
m.mutex.Lock()
defer m.mutex.Unlock()
logger.Info("Adding health check target: ID=%d, hostname=%s, port=%d, enabled=%t",
config.ID, config.Hostname, config.Port, config.Enabled)
return m.addTargetUnsafe(config)
}
// AddTargets adds multiple health check targets in bulk
func (m *Monitor) AddTargets(configs []Config) error {
m.mutex.Lock()
defer m.mutex.Unlock()
logger.Debug("Adding %d health check targets in bulk", len(configs))
for _, config := range configs {
if err := m.addTargetUnsafe(config); err != nil {
logger.Error("Failed to add target %d: %v", config.ID, err)
return fmt.Errorf("failed to add target %d: %v", config.ID, err)
}
logger.Debug("Successfully added target: ID=%d, hostname=%s", config.ID, config.Hostname)
}
// Don't notify callback immediately - let the initial health checks complete first
// The callback will be triggered when the first health check results are available
logger.Debug("Successfully added all %d health check targets", len(configs))
return nil
}
// addTargetUnsafe adds a target without acquiring the mutex (internal method)
func (m *Monitor) addTargetUnsafe(config Config) error {
// Set defaults
if config.Scheme == "" {
config.Scheme = "http"
}
if config.Mode == "" {
config.Mode = "http"
}
if config.Method == "" {
config.Method = "GET"
}
if config.Interval == 0 {
config.Interval = 30
}
if config.UnhealthyInterval == 0 {
config.UnhealthyInterval = 30
}
if config.Timeout == 0 {
config.Timeout = 5
}
logger.Debug("Target %d configuration: scheme=%s, method=%s, interval=%ds, timeout=%ds",
config.ID, config.Scheme, config.Method, config.Interval, config.Timeout)
// Parse headers if provided as string
if len(config.Headers) == 0 && config.Path != "" {
// This is a simplified header parsing - in real use you might want more robust parsing
config.Headers = make(map[string]string)
}
// Remove existing target if it exists
if existing, exists := m.targets[config.ID]; exists {
logger.Info("Replacing existing target with ID %d", config.ID)
existing.cancel()
}
// Create new target
ctx, cancel := context.WithCancel(context.Background())
target := &Target{
Config: config,
Status: StatusUnknown,
ctx: ctx,
cancel: cancel,
}
m.targets[config.ID] = target
// Start monitoring if enabled
if config.Enabled {
logger.Info("Starting monitoring for target %d (%s:%d)", config.ID, config.Hostname, config.Port)
go m.monitorTarget(target)
} else {
logger.Debug("Target %d added but monitoring is disabled", config.ID)
}
return nil
}
// RemoveTarget removes a health check target
func (m *Monitor) RemoveTarget(id int) error {
m.mutex.Lock()
defer m.mutex.Unlock()
target, exists := m.targets[id]
if !exists {
logger.Warn("Attempted to remove non-existent target with ID %d", id)
return fmt.Errorf("target with id %d not found", id)
}
logger.Info("Removing health check target: ID=%d", id)
target.cancel()
delete(m.targets, id)
// Notify callback of status change
if m.callback != nil {
go m.callback(m.GetTargets())
}
logger.Info("Successfully removed target %d", id)
return nil
}
// RemoveTargets removes multiple health check targets
func (m *Monitor) RemoveTargets(ids []int) error {
m.mutex.Lock()
defer m.mutex.Unlock()
logger.Info("Removing %d health check targets", len(ids))
var notFound []int
for _, id := range ids {
target, exists := m.targets[id]
if !exists {
notFound = append(notFound, id)
logger.Warn("Target with ID %d not found during bulk removal", id)
continue
}
logger.Debug("Removing target %d", id)
target.cancel()
delete(m.targets, id)
}
removedCount := len(ids) - len(notFound)
logger.Info("Successfully removed %d targets", removedCount)
// Notify callback of status change if any targets were removed
if len(notFound) != len(ids) && m.callback != nil {
go m.callback(m.GetTargets())
}
if len(notFound) > 0 {
logger.Error("Some targets not found during removal: %v", notFound)
return fmt.Errorf("targets not found: %v", notFound)
}
return nil
}
// RemoveTargetsByID is a convenience method that accepts either a single ID or multiple IDs
func (m *Monitor) RemoveTargetsByID(ids ...int) error {
return m.RemoveTargets(ids)
}
// GetTargets returns a copy of all targets
func (m *Monitor) GetTargets() map[int]*Target {
m.mutex.RLock()
defer m.mutex.RUnlock()
return m.getAllTargetsUnsafe()
}
// getAllTargetsUnsafe returns a copy of all targets without acquiring the mutex (internal method)
func (m *Monitor) getAllTargetsUnsafe() map[int]*Target {
targets := make(map[int]*Target)
for id, target := range m.targets {
// Create a copy to avoid race conditions
targetCopy := *target
targets[id] = &targetCopy
}
return targets
}
// getAllTargets returns a copy of all targets (deprecated, use GetTargets)
func (m *Monitor) getAllTargets() map[int]*Target {
return m.GetTargets()
}
// monitorTarget monitors a single target
func (m *Monitor) monitorTarget(target *Target) {
logger.Info("Starting health check monitoring for target %d (%s:%d)",
target.Config.ID, target.Config.Hostname, target.Config.Port)
// Initial check
oldStatus := target.Status
m.performHealthCheck(target)
// Notify callback after initial check if status changed or if it's the first check
if (oldStatus != target.Status || oldStatus == StatusUnknown) && m.callback != nil {
logger.Info("Target %d initial status: %s", target.Config.ID, target.Status.String())
go m.callback(m.GetTargets())
}
// Set up ticker based on current status
interval := time.Duration(target.Config.Interval) * time.Second
if target.Status == StatusUnhealthy {
interval = time.Duration(target.Config.UnhealthyInterval) * time.Second
}
logger.Debug("Target %d: initial check interval set to %v", target.Config.ID, interval)
target.ticker = time.NewTicker(interval)
defer target.ticker.Stop()
for {
select {
case <-target.ctx.Done():
logger.Info("Stopping health check monitoring for target %d", target.Config.ID)
return
case <-target.ticker.C:
oldStatus := target.Status
m.performHealthCheck(target)
// Update ticker interval if status changed
newInterval := time.Duration(target.Config.Interval) * time.Second
if target.Status == StatusUnhealthy {
newInterval = time.Duration(target.Config.UnhealthyInterval) * time.Second
}
if newInterval != interval {
logger.Debug("Target %d: updating check interval from %v to %v due to status change",
target.Config.ID, interval, newInterval)
target.ticker.Stop()
target.ticker = time.NewTicker(newInterval)
interval = newInterval
}
// Notify callback if status changed
if oldStatus != target.Status && m.callback != nil {
logger.Info("Target %d status changed: %s -> %s",
target.Config.ID, oldStatus.String(), target.Status.String())
go m.callback(m.GetTargets())
}
}
}
}
// performHealthCheck performs a health check on a target
func (m *Monitor) performHealthCheck(target *Target) {
target.CheckCount++
target.LastCheck = time.Now()
target.LastError = ""
// Build URL
url := fmt.Sprintf("%s://%s", target.Config.Scheme, target.Config.Hostname)
if target.Config.Port > 0 {
url = fmt.Sprintf("%s:%d", url, target.Config.Port)
}
if target.Config.Path != "" {
if !strings.HasPrefix(target.Config.Path, "/") {
url += "/"
}
url += target.Config.Path
}
logger.Debug("Target %d: performing health check %d to %s",
target.Config.ID, target.CheckCount, url)
if target.Config.Scheme == "https" {
logger.Debug("Target %d: HTTPS health check with certificate enforcement: %t",
target.Config.ID, m.enforceCert)
}
// Create request
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(target.Config.Timeout)*time.Second)
defer cancel()
req, err := http.NewRequestWithContext(ctx, target.Config.Method, url, nil)
if err != nil {
target.Status = StatusUnhealthy
target.LastError = fmt.Sprintf("failed to create request: %v", err)
logger.Warn("Target %d: failed to create request: %v", target.Config.ID, err)
return
}
// Add headers
for key, value := range target.Config.Headers {
req.Header.Set(key, value)
}
// Perform request
resp, err := m.client.Do(req)
if err != nil {
target.Status = StatusUnhealthy
target.LastError = fmt.Sprintf("request failed: %v", err)
logger.Warn("Target %d: health check failed: %v", target.Config.ID, err)
return
}
defer resp.Body.Close()
// Check response status
var expectedStatus int
if target.Config.Status > 0 {
expectedStatus = target.Config.Status
} else {
expectedStatus = 0 // Use range check for 200-299
}
if expectedStatus > 0 {
logger.Debug("Target %d: checking health status against expected code %d", target.Config.ID, expectedStatus)
// Check for specific status code
if resp.StatusCode == expectedStatus {
target.Status = StatusHealthy
logger.Debug("Target %d: health check passed (status: %d, expected: %d)", target.Config.ID, resp.StatusCode, expectedStatus)
} else {
target.Status = StatusUnhealthy
target.LastError = fmt.Sprintf("unexpected status code: %d (expected: %d)", resp.StatusCode, expectedStatus)
logger.Warn("Target %d: health check failed with status code %d (expected: %d)", target.Config.ID, resp.StatusCode, expectedStatus)
}
} else {
// Check for 2xx range
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
target.Status = StatusHealthy
logger.Debug("Target %d: health check passed (status: %d)", target.Config.ID, resp.StatusCode)
} else {
target.Status = StatusUnhealthy
target.LastError = fmt.Sprintf("unhealthy status code: %d", resp.StatusCode)
logger.Warn("Target %d: health check failed with status code %d", target.Config.ID, resp.StatusCode)
}
}
}
// Stop stops monitoring all targets
func (m *Monitor) Stop() {
m.mutex.Lock()
defer m.mutex.Unlock()
targetCount := len(m.targets)
logger.Info("Stopping health check monitor with %d targets", targetCount)
for id, target := range m.targets {
logger.Debug("Stopping monitoring for target %d", id)
target.cancel()
}
m.targets = make(map[int]*Target)
logger.Info("Health check monitor stopped")
}
// EnableTarget enables monitoring for a specific target
func (m *Monitor) EnableTarget(id int) error {
m.mutex.Lock()
defer m.mutex.Unlock()
target, exists := m.targets[id]
if !exists {
logger.Warn("Attempted to enable non-existent target with ID %d", id)
return fmt.Errorf("target with id %d not found", id)
}
if !target.Config.Enabled {
logger.Info("Enabling health check monitoring for target %d", id)
target.Config.Enabled = true
target.cancel() // Stop existing monitoring
ctx, cancel := context.WithCancel(context.Background())
target.ctx = ctx
target.cancel = cancel
go m.monitorTarget(target)
} else {
logger.Debug("Target %d is already enabled", id)
}
return nil
}
// DisableTarget disables monitoring for a specific target
func (m *Monitor) DisableTarget(id int) error {
m.mutex.Lock()
defer m.mutex.Unlock()
target, exists := m.targets[id]
if !exists {
logger.Warn("Attempted to disable non-existent target with ID %d", id)
return fmt.Errorf("target with id %d not found", id)
}
if target.Config.Enabled {
logger.Info("Disabling health check monitoring for target %d", id)
target.Config.Enabled = false
target.cancel()
target.Status = StatusUnknown
// Notify callback of status change
if m.callback != nil {
go m.callback(m.GetTargets())
}
} else {
logger.Debug("Target %d is already disabled", id)
}
return nil
}

View File

@@ -0,0 +1,80 @@
package state
import (
"sync"
"sync/atomic"
"time"
"github.com/fosrl/newt/internal/telemetry"
)
// TelemetryView is a minimal, thread-safe implementation to feed observables.
// Since one Newt process represents one site, we expose a single logical site.
// site_id is a resource attribute, so we do not emit per-site labels here.
type TelemetryView struct {
online atomic.Bool
lastHBUnix atomic.Int64 // unix seconds
// per-tunnel sessions
sessMu sync.RWMutex
sessions map[string]*atomic.Int64
}
var (
globalView atomic.Pointer[TelemetryView]
)
// Global returns a singleton TelemetryView.
func Global() *TelemetryView {
if v := globalView.Load(); v != nil { return v }
v := &TelemetryView{ sessions: make(map[string]*atomic.Int64) }
globalView.Store(v)
telemetry.RegisterStateView(v)
return v
}
// Instrumentation helpers
func (v *TelemetryView) IncSessions(tunnelID string) {
v.sessMu.Lock(); defer v.sessMu.Unlock()
c := v.sessions[tunnelID]
if c == nil { c = &atomic.Int64{}; v.sessions[tunnelID] = c }
c.Add(1)
}
func (v *TelemetryView) DecSessions(tunnelID string) {
v.sessMu.Lock(); defer v.sessMu.Unlock()
if c := v.sessions[tunnelID]; c != nil {
c.Add(-1)
if c.Load() <= 0 { delete(v.sessions, tunnelID) }
}
}
func (v *TelemetryView) ClearTunnel(tunnelID string) {
v.sessMu.Lock(); defer v.sessMu.Unlock()
delete(v.sessions, tunnelID)
}
func (v *TelemetryView) SetOnline(b bool) { v.online.Store(b) }
func (v *TelemetryView) TouchHeartbeat() { v.lastHBUnix.Store(time.Now().Unix()) }
// --- telemetry.StateView interface ---
func (v *TelemetryView) ListSites() []string { return []string{"self"} }
func (v *TelemetryView) Online(_ string) (bool, bool) { return v.online.Load(), true }
func (v *TelemetryView) LastHeartbeat(_ string) (time.Time, bool) {
sec := v.lastHBUnix.Load()
if sec == 0 { return time.Time{}, false }
return time.Unix(sec, 0), true
}
func (v *TelemetryView) ActiveSessions(_ string) (int64, bool) {
// aggregated sessions (not used for per-tunnel gauge)
v.sessMu.RLock(); defer v.sessMu.RUnlock()
var sum int64
for _, c := range v.sessions { if c != nil { sum += c.Load() } }
return sum, true
}
// Extended accessor used by telemetry callback to publish per-tunnel samples.
func (v *TelemetryView) SessionsByTunnel() map[string]int64 {
v.sessMu.RLock(); defer v.sessMu.RUnlock()
out := make(map[string]int64, len(v.sessions))
for id, c := range v.sessions { if c != nil && c.Load() > 0 { out[id] = c.Load() } }
return out
}

View File

@@ -0,0 +1,19 @@
package telemetry
// Protocol labels (low-cardinality)
const (
ProtocolTCP = "tcp"
ProtocolUDP = "udp"
)
// Reconnect reason bins (fixed, low-cardinality)
const (
ReasonServerRequest = "server_request"
ReasonTimeout = "timeout"
ReasonPeerClose = "peer_close"
ReasonNetworkChange = "network_change"
ReasonAuthError = "auth_error"
ReasonHandshakeError = "handshake_error"
ReasonConfigChange = "config_change"
ReasonError = "error"
)

View File

@@ -0,0 +1,32 @@
package telemetry
import "testing"
func TestAllowedConstants(t *testing.T) {
allowedReasons := map[string]struct{}{
ReasonServerRequest: {},
ReasonTimeout: {},
ReasonPeerClose: {},
ReasonNetworkChange: {},
ReasonAuthError: {},
ReasonHandshakeError: {},
ReasonConfigChange: {},
ReasonError: {},
}
for k := range allowedReasons {
if k == "" {
t.Fatalf("empty reason constant")
}
}
allowedProtocols := map[string]struct{}{
ProtocolTCP: {},
ProtocolUDP: {},
}
for k := range allowedProtocols {
if k == "" {
t.Fatalf("empty protocol constant")
}
}
}

View File

@@ -0,0 +1,542 @@
package telemetry
import (
"context"
"sync"
"sync/atomic"
"time"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)
// Instruments and helpers for Newt metrics following the naming, units, and
// low-cardinality label guidance from the issue description.
//
// Counters end with _total, durations are in seconds, sizes in bytes.
// Only low-cardinality stable labels are supported: tunnel_id,
// transport, direction, result, reason, error_type.
var (
initOnce sync.Once
meter metric.Meter
// Site / Registration
mSiteRegistrations metric.Int64Counter
mSiteOnline metric.Int64ObservableGauge
mSiteLastHeartbeat metric.Float64ObservableGauge
// Tunnel / Sessions
mTunnelSessions metric.Int64ObservableGauge
mTunnelBytes metric.Int64Counter
mTunnelLatency metric.Float64Histogram
mReconnects metric.Int64Counter
// Connection / NAT
mConnAttempts metric.Int64Counter
mConnErrors metric.Int64Counter
// Config/Restart
mConfigReloads metric.Int64Counter
mConfigApply metric.Float64Histogram
mCertRotationTotal metric.Int64Counter
mProcessStartTime metric.Float64ObservableGauge
// Build info
mBuildInfo metric.Int64ObservableGauge
// WebSocket
mWSConnectLatency metric.Float64Histogram
mWSMessages metric.Int64Counter
mWSDisconnects metric.Int64Counter
mWSKeepaliveFailure metric.Int64Counter
mWSSessionDuration metric.Float64Histogram
mWSConnected metric.Int64ObservableGauge
mWSReconnects metric.Int64Counter
// Proxy
mProxyActiveConns metric.Int64ObservableGauge
mProxyBufferBytes metric.Int64ObservableGauge
mProxyAsyncBacklogByte metric.Int64ObservableGauge
mProxyDropsTotal metric.Int64Counter
mProxyAcceptsTotal metric.Int64Counter
mProxyConnDuration metric.Float64Histogram
mProxyConnectionsTotal metric.Int64Counter
buildVersion string
buildCommit string
processStartUnix = float64(time.Now().UnixNano()) / 1e9
wsConnectedState atomic.Int64
)
// Proxy connection lifecycle events.
const (
ProxyConnectionOpened = "opened"
ProxyConnectionClosed = "closed"
)
// attrsWithSite appends site/region labels only when explicitly enabled to keep
// label cardinality low by default.
func attrsWithSite(extra ...attribute.KeyValue) []attribute.KeyValue {
attrs := make([]attribute.KeyValue, len(extra))
copy(attrs, extra)
if ShouldIncludeSiteLabels() {
attrs = append(attrs, siteAttrs()...)
}
return attrs
}
func registerInstruments() error {
var err error
initOnce.Do(func() {
meter = otel.Meter("newt")
if e := registerSiteInstruments(); e != nil {
err = e
return
}
if e := registerTunnelInstruments(); e != nil {
err = e
return
}
if e := registerConnInstruments(); e != nil {
err = e
return
}
if e := registerConfigInstruments(); e != nil {
err = e
return
}
if e := registerBuildWSProxyInstruments(); e != nil {
err = e
return
}
})
return err
}
func registerSiteInstruments() error {
var err error
mSiteRegistrations, err = meter.Int64Counter("newt_site_registrations_total",
metric.WithDescription("Total site registration attempts"))
if err != nil {
return err
}
mSiteOnline, err = meter.Int64ObservableGauge("newt_site_online",
metric.WithDescription("Site online (0/1)"))
if err != nil {
return err
}
mSiteLastHeartbeat, err = meter.Float64ObservableGauge("newt_site_last_heartbeat_timestamp_seconds",
metric.WithDescription("Unix timestamp of the last site heartbeat"),
metric.WithUnit("s"))
if err != nil {
return err
}
return nil
}
func registerTunnelInstruments() error {
var err error
mTunnelSessions, err = meter.Int64ObservableGauge("newt_tunnel_sessions",
metric.WithDescription("Active tunnel sessions"))
if err != nil {
return err
}
mTunnelBytes, err = meter.Int64Counter("newt_tunnel_bytes_total",
metric.WithDescription("Tunnel bytes ingress/egress"),
metric.WithUnit("By"))
if err != nil {
return err
}
mTunnelLatency, err = meter.Float64Histogram("newt_tunnel_latency_seconds",
metric.WithDescription("Per-tunnel latency in seconds"),
metric.WithUnit("s"))
if err != nil {
return err
}
mReconnects, err = meter.Int64Counter("newt_tunnel_reconnects_total",
metric.WithDescription("Tunnel reconnect events"))
if err != nil {
return err
}
return nil
}
func registerConnInstruments() error {
var err error
mConnAttempts, err = meter.Int64Counter("newt_connection_attempts_total",
metric.WithDescription("Connection attempts"))
if err != nil {
return err
}
mConnErrors, err = meter.Int64Counter("newt_connection_errors_total",
metric.WithDescription("Connection errors by type"))
if err != nil {
return err
}
return nil
}
func registerConfigInstruments() error {
mConfigReloads, _ = meter.Int64Counter("newt_config_reloads_total",
metric.WithDescription("Configuration reloads"))
mConfigApply, _ = meter.Float64Histogram("newt_config_apply_seconds",
metric.WithDescription("Configuration apply duration in seconds"),
metric.WithUnit("s"))
mCertRotationTotal, _ = meter.Int64Counter("newt_cert_rotation_total",
metric.WithDescription("Certificate rotation events (success/failure)"))
mProcessStartTime, _ = meter.Float64ObservableGauge("process_start_time_seconds",
metric.WithDescription("Unix timestamp of the process start time"),
metric.WithUnit("s"))
if mProcessStartTime != nil {
if _, err := meter.RegisterCallback(func(ctx context.Context, o metric.Observer) error {
o.ObserveFloat64(mProcessStartTime, processStartUnix)
return nil
}, mProcessStartTime); err != nil {
otel.Handle(err)
}
}
return nil
}
func registerBuildWSProxyInstruments() error {
// Build info gauge (value 1 with version/commit attributes)
mBuildInfo, _ = meter.Int64ObservableGauge("newt_build_info",
metric.WithDescription("Newt build information (value is always 1)"))
// WebSocket
mWSConnectLatency, _ = meter.Float64Histogram("newt_websocket_connect_latency_seconds",
metric.WithDescription("WebSocket connect latency in seconds"),
metric.WithUnit("s"))
mWSMessages, _ = meter.Int64Counter("newt_websocket_messages_total",
metric.WithDescription("WebSocket messages by direction and type"))
mWSDisconnects, _ = meter.Int64Counter("newt_websocket_disconnects_total",
metric.WithDescription("WebSocket disconnects by reason/result"))
mWSKeepaliveFailure, _ = meter.Int64Counter("newt_websocket_keepalive_failures_total",
metric.WithDescription("WebSocket keepalive (ping/pong) failures"))
mWSSessionDuration, _ = meter.Float64Histogram("newt_websocket_session_duration_seconds",
metric.WithDescription("Duration of established WebSocket sessions"),
metric.WithUnit("s"))
mWSConnected, _ = meter.Int64ObservableGauge("newt_websocket_connected",
metric.WithDescription("WebSocket connection state (1=connected, 0=disconnected)"))
mWSReconnects, _ = meter.Int64Counter("newt_websocket_reconnects_total",
metric.WithDescription("WebSocket reconnect attempts by reason"))
// Proxy
mProxyActiveConns, _ = meter.Int64ObservableGauge("newt_proxy_active_connections",
metric.WithDescription("Proxy active connections per tunnel and protocol"))
mProxyBufferBytes, _ = meter.Int64ObservableGauge("newt_proxy_buffer_bytes",
metric.WithDescription("Proxy buffer bytes (may approximate async backlog)"),
metric.WithUnit("By"))
mProxyAsyncBacklogByte, _ = meter.Int64ObservableGauge("newt_proxy_async_backlog_bytes",
metric.WithDescription("Unflushed async byte backlog per tunnel and protocol"),
metric.WithUnit("By"))
mProxyDropsTotal, _ = meter.Int64Counter("newt_proxy_drops_total",
metric.WithDescription("Proxy drops due to write errors"))
mProxyAcceptsTotal, _ = meter.Int64Counter("newt_proxy_accept_total",
metric.WithDescription("Proxy connection accepts by protocol and result"))
mProxyConnDuration, _ = meter.Float64Histogram("newt_proxy_connection_duration_seconds",
metric.WithDescription("Duration of completed proxy connections"),
metric.WithUnit("s"))
mProxyConnectionsTotal, _ = meter.Int64Counter("newt_proxy_connections_total",
metric.WithDescription("Proxy connection lifecycle events by protocol"))
// Register a default callback for build info if version/commit set
reg, e := meter.RegisterCallback(func(ctx context.Context, o metric.Observer) error {
if buildVersion == "" && buildCommit == "" {
return nil
}
attrs := []attribute.KeyValue{}
if buildVersion != "" {
attrs = append(attrs, attribute.String("version", buildVersion))
}
if buildCommit != "" {
attrs = append(attrs, attribute.String("commit", buildCommit))
}
if ShouldIncludeSiteLabels() {
attrs = append(attrs, siteAttrs()...)
}
o.ObserveInt64(mBuildInfo, 1, metric.WithAttributes(attrs...))
return nil
}, mBuildInfo)
if e != nil {
otel.Handle(e)
} else {
// Provide a functional stopper that unregisters the callback
obsStopper = func() { _ = reg.Unregister() }
}
if mWSConnected != nil {
if regConn, err := meter.RegisterCallback(func(ctx context.Context, o metric.Observer) error {
val := wsConnectedState.Load()
o.ObserveInt64(mWSConnected, val, metric.WithAttributes(attrsWithSite()...))
return nil
}, mWSConnected); err != nil {
otel.Handle(err)
} else {
wsConnStopper = func() { _ = regConn.Unregister() }
}
}
return nil
}
// Observable registration: Newt can register a callback to report gauges.
// Call SetObservableCallback once to start observing online status, last
// heartbeat seconds, and active sessions.
var (
obsOnce sync.Once
obsStopper func()
proxyObsOnce sync.Once
proxyStopper func()
wsConnStopper func()
)
// SetObservableCallback registers a single callback that will be invoked
// on collection. Use the provided observer to emit values for the observable
// gauges defined here.
//
// Example inside your code (where you have access to current state):
//
// telemetry.SetObservableCallback(func(ctx context.Context, o metric.Observer) error {
// o.ObserveInt64(mSiteOnline, 1)
// o.ObserveFloat64(mSiteLastHeartbeat, float64(lastHB.Unix()))
// o.ObserveInt64(mTunnelSessions, int64(len(activeSessions)))
// return nil
// })
func SetObservableCallback(cb func(context.Context, metric.Observer) error) {
obsOnce.Do(func() {
reg, e := meter.RegisterCallback(cb, mSiteOnline, mSiteLastHeartbeat, mTunnelSessions)
if e != nil {
otel.Handle(e)
obsStopper = func() {
// no-op: registration failed; keep stopper callable
}
return
}
// Provide a functional stopper mirroring proxy/build-info behavior
obsStopper = func() { _ = reg.Unregister() }
})
}
// SetProxyObservableCallback registers a callback to observe proxy gauges.
func SetProxyObservableCallback(cb func(context.Context, metric.Observer) error) {
proxyObsOnce.Do(func() {
reg, e := meter.RegisterCallback(cb, mProxyActiveConns, mProxyBufferBytes, mProxyAsyncBacklogByte)
if e != nil {
otel.Handle(e)
proxyStopper = func() {
// no-op: registration failed; keep stopper callable
}
return
}
// Provide a functional stopper to unregister later if needed
proxyStopper = func() { _ = reg.Unregister() }
})
}
// Build info registration
func RegisterBuildInfo(version, commit string) {
buildVersion = version
buildCommit = commit
}
// Config reloads
func IncConfigReload(ctx context.Context, result string) {
mConfigReloads.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("result", result),
)...))
}
// Helpers for counters/histograms
func IncSiteRegistration(ctx context.Context, result string) {
attrs := []attribute.KeyValue{
attribute.String("result", result),
}
mSiteRegistrations.Add(ctx, 1, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func AddTunnelBytes(ctx context.Context, tunnelID, direction string, n int64) {
attrs := []attribute.KeyValue{
attribute.String("direction", direction),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mTunnelBytes.Add(ctx, n, metric.WithAttributes(attrsWithSite(attrs...)...))
}
// AddTunnelBytesSet adds bytes using a pre-built attribute.Set to avoid per-call allocations.
func AddTunnelBytesSet(ctx context.Context, n int64, attrs attribute.Set) {
mTunnelBytes.Add(ctx, n, metric.WithAttributeSet(attrs))
}
// --- WebSocket helpers ---
func ObserveWSConnectLatency(ctx context.Context, seconds float64, result, errorType string) {
attrs := []attribute.KeyValue{
attribute.String("transport", "websocket"),
attribute.String("result", result),
}
if errorType != "" {
attrs = append(attrs, attribute.String("error_type", errorType))
}
mWSConnectLatency.Record(ctx, seconds, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func IncWSMessage(ctx context.Context, direction, msgType string) {
mWSMessages.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("direction", direction),
attribute.String("msg_type", msgType),
)...))
}
func IncWSDisconnect(ctx context.Context, reason, result string) {
mWSDisconnects.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("reason", reason),
attribute.String("result", result),
)...))
}
func IncWSKeepaliveFailure(ctx context.Context, reason string) {
mWSKeepaliveFailure.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("reason", reason),
)...))
}
// SetWSConnectionState updates the backing gauge for the WebSocket connected state.
func SetWSConnectionState(connected bool) {
if connected {
wsConnectedState.Store(1)
} else {
wsConnectedState.Store(0)
}
}
// IncWSReconnect increments the WebSocket reconnect counter with a bounded reason label.
func IncWSReconnect(ctx context.Context, reason string) {
if reason == "" {
reason = "unknown"
}
mWSReconnects.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("reason", reason),
)...))
}
func ObserveWSSessionDuration(ctx context.Context, seconds float64, result string) {
mWSSessionDuration.Record(ctx, seconds, metric.WithAttributes(attrsWithSite(
attribute.String("result", result),
)...))
}
// --- Proxy helpers ---
func ObserveProxyActiveConnsObs(o metric.Observer, value int64, attrs []attribute.KeyValue) {
o.ObserveInt64(mProxyActiveConns, value, metric.WithAttributes(attrs...))
}
func ObserveProxyBufferBytesObs(o metric.Observer, value int64, attrs []attribute.KeyValue) {
o.ObserveInt64(mProxyBufferBytes, value, metric.WithAttributes(attrs...))
}
func ObserveProxyAsyncBacklogObs(o metric.Observer, value int64, attrs []attribute.KeyValue) {
o.ObserveInt64(mProxyAsyncBacklogByte, value, metric.WithAttributes(attrs...))
}
func IncProxyDrops(ctx context.Context, tunnelID, protocol string) {
attrs := []attribute.KeyValue{
attribute.String("protocol", protocol),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mProxyDropsTotal.Add(ctx, 1, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func IncProxyAccept(ctx context.Context, tunnelID, protocol, result, reason string) {
attrs := []attribute.KeyValue{
attribute.String("protocol", protocol),
attribute.String("result", result),
}
if reason != "" {
attrs = append(attrs, attribute.String("reason", reason))
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mProxyAcceptsTotal.Add(ctx, 1, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func ObserveProxyConnectionDuration(ctx context.Context, tunnelID, protocol, result string, seconds float64) {
attrs := []attribute.KeyValue{
attribute.String("protocol", protocol),
attribute.String("result", result),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mProxyConnDuration.Record(ctx, seconds, metric.WithAttributes(attrsWithSite(attrs...)...))
}
// IncProxyConnectionEvent records proxy connection lifecycle events (opened/closed).
func IncProxyConnectionEvent(ctx context.Context, tunnelID, protocol, event string) {
if event == "" {
event = "unknown"
}
attrs := []attribute.KeyValue{
attribute.String("protocol", protocol),
attribute.String("event", event),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mProxyConnectionsTotal.Add(ctx, 1, metric.WithAttributes(attrsWithSite(attrs...)...))
}
// --- Config/PKI helpers ---
func ObserveConfigApply(ctx context.Context, phase, result string, seconds float64) {
mConfigApply.Record(ctx, seconds, metric.WithAttributes(attrsWithSite(
attribute.String("phase", phase),
attribute.String("result", result),
)...))
}
func IncCertRotation(ctx context.Context, result string) {
mCertRotationTotal.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("result", result),
)...))
}
func ObserveTunnelLatency(ctx context.Context, tunnelID, transport string, seconds float64) {
attrs := []attribute.KeyValue{
attribute.String("transport", transport),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mTunnelLatency.Record(ctx, seconds, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func IncReconnect(ctx context.Context, tunnelID, initiator, reason string) {
attrs := []attribute.KeyValue{
attribute.String("initiator", initiator),
attribute.String("reason", reason),
}
if ShouldIncludeTunnelID() && tunnelID != "" {
attrs = append(attrs, attribute.String("tunnel_id", tunnelID))
}
mReconnects.Add(ctx, 1, metric.WithAttributes(attrsWithSite(attrs...)...))
}
func IncConnAttempt(ctx context.Context, transport, result string) {
mConnAttempts.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("transport", transport),
attribute.String("result", result),
)...))
}
func IncConnError(ctx context.Context, transport, typ string) {
mConnErrors.Add(ctx, 1, metric.WithAttributes(attrsWithSite(
attribute.String("transport", transport),
attribute.String("error_type", typ),
)...))
}

View File

@@ -0,0 +1,59 @@
package telemetry
import (
"sync"
"time"
)
func resetMetricsForTest() {
initOnce = sync.Once{}
obsOnce = sync.Once{}
proxyObsOnce = sync.Once{}
obsStopper = nil
proxyStopper = nil
if wsConnStopper != nil {
wsConnStopper()
}
wsConnStopper = nil
meter = nil
mSiteRegistrations = nil
mSiteOnline = nil
mSiteLastHeartbeat = nil
mTunnelSessions = nil
mTunnelBytes = nil
mTunnelLatency = nil
mReconnects = nil
mConnAttempts = nil
mConnErrors = nil
mConfigReloads = nil
mConfigApply = nil
mCertRotationTotal = nil
mProcessStartTime = nil
mBuildInfo = nil
mWSConnectLatency = nil
mWSMessages = nil
mWSDisconnects = nil
mWSKeepaliveFailure = nil
mWSSessionDuration = nil
mWSConnected = nil
mWSReconnects = nil
mProxyActiveConns = nil
mProxyBufferBytes = nil
mProxyAsyncBacklogByte = nil
mProxyDropsTotal = nil
mProxyAcceptsTotal = nil
mProxyConnDuration = nil
mProxyConnectionsTotal = nil
processStartUnix = float64(time.Now().UnixNano()) / 1e9
wsConnectedState.Store(0)
includeTunnelIDVal.Store(false)
includeSiteLabelVal.Store(false)
}

View File

@@ -0,0 +1,106 @@
package telemetry
import (
"context"
"sync/atomic"
"time"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)
// StateView provides a read-only view for observable gauges.
// Implementations must be concurrency-safe and avoid blocking operations.
// All methods should be fast and use RLocks where applicable.
type StateView interface {
// ListSites returns a stable, low-cardinality list of site IDs to expose.
ListSites() []string
// Online returns whether the site is online.
Online(siteID string) (online bool, ok bool)
// LastHeartbeat returns the last heartbeat time for a site.
LastHeartbeat(siteID string) (t time.Time, ok bool)
// ActiveSessions returns the current number of active sessions for a site (across tunnels),
// or scoped to site if your model is site-scoped.
ActiveSessions(siteID string) (n int64, ok bool)
}
var (
stateView atomic.Value // of type StateView
)
// RegisterStateView sets the global StateView used by the default observable callback.
func RegisterStateView(v StateView) {
stateView.Store(v)
// If instruments are registered, ensure a callback exists.
if v != nil {
SetObservableCallback(func(ctx context.Context, o metric.Observer) error {
if any := stateView.Load(); any != nil {
if sv, ok := any.(StateView); ok {
for _, siteID := range sv.ListSites() {
observeSiteOnlineFor(o, sv, siteID)
observeLastHeartbeatFor(o, sv, siteID)
observeSessionsFor(o, siteID, sv)
}
}
}
return nil
})
}
}
func observeSiteOnlineFor(o metric.Observer, sv StateView, siteID string) {
if online, ok := sv.Online(siteID); ok {
val := int64(0)
if online {
val = 1
}
o.ObserveInt64(mSiteOnline, val, metric.WithAttributes(
attribute.String("site_id", siteID),
))
}
}
func observeLastHeartbeatFor(o metric.Observer, sv StateView, siteID string) {
if t, ok := sv.LastHeartbeat(siteID); ok {
ts := float64(t.UnixNano()) / 1e9
o.ObserveFloat64(mSiteLastHeartbeat, ts, metric.WithAttributes(
attribute.String("site_id", siteID),
))
}
}
func observeSessionsFor(o metric.Observer, siteID string, any interface{}) {
if tm, ok := any.(interface{ SessionsByTunnel() map[string]int64 }); ok {
sessions := tm.SessionsByTunnel()
// If tunnel_id labels are enabled, preserve existing per-tunnel observations
if ShouldIncludeTunnelID() {
for tid, n := range sessions {
attrs := []attribute.KeyValue{
attribute.String("site_id", siteID),
}
if tid != "" {
attrs = append(attrs, attribute.String("tunnel_id", tid))
}
o.ObserveInt64(mTunnelSessions, n, metric.WithAttributes(attrs...))
}
return
}
// When tunnel_id is disabled, collapse per-tunnel counts into a single site-level value
var total int64
for _, n := range sessions {
total += n
}
// If there are no per-tunnel entries, fall back to ActiveSessions() if available
if total == 0 {
if svAny := stateView.Load(); svAny != nil {
if sv, ok := svAny.(StateView); ok {
if n, ok2 := sv.ActiveSessions(siteID); ok2 {
total = n
}
}
}
}
o.ObserveInt64(mTunnelSessions, total, metric.WithAttributes(attribute.String("site_id", siteID)))
return
}
}

View File

@@ -0,0 +1,384 @@
package telemetry
import (
"context"
"errors"
"net/http"
"os"
"strings"
"sync/atomic"
"time"
promclient "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"go.opentelemetry.io/contrib/instrumentation/runtime"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/exporters/prometheus"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
"google.golang.org/grpc/credentials"
)
// Config controls telemetry initialization via env flags.
//
// Defaults align with the issue requirements:
// - Prometheus exporter enabled by default (/metrics)
// - OTLP exporter disabled by default
// - Durations in seconds, bytes in raw bytes
// - Admin HTTP server address configurable (for mounting /metrics)
type Config struct {
ServiceName string
ServiceVersion string
// Optional resource attributes
SiteID string
Region string
PromEnabled bool
OTLPEnabled bool
OTLPEndpoint string // host:port
OTLPInsecure bool
MetricExportInterval time.Duration
AdminAddr string // e.g.: ":2112"
// Optional build info for newt_build_info metric
BuildVersion string
BuildCommit string
}
// FromEnv reads configuration from environment variables.
//
// NEWT_METRICS_PROMETHEUS_ENABLED (default: true)
// NEWT_METRICS_OTLP_ENABLED (default: false)
// OTEL_EXPORTER_OTLP_ENDPOINT (default: "localhost:4317")
// OTEL_EXPORTER_OTLP_INSECURE (default: true)
// OTEL_METRIC_EXPORT_INTERVAL (default: 15s)
// OTEL_SERVICE_NAME (default: "newt")
// OTEL_SERVICE_VERSION (default: "")
// NEWT_ADMIN_ADDR (default: ":2112")
func FromEnv() Config {
// Prefer explicit NEWT_* env vars, then fall back to OTEL_RESOURCE_ATTRIBUTES
site := getenv("NEWT_SITE_ID", "")
if site == "" {
site = getenv("NEWT_ID", "")
}
region := os.Getenv("NEWT_REGION")
if site == "" || region == "" {
if ra := os.Getenv("OTEL_RESOURCE_ATTRIBUTES"); ra != "" {
m := parseResourceAttributes(ra)
if site == "" {
site = m["site_id"]
}
if region == "" {
region = m["region"]
}
}
}
return Config{
ServiceName: getenv("OTEL_SERVICE_NAME", "newt"),
ServiceVersion: os.Getenv("OTEL_SERVICE_VERSION"),
SiteID: site,
Region: region,
PromEnabled: getenv("NEWT_METRICS_PROMETHEUS_ENABLED", "true") == "true",
OTLPEnabled: getenv("NEWT_METRICS_OTLP_ENABLED", "false") == "true",
OTLPEndpoint: getenv("OTEL_EXPORTER_OTLP_ENDPOINT", "localhost:4317"),
OTLPInsecure: getenv("OTEL_EXPORTER_OTLP_INSECURE", "true") == "true",
MetricExportInterval: getdur("OTEL_METRIC_EXPORT_INTERVAL", 15*time.Second),
AdminAddr: getenv("NEWT_ADMIN_ADDR", ":2112"),
}
}
// Setup holds initialized telemetry providers and (optionally) a /metrics handler.
// Call Shutdown when the process terminates to flush exporters.
type Setup struct {
MeterProvider *metric.MeterProvider
TracerProvider *trace.TracerProvider
PrometheusHandler http.Handler // nil if Prometheus exporter disabled
shutdowns []func(context.Context) error
}
// Init configures OpenTelemetry metrics and (optionally) tracing.
//
// It sets a global MeterProvider and TracerProvider, registers runtime instrumentation,
// installs recommended histogram views for *_latency_seconds, and returns a Setup with
// a Shutdown method to flush exporters.
func Init(ctx context.Context, cfg Config) (*Setup, error) {
// Configure tunnel_id label inclusion from env (default true)
if getenv("NEWT_METRICS_INCLUDE_TUNNEL_ID", "true") == "true" {
includeTunnelIDVal.Store(true)
} else {
includeTunnelIDVal.Store(false)
}
if getenv("NEWT_METRICS_INCLUDE_SITE_LABELS", "true") == "true" {
includeSiteLabelVal.Store(true)
} else {
includeSiteLabelVal.Store(false)
}
res := buildResource(ctx, cfg)
UpdateSiteInfo(cfg.SiteID, cfg.Region)
s := &Setup{}
readers, promHandler, shutdowns, err := setupMetricExport(ctx, cfg, res)
if err != nil {
return nil, err
}
s.PrometheusHandler = promHandler
// Build provider
mp := buildMeterProvider(res, readers)
otel.SetMeterProvider(mp)
s.MeterProvider = mp
s.shutdowns = append(s.shutdowns, mp.Shutdown)
// Optional tracing
if cfg.OTLPEnabled {
if tp, shutdown := setupTracing(ctx, cfg, res); tp != nil {
otel.SetTracerProvider(tp)
s.TracerProvider = tp
s.shutdowns = append(s.shutdowns, func(c context.Context) error {
return errors.Join(shutdown(c), tp.Shutdown(c))
})
}
}
// Add metric exporter shutdowns
s.shutdowns = append(s.shutdowns, shutdowns...)
// Runtime metrics
_ = runtime.Start(runtime.WithMeterProvider(mp))
// Instruments
if err := registerInstruments(); err != nil {
return nil, err
}
if cfg.BuildVersion != "" || cfg.BuildCommit != "" {
RegisterBuildInfo(cfg.BuildVersion, cfg.BuildCommit)
}
return s, nil
}
func buildResource(ctx context.Context, cfg Config) *resource.Resource {
attrs := []attribute.KeyValue{
semconv.ServiceName(cfg.ServiceName),
semconv.ServiceVersion(cfg.ServiceVersion),
}
if cfg.SiteID != "" {
attrs = append(attrs, attribute.String("site_id", cfg.SiteID))
}
if cfg.Region != "" {
attrs = append(attrs, attribute.String("region", cfg.Region))
}
res, _ := resource.New(ctx, resource.WithFromEnv(), resource.WithHost(), resource.WithAttributes(attrs...))
return res
}
func setupMetricExport(ctx context.Context, cfg Config, _ *resource.Resource) ([]metric.Reader, http.Handler, []func(context.Context) error, error) {
var readers []metric.Reader
var shutdowns []func(context.Context) error
var promHandler http.Handler
if cfg.PromEnabled {
reg := promclient.NewRegistry()
exp, err := prometheus.New(prometheus.WithRegisterer(reg))
if err != nil {
return nil, nil, nil, err
}
readers = append(readers, exp)
promHandler = promhttp.HandlerFor(reg, promhttp.HandlerOpts{})
}
if cfg.OTLPEnabled {
mopts := []otlpmetricgrpc.Option{otlpmetricgrpc.WithEndpoint(cfg.OTLPEndpoint)}
if hdrs := parseOTLPHeaders(os.Getenv("OTEL_EXPORTER_OTLP_HEADERS")); len(hdrs) > 0 {
mopts = append(mopts, otlpmetricgrpc.WithHeaders(hdrs))
}
if cfg.OTLPInsecure {
mopts = append(mopts, otlpmetricgrpc.WithInsecure())
} else if certFile := os.Getenv("OTEL_EXPORTER_OTLP_CERTIFICATE"); certFile != "" {
if creds, cerr := credentials.NewClientTLSFromFile(certFile, ""); cerr == nil {
mopts = append(mopts, otlpmetricgrpc.WithTLSCredentials(creds))
}
}
mexp, err := otlpmetricgrpc.New(ctx, mopts...)
if err != nil {
return nil, nil, nil, err
}
readers = append(readers, metric.NewPeriodicReader(mexp, metric.WithInterval(cfg.MetricExportInterval)))
shutdowns = append(shutdowns, mexp.Shutdown)
}
return readers, promHandler, shutdowns, nil
}
func buildMeterProvider(res *resource.Resource, readers []metric.Reader) *metric.MeterProvider {
var mpOpts []metric.Option
mpOpts = append(mpOpts, metric.WithResource(res))
for _, r := range readers {
mpOpts = append(mpOpts, metric.WithReader(r))
}
mpOpts = append(mpOpts, metric.WithView(metric.NewView(
metric.Instrument{Name: "newt_*_latency_seconds"},
metric.Stream{Aggregation: metric.AggregationExplicitBucketHistogram{Boundaries: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, 30}}},
)))
mpOpts = append(mpOpts, metric.WithView(metric.NewView(
metric.Instrument{Name: "newt_*"},
metric.Stream{AttributeFilter: func(kv attribute.KeyValue) bool {
k := string(kv.Key)
switch k {
case "tunnel_id", "transport", "direction", "protocol", "result", "reason", "initiator", "error_type", "msg_type", "phase", "version", "commit", "site_id", "region":
return true
default:
return false
}
}},
)))
return metric.NewMeterProvider(mpOpts...)
}
func setupTracing(ctx context.Context, cfg Config, res *resource.Resource) (*trace.TracerProvider, func(context.Context) error) {
topts := []otlptracegrpc.Option{otlptracegrpc.WithEndpoint(cfg.OTLPEndpoint)}
if hdrs := parseOTLPHeaders(os.Getenv("OTEL_EXPORTER_OTLP_HEADERS")); len(hdrs) > 0 {
topts = append(topts, otlptracegrpc.WithHeaders(hdrs))
}
if cfg.OTLPInsecure {
topts = append(topts, otlptracegrpc.WithInsecure())
} else if certFile := os.Getenv("OTEL_EXPORTER_OTLP_CERTIFICATE"); certFile != "" {
if creds, cerr := credentials.NewClientTLSFromFile(certFile, ""); cerr == nil {
topts = append(topts, otlptracegrpc.WithTLSCredentials(creds))
}
}
exp, err := otlptracegrpc.New(ctx, topts...)
if err != nil {
return nil, nil
}
tp := trace.NewTracerProvider(trace.WithBatcher(exp), trace.WithResource(res))
return tp, exp.Shutdown
}
// Shutdown flushes exporters and providers in reverse init order.
func (s *Setup) Shutdown(ctx context.Context) error {
var err error
for i := len(s.shutdowns) - 1; i >= 0; i-- {
err = errors.Join(err, s.shutdowns[i](ctx))
}
return err
}
func parseOTLPHeaders(h string) map[string]string {
m := map[string]string{}
if h == "" {
return m
}
pairs := strings.Split(h, ",")
for _, p := range pairs {
kv := strings.SplitN(strings.TrimSpace(p), "=", 2)
if len(kv) == 2 {
m[strings.TrimSpace(kv[0])] = strings.TrimSpace(kv[1])
}
}
return m
}
// parseResourceAttributes parses OTEL_RESOURCE_ATTRIBUTES formatted as k=v,k2=v2
func parseResourceAttributes(s string) map[string]string {
m := map[string]string{}
if s == "" {
return m
}
parts := strings.Split(s, ",")
for _, p := range parts {
kv := strings.SplitN(strings.TrimSpace(p), "=", 2)
if len(kv) == 2 {
m[strings.TrimSpace(kv[0])] = strings.TrimSpace(kv[1])
}
}
return m
}
// Global site/region used to enrich metric labels.
var siteIDVal atomic.Value
var regionVal atomic.Value
var (
includeTunnelIDVal atomic.Value // bool; default true
includeSiteLabelVal atomic.Value // bool; default false
)
// UpdateSiteInfo updates the global site_id and region used for metric labels.
// Thread-safe via atomic.Value: subsequent metric emissions will include
// the new labels, prior emissions remain unchanged.
func UpdateSiteInfo(siteID, region string) {
if siteID != "" {
siteIDVal.Store(siteID)
}
if region != "" {
regionVal.Store(region)
}
}
func getSiteID() string {
if v, ok := siteIDVal.Load().(string); ok {
return v
}
return ""
}
func getRegion() string {
if v, ok := regionVal.Load().(string); ok {
return v
}
return ""
}
// siteAttrs returns label KVs for site_id and region (if set).
func siteAttrs() []attribute.KeyValue {
var out []attribute.KeyValue
if s := getSiteID(); s != "" {
out = append(out, attribute.String("site_id", s))
}
if r := getRegion(); r != "" {
out = append(out, attribute.String("region", r))
}
return out
}
// SiteLabelKVs exposes site label KVs for other packages (e.g., proxy manager).
func SiteLabelKVs() []attribute.KeyValue {
if !ShouldIncludeSiteLabels() {
return nil
}
return siteAttrs()
}
// ShouldIncludeTunnelID returns whether tunnel_id labels should be emitted.
func ShouldIncludeTunnelID() bool {
if v, ok := includeTunnelIDVal.Load().(bool); ok {
return v
}
return true
}
// ShouldIncludeSiteLabels returns whether site_id/region should be emitted as
// metric labels in addition to resource attributes.
func ShouldIncludeSiteLabels() bool {
if v, ok := includeSiteLabelVal.Load().(bool); ok {
return v
}
return false
}
func getenv(k, d string) string {
if v := os.Getenv(k); v != "" {
return v
}
return d
}
func getdur(k string, d time.Duration) time.Duration {
if v := os.Getenv(k); v != "" {
if p, e := time.ParseDuration(v); e == nil {
return p
}
}
return d
}

View File

@@ -0,0 +1,53 @@
package telemetry
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"go.opentelemetry.io/otel/attribute"
)
// Test that disallowed attributes are filtered from the exposition.
func TestAttributeFilterDropsUnknownKeys(t *testing.T) {
ctx := context.Background()
resetMetricsForTest()
t.Setenv("NEWT_METRICS_INCLUDE_SITE_LABELS", "true")
cfg := Config{ServiceName: "newt", PromEnabled: true, AdminAddr: "127.0.0.1:0"}
tel, err := Init(ctx, cfg)
if err != nil {
t.Fatalf("init: %v", err)
}
defer func() { _ = tel.Shutdown(context.Background()) }()
if tel.PrometheusHandler == nil {
t.Fatalf("prom handler nil")
}
ts := httptest.NewServer(tel.PrometheusHandler)
defer ts.Close()
// Add samples with disallowed attribute keys
for _, k := range []string{"forbidden", "site_id", "host"} {
set := attribute.NewSet(attribute.String(k, "x"))
AddTunnelBytesSet(ctx, 123, set)
}
time.Sleep(50 * time.Millisecond)
resp, err := http.Get(ts.URL)
if err != nil {
t.Fatalf("GET: %v", err)
}
defer resp.Body.Close()
b, _ := io.ReadAll(resp.Body)
body := string(b)
if strings.Contains(body, "forbidden=") {
t.Fatalf("unexpected forbidden attribute leaked into metrics: %s", body)
}
if !strings.Contains(body, "site_id=\"x\"") {
t.Fatalf("expected allowed attribute site_id to be present in metrics, got: %s", body)
}
}

View File

@@ -0,0 +1,76 @@
package telemetry
import (
"bufio"
"context"
"io"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
"time"
)
// Golden test that /metrics contains expected metric names.
func TestMetricsGoldenContains(t *testing.T) {
ctx := context.Background()
resetMetricsForTest()
t.Setenv("NEWT_METRICS_INCLUDE_SITE_LABELS", "true")
cfg := Config{ServiceName: "newt", PromEnabled: true, AdminAddr: "127.0.0.1:0", BuildVersion: "test"}
tel, err := Init(ctx, cfg)
if err != nil {
t.Fatalf("telemetry init error: %v", err)
}
defer func() { _ = tel.Shutdown(context.Background()) }()
if tel.PrometheusHandler == nil {
t.Fatalf("prom handler nil")
}
ts := httptest.NewServer(tel.PrometheusHandler)
defer ts.Close()
// Trigger counters to ensure they appear in the scrape
IncConnAttempt(ctx, "websocket", "success")
IncWSReconnect(ctx, "io_error")
IncProxyConnectionEvent(ctx, "", "tcp", ProxyConnectionOpened)
if tel.MeterProvider != nil {
_ = tel.MeterProvider.ForceFlush(ctx)
}
time.Sleep(100 * time.Millisecond)
var body string
for i := 0; i < 5; i++ {
resp, err := http.Get(ts.URL)
if err != nil {
t.Fatalf("GET metrics failed: %v", err)
}
b, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
body = string(b)
if strings.Contains(body, "newt_connection_attempts_total") {
break
}
time.Sleep(100 * time.Millisecond)
}
f, err := os.Open(filepath.Join("testdata", "expected_contains.golden"))
if err != nil {
t.Fatalf("read golden: %v", err)
}
defer f.Close()
s := bufio.NewScanner(f)
for s.Scan() {
needle := strings.TrimSpace(s.Text())
if needle == "" {
continue
}
if !strings.Contains(body, needle) {
t.Fatalf("expected metrics body to contain %q. body=\n%s", needle, body)
}
}
if err := s.Err(); err != nil {
t.Fatalf("scan golden: %v", err)
}
}

View File

@@ -0,0 +1,65 @@
package telemetry
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
// Smoke test that /metrics contains at least one newt_* metric when Prom exporter is enabled.
func TestMetricsSmoke(t *testing.T) {
ctx := context.Background()
resetMetricsForTest()
t.Setenv("NEWT_METRICS_INCLUDE_SITE_LABELS", "true")
cfg := Config{
ServiceName: "newt",
PromEnabled: true,
OTLPEnabled: false,
AdminAddr: "127.0.0.1:0",
BuildVersion: "test",
BuildCommit: "deadbeef",
MetricExportInterval: 5 * time.Second,
}
tel, err := Init(ctx, cfg)
if err != nil {
t.Fatalf("telemetry init error: %v", err)
}
defer func() { _ = tel.Shutdown(context.Background()) }()
// Serve the Prom handler on a test server
if tel.PrometheusHandler == nil {
t.Fatalf("Prometheus handler nil; PromEnabled should enable it")
}
ts := httptest.NewServer(tel.PrometheusHandler)
defer ts.Close()
// Record a simple metric and then fetch /metrics
IncConnAttempt(ctx, "websocket", "success")
if tel.MeterProvider != nil {
_ = tel.MeterProvider.ForceFlush(ctx)
}
// Give the exporter a tick to collect
time.Sleep(100 * time.Millisecond)
var body string
for i := 0; i < 5; i++ {
resp, err := http.Get(ts.URL)
if err != nil {
t.Fatalf("GET /metrics failed: %v", err)
}
b, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
body = string(b)
if strings.Contains(body, "newt_connection_attempts_total") {
break
}
time.Sleep(100 * time.Millisecond)
}
if !strings.Contains(body, "newt_connection_attempts_total") {
t.Fatalf("expected newt_connection_attempts_total in metrics, got:\n%s", body)
}
}

View File

@@ -0,0 +1,7 @@
newt_connection_attempts_total
newt_websocket_connected
newt_websocket_reconnects_total
newt_proxy_connections_total
newt_build_info
process_start_time_seconds

1
key Normal file
View File

@@ -0,0 +1 @@
oBvcoMJZXGzTZ4X+aNSCCQIjroREFBeRCs+a328xWGA=

View File

@@ -4,7 +4,8 @@ package main
import (
"fmt"
"strings"
"os"
"runtime"
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
@@ -13,71 +14,61 @@ import (
"github.com/fosrl/newt/wgtester"
)
var wgService *wg.WireGuardService
var wgTesterServer *wgtester.Server
var wgServiceNative *wg.WireGuardService
func setupClients(client *websocket.Client) {
var host = endpoint
if strings.HasPrefix(host, "http://") {
host = strings.TrimPrefix(host, "http://")
} else if strings.HasPrefix(host, "https://") {
host = strings.TrimPrefix(host, "https://")
func setupClientsNative(client *websocket.Client, host string) {
if runtime.GOOS != "linux" {
logger.Fatal("Tunnel management is only supported on Linux right now!")
os.Exit(1)
}
host = strings.TrimSuffix(host, "/")
// make sure we are sudo
if os.Geteuid() != 0 {
logger.Fatal("You must run this program as root to manage tunnels on Linux.")
os.Exit(1)
}
// Create WireGuard service
wgService, err = wg.NewWireGuardService(interfaceName, mtuInt, generateAndSaveKeyTo, host, id, client)
wgServiceNative, err = wg.NewWireGuardService(interfaceName, mtuInt, generateAndSaveKeyTo, host, id, client)
if err != nil {
logger.Fatal("Failed to create WireGuard service: %v", err)
}
defer wgService.Close(rm)
wgTesterServer = wgtester.NewServer("0.0.0.0", wgService.Port, id) // TODO: maybe make this the same ip of the wg server?
wgTesterServer = wgtester.NewServer("0.0.0.0", wgServiceNative.Port, id) // TODO: maybe make this the same ip of the wg server?
err := wgTesterServer.Start()
if err != nil {
logger.Error("Failed to start WireGuard tester server: %v", err)
} else {
// Make sure to stop the server on exit
defer wgTesterServer.Stop()
}
client.OnTokenUpdate(func(token string) {
wgService.SetToken(token)
wgServiceNative.SetToken(token)
})
}
func closeClients() {
if wgService != nil {
wgService.Close(rm)
wgService = nil
}
if wgTesterServer != nil {
wgTesterServer.Stop()
wgTesterServer = nil
func closeWgServiceNative() {
if wgServiceNative != nil {
wgServiceNative.Close(!keepInterface)
wgServiceNative = nil
}
}
func clientsHandleNewtConnection(publicKey string) {
if wgService == nil {
return
func clientsOnConnectNative() {
if wgServiceNative != nil {
wgServiceNative.LoadRemoteConfig()
}
wgService.SetServerPubKey(publicKey)
}
func clientsOnConnect() {
if wgService == nil {
return
func clientsHandleNewtConnectionNative(publicKey, endpoint string) {
if wgServiceNative != nil {
wgServiceNative.StartHolepunch(publicKey, endpoint)
}
wgService.LoadRemoteConfig()
}
func clientsAddProxyTarget(pm *proxy.ProxyManager, tunnelIp string) {
if wgService == nil {
return
}
func clientsAddProxyTargetNative(pm *proxy.ProxyManager, tunnelIp string) {
// add a udp proxy for localost and the wgService port
// TODO: make sure this port is not used in a target
pm.AddTarget("udp", tunnelIp, int(wgService.Port), fmt.Sprintf("127.0.0.1:%d", wgService.Port))
if wgServiceNative != nil {
pm.AddTarget("udp", tunnelIp, int(wgServiceNative.Port), fmt.Sprintf("127.0.0.1:%d", wgServiceNative.Port))
}
}

View File

@@ -2,6 +2,7 @@ package logger
import (
"fmt"
"io"
"log"
"os"
"sync"
@@ -48,6 +49,11 @@ func (l *Logger) SetLevel(level LogLevel) {
l.level = level
}
// SetOutput sets the output destination for the logger
func (l *Logger) SetOutput(w io.Writer) {
l.logger.SetOutput(w)
}
// log handles the actual logging
func (l *Logger) log(level LogLevel, format string, args ...interface{}) {
if level < l.level {
@@ -120,3 +126,8 @@ func Error(format string, args ...interface{}) {
func Fatal(format string, args ...interface{}) {
GetLogger().Fatal(format, args...)
}
// SetOutput sets the output destination for the default logger
func SetOutput(w io.Writer) {
GetLogger().SetOutput(w)
}

905
main.go

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,802 @@
diff --git a/Dockerfile b/Dockerfile
index b9c4d29..b9b6dea 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -22,6 +22,9 @@ RUN apk --no-cache add ca-certificates tzdata
COPY --from=builder /newt /usr/local/bin/
COPY entrypoint.sh /
+# Admin/metrics endpoint (Prometheus scrape)
+EXPOSE 2112
+
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
-CMD ["newt"]
\ No newline at end of file
+CMD ["newt"]
diff --git a/go.mod b/go.mod
index d475835..5909955 100644
--- a/go.mod
+++ b/go.mod
@@ -7,6 +7,14 @@ require (
github.com/google/gopacket v1.1.19
github.com/gorilla/websocket v1.5.3
github.com/vishvananda/netlink v1.3.1
+ go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.62.0
+ go.opentelemetry.io/contrib/instrumentation/runtime v0.62.0
+ go.opentelemetry.io/otel v1.37.0
+ go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.37.0
+ go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.37.0
+ go.opentelemetry.io/otel/sdk/metric v1.37.0
+ go.opentelemetry.io/otel/sdk/trace v1.37.0
+ go.opentelemetry.io/otel/semconv v1.26.0
golang.org/x/crypto v0.42.0
golang.org/x/exp v0.0.0-20250718183923-645b1fa84792
golang.org/x/net v0.44.0
diff --git a/main.go b/main.go
index 12849b1..c223b75 100644
--- a/main.go
+++ b/main.go
@@ -1,7 +1,9 @@
package main
import (
+ "context"
"encoding/json"
+ "errors"
"flag"
"fmt"
"net"
@@ -22,6 +24,9 @@ import (
"github.com/fosrl/newt/updates"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/state"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/device"
"golang.zx2c4.com/wireguard/tun"
@@ -116,6 +121,13 @@ var (
healthMonitor *healthcheck.Monitor
enforceHealthcheckCert bool
+ // Observability/metrics flags
+ metricsEnabled bool
+ otlpEnabled bool
+ adminAddr string
+ region string
+ metricsAsyncBytes bool
+
// New mTLS configuration variables
tlsClientCert string
tlsClientKey string
@@ -126,6 +138,10 @@ var (
)
func main() {
+ // Prepare context for graceful shutdown and signal handling
+ ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
+ defer stop()
+
// if PANGOLIN_ENDPOINT, NEWT_ID, and NEWT_SECRET are set as environment variables, they will be used as default values
endpoint = os.Getenv("PANGOLIN_ENDPOINT")
id = os.Getenv("NEWT_ID")
@@ -141,6 +157,13 @@ func main() {
useNativeInterfaceEnv := os.Getenv("USE_NATIVE_INTERFACE")
enforceHealthcheckCertEnv := os.Getenv("ENFORCE_HC_CERT")
+ // Metrics/observability env mirrors
+ metricsEnabledEnv := os.Getenv("NEWT_METRICS_PROMETHEUS_ENABLED")
+ otlpEnabledEnv := os.Getenv("NEWT_METRICS_OTLP_ENABLED")
+ adminAddrEnv := os.Getenv("NEWT_ADMIN_ADDR")
+ regionEnv := os.Getenv("NEWT_REGION")
+ asyncBytesEnv := os.Getenv("NEWT_METRICS_ASYNC_BYTES")
+
keepInterface = keepInterfaceEnv == "true"
acceptClients = acceptClientsEnv == "true"
useNativeInterface = useNativeInterfaceEnv == "true"
@@ -272,6 +295,35 @@ func main() {
flag.StringVar(&healthFile, "health-file", "", "Path to health file (if unset, health file won't be written)")
}
+ // Metrics/observability flags (mirror ENV if unset)
+ if metricsEnabledEnv == "" {
+ flag.BoolVar(&metricsEnabled, "metrics", true, "Enable Prometheus /metrics exporter")
+ } else {
+ if v, err := strconv.ParseBool(metricsEnabledEnv); err == nil { metricsEnabled = v } else { metricsEnabled = true }
+ }
+ if otlpEnabledEnv == "" {
+ flag.BoolVar(&otlpEnabled, "otlp", false, "Enable OTLP exporters (metrics/traces) to OTEL_EXPORTER_OTLP_ENDPOINT")
+ } else {
+ if v, err := strconv.ParseBool(otlpEnabledEnv); err == nil { otlpEnabled = v }
+ }
+ if adminAddrEnv == "" {
+ flag.StringVar(&adminAddr, "metrics-admin-addr", "127.0.0.1:2112", "Admin/metrics bind address")
+ } else {
+ adminAddr = adminAddrEnv
+ }
+ // Async bytes toggle
+ if asyncBytesEnv == "" {
+ flag.BoolVar(&metricsAsyncBytes, "metrics-async-bytes", false, "Enable async bytes counting (background flush; lower hot path overhead)")
+ } else {
+ if v, err := strconv.ParseBool(asyncBytesEnv); err == nil { metricsAsyncBytes = v }
+ }
+ // Optional region flag (resource attribute)
+ if regionEnv == "" {
+ flag.StringVar(&region, "region", "", "Optional region resource attribute (also NEWT_REGION)")
+ } else {
+ region = regionEnv
+ }
+
// do a --version check
version := flag.Bool("version", false, "Print the version")
@@ -286,6 +338,50 @@ func main() {
loggerLevel := parseLogLevel(logLevel)
logger.GetLogger().SetLevel(parseLogLevel(logLevel))
+ // Initialize telemetry after flags are parsed (so flags override env)
+ tcfg := telemetry.FromEnv()
+ tcfg.PromEnabled = metricsEnabled
+ tcfg.OTLPEnabled = otlpEnabled
+ if adminAddr != "" { tcfg.AdminAddr = adminAddr }
+ // Resource attributes (if available)
+ tcfg.SiteID = id
+ tcfg.Region = region
+ // Build info
+ tcfg.BuildVersion = newtVersion
+ tcfg.BuildCommit = os.Getenv("NEWT_COMMIT")
+
+ tel, telErr := telemetry.Init(ctx, tcfg)
+ if telErr != nil {
+ logger.Warn("Telemetry init failed: %v", telErr)
+ }
+ if tel != nil {
+ // Admin HTTP server (exposes /metrics when Prometheus exporter is enabled)
+ mux := http.NewServeMux()
+ mux.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { w.WriteHeader(200) })
+ if tel.PrometheusHandler != nil {
+ mux.Handle("/metrics", tel.PrometheusHandler)
+ }
+ admin := &http.Server{
+ Addr: tcfg.AdminAddr,
+ Handler: otelhttp.NewHandler(mux, "newt-admin"),
+ ReadTimeout: 5 * time.Second,
+ WriteTimeout: 10 * time.Second,
+ ReadHeaderTimeout: 5 * time.Second,
+ IdleTimeout: 30 * time.Second,
+ }
+ go func() {
+ if err := admin.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
+ logger.Warn("admin http error: %v", err)
+ }
+ }()
+ defer func() {
+ ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+ defer cancel()
+ _ = admin.Shutdown(ctx)
+ }()
+ defer func() { _ = tel.Shutdown(context.Background()) }()
+ }
+
newtVersion := "version_replaceme"
if *version {
fmt.Println("Newt version " + newtVersion)
@@ -557,7 +653,10 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
}
// Use reliable ping for initial connection test
logger.Debug("Testing initial connection with reliable ping...")
- _, err = reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ lat, err := reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ if err == nil && wgData.PublicKey != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", wgData.PublicKey, "wireguard", lat.Seconds())
+ }
if err != nil {
logger.Warn("Initial reliable ping failed, but continuing: %v", err)
} else {
@@ -570,14 +669,20 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
// as the pings will continue in the background
if !connected {
logger.Debug("Starting ping check")
- pingStopChan = startPingCheck(tnet, wgData.ServerIP, client)
+ pingStopChan = startPingCheck(tnet, wgData.ServerIP, client, wgData.PublicKey)
}
// Create proxy manager
pm = proxy.NewProxyManager(tnet)
+ pm.SetAsyncBytes(metricsAsyncBytes)
+ // Set tunnel_id for metrics (WireGuard peer public key)
+ pm.SetTunnelID(wgData.PublicKey)
connected = true
+ // telemetry: record a successful site registration (omit region unless available)
+ telemetry.IncSiteRegistration(context.Background(), id, "", "success")
+
// add the targets if there are any
if len(wgData.Targets.TCP) > 0 {
updateTargets(pm, "add", wgData.TunnelIP, "tcp", TargetData{Targets: wgData.Targets.TCP})
@@ -611,10 +716,25 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/reconnect", func(msg websocket.WSMessage) {
logger.Info("Received reconnect message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
// Mark as disconnected
connected = false
@@ -631,6 +751,9 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/terminate", func(msg websocket.WSMessage) {
logger.Info("Received termination message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
diff --git a/proxy/manager.go b/proxy/manager.go
index bf10322..86c47a8 100644
--- a/proxy/manager.go
+++ b/proxy/manager.go
@@ -1,16 +1,22 @@
package proxy
import (
+ "context"
"fmt"
"io"
"net"
+ "os"
"strings"
"sync"
+ "sync/atomic"
"time"
+ "github.com/fosrl/newt/internal/state"
+ "github.com/fosrl/newt/internal/telemetry"
"github.com/fosrl/newt/logger"
"golang.zx2c4.com/wireguard/tun/netstack"
"gvisor.dev/gvisor/pkg/tcpip/adapters/gonet"
+ "go.opentelemetry.io/otel/attribute"
)
// Target represents a proxy target with its address and port
@@ -28,6 +34,52 @@ type ProxyManager struct {
udpConns []*gonet.UDPConn
running bool
mutex sync.RWMutex
+
+ // telemetry (multi-tunnel)
+ currentTunnelID string
+ tunnels map[string]*tunnelEntry
+ asyncBytes bool
+ flushStop chan struct{}
+}
+
+// tunnelEntry holds per-tunnel attributes and (optional) async counters.
+type tunnelEntry struct {
+ attrInTCP attribute.Set
+ attrOutTCP attribute.Set
+ attrInUDP attribute.Set
+ attrOutUDP attribute.Set
+
+ bytesInTCP atomic.Uint64
+ bytesOutTCP atomic.Uint64
+ bytesInUDP atomic.Uint64
+ bytesOutUDP atomic.Uint64
+}
+
+// countingWriter wraps an io.Writer and adds bytes to OTel counter using a pre-built attribute set.
+type countingWriter struct {
+ ctx context.Context
+ w io.Writer
+ set attribute.Set
+ pm *ProxyManager
+ ent *tunnelEntry
+ out bool // false=in, true=out
+ proto string // "tcp" or "udp"
+}
+
+func (cw *countingWriter) Write(p []byte) (int, error) {
+ n, err := cw.w.Write(p)
+ if n > 0 {
+ if cw.pm != nil && cw.pm.asyncBytes && cw.ent != nil {
+ if cw.proto == "tcp" {
+ if cw.out { cw.ent.bytesOutTCP.Add(uint64(n)) } else { cw.ent.bytesInTCP.Add(uint64(n)) }
+ } else if cw.proto == "udp" {
+ if cw.out { cw.ent.bytesOutUDP.Add(uint64(n)) } else { cw.ent.bytesInUDP.Add(uint64(n)) }
+ }
+ } else {
+ telemetry.AddTunnelBytesSet(cw.ctx, int64(n), cw.set)
+ }
+ }
+ return n, err
}
// NewProxyManager creates a new proxy manager instance
@@ -38,9 +90,46 @@ func NewProxyManager(tnet *netstack.Net) *ProxyManager {
udpTargets: make(map[string]map[int]string),
listeners: make([]*gonet.TCPListener, 0),
udpConns: make([]*gonet.UDPConn, 0),
+ tunnels: make(map[string]*tunnelEntry),
}
}
+// SetTunnelID sets the WireGuard peer public key used as tunnel_id label.
+func (pm *ProxyManager) SetTunnelID(id string) {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ pm.currentTunnelID = id
+ if _, ok := pm.tunnels[id]; !ok {
+ pm.tunnels[id] = &tunnelEntry{}
+ }
+ e := pm.tunnels[id]
+ e.attrInTCP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "in"), attribute.String("protocol", "tcp"))
+ e.attrOutTCP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "out"), attribute.String("protocol", "tcp"))
+ e.attrInUDP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "in"), attribute.String("protocol", "udp"))
+ e.attrOutUDP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "out"), attribute.String("protocol", "udp"))
+}
+
+// ClearTunnelID clears cached attribute sets for the current tunnel.
+func (pm *ProxyManager) ClearTunnelID() {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ id := pm.currentTunnelID
+ if id == "" { return }
+ if e, ok := pm.tunnels[id]; ok {
+ // final flush for this tunnel
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ delete(pm.tunnels, id)
+ }
+ pm.currentTunnelID = ""
+}
+
// init function without tnet
func NewProxyManagerWithoutTNet() *ProxyManager {
return &ProxyManager{
@@ -160,6 +249,57 @@ func (pm *ProxyManager) Start() error {
return nil
}
+func (pm *ProxyManager) SetAsyncBytes(b bool) {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ pm.asyncBytes = b
+ if b && pm.flushStop == nil {
+ pm.flushStop = make(chan struct{})
+ go pm.flushLoop()
+ }
+}
+func (pm *ProxyManager) flushLoop() {
+ flushInterval := 2 * time.Second
+ if v := os.Getenv("OTEL_METRIC_EXPORT_INTERVAL"); v != "" {
+ if d, err := time.ParseDuration(v); err == nil && d > 0 {
+ if d/2 < flushInterval { flushInterval = d / 2 }
+ }
+ }
+ ticker := time.NewTicker(flushInterval)
+ defer ticker.Stop()
+ for {
+ select {
+ case <-ticker.C:
+ pm.mutex.RLock()
+ for _, e := range pm.tunnels {
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ }
+ pm.mutex.RUnlock()
+ case <-pm.flushStop:
+ pm.mutex.RLock()
+ for _, e := range pm.tunnels {
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ }
+ pm.mutex.RUnlock()
+ return
+ }
+ }
+}
+
func (pm *ProxyManager) Stop() error {
pm.mutex.Lock()
defer pm.mutex.Unlock()
@@ -236,6 +376,14 @@ func (pm *ProxyManager) startTarget(proto, listenIP string, port int, targetAddr
return nil
}
+// getEntry returns per-tunnel entry or nil.
+func (pm *ProxyManager) getEntry(id string) *tunnelEntry {
+ pm.mutex.RLock()
+ e := pm.tunnels[id]
+ pm.mutex.RUnlock()
+ return e
+}
+
func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string) {
for {
conn, err := listener.Accept()
@@ -257,6 +405,9 @@ func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string)
continue
}
+// Count sessions only once per accepted TCP connection
+ if pm.tunnelID != "" { state.Global().IncSessions(pm.tunnelID) }
+
go func() {
target, err := net.Dial("tcp", targetAddr)
if err != nil {
@@ -265,24 +416,33 @@ func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string)
return
}
+ // already incremented on accept
+
// Create a WaitGroup to ensure both copy operations complete
var wg sync.WaitGroup
wg.Add(2)
+ // client -> target (direction=in)
go func() {
defer wg.Done()
- io.Copy(target, conn)
- target.Close()
+e := pm.getEntry(pm.currentTunnelID)
+cw := &countingWriter{ctx: context.Background(), w: target, set: e.attrInTCP, pm: pm, ent: e, out: false, proto: "tcp"}
+ _, _ = io.Copy(cw, conn)
+ _ = target.Close()
}()
+ // target -> client (direction=out)
go func() {
defer wg.Done()
- io.Copy(conn, target)
- conn.Close()
+e := pm.getEntry(pm.currentTunnelID)
+cw := &countingWriter{ctx: context.Background(), w: conn, set: e.attrOutTCP, pm: pm, ent: e, out: true, proto: "tcp"}
+ _, _ = io.Copy(cw, target)
+ _ = conn.Close()
}()
- // Wait for both copies to complete
+ // Wait for both copies to complete then session -1
wg.Wait()
+ if pm.tunnelID != "" { state.Global().DecSessions(pm.tunnelID) }
}()
}
}
@@ -326,6 +486,14 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
}
clientKey := remoteAddr.String()
+ // bytes from client -> target (direction=in)
+if pm.currentTunnelID != "" && n > 0 {
+if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesInUDP.Add(uint64(n)) }
+ } else {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrInUDP) }
+ }
+ }
clientsMutex.RLock()
targetConn, exists := clientConns[clientKey]
clientsMutex.RUnlock()
@@ -366,6 +534,15 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
return // defer will handle cleanup
}
+ // bytes from target -> client (direction=out)
+ if pm.currentTunnelID != "" && n > 0 {
+ if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesOutUDP.Add(uint64(n)) }
+ } else {
+if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrOutUDP) }
+ }
+ }
+
_, err = conn.WriteTo(buffer[:n], remoteAddr)
if err != nil {
logger.Error("Error writing to client: %v", err)
@@ -375,13 +552,19 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
}(clientKey, targetConn, remoteAddr)
}
- _, err = targetConn.Write(buffer[:n])
+ written, err := targetConn.Write(buffer[:n])
if err != nil {
logger.Error("Error writing to target: %v", err)
targetConn.Close()
clientsMutex.Lock()
delete(clientConns, clientKey)
clientsMutex.Unlock()
+} else if pm.currentTunnelID != "" && written > 0 {
+ if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesInUDP.Add(uint64(written)) }
+ } else {
+if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(written), e.attrInUDP) }
+ }
}
}
}
diff --git a/util.go b/util.go
index 7d6da4f..c1f4915 100644
--- a/util.go
+++ b/util.go
@@ -17,6 +17,7 @@ import (
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/telemetry"
"golang.org/x/net/icmp"
"golang.org/x/net/ipv4"
"golang.zx2c4.com/wireguard/device"
@@ -229,7 +230,7 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
return stopChan, fmt.Errorf("initial ping attempts failed, continuing in background")
}
-func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client) chan struct{} {
+func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client, tunnelID string) chan struct{} {
maxInterval := 6 * time.Second
currentInterval := pingInterval
consecutiveFailures := 0
@@ -292,6 +293,9 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
if !connectionLost {
connectionLost = true
logger.Warn("Connection to server lost after %d failures. Continuous reconnection attempts will be made.", consecutiveFailures)
+ if tunnelID != "" {
+ telemetry.IncReconnect(context.Background(), "", tunnelID, telemetry.ReasonTimeout)
+ }
stopFunc = client.SendMessageInterval("newt/ping/request", map[string]interface{}{}, 3*time.Second)
// Send registration message to the server for backward compatibility
err := client.SendMessage("newt/wg/register", map[string]interface{}{
@@ -318,6 +322,10 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
} else {
// Track recent latencies
recentLatencies = append(recentLatencies, latency)
+ // Record tunnel latency (limit sampling to this periodic check)
+ if tunnelID != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", tunnelID, "wireguard", latency.Seconds())
+ }
if len(recentLatencies) > 10 {
recentLatencies = recentLatencies[1:]
}
diff --git a/websocket/client.go b/websocket/client.go
index 0c0664a..c9ac264 100644
--- a/websocket/client.go
+++ b/websocket/client.go
@@ -18,6 +18,10 @@ import (
"github.com/fosrl/newt/logger"
"github.com/gorilla/websocket"
+
+ "context"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/otel"
)
type Client struct {
@@ -287,6 +291,7 @@ func (c *Client) getToken() (string, error) {
}
resp, err := client.Do(req)
if err != nil {
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", classifyConnError(err))
return "", fmt.Errorf("failed to request new token: %w", err)
}
defer resp.Body.Close()
@@ -294,6 +299,18 @@ func (c *Client) getToken() (string, error) {
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
logger.Error("Failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "failure")
+ bin := "http_other"
+ if resp.StatusCode >= 500 {
+ bin = "http_5xx"
+ } else if resp.StatusCode >= 400 {
+ bin = "http_4xx"
+ }
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", bin)
+ // Reconnect reason mapping for auth failures
+ if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonAuthError)
+ }
return "", fmt.Errorf("failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
}
@@ -312,10 +329,33 @@ func (c *Client) getToken() (string, error) {
}
logger.Debug("Received token: %s", tokenResp.Data.Token)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "success")
return tokenResp.Data.Token, nil
}
+// classifyConnError maps common errors to low-cardinality error_type labels
+func classifyConnError(err error) string {
+ if err == nil {
+ return ""
+ }
+ msg := strings.ToLower(err.Error())
+ switch {
+ case strings.Contains(msg, "tls") || strings.Contains(msg, "certificate"):
+ return "tls"
+ case strings.Contains(msg, "timeout") || strings.Contains(msg, "i/o timeout"):
+ return "timeout"
+ case strings.Contains(msg, "no such host") || strings.Contains(msg, "dns"):
+ return "dns"
+ case strings.Contains(msg, "unauthorized") || strings.Contains(msg, "forbidden"):
+ return "auth"
+ case strings.Contains(msg, "broken pipe") || strings.Contains(msg, "connection reset") || strings.Contains(msg, "connection refused") || strings.Contains(msg, "use of closed network connection") || strings.Contains(msg, "network is unreachable"):
+ return "io"
+ default:
+ return "other"
+ }
+}
+
func (c *Client) connectWithRetry() {
for {
select {
@@ -337,6 +377,10 @@ func (c *Client) establishConnection() error {
// Get token for authentication
token, err := c.getToken()
if err != nil {
+ // telemetry: connection attempt failed before dialing
+ // site_id isn't globally available here; use client ID as site_id (low cardinality)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", classifyConnError(err))
return fmt.Errorf("failed to get token: %w", err)
}
@@ -369,7 +413,11 @@ func (c *Client) establishConnection() error {
q.Set("clientType", c.clientType)
u.RawQuery = q.Encode()
- // Connect to WebSocket
+ // Connect to WebSocket (optional span)
+ tr := otel.Tracer("newt")
+ spanCtx, span := tr.Start(context.Background(), "ws.connect")
+ defer span.End()
+
dialer := websocket.DefaultDialer
// Use new TLS configuration method
@@ -391,11 +439,23 @@ func (c *Client) establishConnection() error {
logger.Debug("WebSocket TLS certificate verification disabled via SKIP_TLS_VERIFY environment variable")
}
- conn, _, err := dialer.Dial(u.String(), nil)
+conn, _, err := dialer.DialContext(spanCtx, u.String(), nil)
if err != nil {
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ etype := classifyConnError(err)
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", etype)
+ // Map handshake-related errors to reconnect reasons where appropriate
+ if etype == "tls" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonHandshakeError)
+ } else if etype == "timeout" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonTimeout)
+ } else {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonError)
+ }
return fmt.Errorf("failed to connect to WebSocket: %w", err)
}
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "success")
c.conn = conn
c.setConnected(true)
diff --git a/wg/wg.go b/wg/wg.go
index 3cee1a9..a765279 100644
--- a/wg/wg.go
+++ b/wg/wg.go
@@ -3,6 +3,7 @@
package wg
import (
+ "context"
"encoding/json"
"errors"
"fmt"
@@ -23,6 +24,8 @@ import (
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/wgctrl"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -298,6 +301,13 @@ func (s *WireGuardService) handleConfig(msg websocket.WSMessage) {
s.stopGetConfig = nil
}
+ // telemetry: config reload success
+ telemetry.IncConfigReload(context.Background(), "success")
+ // Optional reconnect reason mapping: config change
+ if s.serverPubKey != "" {
+ telemetry.IncReconnect(context.Background(), "", s.serverPubKey, telemetry.ReasonConfigChange)
+ }
+
// Ensure the WireGuard interface and peers are configured
if err := s.ensureWireguardInterface(config); err != nil {
logger.Error("Failed to ensure WireGuard interface: %v", err)
diff --git a/wgnetstack/wgnetstack.go b/wgnetstack/wgnetstack.go
index 6684c40..09f160e 100644
--- a/wgnetstack/wgnetstack.go
+++ b/wgnetstack/wgnetstack.go
@@ -1,6 +1,7 @@
package wgnetstack
import (
+ "context"
"crypto/rand"
"encoding/base64"
"encoding/hex"
@@ -26,6 +27,8 @@ import (
"golang.zx2c4.com/wireguard/tun"
"golang.zx2c4.com/wireguard/tun/netstack"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -240,14 +243,20 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
return service, nil
}
+// ReportRTT allows reporting native RTTs to telemetry, rate-limited externally.
+func (s *WireGuardService) ReportRTT(seconds float64) {
+ if s.serverPubKey == "" { return }
+ telemetry.ObserveTunnelLatency(context.Background(), "", s.serverPubKey, "wireguard", seconds)
+}
+
func (s *WireGuardService) addTcpTarget(msg websocket.WSMessage) {
logger.Debug("Received: %+v", msg)
// if there is no wgData or pm, we can't add targets
if s.TunnelIP == "" || s.proxyManager == nil {
logger.Info("No tunnel IP or proxy manager available")
- return
- }
+ return
+}
targetData, err := parseTargetData(msg.Data)
if err != nil {

View File

@@ -0,0 +1,301 @@
diff --git a/proxy/manager.go b/proxy/manager.go
index bf10322..86c47a8 100644
--- a/proxy/manager.go
+++ b/proxy/manager.go
@@ -1,16 +1,22 @@
package proxy
import (
+ "context"
"fmt"
"io"
"net"
+ "os"
"strings"
"sync"
+ "sync/atomic"
"time"
+ "github.com/fosrl/newt/internal/state"
+ "github.com/fosrl/newt/internal/telemetry"
"github.com/fosrl/newt/logger"
"golang.zx2c4.com/wireguard/tun/netstack"
"gvisor.dev/gvisor/pkg/tcpip/adapters/gonet"
+ "go.opentelemetry.io/otel/attribute"
)
// Target represents a proxy target with its address and port
@@ -28,6 +34,52 @@ type ProxyManager struct {
udpConns []*gonet.UDPConn
running bool
mutex sync.RWMutex
+
+ // telemetry (multi-tunnel)
+ currentTunnelID string
+ tunnels map[string]*tunnelEntry
+ asyncBytes bool
+ flushStop chan struct{}
+}
+
+// tunnelEntry holds per-tunnel attributes and (optional) async counters.
+type tunnelEntry struct {
+ attrInTCP attribute.Set
+ attrOutTCP attribute.Set
+ attrInUDP attribute.Set
+ attrOutUDP attribute.Set
+
+ bytesInTCP atomic.Uint64
+ bytesOutTCP atomic.Uint64
+ bytesInUDP atomic.Uint64
+ bytesOutUDP atomic.Uint64
+}
+
+// countingWriter wraps an io.Writer and adds bytes to OTel counter using a pre-built attribute set.
+type countingWriter struct {
+ ctx context.Context
+ w io.Writer
+ set attribute.Set
+ pm *ProxyManager
+ ent *tunnelEntry
+ out bool // false=in, true=out
+ proto string // "tcp" or "udp"
+}
+
+func (cw *countingWriter) Write(p []byte) (int, error) {
+ n, err := cw.w.Write(p)
+ if n > 0 {
+ if cw.pm != nil && cw.pm.asyncBytes && cw.ent != nil {
+ if cw.proto == "tcp" {
+ if cw.out { cw.ent.bytesOutTCP.Add(uint64(n)) } else { cw.ent.bytesInTCP.Add(uint64(n)) }
+ } else if cw.proto == "udp" {
+ if cw.out { cw.ent.bytesOutUDP.Add(uint64(n)) } else { cw.ent.bytesInUDP.Add(uint64(n)) }
+ }
+ } else {
+ telemetry.AddTunnelBytesSet(cw.ctx, int64(n), cw.set)
+ }
+ }
+ return n, err
}
// NewProxyManager creates a new proxy manager instance
@@ -38,9 +90,46 @@ func NewProxyManager(tnet *netstack.Net) *ProxyManager {
udpTargets: make(map[string]map[int]string),
listeners: make([]*gonet.TCPListener, 0),
udpConns: make([]*gonet.UDPConn, 0),
+ tunnels: make(map[string]*tunnelEntry),
}
}
+// SetTunnelID sets the WireGuard peer public key used as tunnel_id label.
+func (pm *ProxyManager) SetTunnelID(id string) {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ pm.currentTunnelID = id
+ if _, ok := pm.tunnels[id]; !ok {
+ pm.tunnels[id] = &tunnelEntry{}
+ }
+ e := pm.tunnels[id]
+ e.attrInTCP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "in"), attribute.String("protocol", "tcp"))
+ e.attrOutTCP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "out"), attribute.String("protocol", "tcp"))
+ e.attrInUDP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "in"), attribute.String("protocol", "udp"))
+ e.attrOutUDP = attribute.NewSet(attribute.String("tunnel_id", id), attribute.String("direction", "out"), attribute.String("protocol", "udp"))
+}
+
+// ClearTunnelID clears cached attribute sets for the current tunnel.
+func (pm *ProxyManager) ClearTunnelID() {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ id := pm.currentTunnelID
+ if id == "" { return }
+ if e, ok := pm.tunnels[id]; ok {
+ // final flush for this tunnel
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ delete(pm.tunnels, id)
+ }
+ pm.currentTunnelID = ""
+}
+
// init function without tnet
func NewProxyManagerWithoutTNet() *ProxyManager {
return &ProxyManager{
@@ -160,6 +249,57 @@ func (pm *ProxyManager) Start() error {
return nil
}
+func (pm *ProxyManager) SetAsyncBytes(b bool) {
+ pm.mutex.Lock()
+ defer pm.mutex.Unlock()
+ pm.asyncBytes = b
+ if b && pm.flushStop == nil {
+ pm.flushStop = make(chan struct{})
+ go pm.flushLoop()
+ }
+}
+func (pm *ProxyManager) flushLoop() {
+ flushInterval := 2 * time.Second
+ if v := os.Getenv("OTEL_METRIC_EXPORT_INTERVAL"); v != "" {
+ if d, err := time.ParseDuration(v); err == nil && d > 0 {
+ if d/2 < flushInterval { flushInterval = d / 2 }
+ }
+ }
+ ticker := time.NewTicker(flushInterval)
+ defer ticker.Stop()
+ for {
+ select {
+ case <-ticker.C:
+ pm.mutex.RLock()
+ for _, e := range pm.tunnels {
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ }
+ pm.mutex.RUnlock()
+ case <-pm.flushStop:
+ pm.mutex.RLock()
+ for _, e := range pm.tunnels {
+ inTCP := e.bytesInTCP.Swap(0)
+ outTCP := e.bytesOutTCP.Swap(0)
+ inUDP := e.bytesInUDP.Swap(0)
+ outUDP := e.bytesOutUDP.Swap(0)
+ if inTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP) }
+ if outTCP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP) }
+ if inUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP) }
+ if outUDP > 0 { telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP) }
+ }
+ pm.mutex.RUnlock()
+ return
+ }
+ }
+}
+
func (pm *ProxyManager) Stop() error {
pm.mutex.Lock()
defer pm.mutex.Unlock()
@@ -236,6 +376,14 @@ func (pm *ProxyManager) startTarget(proto, listenIP string, port int, targetAddr
return nil
}
+// getEntry returns per-tunnel entry or nil.
+func (pm *ProxyManager) getEntry(id string) *tunnelEntry {
+ pm.mutex.RLock()
+ e := pm.tunnels[id]
+ pm.mutex.RUnlock()
+ return e
+}
+
func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string) {
for {
conn, err := listener.Accept()
@@ -257,6 +405,9 @@ func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string)
continue
}
+// Count sessions only once per accepted TCP connection
+ if pm.tunnelID != "" { state.Global().IncSessions(pm.tunnelID) }
+
go func() {
target, err := net.Dial("tcp", targetAddr)
if err != nil {
@@ -265,24 +416,33 @@ func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string)
return
}
+ // already incremented on accept
+
// Create a WaitGroup to ensure both copy operations complete
var wg sync.WaitGroup
wg.Add(2)
+ // client -> target (direction=in)
go func() {
defer wg.Done()
- io.Copy(target, conn)
- target.Close()
+e := pm.getEntry(pm.currentTunnelID)
+cw := &countingWriter{ctx: context.Background(), w: target, set: e.attrInTCP, pm: pm, ent: e, out: false, proto: "tcp"}
+ _, _ = io.Copy(cw, conn)
+ _ = target.Close()
}()
+ // target -> client (direction=out)
go func() {
defer wg.Done()
- io.Copy(conn, target)
- conn.Close()
+e := pm.getEntry(pm.currentTunnelID)
+cw := &countingWriter{ctx: context.Background(), w: conn, set: e.attrOutTCP, pm: pm, ent: e, out: true, proto: "tcp"}
+ _, _ = io.Copy(cw, target)
+ _ = conn.Close()
}()
- // Wait for both copies to complete
+ // Wait for both copies to complete then session -1
wg.Wait()
+ if pm.tunnelID != "" { state.Global().DecSessions(pm.tunnelID) }
}()
}
}
@@ -326,6 +486,14 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
}
clientKey := remoteAddr.String()
+ // bytes from client -> target (direction=in)
+if pm.currentTunnelID != "" && n > 0 {
+if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesInUDP.Add(uint64(n)) }
+ } else {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrInUDP) }
+ }
+ }
clientsMutex.RLock()
targetConn, exists := clientConns[clientKey]
clientsMutex.RUnlock()
@@ -366,6 +534,15 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
return // defer will handle cleanup
}
+ // bytes from target -> client (direction=out)
+ if pm.currentTunnelID != "" && n > 0 {
+ if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesOutUDP.Add(uint64(n)) }
+ } else {
+if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrOutUDP) }
+ }
+ }
+
_, err = conn.WriteTo(buffer[:n], remoteAddr)
if err != nil {
logger.Error("Error writing to client: %v", err)
@@ -375,13 +552,19 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
}(clientKey, targetConn, remoteAddr)
}
- _, err = targetConn.Write(buffer[:n])
+ written, err := targetConn.Write(buffer[:n])
if err != nil {
logger.Error("Error writing to target: %v", err)
targetConn.Close()
clientsMutex.Lock()
delete(clientConns, clientKey)
clientsMutex.Unlock()
+} else if pm.currentTunnelID != "" && written > 0 {
+ if pm.asyncBytes {
+ if e := pm.getEntry(pm.currentTunnelID); e != nil { e.bytesInUDP.Add(uint64(written)) }
+ } else {
+if e := pm.getEntry(pm.currentTunnelID); e != nil { telemetry.AddTunnelBytesSet(context.Background(), int64(written), e.attrInUDP) }
+ }
}
}
}

View File

@@ -0,0 +1,422 @@
diff --git a/main.go b/main.go
index 12849b1..c223b75 100644
--- a/main.go
+++ b/main.go
@@ -1,7 +1,9 @@
package main
import (
+ "context"
"encoding/json"
+ "errors"
"flag"
"fmt"
"net"
@@ -22,6 +24,9 @@ import (
"github.com/fosrl/newt/updates"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/state"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/device"
"golang.zx2c4.com/wireguard/tun"
@@ -116,6 +121,13 @@ var (
healthMonitor *healthcheck.Monitor
enforceHealthcheckCert bool
+ // Observability/metrics flags
+ metricsEnabled bool
+ otlpEnabled bool
+ adminAddr string
+ region string
+ metricsAsyncBytes bool
+
// New mTLS configuration variables
tlsClientCert string
tlsClientKey string
@@ -126,6 +138,10 @@ var (
)
func main() {
+ // Prepare context for graceful shutdown and signal handling
+ ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
+ defer stop()
+
// if PANGOLIN_ENDPOINT, NEWT_ID, and NEWT_SECRET are set as environment variables, they will be used as default values
endpoint = os.Getenv("PANGOLIN_ENDPOINT")
id = os.Getenv("NEWT_ID")
@@ -141,6 +157,13 @@ func main() {
useNativeInterfaceEnv := os.Getenv("USE_NATIVE_INTERFACE")
enforceHealthcheckCertEnv := os.Getenv("ENFORCE_HC_CERT")
+ // Metrics/observability env mirrors
+ metricsEnabledEnv := os.Getenv("NEWT_METRICS_PROMETHEUS_ENABLED")
+ otlpEnabledEnv := os.Getenv("NEWT_METRICS_OTLP_ENABLED")
+ adminAddrEnv := os.Getenv("NEWT_ADMIN_ADDR")
+ regionEnv := os.Getenv("NEWT_REGION")
+ asyncBytesEnv := os.Getenv("NEWT_METRICS_ASYNC_BYTES")
+
keepInterface = keepInterfaceEnv == "true"
acceptClients = acceptClientsEnv == "true"
useNativeInterface = useNativeInterfaceEnv == "true"
@@ -272,6 +295,35 @@ func main() {
flag.StringVar(&healthFile, "health-file", "", "Path to health file (if unset, health file won't be written)")
}
+ // Metrics/observability flags (mirror ENV if unset)
+ if metricsEnabledEnv == "" {
+ flag.BoolVar(&metricsEnabled, "metrics", true, "Enable Prometheus /metrics exporter")
+ } else {
+ if v, err := strconv.ParseBool(metricsEnabledEnv); err == nil { metricsEnabled = v } else { metricsEnabled = true }
+ }
+ if otlpEnabledEnv == "" {
+ flag.BoolVar(&otlpEnabled, "otlp", false, "Enable OTLP exporters (metrics/traces) to OTEL_EXPORTER_OTLP_ENDPOINT")
+ } else {
+ if v, err := strconv.ParseBool(otlpEnabledEnv); err == nil { otlpEnabled = v }
+ }
+ if adminAddrEnv == "" {
+ flag.StringVar(&adminAddr, "metrics-admin-addr", "127.0.0.1:2112", "Admin/metrics bind address")
+ } else {
+ adminAddr = adminAddrEnv
+ }
+ // Async bytes toggle
+ if asyncBytesEnv == "" {
+ flag.BoolVar(&metricsAsyncBytes, "metrics-async-bytes", false, "Enable async bytes counting (background flush; lower hot path overhead)")
+ } else {
+ if v, err := strconv.ParseBool(asyncBytesEnv); err == nil { metricsAsyncBytes = v }
+ }
+ // Optional region flag (resource attribute)
+ if regionEnv == "" {
+ flag.StringVar(&region, "region", "", "Optional region resource attribute (also NEWT_REGION)")
+ } else {
+ region = regionEnv
+ }
+
// do a --version check
version := flag.Bool("version", false, "Print the version")
@@ -286,6 +338,50 @@ func main() {
loggerLevel := parseLogLevel(logLevel)
logger.GetLogger().SetLevel(parseLogLevel(logLevel))
+ // Initialize telemetry after flags are parsed (so flags override env)
+ tcfg := telemetry.FromEnv()
+ tcfg.PromEnabled = metricsEnabled
+ tcfg.OTLPEnabled = otlpEnabled
+ if adminAddr != "" { tcfg.AdminAddr = adminAddr }
+ // Resource attributes (if available)
+ tcfg.SiteID = id
+ tcfg.Region = region
+ // Build info
+ tcfg.BuildVersion = newtVersion
+ tcfg.BuildCommit = os.Getenv("NEWT_COMMIT")
+
+ tel, telErr := telemetry.Init(ctx, tcfg)
+ if telErr != nil {
+ logger.Warn("Telemetry init failed: %v", telErr)
+ }
+ if tel != nil {
+ // Admin HTTP server (exposes /metrics when Prometheus exporter is enabled)
+ mux := http.NewServeMux()
+ mux.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { w.WriteHeader(200) })
+ if tel.PrometheusHandler != nil {
+ mux.Handle("/metrics", tel.PrometheusHandler)
+ }
+ admin := &http.Server{
+ Addr: tcfg.AdminAddr,
+ Handler: otelhttp.NewHandler(mux, "newt-admin"),
+ ReadTimeout: 5 * time.Second,
+ WriteTimeout: 10 * time.Second,
+ ReadHeaderTimeout: 5 * time.Second,
+ IdleTimeout: 30 * time.Second,
+ }
+ go func() {
+ if err := admin.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
+ logger.Warn("admin http error: %v", err)
+ }
+ }()
+ defer func() {
+ ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+ defer cancel()
+ _ = admin.Shutdown(ctx)
+ }()
+ defer func() { _ = tel.Shutdown(context.Background()) }()
+ }
+
newtVersion := "version_replaceme"
if *version {
fmt.Println("Newt version " + newtVersion)
@@ -557,7 +653,10 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
}
// Use reliable ping for initial connection test
logger.Debug("Testing initial connection with reliable ping...")
- _, err = reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ lat, err := reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ if err == nil && wgData.PublicKey != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", wgData.PublicKey, "wireguard", lat.Seconds())
+ }
if err != nil {
logger.Warn("Initial reliable ping failed, but continuing: %v", err)
} else {
@@ -570,14 +669,20 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
// as the pings will continue in the background
if !connected {
logger.Debug("Starting ping check")
- pingStopChan = startPingCheck(tnet, wgData.ServerIP, client)
+ pingStopChan = startPingCheck(tnet, wgData.ServerIP, client, wgData.PublicKey)
}
// Create proxy manager
pm = proxy.NewProxyManager(tnet)
+ pm.SetAsyncBytes(metricsAsyncBytes)
+ // Set tunnel_id for metrics (WireGuard peer public key)
+ pm.SetTunnelID(wgData.PublicKey)
connected = true
+ // telemetry: record a successful site registration (omit region unless available)
+ telemetry.IncSiteRegistration(context.Background(), id, "", "success")
+
// add the targets if there are any
if len(wgData.Targets.TCP) > 0 {
updateTargets(pm, "add", wgData.TunnelIP, "tcp", TargetData{Targets: wgData.Targets.TCP})
@@ -611,10 +716,25 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/reconnect", func(msg websocket.WSMessage) {
logger.Info("Received reconnect message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
// Mark as disconnected
connected = false
@@ -631,6 +751,9 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/terminate", func(msg websocket.WSMessage) {
logger.Info("Received termination message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
diff --git a/util.go b/util.go
index 7d6da4f..c1f4915 100644
--- a/util.go
+++ b/util.go
@@ -17,6 +17,7 @@ import (
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/telemetry"
"golang.org/x/net/icmp"
"golang.org/x/net/ipv4"
"golang.zx2c4.com/wireguard/device"
@@ -229,7 +230,7 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
return stopChan, fmt.Errorf("initial ping attempts failed, continuing in background")
}
-func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client) chan struct{} {
+func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client, tunnelID string) chan struct{} {
maxInterval := 6 * time.Second
currentInterval := pingInterval
consecutiveFailures := 0
@@ -292,6 +293,9 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
if !connectionLost {
connectionLost = true
logger.Warn("Connection to server lost after %d failures. Continuous reconnection attempts will be made.", consecutiveFailures)
+ if tunnelID != "" {
+ telemetry.IncReconnect(context.Background(), "", tunnelID, telemetry.ReasonTimeout)
+ }
stopFunc = client.SendMessageInterval("newt/ping/request", map[string]interface{}{}, 3*time.Second)
// Send registration message to the server for backward compatibility
err := client.SendMessage("newt/wg/register", map[string]interface{}{
@@ -318,6 +322,10 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
} else {
// Track recent latencies
recentLatencies = append(recentLatencies, latency)
+ // Record tunnel latency (limit sampling to this periodic check)
+ if tunnelID != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", tunnelID, "wireguard", latency.Seconds())
+ }
if len(recentLatencies) > 10 {
recentLatencies = recentLatencies[1:]
}
diff --git a/websocket/client.go b/websocket/client.go
index 0c0664a..c9ac264 100644
--- a/websocket/client.go
+++ b/websocket/client.go
@@ -18,6 +18,10 @@ import (
"github.com/fosrl/newt/logger"
"github.com/gorilla/websocket"
+
+ "context"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/otel"
)
type Client struct {
@@ -287,6 +291,7 @@ func (c *Client) getToken() (string, error) {
}
resp, err := client.Do(req)
if err != nil {
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", classifyConnError(err))
return "", fmt.Errorf("failed to request new token: %w", err)
}
defer resp.Body.Close()
@@ -294,6 +299,18 @@ func (c *Client) getToken() (string, error) {
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
logger.Error("Failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "failure")
+ bin := "http_other"
+ if resp.StatusCode >= 500 {
+ bin = "http_5xx"
+ } else if resp.StatusCode >= 400 {
+ bin = "http_4xx"
+ }
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", bin)
+ // Reconnect reason mapping for auth failures
+ if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonAuthError)
+ }
return "", fmt.Errorf("failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
}
@@ -312,10 +329,33 @@ func (c *Client) getToken() (string, error) {
}
logger.Debug("Received token: %s", tokenResp.Data.Token)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "success")
return tokenResp.Data.Token, nil
}
+// classifyConnError maps common errors to low-cardinality error_type labels
+func classifyConnError(err error) string {
+ if err == nil {
+ return ""
+ }
+ msg := strings.ToLower(err.Error())
+ switch {
+ case strings.Contains(msg, "tls") || strings.Contains(msg, "certificate"):
+ return "tls"
+ case strings.Contains(msg, "timeout") || strings.Contains(msg, "i/o timeout"):
+ return "timeout"
+ case strings.Contains(msg, "no such host") || strings.Contains(msg, "dns"):
+ return "dns"
+ case strings.Contains(msg, "unauthorized") || strings.Contains(msg, "forbidden"):
+ return "auth"
+ case strings.Contains(msg, "broken pipe") || strings.Contains(msg, "connection reset") || strings.Contains(msg, "connection refused") || strings.Contains(msg, "use of closed network connection") || strings.Contains(msg, "network is unreachable"):
+ return "io"
+ default:
+ return "other"
+ }
+}
+
func (c *Client) connectWithRetry() {
for {
select {
@@ -337,6 +377,10 @@ func (c *Client) establishConnection() error {
// Get token for authentication
token, err := c.getToken()
if err != nil {
+ // telemetry: connection attempt failed before dialing
+ // site_id isn't globally available here; use client ID as site_id (low cardinality)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", classifyConnError(err))
return fmt.Errorf("failed to get token: %w", err)
}
@@ -369,7 +413,11 @@ func (c *Client) establishConnection() error {
q.Set("clientType", c.clientType)
u.RawQuery = q.Encode()
- // Connect to WebSocket
+ // Connect to WebSocket (optional span)
+ tr := otel.Tracer("newt")
+ spanCtx, span := tr.Start(context.Background(), "ws.connect")
+ defer span.End()
+
dialer := websocket.DefaultDialer
// Use new TLS configuration method
@@ -391,11 +439,23 @@ func (c *Client) establishConnection() error {
logger.Debug("WebSocket TLS certificate verification disabled via SKIP_TLS_VERIFY environment variable")
}
- conn, _, err := dialer.Dial(u.String(), nil)
+conn, _, err := dialer.DialContext(spanCtx, u.String(), nil)
if err != nil {
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ etype := classifyConnError(err)
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", etype)
+ // Map handshake-related errors to reconnect reasons where appropriate
+ if etype == "tls" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonHandshakeError)
+ } else if etype == "timeout" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonTimeout)
+ } else {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonError)
+ }
return fmt.Errorf("failed to connect to WebSocket: %w", err)
}
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "success")
c.conn = conn
c.setConnected(true)
diff --git a/wg/wg.go b/wg/wg.go
index 3cee1a9..a765279 100644
--- a/wg/wg.go
+++ b/wg/wg.go
@@ -3,6 +3,7 @@
package wg
import (
+ "context"
"encoding/json"
"errors"
"fmt"
@@ -23,6 +24,8 @@ import (
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/wgctrl"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -298,6 +301,13 @@ func (s *WireGuardService) handleConfig(msg websocket.WSMessage) {
s.stopGetConfig = nil
}
+ // telemetry: config reload success
+ telemetry.IncConfigReload(context.Background(), "success")
+ // Optional reconnect reason mapping: config change
+ if s.serverPubKey != "" {
+ telemetry.IncReconnect(context.Background(), "", s.serverPubKey, telemetry.ReasonConfigChange)
+ }
+
// Ensure the WireGuard interface and peers are configured
if err := s.ensureWireguardInterface(config); err != nil {
logger.Error("Failed to ensure WireGuard interface: %v", err)

View File

@@ -0,0 +1,466 @@
diff --git a/main.go b/main.go
index 12849b1..c223b75 100644
--- a/main.go
+++ b/main.go
@@ -1,7 +1,9 @@
package main
import (
+ "context"
"encoding/json"
+ "errors"
"flag"
"fmt"
"net"
@@ -22,6 +24,9 @@ import (
"github.com/fosrl/newt/updates"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/state"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/device"
"golang.zx2c4.com/wireguard/tun"
@@ -116,6 +121,13 @@ var (
healthMonitor *healthcheck.Monitor
enforceHealthcheckCert bool
+ // Observability/metrics flags
+ metricsEnabled bool
+ otlpEnabled bool
+ adminAddr string
+ region string
+ metricsAsyncBytes bool
+
// New mTLS configuration variables
tlsClientCert string
tlsClientKey string
@@ -126,6 +138,10 @@ var (
)
func main() {
+ // Prepare context for graceful shutdown and signal handling
+ ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
+ defer stop()
+
// if PANGOLIN_ENDPOINT, NEWT_ID, and NEWT_SECRET are set as environment variables, they will be used as default values
endpoint = os.Getenv("PANGOLIN_ENDPOINT")
id = os.Getenv("NEWT_ID")
@@ -141,6 +157,13 @@ func main() {
useNativeInterfaceEnv := os.Getenv("USE_NATIVE_INTERFACE")
enforceHealthcheckCertEnv := os.Getenv("ENFORCE_HC_CERT")
+ // Metrics/observability env mirrors
+ metricsEnabledEnv := os.Getenv("NEWT_METRICS_PROMETHEUS_ENABLED")
+ otlpEnabledEnv := os.Getenv("NEWT_METRICS_OTLP_ENABLED")
+ adminAddrEnv := os.Getenv("NEWT_ADMIN_ADDR")
+ regionEnv := os.Getenv("NEWT_REGION")
+ asyncBytesEnv := os.Getenv("NEWT_METRICS_ASYNC_BYTES")
+
keepInterface = keepInterfaceEnv == "true"
acceptClients = acceptClientsEnv == "true"
useNativeInterface = useNativeInterfaceEnv == "true"
@@ -272,6 +295,35 @@ func main() {
flag.StringVar(&healthFile, "health-file", "", "Path to health file (if unset, health file won't be written)")
}
+ // Metrics/observability flags (mirror ENV if unset)
+ if metricsEnabledEnv == "" {
+ flag.BoolVar(&metricsEnabled, "metrics", true, "Enable Prometheus /metrics exporter")
+ } else {
+ if v, err := strconv.ParseBool(metricsEnabledEnv); err == nil { metricsEnabled = v } else { metricsEnabled = true }
+ }
+ if otlpEnabledEnv == "" {
+ flag.BoolVar(&otlpEnabled, "otlp", false, "Enable OTLP exporters (metrics/traces) to OTEL_EXPORTER_OTLP_ENDPOINT")
+ } else {
+ if v, err := strconv.ParseBool(otlpEnabledEnv); err == nil { otlpEnabled = v }
+ }
+ if adminAddrEnv == "" {
+ flag.StringVar(&adminAddr, "metrics-admin-addr", "127.0.0.1:2112", "Admin/metrics bind address")
+ } else {
+ adminAddr = adminAddrEnv
+ }
+ // Async bytes toggle
+ if asyncBytesEnv == "" {
+ flag.BoolVar(&metricsAsyncBytes, "metrics-async-bytes", false, "Enable async bytes counting (background flush; lower hot path overhead)")
+ } else {
+ if v, err := strconv.ParseBool(asyncBytesEnv); err == nil { metricsAsyncBytes = v }
+ }
+ // Optional region flag (resource attribute)
+ if regionEnv == "" {
+ flag.StringVar(&region, "region", "", "Optional region resource attribute (also NEWT_REGION)")
+ } else {
+ region = regionEnv
+ }
+
// do a --version check
version := flag.Bool("version", false, "Print the version")
@@ -286,6 +338,50 @@ func main() {
loggerLevel := parseLogLevel(logLevel)
logger.GetLogger().SetLevel(parseLogLevel(logLevel))
+ // Initialize telemetry after flags are parsed (so flags override env)
+ tcfg := telemetry.FromEnv()
+ tcfg.PromEnabled = metricsEnabled
+ tcfg.OTLPEnabled = otlpEnabled
+ if adminAddr != "" { tcfg.AdminAddr = adminAddr }
+ // Resource attributes (if available)
+ tcfg.SiteID = id
+ tcfg.Region = region
+ // Build info
+ tcfg.BuildVersion = newtVersion
+ tcfg.BuildCommit = os.Getenv("NEWT_COMMIT")
+
+ tel, telErr := telemetry.Init(ctx, tcfg)
+ if telErr != nil {
+ logger.Warn("Telemetry init failed: %v", telErr)
+ }
+ if tel != nil {
+ // Admin HTTP server (exposes /metrics when Prometheus exporter is enabled)
+ mux := http.NewServeMux()
+ mux.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { w.WriteHeader(200) })
+ if tel.PrometheusHandler != nil {
+ mux.Handle("/metrics", tel.PrometheusHandler)
+ }
+ admin := &http.Server{
+ Addr: tcfg.AdminAddr,
+ Handler: otelhttp.NewHandler(mux, "newt-admin"),
+ ReadTimeout: 5 * time.Second,
+ WriteTimeout: 10 * time.Second,
+ ReadHeaderTimeout: 5 * time.Second,
+ IdleTimeout: 30 * time.Second,
+ }
+ go func() {
+ if err := admin.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
+ logger.Warn("admin http error: %v", err)
+ }
+ }()
+ defer func() {
+ ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+ defer cancel()
+ _ = admin.Shutdown(ctx)
+ }()
+ defer func() { _ = tel.Shutdown(context.Background()) }()
+ }
+
newtVersion := "version_replaceme"
if *version {
fmt.Println("Newt version " + newtVersion)
@@ -557,7 +653,10 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
}
// Use reliable ping for initial connection test
logger.Debug("Testing initial connection with reliable ping...")
- _, err = reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ lat, err := reliablePing(tnet, wgData.ServerIP, pingTimeout, 5)
+ if err == nil && wgData.PublicKey != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", wgData.PublicKey, "wireguard", lat.Seconds())
+ }
if err != nil {
logger.Warn("Initial reliable ping failed, but continuing: %v", err)
} else {
@@ -570,14 +669,20 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
// as the pings will continue in the background
if !connected {
logger.Debug("Starting ping check")
- pingStopChan = startPingCheck(tnet, wgData.ServerIP, client)
+ pingStopChan = startPingCheck(tnet, wgData.ServerIP, client, wgData.PublicKey)
}
// Create proxy manager
pm = proxy.NewProxyManager(tnet)
+ pm.SetAsyncBytes(metricsAsyncBytes)
+ // Set tunnel_id for metrics (WireGuard peer public key)
+ pm.SetTunnelID(wgData.PublicKey)
connected = true
+ // telemetry: record a successful site registration (omit region unless available)
+ telemetry.IncSiteRegistration(context.Background(), id, "", "success")
+
// add the targets if there are any
if len(wgData.Targets.TCP) > 0 {
updateTargets(pm, "add", wgData.TunnelIP, "tcp", TargetData{Targets: wgData.Targets.TCP})
@@ -611,10 +716,25 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/reconnect", func(msg websocket.WSMessage) {
logger.Info("Received reconnect message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
+ // Clear metrics attrs and sessions for the tunnel
+ if pm != nil {
+ pm.ClearTunnelID()
+ state.Global().ClearTunnel(wgData.PublicKey)
+ }
+
// Mark as disconnected
connected = false
@@ -631,6 +751,9 @@ persistent_keepalive_interval=5`, fixKey(privateKey.String()), fixKey(wgData.Pub
client.RegisterHandler("newt/wg/terminate", func(msg websocket.WSMessage) {
logger.Info("Received termination message")
+ if wgData.PublicKey != "" {
+ telemetry.IncReconnect(context.Background(), "", wgData.PublicKey, "server_request")
+ }
// Close the WireGuard device and TUN
closeWgTunnel()
diff --git a/util.go b/util.go
index 7d6da4f..c1f4915 100644
--- a/util.go
+++ b/util.go
@@ -17,6 +17,7 @@ import (
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
"github.com/fosrl/newt/websocket"
+ "github.com/fosrl/newt/internal/telemetry"
"golang.org/x/net/icmp"
"golang.org/x/net/ipv4"
"golang.zx2c4.com/wireguard/device"
@@ -229,7 +230,7 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
return stopChan, fmt.Errorf("initial ping attempts failed, continuing in background")
}
-func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client) chan struct{} {
+func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client, tunnelID string) chan struct{} {
maxInterval := 6 * time.Second
currentInterval := pingInterval
consecutiveFailures := 0
@@ -292,6 +293,9 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
if !connectionLost {
connectionLost = true
logger.Warn("Connection to server lost after %d failures. Continuous reconnection attempts will be made.", consecutiveFailures)
+ if tunnelID != "" {
+ telemetry.IncReconnect(context.Background(), "", tunnelID, telemetry.ReasonTimeout)
+ }
stopFunc = client.SendMessageInterval("newt/ping/request", map[string]interface{}{}, 3*time.Second)
// Send registration message to the server for backward compatibility
err := client.SendMessage("newt/wg/register", map[string]interface{}{
@@ -318,6 +322,10 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
} else {
// Track recent latencies
recentLatencies = append(recentLatencies, latency)
+ // Record tunnel latency (limit sampling to this periodic check)
+ if tunnelID != "" {
+ telemetry.ObserveTunnelLatency(context.Background(), "", tunnelID, "wireguard", latency.Seconds())
+ }
if len(recentLatencies) > 10 {
recentLatencies = recentLatencies[1:]
}
diff --git a/websocket/client.go b/websocket/client.go
index 0c0664a..c9ac264 100644
--- a/websocket/client.go
+++ b/websocket/client.go
@@ -18,6 +18,10 @@ import (
"github.com/fosrl/newt/logger"
"github.com/gorilla/websocket"
+
+ "context"
+ "github.com/fosrl/newt/internal/telemetry"
+ "go.opentelemetry.io/otel"
)
type Client struct {
@@ -287,6 +291,7 @@ func (c *Client) getToken() (string, error) {
}
resp, err := client.Do(req)
if err != nil {
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", classifyConnError(err))
return "", fmt.Errorf("failed to request new token: %w", err)
}
defer resp.Body.Close()
@@ -294,6 +299,18 @@ func (c *Client) getToken() (string, error) {
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
logger.Error("Failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "failure")
+ bin := "http_other"
+ if resp.StatusCode >= 500 {
+ bin = "http_5xx"
+ } else if resp.StatusCode >= 400 {
+ bin = "http_4xx"
+ }
+ telemetry.IncConnError(context.Background(), c.config.ID, "auth", bin)
+ // Reconnect reason mapping for auth failures
+ if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonAuthError)
+ }
return "", fmt.Errorf("failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
}
@@ -312,10 +329,33 @@ func (c *Client) getToken() (string, error) {
}
logger.Debug("Received token: %s", tokenResp.Data.Token)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "auth", "success")
return tokenResp.Data.Token, nil
}
+// classifyConnError maps common errors to low-cardinality error_type labels
+func classifyConnError(err error) string {
+ if err == nil {
+ return ""
+ }
+ msg := strings.ToLower(err.Error())
+ switch {
+ case strings.Contains(msg, "tls") || strings.Contains(msg, "certificate"):
+ return "tls"
+ case strings.Contains(msg, "timeout") || strings.Contains(msg, "i/o timeout"):
+ return "timeout"
+ case strings.Contains(msg, "no such host") || strings.Contains(msg, "dns"):
+ return "dns"
+ case strings.Contains(msg, "unauthorized") || strings.Contains(msg, "forbidden"):
+ return "auth"
+ case strings.Contains(msg, "broken pipe") || strings.Contains(msg, "connection reset") || strings.Contains(msg, "connection refused") || strings.Contains(msg, "use of closed network connection") || strings.Contains(msg, "network is unreachable"):
+ return "io"
+ default:
+ return "other"
+ }
+}
+
func (c *Client) connectWithRetry() {
for {
select {
@@ -337,6 +377,10 @@ func (c *Client) establishConnection() error {
// Get token for authentication
token, err := c.getToken()
if err != nil {
+ // telemetry: connection attempt failed before dialing
+ // site_id isn't globally available here; use client ID as site_id (low cardinality)
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", classifyConnError(err))
return fmt.Errorf("failed to get token: %w", err)
}
@@ -369,7 +413,11 @@ func (c *Client) establishConnection() error {
q.Set("clientType", c.clientType)
u.RawQuery = q.Encode()
- // Connect to WebSocket
+ // Connect to WebSocket (optional span)
+ tr := otel.Tracer("newt")
+ spanCtx, span := tr.Start(context.Background(), "ws.connect")
+ defer span.End()
+
dialer := websocket.DefaultDialer
// Use new TLS configuration method
@@ -391,11 +439,23 @@ func (c *Client) establishConnection() error {
logger.Debug("WebSocket TLS certificate verification disabled via SKIP_TLS_VERIFY environment variable")
}
- conn, _, err := dialer.Dial(u.String(), nil)
+conn, _, err := dialer.DialContext(spanCtx, u.String(), nil)
if err != nil {
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "failure")
+ etype := classifyConnError(err)
+ telemetry.IncConnError(context.Background(), c.config.ID, "websocket", etype)
+ // Map handshake-related errors to reconnect reasons where appropriate
+ if etype == "tls" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonHandshakeError)
+ } else if etype == "timeout" {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonTimeout)
+ } else {
+ telemetry.IncReconnect(context.Background(), "", c.config.ID, telemetry.ReasonError)
+ }
return fmt.Errorf("failed to connect to WebSocket: %w", err)
}
+ telemetry.IncConnAttempt(context.Background(), c.config.ID, "websocket", "success")
c.conn = conn
c.setConnected(true)
diff --git a/wg/wg.go b/wg/wg.go
index 3cee1a9..a765279 100644
--- a/wg/wg.go
+++ b/wg/wg.go
@@ -3,6 +3,7 @@
package wg
import (
+ "context"
"encoding/json"
"errors"
"fmt"
@@ -23,6 +24,8 @@ import (
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/wgctrl"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -298,6 +301,13 @@ func (s *WireGuardService) handleConfig(msg websocket.WSMessage) {
s.stopGetConfig = nil
}
+ // telemetry: config reload success
+ telemetry.IncConfigReload(context.Background(), "success")
+ // Optional reconnect reason mapping: config change
+ if s.serverPubKey != "" {
+ telemetry.IncReconnect(context.Background(), "", s.serverPubKey, telemetry.ReasonConfigChange)
+ }
+
// Ensure the WireGuard interface and peers are configured
if err := s.ensureWireguardInterface(config); err != nil {
logger.Error("Failed to ensure WireGuard interface: %v", err)
diff --git a/wgnetstack/wgnetstack.go b/wgnetstack/wgnetstack.go
index 6684c40..09f160e 100644
--- a/wgnetstack/wgnetstack.go
+++ b/wgnetstack/wgnetstack.go
@@ -1,6 +1,7 @@
package wgnetstack
import (
+ "context"
"crypto/rand"
"encoding/base64"
"encoding/hex"
@@ -26,6 +27,8 @@ import (
"golang.zx2c4.com/wireguard/tun"
"golang.zx2c4.com/wireguard/tun/netstack"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -240,14 +243,20 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
return service, nil
}
+// ReportRTT allows reporting native RTTs to telemetry, rate-limited externally.
+func (s *WireGuardService) ReportRTT(seconds float64) {
+ if s.serverPubKey == "" { return }
+ telemetry.ObserveTunnelLatency(context.Background(), "", s.serverPubKey, "wireguard", seconds)
+}
+
func (s *WireGuardService) addTcpTarget(msg websocket.WSMessage) {
logger.Debug("Received: %+v", msg)
// if there is no wgData or pm, we can't add targets
if s.TunnelIP == "" || s.proxyManager == nil {
logger.Info("No tunnel IP or proxy manager available")
- return
- }
+ return
+}
targetData, err := parseTargetData(msg.Data)
if err != nil {

View File

View File

@@ -0,0 +1,44 @@
diff --git a/wgnetstack/wgnetstack.go b/wgnetstack/wgnetstack.go
index 6684c40..09f160e 100644
--- a/wgnetstack/wgnetstack.go
+++ b/wgnetstack/wgnetstack.go
@@ -1,6 +1,7 @@
package wgnetstack
import (
+ "context"
"crypto/rand"
"encoding/base64"
"encoding/hex"
@@ -26,6 +27,8 @@ import (
"golang.zx2c4.com/wireguard/tun"
"golang.zx2c4.com/wireguard/tun/netstack"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
+
+ "github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -240,14 +243,20 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
return service, nil
}
+// ReportRTT allows reporting native RTTs to telemetry, rate-limited externally.
+func (s *WireGuardService) ReportRTT(seconds float64) {
+ if s.serverPubKey == "" { return }
+ telemetry.ObserveTunnelLatency(context.Background(), "", s.serverPubKey, "wireguard", seconds)
+}
+
func (s *WireGuardService) addTcpTarget(msg websocket.WSMessage) {
logger.Debug("Received: %+v", msg)
// if there is no wgData or pm, we can't add targets
if s.TunnelIP == "" || s.proxyManager == nil {
logger.Info("No tunnel IP or proxy manager available")
- return
- }
+ return
+}
targetData, err := parseTargetData(msg.Data)
if err != nil {

View File

25
patches/HOWTO-APPLY.md Normal file
View File

@@ -0,0 +1,25 @@
# How to apply patches
These patches were generated from the working tree without commits. You can apply them in one shot or in topic order.
One shot (recommended during review):
```bash
git apply patches/00_all_changes.patch
```
Topic order:
```bash
git apply patches/01_proxy_multitunnel.patch
git apply patches/02_reconnect_rtt.patch
git apply patches/03_constants_docs.patch
```
Rollback (restore to HEAD and clean untracked files):
```bash
git restore --source=HEAD --worktree --staged .
git clean -fd
```

View File

@@ -1,18 +1,28 @@
package proxy
import (
"context"
"errors"
"fmt"
"io"
"net"
"os"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/fosrl/newt/internal/state"
"github.com/fosrl/newt/internal/telemetry"
"github.com/fosrl/newt/logger"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
"golang.zx2c4.com/wireguard/tun/netstack"
"gvisor.dev/gvisor/pkg/tcpip/adapters/gonet"
)
const errUnsupportedProtoFmt = "unsupported protocol: %s"
// Target represents a proxy target with its address and port
type Target struct {
Address string
@@ -28,6 +38,90 @@ type ProxyManager struct {
udpConns []*gonet.UDPConn
running bool
mutex sync.RWMutex
// telemetry (multi-tunnel)
currentTunnelID string
tunnels map[string]*tunnelEntry
asyncBytes bool
flushStop chan struct{}
}
// tunnelEntry holds per-tunnel attributes and (optional) async counters.
type tunnelEntry struct {
attrInTCP attribute.Set
attrOutTCP attribute.Set
attrInUDP attribute.Set
attrOutUDP attribute.Set
bytesInTCP atomic.Uint64
bytesOutTCP atomic.Uint64
bytesInUDP atomic.Uint64
bytesOutUDP atomic.Uint64
activeTCP atomic.Int64
activeUDP atomic.Int64
}
// countingWriter wraps an io.Writer and adds bytes to OTel counter using a pre-built attribute set.
type countingWriter struct {
ctx context.Context
w io.Writer
set attribute.Set
pm *ProxyManager
ent *tunnelEntry
out bool // false=in, true=out
proto string // "tcp" or "udp"
}
func (cw *countingWriter) Write(p []byte) (int, error) {
n, err := cw.w.Write(p)
if n > 0 {
if cw.pm != nil && cw.pm.asyncBytes && cw.ent != nil {
switch cw.proto {
case "tcp":
if cw.out {
cw.ent.bytesOutTCP.Add(uint64(n))
} else {
cw.ent.bytesInTCP.Add(uint64(n))
}
case "udp":
if cw.out {
cw.ent.bytesOutUDP.Add(uint64(n))
} else {
cw.ent.bytesInUDP.Add(uint64(n))
}
}
} else {
telemetry.AddTunnelBytesSet(cw.ctx, int64(n), cw.set)
}
}
return n, err
}
func classifyProxyError(err error) string {
if err == nil {
return ""
}
if errors.Is(err, net.ErrClosed) {
return "closed"
}
if ne, ok := err.(net.Error); ok {
if ne.Timeout() {
return "timeout"
}
if ne.Temporary() {
return "temporary"
}
}
msg := strings.ToLower(err.Error())
switch {
case strings.Contains(msg, "refused"):
return "refused"
case strings.Contains(msg, "reset"):
return "reset"
default:
return "io_error"
}
}
// NewProxyManager creates a new proxy manager instance
@@ -38,9 +132,94 @@ func NewProxyManager(tnet *netstack.Net) *ProxyManager {
udpTargets: make(map[string]map[int]string),
listeners: make([]*gonet.TCPListener, 0),
udpConns: make([]*gonet.UDPConn, 0),
tunnels: make(map[string]*tunnelEntry),
}
}
// SetTunnelID sets the WireGuard peer public key used as tunnel_id label.
func (pm *ProxyManager) SetTunnelID(id string) {
pm.mutex.Lock()
defer pm.mutex.Unlock()
pm.currentTunnelID = id
if _, ok := pm.tunnels[id]; !ok {
pm.tunnels[id] = &tunnelEntry{}
}
e := pm.tunnels[id]
// include site labels if available
site := telemetry.SiteLabelKVs()
build := func(base []attribute.KeyValue) attribute.Set {
if telemetry.ShouldIncludeTunnelID() {
base = append([]attribute.KeyValue{attribute.String("tunnel_id", id)}, base...)
}
base = append(site, base...)
return attribute.NewSet(base...)
}
e.attrInTCP = build([]attribute.KeyValue{
attribute.String("direction", "ingress"),
attribute.String("protocol", "tcp"),
})
e.attrOutTCP = build([]attribute.KeyValue{
attribute.String("direction", "egress"),
attribute.String("protocol", "tcp"),
})
e.attrInUDP = build([]attribute.KeyValue{
attribute.String("direction", "ingress"),
attribute.String("protocol", "udp"),
})
e.attrOutUDP = build([]attribute.KeyValue{
attribute.String("direction", "egress"),
attribute.String("protocol", "udp"),
})
}
// ClearTunnelID clears cached attribute sets for the current tunnel.
func (pm *ProxyManager) ClearTunnelID() {
pm.mutex.Lock()
defer pm.mutex.Unlock()
id := pm.currentTunnelID
if id == "" {
return
}
if e, ok := pm.tunnels[id]; ok {
// final flush for this tunnel
inTCP := e.bytesInTCP.Swap(0)
outTCP := e.bytesOutTCP.Swap(0)
inUDP := e.bytesInUDP.Swap(0)
outUDP := e.bytesOutUDP.Swap(0)
if inTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP)
}
if outTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP)
}
if inUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP)
}
if outUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP)
}
delete(pm.tunnels, id)
}
pm.currentTunnelID = ""
}
// init function without tnet
func NewProxyManagerWithoutTNet() *ProxyManager {
return &ProxyManager{
tcpTargets: make(map[string]map[int]string),
udpTargets: make(map[string]map[int]string),
listeners: make([]*gonet.TCPListener, 0),
udpConns: make([]*gonet.UDPConn, 0),
}
}
// Function to add tnet to existing ProxyManager
func (pm *ProxyManager) SetTNet(tnet *netstack.Net) {
pm.mutex.Lock()
defer pm.mutex.Unlock()
pm.tnet = tnet
}
// AddTarget adds as new target for proxying
func (pm *ProxyManager) AddTarget(proto, listenIP string, port int, targetAddr string) error {
pm.mutex.Lock()
@@ -58,7 +237,7 @@ func (pm *ProxyManager) AddTarget(proto, listenIP string, port int, targetAddr s
}
pm.udpTargets[listenIP][port] = targetAddr
default:
return fmt.Errorf("unsupported protocol: %s", proto)
return fmt.Errorf(errUnsupportedProtoFmt, proto)
}
if pm.running {
@@ -107,13 +286,28 @@ func (pm *ProxyManager) RemoveTarget(proto, listenIP string, port int) error {
return fmt.Errorf("target not found: %s:%d", listenIP, port)
}
default:
return fmt.Errorf("unsupported protocol: %s", proto)
return fmt.Errorf(errUnsupportedProtoFmt, proto)
}
return nil
}
// Start begins listening for all configured proxy targets
func (pm *ProxyManager) Start() error {
// Register proxy observables once per process
telemetry.SetProxyObservableCallback(func(ctx context.Context, o metric.Observer) error {
pm.mutex.RLock()
defer pm.mutex.RUnlock()
for _, e := range pm.tunnels {
// active connections
telemetry.ObserveProxyActiveConnsObs(o, e.activeTCP.Load(), e.attrOutTCP.ToSlice())
telemetry.ObserveProxyActiveConnsObs(o, e.activeUDP.Load(), e.attrOutUDP.ToSlice())
// backlog bytes (sum of unflushed counters)
b := int64(e.bytesInTCP.Load() + e.bytesOutTCP.Load() + e.bytesInUDP.Load() + e.bytesOutUDP.Load())
telemetry.ObserveProxyAsyncBacklogObs(o, b, e.attrOutTCP.ToSlice())
telemetry.ObserveProxyBufferBytesObs(o, b, e.attrOutTCP.ToSlice())
}
return nil
})
pm.mutex.Lock()
defer pm.mutex.Unlock()
@@ -143,6 +337,75 @@ func (pm *ProxyManager) Start() error {
return nil
}
func (pm *ProxyManager) SetAsyncBytes(b bool) {
pm.mutex.Lock()
defer pm.mutex.Unlock()
pm.asyncBytes = b
if b && pm.flushStop == nil {
pm.flushStop = make(chan struct{})
go pm.flushLoop()
}
}
func (pm *ProxyManager) flushLoop() {
flushInterval := 2 * time.Second
if v := os.Getenv("OTEL_METRIC_EXPORT_INTERVAL"); v != "" {
if d, err := time.ParseDuration(v); err == nil && d > 0 {
if d/2 < flushInterval {
flushInterval = d / 2
}
}
}
ticker := time.NewTicker(flushInterval)
defer ticker.Stop()
for {
select {
case <-ticker.C:
pm.mutex.RLock()
for _, e := range pm.tunnels {
inTCP := e.bytesInTCP.Swap(0)
outTCP := e.bytesOutTCP.Swap(0)
inUDP := e.bytesInUDP.Swap(0)
outUDP := e.bytesOutUDP.Swap(0)
if inTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP)
}
if outTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP)
}
if inUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP)
}
if outUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP)
}
}
pm.mutex.RUnlock()
case <-pm.flushStop:
pm.mutex.RLock()
for _, e := range pm.tunnels {
inTCP := e.bytesInTCP.Swap(0)
outTCP := e.bytesOutTCP.Swap(0)
inUDP := e.bytesInUDP.Swap(0)
outUDP := e.bytesOutUDP.Swap(0)
if inTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inTCP), e.attrInTCP)
}
if outTCP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outTCP), e.attrOutTCP)
}
if inUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(inUDP), e.attrInUDP)
}
if outUDP > 0 {
telemetry.AddTunnelBytesSet(context.Background(), int64(outUDP), e.attrOutUDP)
}
}
pm.mutex.RUnlock()
return
}
}
}
func (pm *ProxyManager) Stop() error {
pm.mutex.Lock()
defer pm.mutex.Unlock()
@@ -174,13 +437,13 @@ func (pm *ProxyManager) Stop() error {
pm.udpConns = append(pm.udpConns[:i], pm.udpConns[i+1:]...)
}
// Clear the target maps
for k := range pm.tcpTargets {
delete(pm.tcpTargets, k)
}
for k := range pm.udpTargets {
delete(pm.udpTargets, k)
}
// // Clear the target maps
// for k := range pm.tcpTargets {
// delete(pm.tcpTargets, k)
// }
// for k := range pm.udpTargets {
// delete(pm.udpTargets, k)
// }
// Give active connections a chance to close gracefully
time.Sleep(100 * time.Millisecond)
@@ -210,7 +473,7 @@ func (pm *ProxyManager) startTarget(proto, listenIP string, port int, targetAddr
go pm.handleUDPProxy(conn, targetAddr)
default:
return fmt.Errorf("unsupported protocol: %s", proto)
return fmt.Errorf(errUnsupportedProtoFmt, proto)
}
logger.Info("Started %s proxy to %s", proto, targetAddr)
@@ -219,54 +482,84 @@ func (pm *ProxyManager) startTarget(proto, listenIP string, port int, targetAddr
return nil
}
// getEntry returns per-tunnel entry or nil.
func (pm *ProxyManager) getEntry(id string) *tunnelEntry {
pm.mutex.RLock()
e := pm.tunnels[id]
pm.mutex.RUnlock()
return e
}
func (pm *ProxyManager) handleTCPProxy(listener net.Listener, targetAddr string) {
for {
conn, err := listener.Accept()
if err != nil {
// Check if we're shutting down or the listener was closed
telemetry.IncProxyAccept(context.Background(), pm.currentTunnelID, "tcp", "failure", classifyProxyError(err))
if !pm.running {
return
}
// Check for specific network errors that indicate the listener is closed
if ne, ok := err.(net.Error); ok && !ne.Temporary() {
logger.Info("TCP listener closed, stopping proxy handler for %v", listener.Addr())
return
}
logger.Error("Error accepting TCP connection: %v", err)
// Don't hammer the CPU if we hit a temporary error
time.Sleep(100 * time.Millisecond)
continue
}
go func() {
tunnelID := pm.currentTunnelID
telemetry.IncProxyAccept(context.Background(), tunnelID, "tcp", "success", "")
telemetry.IncProxyConnectionEvent(context.Background(), tunnelID, "tcp", telemetry.ProxyConnectionOpened)
if tunnelID != "" {
state.Global().IncSessions(tunnelID)
if e := pm.getEntry(tunnelID); e != nil {
e.activeTCP.Add(1)
}
}
go func(tunnelID string, accepted net.Conn) {
connStart := time.Now()
target, err := net.Dial("tcp", targetAddr)
if err != nil {
logger.Error("Error connecting to target: %v", err)
conn.Close()
accepted.Close()
telemetry.IncProxyAccept(context.Background(), tunnelID, "tcp", "failure", classifyProxyError(err))
telemetry.IncProxyConnectionEvent(context.Background(), tunnelID, "tcp", telemetry.ProxyConnectionClosed)
telemetry.ObserveProxyConnectionDuration(context.Background(), tunnelID, "tcp", "failure", time.Since(connStart).Seconds())
return
}
// Create a WaitGroup to ensure both copy operations complete
entry := pm.getEntry(tunnelID)
if entry == nil {
entry = &tunnelEntry{}
}
var wg sync.WaitGroup
wg.Add(2)
go func() {
go func(ent *tunnelEntry) {
defer wg.Done()
io.Copy(target, conn)
target.Close()
}()
cw := &countingWriter{ctx: context.Background(), w: target, set: ent.attrInTCP, pm: pm, ent: ent, out: false, proto: "tcp"}
_, _ = io.Copy(cw, accepted)
_ = target.Close()
}(entry)
go func() {
go func(ent *tunnelEntry) {
defer wg.Done()
io.Copy(conn, target)
conn.Close()
}()
cw := &countingWriter{ctx: context.Background(), w: accepted, set: ent.attrOutTCP, pm: pm, ent: ent, out: true, proto: "tcp"}
_, _ = io.Copy(cw, target)
_ = accepted.Close()
}(entry)
// Wait for both copies to complete
wg.Wait()
}()
if tunnelID != "" {
state.Global().DecSessions(tunnelID)
if e := pm.getEntry(tunnelID); e != nil {
e.activeTCP.Add(-1)
}
}
telemetry.ObserveProxyConnectionDuration(context.Background(), tunnelID, "tcp", "success", time.Since(connStart).Seconds())
telemetry.IncProxyConnectionEvent(context.Background(), tunnelID, "tcp", telemetry.ProxyConnectionClosed)
}(tunnelID, conn)
}
}
@@ -279,6 +572,13 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
n, remoteAddr, err := conn.ReadFrom(buffer)
if err != nil {
if !pm.running {
// Clean up all connections when stopping
clientsMutex.Lock()
for _, targetConn := range clientConns {
targetConn.Close()
}
clientConns = nil
clientsMutex.Unlock()
return
}
@@ -302,6 +602,18 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
}
clientKey := remoteAddr.String()
// bytes from client -> target (direction=in)
if pm.currentTunnelID != "" && n > 0 {
if pm.asyncBytes {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
e.bytesInUDP.Add(uint64(n))
}
} else {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrInUDP)
}
}
}
clientsMutex.RLock()
targetConn, exists := clientConns[clientKey]
clientsMutex.RUnlock()
@@ -310,44 +622,117 @@ func (pm *ProxyManager) handleUDPProxy(conn *gonet.UDPConn, targetAddr string) {
targetUDPAddr, err := net.ResolveUDPAddr("udp", targetAddr)
if err != nil {
logger.Error("Error resolving target address: %v", err)
telemetry.IncProxyAccept(context.Background(), pm.currentTunnelID, "udp", "failure", "resolve")
continue
}
targetConn, err = net.DialUDP("udp", nil, targetUDPAddr)
if err != nil {
logger.Error("Error connecting to target: %v", err)
telemetry.IncProxyAccept(context.Background(), pm.currentTunnelID, "udp", "failure", classifyProxyError(err))
continue
}
tunnelID := pm.currentTunnelID
telemetry.IncProxyAccept(context.Background(), tunnelID, "udp", "success", "")
telemetry.IncProxyConnectionEvent(context.Background(), tunnelID, "udp", telemetry.ProxyConnectionOpened)
// Only increment activeUDP after a successful DialUDP
if e := pm.getEntry(tunnelID); e != nil {
e.activeUDP.Add(1)
}
clientsMutex.Lock()
clientConns[clientKey] = targetConn
clientsMutex.Unlock()
go func() {
go func(clientKey string, targetConn *net.UDPConn, remoteAddr net.Addr, tunnelID string) {
start := time.Now()
result := "success"
defer func() {
// Always clean up when this goroutine exits
clientsMutex.Lock()
if storedConn, exists := clientConns[clientKey]; exists && storedConn == targetConn {
delete(clientConns, clientKey)
targetConn.Close()
if e := pm.getEntry(tunnelID); e != nil {
e.activeUDP.Add(-1)
}
}
clientsMutex.Unlock()
telemetry.ObserveProxyConnectionDuration(context.Background(), tunnelID, "udp", result, time.Since(start).Seconds())
telemetry.IncProxyConnectionEvent(context.Background(), tunnelID, "udp", telemetry.ProxyConnectionClosed)
}()
buffer := make([]byte, 65507)
for {
n, _, err := targetConn.ReadFromUDP(buffer)
if err != nil {
logger.Error("Error reading from target: %v", err)
return
result = "failure"
return // defer will handle cleanup
}
// bytes from target -> client (direction=out)
if pm.currentTunnelID != "" && n > 0 {
if pm.asyncBytes {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
e.bytesOutUDP.Add(uint64(n))
}
} else {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
telemetry.AddTunnelBytesSet(context.Background(), int64(n), e.attrOutUDP)
}
}
}
_, err = conn.WriteTo(buffer[:n], remoteAddr)
if err != nil {
logger.Error("Error writing to client: %v", err)
return
telemetry.IncProxyDrops(context.Background(), pm.currentTunnelID, "udp")
result = "failure"
return // defer will handle cleanup
}
}
}()
}(clientKey, targetConn, remoteAddr, tunnelID)
}
_, err = targetConn.Write(buffer[:n])
written, err := targetConn.Write(buffer[:n])
if err != nil {
logger.Error("Error writing to target: %v", err)
telemetry.IncProxyDrops(context.Background(), pm.currentTunnelID, "udp")
targetConn.Close()
clientsMutex.Lock()
delete(clientConns, clientKey)
clientsMutex.Unlock()
} else if pm.currentTunnelID != "" && written > 0 {
if pm.asyncBytes {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
e.bytesInUDP.Add(uint64(written))
}
} else {
if e := pm.getEntry(pm.currentTunnelID); e != nil {
telemetry.AddTunnelBytesSet(context.Background(), int64(written), e.attrInUDP)
}
}
}
}
}
// write a function to print out the current targets in the ProxyManager
func (pm *ProxyManager) PrintTargets() {
pm.mutex.RLock()
defer pm.mutex.RUnlock()
logger.Info("Current TCP Targets:")
for listenIP, targets := range pm.tcpTargets {
for port, targetAddr := range targets {
logger.Info("TCP %s:%d -> %s", listenIP, port, targetAddr)
}
}
logger.Info("Current UDP Targets:")
for listenIP, targets := range pm.udpTargets {
for port, targetAddr := range targets {
logger.Info("UDP %s:%d -> %s", listenIP, port, targetAddr)
}
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 774 KiB

After

Width:  |  Height:  |  Size: 93 KiB

55
scripts/smoke-metrics.sh Normal file
View File

@@ -0,0 +1,55 @@
#!/usr/bin/env bash
set -euo pipefail
NEWTHOST=${NEWTHOST:-localhost}
NEWTPORT=${NEWTPORT:-2112}
METRICS_URL="http://${NEWTHOST}:${NEWTPORT}/metrics"
probe() {
local name=$1
local pattern=$2
echo "[probe] ${name}"
curl -sf "${METRICS_URL}" | grep -E "${pattern}" || {
echo "[warn] ${name} not found"
return 1
}
}
# Basic presence
probe "newt_* presence" "^newt_" || true
# Site gauges with site_id
probe "site_online with site_id" "^newt_site_online\{.*site_id=\"[^\"]+\"" || true
probe "last_heartbeat with site_id" "^newt_site_last_heartbeat_timestamp_seconds\{.*site_id=\"[^\"]+\"" || true
# Bytes with direction ingress/egress and protocol
probe "tunnel bytes ingress" "^newt_tunnel_bytes_total\{.*direction=\"ingress\".*protocol=\"(tcp|udp)\"" || true
probe "tunnel bytes egress" "^newt_tunnel_bytes_total\{.*direction=\"egress\".*protocol=\"(tcp|udp)\"" || true
# Optional: verify absence/presence of tunnel_id based on EXPECT_TUNNEL_ID (default true)
EXPECT_TUNNEL_ID=${EXPECT_TUNNEL_ID:-true}
if [ "$EXPECT_TUNNEL_ID" = "false" ]; then
echo "[probe] ensure tunnel_id label is absent when NEWT_METRICS_INCLUDE_TUNNEL_ID=false"
! curl -sf "${METRICS_URL}" | grep -q "tunnel_id=\"" || { echo "[fail] tunnel_id present but EXPECT_TUNNEL_ID=false"; exit 1; }
else
echo "[probe] ensure tunnel_id label is present (default)"
curl -sf "${METRICS_URL}" | grep -q "tunnel_id=\"" || { echo "[warn] tunnel_id not found (may be expected if no tunnel is active)"; }
fi
# WebSocket metrics (when OTLP/WS used)
probe "websocket connect latency buckets" "^newt_websocket_connect_latency_seconds_bucket" || true
probe "websocket messages total" "^newt_websocket_messages_total\{.*(direction|msg_type)=" || true
probe "websocket connected gauge" "^newt_websocket_connected" || true
probe "websocket reconnects total" "^newt_websocket_reconnects_total\{" || true
# Proxy metrics (when proxy active)
probe "proxy active connections" "^newt_proxy_active_connections\{" || true
probe "proxy buffer bytes" "^newt_proxy_buffer_bytes\{" || true
probe "proxy drops total" "^newt_proxy_drops_total\{" || true
probe "proxy connections total" "^newt_proxy_connections_total\{" || true
# Config apply
probe "config apply seconds buckets" "^newt_config_apply_seconds_bucket\{" || true
echo "Smoke checks completed (warnings above are acceptable if the feature isn't exercised yet)."

30
stub.go
View File

@@ -7,26 +7,28 @@ import (
"github.com/fosrl/newt/websocket"
)
func setupClients(client *websocket.Client) {
return // This function is not implemented for non-Linux systems.
func setupClientsNative(client *websocket.Client, host string) {
_ = client
_ = host
// No-op for non-Linux systems
}
func closeClients() {
// This function is not implemented for non-Linux systems.
return
func closeWgServiceNative() {
// No-op for non-Linux systems
}
func clientsHandleNewtConnection(publicKey string) {
// This function is not implemented for non-Linux systems.
return
func clientsOnConnectNative() {
// No-op for non-Linux systems
}
func clientsOnConnect() {
// This function is not implemented for non-Linux systems.
return
func clientsHandleNewtConnectionNative(publicKey, endpoint string) {
_ = publicKey
_ = endpoint
// No-op for non-Linux systems
}
func clientsAddProxyTarget(pm *proxy.ProxyManager, tunnelIp string) {
// This function is not implemented for non-Linux systems.
return
func clientsAddProxyTargetNative(pm *proxy.ProxyManager, tunnelIp string) {
_ = pm
_ = tunnelIp
// No-op for non-Linux systems
}

73
util.go
View File

@@ -2,6 +2,7 @@ package main
import (
"bytes"
"context"
"encoding/base64"
"encoding/hex"
"encoding/json"
@@ -14,6 +15,7 @@ import (
"math/rand"
"github.com/fosrl/newt/internal/telemetry"
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/proxy"
"github.com/fosrl/newt/websocket"
@@ -21,8 +23,11 @@ import (
"golang.org/x/net/ipv4"
"golang.zx2c4.com/wireguard/device"
"golang.zx2c4.com/wireguard/tun/netstack"
"gopkg.in/yaml.v3"
)
const msgHealthFileWriteFailed = "Failed to write health file: %v"
func fixKey(key string) string {
// Remove any whitespace
key = strings.TrimSpace(key)
@@ -175,7 +180,7 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
if healthFile != "" {
err := os.WriteFile(healthFile, []byte("ok"), 0644)
if err != nil {
logger.Warn("Failed to write health file: %v", err)
logger.Warn(msgHealthFileWriteFailed, err)
}
}
return stopChan, nil
@@ -216,11 +221,13 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
if healthFile != "" {
err := os.WriteFile(healthFile, []byte("ok"), 0644)
if err != nil {
logger.Warn("Failed to write health file: %v", err)
logger.Warn(msgHealthFileWriteFailed, err)
}
}
return
}
case <-pingStopChan:
// Stop the goroutine when signaled
return
}
}
}()
@@ -229,7 +236,7 @@ func pingWithRetry(tnet *netstack.Net, dst string, timeout time.Duration) (stopC
return stopChan, fmt.Errorf("initial ping attempts failed, continuing in background")
}
func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client) chan struct{} {
func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Client, tunnelID string) chan struct{} {
maxInterval := 6 * time.Second
currentInterval := pingInterval
consecutiveFailures := 0
@@ -292,6 +299,9 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
if !connectionLost {
connectionLost = true
logger.Warn("Connection to server lost after %d failures. Continuous reconnection attempts will be made.", consecutiveFailures)
if tunnelID != "" {
telemetry.IncReconnect(context.Background(), tunnelID, "client", telemetry.ReasonTimeout)
}
stopFunc = client.SendMessageInterval("newt/ping/request", map[string]interface{}{}, 3*time.Second)
// Send registration message to the server for backward compatibility
err := client.SendMessage("newt/wg/register", map[string]interface{}{
@@ -318,6 +328,10 @@ func startPingCheck(tnet *netstack.Net, serverIP string, client *websocket.Clien
} else {
// Track recent latencies
recentLatencies = append(recentLatencies, latency)
// Record tunnel latency (limit sampling to this periodic check)
if tunnelID != "" {
telemetry.ObserveTunnelLatency(context.Background(), tunnelID, "wireguard", latency.Seconds())
}
if len(recentLatencies) > 10 {
recentLatencies = recentLatencies[1:]
}
@@ -467,7 +481,8 @@ func updateTargets(pm *proxy.ProxyManager, action string, tunnelIP string, proto
continue
}
if action == "add" {
switch action {
case "add":
target := parts[1] + ":" + parts[2]
// Call updown script if provided
@@ -493,7 +508,7 @@ func updateTargets(pm *proxy.ProxyManager, action string, tunnelIP string, proto
// Add the new target
pm.AddTarget(proto, tunnelIP, port, processedTarget)
} else if action == "remove" {
case "remove":
logger.Info("Removing target with port %d", port)
target := parts[1] + ":" + parts[2]
@@ -511,6 +526,8 @@ func updateTargets(pm *proxy.ProxyManager, action string, tunnelIP string, proto
logger.Error("Failed to remove target: %v", err)
return err
}
default:
logger.Info("Unknown action: %s", action)
}
}
@@ -558,3 +575,47 @@ func executeUpdownScript(action, proto, target string) (string, error) {
return target, nil
}
func sendBlueprint(client *websocket.Client) error {
if blueprintFile == "" {
return nil
}
// try to read the blueprint file
blueprintData, err := os.ReadFile(blueprintFile)
if err != nil {
logger.Error("Failed to read blueprint file: %v", err)
} else {
// first we should convert the yaml to json and error if the yaml is bad
var yamlObj interface{}
var blueprintJsonData string
err = yaml.Unmarshal(blueprintData, &yamlObj)
if err != nil {
logger.Error("Failed to parse blueprint YAML: %v", err)
} else {
// convert to json
jsonBytes, err := json.Marshal(yamlObj)
if err != nil {
logger.Error("Failed to convert blueprint to JSON: %v", err)
} else {
blueprintJsonData = string(jsonBytes)
logger.Debug("Converted blueprint to JSON: %s", blueprintJsonData)
}
}
// if we have valid json data, we can send it to the server
if blueprintJsonData == "" {
logger.Error("No valid blueprint JSON data to send to server")
return nil
}
logger.Info("Sending blueprint to server for application")
// send the blueprint data to the server
err = client.SendMessage("newt/blueprint/apply", map[string]interface{}{
"blueprint": blueprintJsonData,
})
}
return nil
}

View File

@@ -6,6 +6,8 @@ import (
"crypto/x509"
"encoding/json"
"fmt"
"io"
"net"
"net/http"
"net/url"
"os"
@@ -17,6 +19,11 @@ import (
"github.com/fosrl/newt/logger"
"github.com/gorilla/websocket"
"context"
"github.com/fosrl/newt/internal/telemetry"
"go.opentelemetry.io/otel"
)
type Client struct {
@@ -34,12 +41,28 @@ type Client struct {
onConnect func() error
onTokenUpdate func(token string)
writeMux sync.Mutex
clientType string // Type of client (e.g., "newt", "olm")
tlsConfig TLSConfig
metricsCtxMu sync.RWMutex
metricsCtx context.Context
configNeedsSave bool // Flag to track if config needs to be saved
}
type ClientOption func(*Client)
type MessageHandler func(message WSMessage)
// TLSConfig holds TLS configuration options
type TLSConfig struct {
// New separate certificate support
ClientCertFile string
ClientKeyFile string
CAFiles []string
// Existing PKCS12 support (deprecated)
PKCS12File string
}
// WithBaseURL sets the base URL for the client
func WithBaseURL(url string) ClientOption {
return func(c *Client) {
@@ -47,9 +70,14 @@ func WithBaseURL(url string) ClientOption {
}
}
func WithTLSConfig(tlsClientCertPath string) ClientOption {
// WithTLSConfig sets the TLS configuration for the client
func WithTLSConfig(config TLSConfig) ClientOption {
return func(c *Client) {
c.config.TlsClientCert = tlsClientCertPath
c.tlsConfig = config
// For backward compatibility, also set the legacy field
if config.PKCS12File != "" {
c.config.TlsClientCert = config.PKCS12File
}
}
}
@@ -61,10 +89,30 @@ func (c *Client) OnTokenUpdate(callback func(token string)) {
c.onTokenUpdate = callback
}
// NewClient creates a new Newt client
func NewClient(newtID, secret string, endpoint string, pingInterval time.Duration, pingTimeout time.Duration, opts ...ClientOption) (*Client, error) {
func (c *Client) metricsContext() context.Context {
c.metricsCtxMu.RLock()
defer c.metricsCtxMu.RUnlock()
if c.metricsCtx != nil {
return c.metricsCtx
}
return context.Background()
}
func (c *Client) setMetricsContext(ctx context.Context) {
c.metricsCtxMu.Lock()
c.metricsCtx = ctx
c.metricsCtxMu.Unlock()
}
// MetricsContext exposes the context used for telemetry emission when a connection is active.
func (c *Client) MetricsContext() context.Context {
return c.metricsContext()
}
// NewClient creates a new websocket client
func NewClient(clientType string, ID, secret string, endpoint string, pingInterval time.Duration, pingTimeout time.Duration, opts ...ClientOption) (*Client, error) {
config := &Config{
NewtID: newtID,
ID: ID,
Secret: secret,
Endpoint: endpoint,
}
@@ -78,6 +126,7 @@ func NewClient(newtID, secret string, endpoint string, pingInterval time.Duratio
isConnected: false,
pingInterval: pingInterval,
pingTimeout: pingTimeout,
clientType: clientType,
}
// Apply options before loading config
@@ -96,6 +145,10 @@ func NewClient(newtID, secret string, endpoint string, pingInterval time.Duratio
return client, nil
}
func (c *Client) GetConfig() *Config {
return c.config
}
// Connect establishes the WebSocket connection
func (c *Client) Connect() error {
go c.connectWithRetry()
@@ -115,6 +168,7 @@ func (c *Client) Close() error {
// Set connection status to false
c.setConnected(false)
telemetry.SetWSConnectionState(false)
// Close the WebSocket connection gracefully
if c.conn != nil {
@@ -145,25 +199,39 @@ func (c *Client) SendMessage(messageType string, data interface{}) error {
c.writeMux.Lock()
defer c.writeMux.Unlock()
return c.conn.WriteJSON(msg)
if err := c.conn.WriteJSON(msg); err != nil {
return err
}
telemetry.IncWSMessage(c.metricsContext(), "out", "text")
return nil
}
func (c *Client) SendMessageInterval(messageType string, data interface{}, interval time.Duration) (stop func()) {
stopChan := make(chan struct{})
go func() {
count := 0
maxAttempts := 10
err := c.SendMessage(messageType, data) // Send immediately
if err != nil {
logger.Error("Failed to send initial message: %v", err)
}
count++
ticker := time.NewTicker(interval)
defer ticker.Stop()
for {
select {
case <-ticker.C:
if count >= maxAttempts {
logger.Info("SendMessageInterval timed out after %d attempts for message type: %s", maxAttempts, messageType)
return
}
err = c.SendMessage(messageType, data)
if err != nil {
logger.Error("Failed to send message: %v", err)
}
count++
case <-stopChan:
return
}
@@ -192,27 +260,52 @@ func (c *Client) getToken() (string, error) {
baseEndpoint := strings.TrimRight(baseURL.String(), "/")
var tlsConfig *tls.Config = nil
if c.config.TlsClientCert != "" {
tlsConfig, err = loadClientCertificate(c.config.TlsClientCert)
// Use new TLS configuration method
if c.tlsConfig.ClientCertFile != "" || c.tlsConfig.ClientKeyFile != "" || len(c.tlsConfig.CAFiles) > 0 || c.tlsConfig.PKCS12File != "" {
tlsConfig, err = c.setupTLS()
if err != nil {
return "", fmt.Errorf("failed to load certificate %s: %w", c.config.TlsClientCert, err)
return "", fmt.Errorf("failed to setup TLS configuration: %w", err)
}
}
// Check for environment variable to skip TLS verification
if os.Getenv("SKIP_TLS_VERIFY") == "true" {
if tlsConfig == nil {
tlsConfig = &tls.Config{}
}
tlsConfig.InsecureSkipVerify = true
logger.Debug("TLS certificate verification disabled via SKIP_TLS_VERIFY environment variable")
}
var tokenData map[string]interface{}
// Get a new token
tokenData := map[string]interface{}{
"newtId": c.config.NewtID,
"secret": c.config.Secret,
if c.clientType == "newt" {
tokenData = map[string]interface{}{
"newtId": c.config.ID,
"secret": c.config.Secret,
}
} else if c.clientType == "olm" {
tokenData = map[string]interface{}{
"olmId": c.config.ID,
"secret": c.config.Secret,
}
}
jsonData, err := json.Marshal(tokenData)
if err != nil {
return "", fmt.Errorf("failed to marshal token request data: %w", err)
}
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Create a new request
req, err := http.NewRequest(
req, err := http.NewRequestWithContext(
ctx,
"POST",
baseEndpoint+"/api/v1/auth/newt/get-token",
baseEndpoint+"/api/v1/auth/"+c.clientType+"/get-token",
bytes.NewBuffer(jsonData),
)
if err != nil {
@@ -232,13 +325,26 @@ func (c *Client) getToken() (string, error) {
}
resp, err := client.Do(req)
if err != nil {
telemetry.IncConnAttempt(ctx, "auth", "failure")
telemetry.IncConnError(ctx, "auth", classifyConnError(err))
return "", fmt.Errorf("failed to request new token: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
logger.Error("Failed to get token with status code: %d", resp.StatusCode)
return "", fmt.Errorf("failed to get token with status code: %d", resp.StatusCode)
body, _ := io.ReadAll(resp.Body)
logger.Error("Failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
telemetry.IncConnAttempt(ctx, "auth", "failure")
etype := "io_error"
if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
etype = "auth_failed"
}
telemetry.IncConnError(ctx, "auth", etype)
// Reconnect reason mapping for auth failures
if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
telemetry.IncReconnect(ctx, c.config.ID, "client", telemetry.ReasonAuthError)
}
return "", fmt.Errorf("failed to get token with status code: %d, body: %s", resp.StatusCode, string(body))
}
var tokenResp TokenResponse
@@ -256,10 +362,55 @@ func (c *Client) getToken() (string, error) {
}
logger.Debug("Received token: %s", tokenResp.Data.Token)
telemetry.IncConnAttempt(ctx, "auth", "success")
return tokenResp.Data.Token, nil
}
// classifyConnError maps to fixed, low-cardinality error_type values.
// Allowed enum: dial_timeout, tls_handshake, auth_failed, io_error
func classifyConnError(err error) string {
if err == nil {
return ""
}
msg := strings.ToLower(err.Error())
switch {
case strings.Contains(msg, "tls") || strings.Contains(msg, "certificate"):
return "tls_handshake"
case strings.Contains(msg, "timeout") || strings.Contains(msg, "i/o timeout") || strings.Contains(msg, "deadline exceeded"):
return "dial_timeout"
case strings.Contains(msg, "unauthorized") || strings.Contains(msg, "forbidden"):
return "auth_failed"
default:
// Group remaining network/socket errors as io_error to avoid label explosion
return "io_error"
}
}
func classifyWSDisconnect(err error) (result, reason string) {
if err == nil {
return "success", "normal"
}
if websocket.IsCloseError(err, websocket.CloseNormalClosure) {
return "success", "normal"
}
if ne, ok := err.(net.Error); ok && ne.Timeout() {
return "error", "timeout"
}
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseAbnormalClosure) {
return "error", "unexpected_close"
}
msg := strings.ToLower(err.Error())
switch {
case strings.Contains(msg, "eof"):
return "error", "eof"
case strings.Contains(msg, "reset"):
return "error", "connection_reset"
default:
return "error", "read_error"
}
}
func (c *Client) connectWithRetry() {
for {
select {
@@ -278,9 +429,13 @@ func (c *Client) connectWithRetry() {
}
func (c *Client) establishConnection() error {
ctx := context.Background()
// Get token for authentication
token, err := c.getToken()
if err != nil {
telemetry.IncConnAttempt(ctx, "websocket", "failure")
telemetry.IncConnError(ctx, "websocket", classifyConnError(err))
return fmt.Errorf("failed to get token: %w", err)
}
@@ -310,31 +465,72 @@ func (c *Client) establishConnection() error {
// Add token to query parameters
q := u.Query()
q.Set("token", token)
q.Set("clientType", "newt")
q.Set("clientType", c.clientType)
u.RawQuery = q.Encode()
// Connect to WebSocket
// Connect to WebSocket (optional span)
tr := otel.Tracer("newt")
ctx, span := tr.Start(ctx, "ws.connect")
defer span.End()
start := time.Now()
dialer := websocket.DefaultDialer
if c.config.TlsClientCert != "" {
logger.Info("Adding tls to req")
tlsConfig, err := loadClientCertificate(c.config.TlsClientCert)
// Use new TLS configuration method
if c.tlsConfig.ClientCertFile != "" || c.tlsConfig.ClientKeyFile != "" || len(c.tlsConfig.CAFiles) > 0 || c.tlsConfig.PKCS12File != "" {
logger.Info("Setting up TLS configuration for WebSocket connection")
tlsConfig, err := c.setupTLS()
if err != nil {
return fmt.Errorf("failed to load certificate %s: %w", c.config.TlsClientCert, err)
return fmt.Errorf("failed to setup TLS configuration: %w", err)
}
dialer.TLSClientConfig = tlsConfig
}
conn, _, err := dialer.Dial(u.String(), nil)
// Check for environment variable to skip TLS verification for WebSocket connection
if os.Getenv("SKIP_TLS_VERIFY") == "true" {
if dialer.TLSClientConfig == nil {
dialer.TLSClientConfig = &tls.Config{}
}
dialer.TLSClientConfig.InsecureSkipVerify = true
logger.Debug("WebSocket TLS certificate verification disabled via SKIP_TLS_VERIFY environment variable")
}
conn, _, err := dialer.DialContext(ctx, u.String(), nil)
lat := time.Since(start).Seconds()
if err != nil {
telemetry.IncConnAttempt(ctx, "websocket", "failure")
etype := classifyConnError(err)
telemetry.IncConnError(ctx, "websocket", etype)
telemetry.ObserveWSConnectLatency(ctx, lat, "failure", etype)
// Map handshake-related errors to reconnect reasons where appropriate
if etype == "tls_handshake" {
telemetry.IncReconnect(ctx, c.config.ID, "client", telemetry.ReasonHandshakeError)
} else if etype == "dial_timeout" {
telemetry.IncReconnect(ctx, c.config.ID, "client", telemetry.ReasonTimeout)
} else {
telemetry.IncReconnect(ctx, c.config.ID, "client", telemetry.ReasonError)
}
telemetry.IncWSReconnect(ctx, etype)
return fmt.Errorf("failed to connect to WebSocket: %w", err)
}
telemetry.IncConnAttempt(ctx, "websocket", "success")
telemetry.ObserveWSConnectLatency(ctx, lat, "success", "")
c.conn = conn
c.setConnected(true)
telemetry.SetWSConnectionState(true)
c.setMetricsContext(ctx)
sessionStart := time.Now()
// Wire up pong handler for metrics
c.conn.SetPongHandler(func(appData string) error {
telemetry.IncWSMessage(c.metricsContext(), "in", "pong")
return nil
})
// Start the ping monitor
go c.pingMonitor()
// Start the read pump with disconnect detection
go c.readPumpWithDisconnectDetection()
go c.readPumpWithDisconnectDetection(sessionStart)
if c.onConnect != nil {
err := c.saveConfig()
@@ -349,6 +545,69 @@ func (c *Client) establishConnection() error {
return nil
}
// setupTLS configures TLS based on the TLS configuration
func (c *Client) setupTLS() (*tls.Config, error) {
tlsConfig := &tls.Config{}
// Handle new separate certificate configuration
if c.tlsConfig.ClientCertFile != "" && c.tlsConfig.ClientKeyFile != "" {
logger.Info("Loading separate certificate files for mTLS")
logger.Debug("Client cert: %s", c.tlsConfig.ClientCertFile)
logger.Debug("Client key: %s", c.tlsConfig.ClientKeyFile)
// Load client certificate and key
cert, err := tls.LoadX509KeyPair(c.tlsConfig.ClientCertFile, c.tlsConfig.ClientKeyFile)
if err != nil {
return nil, fmt.Errorf("failed to load client certificate pair: %w", err)
}
tlsConfig.Certificates = []tls.Certificate{cert}
// Load CA certificates for remote validation if specified
if len(c.tlsConfig.CAFiles) > 0 {
logger.Debug("Loading CA certificates: %v", c.tlsConfig.CAFiles)
caCertPool := x509.NewCertPool()
for _, caFile := range c.tlsConfig.CAFiles {
caCert, err := os.ReadFile(caFile)
if err != nil {
return nil, fmt.Errorf("failed to read CA file %s: %w", caFile, err)
}
// Try to parse as PEM first, then DER
if !caCertPool.AppendCertsFromPEM(caCert) {
// If PEM parsing failed, try DER
cert, err := x509.ParseCertificate(caCert)
if err != nil {
return nil, fmt.Errorf("failed to parse CA certificate from %s: %w", caFile, err)
}
caCertPool.AddCert(cert)
}
}
tlsConfig.RootCAs = caCertPool
}
return tlsConfig, nil
}
// Fallback to existing PKCS12 implementation for backward compatibility
if c.tlsConfig.PKCS12File != "" {
logger.Info("Loading PKCS12 certificate for mTLS (deprecated)")
return c.setupPKCS12TLS()
}
// Legacy fallback using config.TlsClientCert
if c.config.TlsClientCert != "" {
logger.Info("Loading legacy PKCS12 certificate for mTLS (deprecated)")
return loadClientCertificate(c.config.TlsClientCert)
}
return nil, nil
}
// setupPKCS12TLS loads TLS configuration from PKCS12 file
func (c *Client) setupPKCS12TLS() (*tls.Config, error) {
return loadClientCertificate(c.tlsConfig.PKCS12File)
}
// pingMonitor sends pings at a short interval and triggers reconnect on failure
func (c *Client) pingMonitor() {
ticker := time.NewTicker(c.pingInterval)
@@ -364,6 +623,9 @@ func (c *Client) pingMonitor() {
}
c.writeMux.Lock()
err := c.conn.WriteControl(websocket.PingMessage, []byte{}, time.Now().Add(c.pingTimeout))
if err == nil {
telemetry.IncWSMessage(c.metricsContext(), "out", "ping")
}
c.writeMux.Unlock()
if err != nil {
// Check if we're shutting down before logging error and reconnecting
@@ -373,6 +635,8 @@ func (c *Client) pingMonitor() {
return
default:
logger.Error("Ping failed: %v", err)
telemetry.IncWSKeepaliveFailure(c.metricsContext(), "ping_write")
telemetry.IncWSReconnect(c.metricsContext(), "ping_write")
c.reconnect()
return
}
@@ -382,17 +646,26 @@ func (c *Client) pingMonitor() {
}
// readPumpWithDisconnectDetection reads messages and triggers reconnect on error
func (c *Client) readPumpWithDisconnectDetection() {
func (c *Client) readPumpWithDisconnectDetection(started time.Time) {
ctx := c.metricsContext()
disconnectReason := "shutdown"
disconnectResult := "success"
defer func() {
if c.conn != nil {
c.conn.Close()
}
if !started.IsZero() {
telemetry.ObserveWSSessionDuration(ctx, time.Since(started).Seconds(), disconnectResult)
}
telemetry.IncWSDisconnect(ctx, disconnectReason, disconnectResult)
// Only attempt reconnect if we're not shutting down
select {
case <-c.done:
// Shutting down, don't reconnect
return
default:
telemetry.IncWSReconnect(ctx, disconnectReason)
c.reconnect()
}
}()
@@ -400,23 +673,33 @@ func (c *Client) readPumpWithDisconnectDetection() {
for {
select {
case <-c.done:
disconnectReason = "shutdown"
disconnectResult = "success"
return
default:
var msg WSMessage
err := c.conn.ReadJSON(&msg)
if err == nil {
telemetry.IncWSMessage(c.metricsContext(), "in", "text")
}
if err != nil {
// Check if we're shutting down before logging error
select {
case <-c.done:
// Expected during shutdown, don't log as error
logger.Debug("WebSocket connection closed during shutdown")
disconnectReason = "shutdown"
disconnectResult = "success"
return
default:
// Unexpected error during normal operation
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseAbnormalClosure, websocket.CloseNormalClosure) {
logger.Error("WebSocket read error: %v", err)
} else {
logger.Debug("WebSocket connection closed: %v", err)
disconnectResult, disconnectReason = classifyWSDisconnect(err)
if disconnectResult == "error" {
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway, websocket.CloseAbnormalClosure, websocket.CloseNormalClosure) {
logger.Error("WebSocket read error: %v", err)
} else {
logger.Debug("WebSocket connection closed: %v", err)
}
}
return // triggers reconnect via defer
}
@@ -433,6 +716,7 @@ func (c *Client) readPumpWithDisconnectDetection() {
func (c *Client) reconnect() {
c.setConnected(false)
telemetry.SetWSConnectionState(false)
if c.conn != nil {
c.conn.Close()
c.conn = nil
@@ -453,7 +737,7 @@ func (c *Client) setConnected(status bool) {
c.isConnected = status
}
// LoadClientCertificate Helper method to load client certificates
// LoadClientCertificate Helper method to load client certificates (PKCS12 format)
func loadClientCertificate(p12Path string) (*tls.Config, error) {
logger.Info("Loading tls-client-cert %s", p12Path)
// Read the PKCS12 file

View File

@@ -6,35 +6,54 @@ import (
"os"
"path/filepath"
"runtime"
"github.com/fosrl/newt/logger"
)
func getConfigPath() string {
var configDir string
switch runtime.GOOS {
case "darwin":
configDir = filepath.Join(os.Getenv("HOME"), "Library", "Application Support", "newt-client")
case "windows":
configDir = filepath.Join(os.Getenv("APPDATA"), "newt-client")
default: // linux and others
configDir = filepath.Join(os.Getenv("HOME"), ".config", "newt-client")
func getConfigPath(clientType string) string {
configFile := os.Getenv("CONFIG_FILE")
if configFile == "" {
var configDir string
switch runtime.GOOS {
case "darwin":
configDir = filepath.Join(os.Getenv("HOME"), "Library", "Application Support", clientType+"-client")
case "windows":
logDir := filepath.Join(os.Getenv("PROGRAMDATA"), "olm")
configDir = filepath.Join(logDir, clientType+"-client")
default: // linux and others
configDir = filepath.Join(os.Getenv("HOME"), ".config", clientType+"-client")
}
if err := os.MkdirAll(configDir, 0755); err != nil {
log.Printf("Failed to create config directory: %v", err)
}
return filepath.Join(configDir, "config.json")
}
if err := os.MkdirAll(configDir, 0755); err != nil {
log.Printf("Failed to create config directory: %v", err)
}
return filepath.Join(configDir, "config.json")
return configFile
}
func (c *Client) loadConfig() error {
if c.config.NewtID != "" && c.config.Secret != "" && c.config.Endpoint != "" {
originalConfig := *c.config // Store original config to detect changes
configPath := getConfigPath(c.clientType)
if c.config.ID != "" && c.config.Secret != "" && c.config.Endpoint != "" {
logger.Debug("Config already provided, skipping loading from file")
// Check if config file exists, if not, we should save it
if _, err := os.Stat(configPath); os.IsNotExist(err) {
logger.Info("Config file does not exist at %s, will create it", configPath)
c.configNeedsSave = true
}
return nil
}
configPath := getConfigPath()
logger.Info("Loading config from: %s", configPath)
data, err := os.ReadFile(configPath)
if err != nil {
if os.IsNotExist(err) {
logger.Info("Config file does not exist at %s, will create it with provided values", configPath)
c.configNeedsSave = true
return nil
}
return err
@@ -45,8 +64,14 @@ func (c *Client) loadConfig() error {
return err
}
if c.config.NewtID == "" {
c.config.NewtID = config.NewtID
// Track what was loaded from file vs provided by CLI
fileHadID := c.config.ID == ""
fileHadSecret := c.config.Secret == ""
fileHadCert := c.config.TlsClientCert == ""
fileHadEndpoint := c.config.Endpoint == ""
if c.config.ID == "" {
c.config.ID = config.ID
}
if c.config.Secret == "" {
c.config.Secret = config.Secret
@@ -59,14 +84,37 @@ func (c *Client) loadConfig() error {
c.baseURL = config.Endpoint
}
// Check if CLI args provided values that override file values
if (!fileHadID && originalConfig.ID != "") ||
(!fileHadSecret && originalConfig.Secret != "") ||
(!fileHadCert && originalConfig.TlsClientCert != "") ||
(!fileHadEndpoint && originalConfig.Endpoint != "") {
logger.Info("CLI arguments provided, config will be updated")
c.configNeedsSave = true
}
logger.Debug("Loaded config from %s", configPath)
logger.Debug("Config: %+v", c.config)
return nil
}
func (c *Client) saveConfig() error {
configPath := getConfigPath()
if !c.configNeedsSave {
logger.Debug("Config has not changed, skipping save")
return nil
}
configPath := getConfigPath(c.clientType)
data, err := json.MarshalIndent(c.config, "", " ")
if err != nil {
return err
}
return os.WriteFile(configPath, data, 0644)
logger.Info("Saving config to: %s", configPath)
err = os.WriteFile(configPath, data, 0644)
if err == nil {
c.configNeedsSave = false // Reset flag after successful save
}
return err
}

View File

@@ -1,7 +1,7 @@
package websocket
type Config struct {
NewtID string `json:"newtId"`
ID string `json:"id"`
Secret string `json:"secret"`
Endpoint string `json:"endpoint"`
TlsClientCert string `json:"tlsClientCert"`

275
wg/wg.go
View File

@@ -3,7 +3,9 @@
package wg
import (
"context"
"encoding/json"
"errors"
"fmt"
"net"
"os"
@@ -12,16 +14,19 @@ import (
"sync"
"time"
"math/rand"
"github.com/fosrl/newt/logger"
"github.com/fosrl/newt/network"
"github.com/fosrl/newt/websocket"
"github.com/vishvananda/netlink"
"golang.org/x/crypto/chacha20poly1305"
"golang.org/x/crypto/curve25519"
"golang.org/x/exp/rand"
"golang.zx2c4.com/wireguard/conn"
"golang.zx2c4.com/wireguard/wgctrl"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"github.com/fosrl/newt/internal/telemetry"
)
type WgConfig struct {
@@ -48,21 +53,24 @@ type PeerReading struct {
}
type WireGuardService struct {
interfaceName string
mtu int
client *websocket.Client
wgClient *wgctrl.Client
config WgConfig
key wgtypes.Key
newtId string
lastReadings map[string]PeerReading
mu sync.Mutex
Port uint16
stopHolepunch chan struct{}
host string
serverPubKey string
token string
stopGetConfig chan struct{}
interfaceName string
mtu int
client *websocket.Client
wgClient *wgctrl.Client
config WgConfig
key wgtypes.Key
keyFilePath string
newtId string
lastReadings map[string]PeerReading
mu sync.Mutex
Port uint16
stopHolepunch chan struct{}
host string
serverPubKey string
holePunchEndpoint string
token string
stopGetConfig func()
interfaceCreated bool
}
// Add this type definition
@@ -102,7 +110,7 @@ func FindAvailableUDPPort(minPort, maxPort uint16) (uint16, error) {
}
// Fisher-Yates shuffle to randomize the port order
rand.Seed(uint64(time.Now().UnixNano()))
rand.Seed(time.Now().UnixNano())
for i := len(portRange) - 1; i > 0; i-- {
j := rand.Intn(i + 1)
portRange[i], portRange[j] = portRange[j], portRange[i]
@@ -148,26 +156,52 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
}
var key wgtypes.Key
var port uint16
// if generateAndSaveKeyTo is provided, generate a private key and save it to the file. if the file already exists, load the key from the file
if _, err := os.Stat(generateAndSaveKeyTo); os.IsNotExist(err) {
// generate a new private key
key, err = wgtypes.GeneratePrivateKey()
if err != nil {
logger.Fatal("Failed to generate private key: %v", err)
key, err = wgtypes.GeneratePrivateKey()
if err != nil {
return nil, fmt.Errorf("failed to generate private key: %v", err)
}
// Load or generate private key
if generateAndSaveKeyTo != "" {
if _, err := os.Stat(generateAndSaveKeyTo); os.IsNotExist(err) {
keyData, err := os.ReadFile(generateAndSaveKeyTo)
if err != nil {
return nil, fmt.Errorf("failed to read private key: %v", err)
}
key, err = wgtypes.ParseKey(strings.TrimSpace(string(keyData)))
if err != nil {
return nil, fmt.Errorf("failed to parse private key: %v", err)
}
} else {
err = os.WriteFile(generateAndSaveKeyTo, []byte(key.String()), 0600)
if err != nil {
return nil, fmt.Errorf("failed to save private key: %v", err)
}
}
// save the key to the file
err = os.WriteFile(generateAndSaveKeyTo, []byte(key.String()), 0644)
if err != nil {
logger.Fatal("Failed to save private key: %v", err)
}
// Get the existing wireguard port
device, err := wgClient.Device(interfaceName)
if err == nil {
port = uint16(device.ListenPort)
// also set the private key to the existing key
key = device.PrivateKey
if port != 0 {
logger.Info("WireGuard interface %s already exists with port %d\n", interfaceName, port)
} else {
port, err = FindAvailableUDPPort(49152, 65535)
if err != nil {
fmt.Printf("Error finding available port: %v\n", err)
return nil, err
}
}
} else {
keyData, err := os.ReadFile(generateAndSaveKeyTo)
port, err = FindAvailableUDPPort(49152, 65535)
if err != nil {
logger.Fatal("Failed to read private key: %v", err)
}
key, err = wgtypes.ParseKey(string(keyData))
if err != nil {
logger.Fatal("Failed to parse private key: %v", err)
fmt.Printf("Error finding available port: %v\n", err)
return nil, err
}
}
@@ -177,24 +211,12 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
client: wsClient,
wgClient: wgClient,
key: key,
Port: port,
keyFilePath: generateAndSaveKeyTo,
newtId: newtId,
host: host,
lastReadings: make(map[string]PeerReading),
stopHolepunch: make(chan struct{}),
stopGetConfig: make(chan struct{}),
}
// Get the existing wireguard port (keep this part)
device, err := service.wgClient.Device(service.interfaceName)
if err == nil {
service.Port = uint16(device.ListenPort)
logger.Info("WireGuard interface %s already exists with port %d\n", service.interfaceName, service.Port)
} else {
service.Port, err = FindAvailableUDPPort(49152, 65535)
if err != nil {
fmt.Printf("Error finding available port: %v\n", err)
return nil, err
}
}
// Register websocket handlers
@@ -203,22 +225,13 @@ func NewWireGuardService(interfaceName string, mtu int, generateAndSaveKeyTo str
wsClient.RegisterHandler("newt/wg/peer/remove", service.handleRemovePeer)
wsClient.RegisterHandler("newt/wg/peer/update", service.handleUpdatePeer)
if err := service.sendUDPHolePunch(service.host + ":21820"); err != nil {
logger.Error("Failed to send UDP hole punch: %v", err)
}
// start the UDP holepunch
go service.keepSendingUDPHolePunch(service.host)
return service, nil
}
func (s *WireGuardService) Close(rm bool) {
select {
case <-s.stopGetConfig:
// Already closed, do nothing
default:
close(s.stopGetConfig)
if s.stopGetConfig != nil {
s.stopGetConfig()
s.stopGetConfig = nil
}
s.wgClient.Close()
@@ -229,14 +242,29 @@ func (s *WireGuardService) Close(rm bool) {
}
// Remove the private key file
if err := os.Remove(s.key.String()); err != nil {
logger.Error("Failed to remove private key file: %v", err)
}
// if s.keyFilePath != "" {
// if err := os.Remove(s.keyFilePath); err != nil {
// logger.Error("Failed to remove private key file: %v", err)
// }
// }
}
}
func (s *WireGuardService) SetServerPubKey(serverPubKey string) {
func (s *WireGuardService) StartHolepunch(serverPubKey string, endpoint string) {
// if the device is already created dont start a new holepunch
if s.interfaceCreated {
return
}
s.serverPubKey = serverPubKey
s.holePunchEndpoint = endpoint
logger.Debug("Starting UDP hole punch to %s", s.holePunchEndpoint)
s.stopHolepunch = make(chan struct{})
// start the UDP holepunch
go s.keepSendingUDPHolePunch(s.holePunchEndpoint)
}
func (s *WireGuardService) SetToken(token string) {
@@ -244,47 +272,74 @@ func (s *WireGuardService) SetToken(token string) {
}
func (s *WireGuardService) LoadRemoteConfig() error {
// Send the initial message
err := s.sendGetConfigMessage()
if err != nil {
logger.Error("Failed to send initial get-config message: %v", err)
return err
}
// Start goroutine to periodically send the message until config is received
go s.keepSendingGetConfig()
s.stopGetConfig = s.client.SendMessageInterval("newt/wg/get-config", map[string]interface{}{
"publicKey": s.key.PublicKey().String(),
"port": s.Port,
}, 2*time.Second)
logger.Info("Requesting WireGuard configuration from remote server")
go s.periodicBandwidthCheck()
return nil
}
func (s *WireGuardService) handleConfig(msg websocket.WSMessage) {
ctx := context.Background()
if s.client != nil {
ctx = s.client.MetricsContext()
}
result := "success"
defer func() {
telemetry.IncConfigReload(ctx, result)
}()
var config WgConfig
logger.Info("Received message: %v", msg)
logger.Debug("Received message: %v", msg)
logger.Info("Received WireGuard clients configuration from remote server")
jsonData, err := json.Marshal(msg.Data)
if err != nil {
logger.Info("Error marshaling data: %v", err)
result = "failure"
return
}
if err := json.Unmarshal(jsonData, &config); err != nil {
logger.Info("Error unmarshaling target data: %v", err)
result = "failure"
return
}
s.config = config
close(s.stopGetConfig)
// Ensure the WireGuard interface and peers are configured
if err := s.ensureWireguardInterface(config); err != nil {
logger.Error("Failed to ensure WireGuard interface: %v", err)
if s.stopGetConfig != nil {
s.stopGetConfig()
s.stopGetConfig = nil
}
// telemetry: config reload success
// Optional reconnect reason mapping: config change
if s.serverPubKey != "" {
telemetry.IncReconnect(ctx, s.serverPubKey, "client", telemetry.ReasonConfigChange)
}
// Ensure the WireGuard interface and peers are configured
start := time.Now()
if err := s.ensureWireguardInterface(config); err != nil {
logger.Error("Failed to ensure WireGuard interface: %v", err)
telemetry.ObserveConfigApply(ctx, "interface", "failure", time.Since(start).Seconds())
result = "failure"
} else {
telemetry.ObserveConfigApply(ctx, "interface", "success", time.Since(start).Seconds())
}
startPeers := time.Now()
if err := s.ensureWireguardPeers(config.Peers); err != nil {
logger.Error("Failed to ensure WireGuard peers: %v", err)
telemetry.ObserveConfigApply(ctx, "peer", "failure", time.Since(startPeers).Seconds())
result = "failure"
} else {
telemetry.ObserveConfigApply(ctx, "peer", "success", time.Since(startPeers).Seconds())
}
}
@@ -298,6 +353,7 @@ func (s *WireGuardService) ensureWireguardInterface(wgconfig WgConfig) error {
if err != nil {
logger.Fatal("Failed to create WireGuard interface: %v", err)
}
s.interfaceCreated = true
logger.Info("Created WireGuard interface %s\n", s.interfaceName)
} else {
logger.Fatal("Error checking for WireGuard interface: %v", err)
@@ -315,9 +371,16 @@ func (s *WireGuardService) ensureWireguardInterface(wgconfig WgConfig) error {
s.Port = uint16(device.ListenPort)
logger.Info("WireGuard interface %s already exists with port %d\n", s.interfaceName, s.Port)
s.interfaceCreated = true
return nil
}
// stop the holepunch its a channel
if s.stopHolepunch != nil {
close(s.stopHolepunch)
s.stopHolepunch = nil
}
logger.Info("Assigning IP address %s to interface %s\n", wgconfig.IpAddress, s.interfaceName)
// Assign IP address to the interface
err = s.assignIPAddress(wgconfig.IpAddress)
@@ -328,7 +391,10 @@ func (s *WireGuardService) ensureWireguardInterface(wgconfig WgConfig) error {
// Check if the interface already exists
_, err = s.wgClient.Device(s.interfaceName)
if err != nil {
return fmt.Errorf("interface %s does not exist", s.interfaceName)
if errors.Is(err, os.ErrNotExist) {
return fmt.Errorf("interface %s does not exist", s.interfaceName)
}
return fmt.Errorf("failed to get device: %v", err)
}
// Parse the private key
@@ -447,7 +513,7 @@ func (s *WireGuardService) ensureWireguardPeers(peers []Peer) error {
}
func (s *WireGuardService) handleAddPeer(msg websocket.WSMessage) {
logger.Info("Received message: %v", msg.Data)
logger.Debug("Received message: %v", msg.Data)
var peer Peer
jsonData, err := json.Marshal(msg.Data)
@@ -520,7 +586,7 @@ func (s *WireGuardService) addPeer(peer Peer) error {
}
func (s *WireGuardService) handleRemovePeer(msg websocket.WSMessage) {
logger.Info("Received message: %v", msg.Data)
logger.Debug("Received message: %v", msg.Data)
// parse the publicKey from the message which is json { "publicKey": "asdfasdfl;akjsdf" }
type RemoveRequest struct {
PublicKey string `json:"publicKey"`
@@ -568,7 +634,7 @@ func (s *WireGuardService) removePeer(publicKey string) error {
}
func (s *WireGuardService) handleUpdatePeer(msg websocket.WSMessage) {
logger.Info("Received message: %v", msg.Data)
logger.Debug("Received message: %v", msg.Data)
// Define a struct to match the incoming message structure with optional fields
type UpdatePeerRequest struct {
PublicKey string `json:"publicKey"`
@@ -629,7 +695,7 @@ func (s *WireGuardService) handleUpdatePeer(msg websocket.WSMessage) {
}
// Only update AllowedIPs if provided in the request
if request.AllowedIPs != nil && len(request.AllowedIPs) > 0 {
if len(request.AllowedIPs) > 0 {
var allowedIPs []net.IPNet
for _, ipStr := range request.AllowedIPs {
_, ipNet, err := net.ParseCIDR(ipStr)
@@ -917,17 +983,30 @@ func (s *WireGuardService) encryptPayload(payload []byte) (interface{}, error) {
}
func (s *WireGuardService) keepSendingUDPHolePunch(host string) {
logger.Info("Starting UDP hole punch routine to %s:21820", host)
// send initial hole punch
if err := s.sendUDPHolePunch(host + ":21820"); err != nil {
logger.Debug("Failed to send initial UDP hole punch: %v", err)
}
ticker := time.NewTicker(3 * time.Second)
defer ticker.Stop()
timeout := time.NewTimer(15 * time.Second)
defer timeout.Stop()
for {
select {
case <-s.stopHolepunch:
logger.Info("Stopping UDP holepunch")
return
case <-timeout.C:
logger.Info("UDP holepunch routine timed out after 15 seconds")
return
case <-ticker.C:
if err := s.sendUDPHolePunch(host + ":21820"); err != nil {
logger.Error("Failed to send UDP hole punch: %v", err)
logger.Debug("Failed to send UDP hole punch: %v", err)
}
}
}
@@ -949,33 +1028,3 @@ func (s *WireGuardService) removeInterface() error {
return nil
}
func (s *WireGuardService) sendGetConfigMessage() error {
err := s.client.SendMessage("newt/wg/get-config", map[string]interface{}{
"publicKey": fmt.Sprintf("%s", s.key.PublicKey().String()),
"port": s.Port,
})
if err != nil {
logger.Error("Failed to send get-config message: %v", err)
return err
}
logger.Info("Requesting WireGuard configuration from remote server")
return nil
}
func (s *WireGuardService) keepSendingGetConfig() {
ticker := time.NewTicker(3 * time.Second)
defer ticker.Stop()
for {
select {
case <-s.stopGetConfig:
logger.Info("Stopping get-config messages")
return
case <-ticker.C:
if err := s.sendGetConfigMessage(); err != nil {
logger.Error("Failed to send periodic get-config: %v", err)
}
}
}
}

1305
wgnetstack/wgnetstack.go Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -8,6 +8,8 @@ import (
"time"
"github.com/fosrl/newt/logger"
"golang.zx2c4.com/wireguard/tun/netstack"
"gvisor.dev/gvisor/pkg/tcpip/adapters/gonet"
)
const (
@@ -26,7 +28,9 @@ const (
// Server handles listening for connection check requests using UDP
type Server struct {
conn *net.UDPConn
conn net.Conn // Generic net.Conn interface (could be *net.UDPConn or *gonet.UDPConn)
udpConn *net.UDPConn // Regular UDP connection (when not using netstack)
netstackConn interface{} // Netstack UDP connection (when using netstack)
serverAddr string
serverPort uint16
shutdownCh chan struct{}
@@ -34,6 +38,8 @@ type Server struct {
runningLock sync.Mutex
newtID string
outputPrefix string
useNetstack bool
tnet interface{} // Will be *netstack.Net when using netstack
}
// NewServer creates a new connection test server using UDP
@@ -44,6 +50,21 @@ func NewServer(serverAddr string, serverPort uint16, newtID string) *Server {
shutdownCh: make(chan struct{}),
newtID: newtID,
outputPrefix: "[WGTester] ",
useNetstack: false,
tnet: nil,
}
}
// NewServerWithNetstack creates a new connection test server using WireGuard netstack
func NewServerWithNetstack(serverAddr string, serverPort uint16, newtID string, tnet *netstack.Net) *Server {
return &Server{
serverAddr: serverAddr,
serverPort: serverPort + 1, // use the next port for the server
shutdownCh: make(chan struct{}),
newtID: newtID,
outputPrefix: "[WGTester] ",
useNetstack: true,
tnet: tnet,
}
}
@@ -59,18 +80,30 @@ func (s *Server) Start() error {
//create the address to listen on
addr := net.JoinHostPort(s.serverAddr, fmt.Sprintf("%d", s.serverPort))
// Create UDP address to listen on
udpAddr, err := net.ResolveUDPAddr("udp", addr)
if err != nil {
return err
}
if s.useNetstack && s.tnet != nil {
// Use WireGuard netstack
tnet := s.tnet.(*netstack.Net)
udpAddr := &net.UDPAddr{Port: int(s.serverPort)}
netstackConn, err := tnet.ListenUDP(udpAddr)
if err != nil {
return err
}
s.netstackConn = netstackConn
s.conn = netstackConn
} else {
// Use regular UDP socket
udpAddr, err := net.ResolveUDPAddr("udp", addr)
if err != nil {
return err
}
// Create UDP connection
conn, err := net.ListenUDP("udp", udpAddr)
if err != nil {
return err
udpConn, err := net.ListenUDP("udp", udpAddr)
if err != nil {
return err
}
s.udpConn = udpConn
s.conn = udpConn
}
s.conn = conn
s.isRunning = true
go s.handleConnections()
@@ -93,7 +126,27 @@ func (s *Server) Stop() {
s.conn.Close()
}
s.isRunning = false
logger.Info(s.outputPrefix + "Server stopped")
logger.Info("%sServer stopped", s.outputPrefix)
}
// RestartWithNetstack stops the current server and restarts it with netstack
func (s *Server) RestartWithNetstack(tnet *netstack.Net) error {
s.Stop()
// Update configuration to use netstack
s.useNetstack = true
s.tnet = tnet
// Clear previous connections
s.conn = nil
s.udpConn = nil
s.netstackConn = nil
// Create new shutdown channel
s.shutdownCh = make(chan struct{})
// Restart the server
return s.Start()
}
// handleConnections processes incoming packets
@@ -108,18 +161,34 @@ func (s *Server) handleConnections() {
// Set read deadline to avoid blocking forever
err := s.conn.SetReadDeadline(time.Now().Add(1 * time.Second))
if err != nil {
logger.Error(s.outputPrefix+"Error setting read deadline: %v", err)
logger.Error("%sError setting read deadline: %v", s.outputPrefix, err)
continue
}
// Read from UDP connection
n, addr, err := s.conn.ReadFromUDP(buffer)
// Read from UDP connection - handle both regular UDP and netstack UDP
var n int
var addr net.Addr
if s.useNetstack {
// Use netstack UDP connection
netstackConn := s.netstackConn.(*gonet.UDPConn)
n, addr, err = netstackConn.ReadFrom(buffer)
} else {
// Use regular UDP connection
n, addr, err = s.udpConn.ReadFromUDP(buffer)
}
if err != nil {
if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
// Just a timeout, keep going
continue
}
logger.Error(s.outputPrefix+"Error reading from UDP: %v", err)
// Check if we're shutting down and the connection was closed
select {
case <-s.shutdownCh:
return // Don't log error if we're shutting down
default:
logger.Error("%sError reading from UDP: %v", s.outputPrefix, err)
}
continue
}
@@ -150,14 +219,23 @@ func (s *Server) handleConnections() {
copy(responsePacket[5:13], buffer[5:13])
// Log response being sent for debugging
logger.Debug(s.outputPrefix+"Sending response to %s", addr.String())
logger.Debug("%sSending response to %s", s.outputPrefix, addr.String())
// Send the response packet directly to the source address
_, err = s.conn.WriteToUDP(responsePacket, addr)
if err != nil {
logger.Error(s.outputPrefix+"Error sending response: %v", err)
// Send the response packet - handle both regular UDP and netstack UDP
if s.useNetstack {
// Use netstack UDP connection
netstackConn := s.netstackConn.(*gonet.UDPConn)
_, err = netstackConn.WriteTo(responsePacket, addr)
} else {
logger.Debug(s.outputPrefix + "Response sent successfully")
// Use regular UDP connection
udpAddr := addr.(*net.UDPAddr)
_, err = s.udpConn.WriteToUDP(responsePacket, udpAddr)
}
if err != nil {
logger.Error("%sError sending response: %v", s.outputPrefix, err)
} else {
logger.Debug("%sResponse sent successfully", s.outputPrefix)
}
}
}