fix: override default fleet-agent metrics and health bind addresses by anmazzotti · Pull Request #2173 · rancher/turtles

anmazzotti · 2026-03-02T08:26:56Z

What this PR does / why we need it:

I found out that on GKE the fleet-agent fails to initialize correctly since the metrics port 8080 is already bound (also see the reserved port list).

{"level":"error","ts":"2026-02-26T15:32:50Z","logger":"setup","msg":"problem running manager","error":"failed to start metrics server: failed to create listener: listen tcp :8080: bind: address already in use","stacktrace":"github.com/rancher/fleet/internal/cmd/agent.start\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/operator.go:181\ngithub.com/rancher/fleet/internal/cmd/agent.(*FleetAgent).Run.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/root.go:143"}
{"level":"error","ts":"2026-02-26T15:32:50Z","logger":"setup","msg":"failed to start agent","error":"failed to start metrics server: failed to create listener: listen tcp :8080: bind: address already in use","stacktrace":"github.com/rancher/fleet/internal/cmd/agent.(*FleetAgent).Run.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/root.go:144"}
{"level":"error","ts":"2026-02-26T15:32:50Z","logger":"controller-runtime.source.Kind","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1alpha1.BundleDeployment Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:80\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:68"}
{"level":"error","ts":"2026-02-26T15:33:05Z","logger":"clusterstatus","msg":"failed to report initial cluster status","cluster":"cluster-gke-pq8v6b","interval":900,"error":"client rate limiter Wait returned an error: context canceled","stacktrace":"github.com/rancher/fleet/internal/cmd/agent/clusterstatus.Ticker.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/clusterstatus/ticker.go:42"}

This can be configured using fleet-agent environment variables, however there are two issues:

The FleetAddonConfig is embedded in the rancher-turtles-providers chart (and previously it was embedded in the turtles one)
CAAPF does not allow changing the configuration per Cluster (see Allow different agent configurations per Cluster cluster-api-addon-provider-fleet#428)

So I see no other way than changing this for all Clusters and for all rancher-turtles-providers chart users.
This is an opinionated choice, however since we also use the hostNetwork setting, trying to bind to 18080 and 18081 is probably safer in most cases.

This however has the consequence of rolling out the fleet-agent on already provisioned Clusters to bind to the newly set ports, which is surely going to be an unexpected change for current users.
Chart configuration values have been added so that users can default back to 8080 and 8081 if they wish to.

Test run that includes this change: https://github.com/rancher/turtles/actions/runs/22565380279/job/65360516383

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

squashed commits into logical changes
includes documentation
adds unit tests
adds or updates e2e tests

Signed-off-by: Andrea Mazzotti <andrea.mazzotti@suse.com>

yiannistri

Tested with

 helm template rancher-turtles-providers

and

 helm template rancher-turtles-providers --set extras.addonFleet.config.enabled=false

Thank you!

fix: override default fleet-agent metrics and health bind addresses

391026b

Signed-off-by: Andrea Mazzotti <andrea.mazzotti@suse.com>

anmazzotti requested a review from a team as a code owner March 2, 2026 08:26

anmazzotti marked this pull request as draft March 2, 2026 08:27

anmazzotti self-assigned this Mar 2, 2026

anmazzotti added kind/bug Something isn't working area/fleet labels Mar 2, 2026

anmazzotti added this to CAPI / Turtles Mar 2, 2026

anmazzotti moved this to In Progress (8 max) in CAPI / Turtles Mar 2, 2026

anmazzotti added area/fleet and removed area/fleet labels Mar 2, 2026

anmazzotti moved this from In Progress (8 max) to PR to be reviewed in CAPI / Turtles Mar 2, 2026

anmazzotti marked this pull request as ready for review March 2, 2026 11:13

yiannistri approved these changes Mar 3, 2026

View reviewed changes

salasberryfin approved these changes Mar 3, 2026

View reviewed changes

salasberryfin merged commit 23f6794 into rancher:main Mar 3, 2026
21 of 38 checks passed

github-project-automation bot moved this from PR to be reviewed to Done in CAPI / Turtles Mar 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: override default fleet-agent metrics and health bind addresses#2173

fix: override default fleet-agent metrics and health bind addresses#2173
salasberryfin merged 1 commit intorancher:mainfrom
anmazzotti:update_fleet_agent_default_ports

anmazzotti commented Mar 2, 2026 •

edited

Loading

Uh oh!

yiannistri left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anmazzotti commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiannistri left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anmazzotti commented Mar 2, 2026 •

edited

Loading

yiannistri left a comment •

edited

Loading