Conversation
d14bae2 to
7a22e1d
Compare
6c7b012 to
ac01d3b
Compare
fc2162c to
4adc038
Compare
|
I found out that on GKE the {"level":"error","ts":"2026-02-26T15:32:50Z","logger":"setup","msg":"problem running manager","error":"failed to start metrics server: failed to create listener: listen tcp :8080: bind: address already in use","stacktrace":"github.com/rancher/fleet/internal/cmd/agent.start\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/operator.go:181\ngithub.com/rancher/fleet/internal/cmd/agent.(*FleetAgent).Run.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/root.go:143"}
{"level":"error","ts":"2026-02-26T15:32:50Z","logger":"setup","msg":"failed to start agent","error":"failed to start metrics server: failed to create listener: listen tcp :8080: bind: address already in use","stacktrace":"github.com/rancher/fleet/internal/cmd/agent.(*FleetAgent).Run.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/root.go:144"}
{"level":"error","ts":"2026-02-26T15:32:50Z","logger":"controller-runtime.source.Kind","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1alpha1.BundleDeployment Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:80\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.35.0/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:68"}
{"level":"error","ts":"2026-02-26T15:33:05Z","logger":"clusterstatus","msg":"failed to report initial cluster status","cluster":"cluster-gke-pq8v6b","interval":900,"error":"client rate limiter Wait returned an error: context canceled","stacktrace":"github.com/rancher/fleet/internal/cmd/agent/clusterstatus.Ticker.func1\n\t/home/runner/_work/fleet/fleet/internal/cmd/agent/clusterstatus/ticker.go:42"}This can be configured using fleet-agent environment variables, however there are two issues:
So I see no other way than changing this for all Clusters and for all rancher-turtles-providers chart users. This however has the consequence of rolling out the |
b714005 to
07f7de1
Compare
1f4065a to
3de9617
Compare
3de9617 to
572ea67
Compare
Signed-off-by: Andrea Mazzotti <andrea.mazzotti@suse.com>
572ea67 to
be6af13
Compare
There was a problem hiding this comment.
Thanks @anmazzotti.
With #2176 all Turtles clusters will have a corresponding Fleet cluster. Does this mean that the opt-in can be removed and this check can be applied generally?
Changes look good to me and this is not a blocker.
No. This is testing CAAPF integration mostly. |
salasberryfin
left a comment
There was a problem hiding this comment.
Thanks @anmazzotti.
In this case I think it would be more accurate if we moved (only) the Fleet cluster check out of the CAAPF-specific validation, as this behavior will no longer be exclusive of CAAPF. This will also help when non-CAAPF tests are added in the coming weeks/months.
What do you think?
|
I don't know whether this PR is compatible with #2176. |
|
I think this is a valid check. CAAPF is not going away anytime soon and it's good that we can improve test coverage and reduce flakiness. However, during the time when CAAPF co-exists with stand-alone Turtles, we'll have to progressively add tests that do not rely on the add-on provider. The simplest approach for this is probably to have a separate management cluster that re-uses the existing import gitops. In these cases the creation of the Fleet cluster is still a valid verification. My suggestion is that we just re-structure the checks to make it future-proof for tests that do not use CAAPF but will co-exist with it. |
What this PR does / why we need it:
This PR adds validation for the Fleet integration.
This is a 1:1 port of CAAPF
_test-import-all: https://github.com/rancher/cluster-api-addon-provider-fleet/blob/release-0.14/justfile#L229The reason this is an opt-in feature is that it's not relevant for all users, like in the integration-suite-example
Test run: https://github.com/rancher/turtles/actions/runs/22659509280/job/65676505647
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #1529
Special notes for your reviewer:
Checklist: