Skip to content

Bump gha resources to avoid over-committing splice cluster#4170

Open
julientinguely-da wants to merge 3 commits intomainfrom
julien/7390-bump-gha-resources
Open

Bump gha resources to avoid over-committing splice cluster#4170
julientinguely-da wants to merge 3 commits intomainfrom
julien/7390-bump-gha-resources

Conversation

@julientinguely-da
Copy link
Contributor

fixes https://github.com/DACH-NY/cn-test-failures/issues/7390

see https://grafana.splice.network.canton.global/d/ae9lqwimiigw0d/resource-utilization?orgId=1&from=now-12h&to=now&timezone=UTC&var-test_suite=$__all

Pull Request Checklist

Cluster Testing

  • If a cluster test is required, comment /cluster_test on this PR to request it, and ping someone with access to the DA-internal system to approve it.
  • If a hard-migration test is required (from the latest release), comment /hdm_test on this PR to request it, and ping someone with access to the DA-internal system to approve it.

PR Guidelines

  • Include any change that might be observable by our partners or affect their deployment in the release notes.
  • Specify fixed issues with Fixes #n, and mention issues worked on using #n
  • Include a screenshot for frontend-related PRs - see README or use your favorite screenshot tool

Merge Guidelines

  • Make the git commit message look sensible when squash-merging on GitHub (most likely: just copy your PR description).

[static]

Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
requests: {
cpu: '4',
memory: '10Gi',
memory: '8Gi',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this stuff configurable through config instead of hardcoding it? That way changing it becomes much easier than having to jump betwene splice and internal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have to be careful and first define the runnerSpecs in internal before to merge this PR. right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[static]

Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
Copy link
Contributor

@moritzkiefer-da moritzkiefer-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configurability makes sense, for actually changing resources I suggest reading the discussion in #4047 and checking with Itai and Nicu what makes sense. I haven't followed this closely enough and some of what you seem to do here goes against what Nicu for example suggested there which was to reduce some of the requests while you are increasing them I believe. Maybe for now just merge only the configurability but keep the values as before.

nodeType: c4-standard-16
minNodes: 0
maxNodes: 1
gha:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not put this in the default config.yaml? Then we can overwrite in internal but it still works if we don't overwrite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it seems kind of weird to have half of the config in internal/main and the other half in the default. I usually just paste it in mock to have a reference, but here it doesn't update the mock expected

Copy link
Contributor Author

@julientinguely-da julientinguely-da Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for the refs

Maybe for now just merge only the configurability but keep the values as before.

I'll just wait Monday to see what @nicu says on that. To me the grafana shows that the resources are tight

Copy link
Contributor Author

@julientinguely-da julientinguely-da Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok yes, we could indeed lower the requests a bit more while keeping the limits 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it seems kind of weird to have half of the config in internal/main and the other half in the default. I usually just paste it in mock to have a reference, but here it doesn't update the mock expected

fair

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julientinguely-da also check the discussion in #4047 around changing any resource requests. I think it makes sense but maybe check with @isegall-da as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried that requests != limits means we are not dedicating resources and things might start failing in non-deterministic and hard to investigate ways

Copy link
Contributor Author

@julientinguely-da julientinguely-da Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the changes to make them equal and bump their value: https://github.com/DACH-NY/canton-network-internal/pull/3879

@julientinguely-da julientinguely-da requested review from nicu-da and removed request for OriolMunoz-da February 27, 2026 13:49
[static]

Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants