Bump gha resources to avoid over-committing splice cluster#4170
Bump gha resources to avoid over-committing splice cluster#4170julientinguely-da wants to merge 3 commits intomainfrom
Conversation
[static] Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
cluster/pulumi/gha/src/runners.ts
Outdated
| requests: { | ||
| cpu: '4', | ||
| memory: '10Gi', | ||
| memory: '8Gi', |
There was a problem hiding this comment.
can we make this stuff configurable through config instead of hardcoding it? That way changing it becomes much easier than having to jump betwene splice and internal
There was a problem hiding this comment.
good idea
There was a problem hiding this comment.
Now we have to be careful and first define the runnerSpecs in internal before to merge this PR. right?
There was a problem hiding this comment.
[static] Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
moritzkiefer-da
left a comment
There was a problem hiding this comment.
The configurability makes sense, for actually changing resources I suggest reading the discussion in #4047 and checking with Itai and Nicu what makes sense. I haven't followed this closely enough and some of what you seem to do here goes against what Nicu for example suggested there which was to reduce some of the requests while you are increasing them I believe. Maybe for now just merge only the configurability but keep the values as before.
| nodeType: c4-standard-16 | ||
| minNodes: 0 | ||
| maxNodes: 1 | ||
| gha: |
There was a problem hiding this comment.
why not put this in the default config.yaml? Then we can overwrite in internal but it still works if we don't overwrite
There was a problem hiding this comment.
because it seems kind of weird to have half of the config in internal/main and the other half in the default. I usually just paste it in mock to have a reference, but here it doesn't update the mock expected
There was a problem hiding this comment.
Oh thanks for the refs
Maybe for now just merge only the configurability but keep the values as before.
I'll just wait Monday to see what @nicu says on that. To me the grafana shows that the resources are tight
There was a problem hiding this comment.
ok yes, we could indeed lower the requests a bit more while keeping the limits 👍
There was a problem hiding this comment.
because it seems kind of weird to have half of the config in internal/main and the other half in the default. I usually just paste it in mock to have a reference, but here it doesn't update the mock expected
fair
There was a problem hiding this comment.
@julientinguely-da also check the discussion in #4047 around changing any resource requests. I think it makes sense but maybe check with @isegall-da as well
There was a problem hiding this comment.
I'm a bit worried that requests != limits means we are not dedicating resources and things might start failing in non-deterministic and hard to investigate ways
There was a problem hiding this comment.
I've made the changes to make them equal and bump their value: https://github.com/DACH-NY/canton-network-internal/pull/3879
[static] Signed-off-by: Julien Tinguely <julien.tinguely@digitalasset.com>
fixes https://github.com/DACH-NY/cn-test-failures/issues/7390
see https://grafana.splice.network.canton.global/d/ae9lqwimiigw0d/resource-utilization?orgId=1&from=now-12h&to=now&timezone=UTC&var-test_suite=$__all
Pull Request Checklist
Cluster Testing
/cluster_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./hdm_teston this PR to request it, and ping someone with access to the DA-internal system to approve it.PR Guidelines
Fixes #n, and mention issues worked on using#nMerge Guidelines