-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Context
- Karpenter is our node autoscaler, for VMs
virt-handleris a daemon set starting after a node is provisioned, mutating nodes with labels and resource capacities that Karpenter can't know upfront making VMs unschedulable - We maintained a patch for 2 years ignoring static resource/capacity requirements and regex-based node selectors, we don't want to keep maintaining
- Karpenter just released a NodeOverlay CRD allowing us to flag static custom resource/capacity requirements upfront for nodes we want eligible through well-known labels, that solves half of the problem
Goals
Upgrade Karpenter from our "old" patch v1.3.3x to v1.7.x (latest) replacing:
IGNORED_RESOURCE_REQUESTSwithNodeOverlaynow it's built-inIGNORED_NODE_SELECTOR_REQUIREMENTSwith an additive way of handling node selectors having dynamic label keys (e.g.,scheduling.node.kubevirt.io/tsc-frequency-2999998000=trueand initiate buy-in with Karpenter team
Patch we applied to Karpenter
Diff is available here
Forks are available on the following repos:
How we configured the custom patch
env:
- name: IGNORED_RESOURCE_REQUESTS
value: "devices.kubevirt.io/kvm,devices.kubevirt.io/tun,devices.kubevirt.io/vhost-net"
- name: IGNORED_NODE_SELECTOR_REQUIREMENTS
value: "scheduling.node.kubevirt.io/tsc-frequency-*"(source)
We call then this configured Karpenter fork in clusters where we need it as on:
What's new with Karpenter 1.7.x
This would indicate to Karpenter these custom capacity requirements are available to any eligible node returned by the AWS API; in this case, we make eligible any metal instance size
apiVersion: karpenter.sh/v1alpha1
kind: NodeOverlay
metadata:
name: kubevirt
spec:
requirements:
- key: karpenter.k8s.aws/instance-size
operator: In
values: ["metal"]
capacity:
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1kHow to reproduce the scheduling.node.kubevirt.io/tsc-frequency-* issue
To interact with VMs, install virtctl
- Start a VM (no node selector constraint because it's the first VM boot, libvirt XML will at that point initialize and hardcode the value read from the node)
- Stop your VM:
kubectl virt stop $vm_name - Resume your VM:
kubectl virt start $vm_name
(I don't remember if this tsc-frequency node selector constraint appears only on Windows VMs)
Metadata
Metadata
Assignees
Labels
No labels