Skip to content

Upgrade Karpenter #304

@seanmorton

Description

@seanmorton

Context

  • Karpenter is our node autoscaler, for VMs virt-handler is a daemon set starting after a node is provisioned, mutating nodes with labels and resource capacities that Karpenter can't know upfront making VMs unschedulable
  • We maintained a patch for 2 years ignoring static resource/capacity requirements and regex-based node selectors, we don't want to keep maintaining
  • Karpenter just released a NodeOverlay CRD allowing us to flag static custom resource/capacity requirements upfront for nodes we want eligible through well-known labels, that solves half of the problem

Goals

Upgrade Karpenter from our "old" patch v1.3.3x to v1.7.x (latest) replacing:

  • IGNORED_RESOURCE_REQUESTS with NodeOverlay now it's built-in
  • IGNORED_NODE_SELECTOR_REQUIREMENTS with an additive way of handling node selectors having dynamic label keys (e.g., scheduling.node.kubevirt.io/tsc-frequency-2999998000=true and initiate buy-in with Karpenter team

Patch we applied to Karpenter

Diff is available here

Forks are available on the following repos:

How we configured the custom patch

env:
   - name: IGNORED_RESOURCE_REQUESTS
      value: "devices.kubevirt.io/kvm,devices.kubevirt.io/tun,devices.kubevirt.io/vhost-net"
   - name: IGNORED_NODE_SELECTOR_REQUIREMENTS
      value: "scheduling.node.kubevirt.io/tsc-frequency-*"

(source)

We call then this configured Karpenter fork in clusters where we need it as on:

What's new with Karpenter 1.7.x

This would indicate to Karpenter these custom capacity requirements are available to any eligible node returned by the AWS API; in this case, we make eligible any metal instance size

apiVersion: karpenter.sh/v1alpha1
kind: NodeOverlay
metadata:
 name: kubevirt
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-size
    operator: In
    values: ["metal"]
  capacity: 
    devices.kubevirt.io/kvm: 1k
    devices.kubevirt.io/tun: 1k
    devices.kubevirt.io/vhost-net: 1k

How to reproduce the scheduling.node.kubevirt.io/tsc-frequency-* issue

To interact with VMs, install virtctl

  1. Start a VM (no node selector constraint because it's the first VM boot, libvirt XML will at that point initialize and hardcode the value read from the node)
  2. Stop your VM: kubectl virt stop $vm_name
  3. Resume your VM: kubectl virt start $vm_name

(I don't remember if this tsc-frequency node selector constraint appears only on Windows VMs)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions