Skip to content

Comments

OCM-20625 | feat: Managed Policy additions for Karpenter on ROSA HCP#2581

Draft
robpblake wants to merge 1 commit intoopenshift:masterfrom
robpblake:ocm-20625-karpenter-managed-policy
Draft

OCM-20625 | feat: Managed Policy additions for Karpenter on ROSA HCP#2581
robpblake wants to merge 1 commit intoopenshift:masterfrom
robpblake:ocm-20625-karpenter-managed-policy

Conversation

@robpblake
Copy link

@robpblake robpblake commented Nov 11, 2025

What type of PR is this?

Feature

What this PR does / why we need it?

This PR adds the following:

  • A new Managed Policy for the Karpenter Controller on ROSA HCP
  • Additions to the Control Plane Operator managed policy to allow for tagging of SecurityGroups as a day-2 operation
  • Additions to the installer role managed policy to allow for validation of user provided SQS queue URLs when configuring Karpenter Spot instance interruptions

Which Jira/Github issue(s) this PR fixes?

Fixes #

Special notes for your reviewer:

Pre-checks (if applicable):

  • Tested latest changes against a cluster

  • Included documentation changes with PR

  • If this is a new object that is not intended for the FedRAMP environment (if unsure, please reach out to team FedRAMP), please exclude it with:

    matchExpressions:
    - key: api.openshift.com/fedramp
      operator: NotIn
      values: ["true"]

@robpblake robpblake marked this pull request as draft November 11, 2025 15:48
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 11, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 11, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: robpblake
Once this PR has been reviewed and has the lgtm label, please assign iamkirkbater for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

}
}
},
{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required because the Control Plane Operator adds the karpenter.sh/discovery tags to the SecurityGroup of the cluster when AutoNode is enabled as a day-2 operation on a cluster.

"aws:ResourceTag/red-hat-managed": "true"
}
}
},
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allow Cluster Service to validate that the user provided SQS queue for spot interruption handling exists in the account, preventing basic misconfiguration errors.

}
},
{
"Sid": "PassInstanceRole",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This permission has rather large implications. Its a requirement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, iam:PassRole is required for Karpenter to launch EC2 instances with an IAM instance profile. This is documented in Karpenter's prerequisites.
However, we should constrain this following the pattern in ROSAInstallerPolicy:

 "Condition": {
    "StringEquals": {
      "iam:PassedToService": ["ec2.amazonaws.com"]
   }
}

We can also scope the Resource ARN to Karpenter-specific roles (e.g., arn:aws:iam::*:role/*karpenter*).
Reference: ROSAInstallerPolicy uses this exact pattern for PassRole to EC2.

Comment on lines +8 to +18
"ec2:DescribeAvailabilityZones",
"ec2:DescribeImages",
"ec2:DescribeInstances",
"ec2:DescribeInstanceTypeOfferings",
"ec2:DescribeInstanceTypes",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSnapshots",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure all of these cannot be conditioned by tag?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, most EC2 Describe* actions do not support resource-level permissions or tag-based conditions. This is an AWS API limitation, not a design choice.

Per the AWS Service Authorization Reference for EC2:

  • ec2:DescribeAvailabilityZones - Resource type: None (no resource-level permissions)
  • ec2:DescribeImages - Resource type: None
  • ec2:DescribeInstances - Resource type: None (filtering happens at API response level, not IAM)
  • ec2:DescribeInstanceTypes - Resource type: None
  • ec2:DescribeSubnets - Resource type: None
  • ec2:DescribeVpcs - Resource type: None

These are read-only discovery actions required for Karpenter to find available capacity. The ROSAInstallerPolicy similarly uses Resource: "*" for EC2 read permissions.

Reference: AWS EC2 Actions, Resources, and Condition Keys

Comment on lines +51 to +64
"Sid": "KMSGrantPermissions",
"Effect": "Allow",
"Action": [
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant"
],
"Resource": "*",
"Condition": {
"Bool": {
"kms:GrantIsForAWSResource": "true"
}
}
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another permission that will go through some scrutiny. We have this in our other managed policies IIRC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kms:CreateGrant permissions exist in multiple ROSA managed policies. For reference :

HCP CAPA Controller (openshift_hcp_capa_controller_manager_credentials_policy.json line 239-257):

{

  "Sid": "CreateGrantRestricted",
  "Effect": "Allow",
  "Action": ["kms:CreateGrant"],
  "Resource": "*",
  "Condition": {
    "Bool": { "kms:GrantIsForAWSResource": true },
    "StringEquals": { "aws:ResourceTag/red-hat": "true" },
    "StringLike": { "kms:ViaService": "ec2.*.amazonaws.com" }
  }
}

Machine API (openshift_machine_api_aws_cloud_credentials_policy.json):

Uses kms:GrantIsForAWSResource: true condition.

For Karpenter, we can follow the more restrictive HCP CAPA pattern with all three conditions.

For more reference : AWS KMS Condition Keys

"iam:GetInstanceProfile",
"iam:ListInstanceProfiles"
],
"Resource": "*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iam:GetInstanceProfile and iam:ListInstanceProfiles do not support tag-based conditions per the
AWS IAM Service Authorization Reference

However, we can scope the Resource ARN for GetInstanceProfile:

"Resource": "arn:aws:iam::*:instance-profile/*karpenter*"

For ListInstanceProfiles, AWS requires Resource: "*" as it's a list operation that enumerates across all profiles. This matches the pattern in ROSAInstallerPolicy which uses Resource: "*" for IAM read operations.

References:

@joshbranham joshbranham self-requested a review February 11, 2026 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants