Skip to content

Feat: add AgentRuntime CRD types and documentation#212

Merged
cwiklik merged 2 commits intokagenti:mainfrom
varshaprasad96:feat-agent-runtime
Mar 11, 2026
Merged

Feat: add AgentRuntime CRD types and documentation#212
cwiklik merged 2 commits intokagenti:mainfrom
varshaprasad96:feat-agent-runtime

Conversation

@varshaprasad96
Copy link
Contributor

Summary

Introduce the Agent Runtime custom resource for configuring agent parameters including identity (SPIFFE, IdP client registration) and observability (OTEL traces) on agent and tool workloads via targetRef-based binding.

  • Add AgentRuntimeSpec with type (agent|tool), targetRef, identity, and trace fields
  • Add AgentRuntimeStatus with phase, configuredPods, and conditions
  • Generate CRD manifest and deepcopy functions
  • Enable allowDangerousTypes for float64 sampling rate
  • Document AgentRuntime in api-reference.md and architecture.md

Next step: Configure the controller.

Signed-off-by: Varsha Prasad Narsing varshaprasad96@gmail.com
Assisted-By: Claude (Anthropic AI) noreply@anthropic.com

Related issue(s)

Related: kagenti/kagenti#862

(Optional) Testing Instructions

Fixes #

@varshaprasad96 varshaprasad96 requested a review from a team as a code owner March 10, 2026 23:47
.PHONY: manifests
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
$(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$(CONTROLLER_GEN) rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases
Copy link
Contributor Author

@varshaprasad96 varshaprasad96 Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Controller-gen rejects float32 and float64 types in CRD schemas because floating-point numbers can have precision issues when serialized to/from JSON. But this was intentional because float64 seemed a reasonable choice for sampling rate. The other option is to use resource.Quantity (string based decimal) or an integer percentage. Open to any specific suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8s best practice here seems to be representing floats as strings when possible. resource.Quantity seems OK to me.

@varshaprasad96
Copy link
Contributor Author

varshaprasad96 commented Mar 10, 2026

cc: @cwiklik ptal

@pdettori pdettori added the safe-to-test Maintainer reviewed - safe to run E2E tests label Mar 11, 2026
Copy link
Contributor

@pdettori pdettori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Clean introduction of the AgentRuntime CRD for per-workload identity (SPIFFE, IdP client registration) and observability (OTEL traces). Good API design that reuses the existing TargetRef type from AgentCard. Documentation is thorough with YAML examples and kubectl usage.

Areas reviewed: Go API types, CRD YAML, Deep Copy, Makefile, go.mod, Docs
Commits: 1 commit, signed-off ✓
CI status: Build/E2E/Lint/Unit pending, DCO passed

One item to address in the architecture diagram (see inline comment). Rest are minor.

🤖 Reviewed with Claude Code

// If empty, the operator flag value is used.
// +optional
// +kubebuilder:validation:Pattern=`^[a-zA-Z0-9]([a-zA-Z0-9\-\.]*[a-zA-Z0-9])?$`
TrustDomain string `json:"trustDomain,omitempty"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that will vary per-workload? (Rather than an aspect of how SPIRE is configured for the cluster as a whole)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the same pattern already established in AgentCard's identityBinding.trustDomain (

TrustDomain string `json:"trustDomain,omitempty"`
). One scenario could be federated SPIRE deployments where workloads in the same cluster may belong to different trust domains. I'm not sure if this is a use case we are addressing in Kagenti. Open to removing it if it makes more sense.

Copy link
Contributor

@usize usize Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. In the AgentCard my thinking was to support federation scenarios like e.g., someone represents an agent from out of cluster and wants to verify its signature. This came up during discussions where folks pointed out the idea of representing out of cluster agents in the fashion of a MultiClusterService.

}

// ClientRegistration configures IdP client registration for an AgentRuntime.
type ClientRegistration struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that client registration was a temporary workaround until Keycloak supported authentication using kube and SPIFFE credentials (which it supposedly does as of 26.4).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed ClientRegistration from IdentitySpec - identity now only covers SPIFFE configuration.

type ClientRegistration struct {
// Provider is the IdP provider type (e.g., "keycloak")
// +kubebuilder:validation:MinLength=1
Provider string `json:"provider"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there not need to be a url at which the IdP could be found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC client registration is currently handled externally via Helm jobs in the kagenti repo, removing it in favor of Keycloak.

Copy link

@r3v5 r3v5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @varshaprasad96 ! One suggestion, can we add some unit tests for new structures?

Copy link
Collaborator

@cwiklik cwiklik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean introduction of the AgentRuntime CRD with well-structured API types, proper code generation, and thorough documentation. Good reuse of existing TargetRef from AgentCard and standard K8s patterns (metav1.Condition, status subresource, printer columns). All CI passes.

Adding supplementary notes alongside existing reviews from @pdettori, @grs, and @r3v5. I agree that @grs's questions about whether trust domain and client registration belong as per-workload config are important architectural decisions to resolve.

Also: thorough API reference docs with YAML examples and kubectl usage — well done.

Inline notes:

  • agentruntime_types.go:188: Nit — K8s convention is to not use omitempty on Spec since it should always be present. Status with omitempty is fine.
  • agentruntime_types.go:134: Security note for the controller implementation: corev1.SecretReference includes a namespace field, allowing cross-namespace secret references. When the controller is implemented, ensure RBAC and/or webhook validation prevent workloads from referencing secrets in namespaces they don't own. Aligns with @grs's broader questions about the identity section.

Areas reviewed: Go API types, CRD YAML, DeepCopy, Docs, Security
Commits: 1 commit, signed-off: yes
CI status: All 5 checks passing

Introduce the Agent Runtime custom resource for configuring agent
parameters including identity (SPIFFE, IdP client registration) and
observability (OTEL traces) on agent and tool workloads via
targetRef-based binding.

- Add AgentRuntimeSpec with type (agent|tool), targetRef, identity, and
  trace fields
- Add AgentRuntimeStatus with phase, configuredPods, and conditions
- Generate CRD manifest and deepcopy functions
- Enable allowDangerousTypes for float64 sampling rate
- Document AgentRuntime in api-reference.md and architecture.md

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
- Remove ClientRegistration from IdentitySpec — Keycloak 26.4 federated
  client auth makes operator-managed registration unnecessary, and it is
  currently handled externally via Helm jobs
- Remove omitempty from Spec json tag per K8s convention
- Remove misleading webhook validation edge for AgentRuntime in
  architecture diagram (no webhook implemented yet)
- Revert prometheus/client_model back to indirect dependency

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
@varshaprasad96
Copy link
Contributor Author

One suggestion, can we add some unit tests for new structures?

@r3v5 The PR currently defines only Go structs for the API, hence there are no tests. There is a controller implementation PR in the follow up. Will have unit tests in there, where we have the actual logic of calling/performing CRUD on these API instances.

@usize usize self-requested a review March 11, 2026 22:48
@varshaprasad96
Copy link
Contributor Author

@grs @cwiklik @r3v5 - addressed review comments, ptal.

Copy link
Contributor

@usize usize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some context. This looks like a great start. I appreciate the scaffolding. I'm going to wait for @cwiklik to approve since he has touched more of the operator than I have.

// If empty, the operator flag value is used.
// +optional
// +kubebuilder:validation:Pattern=`^[a-zA-Z0-9]([a-zA-Z0-9\-\.]*[a-zA-Z0-9])?$`
TrustDomain string `json:"trustDomain,omitempty"`
Copy link
Contributor

@usize usize Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. In the AgentCard my thinking was to support federation scenarios like e.g., someone represents an agent from out of cluster and wants to verify its signature. This came up during discussions where folks pointed out the idea of representing out of cluster agents in the fashion of a MultiClusterService.

.PHONY: manifests
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
$(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$(CONTROLLER_GEN) rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8s best practice here seems to be representing floats as strings when possible. resource.Quantity seems OK to me.

Copy link
Collaborator

@cwiklik cwiklik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - lets start with this and modify as needed. Thank you.

@cwiklik cwiklik merged commit 8ad9f60 into kagenti:main Mar 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe-to-test Maintainer reviewed - safe to run E2E tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants