Feat: add AgentRuntime CRD types and documentation#212
Conversation
| .PHONY: manifests | ||
| manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects. | ||
| $(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases | ||
| $(CONTROLLER_GEN) rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases |
There was a problem hiding this comment.
Controller-gen rejects float32 and float64 types in CRD schemas because floating-point numbers can have precision issues when serialized to/from JSON. But this was intentional because float64 seemed a reasonable choice for sampling rate. The other option is to use resource.Quantity (string based decimal) or an integer percentage. Open to any specific suggestions.
There was a problem hiding this comment.
K8s best practice here seems to be representing floats as strings when possible. resource.Quantity seems OK to me.
|
cc: @cwiklik ptal |
pdettori
left a comment
There was a problem hiding this comment.
Review Summary
Clean introduction of the AgentRuntime CRD for per-workload identity (SPIFFE, IdP client registration) and observability (OTEL traces). Good API design that reuses the existing TargetRef type from AgentCard. Documentation is thorough with YAML examples and kubectl usage.
Areas reviewed: Go API types, CRD YAML, Deep Copy, Makefile, go.mod, Docs
Commits: 1 commit, signed-off ✓
CI status: Build/E2E/Lint/Unit pending, DCO passed
One item to address in the architecture diagram (see inline comment). Rest are minor.
🤖 Reviewed with Claude Code
| // If empty, the operator flag value is used. | ||
| // +optional | ||
| // +kubebuilder:validation:Pattern=`^[a-zA-Z0-9]([a-zA-Z0-9\-\.]*[a-zA-Z0-9])?$` | ||
| TrustDomain string `json:"trustDomain,omitempty"` |
There was a problem hiding this comment.
Is this something that will vary per-workload? (Rather than an aspect of how SPIRE is configured for the cluster as a whole)
There was a problem hiding this comment.
This follows the same pattern already established in AgentCard's identityBinding.trustDomain (
). One scenario could be federated SPIRE deployments where workloads in the same cluster may belong to different trust domains. I'm not sure if this is a use case we are addressing in Kagenti. Open to removing it if it makes more sense.There was a problem hiding this comment.
Right. In the AgentCard my thinking was to support federation scenarios like e.g., someone represents an agent from out of cluster and wants to verify its signature. This came up during discussions where folks pointed out the idea of representing out of cluster agents in the fashion of a MultiClusterService.
| } | ||
|
|
||
| // ClientRegistration configures IdP client registration for an AgentRuntime. | ||
| type ClientRegistration struct { |
There was a problem hiding this comment.
My understanding was that client registration was a temporary workaround until Keycloak supported authentication using kube and SPIFFE credentials (which it supposedly does as of 26.4).
There was a problem hiding this comment.
Removed ClientRegistration from IdentitySpec - identity now only covers SPIFFE configuration.
| type ClientRegistration struct { | ||
| // Provider is the IdP provider type (e.g., "keycloak") | ||
| // +kubebuilder:validation:MinLength=1 | ||
| Provider string `json:"provider"` |
There was a problem hiding this comment.
Would there not need to be a url at which the IdP could be found?
There was a problem hiding this comment.
IIUC client registration is currently handled externally via Helm jobs in the kagenti repo, removing it in favor of Keycloak.
r3v5
left a comment
There was a problem hiding this comment.
Great work, @varshaprasad96 ! One suggestion, can we add some unit tests for new structures?
cwiklik
left a comment
There was a problem hiding this comment.
Clean introduction of the AgentRuntime CRD with well-structured API types, proper code generation, and thorough documentation. Good reuse of existing TargetRef from AgentCard and standard K8s patterns (metav1.Condition, status subresource, printer columns). All CI passes.
Adding supplementary notes alongside existing reviews from @pdettori, @grs, and @r3v5. I agree that @grs's questions about whether trust domain and client registration belong as per-workload config are important architectural decisions to resolve.
Also: thorough API reference docs with YAML examples and kubectl usage — well done.
Inline notes:
agentruntime_types.go:188: Nit — K8s convention is to not useomitemptyonSpecsince it should always be present.Statuswithomitemptyis fine.agentruntime_types.go:134: Security note for the controller implementation:corev1.SecretReferenceincludes anamespacefield, allowing cross-namespace secret references. When the controller is implemented, ensure RBAC and/or webhook validation prevent workloads from referencing secrets in namespaces they don't own. Aligns with @grs's broader questions about the identity section.
Areas reviewed: Go API types, CRD YAML, DeepCopy, Docs, Security
Commits: 1 commit, signed-off: yes
CI status: All 5 checks passing
Introduce the Agent Runtime custom resource for configuring agent parameters including identity (SPIFFE, IdP client registration) and observability (OTEL traces) on agent and tool workloads via targetRef-based binding. - Add AgentRuntimeSpec with type (agent|tool), targetRef, identity, and trace fields - Add AgentRuntimeStatus with phase, configuredPods, and conditions - Generate CRD manifest and deepcopy functions - Enable allowDangerousTypes for float64 sampling rate - Document AgentRuntime in api-reference.md and architecture.md Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
- Remove ClientRegistration from IdentitySpec — Keycloak 26.4 federated client auth makes operator-managed registration unnecessary, and it is currently handled externally via Helm jobs - Remove omitempty from Spec json tag per K8s convention - Remove misleading webhook validation edge for AgentRuntime in architecture diagram (no webhook implemented yet) - Revert prometheus/client_model back to indirect dependency Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
6fd63e9 to
b3aa892
Compare
@r3v5 The PR currently defines only Go structs for the API, hence there are no tests. There is a controller implementation PR in the follow up. Will have unit tests in there, where we have the actual logic of calling/performing CRUD on these API instances. |
| // If empty, the operator flag value is used. | ||
| // +optional | ||
| // +kubebuilder:validation:Pattern=`^[a-zA-Z0-9]([a-zA-Z0-9\-\.]*[a-zA-Z0-9])?$` | ||
| TrustDomain string `json:"trustDomain,omitempty"` |
There was a problem hiding this comment.
Right. In the AgentCard my thinking was to support federation scenarios like e.g., someone represents an agent from out of cluster and wants to verify its signature. This came up during discussions where folks pointed out the idea of representing out of cluster agents in the fashion of a MultiClusterService.
| .PHONY: manifests | ||
| manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects. | ||
| $(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases | ||
| $(CONTROLLER_GEN) rbac:roleName=manager-role crd:allowDangerousTypes=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases |
There was a problem hiding this comment.
K8s best practice here seems to be representing floats as strings when possible. resource.Quantity seems OK to me.
cwiklik
left a comment
There was a problem hiding this comment.
lgtm - lets start with this and modify as needed. Thank you.
Summary
Introduce the Agent Runtime custom resource for configuring agent parameters including identity (SPIFFE, IdP client registration) and observability (OTEL traces) on agent and tool workloads via targetRef-based binding.
Next step: Configure the controller.
Signed-off-by: Varsha Prasad Narsing varshaprasad96@gmail.com
Assisted-By: Claude (Anthropic AI) noreply@anthropic.com
Related issue(s)
Related: kagenti/kagenti#862
(Optional) Testing Instructions
Fixes #