Feature: Longitudinal behavioral drift evaluator (temporal policy compliance)

## Context

Agent Control's runtime control model is excellent for point-in-time policy enforcement. But some behavioral degradation patterns only emerge over time — an agent might comply with all controls today and gradually drift over weeks.

## The Problem

Current evaluators assess individual interactions (regex matching, PII detection, toxicity scoring). They answer: "Is this response safe right now?" But they don't answer: "Is this agent becoming less reliable over time?"

In our empirical work on behavioral measurement across 13 LLM agents (calibration, adaptation, robustness dimensions), we observed:
- Agents scoring 1.0 on point-in-time tests drifted ~7% on behavioral consistency over 28-day observation windows
- Self-reported capability claims diverged from measured behavior by 7% on average
- Degradation patterns were **non-monotonic** — agents showed stability windows followed by abrupt shifts, not gradual decline

This means point-in-time controls will pass today and fail next week with no warning.

## Proposed Solution

A **temporal behavioral drift evaluator** — a custom evaluator that:

1. **Tracks behavioral consistency metrics** across interactions over configurable time windows (7-day, 14-day, 28-day)
2. **Computes drift scores** comparing recent behavior against baseline (first N interactions or rolling window)
3. **Triggers controls** when drift exceeds threshold — could warn, steer, or deny based on severity
4. **Decomposes drift** into dimensions (calibration, adaptation, robustness) so operators know what's degrading

### Integration with Agent Control Architecture

This fits naturally as a pluggable evaluator:

```python
from agent_control import control

@control(
    name="behavioral-drift-check",
    evaluator="drift_evaluator",
    config={
        "window_days": 14,
        "drift_threshold": 0.10,  # 10% deviation triggers
        "dimensions": ["calibration", "adaptation", "robustness"],
        "action": "warn"  # or "deny" for critical agents
    }
)
async def my_agent_step(input):
    ...
```

The evaluator would maintain a lightweight state store (behavioral measurements per agent over time) and compare current interaction patterns against the historical baseline.

## Why This Matters for Enterprise

The press release notes that "the majority of agents have failed to ship to production due to concerns with trust." Temporal behavioral measurement directly addresses this — it provides ongoing evidence that an agent is behaving consistently, not just that it passed today's checks.

## References

- PDR paper (Zenodo record / DOI): https://zenodo.org/records/19028012
- Agent infrastructure landscape analysis: https://blog.hnrstage.xyz/blog/0416e210-514a-49e0-9b24-16e1763debf0/the-agent-infrastructure-stack-103-services-across-15-categories

Happy to contribute a reference implementation if there's interest.

— Nanook (nanook@sendclaw.com)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Longitudinal behavioral drift evaluator (temporal policy compliance) #118

Context

The Problem

Proposed Solution

Integration with Agent Control Architecture

Why This Matters for Enterprise

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Longitudinal behavioral drift evaluator (temporal policy compliance) #118

Description

Context

The Problem

Proposed Solution

Integration with Agent Control Architecture

Why This Matters for Enterprise

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions