-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Agent State Serialization Enhancement Plan
Problem Statement
Currently, Strands agents require agent.state to be JSON-serializable, limiting it to basic Python types (str, int, dict, list, etc.). This creates two distinct problems:
Problem 1: Rich Type Serialization
Users cannot store:
- Python dataclasses
- Pydantic models
- Custom objects with complex behavior
- Rich types like
datetime,Decimal,UUID
This limitation forces users to write boilerplate serialization/deserialization code.
Problem 2: Runtime-Only Resources
Even with serialization improvements, some objects fundamentally cannot be persisted:
- Database connections (
sqlite3.Connection,psycopg2.connection) - API clients (
boto3.client,httpx.Client)
These represent runtime resources, not data. They cannot be meaningfully serialized because they contain system handles that are only valid in the current process and recreating them from serialized state would not restore the actual resource
This prevents clean, Pythonic patterns that competing frameworks like Pydantic AI support.
Motivation
Pydantic AI and similar frameworks allow arbitrary Python objects in agent state, enabling cleaner code:
from dataclasses import dataclass
from datetime import datetime
from uuid import UUID
@dataclass
class AgentState:
db_connection: DatabaseConnection # Complex object
created_at: datetime # Rich type
user_id: UUID # Rich type
cache: dict[str, Any] # Still works
agent = Agent('openai:gpt-4', deps_type=AgentState)Benefits
- No serialization boilerplate - Use Python objects directly without conversion code
- Richer type system - Leverage dataclasses, Pydantic models, enums, datetime, Decimal, etc.
- Complex state - Store database connections, API clients, file handles (runtime-only)
- Better type safety - Full IDE support and type checking for state objects
- Cleaner code - Business logic without serialization/deserialization
Design Decisions
1. Pluggable Serializers
We will introduce a pluggable serializer architecture that allows users to choose their serialization strategy:
- JSONSerializer (default) - Backward compatible, human-readable, validates on
set() - PickleSerializer - Supports any Python object, no validation on
set() - Custom serializers - Users can implement the
StateSerializerprotocol
2. Configuration Location
After considering multiple options, we decided:
- Agent construction: Accept
state_serializerparameter for convenience - AgentState object: All further access and modification via
agent.state.serializer - No property delegation: Agent won't have getter/setter for serializer
- Conflict handling: Raise error if both
state: AgentStateandstate_serializerprovided
# At construction
agent = Agent(model, state_serializer=PickleSerializer())
# Later modification
agent.state.serializer = JSONSerializer()3. Validation Strategy
Validation is delegated to the serializer:
JSONSerializervalidates onset()to maintain current behaviorPickleSerializerhas no validation (accepts anything)- Custom serializers can implement their own validation logic
4. Backward Compatibility
- Default serializer is
JSONSerializer→ existing code works unchanged JSONSerializableDictwill be deprecated but kept for compatibility- Current JSON validation behavior preserved for default case
5. Runtime-Only State (Transient Values)
To handle runtime resources that cannot be persisted (database connections, API clients, etc.), we will add a persist parameter to the set() method:
- Default behavior:
persist=True- values are validated and serialized - Runtime-only:
persist=False- values are kept in memory but excluded from serialization - Unified retrieval: Single
get()method works for both persistent and transient values - Explicit opt-out: Runtime-only must be explicitly marked at call site
This solves the Runtime-Only Resources problem while maintaining backward compatibility.
Implementation Architecture
Core Components
# src/strands/state/serializers.py
from typing import Protocol, Any, runtime_checkable
@runtime_checkable
class StateSerializer(Protocol):
"""Protocol for state serializers."""
def serialize(self, data: dict[str, Any]) -> bytes:
"""Serialize state dict to bytes."""
...
def deserialize(self, data: bytes) -> dict[str, Any]:
"""Deserialize bytes back to state dict."""
...
# Optional validation method
def validate(self, value: Any) -> None:
"""Validate a value can be serialized (optional)."""
...AgentState Refactoring
# src/strands/agent/state.py
class AgentState:
"""Flexible state container with pluggable serialization and runtime-only support."""
def __init__(
self,
initial_state: dict[str, Any] | None = None,
serializer: StateSerializer | None = None
):
self._data: dict[str, Any] = initial_state or {}
self._transient_keys: set[str] = set() # Track runtime-only keys
self.serializer = serializer or JSONSerializer()
def set(self, key: str, value: Any, *, persist: bool = True) -> None:
"""Set value with optional persistence.
Args:
key: The key to store the value under
value: The value to store
persist: If False, value is transient (not serialized). Default True.
"""
self._validate_key(key)
if persist:
# Validate serializable (existing behavior)
if hasattr(self.serializer, 'validate'):
self.serializer.validate(value)
self._transient_keys.discard(key)
else:
# Mark as transient - skip validation
self._transient_keys.add(key)
self._data[key] = value
def get(self, key: str | None = None) -> Any:
"""Get value - works uniformly for persistent and transient."""
if key is None:
return copy.deepcopy(self._data)
return copy.deepcopy(self._data.get(key))
def is_transient(self, key: str) -> bool:
"""Check if a key is transient (not persisted)."""
return key in self._transient_keys
def serialize(self) -> bytes:
"""Serialize only persistent keys."""
persistent_data = {k: v for k, v in self._data.items()
if k not in self._transient_keys}
return self.serializer.serialize(persistent_data)
def deserialize(self, data: bytes) -> None:
"""Deserialize persistent state. Transient keys are preserved if in memory."""
persistent_data = self.serializer.deserialize(data)
# Keep transient keys in memory, replace persistent
transient_data = {k: v for k, v in self._data.items()
if k in self._transient_keys}
self._data = {**persistent_data, **transient_data}Session Manager Updates
Session managers will use the agent's serializer for persistence:
# In FileSessionManager.sync_agent()
serialized_state = agent.state.serialize() # Returns bytes
# Store serialized_state appropriatelyImplementation Phases
Phase 1: Core Serialization Infrastructure
- Create serializer protocol and implementations
- Refactor AgentState to use pluggable serializers
- Update Agent constructor
Phase 2: Session Manager Integration
- Update session types to handle serialized state
- Modify FileSessionManager to use agent serialization
- Modify S3SessionManager similarly
- Update RepositorySessionManager
Phase 3: Exports and Types
- Update public API exports
- Add necessary type definitions
Phase 4: Testing
- Unit tests for serializers
- Unit tests for AgentState with different serializers
- Unit tests for Agent serializer configuration
- Integration tests with rich types
Phase 5: Cleanup and Documentation
- Deprecate or remove JSONSerializableDict
- Update BidiAgent for consistency
Usage Examples
Basic Usage (Backward Compatible)
# Default behavior - JSON serialization with validation
agent = Agent(model)
agent.state.set("count", 42) # ✅ Works
agent.state.set("created", datetime.now()) # ❌ ValueError: not JSON serializableRich Types with Pickle
from strands import Agent, PickleSerializer
from datetime import datetime
from uuid import UUID
agent = Agent(model, state_serializer=PickleSerializer())
# Store rich Python types
agent.state.set("created_at", datetime.now()) # ✅
agent.state.set("user_id", UUID('...')) # ✅
agent.state.set("config", MyConfigClass()) # ✅
# Session persistence just works
session_manager.sync_agent(agent) # Automatically uses pickleCustom State Class
@dataclass
class CustomerState:
customer_id: UUID
last_interaction: datetime
preferences: dict[str, Any]
agent = Agent(
model,
state={"customer": CustomerState(...)},
state_serializer=PickleSerializer()
)Runtime-Only State (Transient Values)
from strands import Agent, PickleSerializer
import sqlite3
import boto3
agent = Agent(model="...", state_serializer=PickleSerializer())
# Persistent state (default behavior)
agent.state.set("user_id", "12345") # ✅ Persisted
agent.state.set("session_data", {"cart": []}) # ✅ Persisted
agent.state.set("created_at", datetime.now()) # ✅ Persisted (with Pickle)
# Runtime-only state (explicit opt-out)
agent.state.set("db", sqlite3.connect(":memory:"), persist=False) # ✅ Works
agent.state.set("s3_client", boto3.client("s3"), persist=False) # ✅ Works
agent.state.set("temp_cache", {}, persist=False) # ✅ Works
# Unified retrieval - no need to know if transient
db = agent.state.get("db") # Get runtime resource
user_id = agent.state.get("user_id") # Get persistent data
# Check if transient
agent.state.is_transient("db") # True
agent.state.is_transient("user_id") # False
# Use in tools
@tool
def query_database(query: str) -> str:
db = tool_context.agent.state.get("db")
if db is None:
raise ValueError("Database not initialized")
return str(db.execute(query).fetchall())
# Serialization behavior
checkpoint = agent.state.serialize() # Only includes user_id, session_data, created_at
# After restore, transient values are lost
agent.state.deserialize(checkpoint)
agent.state.get("db") # None (not persisted)
agent.state.get("user_id") # "12345" (persisted)Checkpointing Without Session Manager
# Serialize for checkpointing
checkpoint = agent.state.serialize()
save_to_file(checkpoint)
# Later restore
checkpoint = load_from_file()
agent.state.deserialize(checkpoint)Security Considerations
Pickle Security Risks
Pickle can execute arbitrary code during deserialization. This is a known security risk. Users should:
- Only unpickle data from trusted sources
- Never unpickle data received over network from untrusted sources
- Consider using
hmacto verify data integrity - Use custom serializers with restricted functionality if needed
We will document these risks clearly in the API documentation.
Migration Guide
For Existing Users
No changes required. Existing code continues to work:
# This still works exactly as before
agent = Agent(model)
agent.state.set("key", "value")Upgrading to Rich Types
# Old way - manual serialization
agent.state.set("created", datetime.now().isoformat()) # Convert to string
created = datetime.fromisoformat(agent.state.get("created")) # Parse back
# New way - direct storage
agent = Agent(model, state_serializer=PickleSerializer())
agent.state.set("created", datetime.now()) # Store directly
created = agent.state.get("created") # Already a datetimeTesting Strategy
Unit Tests
- Test each serializer with valid/invalid inputs
- Test AgentState with different serializers
- Test validation delegation
- Test serialize/deserialize round-trips
- Test Agent constructor parameter handling
Integration Tests
- Session persistence with PickleSerializer
- Rich type round-trips (datetime, UUID, Decimal, dataclass, Pydantic)
- Multi-agent scenarios with different serializers
- Backward compatibility verification
Performance Tests
- Compare serialization speed (JSON vs Pickle)
- Memory usage with large state objects
- Session sync performance
Documentation Updates
- Update Agent API docs with
state_serializerparameter - Add serialization guide to documentation
- Document security considerations for Pickle
- Provide migration examples
- Update tutorials with rich type examples