Skip to content

Conversation

@prasadskarmarkar
Copy link

@prasadskarmarkar prasadskarmarkar commented Jan 26, 2026

Implements the Interactions API for the Java SDK, addressing feature request #749.
Based on the OpenAPI spec at ai.google.dev/static/api/interactions.openapi.json
and API documentation at ai.google.dev/api/interactions-api.

Core Clients

  • Synchronous Client (Interactions.java) - create, get, cancel, delete operations
  • Async Client (AsyncInteractions.java) - CompletableFuture-based operations
  • Integration - Seamless integration with existing Client.java

Streaming Support (SSE)

  • Real-time Streaming - createStream() and getStream() methods for live interaction updates
  • 6 Event Types - InteractionEvent, ContentStart, ContentDelta, ContentStop, InteractionStatusUpdate, ErrorEvent
  • 19 Delta Types - Text, Image, Audio, Video, Document, Thought, FunctionCall, and 12 tool-specific deltas
  • Stream Resumption - Support for interrupted streams via lastEventId
  • Type-Safe Deserialization - Jackson polymorphic deserializer with @JsonTypeInfo discriminator
  • Auto-Closeable - try-with-resources support for proper resource management

Type System

Core Types

  • Interaction - Response type with status, outputs, usage metadata
  • CreateInteractionConfig - Rich configuration with model/agent selection, tools, generation config
  • GenerationConfig - Thinking level, speech/image config, response modality

Content Types (17)

TextContent, ImageContent, AudioContent, VideoContent, DocumentContent, ThoughtContent,
ThoughtSummaryContent, FunctionCallContent, FunctionResultContent, CodeExecutionCallContent,
CodeExecutionResultContent, GoogleSearchCallContent, GoogleSearchResultContent,
UrlContextCallContent, UrlContextResultContent, FileSearchCallContent, McpServerToolCallContent

Tool Types (7)

Function, GoogleSearch, CodeExecution, FileSearch, UrlContext, ComputerUse, McpServer

Key Features

✓ Multi-turn conversations via previousInteractionId
✓ Manual function calling workflow (application-side execution required)
✓ Background operation support with cancel capability
✓ Rich media support (text, images, audio, video, documents)
✓ Comprehensive tool ecosystem with 7 built-in tools
✓ SSE streaming with resumption support
✓ Agent configuration (Deep Research, Dynamic agents)

Testing

  • Unit Tests - Validation, type serialization, builders
  • Integration Tests - Mock-based client interaction tests
  • Streaming Tests - Event parsing, polymorphic deserialization, status transitions
  • Comprehensive Coverage - 25+ test classes covering all features

Examples (43 files)

Non-Streaming (19)

  • Basic operations (create, get, cancel, delete)
  • Content types (text, image, audio, video, document)
  • Tools (function calling, Google Search, URL context, code execution, file search, MCP server)
  • Multi-turn conversations
  • Agent configurations

Streaming (23)

  • Basic streaming iteration
  • Media streaming (image, audio, video, document)
  • Tool call streaming (functions, Google Search, URL context)
  • Thought/reasoning streaming
  • Multi-turn streaming conversations
  • Stream resumption
  • Async streaming with CompletableFuture

Pending Work

  • Replay tests (setup complete, working on generating replay JSON files)

Fixes: #749

Implements the Interactions API for the Java SDK, addressing feature request googleapis#749.
Based on the OpenAPI spec at ai.google.dev/static/api/interactions.openapi.json
and API documentation at ai.google.dev/api/interactions-api.

- `Interactions.java` - Synchronous client with create, get, cancel, delete methods
- `AsyncInteractions.java` - Asynchronous client with CompletableFuture support
- Integration with existing `Client.java`

- `Interaction` - Core response type with status, outputs, usage metadata
- `CreateInteractionConfig` - Rich configuration with model/agent, tools, generation config
- `Content` types - TextContent, ImageContent, AudioContent, VideoContent, DocumentContent,
  FunctionCallContent, FunctionResultContent, CodeExecutionCallContent, GoogleSearchCallContent,
  UrlContextCallContent, ThoughtContent, McpServerToolCallContent, and more
- `Tool` types - Function, GoogleSearch, CodeExecution, FileSearch, UrlContext, ComputerUse, McpServer
- `GenerationConfig` - Interactions-specific config with thinking level, speech/image config

- Multi-turn conversations via previousInteractionId
- Manual function calling workflow (no AFC - requires application-side execution)
- Background operation support with cancel capability
- Rich media support (text, images, audio, video, documents)
- Comprehensive tool ecosystem
- SSE streaming support (InteractionSseEvent) for real-time interaction updates

- Unit tests for validation, types, serialization
- Mock-based integration tests

- example files demonstrating all features
- Coverage of all content types, tools, and configurations

- Replay tests - Pending- setup done - working on generating replay JSON files for testing

Fixes: googleapis#749
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant