Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
_*
# '_' in src dir, ok.
!**/src/**/_*
!**/spec/**/_*

*.lock
*.lockb
Expand Down
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
`.` minor | `-` Fix | `+` Addition | `^` improvement | `!` Change | `*` Refactor


## 2025-10-25 - [v0.4.3](https://github.com/jeremychone/rust-genai/compare/v0.4.2...v0.4.3)

- `!` Refactor ZHIPU adapter to ZAI with namespace-based endpoint routing (#95)
- `-` openai - stream tool - Fix streaming too issue (#91)
- `.` added ModelName partial eq implementations for string types (#94)
- `.` anthropic - update model name for haiku 4.5

## 2025-10-12 - [v0.4.2](https://github.com/jeremychone/rust-genai/compare/v0.4.1...v0.4.2)

- `.` test - make the common_test_chat_stop_sequences_ok more resilient
Expand Down
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "genai"
version = "0.4.3-wip"
version = "0.4.4-WIP"
edition = "2024"
license = "MIT OR Apache-2.0"
description = "Multi-AI Providers Library for Rust. (OpenAI, Gemini, Anthropic, xAI, Ollama, Groq, DeepSeek, Grok)"
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Provides a single, ergonomic API to many generative AI providers, such as Anthro

**NOTE:** Big update with **v0.4.x** - More adapters, PDF and image support, embeddings, custom headers, and transparent support for the OpenAI Responses API (gpt-5-codex)

## v0.4.0 Big Release
## v0.4.x Big Release

- **What's new**:
- **PDF and Images** support (thanks to [Andrew Rademacher](https://github.com/AndrewRademacher))
Expand All @@ -39,6 +39,8 @@ See:

## Big Thanks to

- [Bart Carroll](https://github.com/bartCarroll) For [#91](https://github.com/jeremychone/rust-genai/pull/91) Fixed streaming tool calls for openai models
- [Rui Andrada](https://github.com/shingonoide) For [#95](https://github.com/jeremychone/rust-genai/pull/95) refactoring ZHIPU adapter to ZAI
- [Adrien](https://github.com/XciD) Extra headers in requests, seed for chat requests, and fixes (with [Julien Chaumond](https://github.com/julien-c) for extra headers)
- [Andrew Rademacher](https://github.com/AndrewRademacher) for PDF support, Anthropic streamer, and insight on flattening the message content (e.g., ContentParts)
- [Jesus Santander](https://github.com/jsantanders) Embedding support [PR #83](https://github.com/jeremychone/rust-genai/pull/83)
Expand Down
59 changes: 59 additions & 0 deletions dev/spec/_spec-rules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Specification Guidelines

This document defines the rules for creating and maintaining specification files.

Important formatting rules

- Use `-` for bullet points.
- For numbering bullet point style, have empty lines between numbering line.


## Types of Specification Files

### `spec--index.md`

A single file providing a high-level summary of the entire system.

### `spec-module_name.md`

A specification file for each individual module.
- `module-path-name` represents the module’s hierarchy path, flattened with `-`.
- Each file documents the specification for a single module.

Make sure that the `module_name` is the top most common just after `src/`

For example `src/module_01/sub_mod/some_file.rs` the spec module name will be `dev/spec/spec-module_01.md`

(module_name is lowercase)

## Required Structure for Module Specification Files

Each `spec-module-path-name.md` file must include the following sections.

<module_spec_template>

## module-path-name

### Goal

A clear description of the module’s purpose and responsibilities.

### Public Module API

A description of the APIs exposed by the module.
- Define what is exported and how it can be consumed by other modules.
- Include function signatures, data structures, or endpoints as needed.

### Module Parts

A breakdown of the module’s internal components.
- May reference sub-files or sub-modules.
- Should explain how the parts work together.

### Key Design Considerations

Key design considerations of this module and of its key parts.



</module_spec_template>
33 changes: 33 additions & 0 deletions dev/spec/spec-adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
## adapter

### Goal

The `adapter` module is responsible for abstracting the communication with various Generative AI providers (e.g., OpenAI, Gemini, Anthropic, Groq, DeepSeek). It translates generic GenAI requests (like `ChatRequest` and `EmbedRequest`) into provider-specific HTTP request data and converts provider-specific web responses back into generic GenAI response structures. It acts as the translation and dispatch layer between the client logic and the underlying web communication.

### Public Module API

The primary public API exposed by the `adapter` module is:

- `AdapterKind`: An enum identifying the AI provider or protocol type (e.g., `OpenAI`, `Gemini`, `Anthropic`, `Cohere`). This type is used by the client and resolver layers to determine which adapter implementation should handle a specific model request.

### Module Parts

- `adapter_kind.rs`: Defines the `AdapterKind` enum. It includes implementation details for serialization, environment variable name resolution, and a default static mapping logic (`from_model`) to associate model names with a specific `AdapterKind`.

- `adapter_types.rs`: Defines the `Adapter` trait, which sets the contract for all concrete adapter implementations. It also defines common types like `ServiceType` (Chat, ChatStream, Embed) and `WebRequestData` (the normalized structure holding URL, headers, and payload before web execution).

- `dispatcher.rs`: Contains the `AdapterDispatcher` struct, which acts as the central routing mechanism. It dispatches calls from the client layer to the correct concrete adapter implementation based on the resolved `AdapterKind`.

- `inter_stream.rs`: Defines internal types (`InterStreamEvent`, `InterStreamEnd`) used by streaming adapters to standardize the output format from diverse provider streaming protocols. This intermediary layer handles complex stream features like capturing usage, reasoning content, and tool calls before conversion to public `ChatStreamResponse` events.

- `adapters/`: This submodule contains the concrete implementation of the `Adapter` trait for each provider (e.g., `openai`, `gemini`, `anthropic`, `zai`). These submodules handle the specific request/response translation logic for their respective protocols.

### Key Design Considerations

- **Stateless and Static Dispatch:** Adapters are designed to be stateless, with all methods in the `Adapter` trait being associated functions (static). Requests are routed efficiently using static dispatch through the `AdapterDispatcher`, minimizing runtime overhead and simplifying dependency management.

- **Request/Response Normalization:** The adapter layer ensures that incoming requests and outgoing responses conform to generic GenAI types, hiding provider-specific implementation details from the rest of the library.

- **Dynamic Resolution:** While `AdapterKind::from_model` provides a default mapping from model names (based on common prefixes or keywords), the system allows this to be overridden by custom `ServiceTargetResolver` configurations, enabling flexible routing (e.g., mapping a custom model name to an `OpenAI` adapter with a custom endpoint).

- **Stream Intermediation:** The introduction of `InterStreamEvent` is crucial for handling the variance in streaming protocols across providers. it ensures that complex data transmitted at the end of a stream (like final usage statistics or aggregated tool calls) can be correctly collected and normalized, regardless of the provider's specific event format.
66 changes: 66 additions & 0 deletions dev/spec/spec-chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
## chat

### Goal

The `chat` module provides the core primitives for constructing chat requests, defining messages (including multi-part content like text, binary, and tool data), and handling synchronous and asynchronous (streaming) chat responses across all supported AI providers. It standardizes the data structures necessary for modern LLM interactions.

### Public Module API

The module exports the following key data structures:

- **Request/Message Structure:**
- `ChatRequest`: The primary structure for initiating a chat completion call, containing the history (`messages`), an optional system prompt (`system`), and tool definitions (`tools`).
- `ChatMessage`: Represents a single interaction turn, comprising a `ChatRole`, `MessageContent`, and optional `MessageOptions`.
- `ChatRole`: Enum defining message roles (`System`, `User`, `Assistant`, `Tool`).
- `MessageContent`: A unified container for multi-part content, wrapping a list of `ContentPart`s.
- `ContentPart`: Enum defining content types: `Text`, `Binary`, `ToolCall`, `ToolResponse`.
- `Binary`, `BinarySource`: Structures defining binary payloads (e.g., images), sourced via base64 or URL.
- `MessageOptions`, `CacheControl`: Per-message configuration hints (e.g., for cache behavior).

- **Configuration:**
- `ChatOptions`: General request configuration, including sampling parameters (`temperature`, `max_tokens`, `top_p`, `seed`), streaming capture flags, and format control.
- `ReasoningEffort`, `Verbosity`: Provider-specific hints for reasoning intensity or output verbosity.
- `ChatResponseFormat`, `JsonSpec`: Defines desired structured output formats (e.g., JSON mode).

- **Responses:**
- `ChatResponse`: The result of a non-streaming request, including final content, usage, and model identifiers.
- `ChatStreamResponse`: The result wrapper for streaming requests, containing the `ChatStream` and model identity.

- **Streaming:**
- `ChatStream`: A `futures::Stream` implementation yielding `ChatStreamEvent`s.
- `ChatStreamEvent`: Enum defining streaming events: `Start`, `Chunk` (content), `ReasoningChunk`, `ToolCallChunk`, and `End`.
- `StreamEnd`: Terminal event data including optional captured usage, content, and reasoning content.

- **Tooling:**
- `Tool`: Metadata and schema defining a function the model can call.
- `ToolCall`: The model's invocation request for a specific tool.
- `ToolResponse`: The output returned from executing a tool, matched by call ID.

- **Metadata:**
- `Usage`, `PromptTokensDetails`, `CompletionTokensDetails`: Normalized token usage statistics.

- **Utilities:**
- `printer` module: Contains `print_chat_stream` for console output utilities.

### Module Parts

The functionality is divided into specialized files/sub-modules:

- `chat_message.rs`: Defines the `ChatMessage` fundamental structure and associated types (`ChatRole`, `MessageOptions`).
- `chat_options.rs`: Manages request configuration (`ChatOptions`) and provides parsing logic for provider-specific hints like `ReasoningEffort` and `Verbosity`.
- `chat_req_response_format.rs`: Handles configuration for structured output (`ChatResponseFormat`, `JsonSpec`).
- `chat_request.rs`: Defines the top-level `ChatRequest` and methods for managing the request history and properties.
- `chat_response.rs`: Defines synchronous chat response structures (`ChatResponse`).
- `chat_stream.rs`: Implements the public `ChatStream` and its events, mapping from the internal adapter stream.
- `content_part.rs`: Defines `ContentPart`, `Binary`, and `BinarySource` for handling multi-modal inputs/outputs.
- `message_content.rs`: Defines `MessageContent`, focusing on collection management and convenient accessors for content parts (e.g., joining all text).
- `tool/mod.rs` (and associated files): Defines the tooling primitives (`Tool`, `ToolCall`, `ToolResponse`).
- `usage.rs`: Defines the normalized token counting structures (`Usage`).
- `printer.rs`: Provides utility functions for rendering stream events to standard output.

### Key Design Considerations

- **Unified Content Model:** The use of `MessageContent` composed of `ContentPart` allows any message role (user, assistant, tool) to handle complex, multi-part data seamlessly, including text, binary payloads, and tooling actions.
- **Decoupled Streaming:** The public `ChatStream` is an abstraction layer over an internal stream (`InterStream`), ensuring a consistent external interface regardless of adapter implementation details (like internal handling of usage reporting or reasoning chunks).
- **Normalized Usage Metrics:** The `Usage` structure provides an OpenAI-compatible interface while allowing for provider-specific breakdowns (e.g., caching or reasoning tokens) via detailed sub-structures.
- **Hierarchical Options:** `ChatOptions` can be applied globally at the client level or specifically per request. The internal resolution logic ensures request-specific options take precedence over client defaults.
59 changes: 59 additions & 0 deletions dev/spec/spec-client.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
## client

### Goal

The `client` module provides the core entry point (`Client`) for interacting with various Generative AI providers. It encapsulates configuration (`ClientConfig`, `WebConfig`), a builder pattern (`ClientBuilder`), request execution (`exec_chat`, `exec_embed`), and service resolution logic (e.g., determining endpoints and authentication).

### Public Module API

The `client` module exposes the following public types:

- **`Client`**: The main interface for executing AI requests (chat, embedding, streaming, model listing).
- `Client::builder()`: Starts the configuration process.
- `Client::default()`: Creates a client with default configuration.
- Core execution methods: `exec_chat`, `exec_chat_stream`, `exec_embed`, `embed`, `embed_batch`.
- Resolution/Discovery methods: `all_model_names`, `resolve_service_target`.

- **`ClientBuilder`**: Provides a fluent interface for constructing a `Client`. Used to set `ClientConfig`, default `ChatOptions`, `EmbedOptions`, and custom resolvers (`AuthResolver`, `ServiceTargetResolver`, `ModelMapper`).

- **`ClientConfig`**: Holds the resolved and default configurations used by the `Client`, including resolver functions and default options.

- **`Headers`**: A simple map wrapper (`HashMap<String, String>`) for managing HTTP headers in requests.

- **`ServiceTarget`**: A struct containing the final resolved components needed to execute a request: `Endpoint`, `AuthData`, and `ModelIden`.

- **`WebConfig`**: Configuration options specifically for building the underlying `reqwest::Client` (e.g., timeouts, proxies, default headers).

### Module Parts

The module is composed of several files that implement the layered client architecture:

- `builder.rs`: Implements `ClientBuilder`, handling the creation and configuration flow. It initializes or updates the nested `ClientConfig` and optionally an internal `WebClient`.

- `client_types.rs`: Defines the main `Client` struct and `ClientInner` (which holds `WebClient` and `ClientConfig` behind an `Arc`).

- `config.rs`: Defines `ClientConfig` and the core `resolve_service_target` logic, which orchestrates calls to `ModelMapper`, `AuthResolver`, and `ServiceTargetResolver` before falling back to adapter defaults.

- `client_impl.rs`: Contains the main implementation of the public API methods on `Client`, such as `exec_chat` and `exec_embed`. These methods perform service resolution and delegate to `AdapterDispatcher` for request creation and response parsing.

- `headers.rs`: Implements the `Headers` utility for managing key-value HTTP header maps.

- `service_target.rs`: Defines the `ServiceTarget` structure for resolved endpoints, authentication, and model identifiers.

- `web_config.rs`: Defines `WebConfig` and its logic for applying settings to a `reqwest::ClientBuilder`.

### Key Design Considerations

- **Client Immutability and Sharing**: The `Client` holds its internal state (`ClientInner` with `WebClient` and `ClientConfig`) wrapped in an `Arc`. This design ensures that the client is thread-safe and cheaply cloneable, aligning with common client patterns in asynchronous Rust applications.

- **Config Layering and Resolution**: The client architecture employs a sophisticated resolution process managed by `ClientConfig::resolve_service_target`.
- It first applies a `ModelMapper` to potentially translate the input model identifier.
- It then consults the `AuthResolver` for authentication data. If the resolver is absent or returns `None`, it defaults to the adapter's standard authentication mechanism (e.g., API key headers).
- It determines the adapter's default endpoint.
- Finally, it applies the optional `ServiceTargetResolver`, allowing users to override the endpoint, auth, or model for complex scenarios (e.g., custom proxies or routing).

- **WebClient Abstraction**: The core HTTP client logic is delegated to the `WebClient` (from the `webc` module), which handles low-level request execution and streaming setup. This separation keeps the `client` module focused on business logic and AI provider orchestration.

- **Builder Pattern for Configuration**: `ClientBuilder` enforces configuration before client creation, simplifying object construction and ensuring necessary dependencies are set up correctly.

- **Headers Simplification**: The `Headers` struct abstracts HTTP header management, ensuring that subsequent merges or overrides result in a single, final header value, which is typical for API key authorization overrides.
36 changes: 36 additions & 0 deletions dev/spec/spec-common.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## common

### Goal

The `common` module provides fundamental data structures used throughout the `genai` library, primarily focusing on identifying models and adapters in a clear and efficient manner.

### Public Module API

The module exposes two main types: `ModelName` and `ModelIden`.

- `ModelName`: Represents a generative AI model identifier (e.g., `"gpt-4o"`, `"claude-3-opus"`).
- It wraps an `Arc<str>` for efficient cloning and sharing across threads.
- Implements `From<String>`, `From<&String>`, `From<&str>`, and `Deref<Target = str>`.
- Supports equality comparison (`PartialEq`) with various string types (`&str`, `String`).

- `ModelIden`: Uniquely identifies a model by coupling an `AdapterKind` with a `ModelName`.
- Fields:
- `adapter_kind: AdapterKind`
- `model_name: ModelName`
- Constructor: `fn new(adapter_kind: AdapterKind, model_name: impl Into<ModelName>) -> Self`
- Utility methods for creating new identifiers based on name changes:
- `fn from_name<T>(&self, new_name: T) -> ModelIden`
- `fn from_optional_name(&self, new_name: Option<String>) -> ModelIden`

### Module Parts

The `common` module consists of:

- `model_name.rs`: Defines the `ModelName` type and related string manipulation utilities, including parsing optional namespaces (e.g., `namespace::model_name`).
- `model_iden.rs`: Defines the `ModelIden` type, which associates a `ModelName` with an `AdapterKind`.

### Key Design Considerations

- **Efficiency of ModelName:** `ModelName` uses `Arc<str>` to ensure that cloning the model identifier is cheap, which is crucial as model identifiers are frequently passed around in request and response structures.
- **Deref Implementation:** Implementing `Deref<Target = str>` for `ModelName` allows it to be used naturally as a string reference.
- **ModelIden Immutability:** `ModelIden` is designed to be immutable and fully identifiable, combining the model string identity (`ModelName`) with the service provider identity (`AdapterKind`).
Loading
Loading