Fix HTTPAgent to support OpenAI-compatible API responses (vLLM) #210

letusfly85 · 2025-12-30T06:55:19Z

Problem

HTTPAgent was returning the entire OpenAI API response JSON as a string instead of extracting the actual message content. This caused agent validation failures when using OpenAI-compatible servers like vLLM.

Current Behavior

Agent responses contain raw API JSON: {'id': 'chatcmpl-xxx', 'object': 'chat.completion', 'choices': [...]}
AgentBench tasks cannot parse these responses, leading to validation failures
Users must manually modify the agent code to work with OpenAI-compatible servers

Expected Behavior

Agent responses should contain only the text content: the actual LLM output
Content should be extracted from choices[0].message.content for OpenAI format
No manual intervention required

Solution

This PR modifies HTTPAgent.inference() to intelligently handle OpenAI-compatible API responses:

Detect OpenAI format: Check for the presence of choices field in response
Extract content: Get the actual message from choices[0].message.content
Fallback mechanism: If not OpenAI format, use original return_format behavior

Changes

File: src/client/agents/http_agent.py
Lines: 212-222 (inference method)
Impact: 9 lines added (detection + extraction logic)

Effects

Enables vLLM Integration

AgentBench can now work directly with vLLM's OpenAI-compatible server
No need for custom wrappers or response transformation layers
Supports the most widely used open-source LLM inference engine

Eliminates Agent Validation Failures

Agents receive properly formatted text responses
Task validation logic works correctly
Benchmark results become meaningful and accurate

Backward Compatibility

Existing configurations continue to work without modification
Non-OpenAI API formats remain supported via fallback mechanism
Zero breaking changes for current users

Broader Compatibility

Works with any OpenAI-compatible inference server:
- vLLM
- Text Generation Inference (TGI)
- LocalAI
- LiteLLM
- And others following OpenAI's response format

Implementation Details

The fix adds intelligent response parsing:

# Extract content from OpenAI-compatible API response (vLLM)
if isinstance(resp, dict) and "choices" in resp and len(resp["choices"]) > 0:
    message = resp["choices"][0].get("message", {})
    content = message.get("content", "")
    if content:
        return content

# Fallback to return_format if not OpenAI format
return self.return_format.format(response=resp)

This approach ensures that:

OpenAI-format responses are properly parsed
Other formats continue working as before
No configuration changes are needed
The code is self-documenting with clear comments

Use Case

This fix is particularly important for researchers and practitioners who:

Use vLLM for efficient LLM serving
Want to benchmark open-source models with AgentBench
Need cost-effective alternatives to proprietary APIs
Require high-throughput inference for large-scale evaluations

HTTPAgent was returning the entire API response JSON as a string instead of extracting the actual message content. This caused agent validation failures when using OpenAI-compatible servers like vLLM. Changes: - Add detection for OpenAI response format (checks for 'choices' field) - Extract content from choices[0].message.content when available - Maintain backward compatibility with fallback to return_format This fix enables AgentBench to work correctly with vLLM and other OpenAI-compatible inference servers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HTTPAgent to support OpenAI-compatible API responses (vLLM) #210

Fix HTTPAgent to support OpenAI-compatible API responses (vLLM) #210

Uh oh!

letusfly85 commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix HTTPAgent to support OpenAI-compatible API responses (vLLM) #210

Are you sure you want to change the base?

Fix HTTPAgent to support OpenAI-compatible API responses (vLLM) #210

Uh oh!

Conversation

letusfly85 commented Dec 30, 2025

Problem

Current Behavior

Expected Behavior

Solution

Changes

Effects

Enables vLLM Integration

Eliminates Agent Validation Failures

Backward Compatibility

Broader Compatibility

Implementation Details

Use Case

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant