-
Notifications
You must be signed in to change notification settings - Fork 547
Description
Bug Report: Duplicate simultaneous requests to LM Studio with structured output
Description
When using Char with LM Studio as an OpenAI-compatible provider, two identical /v1/chat/completions requests are sent simultaneously when generating structured output. The first request includes response_format: { "type": "json_object" } and the second is identical except without this parameter.
Expected Behavior
Only one request should be sent. The generateStructuredOutput function should attempt JSON mode first, and only fall back to plain text if the first request fails.
Actual Behavior
Both requests arrive at LM Studio within the same second, suggesting they're being sent concurrently rather than sequentially.
Environment
- Provider: LM Studio (OpenAI-compatible)
- Model: qwen3.5-9b (supports tool use)
- Endpoint:
/v1/chat/completions - Char Version: v1.0.11-nightly.2
Steps to Reproduce
- Configure Char to use LM Studio with qwen3.5-9b model
- Trigger any feature that uses structured output (e.g., note enhancement)
- Observe LM Studio logs showing two simultaneous requests
Technical Details
The issue appears to be in the generateStructuredOutput function in enhance-workflow.ts. The function should:
- First attempt with
Output.object({ schema }) - Only fall back to plain text in the catch block
However, both requests are being sent concurrently, indicating a possible race condition or issue with the AI SDK's handling of response_format for OpenAI-compatible providers.
Impact
- Duplicate inference requests waste computational resources
- Unnecessary load on local inference server
- Potential for race conditions in AI responses
Additional Context
The fallback mechanism is intentional for models that don't support structured output, but qwen3.5-9b does support tool use, so the first request should succeed.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status