feat: add multi-provider audio transcription support #134
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds multi-provider audio transcription support to HUF using LiteLLM, following the same pattern as image generation. Agents can now transcribe audio files from OpenAI, Groq, Deepgram, Azure, and other providers through a unified tool interface.
Features
Core Functionality
Technical Implementation
handle_transcribe_audio()insdk_tools.pycreate_transcribe_audio_tool()with auto-update supportImplementation Details
Tool Parameters
file_idfile_url/files/audio.mp3)languageen,es,fr, etc.)model*At least one of
file_idorfile_urlis requiredDefault Models by Provider
{ "openai": "whisper-1", "azure": "whisper-1", "groq": "groq/whisper-large-v3", "deepgram": "deepgram/nova-2", "default": "whisper-1" }Request Flow
litellm.transcription()with file pathnew_agent_messagefor real-time UI updateResponse Format
{ "success": true, "text": "Transcribed audio content...", "file_id": "abc123", "file_url": "/files/audio.mp3", "language": "en", "model": "whisper-1", "message_id": "msg-xyz", "conversation_id": "conv-123" }Usage Examples
Example 1: Basic Transcription
Example 2: With Language Hint
Example 3: With Model Override
Example 4: Agent Workflow
Comparison with Image Generation
litellm.image_generation()litellm.transcription()generated_imagefieldTesting
Tested Scenarios
file_idfile_urlfile_name(fallback)Test Commands
Migration Notes
For Existing Installations:
For New Installations:
Tool is automatically created during
bench install-app huf.Checklist