A compilation of resources (model profiles, benchmarks, docs) for multimodal AI models with audio understanding (esp. focused on ASR and transcription use-cases)
-
Updated
Dec 8, 2025
A compilation of resources (model profiles, benchmarks, docs) for multimodal AI models with audio understanding (esp. focused on ASR and transcription use-cases)
MCP for Gemini multimodal audio transcription with built in post-processing
Add a description, image, and links to the audio-multimodal topic page so that developers can more easily learn about it.
To associate your repository with the audio-multimodal topic, visit your repo's landing page and select "manage topics."