From 4099b1fe6b3297d2f6bf08de802a65b795d9084f Mon Sep 17 00:00:00 2001 From: Deeptanshu Singh Date: Thu, 19 Feb 2026 14:46:03 -0500 Subject: [PATCH 1/3] Update README with token match rate on text backbone --- contrib/models/Qwen2.5-Omni-7B/README.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/contrib/models/Qwen2.5-Omni-7B/README.md b/contrib/models/Qwen2.5-Omni-7B/README.md index 5f3d273..79e0703 100644 --- a/contrib/models/Qwen2.5-Omni-7B/README.md +++ b/contrib/models/Qwen2.5-Omni-7B/README.md @@ -12,6 +12,8 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B. ## Architecture Details +- **Type:** Multimodal (omni — vision, audio, text) model — text backbone validated only +- **Text Backbone:** Decoder-only transformer (Qwen2-based) - **Layers:** Check model config - **Hidden Size:** Check model config - **Attention Heads:** Check model config @@ -28,7 +30,7 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B. | Test | Status | Result | |------|--------|--------| | Smoke Test | ✅ PASS | Model loads successfully | -| Token Matching | ⚠️ N/A | **0.0% match** | +| Token Matching | ✅ PASS | **100% match** (text backbone) | | TTFT (P50) | ✅ PASS | 50.15ms (threshold: 100ms) | | Throughput | ✅ PASS | 19.82 tok/s (threshold: 10 tok/s) | @@ -39,9 +41,12 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B. | TTFT (P50) | 50.15ms | | Throughput | 19.82 tokens/s | - **Status:** ✅ VALIDATED +### Multimodal Validation Notes + +Qwen2.5-Omni is a multimodal model supporting vision, audio, and text. The NeuronX port validates the text backbone only. `AutoModelForCausalLM` does not work for multimodal models — the specific text backbone class must be used to load the HF reference for token matching. Some multimodal configs may be missing attributes expected by the text backbone (e.g., `output_attentions`) and require config patching. With the correct text backbone extraction, the model achieves 100% token match. + ## Usage ```python From 0d087d2beefff7fec06d65a6a7120aa1e98d3b7c Mon Sep 17 00:00:00 2001 From: Deeptanshu Singh Date: Mon, 23 Feb 2026 15:18:42 -0500 Subject: [PATCH 2/3] Removing local reference --- contrib/models/Qwen2.5-Omni-7B/src/modeling_qwen2_5_omni.py | 1 - 1 file changed, 1 deletion(-) diff --git a/contrib/models/Qwen2.5-Omni-7B/src/modeling_qwen2_5_omni.py b/contrib/models/Qwen2.5-Omni-7B/src/modeling_qwen2_5_omni.py index 065407f..89e929c 100644 --- a/contrib/models/Qwen2.5-Omni-7B/src/modeling_qwen2_5_omni.py +++ b/contrib/models/Qwen2.5-Omni-7B/src/modeling_qwen2_5_omni.py @@ -141,7 +141,6 @@ def from_pretrained(cls, model_path: str, **kwargs) -> "Qwen2_5OmniInferenceConf if neuron_config is None: possible_paths = [ os.path.join(model_path, "neuron_config.json"), - "./agent_artifacts/data/qwen2_5_omni_compiled/neuron_config.json", # Compiled directory "neuron_config.json", # Current directory ] From cef88e4733d141cc6cede04099c8b36e1fd85a50 Mon Sep 17 00:00:00 2001 From: Deeptanshu Singh Date: Thu, 26 Feb 2026 13:44:55 -0500 Subject: [PATCH 3/3] Removing internal names --- contrib/models/Qwen2.5-Omni-7B/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/models/Qwen2.5-Omni-7B/README.md b/contrib/models/Qwen2.5-Omni-7B/README.md index 79e0703..998260e 100644 --- a/contrib/models/Qwen2.5-Omni-7B/README.md +++ b/contrib/models/Qwen2.5-Omni-7B/README.md @@ -111,6 +111,6 @@ python3 test/integration/test_model.py ## Maintainer -Neuroboros Team - Annapurna Labs +Annapurna Labs **Last Updated:** 2026-01-29