Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions contrib/models/Qwen2.5-Omni-7B/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B.

## Architecture Details

- **Type:** Multimodal (omni — vision, audio, text) model — text backbone validated only
- **Text Backbone:** Decoder-only transformer (Qwen2-based)
- **Layers:** Check model config
- **Hidden Size:** Check model config
- **Attention Heads:** Check model config
Expand All @@ -28,7 +30,7 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B.
| Test | Status | Result |
|------|--------|--------|
| Smoke Test | ✅ PASS | Model loads successfully |
| Token Matching | ⚠️ N/A | **0.0% match** |
| Token Matching | ✅ PASS | **100% match** (text backbone) |
| TTFT (P50) | ✅ PASS | 50.15ms (threshold: 100ms) |
| Throughput | ✅ PASS | 19.82 tok/s (threshold: 10 tok/s) |

Expand All @@ -39,9 +41,12 @@ NeuronX Distributed Inference implementation of Qwen2.5 Omni 7B.
| TTFT (P50) | 50.15ms |
| Throughput | 19.82 tokens/s |


**Status:** ✅ VALIDATED

### Multimodal Validation Notes

Qwen2.5-Omni is a multimodal model supporting vision, audio, and text. The NeuronX port validates the text backbone only. `AutoModelForCausalLM` does not work for multimodal models — the specific text backbone class must be used to load the HF reference for token matching. Some multimodal configs may be missing attributes expected by the text backbone (e.g., `output_attentions`) and require config patching. With the correct text backbone extraction, the model achieves 100% token match.

## Usage

```python
Expand Down Expand Up @@ -106,6 +111,6 @@ python3 test/integration/test_model.py

## Maintainer

Neuroboros Team - Annapurna Labs
Annapurna Labs

**Last Updated:** 2026-01-29
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,6 @@ def from_pretrained(cls, model_path: str, **kwargs) -> "Qwen2_5OmniInferenceConf
if neuron_config is None:
possible_paths = [
os.path.join(model_path, "neuron_config.json"),
"./agent_artifacts/data/qwen2_5_omni_compiled/neuron_config.json", # Compiled directory
"neuron_config.json", # Current directory
]

Expand Down