Skip to content

feat: Add dockerfiles for strixhalo AMD APU #312

Merged
AnkushMalaker merged 10 commits intoSimpleOpenSoftware:devfrom
0xrushi:fix/addstrixhalo2
Feb 27, 2026
Merged

feat: Add dockerfiles for strixhalo AMD APU #312
AnkushMalaker merged 10 commits intoSimpleOpenSoftware:devfrom
0xrushi:fix/addstrixhalo2

Conversation

@0xrushi
Copy link
Contributor

@0xrushi 0xrushi commented Feb 24, 2026

image

Summary by CodeRabbit

Release Notes

  • New Features

    • Added AMD Strix Halo (ROCm) hardware support for ASR services, including optimized VibeVoice and NeMo providers.
    • Added speaker recognition service variant for AMD Strix Halo.
    • New hardware profile selection in setup wizard enabling users to choose between standard (CPU/NVIDIA) and AMD Strix Halo paths.
  • Bug Fixes

    • Improved conversation end reason tracking and persistence.
  • Tests

    • Added test coverage for Strix Halo hardware configurations and new ASR provider variants.
  • Chores

    • Updated Docker configurations and dependency management for ROCm support.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request introduces support for AMD Strix Halo ROCm GPUs across ASR and speaker recognition services. Changes include new Docker services and Dockerfiles with ROCm configuration, CLI parameters for hardware profile selection, dependency groups for Strix Halo variants, and updates to the service orchestration wizard to handle hardware-specific setup paths.

Changes

Cohort / File(s) Summary
Docker Compose Infrastructure
backends/advanced/docker-compose-test.yml, extras/asr-services/docker-compose.yml, extras/speaker-recognition/docker-compose.yml
Port mapping updated to explicit publish config in test service; added two new ROCm-enabled ASR services (vibevoice-asr-strixhalo, nemo-asr-strixhalo) with device access and environment configuration; introduced speaker-service-strixhalo with GPU device binding and Strix Halo-specific environment variables.
Strix Halo Dockerfiles
extras/asr-services/providers/nemo/Dockerfile.strixhalo, extras/asr-services/providers/vibevoice/Dockerfile.strixhalo, extras/speaker-recognition/Dockerfile.strixhalo
Multi-stage builds targeting gfx1151 with ROCm PyTorch installation via uv, system dependency provisioning, and runtime configuration for Strix Halo environments.
ASR Services Configuration
extras/asr-services/init.py, extras/asr-services/pyproject.toml
Added two new provider variants (vibevoice-strixhalo, nemo-strixhalo) with capabilities and model mappings; introduced nemo-rocm and strixhalo dependency groups; extended CLI to accept new providers and ROCm-specific options; added setup logic for hardware-specific configuration.
Speaker Recognition Configuration
extras/speaker-recognition/init.py, extras/speaker-recognition/pyproject.toml
Introduced --pytorch-cuda-version parameter with strixhalo variant support; added conditional dependencies and git-based pyannote.audio for strixhalo extra; updated Python version constraint and GPU/CPU mode detection logic.
Service Orchestration
services.py, wizard.py
Added strixhalo profile mapping and provider-to-service routes for new ASR variants in services.py; introduced select_hardware_profile() function and hardware_profile parameter in wizard.py for user-driven hardware selection during setup; integrated hardware profile branching into run_service_setup for ASR and speaker recognition services.
Test Coverage
extras/asr-services/tests/test_cuda_version_config.py, tests/unit/test_wizard_strixhalo.py, tests/resources/plugin_keywords.robot
Added fixtures and validation tests for Strix Halo Dockerfile PyPI index configuration; introduced comprehensive unit test module exercising wizard workflows for strixhalo providers, hardware profiles, and embeddings configurations; updated test plugin config path reference.
Backend Worker Logic
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
Added end_reason persistence logic to open_conversation_job, fetching Conversation by ID and assigning end_reason from determined value before post-processing flow.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Add dockerfiles for strixhalo AMD APU' clearly and concisely summarizes the primary change—adding Docker build configurations for AMD Strix Halo support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0xrushi 0xrushi marked this pull request as draft February 24, 2026 22:16
@AnkushMalaker AnkushMalaker marked this pull request as ready for review February 24, 2026 22:18
@AnkushMalaker AnkushMalaker marked this pull request as draft February 24, 2026 22:18
@0xrushi 0xrushi changed the title feat: Add dockerfiles for strixhalo AMD APU feat: Add dockerfiles for strixhalo AMD APU and Embedding model for OpenMemory Feb 24, 2026
@0xrushi 0xrushi changed the title feat: Add dockerfiles for strixhalo AMD APU and Embedding model for OpenMemory feat: Add dockerfiles for strixhalo AMD APU Feb 24, 2026
@0xrushi 0xrushi marked this pull request as ready for review February 24, 2026 23:17
@AnkushMalaker
Copy link
Collaborator

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (8)
extras/asr-services/providers/vibevoice/Dockerfile.strixhalo (1)

44-46: Trim compiler toolchain from runtime image.

Line 45 installs gcc and libc6-dev in the runtime layer even though build steps happen in the builder stage.

Suggested runtime package trim
 RUN apt-get update && apt-get install -y --no-install-recommends \
-        curl ffmpeg git libsndfile1 gcc libc6-dev \
+        curl ffmpeg git libsndfile1 \
     && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo` around lines 44
- 46, The runtime image currently installs build toolchain packages (gcc,
libc6-dev) in the RUN apt-get step; remove those packages from the runtime
install and ensure any compilation happens in the builder stage (or add them
only to the builder stage). Modify the RUN apt-get install line that lists curl
ffmpeg git libsndfile1 gcc libc6-dev so it only installs the runtime packages
(curl ffmpeg git libsndfile1) and move gcc and libc6-dev to the builder stage
Dockerfile RUN/apt-get step where source/build commands run.
extras/asr-services/providers/nemo/Dockerfile.strixhalo (1)

45-47: Avoid build toolchain in runtime layer.

Line 46 includes build-essential in runtime; this usually belongs only in builder images.

Suggested runtime package trim
 RUN apt-get update && apt-get install -y --no-install-recommends \
-        libsndfile1 build-essential portaudio19-dev curl \
+        libsndfile1 portaudio19-dev curl \
     && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo` around lines 45 -
47, The Dockerfile RUN installs build tooling (build-essential) into the runtime
image; remove that from the runtime layer by moving installation of
build-essential and any compilation steps into a build stage (multi-stage build)
or install it only in a temporary builder image and copy artifacts into the
final image, and ensure the RUN line that currently lists build-essential is
updated to exclude "build-essential" so the final image only contains runtime
packages like libsndfile1, portaudio19-dev, curl, etc.; update any related build
steps to occur in a stage that references the builder stage artifacts.
extras/speaker-recognition/Dockerfile.strixhalo (1)

38-41: Pin git install to a fixed ref for reproducible builds.

Line 40 installs pyannote.audio from moving HEAD, so image contents can change without code changes.

Suggested pinning pattern
+ARG PYANNOTE_AUDIO_REF=<commit-sha>
@@
     if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then \
         uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps \
-            "git+https://github.com/pyannote/pyannote-audio.git"; \
+            "git+https://github.com/pyannote/pyannote-audio.git@${PYANNOTE_AUDIO_REF}"; \
     fi; \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/speaker-recognition/Dockerfile.strixhalo` around lines 38 - 41, The
Dockerfile is installing pyannote.audio from the moving HEAD which breaks
reproducible builds; change the git URL used in the uv pip install line (the
"git+https://github.com/pyannote/pyannote-audio.git" reference) to pin to a
fixed tag or commit (e.g. append @<tag-or-commit> to the URL) so the image
always installs the same version when PYTORCH_CUDA_VERSION equals "strixhalo";
update the install string in that conditional to use the chosen immutable ref.
extras/speaker-recognition/pyproject.toml (1)

107-114: Consider pinning pyannote.audio git source to an explicit revision for robustness.

The pyproject.toml doesn't pin a specific commit for the git dependency. While the uv.lock file currently locks the resolved commit (78c0d16a5f558adb1024d112e6857b3204b450b3), explicitly declaring the revision in the source definition makes the intent clear and protects against unexpected changes if the lock is regenerated or the project is built without it.

Suggested pinning approach
-"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo" }
+"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", rev = "78c0d16a5f558adb1024d112e6857b3204b450b3", extra = "strixhalo" }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/speaker-recognition/pyproject.toml` around lines 107 - 114, The
pyproject.toml git dependency for "pyannote.audio" is unpinned; update the
dependency entry for "pyannote.audio" (the line with git =
"https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo") to
include an explicit revision (e.g., add rev =
"78c0d16a5f558adb1024d112e6857b3204b450b3") so the git source is pinned to a
specific commit; make this change in the pyproject.toml dependency definition
for "pyannote.audio" to ensure reproducible installs.
services.py (1)

187-194: Consider extracting the duplicated profile derivation logic into a helper.

The same PYTORCH_CUDA_VERSION → profile mapping appears in both the build path (Lines 187-194) and the up/down path (Lines 298-305). A small helper would keep them in sync.

♻️ Suggested helper
+def _derive_speaker_profile(env_values: dict) -> str:
+    """Derive docker-compose profile from PYTORCH_CUDA_VERSION."""
+    pytorch_version = env_values.get("PYTORCH_CUDA_VERSION", "cpu")
+    if pytorch_version == "strixhalo":
+        return "strixhalo"
+    if pytorch_version.startswith("cu"):
+        return "gpu"
+    return "cpu"

Then replace both occurrences with:

profile = _derive_speaker_profile(env_values)

Also applies to: 298-305

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services.py` around lines 187 - 194, The PYTORCH_CUDA_VERSION → profile
mapping is duplicated; extract it into a single helper (e.g., def
_derive_speaker_profile(env_values):) that reads
env_values.get("PYTORCH_CUDA_VERSION", "cpu") and returns "strixhalo" if value
== "strixhalo", "gpu" if value.startswith("cu"), otherwise "cpu"; then replace
both existing blocks in services.py (the build path block around variables
pytorch_version/profile and the up/down path blocks) with profile =
_derive_speaker_profile(env_values) so both locations use the same logic.
extras/asr-services/init.py (1)

242-261: Strixhalo choices are listed before the standard non-ROCm providers.

Choices 2 and 3 are strixhalo variants, pushing standard qwen3-asr to 4 and nemo to 5. For the majority of users (NVIDIA/CPU), the most common choices are now further down the list. This is a UX consideration — if most users don't have Strix Halo hardware, they'll need to scroll past niche options.

Consider moving the strixhalo choices to the end (e.g., choices 6 and 7), keeping the common providers at the top.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/init.py` around lines 242 - 261, The provider ordering
places StrixHalo options ("vibevoice-strixhalo", "nemo-strixhalo") before common
non-ROCm providers, degrading UX; update the provider_choices and
choice_to_provider mappings so common providers (e.g., "vibevoice", "qwen3-asr",
"nemo", "transformers", "faster-whisper") appear first and move
"vibevoice-strixhalo" and "nemo-strixhalo" to the end of both dictionaries, then
ensure the default passed to prompt_choice("Choose ASR provider:",
provider_choices, "1") still points to the intended common provider.
wizard.py (1)

899-943: streaming_provider in strix_capable_providers will never be true in practice.

strix_capable_providers contains {"parakeet", "vibevoice"}, which are both batch-only (not in STREAMING_CAPABLE at line 689). streaming_provider is only set from select_streaming_provider which offers deepgram, smallest, and qwen3-asr. The check is harmless but unreachable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wizard.py` around lines 899 - 943, The condition in select_hardware_profile
uses streaming_provider in strix_capable_providers which is unreachable because
parakeet/vibevoice are batch-only; remove streaming_provider from the
needs_hardware_choice expression or replace it with a capability-aware check
(e.g., test that streaming_provider is in strix_capable_providers AND in
STREAMING_CAPABLE) so the logic only considers providers that can actually be
streaming-capable; update references to strix_capable_providers,
streaming_provider, select_streaming_provider, and STREAMING_CAPABLE
accordingly.
tests/unit/test_wizard_strixhalo.py (1)

1-65: Module-level wizard loading is effective but fragile across test suites.

The sys.modules.setdefault approach avoids overwriting real modules, but if this file happens to be imported first in a pytest session, the stubs will persist for later test modules that expect the real rich or setup_utils. Consider scoping the wizard load inside a pytest.fixture(scope="module") that also handles cleanup, or at least documenting the isolation assumption.

That said, for a dedicated test module this works fine in practice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_wizard_strixhalo.py` around lines 1 - 65, The test currently
loads the wizard at module import time using _load_wizard_module (which calls
_install_rich_stubs and _install_setup_utils_stub) and mutates
sys.modules/sys.path, which can leak stubs into other tests; change this to a
pytest fixture (e.g. fixture function named wizard_module with scope="module")
that performs the stub installs, inserts repo_root into sys.path, imports the
wizard via _load_wizard_module logic, yields the loaded module (replace global
wizard usage), and then in teardown removes the inserted sys.path entry and
deletes the stub entries from sys.modules ("rich", "rich.console",
"rich.prompt", "setup_utils") to avoid cross-test contamination. Ensure the
fixture name replaces any direct references to the module in the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py`:
- Around line 706-720: The early DB I/O (Conversation.find_one / setting
end_reason / conversation_for_end_reason.save) is executed before the
try/finally that invokes handle_end_of_conversation(), so failures can skip
cleanup; move the retrieval and persistence of conversation_for_end_reason (the
Conversation.find_one call, assigning Conversation.EndReason or
EndReason.UNKNOWN, and await conversation_for_end_reason.save()) inside the try
block that surrounds handle_end_of_conversation(), and if converting end_reason
raises ValueError keep the existing logger.warning but still allow any
exceptions from find_one/save to propagate (or be caught and logged) so the
finally block always runs to call handle_end_of_conversation().

In `@extras/asr-services/init.py`:
- Around line 524-530: Remove the unnecessary f-string prefixes flagged by Ruff
F541 in the console output calls: update the two self.console.print calls inside
the init logic (the ones printing "[yellow][WARNING][/yellow] defaults.yml not
found, cannot add missing models" and "[blue][INFO][/blue] Model definitions
already present in config.yml") to plain string literals without the leading f;
keep the messages and formatting intact and only drop the leading f before the
opening quote.

In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo`:
- Around line 41-65: The runtime stage runs as root—harden by creating a
non-root user and switching to it before CMD: in the runtime stage (FROM
python:3.12-slim-bookworm AS runtime) add steps to create a dedicated user/group
(e.g., appuser), chown the WORKDIR (/app) and any copied files (providers/nemo/,
common/, site-packages) to that user, set USER to that non-root account, and
ensure CMD ["python", "-m", "providers.nemo.service", "--port", "8765"] runs
under that user; reference the Dockerfile runtime stage, WORKDIR /app, the COPY
lines, and the CMD when making these changes.

In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo`:
- Around line 39-72: The Dockerfile's runtime stage runs the service as root
because there is no USER directive; create a non-root user and switch to it in
the runtime stage: add steps in the runtime stage (after copying files and
creating /models) to create a user/group (e.g., vibeuser), create a
home/workdir, chown the application directories (/app and /models) to that user,
and add a USER vibeuser directive before the CMD so providers.vibevoice.service
runs unprivileged.

In `@extras/speaker-recognition/docker-compose.yml`:
- Around line 63-90: The strixhalo profile currently only enables
speaker-service-strixhalo, which means webui and caddy won't start when using
--profile strixhalo; update the docker-compose profiles for the webui and caddy
services (the services named "webui" and "caddy") to include "strixhalo" in
their profiles array so they launch together, or alternatively add a short note
in your README explaining that "strixhalo" is intended as a speaker-service-only
profile and that webui/caddy require the "cpu" or "gpu" profiles to run; modify
the "profiles" entries for the webui and caddy service definitions (or add the
documentation note) accordingly.

In `@extras/speaker-recognition/Dockerfile.strixhalo`:
- Around line 1-65: The image runs the service as root because the Dockerfile
lacks a USER directive; create a non-root user and switch to it before CMD
(e.g., add a user/group like app with a fixed UID/GID, chown important dirs
/app, /models, /app/audio_chunks, /app/debug, /app/data to that user, and then
add USER app) so /opt/venv/bin/simple-speaker-service and runtime files are
executed without root privileges; ensure the added user can access VIRTUAL_ENV
(/opt/venv) and LF_LIBRARY_PATH locations or drop to the non-root user after
setting up permissions so the CMD runs as that user.

---

Nitpick comments:
In `@extras/asr-services/init.py`:
- Around line 242-261: The provider ordering places StrixHalo options
("vibevoice-strixhalo", "nemo-strixhalo") before common non-ROCm providers,
degrading UX; update the provider_choices and choice_to_provider mappings so
common providers (e.g., "vibevoice", "qwen3-asr", "nemo", "transformers",
"faster-whisper") appear first and move "vibevoice-strixhalo" and
"nemo-strixhalo" to the end of both dictionaries, then ensure the default passed
to prompt_choice("Choose ASR provider:", provider_choices, "1") still points to
the intended common provider.

In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo`:
- Around line 45-47: The Dockerfile RUN installs build tooling (build-essential)
into the runtime image; remove that from the runtime layer by moving
installation of build-essential and any compilation steps into a build stage
(multi-stage build) or install it only in a temporary builder image and copy
artifacts into the final image, and ensure the RUN line that currently lists
build-essential is updated to exclude "build-essential" so the final image only
contains runtime packages like libsndfile1, portaudio19-dev, curl, etc.; update
any related build steps to occur in a stage that references the builder stage
artifacts.

In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo`:
- Around line 44-46: The runtime image currently installs build toolchain
packages (gcc, libc6-dev) in the RUN apt-get step; remove those packages from
the runtime install and ensure any compilation happens in the builder stage (or
add them only to the builder stage). Modify the RUN apt-get install line that
lists curl ffmpeg git libsndfile1 gcc libc6-dev so it only installs the runtime
packages (curl ffmpeg git libsndfile1) and move gcc and libc6-dev to the builder
stage Dockerfile RUN/apt-get step where source/build commands run.

In `@extras/speaker-recognition/Dockerfile.strixhalo`:
- Around line 38-41: The Dockerfile is installing pyannote.audio from the moving
HEAD which breaks reproducible builds; change the git URL used in the uv pip
install line (the "git+https://github.com/pyannote/pyannote-audio.git"
reference) to pin to a fixed tag or commit (e.g. append @<tag-or-commit> to the
URL) so the image always installs the same version when PYTORCH_CUDA_VERSION
equals "strixhalo"; update the install string in that conditional to use the
chosen immutable ref.

In `@extras/speaker-recognition/pyproject.toml`:
- Around line 107-114: The pyproject.toml git dependency for "pyannote.audio" is
unpinned; update the dependency entry for "pyannote.audio" (the line with git =
"https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo") to
include an explicit revision (e.g., add rev =
"78c0d16a5f558adb1024d112e6857b3204b450b3") so the git source is pinned to a
specific commit; make this change in the pyproject.toml dependency definition
for "pyannote.audio" to ensure reproducible installs.

In `@services.py`:
- Around line 187-194: The PYTORCH_CUDA_VERSION → profile mapping is duplicated;
extract it into a single helper (e.g., def _derive_speaker_profile(env_values):)
that reads env_values.get("PYTORCH_CUDA_VERSION", "cpu") and returns "strixhalo"
if value == "strixhalo", "gpu" if value.startswith("cu"), otherwise "cpu"; then
replace both existing blocks in services.py (the build path block around
variables pytorch_version/profile and the up/down path blocks) with profile =
_derive_speaker_profile(env_values) so both locations use the same logic.

In `@tests/unit/test_wizard_strixhalo.py`:
- Around line 1-65: The test currently loads the wizard at module import time
using _load_wizard_module (which calls _install_rich_stubs and
_install_setup_utils_stub) and mutates sys.modules/sys.path, which can leak
stubs into other tests; change this to a pytest fixture (e.g. fixture function
named wizard_module with scope="module") that performs the stub installs,
inserts repo_root into sys.path, imports the wizard via _load_wizard_module
logic, yields the loaded module (replace global wizard usage), and then in
teardown removes the inserted sys.path entry and deletes the stub entries from
sys.modules ("rich", "rich.console", "rich.prompt", "setup_utils") to avoid
cross-test contamination. Ensure the fixture name replaces any direct references
to the module in the tests.

In `@wizard.py`:
- Around line 899-943: The condition in select_hardware_profile uses
streaming_provider in strix_capable_providers which is unreachable because
parakeet/vibevoice are batch-only; remove streaming_provider from the
needs_hardware_choice expression or replace it with a capability-aware check
(e.g., test that streaming_provider is in strix_capable_providers AND in
STREAMING_CAPABLE) so the logic only considers providers that can actually be
streaming-capable; update references to strix_capable_providers,
streaming_provider, select_streaming_provider, and STREAMING_CAPABLE
accordingly.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 06367a2 and 31abd5d.

⛔ Files ignored due to path filters (2)
  • extras/asr-services/uv.lock is excluded by !**/*.lock
  • extras/speaker-recognition/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • backends/advanced/docker-compose-test.yml
  • backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
  • extras/asr-services/docker-compose.yml
  • extras/asr-services/init.py
  • extras/asr-services/providers/nemo/Dockerfile.strixhalo
  • extras/asr-services/providers/vibevoice/Dockerfile.strixhalo
  • extras/asr-services/pyproject.toml
  • extras/asr-services/tests/test_cuda_version_config.py
  • extras/speaker-recognition/Dockerfile.strixhalo
  • extras/speaker-recognition/docker-compose.yml
  • extras/speaker-recognition/init.py
  • extras/speaker-recognition/pyproject.toml
  • services.py
  • tests/resources/plugin_keywords.robot
  • tests/unit/test_wizard_strixhalo.py
  • wizard.py

Comment on lines +524 to +530
self.console.print(
f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
)
else:
self.console.print(f"[blue][INFO][/blue] Model definitions already present in config.yml")
self.console.print(
f"[blue][INFO][/blue] Model definitions already present in config.yml"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove extraneous f prefix from strings without placeholders.

Flagged by Ruff (F541). These are plain strings that don't need f-string interpolation.

🔧 Proposed fix
-                    self.console.print(
-                        f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
-                    )
+                    self.console.print(
+                        "[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
+                    )
             else:
-                self.console.print(
-                    f"[blue][INFO][/blue] Model definitions already present in config.yml"
-                )
+                self.console.print(
+                    "[blue][INFO][/blue] Model definitions already present in config.yml"
+                )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.console.print(
f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
)
else:
self.console.print(f"[blue][INFO][/blue] Model definitions already present in config.yml")
self.console.print(
f"[blue][INFO][/blue] Model definitions already present in config.yml"
)
self.console.print(
"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
)
else:
self.console.print(
"[blue][INFO][/blue] Model definitions already present in config.yml"
)
🧰 Tools
🪛 Ruff (0.15.2)

[error] 525-525: f-string without any placeholders

Remove extraneous f prefix

(F541)


[error] 529-529: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/asr-services/init.py` around lines 524 - 530, Remove the unnecessary
f-string prefixes flagged by Ruff F541 in the console output calls: update the
two self.console.print calls inside the init logic (the ones printing
"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
and "[blue][INFO][/blue] Model definitions already present in config.yml") to
plain string literals without the leading f; keep the messages and formatting
intact and only drop the leading f before the opening quote.

Comment on lines +63 to +90
# AMD Strix Halo (gfx1151 / Ryzen AI Max) Profile
speaker-service-strixhalo:
<<: *base-speaker-service
profiles: ["strixhalo"]
build:
context: .
dockerfile: Dockerfile.strixhalo
image: speaker-recognition-strixhalo:latest
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
group_add:
# Use numeric GIDs - group names (video/render) don't exist in vllm-therock image.
# Find your system's GIDs with: getent group video render
- "${VIDEO_GID:-985}" # video group (Arch default; Ubuntu typically 44)
- "${RENDER_GID:-989}" # render group (Arch default; Ubuntu typically 107)
environment:
- HF_HOME=/models
- HF_TOKEN=${HF_TOKEN}
- SIMILARITY_THRESHOLD=${SIMILARITY_THRESHOLD:-0.15}
- SPEAKER_SERVICE_HOST=${SPEAKER_SERVICE_HOST:-0.0.0.0}
- SPEAKER_SERVICE_PORT=${SPEAKER_SERVICE_PORT:-8085}
- DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY}
- HSA_OVERRIDE_GFX_VERSION=11.5.1
networks:
default:
aliases:
- speaker-service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== All declared compose profiles ==="
rg -n 'profiles:\s*\[' extras/speaker-recognition/docker-compose.yml | head -20

echo ""
echo "=== Services with their profiles (full list) ==="
rg -B 2 'profiles:\s*\[' extras/speaker-recognition/docker-compose.yml | grep -E '^\s*[a-z-]+:|profiles:' | head -40

echo ""
echo "=== Checking for UI/proxy/other services ==="
rg -n '^\s*[a-z-]+:' extras/speaker-recognition/docker-compose.yml | head -30

echo ""
echo "=== How compose is launched ==="
find extras/speaker-recognition -type f \( -name "*.sh" -o -name "*.md" -o -name "*.yml" -o -name "*.yaml" \) | xargs rg -l 'docker.*compose.*up|--profile|COMPOSE_PROFILES' 2>/dev/null || echo "No matches found"

Repository: SimpleOpenSoftware/chronicle

Length of output: 1479


🏁 Script executed:

#!/bin/bash

echo "=== quickstart.md (how to launch) ==="
cat extras/speaker-recognition/quickstart.md

echo ""
echo "=== README.md (profile usage guidance) ==="
head -100 extras/speaker-recognition/README.md

echo ""
echo "=== Check for strixhalo-specific launch instructions ==="
rg -i 'strixhalo' extras/speaker-recognition/README* extras/speaker-recognition/quickstart.md -A 3 -B 1 | head -50

Repository: SimpleOpenSoftware/chronicle

Length of output: 9291


Add webui and caddy to strixhalo profile, or document that strixhalo is speaker-service-only.

The speaker-service-strixhalo profile on line 66 exists, but webui (line 98) and caddy (line 125) are limited to ["cpu", "gpu"] profiles. If launched with docker compose --profile strixhalo up, the web UI and reverse proxy will not start, breaking the full-stack system. Either include strixhalo in both services' profile lists, or clarify in documentation that strixhalo is intended for speaker-service-only deployment without the UI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/speaker-recognition/docker-compose.yml` around lines 63 - 90, The
strixhalo profile currently only enables speaker-service-strixhalo, which means
webui and caddy won't start when using --profile strixhalo; update the
docker-compose profiles for the webui and caddy services (the services named
"webui" and "caddy") to include "strixhalo" in their profiles array so they
launch together, or alternatively add a short note in your README explaining
that "strixhalo" is intended as a speaker-service-only profile and that
webui/caddy require the "cpu" or "gpu" profiles to run; modify the "profiles"
entries for the webui and caddy service definitions (or add the documentation
note) accordingly.

Comment on lines 1 to 65
FROM docker.io/kyuz0/vllm-therock-gfx1151:sha-039484a

ARG PYTORCH_CUDA_VERSION=strixhalo
ENV PYTORCH_CUDA_VERSION=${PYTORCH_CUDA_VERSION}

# Install system dependencies
RUN set -eux; \
dnf -y install \
gcc gcc-c++ make git ffmpeg curl libjpeg-turbo-devel zlib-devel libpng-devel; \
dnf -y clean all

WORKDIR /app

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv

# Copy dependency files first (for better caching)
COPY pyproject.toml uv.lock ./

# Use the base image's prebuilt venv (contains torch stack for gfx1151)
RUN test -x /opt/venv/bin/python
ENV VIRTUAL_ENV=/opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install dependencies, but never install/override torch/torchvision/torchaudio in this image.
# We avoid `uv sync` here because it recreates the venv and drops `--system-site-packages`.
# For strixhalo, we install pyannote.audio from git with --no-deps separately.
RUN set -eux; \
uv export --frozen --format requirements.txt --no-dev --no-hashes \
--extra ${PYTORCH_CUDA_VERSION} --no-emit-project \
--prune torch --prune torchvision --prune torchaudio \
--prune rocm --prune rocm-sdk-core --prune rocm-sdk-devel --prune rocm-sdk-libraries-gfx1151 \
--prune torchcodec \
$(if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then echo "--prune pyannote.audio"; fi) \
--output-file /tmp/requirements.txt; \
uv pip install --python /opt/venv/bin/python --no-managed-python --prerelease=if-necessary-or-explicit \
--requirements /tmp/requirements.txt; \
if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then \
uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps \
"git+https://github.com/pyannote/pyannote-audio.git"; \
fi; \
uv cache clean

# Copy the full source code (after dependencies are cached)
COPY src/ src/

# Install the project (editable so the compose bind-mount workflow still works)
RUN uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps --editable .

# Verify we can import the base image's torch stack (and that torchvision ops are present)
RUN python -c "import torch, torchvision; print('torch', torch.__version__, torch.__file__); print('torchvision', torchvision.__version__, torchvision.__file__)"

# Create directories
RUN mkdir -p /app/audio_chunks /app/debug /app/data /models

# Set environment variables
ENV HF_HOME=/models
ENV PYTHONPATH=/app
ENV LD_LIBRARY_PATH=/opt/rocm/lib:/opt/rocm/lib/host-math/lib:/opt/rocm/lib/rocm_sysdeps/lib:/opt/venv/lib/python3.13/site-packages/torch/lib:/usr/lib64:${LD_LIBRARY_PATH}

# Expose port
EXPOSE 8085

# Run the service
CMD ["/opt/venv/bin/simple-speaker-service"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Container should not run as root.

There is no USER directive, so the service runs as root throughout runtime.

Proposed hardening patch
 # Create directories
 RUN mkdir -p /app/audio_chunks /app/debug /app/data /models
+
+RUN useradd --create-home --uid 10001 appuser \
+    && chown -R appuser:appuser /app /models
+USER appuser
 
 # Set environment variables
 ENV HF_HOME=/models
🧰 Tools
🪛 Trivy (0.69.1)

[error] 1-1: Image user should not be 'root'

Specify at least 1 USER command in Dockerfile with non-root user as argument

Rule: DS-0002

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extras/speaker-recognition/Dockerfile.strixhalo` around lines 1 - 65, The
image runs the service as root because the Dockerfile lacks a USER directive;
create a non-root user and switch to it before CMD (e.g., add a user/group like
app with a fixed UID/GID, chown important dirs /app, /models, /app/audio_chunks,
/app/debug, /app/data to that user, and then add USER app) so
/opt/venv/bin/simple-speaker-service and runtime files are executed without root
privileges; ensure the added user can access VIRTUAL_ENV (/opt/venv) and
LF_LIBRARY_PATH locations or drop to the non-root user after setting up
permissions so the CMD runs as that user.

0xrushi and others added 6 commits February 24, 2026 20:10
…cies

- Added user management for `appuser` in the Nemo and VibeVoice Dockerfiles, enhancing security by running services as non-root users.
- Included `sudo` in the installation of runtime dependencies across multiple Dockerfiles to facilitate user permissions.
- Updated `pyproject.toml` to specify a new dependency on `av` for audio processing.
- Enhanced `uv.lock` with additional package resolution markers for better compatibility across different environments.
- Incremented revision number in `uv.lock` to 3 and added new dependencies: `openinference-instrumentation-openai`, `opentelemetry-api`, and `opentelemetry-sdk`.
- Modified `docker-compose.yml` to allow dynamic device and dtype configuration via environment variables.
- Updated `pyproject.toml` to include `einops` as a required dependency.
- Refactored `Dockerfile.strixhalo` for improved dependency management and installation process.
- Enhanced error handling in `transcriber.py` to validate device configuration and prevent runtime failures.
- Cleaned up API documentation in `__init__.py` and improved logging in `identification.py` for better traceability.
- Added a new Docker Compose file `docker-compose-test-strixhalo.yml` for testing the Parakeet ASR service with ROCm support.
- Updated `.gitignore` to exclude model cache files specific to Strix Halo tests.
- Modified `uv.lock` to include `einops` as a new dependency for improved functionality.
- Refactored `Dockerfile.strixhalo` for better dependency management and installation process.
- Updated test script to support dynamic configuration of service URLs and timeouts, enhancing test flexibility.
- Added a new Makefile target `test-asr-strix-parakeet` for running Strix Halo integration tests.
@@ -200,9 +205,11 @@ def run_compose_command(service_name, command, build=False, force_recreate=False
# Map provider to docker service name
provider_to_service = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine cuz strixhalo is so popular but im thinking how this would work if many people wanna add their device. Dockerfiles are ok, but these are changes to code 🤔

Copy link
Collaborator

@AnkushMalaker AnkushMalaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AnkushMalaker AnkushMalaker merged commit d2b2300 into SimpleOpenSoftware:dev Feb 27, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants