feat: Add dockerfiles for strixhalo AMD APU #312
feat: Add dockerfiles for strixhalo AMD APU #312AnkushMalaker merged 10 commits intoSimpleOpenSoftware:devfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis pull request introduces support for AMD Strix Halo ROCm GPUs across ASR and speaker recognition services. Changes include new Docker services and Dockerfiles with ROCm configuration, CLI parameters for hardware profile selection, dependency groups for Strix Halo variants, and updates to the service orchestration wizard to handle hardware-specific setup paths. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (8)
extras/asr-services/providers/vibevoice/Dockerfile.strixhalo (1)
44-46: Trim compiler toolchain from runtime image.Line 45 installs
gccandlibc6-devin the runtime layer even though build steps happen in the builder stage.Suggested runtime package trim
RUN apt-get update && apt-get install -y --no-install-recommends \ - curl ffmpeg git libsndfile1 gcc libc6-dev \ + curl ffmpeg git libsndfile1 \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo` around lines 44 - 46, The runtime image currently installs build toolchain packages (gcc, libc6-dev) in the RUN apt-get step; remove those packages from the runtime install and ensure any compilation happens in the builder stage (or add them only to the builder stage). Modify the RUN apt-get install line that lists curl ffmpeg git libsndfile1 gcc libc6-dev so it only installs the runtime packages (curl ffmpeg git libsndfile1) and move gcc and libc6-dev to the builder stage Dockerfile RUN/apt-get step where source/build commands run.extras/asr-services/providers/nemo/Dockerfile.strixhalo (1)
45-47: Avoid build toolchain in runtime layer.Line 46 includes
build-essentialin runtime; this usually belongs only in builder images.Suggested runtime package trim
RUN apt-get update && apt-get install -y --no-install-recommends \ - libsndfile1 build-essential portaudio19-dev curl \ + libsndfile1 portaudio19-dev curl \ && rm -rf /var/lib/apt/lists/*🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo` around lines 45 - 47, The Dockerfile RUN installs build tooling (build-essential) into the runtime image; remove that from the runtime layer by moving installation of build-essential and any compilation steps into a build stage (multi-stage build) or install it only in a temporary builder image and copy artifacts into the final image, and ensure the RUN line that currently lists build-essential is updated to exclude "build-essential" so the final image only contains runtime packages like libsndfile1, portaudio19-dev, curl, etc.; update any related build steps to occur in a stage that references the builder stage artifacts.extras/speaker-recognition/Dockerfile.strixhalo (1)
38-41: Pin git install to a fixed ref for reproducible builds.Line 40 installs
pyannote.audiofrom moving HEAD, so image contents can change without code changes.Suggested pinning pattern
+ARG PYANNOTE_AUDIO_REF=<commit-sha> @@ if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then \ uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps \ - "git+https://github.com/pyannote/pyannote-audio.git"; \ + "git+https://github.com/pyannote/pyannote-audio.git@${PYANNOTE_AUDIO_REF}"; \ fi; \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/speaker-recognition/Dockerfile.strixhalo` around lines 38 - 41, The Dockerfile is installing pyannote.audio from the moving HEAD which breaks reproducible builds; change the git URL used in the uv pip install line (the "git+https://github.com/pyannote/pyannote-audio.git" reference) to pin to a fixed tag or commit (e.g. append @<tag-or-commit> to the URL) so the image always installs the same version when PYTORCH_CUDA_VERSION equals "strixhalo"; update the install string in that conditional to use the chosen immutable ref.extras/speaker-recognition/pyproject.toml (1)
107-114: Consider pinningpyannote.audiogit source to an explicit revision for robustness.The pyproject.toml doesn't pin a specific commit for the git dependency. While the uv.lock file currently locks the resolved commit (
78c0d16a5f558adb1024d112e6857b3204b450b3), explicitly declaring the revision in the source definition makes the intent clear and protects against unexpected changes if the lock is regenerated or the project is built without it.Suggested pinning approach
-"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo" } +"pyannote.audio" = { git = "https://github.com/pyannote/pyannote-audio.git", rev = "78c0d16a5f558adb1024d112e6857b3204b450b3", extra = "strixhalo" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/speaker-recognition/pyproject.toml` around lines 107 - 114, The pyproject.toml git dependency for "pyannote.audio" is unpinned; update the dependency entry for "pyannote.audio" (the line with git = "https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo") to include an explicit revision (e.g., add rev = "78c0d16a5f558adb1024d112e6857b3204b450b3") so the git source is pinned to a specific commit; make this change in the pyproject.toml dependency definition for "pyannote.audio" to ensure reproducible installs.services.py (1)
187-194: Consider extracting the duplicated profile derivation logic into a helper.The same
PYTORCH_CUDA_VERSION → profilemapping appears in both the build path (Lines 187-194) and the up/down path (Lines 298-305). A small helper would keep them in sync.♻️ Suggested helper
+def _derive_speaker_profile(env_values: dict) -> str: + """Derive docker-compose profile from PYTORCH_CUDA_VERSION.""" + pytorch_version = env_values.get("PYTORCH_CUDA_VERSION", "cpu") + if pytorch_version == "strixhalo": + return "strixhalo" + if pytorch_version.startswith("cu"): + return "gpu" + return "cpu"Then replace both occurrences with:
profile = _derive_speaker_profile(env_values)Also applies to: 298-305
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services.py` around lines 187 - 194, The PYTORCH_CUDA_VERSION → profile mapping is duplicated; extract it into a single helper (e.g., def _derive_speaker_profile(env_values):) that reads env_values.get("PYTORCH_CUDA_VERSION", "cpu") and returns "strixhalo" if value == "strixhalo", "gpu" if value.startswith("cu"), otherwise "cpu"; then replace both existing blocks in services.py (the build path block around variables pytorch_version/profile and the up/down path blocks) with profile = _derive_speaker_profile(env_values) so both locations use the same logic.extras/asr-services/init.py (1)
242-261: Strixhalo choices are listed before the standard non-ROCm providers.Choices 2 and 3 are strixhalo variants, pushing standard
qwen3-asrto 4 andnemoto 5. For the majority of users (NVIDIA/CPU), the most common choices are now further down the list. This is a UX consideration — if most users don't have Strix Halo hardware, they'll need to scroll past niche options.Consider moving the strixhalo choices to the end (e.g., choices 6 and 7), keeping the common providers at the top.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/init.py` around lines 242 - 261, The provider ordering places StrixHalo options ("vibevoice-strixhalo", "nemo-strixhalo") before common non-ROCm providers, degrading UX; update the provider_choices and choice_to_provider mappings so common providers (e.g., "vibevoice", "qwen3-asr", "nemo", "transformers", "faster-whisper") appear first and move "vibevoice-strixhalo" and "nemo-strixhalo" to the end of both dictionaries, then ensure the default passed to prompt_choice("Choose ASR provider:", provider_choices, "1") still points to the intended common provider.wizard.py (1)
899-943:streaming_provider in strix_capable_providerswill never be true in practice.
strix_capable_providerscontains{"parakeet", "vibevoice"}, which are both batch-only (not inSTREAMING_CAPABLEat line 689).streaming_provideris only set fromselect_streaming_providerwhich offers deepgram, smallest, and qwen3-asr. The check is harmless but unreachable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wizard.py` around lines 899 - 943, The condition in select_hardware_profile uses streaming_provider in strix_capable_providers which is unreachable because parakeet/vibevoice are batch-only; remove streaming_provider from the needs_hardware_choice expression or replace it with a capability-aware check (e.g., test that streaming_provider is in strix_capable_providers AND in STREAMING_CAPABLE) so the logic only considers providers that can actually be streaming-capable; update references to strix_capable_providers, streaming_provider, select_streaming_provider, and STREAMING_CAPABLE accordingly.tests/unit/test_wizard_strixhalo.py (1)
1-65: Module-level wizard loading is effective but fragile across test suites.The
sys.modules.setdefaultapproach avoids overwriting real modules, but if this file happens to be imported first in a pytest session, the stubs will persist for later test modules that expect the realrichorsetup_utils. Consider scoping the wizard load inside apytest.fixture(scope="module")that also handles cleanup, or at least documenting the isolation assumption.That said, for a dedicated test module this works fine in practice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_wizard_strixhalo.py` around lines 1 - 65, The test currently loads the wizard at module import time using _load_wizard_module (which calls _install_rich_stubs and _install_setup_utils_stub) and mutates sys.modules/sys.path, which can leak stubs into other tests; change this to a pytest fixture (e.g. fixture function named wizard_module with scope="module") that performs the stub installs, inserts repo_root into sys.path, imports the wizard via _load_wizard_module logic, yields the loaded module (replace global wizard usage), and then in teardown removes the inserted sys.path entry and deletes the stub entries from sys.modules ("rich", "rich.console", "rich.prompt", "setup_utils") to avoid cross-test contamination. Ensure the fixture name replaces any direct references to the module in the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py`:
- Around line 706-720: The early DB I/O (Conversation.find_one / setting
end_reason / conversation_for_end_reason.save) is executed before the
try/finally that invokes handle_end_of_conversation(), so failures can skip
cleanup; move the retrieval and persistence of conversation_for_end_reason (the
Conversation.find_one call, assigning Conversation.EndReason or
EndReason.UNKNOWN, and await conversation_for_end_reason.save()) inside the try
block that surrounds handle_end_of_conversation(), and if converting end_reason
raises ValueError keep the existing logger.warning but still allow any
exceptions from find_one/save to propagate (or be caught and logged) so the
finally block always runs to call handle_end_of_conversation().
In `@extras/asr-services/init.py`:
- Around line 524-530: Remove the unnecessary f-string prefixes flagged by Ruff
F541 in the console output calls: update the two self.console.print calls inside
the init logic (the ones printing "[yellow][WARNING][/yellow] defaults.yml not
found, cannot add missing models" and "[blue][INFO][/blue] Model definitions
already present in config.yml") to plain string literals without the leading f;
keep the messages and formatting intact and only drop the leading f before the
opening quote.
In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo`:
- Around line 41-65: The runtime stage runs as root—harden by creating a
non-root user and switching to it before CMD: in the runtime stage (FROM
python:3.12-slim-bookworm AS runtime) add steps to create a dedicated user/group
(e.g., appuser), chown the WORKDIR (/app) and any copied files (providers/nemo/,
common/, site-packages) to that user, set USER to that non-root account, and
ensure CMD ["python", "-m", "providers.nemo.service", "--port", "8765"] runs
under that user; reference the Dockerfile runtime stage, WORKDIR /app, the COPY
lines, and the CMD when making these changes.
In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo`:
- Around line 39-72: The Dockerfile's runtime stage runs the service as root
because there is no USER directive; create a non-root user and switch to it in
the runtime stage: add steps in the runtime stage (after copying files and
creating /models) to create a user/group (e.g., vibeuser), create a
home/workdir, chown the application directories (/app and /models) to that user,
and add a USER vibeuser directive before the CMD so providers.vibevoice.service
runs unprivileged.
In `@extras/speaker-recognition/docker-compose.yml`:
- Around line 63-90: The strixhalo profile currently only enables
speaker-service-strixhalo, which means webui and caddy won't start when using
--profile strixhalo; update the docker-compose profiles for the webui and caddy
services (the services named "webui" and "caddy") to include "strixhalo" in
their profiles array so they launch together, or alternatively add a short note
in your README explaining that "strixhalo" is intended as a speaker-service-only
profile and that webui/caddy require the "cpu" or "gpu" profiles to run; modify
the "profiles" entries for the webui and caddy service definitions (or add the
documentation note) accordingly.
In `@extras/speaker-recognition/Dockerfile.strixhalo`:
- Around line 1-65: The image runs the service as root because the Dockerfile
lacks a USER directive; create a non-root user and switch to it before CMD
(e.g., add a user/group like app with a fixed UID/GID, chown important dirs
/app, /models, /app/audio_chunks, /app/debug, /app/data to that user, and then
add USER app) so /opt/venv/bin/simple-speaker-service and runtime files are
executed without root privileges; ensure the added user can access VIRTUAL_ENV
(/opt/venv) and LF_LIBRARY_PATH locations or drop to the non-root user after
setting up permissions so the CMD runs as that user.
---
Nitpick comments:
In `@extras/asr-services/init.py`:
- Around line 242-261: The provider ordering places StrixHalo options
("vibevoice-strixhalo", "nemo-strixhalo") before common non-ROCm providers,
degrading UX; update the provider_choices and choice_to_provider mappings so
common providers (e.g., "vibevoice", "qwen3-asr", "nemo", "transformers",
"faster-whisper") appear first and move "vibevoice-strixhalo" and
"nemo-strixhalo" to the end of both dictionaries, then ensure the default passed
to prompt_choice("Choose ASR provider:", provider_choices, "1") still points to
the intended common provider.
In `@extras/asr-services/providers/nemo/Dockerfile.strixhalo`:
- Around line 45-47: The Dockerfile RUN installs build tooling (build-essential)
into the runtime image; remove that from the runtime layer by moving
installation of build-essential and any compilation steps into a build stage
(multi-stage build) or install it only in a temporary builder image and copy
artifacts into the final image, and ensure the RUN line that currently lists
build-essential is updated to exclude "build-essential" so the final image only
contains runtime packages like libsndfile1, portaudio19-dev, curl, etc.; update
any related build steps to occur in a stage that references the builder stage
artifacts.
In `@extras/asr-services/providers/vibevoice/Dockerfile.strixhalo`:
- Around line 44-46: The runtime image currently installs build toolchain
packages (gcc, libc6-dev) in the RUN apt-get step; remove those packages from
the runtime install and ensure any compilation happens in the builder stage (or
add them only to the builder stage). Modify the RUN apt-get install line that
lists curl ffmpeg git libsndfile1 gcc libc6-dev so it only installs the runtime
packages (curl ffmpeg git libsndfile1) and move gcc and libc6-dev to the builder
stage Dockerfile RUN/apt-get step where source/build commands run.
In `@extras/speaker-recognition/Dockerfile.strixhalo`:
- Around line 38-41: The Dockerfile is installing pyannote.audio from the moving
HEAD which breaks reproducible builds; change the git URL used in the uv pip
install line (the "git+https://github.com/pyannote/pyannote-audio.git"
reference) to pin to a fixed tag or commit (e.g. append @<tag-or-commit> to the
URL) so the image always installs the same version when PYTORCH_CUDA_VERSION
equals "strixhalo"; update the install string in that conditional to use the
chosen immutable ref.
In `@extras/speaker-recognition/pyproject.toml`:
- Around line 107-114: The pyproject.toml git dependency for "pyannote.audio" is
unpinned; update the dependency entry for "pyannote.audio" (the line with git =
"https://github.com/pyannote/pyannote-audio.git", extra = "strixhalo") to
include an explicit revision (e.g., add rev =
"78c0d16a5f558adb1024d112e6857b3204b450b3") so the git source is pinned to a
specific commit; make this change in the pyproject.toml dependency definition
for "pyannote.audio" to ensure reproducible installs.
In `@services.py`:
- Around line 187-194: The PYTORCH_CUDA_VERSION → profile mapping is duplicated;
extract it into a single helper (e.g., def _derive_speaker_profile(env_values):)
that reads env_values.get("PYTORCH_CUDA_VERSION", "cpu") and returns "strixhalo"
if value == "strixhalo", "gpu" if value.startswith("cu"), otherwise "cpu"; then
replace both existing blocks in services.py (the build path block around
variables pytorch_version/profile and the up/down path blocks) with profile =
_derive_speaker_profile(env_values) so both locations use the same logic.
In `@tests/unit/test_wizard_strixhalo.py`:
- Around line 1-65: The test currently loads the wizard at module import time
using _load_wizard_module (which calls _install_rich_stubs and
_install_setup_utils_stub) and mutates sys.modules/sys.path, which can leak
stubs into other tests; change this to a pytest fixture (e.g. fixture function
named wizard_module with scope="module") that performs the stub installs,
inserts repo_root into sys.path, imports the wizard via _load_wizard_module
logic, yields the loaded module (replace global wizard usage), and then in
teardown removes the inserted sys.path entry and deletes the stub entries from
sys.modules ("rich", "rich.console", "rich.prompt", "setup_utils") to avoid
cross-test contamination. Ensure the fixture name replaces any direct references
to the module in the tests.
In `@wizard.py`:
- Around line 899-943: The condition in select_hardware_profile uses
streaming_provider in strix_capable_providers which is unreachable because
parakeet/vibevoice are batch-only; remove streaming_provider from the
needs_hardware_choice expression or replace it with a capability-aware check
(e.g., test that streaming_provider is in strix_capable_providers AND in
STREAMING_CAPABLE) so the logic only considers providers that can actually be
streaming-capable; update references to strix_capable_providers,
streaming_provider, select_streaming_provider, and STREAMING_CAPABLE
accordingly.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (2)
extras/asr-services/uv.lockis excluded by!**/*.lockextras/speaker-recognition/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
backends/advanced/docker-compose-test.ymlbackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.pyextras/asr-services/docker-compose.ymlextras/asr-services/init.pyextras/asr-services/providers/nemo/Dockerfile.strixhaloextras/asr-services/providers/vibevoice/Dockerfile.strixhaloextras/asr-services/pyproject.tomlextras/asr-services/tests/test_cuda_version_config.pyextras/speaker-recognition/Dockerfile.strixhaloextras/speaker-recognition/docker-compose.ymlextras/speaker-recognition/init.pyextras/speaker-recognition/pyproject.tomlservices.pytests/resources/plugin_keywords.robottests/unit/test_wizard_strixhalo.pywizard.py
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
Outdated
Show resolved
Hide resolved
| self.console.print( | ||
| f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models" | ||
| ) | ||
| else: | ||
| self.console.print(f"[blue][INFO][/blue] Model definitions already present in config.yml") | ||
| self.console.print( | ||
| f"[blue][INFO][/blue] Model definitions already present in config.yml" | ||
| ) |
There was a problem hiding this comment.
Remove extraneous f prefix from strings without placeholders.
Flagged by Ruff (F541). These are plain strings that don't need f-string interpolation.
🔧 Proposed fix
- self.console.print(
- f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
- )
+ self.console.print(
+ "[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
+ )
else:
- self.console.print(
- f"[blue][INFO][/blue] Model definitions already present in config.yml"
- )
+ self.console.print(
+ "[blue][INFO][/blue] Model definitions already present in config.yml"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.console.print( | |
| f"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models" | |
| ) | |
| else: | |
| self.console.print(f"[blue][INFO][/blue] Model definitions already present in config.yml") | |
| self.console.print( | |
| f"[blue][INFO][/blue] Model definitions already present in config.yml" | |
| ) | |
| self.console.print( | |
| "[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models" | |
| ) | |
| else: | |
| self.console.print( | |
| "[blue][INFO][/blue] Model definitions already present in config.yml" | |
| ) |
🧰 Tools
🪛 Ruff (0.15.2)
[error] 525-525: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 529-529: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/asr-services/init.py` around lines 524 - 530, Remove the unnecessary
f-string prefixes flagged by Ruff F541 in the console output calls: update the
two self.console.print calls inside the init logic (the ones printing
"[yellow][WARNING][/yellow] defaults.yml not found, cannot add missing models"
and "[blue][INFO][/blue] Model definitions already present in config.yml") to
plain string literals without the leading f; keep the messages and formatting
intact and only drop the leading f before the opening quote.
| # AMD Strix Halo (gfx1151 / Ryzen AI Max) Profile | ||
| speaker-service-strixhalo: | ||
| <<: *base-speaker-service | ||
| profiles: ["strixhalo"] | ||
| build: | ||
| context: . | ||
| dockerfile: Dockerfile.strixhalo | ||
| image: speaker-recognition-strixhalo:latest | ||
| devices: | ||
| - /dev/kfd:/dev/kfd | ||
| - /dev/dri:/dev/dri | ||
| group_add: | ||
| # Use numeric GIDs - group names (video/render) don't exist in vllm-therock image. | ||
| # Find your system's GIDs with: getent group video render | ||
| - "${VIDEO_GID:-985}" # video group (Arch default; Ubuntu typically 44) | ||
| - "${RENDER_GID:-989}" # render group (Arch default; Ubuntu typically 107) | ||
| environment: | ||
| - HF_HOME=/models | ||
| - HF_TOKEN=${HF_TOKEN} | ||
| - SIMILARITY_THRESHOLD=${SIMILARITY_THRESHOLD:-0.15} | ||
| - SPEAKER_SERVICE_HOST=${SPEAKER_SERVICE_HOST:-0.0.0.0} | ||
| - SPEAKER_SERVICE_PORT=${SPEAKER_SERVICE_PORT:-8085} | ||
| - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY} | ||
| - HSA_OVERRIDE_GFX_VERSION=11.5.1 | ||
| networks: | ||
| default: | ||
| aliases: | ||
| - speaker-service |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== All declared compose profiles ==="
rg -n 'profiles:\s*\[' extras/speaker-recognition/docker-compose.yml | head -20
echo ""
echo "=== Services with their profiles (full list) ==="
rg -B 2 'profiles:\s*\[' extras/speaker-recognition/docker-compose.yml | grep -E '^\s*[a-z-]+:|profiles:' | head -40
echo ""
echo "=== Checking for UI/proxy/other services ==="
rg -n '^\s*[a-z-]+:' extras/speaker-recognition/docker-compose.yml | head -30
echo ""
echo "=== How compose is launched ==="
find extras/speaker-recognition -type f \( -name "*.sh" -o -name "*.md" -o -name "*.yml" -o -name "*.yaml" \) | xargs rg -l 'docker.*compose.*up|--profile|COMPOSE_PROFILES' 2>/dev/null || echo "No matches found"Repository: SimpleOpenSoftware/chronicle
Length of output: 1479
🏁 Script executed:
#!/bin/bash
echo "=== quickstart.md (how to launch) ==="
cat extras/speaker-recognition/quickstart.md
echo ""
echo "=== README.md (profile usage guidance) ==="
head -100 extras/speaker-recognition/README.md
echo ""
echo "=== Check for strixhalo-specific launch instructions ==="
rg -i 'strixhalo' extras/speaker-recognition/README* extras/speaker-recognition/quickstart.md -A 3 -B 1 | head -50Repository: SimpleOpenSoftware/chronicle
Length of output: 9291
Add webui and caddy to strixhalo profile, or document that strixhalo is speaker-service-only.
The speaker-service-strixhalo profile on line 66 exists, but webui (line 98) and caddy (line 125) are limited to ["cpu", "gpu"] profiles. If launched with docker compose --profile strixhalo up, the web UI and reverse proxy will not start, breaking the full-stack system. Either include strixhalo in both services' profile lists, or clarify in documentation that strixhalo is intended for speaker-service-only deployment without the UI.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/speaker-recognition/docker-compose.yml` around lines 63 - 90, The
strixhalo profile currently only enables speaker-service-strixhalo, which means
webui and caddy won't start when using --profile strixhalo; update the
docker-compose profiles for the webui and caddy services (the services named
"webui" and "caddy") to include "strixhalo" in their profiles array so they
launch together, or alternatively add a short note in your README explaining
that "strixhalo" is intended as a speaker-service-only profile and that
webui/caddy require the "cpu" or "gpu" profiles to run; modify the "profiles"
entries for the webui and caddy service definitions (or add the documentation
note) accordingly.
| FROM docker.io/kyuz0/vllm-therock-gfx1151:sha-039484a | ||
|
|
||
| ARG PYTORCH_CUDA_VERSION=strixhalo | ||
| ENV PYTORCH_CUDA_VERSION=${PYTORCH_CUDA_VERSION} | ||
|
|
||
| # Install system dependencies | ||
| RUN set -eux; \ | ||
| dnf -y install \ | ||
| gcc gcc-c++ make git ffmpeg curl libjpeg-turbo-devel zlib-devel libpng-devel; \ | ||
| dnf -y clean all | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| # Install uv | ||
| COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv | ||
|
|
||
| # Copy dependency files first (for better caching) | ||
| COPY pyproject.toml uv.lock ./ | ||
|
|
||
| # Use the base image's prebuilt venv (contains torch stack for gfx1151) | ||
| RUN test -x /opt/venv/bin/python | ||
| ENV VIRTUAL_ENV=/opt/venv | ||
| ENV PATH="/opt/venv/bin:$PATH" | ||
|
|
||
| # Install dependencies, but never install/override torch/torchvision/torchaudio in this image. | ||
| # We avoid `uv sync` here because it recreates the venv and drops `--system-site-packages`. | ||
| # For strixhalo, we install pyannote.audio from git with --no-deps separately. | ||
| RUN set -eux; \ | ||
| uv export --frozen --format requirements.txt --no-dev --no-hashes \ | ||
| --extra ${PYTORCH_CUDA_VERSION} --no-emit-project \ | ||
| --prune torch --prune torchvision --prune torchaudio \ | ||
| --prune rocm --prune rocm-sdk-core --prune rocm-sdk-devel --prune rocm-sdk-libraries-gfx1151 \ | ||
| --prune torchcodec \ | ||
| $(if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then echo "--prune pyannote.audio"; fi) \ | ||
| --output-file /tmp/requirements.txt; \ | ||
| uv pip install --python /opt/venv/bin/python --no-managed-python --prerelease=if-necessary-or-explicit \ | ||
| --requirements /tmp/requirements.txt; \ | ||
| if [ "${PYTORCH_CUDA_VERSION}" = "strixhalo" ]; then \ | ||
| uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps \ | ||
| "git+https://github.com/pyannote/pyannote-audio.git"; \ | ||
| fi; \ | ||
| uv cache clean | ||
|
|
||
| # Copy the full source code (after dependencies are cached) | ||
| COPY src/ src/ | ||
|
|
||
| # Install the project (editable so the compose bind-mount workflow still works) | ||
| RUN uv pip install --python /opt/venv/bin/python --no-managed-python --no-deps --editable . | ||
|
|
||
| # Verify we can import the base image's torch stack (and that torchvision ops are present) | ||
| RUN python -c "import torch, torchvision; print('torch', torch.__version__, torch.__file__); print('torchvision', torchvision.__version__, torchvision.__file__)" | ||
|
|
||
| # Create directories | ||
| RUN mkdir -p /app/audio_chunks /app/debug /app/data /models | ||
|
|
||
| # Set environment variables | ||
| ENV HF_HOME=/models | ||
| ENV PYTHONPATH=/app | ||
| ENV LD_LIBRARY_PATH=/opt/rocm/lib:/opt/rocm/lib/host-math/lib:/opt/rocm/lib/rocm_sysdeps/lib:/opt/venv/lib/python3.13/site-packages/torch/lib:/usr/lib64:${LD_LIBRARY_PATH} | ||
|
|
||
| # Expose port | ||
| EXPOSE 8085 | ||
|
|
||
| # Run the service | ||
| CMD ["/opt/venv/bin/simple-speaker-service"] |
There was a problem hiding this comment.
Container should not run as root.
There is no USER directive, so the service runs as root throughout runtime.
Proposed hardening patch
# Create directories
RUN mkdir -p /app/audio_chunks /app/debug /app/data /models
+
+RUN useradd --create-home --uid 10001 appuser \
+ && chown -R appuser:appuser /app /models
+USER appuser
# Set environment variables
ENV HF_HOME=/models🧰 Tools
🪛 Trivy (0.69.1)
[error] 1-1: Image user should not be 'root'
Specify at least 1 USER command in Dockerfile with non-root user as argument
Rule: DS-0002
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/speaker-recognition/Dockerfile.strixhalo` around lines 1 - 65, The
image runs the service as root because the Dockerfile lacks a USER directive;
create a non-root user and switch to it before CMD (e.g., add a user/group like
app with a fixed UID/GID, chown important dirs /app, /models, /app/audio_chunks,
/app/debug, /app/data to that user, and then add USER app) so
/opt/venv/bin/simple-speaker-service and runtime files are executed without root
privileges; ensure the added user can access VIRTUAL_ENV (/opt/venv) and
LF_LIBRARY_PATH locations or drop to the non-root user after setting up
permissions so the CMD runs as that user.
…cies - Added user management for `appuser` in the Nemo and VibeVoice Dockerfiles, enhancing security by running services as non-root users. - Included `sudo` in the installation of runtime dependencies across multiple Dockerfiles to facilitate user permissions. - Updated `pyproject.toml` to specify a new dependency on `av` for audio processing. - Enhanced `uv.lock` with additional package resolution markers for better compatibility across different environments.
- Incremented revision number in `uv.lock` to 3 and added new dependencies: `openinference-instrumentation-openai`, `opentelemetry-api`, and `opentelemetry-sdk`. - Modified `docker-compose.yml` to allow dynamic device and dtype configuration via environment variables. - Updated `pyproject.toml` to include `einops` as a required dependency. - Refactored `Dockerfile.strixhalo` for improved dependency management and installation process. - Enhanced error handling in `transcriber.py` to validate device configuration and prevent runtime failures. - Cleaned up API documentation in `__init__.py` and improved logging in `identification.py` for better traceability.
- Added a new Docker Compose file `docker-compose-test-strixhalo.yml` for testing the Parakeet ASR service with ROCm support. - Updated `.gitignore` to exclude model cache files specific to Strix Halo tests. - Modified `uv.lock` to include `einops` as a new dependency for improved functionality. - Refactored `Dockerfile.strixhalo` for better dependency management and installation process. - Updated test script to support dynamic configuration of service URLs and timeouts, enhancing test flexibility. - Added a new Makefile target `test-asr-strix-parakeet` for running Strix Halo integration tests.
| @@ -200,9 +205,11 @@ def run_compose_command(service_name, command, build=False, force_recreate=False | |||
| # Map provider to docker service name | |||
| provider_to_service = { | |||
There was a problem hiding this comment.
I think this is fine cuz strixhalo is so popular but im thinking how this would work if many people wanna add their device. Dockerfiles are ok, but these are changes to code 🤔
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests
Chores