Add an optional HF persistent cache configuration by sshlyapn · Pull Request #55 · ROCm/madengine

sshlyapn · 2025-11-21T10:21:49Z

Motivation

This PR introduces support for persistent Hugging Face model caching within Docker containers. This prevents models from being re-downloaded every time the script runs, which is especially useful during manual testing on private machines from Conductor. By allowing cache reuse, it reduces network overhead and speeds up repeated test cycles.

Technical Details

Added logic to check for persistent_hf_cache_dir in the context.
Created a dedicated Docker volume (hf_models_cache_volume) when persistent_hf_cache_dir is provided.
Mounted the HF cache volume to the specified directory inside the container.
Modified model execution command to append --hf_cache_dir argument when persistent cache is enabled.

Should be merged after:

Test Plan

Validated with manual job: https://mlseqa-ci.amd.com:9090/view/Experimental/job/Test-MAD-akochin/118/

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull Request Overview

This PR adds optional persistent Hugging Face model caching support to prevent models from being re-downloaded on every script execution, particularly useful for manual testing scenarios. The implementation creates a Docker volume for cache persistence when configured and passes the cache directory to model execution scripts.

Key Changes:

Added logic to check for and utilize persistent_hf_cache_dir configuration from context
Implemented Docker volume creation and mounting for persistent HF cache storage
Refactored model execution command building to append --hf_cache_dir argument when persistent cache is enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/madengine/tools/run_models.py

Copilot · 2025-11-21T12:23:37Z

src/madengine/tools/run_models.py

        docker_options += self.get_mount_arg(mount_datapaths)
+
+        if persistent_hf_cache_dir:
+            docker_options += f" -v {hf_models_cache_volume_name}:{persistent_hf_cache_dir} "


The persistent_hf_cache_dir value from context is used directly in the Docker mount without validation. This could lead to arbitrary directory mounting if the context contains malicious input. Add validation to ensure the path is safe and doesn't contain dangerous characters or paths like /, /etc, or similar sensitive directories.

Copilot · 2025-11-21T12:23:38Z

src/madengine/tools/run_models.py

+            model_args = self.context.ctx.get("model_args", info["args"])
+
+            if persistent_hf_cache_dir:
+                model_args += f" --hf_cache_dir {persistent_hf_cache_dir} "


The persistent_hf_cache_dir is concatenated directly into the command string without shell escaping. If the path contains spaces or special characters, it could break the command or introduce command injection vulnerabilities. Use proper shell escaping (e.g., shlex.quote()) when building the command string.

gowthamcr-amd · 2025-11-26T11:28:57Z

LGTM

Add optional HF persistent cache configuration

97d7a57

andrei-kochin requested review from GeneDer, Copilot and gargrahul and removed request for GeneDer, Copilot and gargrahul November 21, 2025 12:22

Copilot AI reviewed Nov 21, 2025

View reviewed changes

gowthamcr-amd self-requested a review November 26, 2025 11:25

gowthamcr-amd approved these changes Nov 26, 2025

View reviewed changes

Merge branch 'main' into persistent_hf_cache_dir

dda6af0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an optional HF persistent cache configuration#55

Add an optional HF persistent cache configuration#55
sshlyapn wants to merge 2 commits intomainfrom
persistent_hf_cache_dir

sshlyapn commented Nov 21, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Nov 21, 2025

Uh oh!

Copilot AI Nov 21, 2025

Uh oh!

gowthamcr-amd commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sshlyapn commented Nov 21, 2025

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

gowthamcr-amd commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants