-
Notifications
You must be signed in to change notification settings - Fork 5
job-exporter docker image clean #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -16,75 +16,120 @@ | |||||||||||||||||||||||
| # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| ############################ | ||||||||||||||||||||||||
| # builder: only for compiling python wheels | ||||||||||||||||||||||||
| ############################ | ||||||||||||||||||||||||
| FROM mcr.microsoft.com/mirror/nvcr/nvidia/cuda:12.0.1-runtime-ubuntu22.04 AS builder | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| ARG TARGETARCH | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| RUN set -eux; \ | ||||||||||||||||||||||||
| apt-get update; \ | ||||||||||||||||||||||||
| DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ | ||||||||||||||||||||||||
| ca-certificates \ | ||||||||||||||||||||||||
| python3-pip \ | ||||||||||||||||||||||||
| python3-dev \ | ||||||||||||||||||||||||
| build-essential \ | ||||||||||||||||||||||||
| gcc; \ | ||||||||||||||||||||||||
| rm -rf /var/lib/apt/lists/* | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| WORKDIR /w | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # build wheels once | ||||||||||||||||||||||||
| COPY requirements.txt /w/requirements.txt | ||||||||||||||||||||||||
| RUN python3 -m pip install --no-cache-dir -U pip wheel && \ | ||||||||||||||||||||||||
| python3 -m pip wheel --no-cache-dir --wheel-dir /w/wheels \ | ||||||||||||||||||||||||
| -r /w/requirements.txt \ | ||||||||||||||||||||||||
| prometheus_client psutil filelock | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| ############################ | ||||||||||||||||||||||||
| # runtime: final image | ||||||||||||||||||||||||
| ############################ | ||||||||||||||||||||||||
| FROM mcr.microsoft.com/mirror/nvcr/nvidia/cuda:12.0.1-runtime-ubuntu22.04 | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| ARG TARGETARCH | ||||||||||||||||||||||||
| # Register the ROCM package repository, and install rocm-dev package | ||||||||||||||||||||||||
| ARG ROCM_VERSION=6.2.2 | ||||||||||||||||||||||||
| ARG AMDGPU_VERSION=6.2.2 | ||||||||||||||||||||||||
| ARG DCGM_TARGET_VERSION=1:4.4.1-1 | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ | ||||||||||||||||||||||||
| autoconf \ | ||||||||||||||||||||||||
| automake \ | ||||||||||||||||||||||||
| bash \ | ||||||||||||||||||||||||
| build-essential \ | ||||||||||||||||||||||||
| cmake \ | ||||||||||||||||||||||||
| curl \ | ||||||||||||||||||||||||
| file \ | ||||||||||||||||||||||||
| g++ \ | ||||||||||||||||||||||||
| git \ | ||||||||||||||||||||||||
| gnupg \ | ||||||||||||||||||||||||
| ibverbs-utils \ | ||||||||||||||||||||||||
| kmod \ | ||||||||||||||||||||||||
| libc++-dev \ | ||||||||||||||||||||||||
| libcap-dev \ | ||||||||||||||||||||||||
| libelf1 \ | ||||||||||||||||||||||||
| libgflags-dev \ | ||||||||||||||||||||||||
| libgtest-dev \ | ||||||||||||||||||||||||
| libnuma-dev \ | ||||||||||||||||||||||||
| libtool \ | ||||||||||||||||||||||||
| numactl \ | ||||||||||||||||||||||||
| pkg-config \ | ||||||||||||||||||||||||
| python3-dev \ | ||||||||||||||||||||||||
| python3-pip \ | ||||||||||||||||||||||||
| sudo \ | ||||||||||||||||||||||||
| unzip && \ | ||||||||||||||||||||||||
| if [ "$TARGETARCH" = "amd64" ]; then \ | ||||||||||||||||||||||||
| printf "Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600" | tee /etc/apt/preferences.d/rocm-pin-600 && \ | ||||||||||||||||||||||||
| curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add - && \ | ||||||||||||||||||||||||
| echo "deb https://repo.radeon.com/rocm/apt/$ROCM_VERSION/ jammy main" | tee /etc/apt/sources.list.d/rocm.list && \ | ||||||||||||||||||||||||
| echo "deb https://repo.radeon.com/amdgpu/$AMDGPU_VERSION/ubuntu jammy main" | tee /etc/apt/sources.list.d/amdgpu.list && \ | ||||||||||||||||||||||||
| apt-get update && \ | ||||||||||||||||||||||||
| DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends rocm-dev; \ | ||||||||||||||||||||||||
| fi | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| COPY src/Moneo /Moneo | ||||||||||||||||||||||||
| # -------------------------- | ||||||||||||||||||||||||
| # base + REQUIRED apt upgrade | ||||||||||||||||||||||||
| # -------------------------- | ||||||||||||||||||||||||
| RUN set -eux; \ | ||||||||||||||||||||||||
| apt-get update; \ | ||||||||||||||||||||||||
| apt-get upgrade -y; \ | ||||||||||||||||||||||||
| DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ | ||||||||||||||||||||||||
| bash \ | ||||||||||||||||||||||||
| ca-certificates \ | ||||||||||||||||||||||||
| curl \ | ||||||||||||||||||||||||
| gnupg \ | ||||||||||||||||||||||||
| wget \ | ||||||||||||||||||||||||
| python3 \ | ||||||||||||||||||||||||
| python3-pip; \ | ||||||||||||||||||||||||
| apt-get clean; \ | ||||||||||||||||||||||||
| rm -rf /var/lib/apt/lists/* /var/cache/apt/* | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Install RDC | ||||||||||||||||||||||||
| RUN if [ "$TARGETARCH" = "amd64" ]; then sudo bash Moneo/src/worker/install/amd.sh; fi | ||||||||||||||||||||||||
| # -------------------------- | ||||||||||||||||||||||||
| # ROCm (runtime only) | ||||||||||||||||||||||||
| # -------------------------- | ||||||||||||||||||||||||
| RUN set -eux; \ | ||||||||||||||||||||||||
| if [ "$TARGETARCH" = "amd64" ]; then \ | ||||||||||||||||||||||||
| printf "Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600" \ | ||||||||||||||||||||||||
| > /etc/apt/preferences.d/rocm-pin-600; \ | ||||||||||||||||||||||||
| curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -; \ | ||||||||||||||||||||||||
| echo "deb https://repo.radeon.com/rocm/apt/$ROCM_VERSION/ jammy main" \ | ||||||||||||||||||||||||
| > /etc/apt/sources.list.d/rocm.list; \ | ||||||||||||||||||||||||
| echo "deb https://repo.radeon.com/amdgpu/$AMDGPU_VERSION/ubuntu jammy main" \ | ||||||||||||||||||||||||
|
Comment on lines
+80
to
+83
|
||||||||||||||||||||||||
| curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -; \ | |
| echo "deb https://repo.radeon.com/rocm/apt/$ROCM_VERSION/ jammy main" \ | |
| > /etc/apt/sources.list.d/rocm.list; \ | |
| echo "deb https://repo.radeon.com/amdgpu/$AMDGPU_VERSION/ubuntu jammy main" \ | |
| curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor -o /usr/share/keyrings/rocm-archive-keyring.gpg; \ | |
| echo "deb [signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/rocm/apt/$ROCM_VERSION/ jammy main" \ | |
| > /etc/apt/sources.list.d/rocm.list; \ | |
| echo "deb [signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/amdgpu/$AMDGPU_VERSION/ubuntu jammy main" \ |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AMD RDC installation has been significantly simplified from building from source (with a specific cherry-picked commit 660c5afaf49630781c1059ba6d30bae21743c32f from amd-staging branch) to installing the prebuilt rdc package. This changes the RDC version and removes the custom patches that were previously applied. Verify that the packaged RDC version includes the necessary functionality from the cherry-picked commit, or that the commit is no longer needed for the target ROCm version 6.2.2. This change could impact AMD GPU monitoring functionality if the custom patches were critical.
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DCGM Python bindings path in nvidia_exporter.py is incompatible with DCGM 4. The new Dockerfile installs datacenter-gpu-manager-4 which places Python bindings at /usr/share/datacenter-gpu-manager-4/bindings/python3, but nvidia_exporter.py still references /usr/local/dcgm/bindings/python3. The update-dcgm.py script that previously handled this path migration has been removed, which will cause the NVIDIA exporter to fail with ImportError when trying to import dcgm_fields and DcgmReader. The sys.path.append line needs to be updated to match the new DCGM 4 installation path.
| rm -rf /var/lib/apt/lists/* | |
| rm -rf /var/lib/apt/lists/* | |
| ENV PYTHONPATH=/usr/share/datacenter-gpu-manager-4/bindings/python3:${PYTHONPATH} |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nerdctl binary is downloaded from GitHub without checksum verification. This could pose a security risk if the download is compromised or if the download fails partially. Consider adding checksum verification using sha256sum or downloading and verifying the checksum file provided by the nerdctl releases. For example, the nerdctl releases include SHA256SUMS files that can be used to verify the integrity of the downloaded archive.
| mkdir -p /tmp/nerdctl; \ | |
| tar -xzf /tmp/nerdctl.tar.gz -C /tmp/nerdctl; \ | |
| mv /tmp/nerdctl/nerdctl /usr/local/bin/nerdctl; \ | |
| rm -rf /tmp/nerdctl* /tmp/nerdctl.tar.gz | |
| wget -O /tmp/nerdctl-SHA256SUMS \ | |
| https://github.com/containerd/nerdctl/releases/download/v${NERDCTL_VERSION}/SHA256SUMS; \ | |
| grep " nerdctl-${NERDCTL_VERSION}-linux-${TARGETARCH}.tar.gz$" /tmp/nerdctl-SHA256SUMS | sha256sum -c -; \ | |
| mkdir -p /tmp/nerdctl; \ | |
| tar -xzf /tmp/nerdctl.tar.gz -C /tmp/nerdctl; \ | |
| mv /tmp/nerdctl/nerdctl /usr/local/bin/nerdctl; \ | |
| rm -rf /tmp/nerdctl* /tmp/nerdctl.tar.gz /tmp/nerdctl-SHA256SUMS |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is trailing whitespace at the end of this line after the backslash continuation character. This should be removed for consistency and to avoid potential issues with shell script parsing.
| RUN python3 -m pip install --no-cache-dir -U pip && \ | |
| RUN python3 -m pip install --no-cache-dir -U pip && \ |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packages prometheus_client, psutil, and filelock are being installed redundantly. They were already installed from requirements.txt on line 124 (prometheus_client is pinned to 0.20.0 in requirements.txt). Installing them again without version specifications may cause version conflicts or simply waste build time. Consider removing these redundant installations or ensuring all package versions are consistently managed through requirements.txt.
| RUN python3 -m pip install --no-cache-dir -U pip && \ | |
| python3 -m pip install --no-cache-dir \ | |
| --no-index --find-links=/wheels \ | |
| -r /job_exporter/requirements.txt && \ | |
| python3 -m pip install --no-cache-dir \ | |
| --no-index --find-links=/wheels \ | |
| prometheus_client psutil filelock && \ | |
| RUN python3 -m pip install --no-cache-dir -U pip && \ | |
| python3 -m pip install --no-cache-dir \ | |
| --no-index --find-links=/wheels \ | |
| -r /job_exporter/requirements.txt && \ |
This file was deleted.
This file was deleted.
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package prometheus_client is being built twice as a wheel - once from requirements.txt (which pins it to version 0.20.0) on line 42, and again without version specification on line 43. This may cause version conflicts or unnecessary duplication. Similarly, the same packages (prometheus_client, psutil, filelock) are being installed again in lines 125-127 after already being installed from requirements.txt on line 124. Consider consolidating these package specifications into requirements.txt to ensure consistent versioning and avoid redundant installations.