Skip to content

[FEATURE] auto-detect libcudart.so path / dynamic dependency detection #26

@ethangraham2001

Description

@ethangraham2001

Motivation

eBPF can monitor when a file is opened by a binary. Oracle, for instance, uses this for dependency detection in their cloud service. We could do something like this

def on_file_open(path):
    if path.contains("libcudart.so"):
        attach_uprobes(path)

Basically, automatically detect when a libcudart.so is used by a running binary. This would be particularly useful when running jobs that access difference libcudart.so files, for example PyTorch jobs running inside their own separate conda envs and accessing copies of the runtime dynamic lib. Removes the burden from the user who wants to just run their jobs without having to spawn multiple gpuprobe-daemon instances or checking through their env to find where libcudart.so is.

Proposal

Add an option --autodetect-libcudart that monitors such file-open events. This needs to be an option, as it is pretty expensive to perform a string comparison whenever a file is opened.

This is pretty important for usability imho.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions