-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Motivation
eBPF can monitor when a file is opened by a binary. Oracle, for instance, uses this for dependency detection in their cloud service. We could do something like this
def on_file_open(path):
if path.contains("libcudart.so"):
attach_uprobes(path)Basically, automatically detect when a libcudart.so is used by a running binary. This would be particularly useful when running jobs that access difference libcudart.so files, for example PyTorch jobs running inside their own separate conda envs and accessing copies of the runtime dynamic lib. Removes the burden from the user who wants to just run their jobs without having to spawn multiple gpuprobe-daemon instances or checking through their env to find where libcudart.so is.
Proposal
Add an option --autodetect-libcudart that monitors such file-open events. This needs to be an option, as it is pretty expensive to perform a string comparison whenever a file is opened.
This is pretty important for usability imho.