Skip to content

feat: optional TensorRT provider for ONNX runtime issue #353#371

Open
saivarunkonda wants to merge 1 commit intoNeptuneHub:mainfrom
saivarunkonda:feat/tensorrt-onnx
Open

feat: optional TensorRT provider for ONNX runtime issue #353#371
saivarunkonda wants to merge 1 commit intoNeptuneHub:mainfrom
saivarunkonda:feat/tensorrt-onnx

Conversation

@saivarunkonda
Copy link

@saivarunkonda saivarunkonda commented Mar 14, 2026

Closes #353

Summary

Adds optional TensorRT execution provider support for ONNX Runtime while keeping default behavior unchanged (USE_TENSORRT=false).

What changed

  • Added centralized ONNX provider selector (TensorRT -> CUDA -> CPU) in tasks/onnx_providers.py.
  • Refactored MusiCNN, CLAP, and MuLan ONNX session initialization to use shared provider logic.
  • Added USE_TENSORRT config flag and wired it into NVIDIA compose templates + .env.example.
  • Added TensorRT runtime libraries in NVIDIA Docker builds.
  • Updated GPU/parameters docs for opt-in TensorRT usage.
  • Added tests for provider selection and TensorRT visibility.

Validation

  • Local NVIDIA container verified:
    • USE_TENSORRT=true selects TensorrtExecutionProvider first.
    • USE_TENSORRT=false keeps CUDAExecutionProvider first (existing behavior preserved).
  • Confirmed TensorRT libs exist in container (libnvinfer.so.10, libnvinfer_plugin.so.10).

Notes

  • Change is strictly opt-in and backward-compatible.
  • Increased Docker dependency-install HTTP timeout to reduce transient RAPIDS download timeout failures.

@saivarunkonda saivarunkonda changed the title feat: optional TensorRT provider for ONNX runtime issue #353 feat: optional TensorRT provider for ONNX runtime issue Mar 14, 2026
@saivarunkonda saivarunkonda changed the title feat: optional TensorRT provider for ONNX runtime issue feat: optional TensorRT provider for ONNX runtime issue #353 Mar 14, 2026
@saivarunkonda
Copy link
Author

@NeptuneHub/maintainers Could you please approve workflows for this fork PR and review it? This PR addresses #353 (optional TensorRT EP, default behavior unchanged with USE_TENSORRT=false), with local NVIDIA validation completed.

@NeptuneHub
Copy link
Owner

Hi and thanks for raising this or, but before proceding with the merge I want to be extra sure that there is a real advantages and is all properly tested.

  1. Do you tested analyzing the same song with both? Which is the speed increase for Musicnn ? Which for CLAP? Which in total per song ?
  2. did you checked that the embbeding vectors created with different provider is exactly the same of the one created with cuda ?
  3. the clustering algorithm also with gpu image have an implementation that use gpu. Is this affected by this change?
  4. Which test you did, even manually, to be sure to don’t introduce regression ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add TensorRT support to the nvidia image?

2 participants