-
Notifications
You must be signed in to change notification settings - Fork 2
Add comprehensive benchmark suite comparing cuforest against sklearn, XGBoost, and LightGBM native inference #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dantegd
wants to merge
16
commits into
rapidsai:main
Choose a base branch
from
dantegd:fea-bench
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
9ae25df
FEA Make optimize() return a new instance instead of mutating in-place
dantegd 656548d
fix formatting
hcho3 444abfe
FIX for Python 3.10
dantegd ff1f476
Merge branch 'fea-optimize-new-instance' of github.com:dantegd/cufore…
dantegd 5a02152
Apply suggestions from code review
dantegd ff92aa8
ENH make _create_with_layout a classmethod
dantegd c7eb560
FEA Add comprehensive benchmark suite comparing cuforest against skle…
dantegd 60aeacd
ENH Move benchmark to Python package and add README documentation
dantegd a25d1ab
ENH Add pytests, improve readme and some code improvements
dantegd 9cec4c0
Merge main
dantegd 4ac9152
ENH LightGBM GPU code improvements
dantegd bf1b24d
Merge branch 'main' into fea-bench
hcho3 1c952ed
Merge remote-tracking branch 'origin/main' into fea-bench
hcho3 5379142
Merge branch 'main' into fea-bench
hcho3 71b1209
Rename cuforest -> nvforest
hcho3 5f8c30a
Add missing GPU check for LightGBM
hcho3 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,161 @@ | ||
| # nvforest Benchmark Suite | ||
|
|
||
| Comprehensive benchmark comparing nvforest inference performance against native ML framework inference (sklearn, XGBoost, LightGBM). | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```bash | ||
| # Dry run - see what will be benchmarked | ||
| python -m nvforest.benchmark.benchmark run --dry-run | ||
|
|
||
| # Quick test - verify setup with minimal parameters | ||
| python -m nvforest.benchmark.benchmark run --quick-test | ||
|
|
||
| # Full benchmark | ||
| python -m nvforest.benchmark.benchmark run | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Running Benchmarks | ||
|
|
||
| ```bash | ||
| python -m nvforest.benchmark.benchmark run [OPTIONS] | ||
| ``` | ||
|
|
||
| **Options:** | ||
|
|
||
| | Option | Short | Description | | ||
| |--------|-------|-------------| | ||
| | `--framework` | `-f` | Framework(s) to benchmark: `sklearn`, `xgboost`, `lightgbm`. Repeatable. Default: all available | | ||
| | `--dry-run` | `-n` | Print configuration without running | | ||
| | `--quick-test` | `-q` | Run with minimal parameters for quick verification | | ||
| | `--device` | `-d` | Device: `cpu`, `gpu`, or `both`. Default: `both` | | ||
| | `--model-type` | `-m` | Model type: `regressor`, `classifier`, or `both`. Default: `both` | | ||
| | `--output-dir` | `-o` | Output directory for results. Default: `benchmark/data/` | | ||
|
|
||
| **Examples:** | ||
|
|
||
| ```bash | ||
| # Benchmark only sklearn on CPU | ||
| python -m nvforest.benchmark.benchmark run --framework sklearn --device cpu | ||
|
|
||
| # Benchmark XGBoost and LightGBM classifiers only | ||
| python -m nvforest.benchmark.benchmark run -f xgboost -f lightgbm -m classifier | ||
|
|
||
| # Quick test with specific framework | ||
| python -m nvforest.benchmark.benchmark run --quick-test --framework sklearn | ||
| ``` | ||
|
|
||
| ### Analyzing Results | ||
|
|
||
| ```bash | ||
| python -m nvforest.benchmark.analyze RESULTS_FILE [OPTIONS] | ||
| ``` | ||
|
|
||
| **Options:** | ||
|
|
||
| | Option | Short | Description | | ||
| |--------|-------|-------------| | ||
| | `--output` | `-o` | Output file for speedup heatmap plot | | ||
| | `--framework` | `-f` | Filter results to specific framework | | ||
| | `--device` | `-d` | Filter results to specific device (`cpu` or `gpu`) | | ||
| | `--plot-only` | | Only generate plot, skip summary | | ||
| | `--summary-only` | | Only print summary, skip plot | | ||
|
|
||
| **Examples:** | ||
|
|
||
| ```bash | ||
| # Analyze results and generate plots | ||
| python -m nvforest.benchmark.analyze data/final_results.csv | ||
|
|
||
| # Summary only for GPU results | ||
| python -m nvforest.benchmark.analyze data/final_results.csv --device gpu --summary-only | ||
| ``` | ||
|
|
||
| ## Parameter Space | ||
|
|
||
| ### Full Benchmark | ||
|
|
||
| | Parameter | Values | | ||
| |-----------|--------| | ||
| | `num_features` | 8, 32, 128, 512 | | ||
| | `max_depth` | 2, 4, 8, 16, 32 | | ||
| | `num_trees` | 16, 128, 1024 | | ||
| | `batch_size` | 1, 16, 128, 1024, 1,048,576, 16,777,216 | | ||
|
|
||
| ### Quick Test | ||
|
|
||
| | Parameter | Values | | ||
| |-----------|--------| | ||
| | `num_features` | 32 | | ||
| | `max_depth` | 4 | | ||
| | `num_trees` | 16 | | ||
| | `batch_size` | 1024 | | ||
|
|
||
| ## Device Handling | ||
|
|
||
| The `--device` parameter affects how both native frameworks and nvforest run inference: | ||
|
|
||
| ### nvforest | ||
| - **CPU**: Uses CPU inference backend | ||
| - **GPU**: Uses GPU inference backend with cupy arrays | ||
|
|
||
| ### XGBoost | ||
| - **CPU**: Standard CPU inference with DMatrix | ||
| - **GPU**: Models are trained with `device="cuda"` and use `inplace_predict` for GPU inference | ||
|
|
||
| ### LightGBM | ||
| - **CPU**: Standard CPU inference | ||
| - **GPU**: LightGBM GPU training and inference require the library to be built with GPU support: | ||
| ```bash | ||
| # Option 1: Build from source | ||
| cmake -DUSE_GPU=1 .. | ||
|
|
||
| # Option 2: pip with GPU flag | ||
| pip install lightgbm --install-option=--gpu | ||
| ``` | ||
| When GPU is requested and LightGBM GPU support is available, models are trained with `device="gpu"` and inference uses the GPU-trained model. If GPU support is not available, training/inference falls back to CPU with a warning. | ||
|
|
||
| ### sklearn | ||
| - **CPU**: Standard CPU inference | ||
| - **GPU**: sklearn is CPU-only. For GPU benchmarks, native inference runs on CPU as a baseline. The speedup comparison reflects nvforest GPU vs sklearn CPU. | ||
|
|
||
| > **Note**: When `device=gpu`, XGBoost models are trained on GPU which enables GPU-native inference. This provides a fair comparison between XGBoost GPU inference and nvforest GPU inference. | ||
|
|
||
| ## Output | ||
|
|
||
| Results are saved as CSV files in the output directory: | ||
|
|
||
| - `checkpoint_N.csv` - Periodic checkpoints during benchmark | ||
| - `final_results.csv` - Complete results | ||
|
|
||
| **Columns:** | ||
|
|
||
| | Column | Description | | ||
| |--------|-------------| | ||
| | `framework` | ML framework (sklearn, xgboost, lightgbm) | | ||
| | `model_type` | regressor or classifier | | ||
| | `device` | cpu or gpu | | ||
| | `num_features` | Number of input features | | ||
| | `max_depth` | Maximum tree depth | | ||
| | `num_trees` | Number of trees in ensemble | | ||
| | `batch_size` | Inference batch size | | ||
| | `native_time` | Native framework inference time (seconds) | | ||
| | `nvforest_time` | nvforest inference time (seconds) | | ||
| | `optimal_layout` | Layout selected by nvforest optimize() | | ||
| | `optimal_chunk_size` | Chunk size selected by nvforest optimize() | | ||
| | `speedup` | native_time / nvforest_time | | ||
|
|
||
| ## Dependencies | ||
|
|
||
| Required: | ||
| - `click` | ||
| - `pandas` | ||
| - `numpy` | ||
|
|
||
| Optional (for specific frameworks): | ||
| - `scikit-learn` - for sklearn benchmarks | ||
| - `xgboost` - for XGBoost benchmarks | ||
| - `lightgbm` - for LightGBM benchmarks | ||
| - `matplotlib`, `seaborn` - for result visualization | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| # | ||
| # SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
|
|
||
| """Benchmark suite for nvforest comparing against native ML framework inference.""" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
LightGBM GPU installation pip 2025 modern method💡 Result:
Modern (2025+) pip install methods for LightGBM GPU
A) Easiest: GPU via OpenCL using official PyPI wheels (Windows / Linux)
device_type="gpu"(OpenCL). (GPU build details are in the official install guide.) (lightgbm.readthedocs.io)Notes:
B) NVIDIA CUDA (Linux):
device_type="cuda"usually means build from sourceIf you specifically want the CUDA backend (
device_type="cuda"), LightGBM treats it as a separate implementation and documents building with CMake using-DUSE_CUDA=ON(Linux; Windows CUDA backend not supported). (lightgbm.readthedocs.io)C) AMD ROCm (Linux): install AMD’s ROCm wheel via pip
If you’re on AMD Instinct + ROCm, AMD provides a ROCm-enabled wheel:
(or the matching ROCm version index). (rocm.docs.amd.com)
Citations:
Update pip installation command for LightGBM GPU support.
The
--install-optionflag is deprecated in pip 21.3+ and removed in pip 23.1. However, the modern approach is simpler than using--config-settings: official LightGBM PyPI wheels include GPU (OpenCL) support built-in.Update to:
Note: If CUDA backend is needed instead of OpenCL, building from source with CMake is required (
-DUSE_CUDA=ON). Add a note clarifying which GPU backend the benchmark uses and any driver prerequisites.🤖 Prompt for AI Agents