Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 161 additions & 0 deletions python/nvforest/nvforest/benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# nvforest Benchmark Suite

Comprehensive benchmark comparing nvforest inference performance against native ML framework inference (sklearn, XGBoost, LightGBM).

## Quick Start

```bash
# Dry run - see what will be benchmarked
python -m nvforest.benchmark.benchmark run --dry-run

# Quick test - verify setup with minimal parameters
python -m nvforest.benchmark.benchmark run --quick-test

# Full benchmark
python -m nvforest.benchmark.benchmark run
```

## Usage

### Running Benchmarks

```bash
python -m nvforest.benchmark.benchmark run [OPTIONS]
```

**Options:**

| Option | Short | Description |
|--------|-------|-------------|
| `--framework` | `-f` | Framework(s) to benchmark: `sklearn`, `xgboost`, `lightgbm`. Repeatable. Default: all available |
| `--dry-run` | `-n` | Print configuration without running |
| `--quick-test` | `-q` | Run with minimal parameters for quick verification |
| `--device` | `-d` | Device: `cpu`, `gpu`, or `both`. Default: `both` |
| `--model-type` | `-m` | Model type: `regressor`, `classifier`, or `both`. Default: `both` |
| `--output-dir` | `-o` | Output directory for results. Default: `benchmark/data/` |

**Examples:**

```bash
# Benchmark only sklearn on CPU
python -m nvforest.benchmark.benchmark run --framework sklearn --device cpu

# Benchmark XGBoost and LightGBM classifiers only
python -m nvforest.benchmark.benchmark run -f xgboost -f lightgbm -m classifier

# Quick test with specific framework
python -m nvforest.benchmark.benchmark run --quick-test --framework sklearn
```

### Analyzing Results

```bash
python -m nvforest.benchmark.analyze RESULTS_FILE [OPTIONS]
```

**Options:**

| Option | Short | Description |
|--------|-------|-------------|
| `--output` | `-o` | Output file for speedup heatmap plot |
| `--framework` | `-f` | Filter results to specific framework |
| `--device` | `-d` | Filter results to specific device (`cpu` or `gpu`) |
| `--plot-only` | | Only generate plot, skip summary |
| `--summary-only` | | Only print summary, skip plot |

**Examples:**

```bash
# Analyze results and generate plots
python -m nvforest.benchmark.analyze data/final_results.csv

# Summary only for GPU results
python -m nvforest.benchmark.analyze data/final_results.csv --device gpu --summary-only
```

## Parameter Space

### Full Benchmark

| Parameter | Values |
|-----------|--------|
| `num_features` | 8, 32, 128, 512 |
| `max_depth` | 2, 4, 8, 16, 32 |
| `num_trees` | 16, 128, 1024 |
| `batch_size` | 1, 16, 128, 1024, 1,048,576, 16,777,216 |

### Quick Test

| Parameter | Values |
|-----------|--------|
| `num_features` | 32 |
| `max_depth` | 4 |
| `num_trees` | 16 |
| `batch_size` | 1024 |

## Device Handling

The `--device` parameter affects how both native frameworks and nvforest run inference:

### nvforest
- **CPU**: Uses CPU inference backend
- **GPU**: Uses GPU inference backend with cupy arrays

### XGBoost
- **CPU**: Standard CPU inference with DMatrix
- **GPU**: Models are trained with `device="cuda"` and use `inplace_predict` for GPU inference

### LightGBM
- **CPU**: Standard CPU inference
- **GPU**: LightGBM GPU training and inference require the library to be built with GPU support:
```bash
# Option 1: Build from source
cmake -DUSE_GPU=1 ..

# Option 2: pip with GPU flag
pip install lightgbm --install-option=--gpu
```
Comment on lines +115 to +117
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

LightGBM GPU installation pip 2025 modern method

💡 Result:

Modern (2025+) pip install methods for LightGBM GPU

A) Easiest: GPU via OpenCL using official PyPI wheels (Windows / Linux)

  1. Install LightGBM:
    pip install -U lightgbm
  2. Ensure an OpenCL runtime/driver is installed.
    • The LightGBM PyPI wheels include both CPU+GPU (OpenCL) support; you typically just need GPU drivers / OpenCL runtime present. (pypi.org)
  3. Use it:
    • In parameters, use device_type="gpu" (OpenCL). (GPU build details are in the official install guide.) (lightgbm.readthedocs.io)

Notes:


B) NVIDIA CUDA (Linux): device_type="cuda" usually means build from source

If you specifically want the CUDA backend (device_type="cuda"), LightGBM treats it as a separate implementation and documents building with CMake using -DUSE_CUDA=ON (Linux; Windows CUDA backend not supported). (lightgbm.readthedocs.io)


C) AMD ROCm (Linux): install AMD’s ROCm wheel via pip

If you’re on AMD Instinct + ROCm, AMD provides a ROCm-enabled wheel:

pip install amd_lightgbm --extra-index-url=https://pypi.amd.com/rocm-7.0.2/simple

(or the matching ROCm version index). (rocm.docs.amd.com)

Citations:


Update pip installation command for LightGBM GPU support.

The --install-option flag is deprecated in pip 21.3+ and removed in pip 23.1. However, the modern approach is simpler than using --config-settings: official LightGBM PyPI wheels include GPU (OpenCL) support built-in.

Update to:

# Option 2: pip with GPU (OpenCL) support
pip install lightgbm
# Ensure OpenCL drivers/runtime are installed for GPU acceleration

Note: If CUDA backend is needed instead of OpenCL, building from source with CMake is required (-DUSE_CUDA=ON). Add a note clarifying which GPU backend the benchmark uses and any driver prerequisites.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/nvforest/nvforest/benchmark/README.md` around lines 115 - 117, Update
the pip/GPU installation instructions in the README: replace the deprecated use
of pip's --install-option=--gpu with a simple "pip install lightgbm" and add a
short note that the PyPI wheel includes OpenCL GPU support and that users must
install appropriate OpenCL drivers/runtimes; also add a clarifying sentence that
CUDA backend requires building from source with CMake and -DUSE_CUDA=ON so
readers know which GPU backend the benchmark uses and what driver prerequisites
are required.

When GPU is requested and LightGBM GPU support is available, models are trained with `device="gpu"` and inference uses the GPU-trained model. If GPU support is not available, training/inference falls back to CPU with a warning.

### sklearn
- **CPU**: Standard CPU inference
- **GPU**: sklearn is CPU-only. For GPU benchmarks, native inference runs on CPU as a baseline. The speedup comparison reflects nvforest GPU vs sklearn CPU.

> **Note**: When `device=gpu`, XGBoost models are trained on GPU which enables GPU-native inference. This provides a fair comparison between XGBoost GPU inference and nvforest GPU inference.

## Output

Results are saved as CSV files in the output directory:

- `checkpoint_N.csv` - Periodic checkpoints during benchmark
- `final_results.csv` - Complete results

**Columns:**

| Column | Description |
|--------|-------------|
| `framework` | ML framework (sklearn, xgboost, lightgbm) |
| `model_type` | regressor or classifier |
| `device` | cpu or gpu |
| `num_features` | Number of input features |
| `max_depth` | Maximum tree depth |
| `num_trees` | Number of trees in ensemble |
| `batch_size` | Inference batch size |
| `native_time` | Native framework inference time (seconds) |
| `nvforest_time` | nvforest inference time (seconds) |
| `optimal_layout` | Layout selected by nvforest optimize() |
| `optimal_chunk_size` | Chunk size selected by nvforest optimize() |
| `speedup` | native_time / nvforest_time |

## Dependencies

Required:
- `click`
- `pandas`
- `numpy`

Optional (for specific frameworks):
- `scikit-learn` - for sklearn benchmarks
- `xgboost` - for XGBoost benchmarks
- `lightgbm` - for LightGBM benchmarks
- `matplotlib`, `seaborn` - for result visualization
6 changes: 6 additions & 0 deletions python/nvforest/nvforest/benchmark/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#
# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0
#

"""Benchmark suite for nvforest comparing against native ML framework inference."""
Loading