Predictions Mistmatch

Hello ✋🏻 
I have tried multiple ways to convert the LightGBMRanker to ONNX, but in all of them i am finding the issue where the output predictions are not the same.

An example of the code with the hummingbird:
```python
import lightgbm as lgb
import pandas as pd
import numpy as np
from hummingbird.ml import convert, load

# Define the number of samples
num_samples = 100

# Generate random data for numerical features
num_data = {
    "col1": np.random.randint(1, 100, num_samples),
    "col2": np.random.randint(1, 100, num_samples),
    "col2": np.random.uniform(10.0, 100.0, num_samples),
    "col3": np.random.uniform(1.0, 10.0, num_samples),
    "col4": np.random.uniform(0.0, 0.5, num_samples),
}

# Generate random data for categorical features
cat_data = {
    "col5": np.random.choice(["dummy1", "dummy2", "dummy3"], num_samples),
    "col6": np.random.choice(["dummy1", "dummy2", "dummy3"], num_samples),
    "col7": np.random.choice(["dummy1", "dummy2", "dummy3"], num_samples),
}

target = {"target": [i % 2 for i in range(num_samples)]}

data = num_data | cat_data | target

df = pd.DataFrame(data)

cat_mapping = {
    col: {val: idx for idx, val in enumerate(df[col].unique())}
    for col in cat_data.keys()  # Iterate through the column names in cat_data
}
for cat_col in list(cat_data.keys()):
    df[cat_col] = df[cat_col].map(cat_mapping[cat_col])

X = df[list(num_data.keys()) + list(cat_data.keys())]
Y = df[list(target.keys())]

for cat_col in cat_data.keys():
    X[cat_col] = X[cat_col].astype('category')

model = lgb.LGBMRanker(
    objective="lambdarank",
    metric=["ndcg", "map"],
    boosting_type="gbdt",
    categorical_feature=list(cat_data.keys()),
    n_estimators=100,
    # Auto-choosing col-wise multi-threading, the overhead of testing was 0.000241 seconds. You can set `force_col_wise=true` to remove the overhead.
    force_col_wise=True,
    random_state=42
)

# just train model as example
model.fit(
    X, 
    Y, 
    group=[2]*50,
)

test_input = np.array(X.loc[:10].values, dtype=np.float32)

onnx_model = convert(
    model, 
    "onnx", 
    test_input=test_input,
)

model.predict(test_input) # array([-1.03617303,  1.31771085, -0.04840754,  1.2865519 ,  0.49542698,
        1.49092876, -1.36244258,  0.15192526, -1.50055302, -0.34809177,
       -0.31897834])

onnx_model.predict(test_input)  # array([-0.83041805,  1.2326361 , -0.518405  ,  0.79480076,  0.24598463,
        0.7074484 , -1.706661  , -0.18985131, -0.98669803,  0.00681482,
        0.1253777 ], dtype=float32)
```

Does anyone know why? Is there anything in the conversion that i am missing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Predictions Mistmatch #802

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Predictions Mistmatch #802

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions