Skip to content

enh - add Y transformation keyword argument in NormativeModel to enforce positivity#372

Open
contsili wants to merge 2 commits intodevfrom
log_positivity
Open

enh - add Y transformation keyword argument in NormativeModel to enforce positivity#372
contsili wants to merge 2 commits intodevfrom
log_positivity

Conversation

@contsili
Copy link
Collaborator

@contsili contsili commented Feb 12, 2026

Closes #356

If Y<-1 then log(Y+1) cant be computed (ValueError) but I expect Y to always be >=-1 (since the functionality i added in this pr is only for data that are always positive). If we want to be conservative we can add though a check like:

# Validate that values are > -1 (required for log1p)
min_val = float(np.min(data[var].values))
if min_val <= -1.0:
    raise ValueError(
        f"y_transform='log1p' requires all {var} values > -1 but your {var} are not"
    )

We can still consider to add a test and/or an example.

@AuguB
Copy link
Collaborator

AuguB commented Feb 12, 2026

Will this solve the problem addressed in issue #365? As I see it, since the standardization happens after the log transform, this can and probably will still yield negative values, resulting in invalid input to WarpLog.

Additionally, why would we apply a log transform as a preparation for a log transform?

Some ideas:

  • Explicitly disallowing standardize+WarpLog. A minmaxscaler can be used instead. Informative messages to the user can be helpful here.
  • Adding a new scaler that only divides the data by its standard deviation, which amounts to a scaling around y=0; positive data stays positive. Such a scaler could be automatically applied when all of the following hold:
  1. Y is strictly positive
  2. BLR with Warplog as regressionmodel
  3. The selected outscaler is standardize
    Then we can safely overwrite the outscaler with the new scaler (e.g. scale_only_standardize)

I do agree the option to add a log transform on Y as it is implemented now is a good idea, and it should be kept somehow. I just don't think it really solves the problem of using WarpLog + standardize. EDIT: I realize now that this PR is not intended to solve #365.

@contsili
Copy link
Collaborator Author

contsili commented Feb 13, 2026

as you mentioned in the end of your previous comment, this pr is not meant to solve #365, but #356. #356 was proposed by @amarquand. It is meant to not allow for strictly positive response variables (like brain volumes or WM hypoinstensities) to have negative centiles (see for example, fig. 6(e) in [1]

For #365 for now we implemented something close to your first idea: see #366. However, your second idea is also a good one. We can make an issue to implement it.

@amarquand
Copy link
Owner

Indeed. We have a user that wants to apply the shash model to strictly positive data, and get the centiles back in the original space. But @AuguB is right that the interaction with the inscaler and outscaler needs to be carefully considered.

Has this been tested?

@contsili
Copy link
Collaborator Author

contsili commented Feb 13, 2026

I wrote an example script which i have locally.

I used inscaler and outscaler = 'standardise', dataset fcon1000:

1. BLR(heteroskedastic=True)
Not enforced positivity:
image

Enforced positivity:
image

2. HBR() (default parameters)
Not enforced positivity
image

Enforced positivity:
image

@amarquand how do these results look?

@contsili
Copy link
Collaborator Author

contsili commented Feb 24, 2026

I will add a test script for this feature and then we are ready to merge it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants