Add NICE method #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

Jamie001129 wants to merge 31 commits into charmlab:main from Jamie001129:feature/add-nice

Contributor

Jamie001129 commented Oct 24, 2025 •

edited

Loading

Implementation Details
(1) 4 variants implemented: none, sparsity, proximity, plausibility
(2) Source: Adapted from official NICE repository (https://github.com/DBrughmans/NICE)
(3) Paper: Brughmans et al. (2024) "NICE: an algorithm for nearest instance counterfactual explanations" Data Mining and Knowledge Discovery
(4) Dataset: Adult
(5) Predictive model: Random Forest and MLP

Potential Differences
(1) Original autoencoder's structure was not provided so we have to create our own
(2) The 200 test samples were originally chosen at random so our samples could be different
(3) Run time will be different but the rankings of the four variants are the same

Reproduced Results
(1) RF as predictive model (updated on 11/13/2025)

Variant	Coverage	CPU (ms)	Sparsity	Proximity (L1)	Plausibility
none	200/200	20.11	3.19 ± 1.07	0.52 ± 0.52	0.0885 ± 0.0209
sparsity	200/200	53.58	1.61 ± 0.91	0.34 ± 0.42	0.0886 ± 0.0220
proximity	200/200	56.70	1.77 ± 1.01	0.32 ± 0.41	0.0887 ± 0.0224
plausibility	200/200	70.33	2.15 ± 1.08	0.38 ± 0.43	0.0892 ± 0.0227

(2) MLP as predictive model

Variant	Coverage	CPU (ms)	Sparsity	Proximity (HEOM)	Plausibility (AE error)
none	200/200	6.70	3.89 ± 1.37	3.14 ± 1.34	0.2044 ± 0.0319
sparsity	200/200	10.80	1.22 ± 0.50	1.06 ± 0.52	0.2008 ± 0.0360
proximity	200/200	11.34	1.41 ± 0.69	1.09 ± 0.67	0.2034 ± 0.0346
plausibility	200/200	19.68	2.37 ± 1.37	2.04 ± 1.24	0.2014 ± 0.0342

Files Added/Modified
Main implementation:
methods/catalog/nice/model.py - Main NICE wrapper class implementing RecourseMethod interface
methods/catalog/nice/reproduce.py - Comprehensive test reproducing paper results (part of table 6)

Library components:
methods/catalog/nice/library/init.py - Library exports
methods/catalog/nice/library/autoencoder.py - Autoencoder for plausibility measurement
methods/catalog/nice/library/data.py - Data handling and candidate filtering
methods/catalog/nice/library/distance.py - HEOM distance metric implementation
methods/catalog/nice/library/heuristic.py - Best-first greedy search
methods/catalog/nice/library/reward.py - Three reward functions (sparsity, proximity, plausibility)

Integration:
Updated methods/init.py to export NICE
Updated methods/catalog/init.py to include NICE

Jamie001129 and others added 5 commits

October 22, 2025 23:01


          complete nice implementation and update inits

75c2660


          creating stuff in the reproduce.py and commiting from the fxxking pub…

dcbf211

…lic pc


          fix the autoencoder and make a more comprehensive reproduce.py

744808e


          add testing with mlp model to reproduce.py

4fb4622


          modify reproduce to use the correct/author's way (HEOM distance)

f62be9e

to calculate proximity

zkhotanlou reviewed

View reviewed changes

methods/catalog/nice/reproduce.py Outdated

+                  std_ae_error = ae_errors.std()
+                  # ============================================
+                  # PRINT ALL FOUR METRICS (like Table 5 in paper)

Collaborator

zkhotanlou Oct 26, 2025

I ran these tests locally, but the printed results don’t match the values reported in the paper. Please turn these print statements into assertions using the numbers from the table (a small tolerance is acceptable).
Also, these tests use the Random Forest model, so the correct reference is Table 6.

methods/catalog/nice/reproduce.py Outdated

+                  elif optimization == "none":
+                      # None should be very plausible (it's an actual instance!)
+                      # But we allow some tolerance since we measure on test set
+                      assert avg_ae_error <= 0.02, \

Collaborator

zkhotanlou Oct 26, 2025

I couldn’t find where the paper reports the average error rate. The only place I see something similar is in Table 7, but that value seems different from what’s being checked here. Could you point me to the exact reference?

Contributor Author

Jamie001129 Nov 1, 2025 •

edited

Loading

There is an online_appendix.xlsx table in a NICE_Experiments repo (https://github.com/DBrughmans/NICE_experiments/online_appendix.xlsx) that contains raw results instead of ranks. The dataset I accessed through DataCatalog is normalized while the author has a different preprocessing workflow. As a result, my AE error is much smaller. Still working on it, and we need the same preprocessing as the author's in our implementation?

methods/catalog/nice/reproduce.py Outdated

+                  for opt in ["none", "sparsity", "proximity", "plausibility"]:
+                      nice = NICE(mlmodel=model, hyperparams={"optimization": opt})
+                      # Measure CPU time

Collaborator

zkhotanlou Oct 26, 2025

CPU time isn’t a reliable metric for unit tests, since it depends on the hardware and environment where the code is executed.

Contributor Author

Jamie001129 Nov 1, 2025

I have removed cpu time assertions

methods/catalog/nice/reproduce.py Outdated

+                      print(f"  NICE({opt:<12}): {metrics['cpu_time_total_ms']:>8.2f} ms total "
+                            f"({metrics['cpu_time_avg_ms']:>6.2f} ms per instance)")
+                  # Verify expectations

Collaborator

zkhotanlou Oct 26, 2025

I’d suggest making each of these assertions a separate unit test for better clarity.

methods/catalog/nice/reproduce.py

		print(f"✓ NICE integrates correctly with {dataset_name} dataset")


		if __name__ == "__main__":

Collaborator

zkhotanlou Oct 26, 2025

This script runs tests manually with print statements, but we should convert it into proper unit tests (e.g., using pytest) instead of using "print" outputs.

Jamie001129 added 2 commits

October 31, 2025 23:36


          Change the autoencoder according to the atructure in repo NICE_experi…

08f88de

…ments


          fix bugs and change print to assertion in reproduce.py

41ff3ff

zkhotanlou requested changes

View reviewed changes

methods/catalog/nice/reproduce.py

Collaborator

zkhotanlou Nov 3, 2025

These 3 tests failed when I tried to run them. Please check them to run successfully
test_nice_quality[mlp-proximity]
test_nice_quality[mlp-plausibility]
nice_variants_comparison[mlp]

methods/catalog/nice/reproduce.py

Collaborator

zkhotanlou Nov 3, 2025

Please avoid having multiple assertions in a single unit test. I’d suggest keeping one assertion per test to make it clearer and easier to debug later.

methods/catalog/nice/reproduce.py Outdated

+                  """
+                  Test that NICE produces quality counterfactuals with all metrics in expected ranges.
+                  """
+                  data = DataCatalog("adult", model_type=model_type, train_split=0.7)

Collaborator

zkhotanlou Nov 3, 2025

please use pytest fixtures to build DataCatalog/ModelCatalog and the AutoEncoder once per dataset/model, then reuse them across tests. Each test can still create a fresh NICE instance and slice the same factuals for isolation.

zkhotanlou reviewed

View reviewed changes

Collaborator

zkhotanlou left a comment

Also, please fetch the changes from the main branch so that the pre-commit hooks could be run successfully.

Jamie001129 and others added 20 commits

November 3, 2025 17:46


          Merge branch 'charmlab:main' into feature/add-nice


          set seed

2527c48


          remove seed settings

fc38c4f


          Start to fix the NICE implementation:

e5b9d2b

use the author's data and model


          Merge upstream/main into feature/add-nice

adc3d69


          complete nice implementation and update inits

53967c0


          creating stuff in the reproduce.py and commiting from the fxxking pub…

e995924

…lic pc


          fix the autoencoder and make a more comprehensive reproduce.py

18cff84


          add testing with mlp model to reproduce.py

55d2d7c


          modify reproduce to use the correct/author's way (HEOM distance)

14b1a47

to calculate proximity


          Change the autoencoder according to the atructure in repo NICE_experi…

d748b39

…ments


          fix bugs and change print to assertion in reproduce.py

f12a042


          set seed

66044a7


          remove seed settings

908b14c


          Start to fix the NICE implementation:

aead507

use the author's data and model


          Merge upstream/main into feature/add-nice

45e11bc


          Merge branch 'feature/add-nice' of https://github.com/Jamie001129/rec…

d047772

…ourse_benchmarks into feature/add-nice


          put back the

8160a7e

    NICE,
    Probe,
in methods/__init__.py


          model tuning

06da9f9


          Add OHE_minmax preprocessor matching NICE_experiments (preprocessing.py)

33fbdb7

Raw data: label-encoded categorical + original scale continuous
Preprocessor: One-Hot Encoding + MinMaxScaler(-1, 1)

Fix autoencoder training on preprocessed data

Jamie001129 and others added 4 commits

November 12, 2025 22:39


          update test auc

e95b00b


          add mlp tuning

1560f8d


          Merge branch 'charmlab:main' into feature/add-nice

5eb511e


          Merge branch 'charmlab:main' into feature/add-nice

d47d569

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet