Hi everyone,
I’m doing a small workflow study at Course Correct Labs on how practitioners handle cases where different explanation methods (like LIME, SHAP, or Integrated Gradients) give conflicting feature importances for the same prediction.
When that happens in your work:
Do you just pick one method?
Try to understand why they disagree?
Average or ensemble their outputs?
Or ignore the disagreement entirely?
I’m not promoting anything. Just trying to document how people handle this in real projects. Even short anecdotes (“we usually trust SHAP more for tabular models”) are really helpful.
Thanks in advance — I’ll post a short summary of findings once I have a few replies.