arXiv Analytics

Sign in

arXiv:2004.11165 [stat.ML]AbstractReferencesReviewsResources

Multi-Objective Counterfactual Explanations

Susanne Dandl, Christoph Molnar, Martin Binder, Bernd Bischl

Published 2020-04-23Version 1

Counterfactual explanations are one of the most popular methods to make predictions of black box machine learning models interpretable by providing explanations in the form of `what-if scenarios'. Current approaches can compute counterfactuals only for certain model classes or feature types, or they generate counterfactuals that are not consistent with the observed data distribution. To overcome these limitations, we propose the Multi-Objective Counterfactuals (MOC) method, which translates the counterfactual search into a multi-objective optimization problem and solves it with a genetic algorithm based on NSGA-II. It returns a diverse set of counterfactuals with different trade-offs between the proposed objectives, enabling either a more detailed post-hoc analysis to facilitate better understanding or more options for actionable user responses to change the predicted outcome. We show the usefulness of MOC in concrete cases and compare our approach with state-of-the-art methods for counterfactual explanations.

Related articles: Most relevant | Search more
arXiv:1910.13376 [stat.ML] (Published 2019-10-29)
How Much Can We See? A Note on Quantifying Explainability of Machine Learning Models
arXiv:2310.03112 [stat.ML] (Published 2023-10-04)
Leveraging Model-based Trees as Interpretable Surrogate Models for Model Distillation
arXiv:2403.10250 [stat.ML] (Published 2024-03-15)
Interpretable Machine Learning for Survival Analysis