arXiv:1907.03419 Abstract | arXiv Analytics

arXiv:1907.03419 [cs.LG]Abstract References Reviews Resources

The Price of Interpretability

Dimitris Bertsimas, Arthur Delarue, Patrick Jaillet, Sebastien Martin

Published 2019-07-08Version 1

When quantitative models are used to support decision-making on complex and important topics, understanding a model's ``reasoning'' can increase trust in its predictions, expose hidden biases, or reduce vulnerability to adversarial attacks. However, the concept of interpretability remains loosely defined and application-specific. In this paper, we introduce a mathematical framework in which machine learning models are constructed in a sequence of interpretable steps. We show that for a variety of models, a natural choice of interpretable steps recovers standard interpretability proxies (e.g., sparsity in linear models). We then generalize these proxies to yield a parametrized family of consistent measures of model interpretability. This formal definition allows us to quantify the ``price'' of interpretability, i.e., the tradeoff with predictive accuracy. We demonstrate practical algorithms to apply our framework on real and synthetic datasets.

Categories: cs.LG, stat.ML

Keywords: expose hidden biases, interpretable steps, standard interpretability proxies, reduce vulnerability, adversarial attacks

Related articles: Most relevant | Search more

arXiv:1802.06552 [cs.LG] (Published 2018-02-19)

Are Generative Classifiers More Robust to Adversarial Attacks?

Yingzhen Li

arXiv:2002.03839 [cs.LG] (Published 2020-02-10)

Adversarial Attacks on Linear Contextual Bandits

Evrard Garcelon, Baptiste Roziere, Laurent Meunier, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta

arXiv:1902.10755 [cs.LG] (Published 2019-02-27)

Adversarial Attacks on Time Series

Fazle Karim, Somshubra Majumdar, Houshang Darabi

arXiv Analytics

arXiv:1907.03419 [cs.LG]Abstract References Reviews Resources

The Price of Interpretability

Links

Toolbox

arXiv:1907.03419 [cs.LG]AbstractReferencesReviewsResources

The Price of Interpretability

Links

Toolbox

arXiv:1907.03419 [cs.LG]Abstract References Reviews Resources