arXiv Analytics

Sign in

arXiv:1907.03419 [cs.LG]AbstractReferencesReviewsResources

The Price of Interpretability

Dimitris Bertsimas, Arthur Delarue, Patrick Jaillet, Sebastien Martin

Published 2019-07-08Version 1

When quantitative models are used to support decision-making on complex and important topics, understanding a model's ``reasoning'' can increase trust in its predictions, expose hidden biases, or reduce vulnerability to adversarial attacks. However, the concept of interpretability remains loosely defined and application-specific. In this paper, we introduce a mathematical framework in which machine learning models are constructed in a sequence of interpretable steps. We show that for a variety of models, a natural choice of interpretable steps recovers standard interpretability proxies (e.g., sparsity in linear models). We then generalize these proxies to yield a parametrized family of consistent measures of model interpretability. This formal definition allows us to quantify the ``price'' of interpretability, i.e., the tradeoff with predictive accuracy. We demonstrate practical algorithms to apply our framework on real and synthetic datasets.

Related articles: Most relevant | Search more
arXiv:1802.06552 [cs.LG] (Published 2018-02-19)
Are Generative Classifiers More Robust to Adversarial Attacks?
arXiv:2002.03839 [cs.LG] (Published 2020-02-10)
Adversarial Attacks on Linear Contextual Bandits
arXiv:1902.10755 [cs.LG] (Published 2019-02-27)
Adversarial Attacks on Time Series