arXiv:2311.14581 Abstract | arXiv Analytics

arXiv:2311.14581 [cs.LG]Abstract References Reviews Resources

Example-Based Explanations of Random Forest Predictions

Published 2023-11-24Version 1

A random forest prediction can be computed by the scalar product of the labels of the training examples and a set of weights that are determined by the leafs of the forest into which the test object falls; each prediction can hence be explained exactly by the set of training examples for which the weights are non-zero. The number of examples used in such explanations is shown to vary with the dimensionality of the training set and hyperparameters of the random forest algorithm. This means that the number of examples involved in each prediction can to some extent be controlled by varying these parameters. However, for settings that lead to a required predictive performance, the number of examples involved in each prediction may be unreasonably large, preventing the user to grasp the explanations. In order to provide more useful explanations, a modified prediction procedure is proposed, which includes only the top-weighted examples. An investigation on regression and classification tasks shows that the number of examples used in each explanation can be substantially reduced while maintaining, or even improving, predictive performance compared to the standard prediction procedure.

Comments: Submitted to 22nd International Symposium on Intelligent Data Analysis, IDA 2024

Categories: cs.LG

Keywords: random forest prediction, example-based explanations, predictive performance, training examples, standard prediction procedure

Related articles: Most relevant | Search more

arXiv:2407.08649 [cs.LG] (Published 2024-07-11)

Confidence-based Estimators for Predictive Performance in Model Monitoring

Juhani Kivimäki, Jakub Białek, Jukka K. Nurminen, Wojtek Kuberski

arXiv:2402.05007 [cs.LG] (Published 2024-02-07)

Example-based Explanations for Random Forests using Machine Unlearning

Tanmay Surve, Romila Pradhan

arXiv:1907.07207 [cs.LG] (Published 2019-07-16)

Online Local Boosting: improving performance in online decision trees