arXiv:1903.05179 Abstract | arXiv Analytics

arXiv:1903.05179 [stat.ML]Abstract References Reviews Resources

Unbiased Measurement of Feature Importance in Tree-Based Methods

Published 2019-03-12Version 1

We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.

Categories: stat.ML, cs.LG

Keywords: tree-based methods, feature importance, unbiased measurement, split-improvement variable importance measures, random forests

Related articles: Most relevant | Search more

arXiv:2106.08217 [stat.ML] (Published 2021-06-15)

RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests

Cansu Alakus, Denis Larocque, Aurelie Labbe

arXiv:2204.13916 [stat.ML] (Published 2022-04-29)

A study of tree-based methods and their combination

Yinuo Zeng

arXiv:2310.18814 [stat.ML] (Published 2023-10-28)

Stability of Random Forests and Coverage of Random-Forest Prediction Intervals

Yan Wang, Huaiqing Wu, Dan Nettleton

arXiv Analytics

arXiv:1903.05179 [stat.ML]Abstract References Reviews Resources

Unbiased Measurement of Feature Importance in Tree-Based Methods

Links

Toolbox

arXiv:1903.05179 [stat.ML]AbstractReferencesReviewsResources

Unbiased Measurement of Feature Importance in Tree-Based Methods

Links

Toolbox

arXiv:1903.05179 [stat.ML]Abstract References Reviews Resources