Up next


Improved Feature Importance Computation for Tree Models Based on the Banzhaf Value

587 Views
Google TechTalks
1
Published on 07 Apr 2023 / In News & Politics

A Google TechTalk, presented by Piotr Sankowski, 2023-03-30 ABSTRACT: The Shapley value -- a fundamental game-theoretic solution concept -- has recently become one of the main tools used to explain predictions of tree ensemble models. Another well-known game-theoretic solution concept is the Banzhaf value. Although the Banzhaf value is closely related to the Shapley value, its properties w.r.t. feature attribution have not been understood equally well. This paper shows that, for tree ensemble models, the Banzhaf value offers some crucial advantages over the Shapley value while providing similar feature attributions. In particular, we first give an optimal O(TL+n) time algorithm for computing the Banzhaf value-based attribution of a tree ensemble model's output. Here, T is the number of trees, L is the maximum number of leaves in a tree, and n is the number of features. In comparison, the state-of-the-art Shapley value-based algorithm runs in O(TLD^2+n) time, where D denotes the maximum depth of a tree in the ensemble. Next, we experimentally compare the Banzhaf and Shapley values for tree ensemble models. Both methods deliver essentially the same average importance scores for the studied datasets using two different tree ensemble models (the sklearn implementation of Decision Trees or xgboost implementation of Gradient Boosting Decision Trees). However, our results indicate that, on top of being computable faster, the Banzhaf is more numerically robust than the Shapley value. Joint work with A. Karczmarz, A. Mukherjee, P. Wygocki and T. Michalak. About the Speaker: Piotr Sankowski is a professor at the Institute of Informatics, University of Warsaw, where he received his habilitation in 2009 and where he received a doctorate in computer science in 2005. His research interest focuses on practical application of algorithms, ranging from economic applications, through learning data structures, to parallel algorithms for data science. In 2009, Piotr Sankowski received also a doctorate in physics in the field of solid state theory at the Polish Academy of Sciences. In 2010 he received ERC Starting Independent Researcher Grant, in 2015 ERC Proof of Concept Grant, and in 2017 ERC Consolidator Grant. He is a president of IDEAS NCBR – a research and development centre operating in the field of artificial intelligence and digital economy. Piotr Sankowski is also a co-founder of the spin-off company MIM Solutions. A Google Talk Series on Algorithms, Theory, and Optimization

Show more
0 Comments sort Sort By

Up next