Search
- Browse Authors in the U.C. Berkeley Division of Biostatistics Working Paper Series
Notification
Most popular papers
COBRA Notification
Most Popular Papers
Institutions: Join COBRA
About COBRA
- Statistical Inference for Variable Importance
-
-
Download the Paper
Forward to a colleague
- Article comments:
- Published 2006 in International Journal of Biostatistics, Vol. 2, Issue 1.
- Abstract:
- Many statistical problems involve the learning of an
importance/effect of a variable for predicting an outcome of
interest based on observing a sample of n independent and
identically distributed observations on a list of input variables
and an outcome. For example, though prediction/machine learning
is, in principle, concerned with learning the optimal unknown
mapping from input variables to an outcome from the data, the
typical reported output is a list of importance measures for each
input variable. The typical approach in prediction has been to
learn the unknown optimal predictor from the data and derive, for
each of the input variables, the variable importance from the
obtained fit. In this article we propose a new approach which
involves for each variable separately 1) carefully defining the
wished variable importance as a real valued parameter, 2) deriving
the efficient influence curve and thereby optimal estimating
function for this parameter in the assumed (possibly
nonparametric) model, and 3) develop a corresponding locally
efficient estimator of this variable importance, obtained by
substituting for the nuisance parameters in the optimal estimating
function data adaptive estimators. We illustrate this methodology
in the context of prediction, and obtain in this manner locally
optimal estimators of marginal variable importance and
covariate-adjusted variable importance, accompanied with p-values
and statistical inference. We also propose a road map for
statistical analysis based on this approach. Finally, we
generalize this methodology to variable importance parameters for
time-dependent variables.
- Subject Area:
- General Biostatistics, Multivariate Analysis
- Suggested Citation:
- Mark J. van der Laan,
"Statistical Inference for Variable Importance"
(August 2005).
U.C. Berkeley Division of Biostatistics Working Paper Series.
Working Paper 188.
http://www.bepress.com/ucbbiostat/paper188