Selecting Instrumental Variables in a Data Rich Environment

Serena Ng, Columbia University
Jushan Bai, New York University

Abstract

Practitioners often have at their disposal a large number of instruments that are weakly exogenous for the parameter of interest. However, not every instrument has the same predictive power for the endogenous variable, and using too many instruments can induce bias. We consider two ways of handling these problems. The first is to form principal components from the observed instruments, and the second is to reduce the number of instruments by subset variable selection. For the latter, we consider boosting, a method that does not require an a priori ordering of the instruments. We also suggest a way to pre-order the instruments and then screen the instruments using the goodness of fit of the first stage regression and information criteria. We find that the principal components are often better instruments than the observed data except when the number of relevant instruments is small. While no single method dominates, a hard-thresholding method based on the t test generally yields estimates with small biases and small root-mean-squared errors.

Recommended Citation

Ng, Serena and Bai, Jushan (2009) "Selecting Instrumental Variables in a Data Rich Environment," Journal of Time Series Econometrics: Vol. 1 : Iss. 1, Article 4.
DOI: 10.2202/1941-1928.1014
Available at: http://www.bepress.com/jtse/vol1/iss1/art4

 
 
 
 

ISSN: 1941-1928 ©1999-2009 The Berkeley Electronic Press™ All rights reserved.

To submit, subscribe, recommend this journal to your library, or sign up for email alerts, please visit: http://www.bepress.com/jtse