Search
- Browse Authors in the U.C. Berkeley Division of Biostatistics Working Paper Series
Notification
Most popular papers
COBRA Notification
Most Popular Papers
Institutions: Join COBRA
About COBRA
- Multiple Testing Procedures: R multtest Package and Applications to Genomics
-
- Katherine S. Pollard, Center for Biomolecular Science and Engineering, University of California, Santa Cruz
- Sandrine Dudoit, Division of Biostatistics, School of Public Health, University of California, Berkeley
- Mark J. van der Laan, Division of Biostatistics, School of Public Health, University of California, Berkeley
-
Download the Paper
Forward to a colleague
- Article comments:
- Published in Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, 2005 (Chapter 15, pp. 249-271).
- Abstract:
- The Bioconductor R package multtest implements widely applicable
resampling-based single-step and stepwise multiple testing procedures
(MTP) for controlling a broad class of Type I error rates, in testing
problems involving general data generating distributions (with arbitrary
dependence structures among variables), null hypotheses, and test
statistics.
The current version of multtest provides MTPs for tests concerning means,
differences in means, and regression parameters in linear and Cox
proportional hazards models.
Procedures are provided to control Type I error rates defined as tail
probabilities for arbitrary functions of the numbers of false positives
and rejected hypotheses.
These error rates include tail probabilities for
the number of false positives (generalized family-wise error rate, gFWER)
and
the proportion of false positives among the rejected hypotheses (TPPFP).
Single-step and step-down common-cut-off (maxT) and common-quantile (minP)
procedures, that take into account the joint distribution of the test
statistics, are proposed to control the family-wise error rate (FWER), or
chance of at least one Type I error.
In addition, augmentation multiple testing procedures are provided to
control the gFWER and TPPFP, based on any initial FWER-controlling
procedure.
The results of a multiple testing procedure can be summarized using
rejection regions for the test statistics, confidence regions for the
parameters of interest, or adjusted p-values.
A key ingredient of our proposed MTPs is the test statistics null
distribution (and estimator thereof) used to derive rejection regions and
corresponding confidence regions and adjusted p-values.
Both bootstrap and permutation estimators of the test statistics null
distribution are available.
The S4 class/method object-oriented programming approach was adopted to
summarize the results of a MTP.
The modular design of multtest allows interested users to readily extend
the package's functionality.
Typical testing scenarios are illustrated by applying various MTPs
implemented in multtest to the Acute Lymphoblastic Leukemia (ALL) dataset
of Chiaretti et al. (2004), with the aim of identifying genes whose
expression measures are associated with (possibly censored) biological and
clinical outcomes.
- Subject Area:
- Laboratory and Basic Science Research, Multivariate Analysis, Statistical Theory and Methods, Survival Analysis
- Suggested Citation:
- Katherine S. Pollard, Sandrine Dudoit, and Mark J. van der Laan,
"Multiple Testing Procedures: R multtest Package and Applications to Genomics"
(December 2004).
U.C. Berkeley Division of Biostatistics Working Paper Series.
Working Paper 164.
http://www.bepress.com/ucbbiostat/paper164