Search
- Browse Authors in the U.C. Berkeley Division of Biostatistics Working Paper Series
Notification
Most popular papers
COBRA Notification
Most Popular Papers
Institutions: Join COBRA
About COBRA
- Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data
-
- Merrill D. Birkner, Division of Biostatistics, School of Public Health, University of California, Berkeley
- Alan E. Hubbard, Division of Biostatistics, School of Public Health, University of California, Berkeley
- Mark J. van der Laan, Division of Biostatistics, School of Public Health, University of California, Berkeley
- Christine F. Skibola, Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley
- Christine M. Hegedus, Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley
- Martyn T. Smith, Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley
-
Download the Paper
Forward to a colleague
- Article comments:
- Published 2006 in Statistical Applications in Genetics and Molecular Biology 5, article 11.
- Abstract:
- A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting" proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth selection; 3) peak finding; and 4) methods to correct for multiple testing (van der Laan et al. (2005)). The result is a list of proteins (indexed by m/z) where average expression is significantly different among disease (or treatment, etc.) groups. The procedures are intended to provide a sensible and statistically driven algorithm, which we argue provides a list of proteins that have a significant difference in expression. Given no sources of unmeasured bias (such as confounding of experimental conditions with disease status), proteins found to be statistically significant using this technique have a low probability of being false positives.
- Subject Area:
- General Biostatistics, Statistical Theory and Methods
- Suggested Citation:
- Merrill D. Birkner, Alan E. Hubbard, Mark J. van der Laan, Christine F. Skibola, Christine M. Hegedus, and Martyn T. Smith,
"Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data"
(December 2005).
U.C. Berkeley Division of Biostatistics Working Paper Series.
Working Paper 200.
http://www.bepress.com/ucbbiostat/paper200